OpenAI, one of the prominent players in artificial intelligence research, is forming a dedicated team to handle what might be one of the most significant challenges in AI – steering and controlling 'superintelligent' AI systems. This team will be led by none other than the company's co-founder and chief scientist, Ilya Sutskever.
In a blog post penned by Sutskever and Jan Leike, a lead on the alignment team at OpenAI, they propose that AI systems with intelligence surpassing that of humans could become a reality within this decade. However, the duo stresses the potential risks, acknowledging that these advanced AI systems might not always be benevolent. Consequently, research to control and restrict these superintelligent entities becomes imperative.
They underline the shortcomings of current AI alignment techniques like reinforcement learning, which rely heavily on human supervision. The problem lies in the fact that human beings may not effectively manage AI systems significantly smarter than us, thus the need for novel solutions to prevent such AI from going astray.
Addressing these concerns, OpenAI is setting up the new Superalignment team, whose primary objective over the next four years is to address the core technical challenges of controlling superintelligent AI. This team, granted access to 20% of the company's secured compute to date, comprises members from OpenAI's previous alignment division and researchers from other organizational sectors.
Sutskever and Leike are looking to build a "human-level automated alignment researcher," with the overarching goal of training AI systems using human feedback and enabling AI to assist in evaluating other AI systems. Ultimately, they aim to develop AI capable of conducting alignment research, a strategy that involves ensuring AI systems stay on track and achieve intended results.
OpenAI holds the belief that AI can conduct alignment research more efficiently and faster than humans. The envisioned process sees AI systems progressively taking over alignment work, innovating improved alignment techniques, and working collaboratively with humans to ensure their successors align better with human interests. However, the team also recognizes potential limitations, such as the risk of scaling up AI's inconsistencies, biases, or vulnerabilities during the evaluation process.
Despite these concerns, OpenAI considers the effort worthwhile and necessary. The company views this initiative as fundamentally a machine learning problem and intends to share the results of this project broadly. They also see contributing to the alignment and safety of non-OpenAI models as a crucial part of their mission.