OpenAI is opening a new alignment research division, focused on developing training techniques to stop superintelligent AI — artificial intelligence that could outthink humans and become misaligned with humans ethics — from causing serious harm.
“Currently, we don’t have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue,” Jan Leike and Ilya Sutskever wrote in a blog post for OpenAI, the company behind the most well-known generative AI large language model, ChatGPT. They added that although superintelligence might seem far off, some experts believe it could arrive this decade.
Current techniques for aligning AI include reinforcement learning from human feedback, but Leike and Sutskever said that as AI systems become smarter than humans, humans can no longer be relied upon to supervise the technology.
“Current alignment techniques will not scale to superintelligence. We need new scientific and technical breakthroughs,” they wrote.
Sutskever is a co-founder and chief scientist at OpenAI, and Leike is a machine learning researcher. They will co-lead OpenAI’s new superalignment team. In order to undertake its mission, over the next four years the division will have access to 20% of the company’s processing capacity to build a “human-level automated alignment researcher” that can be scaled up to supervise superintelligence.
In order to align the automated researcher with human ethics, Leike and Sutskever said a three-step approach will need to be taken: develop a scalable training method; validate the resulting model; and stress test the entire alignment pipeline.
“We expect our research priorities will evolve substantially as we learn more about the problem and we’ll likely add entirely new research areas,” they wrote, adding there were plans to share more of the division’s roadmap in the future.
OpenAI acknowledges the need to mitigate potential AI harm
This isn’t the first time OpenAI has publicly acknowledged the need to mitigate the risks posed by unregulated AI. In May, company CEO Sam Altman signed an open letter stating that controlling the tech should be a top global priority as the evolution of AI could lead to an extinction event.
“Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war,” the letter read.
OpenAI also has a dedicated section on its website, where the public can access materials related to what the company calls the development of “safe and responsible AI,” alongside a charter outlining the principles it adheres to in order to execute its mission. However, these largely relate to the concept of artificial general intelligence (AGI) — highly autonomous systems that outperform humans at most economically valuable work.
“We will attempt to directly build safe and beneficial AGI, but will also consider our mission fulfilled if our work aids others to achieve this outcome,” the charter, which was published in 2018, reads.