Taming the Overmind: OpenAI recruits a team of AI overseers
The company is trying to maintain the authority of people in front of AI.
Ilya Sutskever, Chief Researcher and one of the company’s co-founders, will lead the group. OpenAI predicts that AI with greater than human intelligence could emerge within a decade. And if it does ever appear, it will not necessarily be benevolent, so ways to control and limit such AI need to be explored.
Currently, OpenAI does not have a solution to manage potentially super-intelligent AI and prevent it from spiraling out of control. Current AI conditioning methods, such as reinforcement learning based on human feedback, are based on the ability of humans to control AI. But humans will not be able to control AI systems that are much smarter than humans.
For these purposes, OpenAI is creating a new Superalignment team that will have access to 20% of the companies’ computing. Together with scientists and engineers from OpenAI, as well as researchers from other organizations of the company, the team will solve the main technical problems of managing ultra-intelligent AI over the next four years.
The team will train AI systems using human feedback, train AI to help evaluate other AI systems, and eventually create AI that can control neural networks so they don’t go off the rails. The OpenAI hypothesis is that AI can do such research faster and better than humans.
Neural networks will work alongside humans to make their own successors more in line with humans. Human researchers will focus more on analyzing AI research instead of doing it themselves.
Representatives of OpenAI acknowledge that the use of AI for evaluation can increase inconsistencies, biases or vulnerabilities in AI. And it may turn out that the hardest parts of the AI control problem may not be related to engineering at all.