OpenAI announces Superalignment team

OpenAI has announced a Superalignment team and 4 year project to create an automated alignment researcher. They believe superintelligence (an AI more intelligent than humans) is possible within a decade and therefore we need to accelerate research into alignment. They believe developing an AI alignment researcher that is itself an AGI will give them a way to scale up and “iteratively align superintelligence.” In other words they want to set an AI to aligning more powerful AIs.

Alignment is an approach to AI safety that tries to develop AIs so they act as we would want and expect them to. The idea is to make sure that right out of the box AIs would behave in ways aligned with our values.

Needless to say, there are issues with this approach as this nice Conversation piece by Aaron Snoswell, What is ‘AI alignment’? Silicon Valley’s favourite way to think about AI safety misses the real issues, outlines.

First, and importantly, OpenAI has to figure out how to align an AGI so that it can tun the superintelligences to come.
You can’t get superalignment without alignment, and we don’t really know what that is or how to get it. There isn’t consensus as to what our values should be so an alignment would have to be to some particular ethical position.
Why is OpenAI focusing only on superalignment? Why not try a number of the approaches from promoting regulation to developing more ethical training datasets? How can they be so sure about one approach? What do they know that we don’t? Or … what do they think they know?
Snoswell believes we should start by “acknowledging and addressing existing harms”. There are plenty of immediate difficult problems that should be addressed rather than “kicking the meta-ethical can one block down the road, and hoping we don’t trip over it later on.”
Technical safety isn’t a problem that can be solved. It is an ongoing process of testing and refining as this Tweet from Yann LeCunn puts it.

Anyway, I wish them well. No doubt interesting research will come out of this initiative which I hope OpenAI will share. In the meantime the rest of us can carry on with the boring safety research.