"The government takes the long term risk of non-aligned Artificial General Intelligence, and the unforeseeable changes that it would mean for the UK and the world, seriously." - UK National AI Strategy
Resources
Why AI alignment could be hard with modern deep learning - Cotra (2021) (20 mins)
This article introduces two ways in which modern AI techniques may create misaligned AI systems.
Specification gaming: the flip side of AI ingenuity (Krakovna et al., 2020) (15 mins)
DeepMind researchers elaborate on the difficulty of adequately specifying objectives for AI systems. Failures to solve this may lead to what the first reading refers to as “sycophants.”
“Inner Alignment: Explain like I'm 12 Edition” (Harth, 2020) (15 mins)
This piece elaborates on the risk of “deceptive alignment,” another potential challenge in aligning AI systems with human values. It may lead to what the first reading refers to as “schemers.”
AI alignment landscape (Christiano, 2020) (30 mins)
This talk outlines a strategic landscape of how people (including governance people) can contribute to solving AI alignment.