Some Strategy and Policy Ideas

ℹ️

This is a link post for the 2023 AI Governance Curriculum

As we have seen, future developments in AI may pose major risks. For the remainder of the course, we will focus on what to do about all this—how we can help society mitigate the risks and realize the benefits of AI. To begin, we will attempt to spell out some context on potential paths to impact.

We can think of AI governance as the challenge of ensuring that leading AI developers are willing and able to develop and deploy AI in safe and beneficial ways. As discussed earlier, one way that might fail to happen—and a major concern of the long-term AI governance field—is that global vulnerability to unilateral action, alongside incentives to cut corners, could make it likely for advanced AI to be deployed before safety problems are solved. (More generally, these factors could make it more likely for AI to be deployed in harmful ways before systems to mitigate harms are put into place.)

Considering the potential importance of cooperation for several approaches to these problems—especially if they are to be taken widely enough to make a big dent in risk—some AI governance work focuses on advancing various forms of cooperation on AI. From a different angle, some of the field focuses on shaping who leads in AI (e.g., informing actors, or boosting actors), sometimes with the aim of enabling the above approaches to be taken (e.g., by increasing leaders’ cautiousness, beneficence, ability to have competition constrained, [22] or lead size).[23]

Arguably, given deep strategic uncertainty, lack of expert consensus, and how under-explored many of these issues are, [24] the field cannot yet offer anything close to a comprehensive AI policy wish list. So instead, we will study a short, very incomplete, and mostly preliminary list of ideas. We hope that many readers help improve this current state of affairs, by working with others to help identify, refine, and realize promising ideas.

Core Readings

🔗

How to make the best of the most important century? (Karnofsky, 2021)

🔗

Racing through a minefield: the AI deployment problem (Karnofsky, 2022) (15 minutes)

🔗

Update on ARC's recent eval efforts (2023)

🔗

“International Security” and “AI Ideal Governance” [25] sections of “AI Governance: A Research Agenda” (Dafoe, 2018) (pages 42-51 in the PDF)

🔗

Filling gaps in trustworthy development of AI (Avin, et al., 2021)

🔗

Cooperative AI: machines must learn to find common ground (Dafoe et al., 2021) (15 minutes)

Additional Recommendations

🔗

Strategic Perspectives on Long-term AI Governance: Introduction (Maas, 2022)

Some more potential framings of AI safety problems from a governance angle:

🔗

The Vulnerable World Hypothesis (Bostrom, 2019)

🔗

Discussion with Eliezer Yudkowsky on AGI interventions (Yudkowsky and Anonymous, 2021)

🔗

Thinking About Risks From AI: Accidents, Misuse and Structure (Zwetsloot and Dafoe, 2019)—A short piece proposing three lenses on risks from AI

On prestige motivations in AI competition:

🔗

Emerging Technologies, Prestige Motivations and the Dynamics of International Competition (Barnhart, 2022)

🔗

AI, the space race, and prestige (Barnhart, 2021) (40 minutes)

On AI research publication norms:

🔗

The Offense-Defense Balance of Scientific Knowledge: Does Publishing AI Research Reduce Misuse? (Shevlane and Dafoe, 2021)

🔗

Strategic Implications of Openness in AI Development (Bostrom, 2017)

🔗

Sharing Powerful AI Models (Shevlane, 2022)

On corporate self-regulation:

🔗

Why companies should be leading on AI governance (Leung, 2018)

🔗

Antitrust-Compliant AI Industry Self-Regulation (O’Keefe, 2021)

🔗

How technical safety standards could promote TAI safety (O’Keefe et al., 2022)

🔗

The Windfall Clause — sharing the benefits of advanced AI [video] (O’Keefe, 2020)

🔗

Ideal governance (for companies, countries and more) (Karnofsky, 2022)

On advancing certain kinds of AI:

🔗

Open Problems in Cooperative AI (Dafoe et al., 2020)

🔗

Truthful AI: Developing and governing AI that does not lie (Evans et al., 2021) (20 minutes)

🔗

Law-Following AI 3: Lawless AI Agents Undermine Stabilizing Agreements (O’Keefe, 2022)

Game theoretic models of AI competition:

🔗

Racing to the Precipice: a Model of Artificial Intelligence Development (Armstrong et al., 2013)

🔗

A Regulation Dilemma in Artificial Intelligence Development (Han et al., 2021)

Miscellaneous:

🔗

“Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments (Critch, 2022)

🔗

Are you really in a race? The Cautionary Tales of Szilárd and Ellsberg (Belfield, 2022)

🔗

A Tour of Emerging Cryptographic Technologies: What They Are and How They Could Matter (Garfinkel, 2021)

🔗

The role of existing institutions in AI strategy (Leung and Baum, 2018)

🔗

Open-source learning — a bargaining approach [video] (Clifton, 2019)

🔗

Regulatory Markets for AI Safety (Clark and Hadfield, 2019) (and associated podcast episode)

🔗

Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims (Brundage et al., 2020)

🔗

Why Responsible AI Development Needs Cooperation on Safety (Askell et al., 2019) (summary of their paper on The Role of Cooperation in Responsible AI Development)

🔗

Leo Szilard and the danger of nuclear weapons: a case study in risk mitigation (Grace, 2015) (60 minutes)

🔗

Propositions Concerning Digital Minds and Society (Bostrom and Shulman, 2022)

🔗

Why and How Governments Should Monitor AI Development (Whittlestone and Clark, 2021)

[22] As hypothetical examples, leading AI developers may be more able to have competition among them constrained if they are all in one jurisdiction (so that just one government can regulate them), or if they are all in a few countries that are open to international coordination.

[23] One way of thinking about this is that, if an AI developer has a bigger lead, they may be more comfortable taking the time for responsible AI development (rather than rushing and cutting corners).

[24] See e.g., Dafoe, 2018; Muehlhauser, 2021; Karnofsky, 2021 for discussion of the preliminary nature of some of these ideas.

[25] People sometimes read “ideal governance” as referring to perfect governance, but it may be more useful to interpret it as referring to good governance.

Some Strategy and Policy Ideas

Core Readings

Additional Recommendations

Next in the AI Governance curriculum

Topics