What’s Covered?
This report from UC Berkeley’s Center for Long-Term Cybersecurity offers a structured approach to one of the most difficult questions in AI governance: what counts as an intolerable risk, and how should we set thresholds to prevent such outcomes from ever being triggered? The team focuses on frontier AI—models with advanced general-purpose capabilities—and identifies how these systems can enable harm through misuse, systemic failure, or unintended downstream consequences.
The paper begins with a detailed background analysis of intolerable risks and the logic behind creating “thresholds” to preempt them. It looks at prior frameworks from cybersecurity, nuclear regulation, and dual-use research to help define what “intolerable” means. Risks are grouped into several categories:
- Misuse risks, such as enabling chemical, biological, radiological, and nuclear (CBRN) attacks or cyberweapons.
- Systemic or cascading harms, including misinformation, economic disruption, and automated discrimination.
- Manipulation risks, such as deceptive model behavior or persuasion at scale.
Section 3 delivers seven key principles for operationalizing thresholds—like designing with wide safety margins, comparing risks to meaningful baselines, and combining qualitative with quantitative risk estimations. These aren’t abstract ideals—they’re framed as decision-making tools.
Section 4 then proposes threshold recommendations for specific categories of intolerable outcomes. For example, a model that can reliably assist in building a bioweapon or deceive evaluators during alignment tests would be over the line. The goal is to build a culture of ex ante action—prohibiting, pausing, or modifying development before a capability crosses the line into unacceptable territory.
Three case studies show how this could work in practice:
- CBRN capabilities (e.g., fine-tuned LLMs helping adversaries create nerve agents)
- Evaluation deception (e.g., models gaming safety benchmarks)
- AI-generated misinformation (e.g., large-scale political influence operations)
These cases illustrate how thresholds might be implemented within industry-level evaluations or national safety frameworks.
💡 Why it matters?
This is one of the most pragmatic efforts yet to bridge long-term AI risk thinking with concrete governance strategies. As public agencies and companies scramble to build AI safety evaluations, this paper makes a strong case for integrating risk thresholds into both design and deployment—not just audits and assessments. It also shifts the discussion from speculative doom to real-world policy options.
What’s Missing?
The paper avoids directly naming companies or models, which might make it feel less urgent for frontline developers. It also doesn’t provide numerical examples of thresholds—how many red team hits is too many? What constitutes a “substantial” increase in risk? That’s partly intentional (each use case is different), but it may leave implementers asking, “OK, but where’s the line for us?” Finally, there’s less emphasis on governance infrastructure—who sets these thresholds and ensures compliance?
Best For:
Policy teams in AI companies, regulatory bodies working on frontier model oversight, and research teams developing evaluation methodologies. Also useful for civil society stakeholders pushing for enforceable safety standards. If your job is to operationalize “AI safety,” this paper helps clarify what that should look like.
Source Details:
Full Citation:
Deepika Raman, Nada Madkour, Evan R. Murphy, Krystal Jackson, Jessica Newman. Intolerable Risk Threshold Recommendations for Artificial Intelligence: Key Principles, Considerations, and Case Studies to Inform Frontier AI Safety Frameworks for Industry and Government. UC Berkeley Center for Long-Term Cybersecurity, February 2025.
Context: Developed by the UC Berkeley Center for Long-Term Cybersecurity in collaboration with CSET and a wide range of experts from academia, government, and civil society.Author Credentials:
- Jessica Newman: Founding director of Berkeley’s AI Security Initiative, focuses on AI governance and international coordination.
- Evan Murphy: CSET affiliate and policy researcher focused on technical AI safety.
- Deepika Raman, Krystal Jackson, and Nada Madkour contribute policy and cyber-technical perspectives from law, ethics, and national security.
The paper was shaped by workshops with top experts from the AI safety ecosystem—like Stuart Russell, Chris Meserole, and Florence G’sell—giving it weight in both academic and policy circles.