AI Governance Library

Introduction to AI Safety, Ethics, and Society

This is the book you’d hand to someone serious about understanding AI risk but unsure where to start. With clarity and precision, it lays out how AI could cause major harm—through misalignment, misuse, or sheer scale—and what we can do about it.
Introduction to AI Safety, Ethics, and Society
AI Safety, Ethics, and Society Textbook
Introduction to AI Safety, Ethics and Society is a course by Dan Hendrycks, director of the Center for AI Safety.

What’s Covered?

Dan Hendrycks pulls together a range of disciplines—machine learning, ethics, economics, complex systems—to offer a clear and structured introduction to the full spectrum of AI risk. The book is split into three sections: Societal-Scale Risks, Safety, and Ethics & Society, each with dozens of subtopics that unpack both technical and social fault lines.

It opens with scenarios where AI could go wrong on a massive scale:

  • Malicious use, like AI-generated bioterrorism or persuasive manipulation
  • Racing dynamics, where developers or states cut corners on safety to stay ahead
  • Organizational failure, including unprepared institutions and poor risk culture
  • Rogue AI, exploring misaligned goals, power-seeking behavior, and deceptive systems

The second section dives into technical safety: monitoring opaque systems, robustness to attacks, proxy gaming, and tail-risk scenarios like treacherous turns. Hendrycks connects these to safety engineering principles like redundancy, defense-in-depth, and risk decomposition. The book also explores why DL models should be treated as complex systems, not just code artifacts—highlighting systemic failure modes that can’t be fixed with better training data alone.

The final section deals with moral uncertainty, governance, fairness, economic dynamics, and coordination failures. It walks through game theory, social welfare models, AI compute governance, and institutional design—providing an accessible yet rigorous framework to analyze the challenges of aligning advanced AI with human goals. The inclusion of evolutionary pressures and collective action dilemmas offers a rare systems-thinking angle that’s missing in most ML or law-focused AI books.

There are helpful summaries and reading lists at the end of each chapter. Though grounded in frontier risk, the book doesn’t feel alarmist. It’s more like a guidebook for people who want to build, shape, or regulate safe AI—rooted in both scientific caution and institutional realism.

💡 Why it matters?

This is one of the most integrated accounts of AI safety available in 2025. It builds a shared conceptual foundation for developers, ethicists, and policymakers to talk to each other—not just within silos. With AI deployment accelerating in high-stakes sectors, that shared language matters more than ever. Hendrycks’ focus on both technical mechanisms and social context helps bridge a key gap in many policy efforts.

What’s Missing?

The book offers a deep and broad introduction but stays light on institutional case studies. We hear about government, corporate, and international governance in the abstract, but there are few concrete examples of how actors are currently responding—or failing to respond—to the risks outlined. The book also leans toward catastrophic risk and less on ongoing, cumulative harms like surveillance, disinformation, or labor displacement, which might leave readers working on those issues wanting more.

Another gap: the ethical frameworks are pluralistic but don’t deeply confront tensions between them. The book encourages a moral parliament model, but sidesteps how messy, contested, or politically fraught real-world implementation would be. Lastly, there’s little attention to how race, gender, or inequality shape both AI development and its impacts—something many ethics scholars would flag as a major omission.

Best For:

  • Advanced undergrads, grad students, and early-career researchers
  • Policy advisors building foundations for AI regulation
  • Safety-conscious ML developers
  • Readers who want a systems view of AI, not just technical or ethical silos

Source Details:

Dan Hendrycks is Director of the Center for AI Safety (CAIS), a leading voice in existential AI risk research. He has contributed to both technical work (e.g., GELU activation function, MMLU benchmark) and public awareness through op-eds in TIME and Wall Street Journal. He holds a PhD in Machine Learning from UC Berkeley, has advised the UK government, and has spoken at institutions like OpenAI and Stanford.

His credibility across technical and policy domains lends weight to the book. Published by CRC Press (Taylor & Francis) in 2025, it’s also available under a Creative Commons license for broader access. ISBNs include: 9781032869926 (hardcover), 9781032917221 (paperback), 9781003530336 (ebook). DOI: 10.1201/9781003530336

About the author
Jakub Szarmach

AI Governance Library

Curated Library of AI Governance Resources

AI Governance Library

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to AI Governance Library.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.