AI Governance Library

Applying Data Protection Principles to Generative AI: Practical Approaches for Organizations and Regulators

This CIPL paper offers 14 practical recommendations for applying data protection principles to generative AI systems.
Applying Data Protection Principles to Generative AI: Practical Approaches for Organizations and Regulators

What’s Covered?

Published in December 2024, this discussion paper by the Centre for Information Policy Leadership (CIPL) focuses on how existing privacy and data protection frameworks can be adapted to fit generative AI (genAI) technologies—particularly those that rely on large-scale personal data during model training, fine-tuning, and deployment.

The paper maps privacy principles (like fairness, data minimization, purpose limitation, and individual rights) onto the AI lifecycle, recognizing that each stage—data collection, training, deployment—raises different privacy issues and responsibilities.

Key highlights:

  • Legitimate Interests as a Legal Basis: CIPL argues that genAI developers should be allowed to rely on legitimate interests to collect and process publicly available or first-party personal data—especially where data minimization or consent models are impractical.
  • Sensitive Data Processing: The paper stresses the need to permit sensitive data processing for bias mitigation, accessibility tools, and systems aimed at protected populations. Over-restriction, it says, may reduce fairness or accuracy.
  • Use of PETs/PPTs: It promotes privacy-enhancing technologies like differential privacy, synthetic data, and encryption to mitigate risk while enabling effective model development.
  • Redefining Data Minimization: Rather than limiting data volume, minimization should mean collecting what’s needed for a model to function fairly and accurately—particularly when diverse datasets are key to preventing bias.
  • Purpose and Use Flexibility: Developers often can’t predict future uses of general-purpose AI. Purpose limitation principles should be applied with this uncertainty in mind—allowing room for innovation.
  • Transparency: Context-appropriate transparency is encouraged—both for users and regulators. CIPL supports techniques like model cards and public disclosures over granular individual notices in web-scraping contexts.
  • Individual Rights: The paper recommends flexible mechanisms for honoring rights like erasure and objection, acknowledging technical limitations of genAI systems while promoting reasonable alternatives (e.g., output filters).
  • Organizational Accountability: It emphasizes the need for risk-based, evolving AI governance programs. Regulators should encourage and reward these programs, not just punish non-compliance.
  • Cross-Border Data: National restrictions on cross-border flows are framed as a major threat to fairness and innovation. The report backs privacy certifications and technical safeguards to enable lawful international transfers.

💡 Why it matters?

Data protection laws weren’t written with genAI in mind—but the need to protect rights and foster innovation is immediate. This paper offers a nuanced way forward, urging regulators to lean into flexible interpretations, risk-based tools, and practical measures that reflect how genAI actually works.

What’s Missing?

While CIPL offers a practical framework, some core tensions remain unresolved:

  • It underplays the challenge of verifying fairness or privacy claims at scale, especially with limited regulatory access to training data or model internals.
  • There’s minimal discussion of oversight gaps between developers and deployers—especially in open-source or API-heavy systems.
  • Although PETs are recommended, the paper avoids tough questions about how effective they really are at scale, particularly for large multimodal systems.
  • It suggests using legal flexibility to protect innovation, but doesn’t deeply engage with power asymmetries—e.g., how smaller actors might struggle with accountability costs, or how transparency alone may not check Big Tech dominance.
  • It doesn’t explore how AI-specific harms (like hallucinations, deepfakes, or emergent behaviors) stretch the limits of conventional privacy tools.

Best For:

Policy teams in tech companies, privacy officers designing AI governance, and regulators drafting guidance or participating in AI standardization efforts. It’s also relevant for lawmakers considering how to update data protection rules in light of genAI’s needs.

Source Details:

Title: Applying Data Protection Principles to Generative AI: Practical Approaches for Organizations and Regulators

Author: Centre for Information Policy Leadership (CIPL), a global privacy think tank hosted by law firm Hunton Andrews Kurth LLP

Date: December 2024

Credentials: CIPL is backed by over 85 corporate members and has a track record of influencing global privacy frameworks through dialogue with regulators, academia, and civil society. Its work has been cited in discussions around GDPR, the AI Act, and OECD policy guidelines.

About the author
Jakub Szarmach

AI Governance Library

Curated Library of AI Governance Resources

AI Governance Library

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to AI Governance Library.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.