Advancing AI Governance: The Case for Comprehensive GPAI Model Evaluations

As the European Union prepares to implement the AI Act, ensuring robust evaluation methods for General Purpose AI (GPAI) models is crucial. This article explores the need for more comprehensive evaluation approaches that go beyond simple black-box testing, balancing thorough assessments with intellectual property protection.

The EU AI Office's upcoming Codes of Practice for GPAI model deployment should prioritise multi-faceted evaluation approaches that extend beyond black-box testing. Key recommendations include:

1. Encouraging 'de facto' white-box access for independent third-party evaluators through custom APIs, balancing thorough evaluation with IP protection.

2. Facilitating access to contextual information for comprehensive audits, including relevant code, technical documentation, and internal evaluation findings.

3. Implementing a multi-layered framework of safeguards for model evaluation, combining technical, physical, and legal measures to protect proprietary information while enabling meaningful assessments.

These recommendations aim to enhance the EU's capacity to assess and mitigate GPAI model risks, reinforcing its position as a global leader in AI governance. By prioritising transparency, fairness, and robust evaluation methods, the EU can foster trust and innovation in its AI sector while safeguarding citizens and businesses.

The article emphasises the limitations of black-box testing and advocates for a spectrum of access levels, from black-box to white-box and even "outside-the-box" evaluations. It also addresses concerns about intellectual property protection and suggests various mechanisms to minimise risks associated with more comprehensive evaluations.

By adopting these recommendations, the EU can ensure that its AI governance framework remains adaptable to the rapidly evolving field of GPAI model evaluation, setting a global standard for responsible AI development and deployment.

Read the full brief here.

For questions, please reach out to our AI policy lead David Marti at

david.marti@pourdemain.ch