OpenZeppelin Audit Reveals Data Contamination in OpenAI EVMbench
Blockchain security firm OpenZeppelin has completed an audit of OpenAI's EVMbench, the AI benchmark designed to evaluate smart contract security capabilities, revealing significant methodological flaws and data contamination issues. EVMbench was launched in mid-February through a partnership between OpenAI and crypto investment firm Paradigm, with the stated goal of measuring how effectively different AI models can identify, patch, and exploit smart contract vulnerabilities. OpenZeppelin's audit identified two critical issues: training data contamination and misclassification of high-severity vulnerabilities. The security firm stated that after reviewing the dataset, they found methodological flaws and invalid vulnerability classifications, including at least four issues labeled as high severity that are not actually exploitable in practice. Regarding data contamination, OpenZeppelin noted that the fundamental capability being tested AI security is the ability to find novel vulnerabilities in code that the model has never encountered before. However, their analysis revealed that AI agents scoring highest on EVMbench had likely been exposed to the benchmark's vulnerability reports during their pre-training phase. Although the benchmark tested AI agents with internet access disabled, meaning they could not simply search for solutions, the benchmark was built on curated vulnerabilities from 120 audits conducted between 2024 and mid-2025, with training cutoffs for these agents generally being mid-2025. This created significant risk that AI agents already possessed answers to all test problems stored in their memory. OpenZeppelin emphasized that while AI will significantly impact blockchain security, it is crucial to apply and test the technology correctly to maximize its potential. The question is not whether AI will transform smart contract security, but whether the data and benchmarks used to build and evaluate these tools meet the same standards as the contracts they are designed to protect.