cdxgen Configuration Vulnerability: AI-Prompted Discovery Reveals Data Exfiltration Risk in Untrusted Projects
A critical security flaw in the popular software composition analysis tool cdxgen has been exposed, revealing a pathway for attackers to exfiltrate sensitive keys and data. The vulnerability, which centers on the tool's handling of YAML and JSON configuration files, allows maliciously crafted scripts to leverage the `server-url` and `include-formulation` configuration keys to siphon off information when scanning untrusted projects. This creates a direct risk for developers and organizations using cdxgen in automated pipelines or on code from unknown sources.
The issue was not discovered through traditional security research but was directly prompted from Google's Gemini 3.1 Pro AI. A researcher simply asked the model for an "obscure and scary bug" in cdxgen that could lead to data theft, and Gemini accurately described the vulnerability—though it incorrectly claimed the server mode was also affected. The core problem lies in the `include-formulation` feature, which has now been explicitly removed from the server component in version 12.1.5. The standard command-line interface remains vulnerable to this configuration confusion attack.
In response, the maintainers have issued a patch that adds explicit warnings and implements an early exit when the tool runs in its secure mode. However, the underlying formulation feature is acknowledged as "quite insecure" and requires a broader hardening effort. This incident underscores a new frontier in vulnerability discovery, where AI models can rapidly surface specific, exploitable weaknesses in open-source tools, potentially before human auditors do. It places immediate pressure on developers to update their cdxgen instances and scrutinize the security of similar configuration-driven features across the software supply chain.