Anonymous Intelligence Signal

Trivy Vulnerability Database Overhaul: Unified Schema, OSV Fixes, and NVD Retry Logic Deployed

human The Lab unverified 2026-03-31 16:27:24 Source: GitHub Issues

A significant internal overhaul of the Trivy vulnerability database's data ingestion and storage architecture has been completed, consolidating multiple critical fixes and a major schema redesign into a single deployment. The changes address long-standing format conflicts, data corruption risks, and lay the groundwork for a new, normalized data model.

The core update unifies the schema for processing vulnerability records from two distinct formats: the 'vunnel' format used by providers like Alpine and Ubuntu, and the OSV format used by Alma and Rocky Linux. By implementing a single 'mache' schema that uses a `dig` template function to correctly match records from either source, the system has eliminated cross-format parsing errors. This was validated by successfully processing 845,000 records from 12 operating system providers into a 1.1GB database. Concurrently, a critical fix sanitizes OSV filenames to handle CVE IDs containing problematic characters like slashes and parentheses, preventing file system errors. Furthermore, the NVD data fetcher now includes a retry mechanism for connection resets during body transfers, a failure point that previously caused aborted updates.

Beyond immediate fixes, the branch implements the foundational 'Task #1' of a normalized database schema. This involves creating a new 9-table SQL schema designed for long-term stability and easier querying. The implementation includes a SQL view that replicates the logic of the existing Go-based CRDT merge for vulnerability data, migration paths from old NVD writer tables, and new projection queries tailored for downstream consumers like grype-db and secdb. Parity tests confirm the new SQL-based merge produces identical results to the legacy Go code, a crucial step before a full migration. This coordinated push signals a major step in hardening Trivy's core data pipeline against ingestion failures and technical debt.