HIGH-Severity Bug in Stellar History Crate: Race Condition Allows Archive Overwrite on Any Read Error
A critical race condition in the Stellar ecosystem's `history` crate can cause a live archive to be completely overwritten following any transient read error. The flaw, identified in the `HistoryArchiveManager::initialize_history_archive` function, treats *any* failure to read the archive's root hash—including network timeouts, server errors, or permission issues—as proof the archive is uninitialized. This faulty logic bypasses the intended safety check, prompting the system to unconditionally upload a fresh `.well-known/stellar-history.json` file with `current_ledger: 0`, effectively erasing the existing archive's root metadata.
The vulnerability is a classic Time-Of-Check-Time-Of-Use (TOCTOU) bug in asynchronous code. The guard condition only checks `if archive.get_root_has().await.is_ok()`, making the subsequent write operation non-atomic and dangerously dependent on a single, fragile read. This design flaw opens a direct path for data destruction, not through complex cryptographic attacks, but through simple operational instability or targeted interference with the read operation.
The exploit scenario is straightforward: an attacker controlling a network path or causing a service disruption could induce a read error. The system would then interpret the archive as empty and proceed to overwrite it, corrupting the historical ledger data. This high-severity finding underscores a fundamental weakness in the archive initialization logic, where error handling is conflated with state validation, creating a single point of failure for critical historical data integrity.