imkafka Module Exposes Critical Data Race Vulnerability in Global Metrics Counters (CWE-366)
A critical race condition vulnerability (CWE-366) has been identified in the imkafka module, exposing module-global metrics counters to concurrent modification without atomic protection. The flaw resides in `plugins/imkafka/imkafka.c`, where metrics including `rtt_avg_usec`, `throttle_avg_msec`, and `int_latency_avg_usec` were declared as plain `uint64` variables and updated unsafely from multiple Kafka worker threads. This design oversight creates a window for data corruption during high-concurrency operations, particularly threatening system reliability on 32-bit platforms where 64-bit operations are not inherently atomic.
The vulnerability manifests through the `statsCallback()` function, invoked by multiple Kafka consumer threads that simultaneously write to shared global variables without synchronization primitives. The attack vector is straightforward: concurrent calls to `statsCallback()` result in unsynchronized writes to the same memory locations, potentially causing torn reads where 64-bit counter values become partially updated. On architectures lacking atomic 64-bit operations, this can produce completely corrupted metric values that operators rely on for monitoring and alerting decisions.
The implications extend beyond mere metric inaccuracy. Operators monitoring Kafka ingestion performance may receive inconsistent or corrupted data during peak load conditions, undermining observability and potentially masking genuine system issues. The availability impact is notable—corrupted metrics could trigger false alerts or fail to surface real problems. The fix requires converting these vulnerable counters to atomic operations, ensuring thread-safe updates regardless of concurrent access patterns. Organizations running imkafka in high-throughput environments should prioritize applying this hardening patch, particularly those operating on 32-bit infrastructure where the risk of torn reads is most acute.