kafka/storage
Ritika Reddy 0a483618b9
KAFKA-19690-Add epoch check before verification guard check to prevent unexpected fatal error (#20534)
We are seeing cases where a Kafka Streams (KS) thread stalls for ~20
seconds. During this stall, the broker correctly aborts the open
transaction (triggered by the 10-second transaction timeout).   However,
when the KS thread resumes, instead of receiving the expected
InvalidProducerEpochException (which we already handle gracefully as
part of transaction abort), the client is instead hit with an
InvalidTxnStateException. KS currently treats this as a fatal error,
causing the application to fail.

To fix this, we've added an epoch check before the verification check to
send the recoverable  InvalidProducerEpochException instead of the fatal
InvalidTxnStateException. This helps safeguard both tv1 and tv2 clients

Reviewers: Justine Olshan <jolshan@confluent.io>
2025-09-23 13:45:42 -07:00
..
api/src KAFKA-19523: Gracefully handle error while building remoteLogAuxState (#20201) 2025-07-23 19:29:31 +05:30
src KAFKA-19690-Add epoch check before verification guard check to prevent unexpected fatal error (#20534) 2025-09-23 13:45:42 -07:00