Kafka Streams checkpoints the wrong offset when a task is closed during
restoration. If under exactly-once processing guarantees a
TaskCorruptedException happens, the affected task is closed dirty, its
state content is wiped out and the task is re-initialized. If during
the following restoration the task is closed cleanly, the task writes
the offsets that it stores in its record collector to the checkpoint
file. Those offsets are the offsets that the task wrote to the changelog
topics. In other words, the task writes the end offsets of its changelog
topics to the checkpoint file. Consequently, when the task is
initialized again on the same Streams client, the checkpoint file is
read and the task assumes it is fully restored although the records
between the last offsets the task restored before closing clean and
the end offset of the changelog topics are missing locally.
The fix is to clear the offsets in the record collector on close.
Cherry-pick of 19727f8
Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
This patch fixes a bug in the LeaveGroupResponse construction. Basically, when a top level error is set, no members are expected but the current check always requires one for versions prior to version 3.
Reviewers: David Jacot <djacot@confluent.io>
(cherry picked from commit 3fd6293449)
git gc moves commit hashes from individual .git/refs/heads/ to .git/packed-refs which is not read
by the determineCommitId function.
Replace the existing lookup within the .git directory with a GrGit lookup that handles packed and
unpacked refs transparently.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Fixing bad test setup. We tried to fix an upgrade bug for FK-joins in 3.1 release, but it later turned out that the PR was not sufficient to fix it. We finally fixed in 3.4 release.
This PR updates the system test matrix to only test working versions with FK-joins, limited to available test versions.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Hao Li <hli@confluent.io>, Mickael Maison <mickael.maison@gmail.com>
We have observed an issue where inter broker SSL listener is not coming up when running with TLSv3/JDK 17 .
SSL debug logs shows that TLSv3 post handshake messages >16K are not getting read and causing SslEngineValidator process to stuck while validating the provided trust/key store.
- Right now, WRAP returns if there is already data in the buffer. But if we need more data to be wrapped for UNWRAP to succeed, we end up looping forever. To fix this, now we always attempt WRAP and only return early on BUFFER_OVERFLOW.
- Update SslEngineValidator to unwrap post-handshake messages from peer when local handshake status is FINISHED.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
When running in kraft mode, LogManager.startup is called in a different thread than the main broker (#14239)
startup thread (by BrokerMetadataPublisher when the first metadata update is received.) If a fatal
error happens during broker startup, before LogManager.startup is completed, LogManager.shutdown may
mark log dirs as clean shutdown improperly.
This PR includes following change:
1. During LogManager startup time:
- track hadCleanShutdwon info for each log dir
- track loadLogsCompleted status for each log dir
2. During LogManager shutdown time:
- do not write clean shutdown marker file for log dirs which have hadCleanShutdown==false and loadLogsCompleted==false
Reviewers: Colin P. McCabe <cmccabe@apache.org>
The KRaft client uses an expiration service to complete FETCH requests that have timed out. This expiration service uses a different thread from the KRaft polling thread. This means that it is unsafe for the expiration service thread to call tryCompleteFetchRequest. tryCompleteFetchRequest reads and updates a lot of states that is assumed to be only be read and updated from the polling thread.
The KRaft client now does not call tryCompleteFetchRequest when the FETCH request has expired. It instead will send the FETCH response that was computed when the FETCH request was first handled.
This change also fixes a bug where the KRaft client was not sending the FETCH response immediately, if the response contained a diverging epoch or snapshot id.
Reviewers: Jason Gustafson <jason@confluent.io>
A HistoricalIterator at epoch N is supposed to only reveal elements at epoch N or earlier. However,
due to a bug, we sometimes will reveal elements which are at a newer epoch than N. The bug does
not affect elements that are in the latest epoch (aka topTier). It only affects elements that are
newer than N, but which do not persist until the latest epoch. This PR fixes the bug and adds a
unit test for this case.
Reviewers: David Arthur <mumrah@gmail.com>
On ext4 file systems we have seen snapshots with zero-length files. This is possible if
the file is closed and moved before forcing the channel to write to disk.
Reviewers: Ron Dagostino <rndgstn@gmail.com>, Alok Thatikunta <athatikunta@confluent.io>
Fixed a regression described in KAFKA-15053 that security.protocol only allows uppercase values like PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL. With this fix, both lower case and upper case values will be supported (e.g. PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL, plaintext, ssl, sasl_plaintext, sasl_ssl)
Reviewers: Chris Egerton <chrise@aiven.io>, Divij Vaidya <diviv@amazon.com>
Update "requests" lib used in system tests to version "2.31.0" to fix CVE-2023-32681: Unintended leak of Proxy-Authorization header in requests
Reviewers: Divij Vaidya <diviv@amazon.com>
In SharedServer, fix some cases where a volatile variable could change to null while we were using
it, during shutdown. This is mainly a junit test issue, although it could also cause ugly error
messages during shutdown when running the server in a production context.
Fix a race in KafkaEventQueueTest.testSize.
Reviewers: David Arthur <mumrah@gmail.com>
All dependency upgrades in the PR are minor upgrades with backward compatible changes. Note that no major version for dependencies have been changed to make it a low risk change. No code changes are required for any of these dependencies.
Reviewers: Luke Chen <showuon@gmail.com>