This PR fixes a leak in StreamsMetricImpl not removing a
store-level-metric correctly, and thus leaking objects.
Reviewers: Eduwer Camacaro <eduwerc@gmail.com>, Bill Bejeck
<bbejeck@apache.org>
```
commit ec37eb538b (HEAD ->
KAFKA-19719-cherry-pick-41, origin/KAFKA-19719-cherry-pick-41)
Author: Kevin Wu <kevin.wu2412@gmail.com>
Date: Thu Sep 25 11:56:16 2025 -0500
KAFKA-19719: --no-initial-controllers should not assume
kraft.version=1 (#20551)
Just because a controller node sets --no-initial-controllers flag
does not mean it is necessarily running kraft.version=1. The more
precise meaning is that the controller node being formatted does not
know what kraft version the cluster should be in, and therefore it
is only safe to assume kraft.version=0. Only by setting
--standalone,--initial-controllers, or --no-initial-controllers AND
not specifying the controller.quorum.voters static config, is it
known kraft.version > 0.
For example, it is a valid configuration (although confusing) to run
a static quorum defined by controller.quorum.voters but have all
the controllers format with --no-initial-controllers. In this
case, specifying --no-initial-controllers alongside a metadata
version that does not support kraft.version=1 causes formatting to
fail, which is does not support kraft.version=1 causes formatting
to fail, which is a regression.
Additionally, the formatter should not check the kraft.version
against the release version, since kraft.version does not actually
depend on any release version. It should only check the
kraft.version against the static voters config/format arguments.
This PR also cleans up the integration test framework to match the
semantics of formatting an actual cluster.
Reviewers: TengYao Chi <kitingiao@gmail.com>, Kuan-Po Tseng
<brandboat@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, José Armando
García Sancio <jsancio@apache.org> Conflicts:
core/src/main/scala/kafka/tools/StorageTool.scala Minor conflicts. Keep
changes from cherry-pick.
core/src/test/java/kafka/server/ReconfigurableQuorumIntegrationTest.java
Remove auto-join tests, since 4.1 does not support it. docs/ops.html
Keep docs section from cherry-pick.
metadata/src/test/java/org/apache/kafka/metadata/storage/FormatterTest.java
Minor conflicts. Keep cherry-picked changes.
test-common/test-common-runtime/src/main/java/org/apache/kafka/common/test/KafkaClusterTestKit.java
Conflicts due to integration test framework changes. Keep new changes.
commit 02d58b176c (upstream/4.1)
```
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
This PR performs a refactoring of LockUtils and improves inline
comments, as a follow-up to https://github.com/apache/kafka/pull/19961.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jun Rao <junrao@gmail.com>
https://issues.apache.org/jira/browse/KAFKA-19390
The AbstractIndex.resize() method does not release the old memory map
for both index and time index files. In some cases, Mixed GC may not
run for a long time, which can cause the broker to crash when the
vm.max_map_count limit is reached.
The root cause is that safeForceUnmap() is not being called on Linux
within resize(), so we have changed the code to unmap old mmap on all
operating systems.
The same problem was reported in
[KAFKA-7442](https://issues.apache.org/jira/browse/KAFKA-7442), but the
PR submitted at that time did not acquire all necessary locks around the
mmap accesses and was closed without fixing the issue.
Reviewers: Jun Rao <junrao@gmail.com>
- Fix typo in `process.role`
- Fix formatting of quorum description commands
Reviewers: Lan Ding <isDing_L@163.com>, Ken Huang <s7133700@gmail.com>,
TengYao Chi <frankvicky@apache.org>
Update the KRaft dynamic voter set documentation. In Kafka 4.1, we
introduced a powerful new feature that enables seamless migration from a
static voter set to a dynamic voter set.
Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
When nested Timeline collections are created and discarded while loading
a coordinator partition, references to them accumulate in the current
snapshot. Allow the GC to reclaim them by starting a new snapshot and
discarding previous snapshots every 16,384 records.
Small intervals degrade loading times for non-transactional offset
commit workloads while large intervals degrade loading times for
transactional workloads. A default of 16,384 was chosen as a compromise.
Cherry pick of d067c6c040.
Reviewers: David Jacot <djacot@confluent.io>
Cherry-pick KAFKA-19546 to 4.1.
During online downgrade, when a static member using the consumer
protocol which is also the last member using the consumer protocol is
replaced by another static member using the classic protocol with the
same instance id, the latter will take the assignment of the former and
an online downgrade will be triggered.
In the current implementation, if the replacing static member has a
different subscription, no rebalance will be triggered when the
downgrade happens. The patch checks whether the static member has
changed subscription and triggers a rebalance when it does.
Reviewers: Sean Quah <squah@confluent.io>, David Jacot <djacot@confluent.io>
Fix to avoid flakiness in verifiable producer system test. The test
lists running processes and greps to find the VerifiableProducer one,
but wasn't providing an specific pattern to grep (so flaky if there were
more than one process containing the default grep pattern "kafka")
Fix by passing a "proc_grep_string" to filter when looking for the
VerifiableProducer process.
All test pass successfully after the change.
Reviewers: PoAn Yang <payang@apache.org>, Andrew Schofield
<aschofield@confluent.io>
We need to only pass in the reset strategy, as the `logMessage`
parameter was removed.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Lucas Brutschy
<lbrutschy@confluent.io>
Querying the oldest-open-iterator metric can result in a
NoSuchElementException when the last open iterator gets removed, due to
a race condition between the query and the metric update.
To avoid this race condition, this PR caches the metric result, to avoid
accessing the list of open iterator directly. We don't need to clear
this cache, because the entire metric is removed when the last iterator
gets removed.
Reviewers: Matthias J. Sax <matthias@confluent.io>
With "merge.repartition.topic" optimization enabled, Kafka Streams tries
to push repartition topics upstream, to be able to merge multiple
repartition topics from different downstream branches together.
However, it is not safe to push a repartition topic if the parent node
is value-changing: because of potentially changing data types, the
topology might become invalid, and fail with serde error at runtime.
The optimization itself work correctly, however, processValues() is not
correctly declared as value-changing, what can lead to invalid
topologies.
Reviewers: Bill Bejeck <bill@confluent.io>, Lucas Brutschy
<lbrutschy@confluent.io>
It looks like we are checking for properties that are not guaranteed
under at_least_once, for example, exact counting (not allowing for
overcounting).
This change relaxes the validation constraint:
The TAGG topic contains effectively count-by-count results. So for
example, if we have the input without duplication
0 -> 1,2,3 we will get in TAGG 3 -> 1, since 1 key had 3 values.
with duplication:
0 -> 1,1,2,3 we will get in TAGG 4 -> 1, since 1 key had 4 values.
This makes the result difficult to compare. Since we run the smoke test
also with Exactly_Once, I propose to disable validation off TAGG under
ALOS.
Similarly, the topic AVG may overcount or undercount. The test case is
extremely similar to DIF, both performing a join and two streams, the
only difference being the mathematical operation performed, so we can
also disable this validation under ALOS with minimal loss of coverage.
Finally, the change fixes a bug that would throw a NPE when validation
of a windowed stream would fail.
Reviewers: Kirk True <kirk@kirktrue.pro>, Matthias J. Sax
<matthias@confluent.io>
The previous URL http://lambda-architecture.net/ seems to now be controlled by spammers
Co-authored-by: Shashank <hsshashank.grad@gmail.com>
Reviewers: Mickael Maison <mickael.maison@gmail.com>
The `state-change.log` file is being incorrectly rotated to
`stage-change.log.[date]`. This change fixes the typo to have the log
file correctly rotated to `state-change.log.[date]`
_No functional changes._
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Christo Lolov
<lolovc@amazon.com>, Luke Chen <showuon@gmail.com>, Ken Huang
<s7133700@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
Along with the change: https://github.com/apache/kafka/pull/17952
([KIP-966](https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas)),
the semantics of `min.insync.replicas` config has small change, and add
some constraints. We should document them clearly.
Reviewers: Jun Rao <junrao@gmail.com>, Calvin Liu <caliu@confluent.io>,
Mickael Maison <mickael.maison@gmail.com>, Paolo Patierno
<ppatierno@live.com>, Federico Valeri <fedevaleri@gmail.com>, Chia-Ping
Tsai <chia7712@gmail.com>
All state updater threads use the same metrics instance, but do not use
unique names for their sensors. This can have the following symptoms:
1) Data inserted into one sensor by one thread can affect the metrics of
all state updater threads.
2) If one state updater thread is shutdown, the metrics associated to
all state updater threads are removed.
3) If one state updater thread is started, while another one is removed,
it can happen that a metric is registered with the `Metrics` instance,
but not associated to any `Sensor` (because it is concurrently removed),
which means that the metric will not be removed upon shutdown. If a
thread with the same name later tries to register the same metric, we
may run into a `java.lang.IllegalArgumentException: A metric named ...
already exists`, as described in the ticket.
This change fixes the bug giving unique names to the sensors. A test is
added that there is no interference of the removal of sensors and
metrics during shutdown.
Reviewers: Matthias J. Sax <matthias@confluent.io>
Follow up on https://github.com/apache/kafka/pull/20241.
Delete the instruction that manually set `streams.version=1` for running
Kafka 4.1 since it is already achieved in previous setup steps.
Reviewers: Lucas Brutschy <lucasbru@apache.org>
The changes update the OpenJDK base image from 17-buster to 17-bullseye:
- Updates tests/docker/Dockerfile to use openjdk:17-bullseye instead of
openjdk:17-buster
- Updates tests/docker/ducker-ak script to use the new default image
- Updates documentation in tests/README.md with the new image name
examples
Reviewers: Federico Valeri <fedevaleri@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
This patch fixes the bug that allows the last known leader to be elected as a partition leader while still in a fenced state, before the next heartbeat removes the fence.
https://issues.apache.org/jira/browse/KAFKA-19522
Reviewers: Jun Rao <junrao@gmail.com>, TengYao Chi
<frankvicky@apache.org>
* Coordinator starts with a smaller buffer, which can grow as needed.
* In freeCurrentBatch, release the appropriate buffer:
* The Coordinator recycles the expanded buffer
(`currentBatch.builder.buffer()`), not `currentBatch.buffer`, because
`MemoryBuilder` may allocate a new `ByteBuffer` if the existing one
isn't large enough.
* There are two cases that buffer may exceeds `maxMessageSize` 1.
If there's a single record whose size exceeds `maxMessageSize` (which,
so far, is derived from `max.message.bytes`) and the write is in
`non-atomic` mode, it's still possible for the buffer to grow beyond
`maxMessageSize`. In this case, the Coordinator should revert to using a
smaller buffer afterward. 2. Coordinator do not recycles the buffer
that larger than `maxMessageSize`. If the user dynamically reduces
`maxMessageSize` to a value even smaller than `INITIAL_BUFFER_SIZE`, the
Coordinator should avoid recycling any buffer larger than
`maxMessageSize` so that Coordinator can allocate the smaller buffer in
the next round.
* Add tests to verify the above scenarios.
Reviewers: David Jacot <djacot@confluent.io>, Sean Quah
<squah@confluent.io>, Ken Huang <s7133700@gmail.com>, PoAn Yang
<payang@apache.org>, TaiJuWu <tjwu1217@gmail.com>, Jhen-Yung Hsu
<jhenyunghsu@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Removing the isEligibleLeaderReplicasV1Enabled to let ELR be enabled if
MV is at least 4.1IV1. Also bump the Latest Prod MV to 4.1IV1
Reviewers: Jun Rao <junrao@gmail.com>
The `AdminClient` adds a telemetry reporter to the metrics reporters
list in the constructor. The problem is that the reporter was already
added in the `createInternal` method. In the `createInternal` method
call, the `clientTelemetryReporter` is added to a
`List<MetricReporters>` which is passed to the `Metrics` object, will
get closed when `Metrics.close()` is called. But adding a reporter to
the reporters list in the constructor is not used by the `Metrics`
object and hence doesn't get closed, causing a memory leak.
All related tests pass after this change.
Reviewers: Apoorv Mittal <apoorvmittal10@apache.org>, Matthias J. Sax
<matthias@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>,
Jhen-Yung Hsu <jhenyunghsu@gmail.com>
Update the documentation to describe how to upgrade the kraft feature
version from 0 to 1.
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Alyssa Huang
<ahuang@confluent.io>