Now that Kafka support Java 17, this PR makes some changes in connect
module. The changes in this PR are limited to only some files. A future
PR(s) shall follow.
The changes mostly include:
- Collections.emptyList(), Collections.singletonList() and
Arrays.asList() are replaced with List.of()
- Collections.emptyMap() and Collections.singletonMap() are replaced
with Map.of()
- Collections.singleton() is replaced with Set.of()
Modules target: test-plugins, transforms
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Changes: Rename `waitForTopic` to `waitTopicCreation` for better clarity
Reasons: To align with `waitTopicDeletion` Reference:
https://github.com/apache/kafka/pull/20108/files#r2221659660
Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<frankvicky@apache.org>
This feature adds maintenance burden and potential security concerns
while providing no apparent value to the Kafka community. See
[KIP-1193](https://cwiki.apache.org/confluence/x/dAxJFg) for more
details.
Reviewers: TengYao Chi <frankvicky@apache.org>, Ken Huang
<s7133700@gmail.com>
---------
Signed-off-by: Federico Valeri <fedevaleri@gmail.com>
Now that Kafka support Java 17, this PR makes some changes in tools
module. The changes in this PR are limited to only some files. A future
PR(s) shall follow.
The changes mostly include:
- Collections.emptyList(), Collections.singletonList() and
Arrays.asList() are replaced with List.of()
- Collections.emptyMap() and Collections.singletonMap() are replaced
with Map.of()
- Collections.singleton() is replaced with Set.of()
Sub modules targeted: tools/src/main
Reviewers: Ken Huang <s7133700@gmail.com>, Jhen-Yung Hsu
<jhenyunghsu@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
1. JMH test should return value against return void (compiler can
eliminate returned value and benchmark would be incorrect).
2. Also move constant variable from method to class, to prevent JIT to
unfold.
3. Increase warm up iterations
Reviewers: Lucas Brutschy <lucasbru@apache.org>
There is a typo in the unit test, it calls
`runOnceWithoutProcessingThreads` while it should call
`runOnceWithProcessingThreads` instead.
Reviewers: Lucas Brutschy <lucasbru@apache.org>
* We INFO log a message, if a share partition could be cold snapshotted.
* However, this may create noise if we have highly partitioned topic
backing the share partition. This will be further exacerbated by
multiple share groups using that topic.
* To reduce log pollution, this PR changes the level to DEBUG.
Reviewers: ShivsundarR <shr@confluent.io>, Andrew Schofield
<aschofield@confluent.io>
Implements KIP-1034 to add support of Dead Letter
Queue in Kafka Streams.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Bruno Cadonna
<cadonna@apache.org>
Co-authored-by: Sebastien Viale <sebastien.viale@michelin.com>
The PR refactors the findNextFetchOffset variable from AtomicBoolean to
boolean itself as the access is always done while holding a lock. This
also improves handling of `writeShareGroupState` method response where
now complete lock is not required, rather on sub-section.
Reviewers: Abhinav Dixit <adixit@confluent.io>, Andrew Schofield
<aschofield@confluent.io>
1. Add check leader missing logic in method
`ConsumerGroupCommand.ConsumerGroupService#prepareOffsetsToReset` in
order to fail quickly
2. Add some tests
Reviewers: TaiJuWu <tjwu1217@gmail.com>, Lan Ding <isDing_L@163.com>,
Ken Huang <s7133700@gmail.com>, Andrew Schofield
<aschofield@confluent.io>
Add its for `Admin.deleteShareGroupOffsets`,
`Admin.alterShareGroupOffsets` and `Admin.listShareGroupOffsets` to
`PlaintextAdminIntegrationTest`.
Reviewers: Andrew Schofield <aschofield@confluent.io>
## Summary
jira: https://issues.apache.org/jira/browse/KAFKA-19517
Ensure `LoadSummary#numRecords` counts all records, including control
batches, to maintain consistency with numBytes.
## Test
`testLoading` now verifies `numRecords`.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi
<frankvicky@apache.org>
This patch adds an API level integration test for the producer epoch
verification when processing transactional offset commit and end txn
markers.
Reviewers: PoAn Yang <payang@apache.org>, TengYao Chi
<kitingiao@gmail.com>, Sean Quah <squah@confluent.io>, Chia-Ping Tsai
<chia7712@gmail.com>
This patch fixes the bug that allows the last known leader to be elected as a partition leader while still in a fenced state, before the next heartbeat removes the fence.
https://issues.apache.org/jira/browse/KAFKA-19522
Reviewers: Jun Rao <junrao@gmail.com>, TengYao Chi
<frankvicky@apache.org>
see https://github.com/apache/kafka/pull/19769#issuecomment-3065869429
This patch adds a test to `ProtocolTest` to ensure the Protocol page displays the correct API version range.
Reviewers: Yung <yungyung7654321@gmail.com>, TengYao Chi
<frankvicky@apache.org>, Gaurav Narula <gaurav_narula2@apple.com>, Ken
Huang <s7133700@gmail.com>, Jimmy Wang <wangzhiwang@qq.com>
This PR removes the dependencies on `core` and `scala-library` from the
`coordinator-common` module, as a follow-up to
https://github.com/apache/kafka/pull/20089.
These dependencies have been removed from tests, and the previously
added import-control relaxations have been reverted accordingly.
Reviewers: TengYao Chi <frankvicky@apache.org>, Ken Huang
<s7133700@gmail.com>
Temporarily fix it by disable the new protocol, will take a deeper look
at it in the consumer protocol.
Reviewers: Matthias J. Sax <matthias@confluent.io>
The comment on the RemoteLogManager.getLeaderEpochEntries method has a
small error description,it should be start(inclusive)and end(exclusive).
Reviewers: Ken Huang <s7133700@gmail.com>, Lan Ding <isDing_L@163.com>,
Chia-Ping Tsai <chia7712@gmail.com>
Improve the error message in the kafka-storage.sh when an incorrect
release-version is given. Specifically, following the behavior of
kafka-feature.sh, when an incorrect release-version is entered, it
returns the currently supported versions to the user.
Reviewers: TengYao Chi <frankvicky@apache.org>, Yung
<yungyung7654321@gmail.com>
The MetadataImage has a lot of stuff in it and it gets passed around in
many places in the new GroupCoordinator. This makes it difficult to
understand what metadata the group coordinator actually relies on and
makes it too easy to use metadata in ways it wasn't meant to be used.
This change encapsulate the MetadataImage in an interface
(`CoordinatorMetadataImage`) that indicates and controls what metadata
the group coordinator actually uses. Now it is much easier at a glance
to see what dependencies the GroupCoordinator has on the metadata. Also,
now we have a level of indirection that allows more flexibility in how
the GroupCoordinator is provided the metadata it needs.
`ReplicaManager#alterReplicaLogDirs` does not resume log cleaner while
handling an `AlterReplicaLogDirs` request for a topic partition which
already has an `AlterReplicaLogDirs` in progress, leading to a resource
leak where the cleaning for topic partitions remains paused even after
the log directory has been altered.
This change ensures we invoke `LogManager#resumeCleaning` if the future
replica directory has changed.
Reviewers: Jun Rao <junrao@gmail.com>
Removing the isEligibleLeaderReplicasV1Enabled to let ELR be enabled if
MV is at least 4.1IV1. Also bump the Latest Prod MV to 4.1IV1
Reviewers: Paolo Patierno <ppatierno@live.com>, Jun Rao <junrao@gmail.com>
### Summary of Changes
- Rewrote both `CoordinatorLoaderImpl` and `CoordinatorLoaderImplTest`
in Java, replacing their original Scala implementations.
- Removed the direct dependency on `ReplicaManager` and replaced it with
functional interfaces for `partitionLogSupplier` and
`partitionLogEndOffsetSupplier`
- Preserved original logic and test coverage during migration.
Reviewers: TaiJuWu <tjwu1217@gmail.com>, Ken Huang <s7133700@gmail.com>,
TengYao Chi <frankvicky@apache.org>, Chia-Ping Tsai
<chia7712@gmail.com>
Added required ACLs for new streams operations:
- STREAMS_GROUP_HEARTBEAT (88) requires:
• READ on Group
• DESCRIBE on Topics
• [Conditional] CREATE on Cluster or Topics
- STREAMS_GROUP_DESCRIBE (89) requires:
• DESCRIBE on Group
• DESCRIBE on Topic
Here is the rendering of the modified document.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
Co-authored-by: Lucas Brutschy <lbrutschy@gmail.com>
The mocked value for `UnifiedLog#topicId` was incorrectly set up which
caused test failure.
Reviewers: Luke Chen <showuon@gmail.com>, PoAn Yang <payang@apache.org>, Satish Duggana <satishd@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
The changes update the OpenJDK base image from 17-buster to 17-bullseye:
- Updates tests/docker/Dockerfile to use openjdk:17-bullseye instead of
openjdk:17-buster
- Updates tests/docker/ducker-ak script to use the new default image
- Updates documentation in tests/README.md with the new image name
examples
Reviewers: Federico Valeri <fedevaleri@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
* Coordinator starts with a smaller buffer, which can grow as needed.
* In freeCurrentBatch, release the appropriate buffer:
* The Coordinator recycles the expanded buffer
(`currentBatch.builder.buffer()`), not `currentBatch.buffer`, because
`MemoryBuilder` may allocate a new `ByteBuffer` if the existing one
isn't large enough.
* There are two cases that buffer may exceeds `maxMessageSize` 1.
If there's a single record whose size exceeds `maxMessageSize` (which,
so far, is derived from `max.message.bytes`) and the write is in
`non-atomic` mode, it's still possible for the buffer to grow beyond
`maxMessageSize`. In this case, the Coordinator should revert to using a
smaller buffer afterward. 2. Coordinator do not recycles the buffer
that larger than `maxMessageSize`. If the user dynamically reduces
`maxMessageSize` to a value even smaller than `INITIAL_BUFFER_SIZE`, the
Coordinator should avoid recycling any buffer larger than
`maxMessageSize` so that Coordinator can allocate the smaller buffer in
the next round.
* Add tests to verify the above scenarios.
Reviewers: David Jacot <djacot@confluent.io>, Sean Quah
<squah@confluent.io>, Ken Huang <s7133700@gmail.com>, PoAn Yang
<payang@apache.org>, TaiJuWu <tjwu1217@gmail.com>, Jhen-Yung Hsu
<jhenyunghsu@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
### Background
As part of KIP-932 implementation, ShareFetch requests need to properly
integrate with Kafka's quota system. This requires that ShareFetch
requests extract and pass the correct session information (Principal,
client address, client ID) to quota managers, ensuring consistent quota
enforcement between ShareFetch and traditional Fetch requests.
### Changes
This PR adds `testHandleShareFetchRequestQuotaTagsVerification()`,
`testHandleShareAcknowledgeRequestQuotaTagsVerification` and
`testHandleShareFetchWithAcknowledgementQuotaTagsVerification` to
`KafkaApisTest`, which provides verification of quota tag extraction and
session handling for ShareFetch and ShareAcknowledge requests.
- Ensures ShareFetch/ShareAck requests are properly constructed with
the correct client ID, principal, client address, and API key
- Verifies the request context contains the expected session
information
- Uses `ArgumentCaptor` to capture the exact `Session` and
`RequestChannel.Request` objects passed to quota managers
- Verifies both `quotas.fetch.maybeRecordAndGetThrottleTimeMs()` and
`quotas.request.maybeRecordAndGetThrottleTimeMs()` are called with
correct parameters as and when needed.
- Validates that the captured `RequestChannel.Request` object
maintains the correct request context information
- Ensures the client ID passed to quota managers matches the
test-defined value
- Verifies that in case of Acks being piggybacked on the fetch
requests, the quotas are applied only once and not twice.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
- Updated `ClientQuotaImage` and `TopicImage` by using
`Collections.unmodifiableMap` or `ImmutableMap` to prevent accidental or
intentional mutations after construction.
Reviewers: Alyssa Huang <ahuang@confluent.io>, Chia-Ping Tsai
<chia7712@gmail.com>
This PR performs a refactoring of LockUtils and improves inline
comments, as a follow-up to https://github.com/apache/kafka/pull/19961.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jun Rao <junrao@gmail.com>
For the Kafka Stream group commands, if delete topic requests fail due
to version mismatch, user will have to remove the topics manually by
first retrieving the relevant internal topics.
To assist the user, the internal topic names are now included as part of
the error message, so that the user could delete the internal topics
associated with this application directly.
Reviewers: TengYao Chi <frankvicky@apache.org>, Alieh Saeedi
<asaeedi@confluent.io>
The `AdminClient` adds a telemetry reporter to the metrics reporters
list in the constructor. The problem is that the reporter was already
added in the `createInternal` method. In the `createInternal` method
call, the `clientTelemetryReporter` is added to a
`List<MetricReporters>` which is passed to the `Metrics` object, will
get closed when `Metrics.close()` is called. But adding a reporter to
the reporters list in the constructor is not used by the `Metrics`
object and hence doesn't get closed, causing a memory leak.
All related tests pass after this change.
Reviewers: Apoorv Mittal <apoorvmittal10@apache.org>, Matthias J. Sax
<matthias@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>,
Jhen-Yung Hsu <jhenyunghsu@gmail.com>
The new "streams" protocol behaves slightly different to the "classic"
protocol, and thus we need to update the test to avoid race conditions.
In particular, the first call to `poll()` won't "block" and return after
task assignment completed if we need to create internal topics, but
returns early without a task assignment, and only a consecutive
rebalance will assign tasks.
This implies, that KafkaStreams transits to RUNNING state even if the
group is still in NOT_READY state broker side, but this NOT_READY state
is not reflected in the client side state machine.
Disabling the combination of "complex-topology + streams" for now,
until this difference in behavior of the client state machine is fixed.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
Add documentation for Batch Format to explain the meaning of
batchLength.
This is the preview image after the change:

Reviewers: Ken Huang <s7133700@gmail.com>, Jhen-Yung Hsu
<jhenyunghsu@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
### Summary
Extends RequestQuotaTest to include ShareFetch API quota testing,
ensuring compliance with KIP-932.
### Key Changes
- New test: testShareFetchUsesSameFetchSensor() - Verifies ShareFetch
and Fetch use the same FETCH quota sensor
- New test:
testResponseThrottleTimeWhenBothShareFetchAndRequestQuotasViolated() -
Tests ShareFetch throttling behaviour
- Request builder: Added ApiKeys.SHARE_FETCH case with proper ShareFetch
request construction
- Some minor cleanup wrt use of Collections
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
This PR adds the following metrics for each of the supported production
features (`metadata.version`, `kraft.version`, `transaction.version`,
etc.):
`kafka.server:type=MetadataLoader,name=FinalizedLevel,featureName=X`
`kafka.server:type=node-metrics,name=maximum-supported-level,feature-name=X`
`kafka.server:type=node-metrics,name=minimum-supported-level,feature-name=X`
Reviewers: Josep Prat <josep.prat@aiven.io>, PoAn Yang
<payang@apache.org>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Lan Ding
<isDing_L@163.com>, Chia-Ping Tsai <chia7712@gmail.com>
**Problem Description**
In the `RemoteIndexCache.cleanup()` method, the asynchronous invocation
of `index.deleteIfExists()` may cause a conflict. When the
`getIndexFileFromRemoteCacheDir()` method is executed, it utilizes
`Files.walk()` to traverse all files in the directory path. If
`index.deleteIfExists()` is triggered during this traversal, a
`NoSuchFileException` will be thrown.
**Solution**
To resolve this issue, ensure that `index.deleteIfExists()` has been
fully executed before invoking `getIndexFileFromRemoteCacheDir()`.
Reviewers: Jun Rao <junrao@gmail.com>
The `testHWCheckpointWithFailuresMultipleLogSegments` test in
`LogRecoveryTest` was failing intermittently due to a race condition
during its failure simulation.
In successful runs, the follower broker would restart and rejoin the
In-Sync Replica (ISR) set before the old leader's failure was fully
processed. This allowed for a clean and timely leader election to the
now in-sync follower.
However, in the failing runs, the follower did not rejoin the ISR before
the leader election was triggered. With no replicas in the ISR and
unclean leader election disabled by default for the test, the controller
correctly refused to elect a new leader, causing the test to time out.
This commit fixes the flakiness by overriding the controller
configuration for this test to explicitly enable unclean leader
election. This allows the out-of-sync replica to be promoted to leader,
making the test deterministic and stable.
Reviewers: Jun Rao <junrao@gmail.com>
1. Optimize the corresponding logic in the `ConsumerGroupCommand` by
first checking if a leader exists for the partition before invoking the
`admin.listOffsets`. Finally, concatenate the data and return
2. Add integration test, create a cluster with 3 brokers, then shutdown
a broker and observe whether the output meets the expectations
Reviewers: Ken Huang <s7133700@gmail.com>, PoAn Yang
<payang@apache.org>, TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
This PR enables reading remote storage for multiple partitions in one
fetchRequest. The main changes are:
1. In `DelayedRemoteFetch`, we accept multiple remoteFetchTasks and
other metadata now.
2. In `DelayedRemoteFetch`, we'll wait until all remoteFetch done,
either succeeded or failed.
3. In `ReplicaManager#fetchMessage`, we'll create one
`DelayedRemoteFetch` and pass multiple remoteFetch metadata to it, and
watch all of them.
4. Added tests
Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Federico Valeri <fedevaleri@gmail.com>, Satish Duggana <satishd@apache.org>
Task pairs is an optimization that is enabled in the current sticky task
assignor.
The basic idea is that every time we add a task A to a client that has
tasks B, C, we add pairs (A, B) and (A, C) to a global collection of
pairs. When adding a standby task, we then prioritize creating standby
tasks that create new task pairs. If this does not work, we fall back to
the original behavior.
The complexity of this optimization is fairly significant, and its
usefulness is questionable, the HighAvailabilityAssignor does not seem
to have such an optimization, and the absence of this optimization does
not seem to have caused any problems that I know of. I could not find
any what this optimization is actually trying to achieve.
A side effect of it is that we will sometimes avoid “small loops”, such
as
Node A: ActiveTask1, StandbyTask2 Node B: ActiveTask2,
StandbyTask1 Node C: ActiveTask3,
StandbyTask4 Node D: ActiveTask4,
StandbyTask3
So a small loop like this, worst case losing two nodes will cause 2
tasks to go down, so the assignor is preferring
Node A: ActiveTask1, StandbyTask4 Node B: ActiveTask2,
StandbyTask1 Node C: ActiveTask3,
StandbyTask2 Node D: ActiveTask4,
StandbyTask3
Which is a “big loop” assignment, where worst-case losing two nodes will
cause at most 1 task to be unavailable. However, this optimization seems
fairly niche, and also the current implementation does not seem to
implement it in a direct form, but a more relaxed constraint which
usually, does not always avoid small loops. So it remains unclear
whether this is really the intention behind the optimization. The
current unit tests of the StickyTaskAssignor pass even after removing
the optimization.
The pairs optimization has a worst-case quadratic space and time
complexity in the number of tasks, and make a lot of other optimizations
impossible, so I’d suggest we remove it. I don’t think, in its current
form, it is suitable to be implemented in a broker-side assignor. Note,
however, if we identify a useful effect of the code in the future, we
can work on finding an efficient algorithm that can bring the
optimization to our broker-side assignor.
This reduces the runtime of our worst case benchmark by 10x.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
While walking through the source code I confirmed that the broker checks
`replica.fetch.min.bytes` exactly the same way it checks
`fetch.min.bytes`, so this patch updates the wording for both config
keys.
Co-authored-by: yangxuze <xuze_yang@163.com>
Reviewers: Luke Chen <showuon@gmail.com>
Fixup PR Labels / fixup-pr-labels (needs-attention) (push) Has been cancelledDetails
Fixup PR Labels / fixup-pr-labels (triage) (push) Has been cancelledDetails
Docker Image CVE Scanner / scan_jvm (3.7.2) (push) Has been cancelledDetails
Docker Image CVE Scanner / scan_jvm (3.8.1) (push) Has been cancelledDetails
Docker Image CVE Scanner / scan_jvm (3.9.1) (push) Has been cancelledDetails
Docker Image CVE Scanner / scan_jvm (4.0.0) (push) Has been cancelledDetails
Docker Image CVE Scanner / scan_jvm (latest) (push) Has been cancelledDetails
Fixup PR Labels / needs-attention (push) Has been cancelledDetails
Flaky Test Report / Flaky Test Report (push) Has been cancelledDetails
Moving rollback out of lock, if persister returns a completed future for
write state then same data-plane-request-handler thread should not call
purgatory safeTryAndComplete while holding SharePartition's write lock.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Abhinav Dixit
<adixit@confluent.io>