Commit Graph

6031 Commits

Author SHA1 Message Date
PoAn Yang cff10e6541
KAFKA-19302 Move ReplicaState and Replica to server module (#19755)
CI / build (push) Waiting to run Details
1. Move `ReplicaState` and `Replica` to server module.
2. Rewrite `ReplicaState` and `Replica` in Java.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-05-19 23:59:12 +08:00
Ken Huang 6573b4ace1
KAFKA-19042 Move PlaintextConsumerCommitTest to client-integration-tests module (#19389)
Use Java to rewrite `PlaintextConsumerCommitTest` by new test infra and
move it to client-integration-tests module.

Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai
<chia7712@gmail.com>
2025-05-19 21:51:42 +08:00
Hong-Yi Chen ce4940f989
MINOR: Refactor shared-group request handle methods to return CompletableFuture for consistent error handling (#19724)
CI / build (push) Waiting to run Details
This PR is based on the discussion here:
https://github.com/apache/kafka/pull/18929#discussion_r2083238645

Currently, the handle methods related to `shared‐group` requests  are
inconsistent: some return `Unit` while others return
`CompletableFuture[Unit]` without a clear rationale. This PR
standardizes all of them to return `CompletableFuture[Unit]` and ensures
consistent error handling by chaining `.exceptionally(handleError)` to
each call site.

Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
 <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-19 01:45:50 +08:00
Yunchi Pang 6596ba3a78
MINOR: Remove unnecessary test conditions where ListOffsetsRequest version is 0 (#19738)
Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-19 00:20:24 +08:00
Jhen-Yung Hsu ced56a320b
MINOR: Move logDirs config out of KafkaConfig (#19579)
CI / build (push) Waiting to run Details
Follow up https://github.com/apache/kafka/pull/19460/files#r2062664349

Reviewers: Ismael Juma <ismael@juma.me.uk>, PoAn Yang
<payang@apache.org>, TaiJuWu <tjwu1217@gmail.com>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-17 00:52:20 +08:00
Bolin Lin 21ffc07212
MINOR: migrate MinInSyncReplicasConfigTest to server module (#19727)
Migrate MinInSyncReplicasConfigTest to the server module


Reviewers: PoAn Yang <payang@apache.org>, Yung
 <yungyung7654321@gmail.com>, TengYao Chi <frankvicky@apache.org>, Ken
 Huang <s7133700@gmail.com>
2025-05-16 17:18:28 +08:00
PoAn Yang c26b09c609
KAFKA-18904: [1/N] Change ListClientMetricsResources API to ListConfigResources (#19493)
* Change `ListClientMetricsResourcesRequest.json` to
`ListConfigResourcesRequest.json`.
* Change `ListClientMetricsResourcesResponse.json` to
`ListConfigResourcesResponse.json`.
* Change `ListClientMetricsResourcesRequest.java` to
`ListConfigResourcesRequest.java`.
* Change `ListClientMetricsResourcesResponse.java` to
`ListConfigResourcesResponsejava`.
* Change `KafkaApis` to handle both `ListClientMetricsResourcesRequest`
v0 and v1 requests.

Reviewers: Andrew Schofield <aschofield@confluent.io>

---------

Signed-off-by: PoAn Yang <payang@apache.org>
2025-05-15 23:39:00 +01:00
Andrew Schofield 567a03dd14
HOTFIX: Parameter mismatch in test for BrokerMetadataPublisher (#19733)
Recent changes to BrokerMetadataPublisher seem to have caused a mismatch
between the BrokerMetadataPublisher and its test.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Sushant Mahajan
 <smahajan@confluent.io>
2025-05-15 23:20:29 +01:00
Abhinav Dixit 2e0e9b9056
MINOR: Improvements for Persister APIs with respect to share group protocol enablement (#19719)
CI / build (push) Waiting to run Details
### About
Minor improvements to Persister APIs in KafkaApis with respect to share
group protocol enablement

### Testing
The new code added has been tested with the help of unit tests

Reviewers: Sushant Mahajan <sushant.mahajan88@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>
2025-05-15 18:29:16 +01:00
David Jacot c612cfff29
KAFKA-19274; Group Coordinator Shards are not unloaded when `__consumer_offsets` topic is deleted (#19713)
Group Coordinator Shards are not unloaded when `__consumer_offsets`
topic is deleted. The unloading is scheduled but it is ignored because
the epoch is equal to the current epoch:

```
[2025-05-13 08:46:00,883] INFO [GroupCoordinator id=1]
Scheduling  unloading of metadata for __consumer_offsets-0 with epoch
OptionalInt[0]
(org.apache.kafka.coordinator.common.runtime.CoordinatorRuntime)
[2025-05-13 08:46:00,883] INFO [GroupCoordinator id=1] Scheduling
unloading of metadata for __consumer_offsets-1 with epoch OptionalInt[0]
(org.apache.kafka.coordinator.common.runtime.CoordinatorRuntime)
[2025-05-13 08:46:00,883] INFO [GroupCoordinator id=1] Ignored unloading
metadata for __consumer_offsets-0 in epoch OptionalInt[0] since current
epoch is 0.
(org.apache.kafka.coordinator.common.runtime.CoordinatorRuntime)
[2025-05-13 08:46:00,883] INFO [GroupCoordinator id=1] Ignored unloading
metadata for __consumer_offsets-1 in epoch OptionalInt[0] since current
epoch is 0.
(org.apache.kafka.coordinator.common.runtime.CoordinatorRuntime)
```

This patch fixes the issue by not setting the leader epoch in this case.
The coordinator expects the leader epoch to be incremented when the
resignation code is called. When the topic is deleted, the epoch is not
incremented. Therefore, we must not use it. Note that this is aligned
with deleted partitions are handled too.

Reviewers: Dongnuo Lyu <dlyu@confluent.io>, José Armando García Sancio <jsancio@apache.org>
2025-05-15 19:04:38 +02:00
Lianet Magrans 8a577fa5af
MINOR: adding consumer fenced test and log (#19723)
Adding test to specifically force the fencing path due to delayed
rebalance, and validate how the consumer recovers automatically.

Running this test and DEBUG log enabled, allows to see the details of
the fencing flow: consumer getting fenced due to rebalance exceeded,
resetting to epoch 0, rejoining on the next poll with the existing
subscription, and being accepted back in the group (so consumption
resumes)

This is aimed to help understand
[KAFKA-19233](https://issues.apache.org/jira/browse/KAFKA-19233)

Will add another one in separate PR to also involve commits in similar
fencing scenarios.

Reviewers: TengYao Chi <frankvicky@apache.org>
2025-05-15 12:57:11 -04:00
Chirag Wadhwa f9064f8bcd
KAFKA-19231-1: Handle fetch request when share session cache is full (#19701)
According to the recent changes in KIP-932, when the share session cache
is full and a broker receives a Share Fetch request with Initial Share
Session Epoch (0), then the error code `SHARE_SESSION_LIMIT_REACHED` is
returned after a delay of maxWaitMs. This PR implements this logic. In
order to add a delay between subsequent share fetch requests, the timer
is delayed operation purgatory is used. A new `IdleShareFetchTimeTask`
has been added which takes in a CompletableFuture<Void>. Upon the
expiration, this future is completed with null. When the future is
completes, a response is sent back to the client with the error code
`SHARE_SESSION_LIMIT_REACHED`

Reviewers: Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>
2025-05-15 14:36:44 +01:00
Sushant Mahajan 847968e530
KAFKA-19281: Add share enable flag to periodic jobs. (#19721)
* We have a few periodic timer tasks in `ShareCoordinatorService` which
run continuously.
* With the recent introduction of share group enabled config at feature
level, we would like these jobs to stop when the aforementioned feature
is disabled.
* In this PR, we have added the functionality to make that possible.
* Additionally the service has been supplemented with addition of a
static share group config supplier.
* New test has been added.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal
 <apoorvmittal10@gmail.com>
2025-05-15 14:05:06 +01:00
Apoorv Mittal 6d45657cde
MINOR: Cleaning up code for share feature listener (#19715)
CI / build (push) Waiting to run Details
The PR is a minor follow up on
https://github.com/apache/kafka/pull/19659.

KafkaApis.scala already have a check which denies new share fetch
related calls if the share group is not supported. Hence no new sessions
shall be created so the requirement to have share group enabled flag in
ShareSessionCache is not needed, unless I am missing something.

Reviewers: Abhinav Dixit <adixit@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2025-05-15 10:10:54 +01:00
xijiu ecb5b6bd7e
KAFKA-19269 `Unexpected error ..` should not happen when the delete.topic.enable is false (#19698)
Do not print the `"Unexpected error .."` log when the config
`delete.topic.enable` is false

Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
 <s7133700@gmail.com>, TengYao Chi <frankvicky@apache.org>, Chia-Ping
 Tsai <chia7712@gmail.com>
2025-05-15 15:17:49 +08:00
Ming-Yen Chung 7d4acedc27
KAFKA-19270: Remove Optional from ClusterInstance#controllerListenerName() return type (#19718)
CI / build (push) Waiting to run Details
In KRaft mode, controllerListenerName must always be specified, so we don't need an `Optional` to wrap it.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ken Huang
<s7133700@gmail.com>, TengYao Chi <frankvicky@apache.org>
2025-05-15 10:47:47 +08:00
Kevin Wu 37963256d1
KAFKA-18666: Controller-side monitoring for broker shutdown and startup (#19586)
CI / build (push) Waiting to run Details
This PR introduces the following per-broker metrics:

-`kafka.controller:type=KafkaController,name=BrokerRegistrationState,broker=X`

-`kafka.controller:type=KafkaController,name=TimeSinceLastHeartbeatReceivedMs,broker=X`

and this metric:

`kafka.controller:type=KafkaController,name=ControlledShutdownBrokerCount`

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2025-05-14 10:59:47 -07:00
PoAn Yang cbfbbe833d
KAFKA-19234: broker should return UNAUTHORIZATION error for non-existing topic in produce request (#19635)
Since topic name is sensitive information, it should return a
TOPIC_AUTHORIZATION_FAILED error for non-existing topic. The Fetch
request also follows this pattern.

Co-authored-by: John Doe <zh2725284321@gmail.com>, Ken Huang <s7133700@gmail.com>, Jhen-Yung Hsu <yungyung7654321@gmail.com>

Reviewers: Jun Rao  <junrao@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>, TaiJuWu  <tjwu1217@gmail.com>, TengYao Chi
 <frankvicky@apache.org>, Ken Huang  <s7133700@gmail.com>, Jhen-Yung Hsu
 <jhenyunghsu@gmail.com>
2025-05-14 09:56:09 -07:00
YuChia Ma 05169aa201
MINOR: Add deprecation warning for `log.cleaner.enable` and `log.cleaner.threads` (#19674)
Add a warning message when using log.cleaner.enable to remind users that
this configuration is deprecated. Also, add a warning message for
log.cleaner.threads=0 because in version 5.0, the value must be greater
than zero.

Reviewers: Ken Huang <s7133700@gmail.com>, PoAn Yang
<payang@apache.org>, TengYao Chi <frankvicky@apache.org>, TaiJuWu
<tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-15 00:18:27 +08:00
Xuan-Zhang Gong 15331bbfc4
KAFKA-19273 Ensure the delete policy is configured when the tiered storage is enabled (#19702)
We updated the validation rule for cleanup.policy in remote storage
mode.

If remote log storage is enabled, only cleanup.policy=delete is allowed.
Any other value (e.g. compact, compact,delete) will now result in a
config validation error.

Reviewers: Luke Chen <showuon@gmail.com>, Ken Huang
 <s7133700@gmail.com>, PoAn Yang <payang@apache.org>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-05-15 00:13:55 +08:00
Apoorv Mittal ec70c44362
KAFKA-19116, KAFKA-19258: Handling share group member change events (#19666)
CI / build (push) Waiting to run Details
The PR adds ShareGroupListener which triggers on group changes, such as
when member leaves or group is empty. Such events are specific to
connection on respective broker. There events help to clean specific
states managed for respective member or group in various caches.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-14 09:52:47 +01:00
Bolin Lin f01e5aa964
KAFKA-19145 Move LeaderEndPoint to Server module (#19630)
Move LeaderEndPoint to Server module

Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
<s7133700@gmail.com>, TengYao Chi <frankvicky@apache.org>, Chia-Ping
Tsai <chia7712@gmail.com>
2025-05-14 13:47:51 +08:00
José Armando García Sancio a619c6bb95
MINOR; Remove cast for Records' slice (#19661)
CI / build (push) Waiting to run Details
In Java return types are covariant. This means that method override can
override the return type with a subclass.

Reviewers: Jun Rao <junrao@apache.org>, Chia-Ping Tsai
 <chia7712@apache.org>, Apoorv Mittal <apoorvmittal10@gmail.com>
2025-05-14 01:06:38 +01:00
Abhinav Dixit d5ce463ed3
KAFKA-19253: Improve metadata handling for share version using feature listeners (1/N) (#19659)
This PR creates a listener for `SharePartitionManager` to listen to any
changes in `ShareVersion` feature. In case, there is a toggle, we need
to change the attributes in `SharePartitionManager` accordingly.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-13 15:01:03 +01:00
Sushant Mahajan bf53561d16
KAFKA-19201: Handle deletion of user topics part of share partitions. (#19559)
* Currently even if a user topic is deleted, its related records are not
deleted with respect to subscribed share groups from the GC and the SC.
* The event of topic delete is propagated from the
BrokerMetadataPublisher to the coordinators via the
`onPartitionsDeleted` method. This PR leverages this method to issue
cleanup calls to the GC and SC respectively.
* To prevent chaining of futures in the GC, we issue async calls to both
GC and SC independently and the methods take care of the respective
cleanups unaware of the other.
* This method is more efficient and transcends issues related to
timeouts/broker restarts resulting in chained future execution issues.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-13 14:22:17 +01:00
David Jacot d154a314e7
KAFKA-14691; Add TopicId to OffsetFetch API (#19515)
This patch extends the OffsetFetch API to support topic ids. From
version 10 of the API, topic ids must be used.

The patch only contains the server side changes and it keeps the version
10 as unstable for now. We will mark the version as stable when the
client side changes are merged in.

Reviewers: TengYao Chi <frankvicky@apache.org>, Lianet Magrans <lmagrans@confluent.io>
2025-05-13 15:10:10 +02:00
Ming-Yen Chung 57ae6d6706
KAFKA-18695 Remove quorum=kraft and kip932 from all integration tests (#19633)
CI / build (push) Waiting to run Details
Currently, the quorum uses kraft by default, so there's no need to
specify it explicitly.

For kip932 and isShareGroupTest, they are no longer used after #19542 .

Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
 <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-12 01:28:30 +08:00
Yunchi Pang f588fa0643
MINOR: Move TxnTransitMetadata to transaction-coordinator (#19662)
Migrates the `TxnTransitMetadata` class from scala to java, moving it
from to the `transaction-coordinator` module.

Reviewers: PoAn Yang <payang@apache.org>, Nick Guo
 <lansg0504@gmail.com>, Ken Huang <s7133700@gmail.com>, Jhen-Yung Hsu
 <jhenyunghsu@gmail.com>, TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-05-11 22:36:47 +08:00
Alyssa Huang 042be5b9ac
MINOR: Fix some Request toString methods (#19655)
CI / build (push) Waiting to run Details
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2025-05-09 23:42:34 -07:00
Andrew Schofield 70c0aca4b7
KAFKA-17897: Deprecate Admin.listConsumerGroups [2/N] (#19508)
CI / build (push) Waiting to run Details
Admin.listConsumerGroups() was able to use the early versions of
ListGroups RPC with the version used dependent upon the filters the user
specified. Admin.listGroups(ListGroupsOptions.forConsumerGroups())
inadvertently required ListGroups v5 because it always set a types
filter. This patch handles the UnsupportedVersionException and winds
back the complexity of the request unless the user has specified filters
which demand a higher version.

It also adds ListGroupsOptions.forShareGroups() and forStreamsGroups().
The usability of Admin.listGroups() is much improved as a result.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, PoAn Yang
 <payang@apache.org>
2025-05-09 08:38:16 +01:00
Uladzislau Blok 0076b65f99
KAFKA-19182 Move SchedulerTest to server module (#19608)
CI / build (push) Waiting to run Details
This PR moves SchedulerTest to server module and rewrite it with java.

Please also check updated import control config!

Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-05-09 00:02:38 +08:00
PoAn Yang 9e785cee8d
KAFKA-19087 Move TransactionState to transaction-coordinator module (#19568)
Move TransactionState to transaction-coordinator module and rewrite it
as Java.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-05-08 23:51:51 +08:00
David Jacot 98e535b524
MINOR: Simplify OffsetFetchResponse (#19642)
While working on https://github.com/apache/kafka/pull/19515, I came to
the conclusion that the OffsetFetchResponse is quite messy and overall
too complicated. This patch rationalize the constructors.
OffsetFetchResponse has a single constructor accepting the
OffsetFetchResponseData. A builder is introduced to handle the down
conversion. This will also simplify adding the topic ids. All the
changes are mechanical, replacing data structures by others.

Reviewers: Lianet Magrans <lmagrans@confluent.io>
2025-05-08 14:57:45 +02:00
Apoorv Mittal 2dd6126b5d
KAFKA-18855 Slice API for MemoryRecords (#19581)
CI / build (push) Waiting to run Details
The PR adds `slice` API in `Records.java` and further implementation in
`MemoryRecords`. With the addition of ShareFetch and it's support to
read from TieredStorage, where ShareFetch might acquire subset of fetch
batches and TieredStorage emits MemoryRecords, hence a slice API is
needed for MemoryRecords as well to limit the bytes transferred (if
subset batches are acquired).

MemoryRecords are sliced using `duplicate` and `slice` API of
ByteBuffer, which are backed by the original buffer itself hence no-copy
is created rather position, limit and offset are changed as per the new
position and length.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Jun Rao
 <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-08 14:02:25 +08:00
Chirag Wadhwa f3a4a1b185
KAFKA-19241: Updated tests in ShareFetchAcknowledgeRequestTest to reuse the socket for subsequent requests (#19640)
Currently in the tests in ShareFetchAcknowledgeRequestTest, subsequent
share fetch / share acknowledge requests creates a new socket everytime,
even when the requests are sent by the same member. In reality, a single
share consumer clisnet will reuse the same socket for all the share
related requests in its lifetime. This PR changes the behaviour in the
tests to align with reality and reuse the same socket for all requests
by the same share group member.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-05-07 14:00:41 +01:00
Abhinav Dixit 33abb655eb
KAFKA-19215: Handle share partition fetch lock cleanly using tokens (#19598)
### About
Added code to handle share partition fetch lock cleanly in
`DelayedShareFetch` to avoid a member incorrectly releasing a share
partition's fetch lock

### Testing
The code has been tested with the help of unit tests and integration
tests.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>
2025-05-07 11:13:53 +01:00
José Armando García Sancio 2df14b1190
MINOR; Log message for unexpected buffer allocation (#19596)
Log a message when reading a batch that is larger than the currently
allocated batch.

Reviewers: Colin Patrick McCabe <cmccabe@apache.org>, PoAn Yang
 <payang@apache.org>
2025-05-06 12:01:49 -04:00
PoAn Yang 424e7251d6
KAFKA-19207 Move ForwardingManagerMetrics and ForwardingManagerMetricsTest to server module (#19574)
1. Move `ForwardingManagerMetrics` and `ForwardingManagerMetricsTest` to
server module.
2. Rewrite them in Java.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-05-06 20:03:17 +08:00
yunchi 4e77466f6a
KAFKA-19170 Move MetricsDuringTopicCreationDeletionTest to client-integration-tests module (#19528)
rewrite `MetricsDuringTopicCreationDeletionTest` to `ClusterTest` infra
and move it to clients-integration-tests module.

Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
<s7133700@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Chia-Ping
Tsai <chia7712@gmail.com>
2025-05-06 19:57:16 +08:00
Apoorv Mittal ac9520b922
KAFKA-19227: Piggybacked share fetch acknowledgements performance issue (#19612)
The PR fixes the issue when ShareAcknowledgements are piggybacked on
ShareFetch. The current default configuration in clients sets `batch
size` and `max fetch records` as per the `max.poll.records` config,
default 500. Which means all records in a single poll will be fetched
and acknowledged. Also the default configuration for inflight records in
a partition is 200. Which means prior fetch records has to be
acknowledged prior fetching another batch from share partition.

The piggybacked share fetch-acknowledgement calls from KafkaApis are
async and later the response is combined. If respective share fetch
starts waiting in purgatory because all inflight records are currently
full, hence when startOffset is moved as part of acknowledgement, then a
trigger should happen which should try completing any pending share
fetch requests in purgatory. Else the share fetch requests wait in
purgatory for timeout though records are available, which dips the share
fetch performance.

The regular fetch has a single criteria to land requests in purgatory,
which is min bytes criteria, hence any produce in respective topic
partition triggers to check any pending fetch requests. But share fetch
can wait in purgatory because of multiple reasons: 1) Min bytes 2)
Inflight records exhaustion 3) Share partition fetch lock competition.
The trigger already happens for 1 and current PR fixes 2. We will
investigate further if there should be any handling required for 3.

Reviewers: Abhinav Dixit <adixit@confluent.io>, Andrew Schofield
<aschofield@confluent.io>
2025-05-06 09:58:25 +01:00
Abhinav Dixit caf4a6cc5f
KAFKA-19216: Eliminate flakiness in kafka.server.share.SharePartitionTest (#19639)
### About
11 of the test cases in `SharePartitionTest` have failed at least once
in the past 28 days.

https://develocity.apache.org/scans/tests?search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.timeZoneId=Europe%2FLondon&tests.container=kafka.server.share.SharePartitionTest
Observing the flakiness, they seem to be caused due to the usage of
`SystemTimer` for various acquisition lock timeout related tests. I have
replaced the usage of `SystemTimer` with `MockTimer` and also improved
the `MockTimer` API with regard to removing the timer task entries that
have already been cancelled.
Also, this has reduced the time taken to run `SharePartitionTest` from
~6 sec to ~1.5 sec

### Testing
The testing has been done with the help of already present unit tests in
Apache Kafka.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-05 20:04:22 +01:00
Abhinav Dixit 81c3a285a4
KAFKA-19133: Support fetching for multiple remote fetch topic partitions in a single share fetch request (#19592)
### About
This PR removes the limitation in remote storage fetch for share groups
of only performing remote fetch for a single topic partition in a share
fetch request. With this PR, share groups can now fetch multiple remote
storage topic partitions in a single share fetch request.

### Testing
I have followed the [AK

documentation](https://kafka.apache.org/documentation/#tiered_storage_config_ex)
to test my code locally (by adopting `LocalTieredStorage.java`) and
verify with the help of logs that remote storage is happening for
multiple topic partitions in a single share fetch request. Also,
verified it with the help of unit tests.

Reviewers: Jun Rao <junrao@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>
2025-05-05 19:42:02 +01:00
TaiJuWu 19530738c4
KAFKA-19240 Move MetadataVersionIntegrationTest to clients-integration-tests module (#19641)
The PR do following:
1. Move MetadataVersionIntegrationTest to clients-integration-tests
module
2. rewrite to java from scala

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-05-06 00:12:57 +08:00
xijiu b5cceb43e5
KAFKA-19205: inconsistent result of beginningOffsets/endoffset between classic and async consumer with 0 timeout (#19578)
CI / build (push) Waiting to run Details
In the return results of the methods beginningOffsets and endOffset, if
timeout == 0, then an empty Map should be returned uniformly instead of
in the form of <TopicPartition, null>

Reviewers: Ken Huang <s7133700@gmail.com>, PoAn Yang
 <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>, Lianet
 Magrans <lmagrans@confluent.io>
2025-05-03 13:12:20 -04:00
YuChia Ma 979f49f967
KAFKA-19146 Merge OffsetAndEpoch from raft to server-common (#19475)
CI / build (push) Waiting to run Details
1. remove org.apache.kafka.raft.OffsetAndEpoch
2. rewrite org.apache.kafka.server.common.OffsetAndEpoch by record
keyword
3. rename OffsetAndEpoch#leaderEpoch to OffsetAndEpoch#epoch

Reviewers: PoAn Yang <payang@apache.org>, Xuan-Zhang Gong
 <gongxuanzhangmelt@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-02 03:12:52 +08:00
Calvin Liu 0c1fbf3aeb
KAFKA-19073 add transactional ID pattern filter to ListTransactions (#19355)
Propose adding a new filter TransactionalIdPattern. This transaction ID pattern filter works as AND with the other transaction filters. Also, it is empowered with Re2j.

KIP: https://cwiki.apache.org/confluence/x/4gm9F

Reviewers: Justine Olshan <jolshan@confluent.io>, Ken Huang
<s7133700@gmail.com>, Kuan-Po Tseng <brandboat@gmail.com>, Chia-Ping
Tsai <chia7712@gmail.com>
2025-05-02 00:52:21 +08:00
Lan Ding 8dbf56e4b5
KAFKA-17541:[1/2] Improve handling of delivery count (#19430)
For records which are automatically released as a result of closing a
share session normally, the delivery count should not be incremented.
These records were fetched but they were not actually delivered to the
client since the disposition of the delivery records is carried in the
ShareAcknowledge which closes the share session. Any remaining records
were not delivered, only fetched.
This PR releases the delivery count for records when closing a share
session normally.

Co-authored-by: d00791190 <dinglan6@huawei.com>

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-05-01 14:40:03 +01:00
Chirag Wadhwa 800612e4a7
KAFKA-19015: Remove share session from cache on share consumer connection drop (#19329)
Up till now, the share sessions in the broker were only attempted to
evict when the share session cache was full and a new session was trying
to get registered. With the changes in this PR, whenever a share
consumer gets disconnected from the broker, the corresponding share
session would be evicted from the cache.

Note - `connectAndReceiveWithoutClosingSocket` has been introduced in
`GroupCoordinatorBaseRequestTest`. This method creates a socket
connection, sends the request, receives a response but does not close
the connection. Instead, these sockets are stored in a ListBuffer
`openSockets`, which are closed in tearDown method after each test is
run. Also, all the `connectAndReceive` calls in
`ShareFetchAcknowledgeRequestTest` have been replaced by
`connectAndReceiveWithoutClosingSocket`, because these tests depends
upon the persistence of the share sessions on the broker once
registered. But, with the new code introduced, as soon as the socket
connection is closed, a connection drop is assumed by the broker,
leading to session eviction.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-05-01 14:36:18 +01:00
Lan Ding e3c456ff0f
KAFKA-19169: Enhance AuthorizerIntegrationTest for share group APIs (#19540)
CI / build (push) Waiting to run Details
Enhance AuthorizerIntegrationTest for share group APIs

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-01 10:13:43 +01:00
Andrew Schofield ce97b1d5e7
KAFKA-16894: Exploit share feature [3/N] (#19542)
This PR uses the v1 of the ShareVersion feature to enable share groups
for KIP-932.

Previously, there were two potential configs which could be used -
`group.share.enable=true` and including "share" in
`group.coordinator.rebalance.protocols`. After this PR, the first of
these is retained, but the second is not. Instead, the preferred switch
is the ShareVersion feature.

The `group.share.enable` config is temporarily retained for testing and
situations in which it is inconvenient to set the feature, but it should
really not be necessary, especially when we get to AK 4.2. The aim is to
remove this internal config at that point.

No tests should be setting `group.share.enable` any more, because they
can use the feature (which is enabled in test environments by default
because that's how features work). For tests which need to disable share
groups, they now set the share feature to v0. The majority of the code
changes were related to correct initialisation of the metadata cache in
tests now that a feature is used.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-04-30 13:27:01 +01:00
Matthias J. Sax b0a26bc2f4
KAFKA-19173: Add `Feature` for "streams" group (#19509)
Add new StreamsGroupFeature, disabled by default,  and add "streams" as
default value to `group.coordinator.rebalance.protocols`.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot
<david.jacot@gmail.com>, Lucas Brutschy <lbrutschy@confluent.io>,
Justine Olshan <jolshan@confluent.io>, Andrew Schofield
<aschofield@confluent.io>, Jun Rao <jun@confluent.io>
2025-04-29 22:51:10 -07:00
PoAn Yang 81881dee83
KAFKA-18760: Deprecate Optional<String> and return String from public Endpoint#listener (#19191)
* Deprecate org.apache.kafka.common.Endpoint#listenerName.
* Add org.apache.kafka.common.Endpoint#listener to replace
org.apache.kafka.common.Endpoint#listenerName.
* Replace org.apache.kafka.network.EndPoint with
org.apache.kafka.common.Endpoint.
* Deprecate org.apache.kafka.clients.admin.RaftVoterEndpoint#name
* Add org.apache.kafka.clients.admin.RaftVoterEndpoint#listener to
replace org.apache.kafka.clients.admin.RaftVoterEndpoint#name

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, TaiJuWu
 <tjwu1217@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, TengYao
 Chi <frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>, Bagda
 Parth  , Kuan-Po Tseng <brandboat@gmail.com>

---------

Signed-off-by: PoAn Yang <payang@apache.org>
2025-04-30 12:15:33 +08:00
Omnia Ibrahim 6f783f8536
KAFKA-10551: Add topic id support to produce request and response (#15968)
- Add support topicId in `ProduceRequest`/`ProduceResponse`. Topic name
and Topic Id will become `ignorable` following the footstep of
`FetchRequest`/`FetchResponse`
- ReplicaManager still look for `HostedPartition` using `TopicPartition`
and doesn't check topic id. This is an **[OPEN QUESTION]** if we should
address this in this pr or wait for
[KAFKA-16212](https://issues.apache.org/jira/browse/KAFKA-16212) as this
will update `ReplicaManager::getPartition` to use `TopicIdParittion`
once we update the cache. Other option is that we compare provided
`topicId` with `Partition` topic id and return `UNKNOW_TOPIC_ID` or
`UNKNOW_TOPIC_PARTITION` if we can't find partition with matched topic
id.

Reviewers: Jun Rao <jun@confluent.io>, Justine Olshan
 <jolshan@confluent.io>
2025-04-29 15:37:28 -07:00
Andrew Schofield 019459e950
MINOR: Remove unused erroneous code from test (#19585)
CI / build (push) Waiting to run Details
This PR removes a small piece of unused code which is also not correct.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-04-28 18:41:50 +01:00
David Jacot be194f5dba
MINOR: Simplify OffsetFetchRequest (#19572)
While working on https://github.com/apache/kafka/pull/19515, I came to
the conclusion that the OffsetFetchRequest is quite messy and overall
too complicated. This patch rationalize the constructors.
OffsetFetchRequest has a single constructor accepting the
OffsetFetchRequestData. This will also simplify adding the topic ids.
All the changes are mechanical, replacing data structures by others.

Reviewers: PoAn Yang <payang@apache.org>, TengYao Chi <frankvicky@apache.org>, Lianet Magran <lmagrans@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-04-27 18:58:30 +02:00
PoAn Yang 7293f3a90e
KAFKA-19183 Replace Pool with ConcurrentHashMap (#19535)
1. Replace `Pool.scala` with `ConcurrentHashMap`.
2. Remove `PoolTest.scala`.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-04-27 23:19:49 +08:00
Chirag Wadhwa 2f9c2dd828
KAFKA-16718-3/n: Added the ShareGroupStatePartitionMetadata record during deletion of share group offsets (#19478)
This is a follow up PR for implementation of DeleteShareGroupOffsets
RPC. This PR adds the ShareGroupStatePartitionMetadata record to
__consumer__offsets topic to make sure the topic is removed from the
initializedTopics list. This PR also removes partitions from the request
and response schemas for DeleteShareGroupState RPC

Reviewers: Sushant Mahajan <smahajan@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2025-04-25 22:01:48 +01:00
Ken Huang b4b80731c1
KAFKA-19042 Move PlaintextConsumerFetchTest to client-integration-tests module (#19520)
Use Java to rewrite `PlaintextConsumerFetchTest` by new test infra and
move it to client-integration-tests module.

Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai
<chia7712@gmail.com>
2025-04-26 00:09:23 +08:00
Colin Patrick McCabe 22b89b6413
KAFKA-19192; Old bootstrap checkpoint files cause problems updated servers (#19545)
Old bootstrap.metadata files cause problems with server that include
KAFKA-18601. When the server tries to read the bootstrap.checkpoint
file, it will fail if the metadata.version is older than 3.3-IV3
(feature level 7). This causes problems when these clusters are
upgraded.

This PR makes it possible to represent older MVs in BootstrapMetadata
objects without causing an exception. An exception is thrown only if we
attempt to access the BootstrapMetadata. This ensures that only the code
path in which we start with an empty metadata log checks that the
metadata version is 7 or newer.

Reviewers: José Armando García Sancio <jsancio@apache.org>, Ismael Juma
 <ismael@juma.me.uk>, PoAn Yang <payang@apache.org>, Liu Zeyu
 <zeyu.luke@gmail.com>, Alyssa Huang <ahuang@confluent.io>
2025-04-24 15:43:35 -04:00
Apoorv Mittal 3c05dfdf0e
KAFKA-18889: Make records in ShareFetchResponse non-nullable (#19536)
This PR marks the records as non-nullable for ShareFetch.

This PR is as per the changes for Fetch:
https://github.com/apache/kafka/pull/18726 and some work for ShareFetch
was done here: https://github.com/apache/kafka/pull/19167. I tested with
marking `records` as non-nullable in ShareFetch, which required
additional handling. The same has been fixed in current PR.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai
 <chia7712@gmail.com>, TengYao Chi <frankvicky@apache.org>, PoAn Yang
 <payang@apache.org>
2025-04-24 16:32:08 +01:00
Vikas Singh f4ab3a2275
MINOR: Use readable interface to parse response (#19353)
The generated response data classes take Readable as input to parse the
Response. However, the associated response objects take ByteBuffer as
input and thus convert them to Readable using `new ByteBufferAccessor`
call.

This PR changes the parse method of all the response classes to take the
Readable interface instead so that no such conversion is needed.

To support parsing the ApiVersionsResponse twice for different version
this change adds the "slice" method to the Readable interface.

Reviewers: José Armando García Sancio <jsancio@apache.org>, Truc Nguyen
<[trnguyen@confluent.io](mailto:trnguyen@confluent.io)>, Aadithya
Chandra <[aadithya.c@gmail.com](mailto:aadithya.c@gmail.com)>
2025-04-24 11:09:06 -04:00
Andrew Schofield f0f5571dbb
MINOR: Change KIP-932 log messages from early access to preview (#19547)
Change the log messages which used to warn that KIP-932 was an Early
Access feature to say that it is now a Preview feature. This will make
the broker logs far less noisy when share groups are enabled.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-04-24 11:22:17 +01:00
Chirag Wadhwa 43350274e8
KAFKA-19156: Streamlined share group configs, with usage in ShareSessionCache (#19505)
This PR removes the group.share.max.groups config. This config was used
to calculate the maximum size of share session cache. But with the new
config group.share.max.share.sessions in place with exactly this
purpose, the ShareSessionCache initialization has also been passed the
new config.

Refer: [KAFKA-19156](https://issues.apache.org/jira/browse/KAFKA-19156)

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-04-23 10:52:02 +01:00
David Jacot 71d08780d1
KAFKA-14690; Add TopicId to OffsetCommit API (#19461)
This patch extends the OffsetCommit API to support topic ids. From
version 10 of the API, topic ids must be used. Originally, we wanted to
support both using topic ids and topic names from version 10 but it
turns out that it makes everything more complicated. Hence we propose to
only support topic ids from version 10. Clients which only support using
topic names can either lookup the topic ids using the Metadata API or
stay on using an earlier version.

The patch only contains the server side changes and it keeps the version
10 as unstable for now. We will mark the version as stable when the
client side changes are merged in.

Reviewers: Lianet Magrans <lmagrans@confluent.io>, PoAn Yang <payang@apache.org>
2025-04-23 08:22:09 +02:00
Chirag Wadhwa 22c5794bc3
KAFKA-19159: Removed time based evictions for share sessions (#19500)
Currently the share session cache is desgined like the fetch session
cache. If the cache is full and a new share session is trying to get get
initialized, then the sessions which haven't been touched for more than
2minutes are evicted. This wouldn't be right for share sessions as the
members also hold locks on the acquired records, and session eviction
would mean theose locks will need to be dropped and the corresponding
records re-delivered. This PR removes the time based eviction logic for
share sessions.

Refer: [KAFKA-19159](https://issues.apache.org/jira/browse/KAFKA-19159)

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-04-22 14:59:35 +01:00
Andrew Schofield 66147d5de7
KAFKA-19057: Stabilize KIP-932 RPCs for AK 4.1 (#19378)
This PR removes the unstable API flag for the KIP-932 RPCs.

The 4 RPCs which were exposed for the early access release in AK 4.0 are
stabilised at v1. This is because the RPCs have evolved over time and AK
4.0 clients are not compatible with AK 4.1 brokers. By stabilising at
v1, the API version checks prevent incompatible communication and
server-side exceptions when trying to parse the requests from the older
clients.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-04-22 11:43:32 +01:00
Abhinav Dixit a8f49999cc
KAFKA-19019: Add support for remote storage fetch for share groups (#19437)
This PR adds the support for remote storage fetch for share groups.

There is a limitation in remote storage fetch for consumer groups that
we can only perform remote fetch for a single topic partition in a fetch
request. Since, the logic of share fetch requests is largely based on
how consumer groups work, we are following similar logic in implementing 
remote storage fetch. However, this problem should be addressed as 
part of KAFKA-19133 which should help us perform fetch for multiple 
remote fetch topic partition in a single share fetch request.

Reviewers: Jun Rao <junrao@gmail.com>
2025-04-21 15:30:24 -07:00
Jhen-Yung Hsu a04c2fed04
KAFKA-19180 Fix the hanging testPendingTaskSize (#19526)
The check for `scheduler.pendingTaskSize()` may fail if the thread pool
is too slow to consume the runnable objects

Reviewers: Ken Huang <s7133700@gmail.com>, PoAn Yang
 <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2025-04-21 20:19:42 +08:00
Mickael Maison 7710d1c951
KAFKA-14487: Move LogManager static methods/fields to storage module (#19302)
Move the static fields/methods

Reviewers: Luke Chen <showuon@gmail.com>
2025-04-21 12:03:30 +02:00
Xuan-Zhang Gong a78a931ce0
KAFKA-18854 remove `DynamicConfig` inner class (#19487)
This PR is a umbrella of [KAFKA-18854.
](https://issues.apache.org/jira/browse/KAFKA-18854)

The previous PR encountered some compatibility issues, so we decided to
split it and proceed with the migration step by step.

see https://github.com/apache/kafka/pull/19019

Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-04-20 23:37:18 +08:00
TaiJuWu 6e4e0df057
KAFKA-18891: Add KIP-877 support to RemoteLogMetadataManager and RemoteStorageManager (#19286)
1. Remove `RemoteLogManager#startup` and
`RemoteLogManager#onEndpointCreated`
2. Move endpoint creation to `BrokerServer`
3. Move `RemoteLogMetadataManager#configure` and
`RemoteLogStorageManager#configure` to RemoteLogManager constructor

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ken Huang
 <s7133700@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>
2025-04-18 15:04:37 +02:00
Kamal Chandraprakash 2cd733c9b3
KAFKA-17184: Fix the error thrown while accessing the RemoteIndexCache (#19462)
For segments that are uploaded to remote, RemoteIndexCache caches the
fetched offset, timestamp, and transaction index entries on the first
invocation to remote, then the subsequent invocations are accessed from
local.

The remote indexes that are cached locally gets removed on two cases:

1. Remote segments that are deleted due to breach by retention size/time
and start-offset.
2. The number of cached indexes exceed the remote-log-index-cache size
limit of 1 GB (default).

There are two layers of locks used in the RemoteIndexCache. First-layer
lock on the RemoteIndexCache and the second-layer lock on the
RemoteIndexCache#Entry.

**Issue**

1. The first-layer of lock coordinates the remote-log reader and deleter
threads. To ensure that the reader and deleter threads are not blocked
on each other, we only take `lock.readLock()` when accessing/deleting
the cached index entries.
2. The issue happens when both the reader and deleter threads took the
readLock, then the deleter thread marked the index as
`markedForCleanup`. Now, the reader thread which holds the `indexEntry`
gets an IllegalStateException when accessing it.
3. This is a concurrency issue, where we mark the entry as
`markedForCleanup` before removing it from the cache. See
RemoteIndexCache#remove, and RemoteIndexCache#removeAll methods.
4. When an entry gets evicted from cache due to breach by maxSize of 1
GB, then the cache remove that entry before calling the evictionListener
and all the operations are performed atomically by caffeine cache.

**Solution**

1. When the deleter thread marks an Entry for deletion, then we rename
the underlying index files with ".deleted" as suffix and add a job to
the remote-log-index-cleaner thread which perform the actual cleanup.
Previously, the indexes were not accessible once it was marked for
deletion. Now, we allow to access those renamed files (from entry that
is about to be removed and held by reader thread) until those relevant
files are removed from disk.
2. Similar to local-log index/segment deletion, once the files gets
renamed with ".deleted" as suffix then the actual deletion of file
happens after `file.delete.delay.ms` delay of 1 minute. The renamed
index files gets deleted after 30 seconds.
3. During this time, if the same index entry gets fetched again from
remote, then it does not have conflict with the deleted entry as the
file names are different.

Reviewers: Satish Duggana <satishd@apache.org>
2025-04-18 16:43:37 +05:30
Milly c199418cfa
MINOR: Remove the unused parameter from FetchSession update method (#19414)
Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
 <s7133700@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Chia-Ping
 Tsai <chia7712@gmail.com>
2025-04-18 00:17:24 +08:00
Andrew Schofield 8d66481a83
KAFKA-17897 Deprecate Admin.listConsumerGroups (#19477)
The final part of KIP-1043 is to deprecate Admin.listConsumerGroups() in
favour of Admin.listGroups() which works for all group types.

Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai
<chia7712@gmail.com>
2025-04-17 23:00:57 +08:00
Logan Zhu 50fb993ce0
KAFKA-19136 Move metadata-related configs from KRaftConfigs to MetadataLogConfig (#19465)
Separates metadata-related configurations from the `KRaftConfigs` into
the `MetadataLogConfig` class.

Previously, metadata-related configs were placed in `KRaftConfigs`,
which mixed server-related configs (like process.roles) with
metadata-specific ones (like metadata.log.*), leading to confusion and
tight coupling.

In this PR:
- Extract metadata-related config definitions and variables from
`KRaftConfig` into `MetadataLogConfig`.
- Move `node.id` out of `MetadataLogConfig` into `KafkaMetadataLog’s
constructor` to avoid redundant config references.
- Leave server-related configurations in `KRaftConfig`, consistent with
its role.

This separation makes `KafkaConfig` and `KRaftConfig` cleaner, and
aligns with the goal of having a dedicated MetadataLogConfig class for
managing metadata-specific configurations.

Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
 <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-04-17 22:17:11 +08:00
Chirag Wadhwa db62c7cdff
KAFKA-19157: added group.share.max.share.sessions config (#19503)
This PR adds the config group.share.max.share.sessions to
ShareGroupConfig

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-04-17 13:17:58 +01:00
Mickael Maison c73d97de0c
KAFKA-14523: Move kafka.log.remote classes to storage (#19474)
Pretty much a straight forward move of these classes. I just updated
`RemoteLogManagerTest` to not use `KafkaConfig`

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-04-17 11:05:14 +02:00
TaiJuWu 23e7158665
KAFKA-19002 Rewrite ListOffsetsIntegrationTest and move it to clients-integration-test (#19460)
the following tasks should be addressed in this ticket  rewrite it by
1. new test infra
2. use java
3. move it to clients-integration-test

Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
2025-04-17 02:26:23 +08:00
TengYao Chi 3d353eb92a
MINOR: Remove unused fields in KafkaConfig (#19481)
This is a follow-up clean of #19387   Since we no longer access the log
cleaner config from `KafkaConfig`, we should remove these unused fields.

Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
2025-04-17 00:42:34 +08:00
David Jacot 6e26ec06bb
MINOR: Update GroupCoordinator interface to use AuthorizableRequestContext instead of RequestContext (#19485)
This patch updates the `GroupCoordinator` interface to use
`AuthorizableRequestContext` instead of using `RequestContext`. It makes
the interface more generic. The only downside is that the request
version in `AuthorizableRequestContext` is an `int` instead of a `short`
so we had to adapt it in a few places. We opted for using `int` directly
wherever possible.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
2025-04-16 09:12:11 -07:00
PoAn Yang 18e4608d1c
MINOR: remove unused DelayedElectLeader (#19490)
The `DelayedElectLeader` is only used in `TestReplicaManager`, but there
is no reference in it, so we can safely remove it.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-04-16 23:56:31 +08:00
Rajini Sivaram 4b2a3102da
KAFKA-19147: Start authorizer before group coordinator to ensure coordinator authorizes regex topics (#19488)
[KAFKA-18813](https://issues.apache.org/jira/browse/KAFKA-18813) added
`Topic:Describe` authorization of topics matching regex patterns to the
group coordinator since it was difficult to authorize these in the
broker when processing consumer heartbeats using the new protocol. But
group coordinator is started in `BrokerServer` before the authorizer is
created. And hence group coordinator doesn't have an authorizer and
never performs authorization. As a result, topics that are not
authorized for `Describe` may be assigned to consumers. This potentially
leaks information about topic existence, topic id and partition count to
users who are not authorized to describe a topic. This PR starts
authorizer earlier to ensure that authorization is performed by the
group coordinator. Also adds integration tests for verification.

Note that we still have a second issue when members have different
permissions. If regex is resolved by a member with permission to more
topics, unauthorized topics may be assigned to members with lower
permissions. In this case, we still return assignment containing topic
id and partitions to the member without `Topic:Describe` access. This is
not addressed by this PR, but an integration test that illustrates the
issue has been added so that we can verify when the issue is fixed.

Reviewers: David Jacot <david.jacot@gmail.com>
2025-04-16 12:57:44 +01:00
Ken Huang ae608c1cb2
KAFKA-19042 Move PlaintextConsumerCallbackTest to client-integration-tests module (#19298)
Use Java to rewrite `PlaintextConsumerCallbackTest` by new test infra
and move it to client-integration-tests module.

Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-04-16 11:57:14 +08:00
TengYao Chi 73afcc9b69
KAFKA-13610: Deprecate log.cleaner.enable configuration (#19472)
JIRA: KAFKA-13610  This patch deprecates the `log.cleaner.enable`
configuration. It's part of
[KIP-1148](https://cwiki.apache.org/confluence/x/XAyWF).

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, PoAn Yang
 <payang@apache.org>, Ken Huang <s7133700@gmail.com>, Jhen-Yung Hsu
 <jhenyunghsu@gmail.com>
2025-04-16 10:27:44 +08:00
Mickael Maison fb2ce76b49
KAFKA-18888: Add KIP-877 support to Authorizer (#19050)
This also adds metrics to StandardAuthorizer

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ken Huang
 <s7133700@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, TaiJuWu
 <tjwu1217@gmail.com>
2025-04-15 19:40:24 +02:00
Ritika Reddy 598eb13d07
KAFKA-15370: ACL changes to support 2PC (KIP-939) (#19364)
This patch adds ACL support for 2PC as a part of KIP-939

A new value will be added to the enum AclOperation: TWO_PHASE_COMMIT
((byte) 15 .  When InitProducerId comes with enable2Pc=true, it would
have to have both WRITE and TWO_PHASE_COMMIT operation enabled on the
transactional id resource.

The kafka-acls.sh tool is going to support a new --operation
TwoPhaseCommit.

Reviewers: Artem Livshits <alivshits@confluent.io>, PoAn Yang
 <poan.yang@suse.com>, Justine Olshan <jolshan@confluent.io>
2025-04-15 08:39:46 -07:00
Xuan-Zhang Gong c527530e80
KAFKA-19042 Move ProducerCompressionTest, ProducerFailureHandlingTest, and ProducerIdExpirationTest to client-integration-tests module (#19319)
include three test case
- ProducerCompressionTest
- ProducerFailureHandlingTest
- ProducerIdExpirationTest

Reviewers: Ken Huang <s7133700@gmail.com>, PoAn Yang
<payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2025-04-15 16:34:47 +08:00
Mickael Maison 321a380d0a
KAFKA-14523: Decouple RemoteLogManager and Partition (#19391)
Remove the last dependency in the core module.

Reviewers: Luke Chen <showuon@gmail.com>, PoAn Yang <poan.yang@suse.com>
2025-04-15 09:56:27 +02:00
Mickael Maison d183cf9ac1
KAFKA-18172 Move RemoteIndexCacheTest to the storage module (#19469)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-04-15 15:53:41 +08:00
Mickael Maison 5f2a68b150
KAFKA-19119 Move ApiVersionManager/SimpleApiVersionManager to server (#19426)
Reviewers: Ken Huang <s7133700@gmail.com>, PoAn Yang
 <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2025-04-15 14:32:44 +08:00
Milly 80b209d2a0
MINOR: remove unused parameter from KafkaMetadataLog (#19458)
1. Remove unused parameter from KafkaMetadataLog.
2. Give Utils.closeQuietly a meaningful name when closing reader.


Reviewers: TengYao Chi <kitingiao@gmail.com>, Ken Huang
 <s7133700@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Chia-Ping
 Tsai <chia7712@gmail.com>

---------

Co-authored-by: TengYao Chi <kitingiao@gmail.com>
2025-04-15 10:26:55 +08:00
PoAn Yang b18f00b449
KAFKA-19121 Move AddPartitionsToTxnConfig and TransactionStateManagerConfig out of KafkaConfig (#19439)
Both AddPartitionsToTxnConfig and TransactionStateManagerConfig are
static configs and they don't have specific config check. We can move
them out of KafkaConfig to simplify KafkaConfig.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-04-15 01:16:30 +08:00
PoAn Yang 8827ce4701
KAFKA-19113: Migrate DelegationTokenManager to server module (#19424)
1. Migrate DelegationTokenManager to server module.
2. Rewrite DelegationTokenManager in Java.
3. Move DelegationTokenManagerConfigs out of KafkaConfig.

Reviewers: Mickael Maison <mickael.maison@gmail.com>
2025-04-14 16:49:45 +02:00
Azhar Ahmed 4cdd4b617c
KAFKA-19071: Fix doc for remote.storage.enable (#19345)
As of 3.9, Kafka allows disabling remote storage on a topic after it was
enabled. It allows subsequent enabling and disabling too.

However the documentation says otherwise and needs to be corrected.

Doc:
https://kafka.apache.org/39/documentation/#topicconfigs_remote.storage.enable

Reviewers: Luke Chen <showuon@gmail.com>, PoAn Yang <payang@apache.org>, Ken Huang <s7133700@gmail.com>
2025-04-14 11:08:49 +08:00
Parker Chang c8fe551139
KAFKA-19030 Remove metricNamePrefix from RequestChannel (#19374)
As described in the JIRA ticket, `controlPlaneRequestChannelOpt` was
removed from KRaft mode, so there's no need to use the metrics prefix
anymore.

This change removes `metricNamePrefix` from RequestChannel and the
related files.

It also removes `DataPlaneAcceptor#MetricPrefix`, since
`DataPlaneAcceptor`  is the only implementation of `Acceptor`.

Since the implementation of KIP-291 is essentially removed, we can also
remove `logAndThreadNamePrefix` and `DataPlaneAcceptor#ThreadPrefix`.

Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-04-12 23:22:40 +08:00
Dmitry Werner 7863b35064
KAFKA-14485: Move LogCleaner to storage module (#19387)
Move LogCleaner and related classes to storage module and rewrite in
Java.

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Jun Rao <junrao@gmail.com>
2025-04-11 09:21:05 -07:00
Andrew Schofield 21a080f08c
KAFKA-16894: Define feature to enable share groups (#19293)
This PR proposes a switch to enable share groups for 4.1 (preview) and
4.2 (GA).

* `share.version=1` to indicate that share groups are enabled. This is
used as the switch for turning share groups on and off.

In 4.1, the default will be `share.version=0`. Then a user wanting to
evaluate the preview of KIP-932 would use `bin/kafka-features.sh
--bootstrap.server xxxx upgrade --feature share.version=1`.

In 4.2, the default will be `share.version=1`.

Reviewers: Jun Rao <junrao@gmail.com>
2025-04-11 12:14:38 +01:00
PoAn Yang 34a87d3477
KAFKA-19042 Move TransactionsWithMaxInFlightOneTest to client-integration-tests module (#19289)
Use Java to rewrite `TransactionsWithMaxInFlightOneTest` by new test
infra and move it to client-integration-tests module.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-04-11 12:04:19 +08:00
Ken Huang 588d107ec2
KAFKA-19101 Remove ControllerMutationQuotaManager#throttleTimeMs unused parameter (#19410)
It seems `timeMs` this parameter never used in Kafka project, the method
init commit is

b5f90daf13

Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, PoAn Yang
<payang@apache.org>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
2025-04-11 11:31:08 +08:00
Sushant Mahajan c3b7aa6e64
KAFKA-18170: Add create and write timestamp fields in share snapshot [1/N] (#19432)
* We wish to track the time of creation of the `ShareSnapshot` records
so that automated jobs could force their creation if a share partition
has gone cold (no updates for a specified time interval).
* To accomplish this, we have added 2 new fields `CreateTimestamp` and
`WriteTimestamp` in the `ShareSnapshot` record.
* The former tracks snapshot creation due to regular RPC calls while the
latter will track snapshots created by periodic jobs.
* In this PR we have made the requisite changes.
* This is a first of a series of PRs to create the automated jobs and
associated scaffolding.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-04-10 15:56:58 +01:00
TengYao Chi b649b1ed5d
KAFKA-18935: Ensure brokers do not return null records in FetchResponse (#19167)
JIRA: KAFKA-18935  This patch ensures the broker will not return null
records in FetchResponse.   For more details, please refer to the
ticket.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai
 <chia7712@gmail.com>, Jun Rao <junrao@gmail.com>
2025-04-10 22:21:00 +08:00
Lucas Brutschy 6430fb5d45
MINOR: Add note that streams groups are in early access (#19434)
Add a note to the group protocol configuration that streams groups are
in early access and should not be used in production.

Also update an outdated comment related to disabling the protocol.

Reviewers: Bruno Cadonna <cadonna@apache.org>
2025-04-10 13:46:31 +02:00
Abhinav Dixit 699ae1b75b
KAFKA-16729: Support isolation level for share consumer (#19261)
This PR adds the share group dynamic config `share.isolation.level`.
Until now, share groups only supported `READ_UNCOMMITTED` isolation
level type. With this PR, we aim to support `READ_COMMITTED` isolation
type to share groups.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Jun Rao <junrao@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>
2025-04-10 09:00:03 +01:00
PoAn Yang 56591d2d07
KAFKA-19090: Move DelayedFuture and DelayedFuturePurgatory to server module (#19390)
Rewrite these classes in Java and move them to the server module

Reviewers: Mickael Maison <mickael.maison@gmail.com>
2025-04-09 11:52:56 +02:00
Chirag Wadhwa 5148174196
KAFKA-16718-2/n: KafkaAdminClient and GroupCoordinator implementation for DeleteShareGroupOffsets RPC (#18976)
This PR contains the implementation of KafkaAdminClient and
GroupCoordinator for DeleteShareGroupOffsets RPC.

- Added `deleteShareGroupOffsets` to `KafkaAdminClient`
- Added implementation for `handleDeleteShareGroupOffsetsRequest` in
`KafkaApis.scala`
- Added `deleteShareGroupOffsets` to `GroupCoordinator` as well.
internally this makes use of `persister.deleteState` to persist the
changes in share coordinator

Reviewers: Andrew Schofield <aschofield@confluent.io>, Sushant Mahajan <smahajan@confluent.io>
2025-04-09 07:31:06 +01:00
Nick Guo 43e22ef5d6
KAFKA-19093 Change the "Handler on Broker" to "Handler on Controller" for controller server (#19384)
> INFO [data-plane Kafka Request Handler on Broker 3000], Resizing
request handler thread pool size from 8 to 10
(kafka.server.KafkaRequestHandlerPool)

it should be "Controller" rather than "Broker"

Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-04-09 14:09:36 +08:00
Ken Huang 2f086d188f
KAFKA-18892: Add KIP-877 support for ClientQuotaCallback (#19068)
Allow ClientQuotaCallback to implement Monitorable and register metrics.

Reviewers: Mickael Maison <mickael.maison@gmail.com>, TaiJuWu 
<tjwu1217@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>
2025-04-08 16:58:29 +02:00
Xuan-Zhang Gong 375ed19fba
KAFKA-19100: Use ProcessRole instead of String in AclApis (#19406)
Use the ProcessRole enum instead of hardcoding the role

Reviewers: Mickael Maison <mickael.maison@gmail.com>, PoAn Yang
<poan.yang@suse.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Ken Huang
<s7133700@gmail.com>
2025-04-08 11:09:55 +02:00
Nick Guo fcf6da0a0d
KAFKA-19098 Remove `lastOffset` from PartitionResponse (#19398)
The `lastOffset` is not used actually, so it can be removed.

Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-04-08 00:06:02 +08:00
Stanislav Kozlovski b8c095074d
MINOR: Rename RemoteLogStorageManager variable to RemoteStorageManager (#19401)
This patch renames the KIP-405 Plugin variable from
`remoteLogStorageManager` to `remoteStorageManager`. After [writing
about

it](https://aiven.io/blog/apache-kafka-tiered-storage-in-depth-how-writes-and-metadata-flow),
I realized I got swayed by the code and called the component incorrectly
- the official name doesn't have `Log` in it. I thought i'd go ahead and
change the code so it's consistent with the naming too

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-04-08 00:02:38 +08:00
Sanskar Jhajharia 2ae4ffb5e0
MINOR: Cleanup Core Module (#19372)
Now that Kafka Brokers support Java 17, this PR makes some changes in
core module. The changes in this PR are limited to only the Java files
in the Core module. Scala related changes may follow next.
The changes mostly include:
- Collections.emptyList(), Collections.singletonList() and
Arrays.asList() are replaced with List.of()
- Collections.emptyMap() and Collections.singletonMap() are replaced
with Map.of()
- Collections.singleton() is replaced with Set.of()
- Some changes to use enhanced switch statement.

Reviewers: Andrew Schofield <aschofield@confluent.io>, PoAn Yang <payang@apache.org>, Ken Huang <s7133700@gmail.com>
2025-04-07 16:57:52 +01:00
Hong-Yi Chen 6dd2cc70c3
MINOR: Clean up comments and remove unused code in RecordVersion and CreateTopicsRequestTest (#19342)
## Summary

This PR updates the `RecordVersion` javadoc for clarity. It removes
outdated references to `message.format.version` mentioned in the [Kafka
4.0 upgrade
documentation](48f06981ee/40/upgrade.html (L135))
and aligns with feedback from a previous discussion in [#19325
](https://github.com/apache/kafka/pull/19325).

## Changes
- Cleaned up javadoc in `RecordVersion`
- Removed outdated or deprecated references

Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-04-07 07:47:06 +08:00
TengYao Chi 6d68f8a82c
MINOR: Move BrokerReconfigurable to the sever-common module (#19383)
This patch moves `BrokerReconfigurable` to the `server-common module`
and decouples the `TransactionLogConfig` and `KafkaConfig` to unblock
KAFKA-14485.

Reviewers: PoAn Yang <payang@apache.org>, TaiJuWu <tjwu1217@gmail.com>,
Chia-Ping Tsai <chia7712@gmail.com>
2025-04-07 07:39:01 +08:00
Gaurav Narula 3f0e14a3e8
MINOR: rename metric variable name in Processor#accept (#19361)
`Processor#accept` accepts a metric which tracks the amount of time for
which the Acceptor thread was blocked. It's misleading to name it
`acceptorIdlePercentMeter` and this change updates its naming to align
with the call site.

Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai
<chia7712@gmail.com>
2025-04-05 22:54:21 +08:00
PoAn Yang 3d96b20630
KAFKA-19042 Move TransactionsExpirationTest to client-integration-tests module (#19288)
Use Java to rewrite `TransactionsExpirationTest` by new test infra and
move it to client-integration-tests module.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-04-05 20:01:31 +08:00
Mickael Maison 08a93fe12a
KAFKA-14523: Move DelayedRemoteListOffsets to the storage module (#19285)
Decouple RemoteLogManager and ReplicaManager.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-04-05 19:51:13 +08:00
Abhinav Dixit 30c511d640
KAFKA-19085: SharePartitionManagerTest testMultipleConcurrentShareFetches throws silent exception and works incorrectly (#19370)
The test `testMultipleConcurrentShareFetches` is throwing a silent
exception.
`ERROR Error processing delayed share fetch request
(kafka.server.share.DelayedShareFetch:225)`

This is due to incomplete mocks setup for the test and also requires
changes in timeout.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-04-04 20:20:47 +01:00
Andrew Schofield d4d9f11816
KAFKA-18761: [2/N] List share group offsets with state and auth (#19328)
This PR approaches completion of Admin.listShareGroupOffsets() and
kafka-share-groups.sh --describe --offsets.

Prior to this patch, kafka-share-groups.sh was only able to describe the
offsets for partitions which were assigned to active members. Now, the
Admin.listShareGroupOffsets() uses the persister's knowledge of the
share-partitions which have initialised state. Then, it uses this list
to obtain a complete set of offset information.

The PR also implements the topic-based authorisation checking. If
Admin.listShareGroupOffsets() is called with a list of topic-partitions
specified, the authz checking is performed on the supplied list,
returning errors for any topics to which the client is not authorised.
If Admin.listShareGroupOffsets() is called without a list of
topic-partitions specified, the list of topics is discovered from the
persister as described above, and then the response is filtered down to
only show the topics to which the client is authorised. This is
consistent with other similar RPCs in the Kafka protocol, such as
OffsetFetch.

Reviewers: David Arthur <mumrah@gmail.com>, Sushant Mahajan <smahajan@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>
2025-04-04 13:25:19 +01:00
Abhinav Dixit 98c0f3024d
MINOR: Added trace logs to help debug SharePartition (#19358)
Added `trace` logs to help debug `nextFetchOffset` functionality within
SharePartition. We did not have a way to figure out the fetch offsets of
a share partition through logs. Forward moving `fetchOffset` confirms
that the consumption from a given share partition is happening correctly
on the broker.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-04-04 09:52:17 +01:00
TaiJuWu f1bb29b93a
MINOR: migrate BrokerCompressionTest to storage module (#19277)
There are two change for this PR.

1. Move `BrokerCompressionTest ` from core to storage
2. Rewrite `BrokerCompressionTest ` from scala to java

Reviewers: TengYao Chi <kitingiao@gmail.com>, PoAn Yang
<payang@apache.org>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
2025-04-03 22:43:42 +08:00
Bruno Cadonna 6ef42d1524
MINOR: Deduplicate topics of a topology for authorization check (#19352)
With the new Streams rebalance protocol, the Streams client sends the
topology with the used topics to the broker for initialization. For the
initialization the broker needs to describe the topics in the topology
and consequently the Streams application needs to be authorized to
describe the topics.

The broker checks the authorization by filtering the topics in the
topology by authorization. This filtering implicitly deduplicates the
topics of the topology if they appear multiple times in the topology
send to the brokers. After that the broker compares the size of the
authorized topics with the topics in the topology. If the authorized
topics are less than the topics in the topology a
TOPIC_AUTHORIZATION_FAILED error is returned.

In Streams a topology that is sent to the brokers likely has duplicate
topics because a repartition topic appears as a sink for one subtopology
and as a source for another subtopology.

This commit deduplicates the topics of a topology before the
verification of the authorization.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2025-04-03 13:39:55 +02:00
PoAn Yang be80e3cb8a
KAFKA-18923: resource leak in RSM fetchIndex inputStream (#19111)
Fix resource leak in RSM inputStream.

Reviewers: Luke Chen <showuon@gmail.com>
2025-04-03 15:18:05 +08:00
PoAn Yang 5c01fd0b76
KAFKA-18949 add consumer protocol to testDeleteRecordsAfterCorruptRecords (#19317)
The `PlaintextAdminIntegrationTest#testDeleteRecordsAfterCorruptRecords`
was only enabled for classic protocol. Add consumer protocol to it.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-04-03 13:24:25 +08:00
Xuan-Zhang Gong 2994e5eff3
KAFKA-19004 Move DelayedDeleteRecords to server-common module (#19226)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-04-03 00:06:27 +08:00
Ismael Juma ccf2510fdd
MINOR: Remove dead code `maybeWarnIfOversizedRecords` (#19316)
The `metadataVersionSupplier` is unused after this - remove it.

Also remove redundant `metadataVersion.fetchRequestVersion >= 13` check
in `RemoteLeaderEndPoint` - the minimum version returned by this method
is `13`.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-04-01 06:36:25 -07:00
Chirag Wadhwa 0c97338959
KAFKA-18796-2: Corrected the check for acquisition lock timeout in Sh… (#19338)
Minor PR to correct the check for the presence of acquisition lock in
`assertionFailedMessage` method in `SharePartitionTest`

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-04-01 13:49:47 +01:00
Apoorv Mittal 4aa81204ff
KAFKA-19018,KAFKA-19063: Implement maxRecords and acquisition lock timeout in share fetch request and response resp. (#19334)
PR add `MaxRecords` to share fetch request and also adds
`AcquisitionLockTimeout` to share fetch response. PR also removes
internal broker config of `max.fetch.records`.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-04-01 12:23:06 +01:00
David Jacot d038f44848
MINOR: Small cleanups in ReplicaManager (#19322)
This is a small follow-up of https://github.com/apache/kafka/pull/19290.
The `actionQueue` argument is only used by the
`CoordinatorPartitionWriter` so we can remove it from the other methods
now.

Reviewers: Jeff Kim <jeff.kim@confluent.io>
2025-03-31 07:39:40 -07:00
TengYao Chi 20546930ae
KAFKA-19042 Move ConsumerTopicCreationTest to client-integration-tests module (#19283)
This patch moves `ConsumerTopicCreationTest` to the
`client-integration-tests` and rewrite it as Java.
The patch also streamlines the test flow. 
In the Scala version, there is a producer that produces messages, but
this is not the main purpose of the `ConsumerTopicCreationTest`.

Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
2025-03-31 20:15:54 +08:00
Dmitry Werner 4144290335
MINOR: Cleanup metadata module (#18937)
Removed unused code and fixed IDEA warnings.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-03-31 17:46:21 +08:00
PoAn Yang 4a5ae144ea
KAFKA-19032 Remove TestInfoUtils.TestWithParameterizedQuorumAndGroupProtocolNames (#19270)
The zookeeper mode was removed in 4.0. The test cases don't need to
specify quorum. Following variable and functions can be replaced:
- TestWithParameterizedQuorumAndGroupProtocolNames
- getTestQuorumAndGroupProtocolParametersClassicGroupProtocolOnly
- getTestQuorumAndGroupProtocolParametersConsumerGroupProtocolOnly
- getTestQuorumAndGroupProtocolParametersAll

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-03-30 02:11:07 +08:00
PoAn Yang c125cc7dd1
KAFKA-19036 Rewrite LogAppendTimeTest and move it to storage module (#19282)
Use Java to rewrite `LogAppendTimeTest` by new test infra and move it to
storage module.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-03-29 03:14:53 +08:00
Nick Guo 9292a22606
KAFKA-19049 Remove the `@ExtendWith(ClusterTestExtensions.class)` from code base (#19299)
jira: https://issues.apache.org/jira/browse/KAFKA-19049

[KAFKA-18617](https://issues.apache.org/jira/browse/KAFKA-18617)
introduced the mechanism to inject the cluster test at runtime, so the
integration tests don't need to use
`@ExtendWith(ClusterTestExtensions.class)` any more.

Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai
<chia7712@gmail.com>
2025-03-29 02:15:16 +08:00
David Jacot 28de78bcba
MINOR: Refactor GroupCoordinator write path (#19290)
This patch addresses a weirdness on the GroupCoordinator write path. The
`CoordinatorPartitionWriter` uses the `ReplicaManager#appendRecords`
method with `acks=1` and it expects it to completes
immediately/synchronously. It works because this is effectively what the
method does with `acks=1`. The issue is that fundamentally the method is
asynchronous so the contract is really fragile. This patch changes it by
introducing new method `ReplicaManager.appendRecordsToLeader`, which is
synchronous. It also refactors `ReplicaManager#appendRecords` to use
`ReplicaManager.appendRecordsToLeader` so we can benefits from all the
existing tests.

Reviewers: Fred Zheng <fzheng@confluent.io>, Jeff Kim <jeff.kim@confluent.io>
2025-03-27 08:58:47 -07:00
Lucas Brutschy 2267902b40
MINOR: Mark streams RPCs as unstable (#19292)
Streams groups RPCs are not enabled by default, but they should also be
marked as unstable.

Reviewers: Bruno Cadonna <cadonna@apache.org>
2025-03-27 14:22:01 +01:00
David Jacot 9e42b76147
MINOR: Some cleanups in group coordinator's intergration tests (#19281)
This patch applies a few cleanups to uniformize how group coordinator's
integration tests are setup.

Reviewers: Lianet Magrans <lmagrans@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-27 06:06:36 -07:00
Dmitry Werner 84b8fec089
KAFKA-14486 Move LogCleanerManager to storage module (#19216)
Move LogCleanerManager and related classes to storage module and rewrite
in Java.

Reviewers: TengYao Chi <kitingiao@gmail.com>, Jun Rao
<junrao@gmail.com>, Mickael Maison <mickael.maison@gmail.com>, Chia-Ping
Tsai <chia7712@gmail.com>
2025-03-27 12:35:38 +08:00
Sushant Mahajan eb88e78373
KAFKA-18827: Initialize share group state group coordinator impl. [3/N] (#19026)
* This PR adds impl for the initialize share groups call from the Group
Coordinator perspective.
* The initialize call on persister instance will be invoked by the
`GroupCoordinatorService`, based on the response of the
`GroupCoordinatorShard.shareGroupHeartbeat`. If there is new topic
subscription or member assignment change (topic paritions incremented),
the delta share partitions corresponding to the share group in question
are returned as an optional initialize request.
* The request is then sent to the share coordinator as an encapsulated
timer task because we want the heartbeat response to go asynchronously.
* Tests have been added for `GroupCoordinatorService` and
`GroupMetadataManager`. Existing tests have also been updated.
* A new formatter `ShareGroupStatePartitionMetadataFormatter` has been
added for debugging.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-03-26 19:40:23 +00:00
Vikas Singh 56d1dc1b6e
MINOR: Use readable interface to parse requests (#19163)
The generated request data type's constructors take Readable as an input. However, the parse method in the
AbstractRequest takes a ByteBuffer as input. So to create the corresponding request data objects, each individual
concrete Request classes wraps the ByteBuffer into a ByteBufferAccessor.

This is boilerplate code present in all the concrete request classes. This changes AbstractRequest's parse method so that subclasses can simply pass the `Readable` they get directly to request data classes.

The same change is made to the serialize method to maintain symmetry.

Reviewers: Ismael Juma <ismael@juma.me.uk>, José Armando García Sancio
<jsancio@apache.org>, Artem Livshits <alivshits@confluent.io>,
Truc Nguyen <trnguyen@confluent.io>
2025-03-26 10:13:13 -04:00
ClarkChen 1547204baa
KAFKA-18914 Migrate ConsumerRebootstrapTest to use new test infra (#19154)
Migrate ConsumerRebootstrapTest to the new test infra and remove the old
Scala test.

The PR changed three things.
* Migrated `ConsumerRebootstrapTest` to new test infra and removed the
old Scala test.
* Updated the original test case to cover rebootstrap scenarios.
* Integrated `ConsumerRebootstrapTest` into `ClientRebootstrapTest` in
the `client-integration-tests` module.
* Removed the `RebootstrapTest.scala`.

Default `ConsumerRebootstrap` config:
> properties.put(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG,
"rebootstrap");

properties.put(CommonClientConfigs.METADATA_RECOVERY_REBOOTSTRAP_TRIGGER_MS_CONFIG,
"300000");

properties.put(CommonClientConfigs.SOCKET_CONNECTION_SETUP_TIMEOUT_MS_CONFIG,
"10000");

properties.put(CommonClientConfigs.SOCKET_CONNECTION_SETUP_TIMEOUT_MAX_MS_CONFIG,
"30000");
properties.put(CommonClientConfigs.RECONNECT_BACKOFF_MS_CONFIG, "50L");
properties.put(CommonClientConfigs.RECONNECT_BACKOFF_MAX_MS_CONFIG,
"1000L");

The test case for the consumer with enabled rebootstrap
![Screenshot 2025-03-22 at 9 48
13 PM](https://github.com/user-attachments/assets/8470549f-a24c-43fa-ae44-789cbf422a63)


The test case for the consumer with disabled rebootstrap
![Screenshot 2025-03-22 at 9 47
22 PM](https://github.com/user-attachments/assets/0a183464-6a74-449f-8e71-d641a6ea5bb1)

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-03-26 01:53:42 +08:00
TengYao Chi 80d99ea2ba
KAFKA-18991: FetcherThread should match leader epochs between fetch request and fetch state (#19223)
This PR fixes a potential issue where the `FetchResponse` returns
`divergingEndOffsets` with an older leader epoch. This can lead to
committed records being removed from the follower's log, potentially
causing data loss.

In detail:
`processFetchRequest` gets the requested leader epoch of partition data
by `topicPartition` and compares it with the leader epoch of the current
fetch state. If they don't match, the response is ignored.

Reviewers: Jun Rao <junrao@gmail.com>
2025-03-25 09:14:01 -07:00
David Jacot 9db5888609
MINOR: FindCoordinator API does not lookup partition for share partition key correctly (#19273)
This patch fixes another bug in the FindCoordinator API handling for
share partition key. `shareCoordinator.foreach` returns `Unit` so
`shareCoordinator.foreach(coordinator =>
coordinator.partitionFor(SharePartitionKey.getInstance(key)))` does not
return the partition for the key.

Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-24 19:43:23 +01:00
TengYao Chi 20bad6efb3
KAFKA-18576 Convert ConfigType to Enum (#18711)
JIRA: KAFKA-18576
After removing ZooKeeper, we no longer need to exclude `client_metrics`
and `group` from `ConfigType#ALL`.

Since it's a common pattern to provide a mechanism to know all values in
enumeration ( Java enum provides ootb), we should convert ConfigType to
enum.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-03-25 01:10:59 +08:00
David Jacot 95ef344940
MINOR: FindCoordinator API should return INVALID_REQUEST when share partition key is invalid (#19272)
At the moment, the FindCoordinator API returns an `UNKNOWN_SERVER_ERROR`
error when the share partition key is invalid. It seems that the aim was
to return an `INVALID_REQUEST` error but the code has a small bug
preventing it from working as expected.

Reviewers: Apoorv Mittal <amittal@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2025-03-24 08:29:20 -07:00
Chirag Wadhwa b5f5265864
KAFKA-18796: Added more information to error message when assertion fails for acquisition lock timeout (#19247)
This PR adds extra information in assertion failed messages for tests in
SharePartitionTest revolving around acquisition lock timeouts
functionality.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-03-24 16:14:29 +05:30
ClarkChen fef9aebb19
KAFKA-18276 Migrate ProducerRebootstrapTest to new test infra (#19046)
The PR changed three things.
* Migrated `ProducerRebootstrapTest` to new test infra and removed the
old Scala test.
* Updated the original test case to cover rebootstrap scenarios.
* Integrated `ProducerRebootstrapTest` into `ClientRebootstrapTest` in
the `client-integration-tests` module.

Default `ProducerRebootstrap` config:
> properties.put(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG,
"rebootstrap");

properties.put(CommonClientConfigs.METADATA_RECOVERY_REBOOTSTRAP_TRIGGER_MS_CONFIG,
"300000");

properties.put(CommonClientConfigs.SOCKET_CONNECTION_SETUP_TIMEOUT_MS_CONFIG,
"10000");

properties.put(CommonClientConfigs.SOCKET_CONNECTION_SETUP_TIMEOUT_MAX_MS_CONFIG,
"30000");
properties.put(CommonClientConfigs.RECONNECT_BACKOFF_MS_CONFIG, "50L");
properties.put(CommonClientConfigs.RECONNECT_BACKOFF_MAX_MS_CONFIG,
"1000L");
       
The test case for the producer with enabled rebootstrap
<img width="1549" alt="Screenshot 2025-03-17 at 10 46 03 PM"
src="https://github.com/user-attachments/assets/547840a6-d79d-4db4-98c0-9b05ed04cf60"
/>

The test case for the producer with disabled rebootstrap
<img width="1552" alt="Screenshot 2025-03-17 at 10 46 47 PM"
src="https://github.com/user-attachments/assets/2248e809-d9d5-4f3b-a24f-ba1aa0fef728"
/>

Reviewers: TengYao Chi <kitingiao@gmail.com>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-24 01:09:17 +08:00
PoAn Yang d497250c22
KAFKA-18999 Remove BrokerMetadata (#19227)
* Replace `BrokerMetadata` with `UsableBroker` in KRaftMetadataCache and
ReassignPartitionsCommand.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-03-22 19:30:28 +08:00
David Jacot ca20e9cd92
KAFKA-18329; [3/3] Delete old group coordinator (KIP-848) (#19255)
This patch is the third of a series of patches to remove the old group
coordinator. With the release of Apache Kafka 4.0, the so-called new
group coordinator is the default and only option available now.

It removes the old group coordinator and cleans up the
`GroupCoordinator` interface.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-21 08:07:42 -07:00
TaiJuWu 79fe1305b6
KAFKA-18893: Add KIP-877 support to ReplicaSelector (#19064)
ReplicaSelector implementations can implement Monitorable to register their own metrics.

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ken Huang <s7133700@gmail.com>
2025-03-21 15:39:50 +01:00
David Jacot 0c5e5c5d2d
KAFKA-18329; [2/3] Delete old group coordinator (KIP-848) (#19251)
This patch is the second of a series of patches to remove the old group
coordinator. With the release of Apache Kafka 4.0, the so-called new
group coordinator is the default and only option available now.

The patch removes `group.coordinator.new.enable` (internal config) and
all its usages (integration tests, unit tests, etc.). It also cleans up
`KafkaApis` to remove logic only used by the old group coordinator.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-21 05:54:41 -07:00
Mickael Maison 121ec2a662
KAFKA-15599 Move MetadataLogConfig to raft module (#19246)
Rewrite the class in Java and move it to the raft module.

Reviewers: PoAn Yang <payang@apache.org>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-21 13:44:20 +08:00
Jorge Esteban Quilcate Otoya f24945b519
KAFKA-15931: Cancel RemoteLogReader gracefully (#19197)
Reverts commit
2723dbf3a0
and
269e8892ad.

Instead of reopening the transaction index, it cancels the
RemoteFetchTask without interrupting it--avoiding to close the
TransactionIndex channel.

This will lead to complete the execution of the remote fetch but ignoring
the results. Given that this is considered a rare case, we could live
with this. If it becomes a performance issue, it could be optimized.

Reviewers: Jun Rao <junrao@gmail.com>
2025-03-20 10:20:44 -07:00
TengYao Chi b83a23a4f9
KAFKA-18946 Move BrokerReconfigurable and DynamicProducerStateManagerConfig to server module (#19174)
This patch is to move `DynamicProducerStateManagerConfig` and `BrokerReconfigurable` to the server module.

Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-20 21:30:19 +08:00
Lan Ding e73719d962
KAFKA-18819 StreamsGroupHeartbeat API and StreamsGroupDescribe API check topic describe (#19183)
This patch filters out the topic describe unauthorized topics from the
StreamsGroupHeartbeat and StreamsGroupDescribe response.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2025-03-19 20:42:05 +01:00
PoAn Yang fcca4056fd
KAFKA-18975 Move clients-integration-test out of core module (#19217)
Move following tests from core to clients-integration-test module.

- ClientTelemetryTest
- DeleteTopicTest
- DescribeAuthorizedOperationsTest
- ConsumerIntegrationTest
- CustomQuotaCallbackTest
- RackAwareAutoTopicCreationTest

Move following tests from core to server module.

- BootstrapControllersIntegrationTest
- LogManagerIntegrationTest

Reviewers: Kirk True <kirk@kirktrue.pro>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-20 02:43:19 +08:00
Ritika Reddy 3a3159b01e
KAFKA-18953: [1/N] Add broker side handling for 2 PC (KIP-939) (#19193)
This patch adds logic to enable and handle two phase commit (2PC)
transactions following KIP-939.
The changes made are as follows:
1) Add a new broker config called
**transaction.two.phase.commit.enable** which is set to false by default
2) Add new flags **enableTwoPCFlag** and **keepPreparedTxn** to
handleInitProducerId
3) Return an error if keepPreparedTxn is set to true (for now)

Reviewers: Artem Livshits <alivshits@confluent.io>, Justine Olshan
<jolshan@confluent.io>
2025-03-19 09:22:00 -07:00
Kevin Wu a5325e029e
KAFKA-17431: Support invalid static configs for KRaft so long as dynamic configs are valid (#18949)
During broker startup, attempt to read dynamic configurations from latest local snapshot on disk. This will avoid most situations where the static configuration is not sufficient to start up, but the dynamic configuration would have been. The PR includes an integration test.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2025-03-18 14:23:23 -07:00
Ken Huang b805877705
KAFKA-18969 Rewrite ShareConsumerTest#setup and move to clients-integration-tests module (#19202)
Move share consumer to clients-integration-tests module and use `@BeforeEach` to setup

Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-18 14:47:38 +08:00
Nick Guo e9ffe0ba7c
KAFKA-18808 add test to ensure the name=<default> is not equal to default quota (#18966)
see discussion in
[KAFKA-18735](https://issues.apache.org/jira/browse/KAFKA-18735) - the
test should include following check.

1. Using name=<default> does not create default quota
2. the returned entity should have name=<default>
2. the filter `ClientQuotaFilterComponent.ofDefaultEntity` should return
nothing

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-03-18 01:57:24 +08:00
TengYao Chi a6a0ea56d8
KAFKA-17171 Add test cases for `STATIC_BROKER_CONFIG`in kraft mode (#18463)
Given that the `core` module will be separated into other small modules,
this test will not be added to the core module.
Instead, I added it to the `clients-integration-tests` module since it
focuses on the admin client test. The patch should include following test cases.

1. a topic-related static config is added to quorum controller. The
configs from topic creation should include it, but `describeConfigs`
does not.

2. a topic-related static config is added to quorum controller. The
configs from topic creation should include it, and `describeConfigs`
does if admin is using controller.bootstrap

3. a topic-related static config is added to broker. The configs from
topic creation should NOT include it, but `describeConfigs` does.

4. a topic-related static config is added to broker. The configs from
topic creation should NOT include it, and `describeConfigs` does not
also if admin is using controller.bootstrap

for another, the docs of `STATIC_BROKER_CONFIG` should remind the impact of "controller.properties" BTW, those test cases should leverage new test infra, since new test infra allow us to define configs to broker/controller individually.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-03-18 00:30:53 +08:00
PoAn Yang da46cf6e79
KAFKA-17565 Move MetadataCache interface to metadata module (#18801)
### Changes

* Move MetadataCache interface to metadata module and change Scala
function to Java.
* Remove functions `getTopicPartitions`, `getAliveBrokers`,
`topicNamesToIds`, `topicIdInfo`, and `getClusterMetadata` from
MetadataCache interface, because these functions are only used in test
code.

### Performance

* ReplicaFetcherThreadBenchmark
  ```
./jmh-benchmarks/jmh.sh -f 1 -i 2 -wi 2
org.apache.kafka.jmh.fetcher.ReplicaFetcherThreadBenchmark
  ```
  * trunk
  ```
Benchmark (partitionCount) Mode Cnt Score Error Units
ReplicaFetcherThreadBenchmark.testFetcher 100 avgt 2 4775.490 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 500 avgt 2 25730.790 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 1000 avgt 2 55334.206 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 5000 avgt 2 488427.547 ns/op
  ```
  * branch
  ```
Benchmark (partitionCount) Mode Cnt Score Error Units
ReplicaFetcherThreadBenchmark.testFetcher 100 avgt 2 4825.219 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 500 avgt 2 25985.662 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 1000 avgt 2 56056.005 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 5000 avgt 2 497138.573 ns/op
  ```

* KRaftMetadataRequestBenchmark
  ```
./jmh-benchmarks/jmh.sh -f 1 -i 2 -wi 2
org.apache.kafka.jmh.metadata.KRaftMetadataRequestBenchmark
  ```
  * trunk
  ```
Benchmark (partitionCount) (topicCount) Mode Cnt Score Error Units
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 10 500
avgt 2 884933.558 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 10 1000
avgt 2 1910054.621 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 10 5000
avgt 2 21778869.337 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 20 500
avgt 2 1537550.670 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 20 1000
avgt 2 3168237.805 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 20 5000
avgt 2 29699652.466 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 50 500
avgt 2 3501483.852 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 50 1000
avgt 2 7405481.182 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 50 5000
avgt 2 55839670.124 ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 10 500 avgt 2 333.667
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 10 1000 avgt 2 339.685
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 10 5000 avgt 2 334.293
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 20 500 avgt 2 329.899
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 20 1000 avgt 2 347.537
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 20 5000 avgt 2 332.781
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 50 500 avgt 2 327.085
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 50 1000 avgt 2 325.206
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 50 5000 avgt 2 316.758
ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 10 500 avgt 2 7.569 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 10 1000 avgt 2 7.565 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 10 5000 avgt 2 7.574 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 20 500 avgt 2 7.568 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 20 1000 avgt 2 7.557 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 20 5000 avgt 2 7.585 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 50 500 avgt 2 7.560 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 50 1000 avgt 2 7.554 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 50 5000 avgt 2 7.574 ns/op
  ```
  * branch
  ```
Benchmark (partitionCount) (topicCount) Mode Cnt Score Error Units
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 10 500
avgt 2 910337.770 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 10 1000
avgt 2 1902351.360 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 10 5000
avgt 2 22215893.338 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 20 500
avgt 2 1572683.875 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 20 1000
avgt 2 3188560.081 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 20 5000
avgt 2 29984751.632 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 50 500
avgt 2 3413567.549 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 50 1000
avgt 2 7303174.254 ns/op
KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 50 5000
avgt 2 54293721.640 ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 10 500 avgt 2 318.335
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 10 1000 avgt 2 331.386
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 10 5000 avgt 2 332.944
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 20 500 avgt 2 340.322
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 20 1000 avgt 2 330.294
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 20 5000 avgt 2 342.154
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 50 500 avgt 2 341.053
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 50 1000 avgt 2 335.458
ns/op
KRaftMetadataRequestBenchmark.testRequestToJson 50 5000 avgt 2 322.050
ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 10 500 avgt 2 7.538 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 10 1000 avgt 2 7.548 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 10 5000 avgt 2 7.545 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 20 500 avgt 2 7.597 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 20 1000 avgt 2 7.567 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 20 5000 avgt 2 7.558 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 50 500 avgt 2 7.559 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 50 1000 avgt 2 7.615 ns/op
KRaftMetadataRequestBenchmark.testTopicIdInfo 50 5000 avgt 2 7.562 ns/op
  ```

* PartitionMakeFollowerBenchmark
  ```
./jmh-benchmarks/jmh.sh -f 1 -i 2 -wi 2
org.apache.kafka.jmh.partition.PartitionMakeFollowerBenchmark
  ```
  * trunk
  ```
Benchmark Mode Cnt Score Error Units
PartitionMakeFollowerBenchmark.testMakeFollower avgt 2 158.816 ns/op
  ```
  * branch
  ```
Benchmark Mode Cnt Score Error Units
PartitionMakeFollowerBenchmark.testMakeFollower avgt 2 160.533 ns/op
  ```

* UpdateFollowerFetchStateBenchmark
  ```
./jmh-benchmarks/jmh.sh -f 1 -i 2 -wi 2
org.apache.kafka.jmh.partition.UpdateFollowerFetchStateBenchmark
  ```
  * trunk
  ```
Benchmark Mode Cnt Score Error Units
UpdateFollowerFetchStateBenchmark.updateFollowerFetchStateBench avgt 2
4975.261 ns/op
UpdateFollowerFetchStateBenchmark.updateFollowerFetchStateBenchNoChange
avgt 2 4880.880 ns/op
  ```
  * branch
  ```
Benchmark Mode Cnt Score Error Units
UpdateFollowerFetchStateBenchmark.updateFollowerFetchStateBench avgt 2
5020.722 ns/op
UpdateFollowerFetchStateBenchmark.updateFollowerFetchStateBenchNoChange
avgt 2 4878.855 ns/op
  ```


* CheckpointBench
  ```
./jmh-benchmarks/jmh.sh -f 1 -i 2 -wi 2
org.apache.kafka.jmh.server.CheckpointBench
  ```
  * trunk
  ```
Benchmark (numPartitions) (numTopics) Mode Cnt Score Error Units
CheckpointBench.measureCheckpointHighWatermarks 3 100 thrpt 2 0.997
ops/ms
CheckpointBench.measureCheckpointHighWatermarks 3 1000 thrpt 2 0.703
ops/ms
CheckpointBench.measureCheckpointHighWatermarks 3 2000 thrpt 2 0.486
ops/ms
CheckpointBench.measureCheckpointLogStartOffsets 3 100 thrpt 2 1.038
ops/ms
CheckpointBench.measureCheckpointLogStartOffsets 3 1000 thrpt 2 0.734
ops/ms
CheckpointBench.measureCheckpointLogStartOffsets 3 2000 thrpt 2 0.637
ops/ms
  ```
  * branch
  ```
Benchmark (numPartitions) (numTopics) Mode Cnt Score Error Units
CheckpointBench.measureCheckpointHighWatermarks 3 100 thrpt 2 0.990
ops/ms
CheckpointBench.measureCheckpointHighWatermarks 3 1000 thrpt 2 0.659
ops/ms
CheckpointBench.measureCheckpointHighWatermarks 3 2000 thrpt 2 0.508
ops/ms
CheckpointBench.measureCheckpointLogStartOffsets 3 100 thrpt 2 0.923
ops/ms
CheckpointBench.measureCheckpointLogStartOffsets 3 1000 thrpt 2 0.736
ops/ms
CheckpointBench.measureCheckpointLogStartOffsets 3 2000 thrpt 2 0.637
ops/ms
  ```

* PartitionCreationBench
  ```
./jmh-benchmarks/jmh.sh -f 1 -i 2 -wi 2
org.apache.kafka.jmh.server.PartitionCreationBench
  ```
  * trunk
  ```
Benchmark (numPartitions) (useTopicIds) Mode Cnt Score Error Units
PartitionCreationBench.makeFollower 20 false avgt 2 5.997 ms/op
PartitionCreationBench.makeFollower 20 true avgt 2 6.961 ms/op
  ```
  * branch
  ```
Benchmark (numPartitions) (useTopicIds) Mode Cnt Score Error Units
PartitionCreationBench.makeFollower 20 false avgt 2 6.212 ms/op
PartitionCreationBench.makeFollower 20 true avgt 2 7.005 ms/op
  ```

Reviewers: Ismael Juma <ismael@juma.me.uk>, David Arthur <mumrah@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-17 23:59:11 +08:00
Ming-Yen Chung f7d07d62d9
KAFKA-18990 Avoid redundant MetricName creation in BaseQuotaTest#produceUntilThrottled (#19215)
Avoid redundant MetricName creation in BaseQuotaTest#produceUntilThrottled via moving metrics creation out of loop.

Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-16 21:06:04 +08:00
Ken Huang 7bff678699
KAFKA-18859 honor the error message of UnregisterBrokerResponse (#19027)
Reviewers: Ismael Juma <ismael@juma.me.uk>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-16 03:06:01 +08:00
ClarkChen e05b0e68e4
KAFKA-18915 Rewrite AdminClientRebootstrapTest to cover the current scenario (#19187)
Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-16 02:35:41 +08:00
Andrew Schofield 5e7445a6d6
KAFKA-17516 Synonyms for client metrics configs (#17264)
This PR brings client metrics configuration resources in line with the
other config resources in terms of handling synonyms and defaults.
Specifically, configs which are not explicitly set take their hard-coded
default values, and these are reported by `kafka-configs.sh --describe`
and `Kafka-client-metrics.sh --describe`. Previously, they were omitted
which means the administrator needed to know the default values.

The ConfigHelper was changed so that the handling of client metrics
configuration matches that of group configuration.

Reviewers: poorv Mittal <apoorvmittal10@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-14 16:05:40 +08:00
Alieh Saeedi ff785ac251
KAFKA-18651: Add Streams-specific broker configurations (#19176)
This change implements the broker-side configs proposed in KIP-1071.
The configurations implemented by this PR are only those that were specifically aimed to be included in `AK 4.1`.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2025-03-13 18:05:24 +01:00
Mickael Maison 759fbbba8b
KAFKA-14484: Move UnifiedLog to storage module (#19030)
Rewrite UnifiedLog in Java

Reviewers: Jun Rao <jun@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-13 10:49:55 +01:00
Mickael Maison 55d65cb3ba
MINOR: Cleanups in CoreUtils (#19175)
Delete unused methods in CoreUtils and switch to Utils.newInstance().

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-03-12 19:43:30 +01:00
TengYao Chi e1d980a3d1
MINOR: Remove unused ConfigCommandOptions#forceOpt (#19170)
This field is unused, and we should remove it.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-03-13 00:04:22 +08:00
Abhinav Dixit c07c59ad24
KAFKA-18932: Removed usage of partition max bytes from share fetch requests (#19148)
This PR aims to remove the usage of partition max bytes from share fetch
requests. Partition Max Bytes is being defined by
`PartitionMaxBytesStrategy` which was added to the broker as part of PR
https://github.com/apache/kafka/pull/17870

Reviewers: Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>
2025-03-12 13:19:19 +00:00
Apoorv Mittal f3da8f500e
KAFKA-18936: Fix share fetch when records are larger than max bytes (#19145)
The PR fixes the behaviour when records are fetched which are larger
than `fetch.max.bytes` config.

The usage of `hardMaxBytesLimit` is in ReplicaManager where it decides
whether to fetch a single record or not. The file records get sliced
based on the bytes requested. However, if `hardMaxBytesLimit` is false
then at least one record is fetched and bytes are adjusted accordingly in
`localLog`.

Reviewers: Jun Rao <junrao@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Abhinav Dixit <adixit@confluent.io>
2025-03-12 09:03:35 +00:00
David Arthur 701573366f
KAFKA-18933 Add client integration tests module (#19144)
Adds a new ":clients:integration-test" Gradle module. Relocates one
example test from ":core"

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-03-11 16:36:23 -04:00
Lucas Brutschy fc2e3dfce9
MINOR: Disallow unused local variables (#18963)
Recently, we found a regression that could have been detected by static
analysis, since a local variable wasn't being passed to a method during
a refactoring, and was left unused. It was fixed in
[7a749b5](7a749b589f),
but almost slipped into 4.0. Unused variables are typically detected by
IDEs, but this is insufficient to prevent these kinds of bugs. This
change enables unused local variable detection in checkstyle for Kafka.

A few notes on the usage:
- There are two situations in which people actually want to have a local
variable but not use it. First, there are `for (Type ignored:
collection)` loops which have to loop `collection.length` number of
times, but that do not use `ignored` in the loop body. These are
typically still easier to read than a classical `for` loop. Second, some
IDEs detect it if a return value of a function such as `File.delete` is
not being used. In this case, people sometimes store the result in an
unused local variable to make ignoring the return value explicit and to
avoid the squiggly lines.
- In Java 22, unsued local variables can be omitted by using a single
underscore `_`. This is supported by checkstyle. In pre-22 versions,
IntelliJ allows such variables to be named `ignored` to suppress the
unused local variable warning. This pattern is often (but not
consistently) used in the Kafka codebase. This is, however, not
supported by checkstyle.

Since we cannot switch to Java 22, yet, and we want to use automated
detection using checkstyle, we have to resort to prefixing the unused
local variables with `@SuppressWarnings("UnusedLocalVariable")`. We have
to apply this in 11 cases across the Kafka codebase. While not being
pretty, I'd argue it's worth it to prevent bugs like the one fixed in
[7a749b5](7a749b589f).

Reviewers: Andrew Schofield <aschofield@confluent.io>, David Arthur
<mumrah@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Bruno
Cadonna <cadonna@apache.org>, Kirk True <ktrue@confluent.io>
2025-03-10 09:37:35 +01:00
Azhar Ahmed 832dfa36da
KAFKA-18637: Fix max connections per ip and override reconfigurations (#19099)
Reviewers: Christo Lolov <lolovc@amazon.com>, TengYao Chi <kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>
2025-03-10 07:27:48 +00:00
Ken Huang d5413fdb48
KAFKA-17856 Move ConfigCommandTest and ConfigCommandIntegrationTest to tool module (#17767)
Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-09 21:05:36 +08:00
PoAn Yang a5e5e2dcd5
KAFKA-18706 Move AclPublisher to metadata module (#18802)
Move AclPublisher to org.apache.kafka.metadata.publisher package.

Reviewers: Christo Lolov <lolovc@amazon.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-09 21:00:33 +08:00
ClarkChen 1584d49470
KAFKA-18944 Remove unused setters from ClusterConfig (#19166)
Remove unused `saslServerProperties`, `saslClientProperties`,
`adminClientProperties`, `producerProperties`, and `consumerProperties`
in ClusterConfig.

First, I quickly fixed the unused adminClientProperties, and then I will
move on to https://github.com/apache/kafka/pull/19094 to fix the related
issues.

Pass AdminClientRebootstrapTest
<img width="1398" alt="Screenshot 2025-03-09 at 12 54 57 PM"
src="https://github.com/user-attachments/assets/73c50376-6602-493d-8abd-0eb2bb304114"
/>

Pass ClusterConfigTest
<img width="1117" alt="Screenshot 2025-03-09 at 12 55 28 PM"
src="https://github.com/user-attachments/assets/b4da59da-dfdf-4698-9077-5086854360ab"
/>

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-03-09 17:49:28 +08:00
ClarkChen 2a0dbd8e0b
KAFKA-18909 Move DynamicThreadPool to server module (#19081)
* Add `DynamicThreadPool.java` to the server module.
* Remove the old DynamicThreadPool object in the `DynamicBrokerConfig.scala`.

Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-09 17:42:51 +08:00
Colin Patrick McCabe 343bc995f4
KAFKA-18920: The kcontrollers must set kraft.version in ApiVersionsResponse (#19127)
The kafka controllers need to set kraft.version in their
ApiVersionsResponse messages according to the current kraft.version
reported by the Raft layer. Instead, currently they always set it to 0.

Also remove FeatureControlManager.latestFinalizedFeatures. It is not
needed and it does a lot of copying.

Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-07 13:46:46 -08:00
Dániel Urbán 40db001588
KAFKA-18929: Log a warning when time based segment delete is blocked by a future timestamp (#19137)
When producers send future timestamps, time retention based log segments
may get blocked from removal for an extended period of time. Log
cleaning should should warn in the logs when this scenario occurs.

Reviewers: Viktor Somogyi-Vass <viktorsomogyi@gmail.com>
2025-03-07 14:31:22 +01:00
ClarkChen 870db5d811
KAFKA-18915: Migrate AdminClientRebootstrapTest to use new test infra (#19094)
Migrate AdminClientRebootstrapTest to the new test infra and remove the
old Scala test.

Reviewers: TengYao Chi <kitingiao@gmail.com>, David Arthur <mumrah@gmail.com>
2025-03-06 16:05:51 -05:00
Andrew Schofield 1da30bdedf
KAFKA-18900: Experimental share consumer acknowledge mode config (#19113)
User testing of the `KafkaShareConsumer` interface has revealed some
areas which confuse people. One of these is that way that it decides
whether you want to use implicit or explicit acknowledgement of records
by observing which calls the application issues. We are taking the
opportunity to refine the interface before it is finalised.

This PR introduces an experimental configuration called
`internal.share.acknowledgement.mode` which can be used to make the
application declare which kind of acknowledgement it wishes to use. We
plan to try out the configuration, assess whether it has helped, and
then create a proper consumer configuration that makes this area better.
That would require a lot of change in the tests, which explains why this
initial PR only has a small number of tests.

Reviewers: David Arthur <mumrah@gmail.com>
2025-03-06 17:57:11 +00:00
Ken Huang 041d8019d6
KAFKA-18910 Remove kafka.utils.json (#19112)
Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-06 14:11:20 +08:00
co63oc 3d7ac0c3d1
MINOR: Fix typos in multiple files (#19102)
Fix typos in multiple files

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-03-05 14:27:32 +00:00
Xuan-Zhang Gong 18eca0229d
KAFKA-18882 Remove BaseKey, TxnKey, and UnknownKey (#19054)
Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi <kitingiao@gmail.com>, PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-05 21:16:18 +08:00
Lan Ding 69ff5d1e70
KAFKA-18817: ShareGroupHeartbeat and ShareGroupDescribe API must check topic describe (#19083)
Reviewers: Christo Lolov <lolovc@amazon.com>
2025-03-05 11:25:08 +00:00
Kuan-Po Tseng cbd72cc216
KAFKA-14121: AlterPartitionReassignments API should allow callers to specify the option of preserving the replication factor (#18983)
Reviewers: Christo Lolov <lolovc@amazon.com>, Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi <kitingiao@gmail.com>
2025-03-05 11:23:12 +00:00
Logan Zhu 011f256c86
KAFKA-18886 add behavior change of CreateTopicPolicy and AlterConfigPolicy to zk2kraft (#19087)
1. Updated JavaDoc to reflect that CreateTopicPolicy and AlterConfigPolicy run on the controller in KRaft mode.
2. Modified Behavioral Change Reference in the HTML docs to include this change.
3. add warning message to KafkaConfig if the config of broker node has policy configs 


Reviewers: TengYao Chi <kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-05 15:15:03 +08:00
co63oc e4ece37dbf
Fix typos in multiple files (#19086)
Fix typos in multiple files

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-03-04 16:05:51 +00:00
Apoorv Mittal c1fc59fc23
KAFKA-18918: Correcting releasing of locks on exception (#19091)
The PR corrects the way the locks are released on exception. As
`partitionsAcquired` can be a reference to `topicPartitionData`, hence
the locks should released prior clearing `partitionsAcquired`.

Reviewers: Abhinav Dixit <adixit@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2025-03-04 16:04:45 +00:00
David Jacot 1df4a42b40
KAFKA-18916; Resolved regular expressions must update the group by topics data structure (#19088)
When regular expressions are resolved, they do not update the group by
topics data structure. Hence, topic changes (e.g. deletion) do not
trigger a rebalance of the group.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2025-03-04 06:31:08 -08:00
Nick Guo 101e15bb1c
KAFKA-18867 add tests to describe topic configs with empty name (#19075)
Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-04 14:56:25 +08:00
Mahsa Seifikar 2154e55abf
MINOR: Prevent broker fencing by adjusting resendExponentialBackoff in BrokerLifecycleManager (#19061)
This PR reduces `maxInterval` for `resendExponentialBackoff` in
`BrokerLifecycleManager` class from `broker.session.timeout.ms` to half
of its value. Setting `maxInterval` to `broker.session.timeout.ms`
caused brokers to be fenced if a resend attempt occurred near the
timeout threshold, leading to unnecessary broker fencing.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2025-03-03 12:03:15 -08:00
Apoorv Mittal a6c53d0c37
KAFKA-18878: Added share session cache and delayed share fetch metrics (KIP-1103) (#19059)
The PR implements the ShareSessionCache and DelayedShareFetchMetrics as
defined in KIP-1103.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-03-03 16:44:34 +00:00
Lucas Brutschy a04dd21f26
KAFKA-18613: Auto-creation of internal topics in streams group heartbeat (#18981)
Implements auto-topic creation when handling the streams group
heartbeat.

Inside KafkaApis, the handler for streamsGroupHeartbeat uses the result
of the streams group heartbeat inside the group coordinator to attempt
to create all missing internal topics using AutoTopicCreationManager.
CREATE TOPIC ACLs are checked. The unit tests class
AutoTopicCreationManagerTest is brought back (it was recently deleted
during a ZK removal PR), but testing only the kraft-related
functionality.

Reviewers: Bruno Cadonna <bruno@confluent.io>

### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation 
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)
2025-03-03 08:48:00 +01:00
Xuan-Zhang Gong ceac4f0a1d
KAFKA-18880 Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException (#19047)
Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException as they were used by zk path.

Reviewers: TengYao Chi <kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-02 10:54:32 +08:00
TengYao Chi e0c77140b2
KAFKA-17039 KIP-919 supports for unregisterBroker (#19063)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-03-01 23:55:35 +08:00
Nick Guo 98bb79e732
KAFKA-17981 add Integration test for ConfigCommand to add config `key=[val1,val2]` (#17771)
Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-03-01 13:15:25 +08:00
Apoorv Mittal 8cf969e00a
KAFKA-18734: Implemented share partition metrics (KIP-1103) (#19045)
The PR implements the SharePartitionMetrics as defined in KIP-1103, with
one change. The metric `FetchLockRatio` is defined as `Meter` in KIP but
is implemented as `HIstogram`. There was a discussion about same on
KIP-1103 discussion where we thought that `FetchLockRatio` is
pre-aggregated but while implemeting the rate from `Meter` can go above
100 as `Meter` defines rate per time period. Hence it makes more sense
to implement metric `FetchLockRatio` as `Histogram`.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-28 14:22:27 +00:00
Apoorv Mittal 8b605bd362
MINOR: Removing share partition manager flaky annotation (#19053)
There isn't any flaky test for SharePartitionManager in last 7 days, removing flaky annotation.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-28 08:49:59 +00:00
Dongnuo Lyu 36f19057e1
KAFKA-18813: ConsumerGroupHeartbeat API and ConsumerGroupDescribe API must check topic describe (#18989)
This patch filters out the topic describe unauthorized topics from the
ConsumerGroupHeartbeat and ConsumerGroupDescribe response.

In ConsumerGroupHeartbeat, 
- if the request has `subscribedTopicNames` set, we directly check the
authz in `KafkaApis` and return a topic auth failure in the response if
any of the topics is denied.
- Otherwise, we check the authz only if a regex refresh is triggered and
we do it based on the acl of the consumer that triggered the refresh. If
any of the topic is denied, we filter it out from the resolved
subscription.

In ConsumerGroupDescribe, we check the authz of the coordinator
response. If any of the topic in the group is denied, we remove the
described info and add a topic auth failure to the described group.
(similar to the group auth failure)

Reviewers: David Jacot <djacot@confluent.io>, Lianet Magrans
<lmagrans@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>,
Chia-Ping Tsai <chia7712@gmail.com>, TaiJuWu <tjwu1217@gmail.com>,
TengYao Chi <kitingiao@gmail.com>
2025-02-26 13:05:36 -05:00
Lucas Brutschy cb7c54ccd3
KAFKA-18614, KAFKA-18613: Add streams group request plumbing (#18979)
This change implements the basic RPC handling StreamsGroupHeartbeat and
StreamsGroupDescribe. This includes:
 - Adding an option to enable streams groups on the broker
- Passing describe and heartbeats to the right shard of the group
coordinator
- The handler inside the GroupMetadatManager for StreamsGroupDescribe is
fairly trivial, and is included directly in this PR.
- The handler for StreamsGroupHeartbeat is complex and not included in
this PR yet. Instead, a UnsupportedOperationException is thrown.
However, the interface is already defined: The result of a
streamsGroupHeartbeat is a response, together with a list of internal
topics to be created.

The heartbeat implementation inside the `GroupMetadataManager`, which
actually implements the assignment / reconciliation logic, will come in
a follow-up PR. Also, automatic creation of internal topics will be
created in a follow-up PR.

Reviewers: Bill Bejeck <bill@confluent.io>
2025-02-26 16:33:26 +01:00
Abhinav Dixit 4b5a16bf6f
KAFKA-18757: Create full-function SimpleAssignor to match KIP-932 description (#18864)
### About
The current `SimpleAssignor` in AK assigned all subscribed topic
partitions to all the share group members. This does not match the
description given in
[KIP-932](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255070434#KIP932:QueuesforKafka-TheSimpleAssignor).
Here are the rules as mentioned in the KIP by which the assignment
should happen. We have changed the step 3 implementation here due to the
reasons
[described](https://github.com/apache/kafka/pull/18864#issuecomment-2659266502)
-

1. The assignor hashes the member IDs of the members and maps the
partitions assigned to the members based on the hash. This gives
approximately even balance.
2. If any partitions were not assigned any members by (1) and do not
have members already assigned in the current assignment, members are
assigned round-robin until each partition has at least one member
assigned to it.
3. We combine the current and new assignment. (Original rule - If any
partitions were assigned members by (1) and also have members in the
current assignment assigned by (2), the members assigned by (2) are
removed.)

### Tests
The added code has been verified with unit tests and the already present
integration tests.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>, TaiJuWu <tjwu1217@gmail.com>
2025-02-26 11:02:23 +00:00
José Armando García Sancio 4a8a0637e0
KAFKA-18723; Better handle invalid records during replication (#18852)
For the KRaft implementation there is a race between the network thread,
which read bytes in the log segments, and the KRaft driver thread, which
truncates the log and appends records to the log. This race can cause
the network thread to send corrupted records or inconsistent records.
The corrupted records case is handle by catching and logging the
CorruptRecordException. The inconsistent records case is handle by only
appending record batches who's partition leader epoch is less than or
equal to the fetching replica's epoch and the epoch didn't change
between the request and response.

For the ISR implementation there is also a race between the network
thread and the replica fetcher thread, which truncates the log and
appends records to the log. This race can cause the network thread send
corrupted records or inconsistent records. The replica fetcher thread
already handles the corrupted record case. The inconsistent records case
is handle by only appending record batches who's partition leader epoch
is less than or equal to the leader epoch in the FETCH request.

Reviewers: Jun Rao <junrao@apache.org>, Alyssa Huang <ahuang@confluent.io>, Chia-Ping Tsai <chia7712@apache.org>
2025-02-25 20:09:19 -05:00
Apoorv Mittal df5839a9f4
KAFKA-17351: Improved handling of compacted topics in share partition (2/N) (#19010)
The PR handles fetch for `compacted` topics. The fix was required only
when complete batch disappears from the topic log, and same batch is
marked re-available in Share Partition state cache. Subsequent log reads
will not result the disappeared batch in read response hence respective
batch will be left as available in the state cache.

The PR checks for the first fetched/read batch base offset and if it's
greater than the position from where the read occurred (fetch offset)
then if there exists any `available` batches in the state cache then
they will be archived.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Abhinav Dixit <adixit@confluent.io>
2025-02-25 14:11:39 +00:00
xijiu 1edc30bf30
KAFKA-17836 Move RackAwareTest to server module (#19021)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-25 18:15:34 +08:00
TaiJuWu 1c82b89b4c
KAFKA-18712 Move Endpoint to server module (#18803)
Reviewers: Ismael Juma <ismael@juma.me.uk>, Mickael Maison <mickael.maison@gmail.com>, Christo Lolov <lolovc@amazon.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-25 14:02:51 +08:00
PoAn Yang 10873e4210
KAFKA-18281: Kafka is improperly validating non-advertised listeners for routable controller addresses (#18387)
When a cluster is configured with a dynamic controller quorum, KRaft replica's endpoint are computed using the advertised.listeners property and not the quorum.controller.voters property. This change in the configuration makes it difficult to keeping all previous node configurations compatible with the new endpoint discovery functionality.

The least intrusive solution is to rely on Kafka's reverse hostname lookup when the hostname is not specified. The effective advertised controller listener now remove '0.0.0.0' hostname if the endpoint came from the listener configuration and not the advertised.listener configuration.

Reviewers: José Armando García Sancio <jsancio@apache.org>, Alyssa Huang <ahuang@confluent.io>
2025-02-24 21:51:28 -05:00
Nick Guo d23a61738a
KAFKA-17937 Cleanup AbstractFetcherThreadTest (#18900)
- Remove AbstractFetcherThreadWithIbp26Test as it tests unsupported IBP
- cleanup AbstractFetcherThreadTest to remove unreachable paths, variables, and code

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-25 07:45:47 +08:00
Apoorv Mittal 48a506b7b8
KAFKA-18522: Slice records for share fetch (#18804)
The PR handles slicing of fetched records based on acquire response for
share fetch. There could be additional bytes fetched from log but
acquired offsets can be a subset, typically with `max fetch records`
configuration. Rather sending additional bytes of fetched data to client
we should slice the file and wire only needed batches.

Note: If the acquired offsets are within a batch then we need to send
the entire batch within the file record. Hence rather checking for
individual batches, PR finds the first and last acquired offset, and
trims the file for all batches between (inclusive) these two offsets.

Reviewers: Christo Lolov <lolovc@amazon.com>, Andrew Schofield <aschofield@confluent.io>, Jun Rao <junrao@gmail.com>
2025-02-24 09:55:24 -08:00
mingdaoy 289e958c39
MINOR: Fix validateResourceNameIsNodeId's exception message (#19017)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-24 09:30:02 +08:00
Ismael Juma 13cb87c2d0
MINOR: Remove request log space added inadvertently (#19011)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-23 11:30:19 +08:00
Apoorv Mittal 6e45ab7d84
KAFKA-17351: Update tests and acquire API to allow discard batches from compacted topics (1/N) (#18978)
The PR does following:
1.  Adds `fetchOffset` to `acquire` API in SharePartition.
2. Adds a ShareFetchPartitionData class efficiently handle the
propagation of fetchOffset information.
3. Updates SharePartitionTests to make common code so such improvements
does not require all tests changes for future PRs.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-22 16:14:09 +00:00
Sushant Mahajan 4f28973bd1
KAFKA-18827: Initialize share state, share coordinator impl. [1/N] (#18968)
In this PR, we have added the share coordinator and KafkaApis side impl
of the intialize share group state RPC.
ref:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-932%3A+Queues+for+Kafka#KIP932:QueuesforKafka-InitializeShareGroupStateAPI

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-22 16:12:08 +00:00
Apoorv Mittal f543eac4fe
KAFKA-18733: Implemented fetch ratio and partition acquire time metrics (3/N) (#18959)
PR implements the final set of ShareGroupMetrics,
RequestTopicPartitionsFetchRatio and TopicPartitionsAcquireTimeMs, as
defined in KIP-1103:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1103%3A+Additional+metrics+for+cooperative+consumption

Note: Metric `RequestTopicPartitionsFetchRatio` is calculated as
percentage as Histogram API doesn't record double.


Reviewers: Andrew Schofield <aschofield@confluent.io>, Abhinav Dixit <adixit@confluent.io>
2025-02-21 17:01:39 +00:00
Calvin Liu 8f13e7c207
MINOR: Move the ELR default version to 4.1 (#18954)
- ELR is enabled (ELRV_1) by default if the cluster is created with its bootstrap metadata version >= IBP_4_1_IV0.
- ELRV_1 can be manually enabled iff the metadata version is >= IBP_4_0_IV1.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Colin P. McCabe <cmccabe@apache.org>, David Jacot <djacot@confluent.io>
2025-02-21 16:13:11 +01:00
Shivsundar R 7da1a6cbff
KAFKA-18033: Remove flaky tag in ShareConsumerTest (#18995)
3 tests which were marked flaky in ShareConsumerTest do not have any
failure on trunk since the test was converted to use `ClusterTestExtensions`.

Reviewers: Sushant Mahajan <smahajan@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-02-21 13:50:08 +00:00
TengYao Chi 767a62ade6
KAFKA-18737 KafkaDockerWrapper setup functions fails due to storage format command (#18844)
The current Docker Hub documentation for Kafka is based on the use of static voters. Since Kafka 4.0 utilizes dynamic voters, users following the doc of docker hub may encounter unexpected behavior. Due to the limited time available for the 4.0.0 release, a simple and quick solution is to revert to using static voters within the Docker image. This can be achieved by adding a configuration file with static voter definitions to the kafka/docker folder, keeping it separate from the main kafka/config directory. This approach allows us to encourage the use of dynamic voters in typical deployments while maintaining compatibility within the Docker image.

Reviewers: Vedarth Sharma <142404391+VedarthConfluent@users.noreply.github.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-21 20:43:41 +08:00
TengYao Chi d31cbf59de
KAFKA-18831 Migrating to log4j2 introduce behavior changes of adjusting level dynamically (#18969)
fix the following behavior changes.

1) in log4j 1, users can't change the logger by parent if the logger is declared by properties explicitly. For example, `org.apache.kafka.controller` has level explicitly in the properties. Hence, we can't use "org.apache.kafka=INFO" to change the level of `org.apache.kafka.controller` to INFO. By contrast, log4j2 allows us to change all child loggers by the parent logger.

2) in log4j2, we can change the level of root to impact all loggers' level. By contrast, log4j 1 can't. 

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-21 16:12:58 +08:00
Calvin Liu 1eecd02ce8
MINOR: Deflake EligibleLeaderReplicasIntegrationTest (#18923)
Make sure to give enough time for the partition ISR updates.

Reviewers: David Jacot <djacot@confluent.io>
2025-02-20 05:14:15 -08:00
Matthias J. Sax 538a60e1b3
MINOR: disallow rawtypes and fail build (#18877)
Cleanup code to avoid rawtype, and add suppressions where necessary.
Change the build to fail on rawtype warning.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-02-19 13:11:49 -08:00
Ismael Juma 3a59a526d9
MIINOR: Remove redundant quorum parameter from *AdminIntegrationTest classes (#18965)
Reviewers: Lianet Magrans <lmagrans@confluent.io>
2025-02-19 15:57:47 -05:00
Shivsundar R 3603c8fe35
KAFKA-18829: Added check before converting to IMPLICIT mode (#18964)
Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-19 17:34:28 +00:00
Ismael Juma 3dba3125e9
KAFKA-18601: Assume a baseline of 3.3 for server protocol versions (#18845)
3.3.0 was the first KRaft release that was deemed production-ready and also
when KIP-778 (KRaft to KRaft upgrades) landed. Given that, it's reasonable
for 4.x to only support upgrades from 3.3.0 or newer (the metadata version also
needs to be set to "3.3" or newer before upgrading).

Noteworthy changes:
1. `AlterPartition` no longer includes topic names, which makes it possible to
simplify `AlterParitionManager` logic.
2. Metadata versions older than `IBP_3_3_IV3` have been removed and
`IBP_3_3_IV3` is now the minimum version.
3. `MINIMUM_BOOTSTRAP_VERSION` has been removed.
4. Removed `isLeaderRecoverySupported`, `isNoOpsRecordSupported`,
`isKRaftSupported`, `isBrokerRegistrationChangeRecordSupported` and
`isInControlledShutdownStateSupported` - these are always `true` now.
Also removed related conditional code.
5. Removed default metadata version or metadata version fallbacks in
multiple places - we now fail-fast instead of potentially using an incorrect
metadata version.
6. Update `MetadataBatchLoader.resetToImage` to set `hasSeenRecord`
based on whether image is empty - this was a previously existing issue that
became more apparent after the changes in this PR.
7. Remove `ibp` parameter from `BootstrapDirectory`
8. A number of tests were not useful anymore and have been removed.

I will update the upgrade notes via a separate PR as there are a few things that
need changing and it would be easier to do so that way.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jun Rao <junrao@gmail.com>, David Arthur <mumrah@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Justine Olshan <jolshan@confluen.io>, Ken Huang <s7133700@gmail.com>
2025-02-19 05:35:42 -08:00
xijiu 4c4458c17a
KAFKA-18799 Remove AdminUtils (#18946)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-19 06:25:43 +08:00
PoAn Yang 1132f08c57
KAFKA-18773 Migrate the log4j1 config to log4j 2 for native image and README (#18872)
- update reflection-config.json and resource-config.json to include log4j2 and jackson
- remove unused jackson scala library
- fix the incorrect path of log4j2.yaml
- adopt workaround (--standalone) to make this PR work and it will be fixed by KAFKA-18737)

Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-19 00:48:46 +08:00
TaiJuWu 934b0159bb
KAFKA-18089: Upgrade Caffeine lib to 3.1.8 (#18004)
- Fixed the RemoteIndexCacheTest that fails with caffeine > 3.1.1

Reviewers: Luke Chen <showuon@gmail.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>
2025-02-18 21:51:38 +05:30
Parker Chang ed366e6b89
MINOR: Align assertFutureThrows method signature with JUnit conventions (#18825)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-02-18 15:56:42 +00:00
Mickael Maison 0a2fab9310
KAFKA-14484: Decouple UnifiedLog and RemoteLogManager (#18460)
Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Ismael Juma <ismael@juma.me.uk>
2025-02-18 15:10:31 +01:00
Andrew Schofield 6c14f64245
MINOR: Rename NoOpShareStatePersister for consistency (#18933)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-18 14:07:59 +00:00
Chirag Wadhwa 63229a768c
KAFKA-16718 [1/n]: Added DeleteShareGroupOffsets request and response schema (#18927)
Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-18 14:06:24 +00:00
Andrew Schofield 385b7ad355
MINOR: Align share group admin authz with consumer group (#18936)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2025-02-18 09:12:07 +00:00
Kamal Chandraprakash da3643c6b4
KAFKA-18787: RemoteIndexCache fails to delete invalid files on init (#18888)
The stale/invalid files that ends-with ".deleted" and ".tmp" should be cleaned when the broker gets restarted.

- fix the remote-index-cache test to use the logDir instead of topicDir
- fix the flaky test

Reviewers: Luke Chen <showuon@gmail.com>
2025-02-18 12:56:03 +05:30
Apoorv Mittal 06ce3e890b
KAFKA-18733: Updating share group record acks metric (2/N) (#18924)
Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-17 18:12:58 +00:00
PoAn Yang 2b6e868538
KAFKA-18784 Fix ConsumerWithLegacyMessageFormatIntegrationTest (#18889)
In PR #18267, we removed old message format for cases in ConsumerWithLegacyMessageFormatIntegrationTest. Although test cases can pass, they don't fulfill original purpose. We can't send old message format since 4.0, so I change cases to append old records by ReplicaManager directly.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-17 20:43:29 +08:00
Andrew Schofield 9b7ad6ec32
MINOR: Mark testQuotaOverrideDelete as flaky (#18925)
Reviewers: poorv Mittal <apoorvmittal10@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-17 15:20:35 +08:00
TengYao Chi 5cbe00e375
MINOR: Remove unused member in DynamicBrokerConfig (#18915)
Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-17 04:46:25 +08:00
Ming-Yen Chung e828767062
KAFKA-18790 Fix testCustomQuotaCallback (#18906)
Frequently updating the trust store can cause unexpected termination of the AsyncConsumer background thread.

1. To resolve this issue, reuse the same AdminClient instead of recreating it.
2. Add error logging when fail to initialize resources for the consumer network thread.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-15 03:07:59 +08:00
Jimmy Wang 6a6b80215d
KAFKA-16717 [1/2]: Add AdminClient.alterShareGroupOffsets (#18819)
KAFKA-16720 aims to add the support for the AlterShareGroupOffsets AdminClient. Key Changes in the PR:

1. Added handing of alterShareGroupOffsets() in KafkaAdminClient and introduce AlterShareGroupOffsetRequest/AlterShareGroupOffsetResponse/AlterShareGroupOffsetsOptions classes.
2. Corresponding test in KafkaAdminClientTest.
3. Added ALTER_SHARE_GROUP_OFFSETS API (will finish it in next PR and the share coordinator pieces)

Reviewers: poorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-15 02:35:46 +08:00
Apoorv Mittal 53543bcf63
KAFKA-18733: Updating share group metrics (1/N) (#18826)
Reviewers: Sushant Mahajan <smahajan@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2025-02-14 08:48:41 +00:00
陳昱霖(Yu-Lin Chen) 2bbd25841e
KAFKA-18298 Fix flaky testConsumerGroupsDeprecatedConsumerGroupState and testConsumerGroups in PlaintextAdminIntegrationTest (#18513)
It's related to KAFKA-18298 and KAFKA-18297. The root cause of the flaky tests is member rejoin after member removal. To prevent members from rejoining after being removed, before removing group members, calling `consumers.close` in ConsumerThread . This fix also extract the flaky member removal test  to new test `testConsumerGroupWithMemberRemoval`.

Flow of member removal test: 
1. Set 2 static consumer + 1 dynamic consumer
2. Close all consumers.
3. remove one static member
4. remove remaining members
 
Before KIP-1092, the member count is different between ClassicConsumer/AsyncConsumer. (AsyncConsumer will remove dynamic member after consumer closed.)

To get more details, please refer to the discussion under KAFKA-18297 and this PR:
- discussion : [Link](https://issues.apache.org/jira/browse/KAFKA-18297?focusedCommentId=17912537&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17912537)
- review: https://github.com/apache/kafka/pull/18513#pullrequestreview-2589110367

This PR fixed below flaky errors:

1. **PlaintextAdminIntegrationTest#testConsumerGroups**
  a.  `org.opentest4j.AssertionFailedError: expected: <2> but was: <3>` ([Report](https://ge.apache.org/s/lt3lpviv45cns/tests/task/:core:test/details/kafka.api.PlaintextAdminIntegrationTest/testConsumerGroups(String%2C%20String)%5B1%5D?top-execution=1))
  b.  `org.opentest4j.AssertionFailedError: expected: <true> but was: <false>` ([Report](https://ge.apache.org/s/jlxo446xalpoa/tests/task/:core:test/details/kafka.api.PlaintextAdminIntegrationTest/testConsumerGroups(String%2C%20String)%5B1%5D?top-execution=1))

2. **PlaintextAdminIntegrationTest#testConsumerGroupsDeprecatedConsumerGroupState**
  a.  `org.opentest4j.AssertionFailedError: expected: <2> but was: <3>` ([Report](https://ge.apache.org/s/ndoj6s2stb446/tests/task/:core:test/details/kafka.api.PlaintextAdminIntegrationTest/testConsumerGroupsDeprecatedConsumerGroupState(String%2C%20String)%5B1%5D?top-execution=1))
  b. `org.opentest4j.AssertionFailedError: expected: <true> but was: <false>` ([Report](https://ge.apache.org/s/kh3jze2tc5qeu/tests/task/:core:test/details/kafka.api.PlaintextAdminIntegrationTest/testConsumerGroupsDeprecatedConsumerGroupState(String%2C%20String)%5B1%5D?top-execution=1))

Reviewers: David Jacot <djacot@confluent.io>, TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-14 07:28:45 +08:00
Andrew Schofield 952113e8e0
KAFKA-16720: Support multiple groups in DescribeShareGroupOffsets RPC (#18834)
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
2025-02-13 18:27:05 +00:00
Calvin Liu 9cb271f1e1
KAFKA-18654[2/2]: Transction V2 retry add partitions on the server side when handling produce request. (#18810)
During the transaction commit phase, it is normal to hit CONCURRENT_TRANSACTION error before the transaction markers are fully propagated. Instead of letting the client to retry the produce request, it is better to retry on the server side.

Reviewers: Artem Livshits <alivshits@confluent.io>, Justine Olshan <jolshan@confluent.io>
2025-02-13 09:30:58 -08:00
Apoorv Mittal a13d815a0d
MINOR: Updated share partition manager tests to close and other fixes (#18862)
Reviewers: Abhinav Dixit <adixit@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2025-02-13 13:37:37 +00:00
Ken Huang 9494bebee6
KAFKA-18728 Move ListOffsetsPartitionStatus to server module (#18807)
Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>
2025-02-13 10:36:46 +05:30
Jhen-Yung Hsu b0e5cdfc57
KAFKA-18777 add `PartitionsWithLateTransactionsCount` to BrokerMetricNamesTest (#18869)
Rewrite BrokerMetricNamesTest using ReplicaManager.MetricNames, ensuring that all metrics are always included. This helps prevent issues like PartitionsWithLateTransactionsCount not being correctly included in the test before.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-12 22:09:42 +08:00
PoAn Yang 63fc9b3cb8
KAFKA-18771: fix Flaky test KRaftClusterTest .testDescribeQuorumRequestToControllers (#18859)
The case testDescribeQuorumRequestToControllers shutdowns raft client but not the controller. This makes client has chance to send a request to the controller and get NOT_LEADER_OR_FOLLOWER error. However, if the raft client finishes shutdown before handling the request, the request will not be handled. Shutdown the controller before doing KafkaFuture#get for the client request, so we can make sure the request is handled by another controller eventually.

Signed-off-by: PoAn Yang <payang@apache.org>

Reviewers: Luke Chen <showuon@gmail.com>
2025-02-12 16:16:43 +08:00
Justine Olshan 400363b7e2
KAFKA-18035: TransactionsTest testBumpTransactionalEpochWithTV2Disabled failed on trunk (#18451)
Sometimes we didn't get into abortable state before aborting, so the epoch didn't get bumped. Now we force abortable state with an attempt to send before aborting so the epoch bump occurs as expected.

Reviewers: Jeff Kim <jeff.kim@confluent.io>
2025-02-11 14:01:43 -08:00
Edoardo Comar 7e405ccc65
KAFKA-18758: NullPointerException in shutdown following InvalidConfigurationException (#18833)
* KAFKA-18758:  NullPointerException in shutdown following InvalidConfigurationException

Add checks for null in shutdown as BrokerLifecycleManager is not instantiaited if LogManager constructor throws an Exception
2025-02-11 10:06:55 +00:00
Sushant Mahajan 675a0889de
KAFKA-18764: Throttle on share state RPCs auth failure. (#18855)
Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-11 09:54:24 +00:00