Commit Graph

15707 Commits

Author SHA1 Message Date
Sean Quah c16c240bd1
KAFKA-18688: Fix uniform homogeneous assignor stability (#19677)
When the number of partitions is not divisible by the number of members,
some members will end up with one more partition than others.
Previously, we required these to be the members at the start of the
iteration order, which meant that partitions could be reassigned even
when the previous assignment was already balanced.

Allow any member to have the extra partition, so that we do not move
partitions around when the previous assignment is already balanced.

Before the PR
```
Benchmark                                 (assignmentType)  (assignorType)  (isRackAware)  (memberCount)  (partitionsToMemberRatio)  (subscriptionType)  (topicCount)  Mode  Cnt    Score   Error  Units
ServerSideAssignorBenchmark.doAssignment              FULL           RANGE          false          10000                         50         HOMOGENEOUS          1000  avgt    2   26.175          ms/op
ServerSideAssignorBenchmark.doAssignment              FULL           RANGE          false          10000                         50       HETEROGENEOUS          1000  avgt    2  123.955          ms/op
ServerSideAssignorBenchmark.doAssignment       INCREMENTAL           RANGE          false          10000                         50         HOMOGENEOUS          1000  avgt    2   24.408          ms/op
ServerSideAssignorBenchmark.doAssignment       INCREMENTAL           RANGE          false          10000                         50       HETEROGENEOUS          1000  avgt    2  114.873          ms/op
```
After the PR
```
Benchmark                                 (assignmentType)  (assignorType)  (isRackAware)  (memberCount)  (partitionsToMemberRatio)  (subscriptionType)  (topicCount)  Mode  Cnt    Score   Error  Units
ServerSideAssignorBenchmark.doAssignment              FULL           RANGE          false          10000                         50         HOMOGENEOUS          1000  avgt    2   24.259          ms/op
ServerSideAssignorBenchmark.doAssignment              FULL           RANGE          false          10000                         50       HETEROGENEOUS          1000  avgt    2  118.513          ms/op
ServerSideAssignorBenchmark.doAssignment       INCREMENTAL           RANGE          false          10000                         50         HOMOGENEOUS          1000  avgt    2   24.636          ms/op
ServerSideAssignorBenchmark.doAssignment       INCREMENTAL           RANGE          false          10000                         50       HETEROGENEOUS          1000  avgt    2  115.503          ms/op
```

Reviewers: David Jacot <djacot@confluent.io>
2025-05-13 08:01:14 -07:00
Abhinav Dixit d5ce463ed3
KAFKA-19253: Improve metadata handling for share version using feature listeners (1/N) (#19659)
This PR creates a listener for `SharePartitionManager` to listen to any
changes in `ShareVersion` feature. In case, there is a toggle, we need
to change the attributes in `SharePartitionManager` accordingly.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-13 15:01:03 +01:00
Lianet Magrans 33b9471ac3
MINOR: Improve docs for retries & cleanup (#19595)
Improve docs for retries config, mainly to clarify the expected
behaviour on retries=0  Also remove unused funcs and fix typo.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Ming-Yen Chung
 <mingyen066@gmail.com>, PoAn Yang <payang@apache.org>
2025-05-13 09:36:49 -04:00
Sushant Mahajan bf53561d16
KAFKA-19201: Handle deletion of user topics part of share partitions. (#19559)
* Currently even if a user topic is deleted, its related records are not
deleted with respect to subscribed share groups from the GC and the SC.
* The event of topic delete is propagated from the
BrokerMetadataPublisher to the coordinators via the
`onPartitionsDeleted` method. This PR leverages this method to issue
cleanup calls to the GC and SC respectively.
* To prevent chaining of futures in the GC, we issue async calls to both
GC and SC independently and the methods take care of the respective
cleanups unaware of the other.
* This method is more efficient and transcends issues related to
timeouts/broker restarts resulting in chained future execution issues.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-13 14:22:17 +01:00
David Jacot d154a314e7
KAFKA-14691; Add TopicId to OffsetFetch API (#19515)
This patch extends the OffsetFetch API to support topic ids. From
version 10 of the API, topic ids must be used.

The patch only contains the server side changes and it keeps the version
10 as unstable for now. We will mark the version as stable when the
client side changes are merged in.

Reviewers: TengYao Chi <frankvicky@apache.org>, Lianet Magrans <lmagrans@confluent.io>
2025-05-13 15:10:10 +02:00
Apoorv Mittal b4b73c604b
KAFKA-19245: Updated default locks config for share group (#19705)
Updated default locks config for share groups from 200 to 2000. The
increase in the limit is a result from tests which showed that with
default maxFetchRecords of 500 from client and 200 as locks limit, there
can't be parallel fetch for same partition. Also the tests resulted that
sharing a partition to an index of 3-4 is easily achievable, hence
raised the limit to 4 times of default limit of maxFetchRecords (500).

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-13 13:46:25 +01:00
Sean Quah 1293658cca
KAFKA-19163: Avoid deleting groups with pending transactional offsets (#19496)
When a group has pending transactional offsets but no committed offsets,
we can accidentally delete it while cleaning up expired offsets. Add a
check to avoid this case.

Reviewers: David Jacot <djacot@confluent.io>
2025-05-13 05:10:26 -07:00
Andrew Schofield 86baac103b
MINOR: Improve client error messages for share groups not enabled (#19688)
CI / build (push) Waiting to run Details
As mentioned in
https://github.com/apache/kafka/pull/19378#pullrequestreview-2775598123,
the error messages for a 4.1 share consumer could be clearer for the
different cases for when it cannot successfully join a share group.

This PR uses different error messages for the different cases such as
out-of-date cluster or share groups just not enabled.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-05-13 10:42:40 +01:00
Bolin Lin 6eafe407bd
MINOR: Fix unchecked type warnings in several test classes (#19679)
* In ConsoleShareConsumerTest, add `@SuppressWarnings("unchecked")`
annotation in method shouldUpgradeDeliveryCount
* In ListConsumerGroupOffsetsHandlerTest, add generic parameters to
HashSet constructors
* In TopicsImageTest, add explicit generic type to Collections.EMPTY_MAP
to fix raw type usage

Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-13 14:59:22 +08:00
Alyssa Huang c28f46459a
KAFKA-18345; Prevent livelocked elections (#19658)
CI / build (push) Waiting to run Details
At the retry limit binaryExponentialElectionBackoffMs it becomes
statistically likely that the exponential backoff returned
electionBackoffMaxMs. This is an issue as multiple replicas can get
stuck starting elections at the same cadence.

This change fixes that by added a random jitter to the max election
backoff.

Reviewers: José Armando García Sancio <jsancio@apache.org>, TaiJuWu
 <tjwu1217@gmail.com>, Yung <yungyung7654321@gmail.com>
2025-05-12 16:23:18 -04:00
Jonah Hooper 13fa4537f5
KAFKA-18905; Disable idempotent producer to remove test flakiness (#19644)
As a result of KAFKA-18905 the reassign test will often have test
failures which are unrelated to the actual reassignment of partitions.
This failure is mentioned in KAFKA-9199.

Quote from KAFKA-9199:  "This issue popped up in the reassignment system
test. It ultimately caused the test to fail because the producer was
stuck retrying the duplicate batch repeatedly until ultimately giving
up."

Disabling the idempotent producer circumvents this issue and allows the
reassignment system tests to succeed reliably. The reassignment test
still check that produce batches were not lost.

Reviewers: José Armando García Sancio <jsancio@apache.org>
2025-05-12 15:41:02 -04:00
Andrew Schofield 7b8633e36f
MINOR: Add deprecation of listConsumerGroups to upgrade.html (#19684)
CI / build (push) Waiting to run Details
As part of KIP-1043, `Admin.listConsumerGroups()` and variants have been
deprecated. This is because there are now 4 types of group and listing
has been consolidated under `Admin.listGroups()`. This PR adds the
deprecation information to the upgrade documentation.

Reviewers: Lianet Magrans <lmagrans@confluent.io>
2025-05-12 18:08:11 +01:00
ChickenchickenLove 62bec20aef
KAFKA-19242: Fix commit bugs caused by race condition during rebalancing. (#19631)
### Motivation
While investigating “events skipped in group

rebalancing” ([spring‑projects/spring‑kafka#3703](https://github.com/spring-projects/spring-kafka/issues/3703))
I discovered a race
condition between
- the main poll/commit thread, and
- the consumer‑coordinator heartbeat thread.

If the main thread enters
`ConsumerCoordinator.sendOffsetCommitRequest()` while the heartbeat
thread is finishing a rebalance (`SyncGroupResponseHandler.handle()`),
the group state transitions in the following order:

```
COMPLETING_REBALANCE  →  (race window)  →  STABLE
```
Because we read the state twice without a lock:
1. `generationIfStable()` returns `null` (state still
`COMPLETING_REBALANCE`),
2. the heartbeat thread flips the state to `STABLE`,
3. the main thread re‑checks with `rebalanceInProgress()` and wrongly
decides that a rebalance is still active,
4. a spurious `CommitFailedException` is returned even though the commit
could succeed.

For more details, please refer to sequence diagram below.  <img
width="1494" alt="image"

src="https://github.com/user-attachments/assets/90f19af5-5e2d-4566-aece-ef764df2d89c"
/>

### Impact
- The exception is semantically wrong: the consumer is in a stable
group, but reports failure.
- Frameworks and applications that rely on the semantics of
`CommitFailedException` and `RetryableCommitException` (for example
`Spring Kafka`) take the wrong code path, which can ultimately skip the
events and break “at‑most‑once” guarantees.

### Fix
We enlarge the synchronized block in
`ConsumerCoordinator.sendOffsetCommitRequest()` so that the consumer
group state is examined atomically with respect to the heartbeat thread:

### Jira
https://issues.apache.org/jira/browse/KAFKA-19242

https: //github.com/spring-projects/spring-kafka/issues/3703

Signed-off-by: chickenchickenlove <ojt90902@naver.com>

Reviewers: David Jacot <david.jacot@gmail.com>
2025-05-12 17:01:29 +02:00
Sean Quah eb3714f022
KAFKA-19160;KAFKA-19164; Improve performance of fetching stable offsets (#19497)
CI / build (push) Waiting to run Details
When fetching stable offsets in the group coordinator, we iterate over
all requested partitions. For each partition, we iterate over the
group's ongoing transactions to check if there is a pending
transactional offset commit for that partition.

This can get slow when there are a large number of partitions and a
large number of pending transactions. Instead, maintain a list of
pending transactions per partition to speed up lookups.

Reviewers: Shaan, Dongnuo Lyu <dlyu@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, David Jaco <djacot@confluent.io>
2025-05-12 00:32:17 -07:00
Matthias J. Sax b66729e231
MINOR: fit HTML markup (#19676)
CI / build (push) Waiting to run Details
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-05-11 16:20:25 -07:00
Ming-Yen Chung 57ae6d6706
KAFKA-18695 Remove quorum=kraft and kip932 from all integration tests (#19633)
CI / build (push) Waiting to run Details
Currently, the quorum uses kraft by default, so there's no need to
specify it explicitly.

For kip932 and isShareGroupTest, they are no longer used after #19542 .

Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
 <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-12 01:28:30 +08:00
Kuan-Po Tseng 54fd1361e5
KAFKA-19264 Remove fallback for thread pool sizes in RemoteLogManagerConfig (#19673)
The fallback mechanism for `remote.log.manager.copier.thread.pool.size`
and `remote.log.manager.expiration.thread.pool.size` defaulting to
`remote.log.manager.thread.pool.size` was introduced in KIP-950. This
approach was abandoned in KIP-1030, where default values were changed
from -1 to 10, and a configuration validator enforcing a minimum value
of 1 was added. As a result, this commit removes the fallback mechanism
from `RemoteLogManagerConfig.java` to align with the new defaults and
validation.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-05-11 23:48:45 +08:00
Yunchi Pang f588fa0643
MINOR: Move TxnTransitMetadata to transaction-coordinator (#19662)
Migrates the `TxnTransitMetadata` class from scala to java, moving it
from to the `transaction-coordinator` module.

Reviewers: PoAn Yang <payang@apache.org>, Nick Guo
 <lansg0504@gmail.com>, Ken Huang <s7133700@gmail.com>, Jhen-Yung Hsu
 <jhenyunghsu@gmail.com>, TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-05-11 22:36:47 +08:00
Nick Guo 707a44a6cb
KAFKA-19068 Eliminate the duplicate type check in creating ControlRecord (#19346)
CI / build (push) Waiting to run Details
jira: https://issues.apache.org/jira/browse/KAFKA-19068

`RecordsIterator#decodeControlRecord` do the type check and then
`ControlRecord` constructor does that again.

we should add a static method to ControlRecord to create `ControlRecord`
with type check, and then `ControlRecord` constructor should be changed
to private to ensure all instance is created by the static method.

Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai
<chia7712@gmail.com>
2025-05-11 00:07:00 +08:00
PoAn Yang 61cb33f347
KAFKA-19109 Don't print null in kafka-metadata-quorum describe status (#19543)
If directory id is `Uuid.ZERO_UUID`, the command don't print the result.

Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
 <frankvicky@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-10 23:48:05 +08:00
xijiu 3696c49788
KAFKA-19220 Add tests to ensure the internal configs don't return by public APIs by default (#19650)
Add tests to check whether the results returned by the API
`createTopics` and `describeConfigs` contain internal configurations.

Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
 <s7133700@gmail.com>, TengYao Chi <frankvicky@apache.org>, TaiJuWu
 <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-10 23:13:58 +08:00
Hong-Yi Chen a7eae28a67
MINOR: Replaced internal KafkaConfig field in TransactionLogConfig (#19482)
Retaining a reference to ```AbstractConfig``` introduced coupling and
potential inconsistencies with dynamic config updates. This change
simplifies ```TransactionLogConfig``` into a POJO by removing the
internal ```AbstractConfig``` field, and aligns with feedback from
#19439

Reviewers: PoAn Yang <payang@apache.org>, TengYao Chi
 <frankvicky@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-10 23:06:18 +08:00
Stanislav Kozlovski 0bc8d0c962
MINOR: Add documentation about KIP-405 remote reads serving just one partition per FetchRequest (#19336)
[As discussed in the mailing
list](https://lists.apache.org/thread/m03mpkm93737kk6d1nd6fbv9wdgsrhv9),
the broker only fetches remote data for ONE partition in a given
FetchRequest. In other words, if a consumer sends a FetchRequest
requesting 50 topic-partitions, and each partition's requested offset is
not stored locally - the broker will fetch and respond with just one
partition's worth of data from the remote store, and the rest will be
empty.

Given our defaults for total fetch response is 50 MiB and per partition
is 1 MiB, this can limit throughput. This patch documents the behavior
in 3 configs - `fetch.max.bytes`, `max.partition.fetch.bytes` and
`remote.fetch.max.wait.ms`

Reviewers: Luke Chen <showuon@gmail.com>, Kamal Chandraprakash
 <kamal.chandraprakash@gmail.com>, Satish Duggana <satishd@apache.org>
2025-05-10 16:48:55 +05:30
Alyssa Huang 042be5b9ac
MINOR: Fix some Request toString methods (#19655)
CI / build (push) Waiting to run Details
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2025-05-09 23:42:34 -07:00
Chia-Ping Tsai 064afe2c65
MINOR: add xijiu from asf.yaml in order to resend invitation (#19663)
CI / build (push) Waiting to run Details
@xijiu's invitation is timeout, so we have to remove the name, then
re-add it in a new commit.

see https://issues.apache.org/jira/browse/INFRA-26796

Reviewers: PoAn Yang <payang@apache.org>, xijiu <422766572@qq.com>,
 Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Ken Huang <s7133700@gmail.com>,
 TengYao Chi <frankvicky@apache.org>
2025-05-10 05:42:54 +08:00
Matthias J. Sax 0b81d6c780
MINOR: avoid double brace initialization (#19667)
Reviewers: Bill Bejeck <bill@confluent.io>
2025-05-09 11:52:01 -07:00
Shivsundar R 58c08441d1
KAFKA-19229: Ignore background errors while closing share consumers. (Fix flaky test) (#19647)
CI / build (push) Waiting to run Details
- A couple of newly added tests were found to be flaky in
`AuthorizerIntegrationTest.scala`.
- `testShareGroupDescribeWithGroupDescribeAndTopicDescribeAcl` and
`testShareGroupDescribeWithoutGroupDescribeAcl`. These tests pass
locally, so could not replicate the failure.
- But logs from develocity indicated that the test fails when the
following condition happens :
   When the background error event arrives after the consumer had
unsubscribed, then these events are processed in the
`handleCompletedAcknowledgements` method and the exception from the
event is thrown, preventing `close()` to complete.

- We need to handle this race condition where we might get the
background event after unsubscribe and before processing the callbacks.
- PR fixes this by ignoring the exceptions in the background queue when
the `handleCompletedAcknowledgements` method is called during `close()`.
This ensures `close()` completes successfully.
- Have added a unit test which mimics the race condition as well.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-09 11:20:09 +01:00
Andrew Schofield 70c0aca4b7
KAFKA-17897: Deprecate Admin.listConsumerGroups [2/N] (#19508)
CI / build (push) Waiting to run Details
Admin.listConsumerGroups() was able to use the early versions of
ListGroups RPC with the version used dependent upon the filters the user
specified. Admin.listGroups(ListGroupsOptions.forConsumerGroups())
inadvertently required ListGroups v5 because it always set a types
filter. This patch handles the UnsupportedVersionException and winds
back the complexity of the request unless the user has specified filters
which demand a higher version.

It also adds ListGroupsOptions.forShareGroups() and forStreamsGroups().
The usability of Admin.listGroups() is much improved as a result.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, PoAn Yang
 <payang@apache.org>
2025-05-09 08:38:16 +01:00
ShihYuan Lin 1ccaddaa70
KAFKA-19209: Clarify index.interval.bytes impact on offset and time index (#19657)
Update docs to note index.interval.bytes sets entry frequency for offset index and, conditionally, time index. Improve clarity and readability of index.interval.bytes description.

Reviewers: Luke Chen <showuon@gmail.com>
2025-05-09 09:48:55 +08:00
Manoj b5c468fd7c
KAFKA-18115; Fix for loading big files while performing load tests (#18391)
CI / build (push) Waiting to run Details
When performing perf tests, we can specify a payload using the
"--payloadFile" flag. This file is utilized during the load/performance
testing process. This causes the entire file to get loaded into a String
and split using the delimiter. However, if the file is large, it may
result in  NegativeArraySizeException error.

Moving the file loading logic to Scanner which doesn't have this issue.

Reviewers: José Armando García Sancio <jsancio@apache.org>, Ken Huang
 <s7133700@gmail.com>, Zhe Guang <zheguang.zhao@alumni.brown.edu>
2025-05-08 17:08:36 -04:00
Chia-Ping Tsai 99ecd5ca08
MINOR: remove xijiu from asf.yaml in order to resend invitation (#19660)
@xijiu's invitation is timeout, so we have to remove the name, then
re-add it in a new commit.

see https://issues.apache.org/jira/browse/INFRA-26796

Reviewers: Chih-Yuan Chien <joshua2519@gmail.com>, Kuan-Po Tseng
 <brandboat@gmail.com>, PoAn Yang <payang@apache.org>, yunchi
 <yunchipang@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Hong-Yi Chen
 <apalan60@gmail.com>, Bolin Lin <linbolin1230@gmail.com>, Shih-Yuan Lin
 <shmily7829@gmail.com>, Mirai1129 <minecraftmiku831@gmail.com>
2025-05-09 02:07:14 +08:00
Uladzislau Blok 0076b65f99
KAFKA-19182 Move SchedulerTest to server module (#19608)
CI / build (push) Waiting to run Details
This PR moves SchedulerTest to server module and rewrite it with java.

Please also check updated import control config!

Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-05-09 00:02:38 +08:00
PoAn Yang 9e785cee8d
KAFKA-19087 Move TransactionState to transaction-coordinator module (#19568)
Move TransactionState to transaction-coordinator module and rewrite it
as Java.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-05-08 23:51:51 +08:00
David Jacot 98e535b524
MINOR: Simplify OffsetFetchResponse (#19642)
While working on https://github.com/apache/kafka/pull/19515, I came to
the conclusion that the OffsetFetchResponse is quite messy and overall
too complicated. This patch rationalize the constructors.
OffsetFetchResponse has a single constructor accepting the
OffsetFetchResponseData. A builder is introduced to handle the down
conversion. This will also simplify adding the topic ids. All the
changes are mechanical, replacing data structures by others.

Reviewers: Lianet Magrans <lmagrans@confluent.io>
2025-05-08 14:57:45 +02:00
Apoorv Mittal 2dd6126b5d
KAFKA-18855 Slice API for MemoryRecords (#19581)
CI / build (push) Waiting to run Details
The PR adds `slice` API in `Records.java` and further implementation in
`MemoryRecords`. With the addition of ShareFetch and it's support to
read from TieredStorage, where ShareFetch might acquire subset of fetch
batches and TieredStorage emits MemoryRecords, hence a slice API is
needed for MemoryRecords as well to limit the bytes transferred (if
subset batches are acquired).

MemoryRecords are sliced using `duplicate` and `slice` API of
ByteBuffer, which are backed by the original buffer itself hence no-copy
is created rather position, limit and offset are changed as per the new
position and length.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Jun Rao
 <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-08 14:02:25 +08:00
Calvin Liu 3094ce2c20
KAFKA-19212: Correct the unclean leader election metric calculation (#19590)
CI / build (push) Waiting to run Details
The current ElectionWasClean checks if the new leader is in the previous
ISR. However, there is a corner case in the partition reassignment.  The
partition reassignment can change the partition replicas. If the new
preferred leader (the first one in the new replicas) is the last one to
join ISR, this preferred leader will be elected in the same partition
change.

For example:  In the previous state, the partition is  Leader: 0,
Replicas (2,1,0), ISR (1,0), Adding(2), removing(0).  Then replica 2
joins the ISR. The new partition would be like:  Leader: 2, Replicas
(2,1), ISR(1,2).  The new leader 2 is not in the previous ISR (1,0) but
it is still a clean election.

Reviewers: Jun Rao <junrao@gmail.com>
2025-05-07 13:26:53 -07:00
Lianet Magrans 67b46fec15
MINOR: introduce structure to keep member assignment with topic Ids (#19645)
- Add new DS to wrap the member assignment (containing topic Ids, names
and partitions), to easily access the data as needed. This will be used
in following PR to integrate assignment with topic IDs into the
subscription state.
- Improve logging on the client assignment/reconciliation path

No changes in logic.

Reviewers: TengYao Chi <frankvicky@apache.org>, Andrew Schofield
 <aschofield@confluent.io>
2025-05-07 13:57:56 -04:00
Kirk True d3707fc815
KAFKA-19214: Clean up use of Optionals in RequestManagers.entries() (#19609)
Change:

`public List<Optional<? extends RequestManager>> entries();`

to:

`public List<RequestManager> entries();`

and clean up the callers.

Reviewers: TengYao Chi <kitingiao@gmail.com>, Andrew Schofield
 <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-07 17:18:12 +01:00
Kevin Wu 6cb6aa2030
MINOR; Add `--standalone --ignore-formatted` formatter test (#19643)
CI / build (push) Waiting to run Details
This PR adds an additional test case to `FormatterTest` that checks that
formatting with `--standalone` and then formatting again with
`--standalone --ignore-formatted` is indeed a no-op.

Reviewers: José Armando García Sancio <jsancio@apache.org>
2025-05-07 10:41:18 -04:00
Chirag Wadhwa f3a4a1b185
KAFKA-19241: Updated tests in ShareFetchAcknowledgeRequestTest to reuse the socket for subsequent requests (#19640)
Currently in the tests in ShareFetchAcknowledgeRequestTest, subsequent
share fetch / share acknowledge requests creates a new socket everytime,
even when the requests are sent by the same member. In reality, a single
share consumer clisnet will reuse the same socket for all the share
related requests in its lifetime. This PR changes the behaviour in the
tests to align with reality and reuse the same socket for all requests
by the same share group member.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-05-07 14:00:41 +01:00
yunchi d034268312
MINOR: Remove ConstantBrokerOrActiveKController (#19654)
`ConstantBrokerOrActiveKController` was introduced in #14399, to provide
a mechanism for selecting the least loaded broker or the active
controller when using `bootstrap.controllers`.

Usage was removed in #18002, after `alterConfigs` was deprecated in
Kafka 2.4.0.

Reviewers: PoAn Yang <payang@apache.org>, Jhen-Yung Hsu
 <jhenyunghsu@gmail.com>, Ken Huang <s7133700@gmail.com>, TengYao Chi
 <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-05-07 20:23:29 +08:00
Abhinav Dixit 33abb655eb
KAFKA-19215: Handle share partition fetch lock cleanly using tokens (#19598)
### About
Added code to handle share partition fetch lock cleanly in
`DelayedShareFetch` to avoid a member incorrectly releasing a share
partition's fetch lock

### Testing
The code has been tested with the help of unit tests and integration
tests.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>
2025-05-07 11:13:53 +01:00
Lucas Brutschy 3f465fc1b6
KAFKA-19202: Enable KIP-1071 in streams_standby_replica_test.py (#19625)
New system test for KIP-1071.

Standby replicas need to be enabled via `kafka-configs.sh`.

Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax
 <matthias@confluent.io>
2025-05-07 09:43:11 +02:00
Lan Ding e1da318722
MINOR: add boundary IT for delivery count (#19649)
CI / build (push) Waiting to run Details
see
https://github.com/apache/kafka/pull/19430#pullrequestreview-2809619176
Add boundary IT for delivery count.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-05-06 22:05:02 +01:00
Kevin Wu 7953092108
MINOR: support ipv6 in ducker-ak (#19537)
Reviewers: Colin P. McCabe <cmccabe@apache.org>, Kirk True <kirk@kirktrue.pro>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Ian McDonald <ian_mcdonald@rocketmail.com>
2025-05-06 13:55:18 -07:00
José Armando García Sancio 2df14b1190
MINOR; Log message for unexpected buffer allocation (#19596)
Log a message when reading a batch that is larger than the currently
allocated batch.

Reviewers: Colin Patrick McCabe <cmccabe@apache.org>, PoAn Yang
 <payang@apache.org>
2025-05-06 12:01:49 -04:00
Andrew Schofield 7d027a4d83
KAFKA-19218: Add missing leader epoch to share group state summary response (#19602)
CI / build (push) Waiting to run Details
When the persister is responding to a read share-group state summary
request, it has no way of including the leader epoch in its response,
even though it has the information to hand. This means that the leader
epoch information is not initialised in the admin client operation to
list share group offsets, and this then means that the information
cannot be displayed in kafka-share-groups.sh.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Sushant Mahajan
 <smahajan@confluent.io>
2025-05-06 14:53:12 +01:00
Dmitry Werner 0810650da1
MINOR: Small cleanups in clients tests (#19634)
- Removed unused fields and methods in clients tests
- Fixed IDEA code inspection warnings

Reviewers: Ken Huang <s7133700@gmail.com>, PoAn Yang
 <payang@apache.org>, Andrew Schofield <aschofield@confluent.io>,
 Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi
 <frankvicky@apache.org>
2025-05-06 20:19:21 +08:00
PoAn Yang 424e7251d6
KAFKA-19207 Move ForwardingManagerMetrics and ForwardingManagerMetricsTest to server module (#19574)
1. Move `ForwardingManagerMetrics` and `ForwardingManagerMetricsTest` to
server module.
2. Rewrite them in Java.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-05-06 20:03:17 +08:00
yunchi 4e77466f6a
KAFKA-19170 Move MetricsDuringTopicCreationDeletionTest to client-integration-tests module (#19528)
rewrite `MetricsDuringTopicCreationDeletionTest` to `ClusterTest` infra
and move it to clients-integration-tests module.

Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
<s7133700@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Chia-Ping
Tsai <chia7712@gmail.com>
2025-05-06 19:57:16 +08:00