Commit Graph

16313 Commits

Author SHA1 Message Date
Hong-Yi Chen a9bce0647f
KAFKA-19535 add integration tests for DescribeProducersOptions#brokerId (#20420)
Add tests for producer state listing with, without, and invalid
brokerId.

Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-09-04 03:15:21 +08:00
Nick Guo ef10a52a52
KAFKA-19011 Improve EndToEndLatency Tool with argument parser and message key/header support (#20301)
jira: https://issues.apache.org/jira/browse/KAFKA-19011  kip:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-1172%3A+Improve+EndToEndLatency+tool

This PR improves the usability and maintainability of the
`kafka-e2e-latency.sh` tool:

- Replaces fixed-index argument parsing with a proper argument parser
(joptsimple)
- Adds support for configuring:
    - -record-key-size: size of the message key
    - -num-headers: number of headers per message
    - -record-header-key-size: size of each header key
    - -record-header-size: size of each header value
- Renames existing arguments to align with Kafka CLI conventions:
    - broker_list → bootstrap-server
    - num_messages → num-records
    - message_size_bytes → record-size
    - properties_file → command-config
    -

Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Ken Huang
 <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-09-04 02:29:53 +08:00
Lucas Brutschy 6247fd9eb3
KAFKA-19478 [3/N]: Use heaps to discover the least loaded process (#20172)
The original implementation uses a linear search to find the least
loaded process in O(n), and we can replace this by look-ups in a heap is
O(log(n)), as described below

Active tasks: For active tasks, we can do exactly the same assignment as
in the original algorithm by first building a heap (by load) of all
processes. When we assign a task, we pick the head off the heap, assign
the task to it, update the load, and re-insert it into the heap in
O(log(n)).

Standby tasks: For standby tasks, we cannot do this optimization
directly, because of the order in which we assign tasks:

1. We first try to assign task A to a process that previously owned A.
2. If we did not find such a process, we assign A to the least loaded
node.
3. We now try to assign task B to a process that previously owned B
4. If we did not find such a process, we assign B to the least loaded
node
   ...

The problem is that we cannot efficiently keep a heap (by load)
throughout this process, because finding and removing process that
previously owned A (and B and…) in the heap is O(n). We therefore need
to change the order of evaluation to be able to use a heap:

1. Try to assign all tasks A, B.. to a process that previously owned the
task
2. Build a heap.
3. Assign all remaining tasks to the least-loaded process that does not
yet own the task. Since at most NumStandbyReplicas already own the task,
we can do it by removing up to NumStandbyReplicas from the top of the
heap in O(log(n)), so we get O(log(NumProcesses)*NumStandbyReplicas).

Note that the change in order changes the resulting standby assignments
(although this difference does not show up in the existing unit tests).
I would argue that the new order of assignment will actually yield
better assignments, since the assignment will be more sticky, which has
the potential to reduce the amount of store we have to restore from the
changelog topic after assingments.

In our worst-performing benchmark, this improves the runtime by ~107x.

Reviewers: Bill Bejeck<bbejeck@apache.org>
2025-09-03 17:13:01 +02:00
Andrew Schofield 4b9075b506
KAFKA-19653: Improve metavariable names in usage messages (#20438)
This trivial PR improves the so-called metavariable names in the usage
messages of the verifiable producer/consumer command-line tools. These
are the names of the replacement variables that appear solely in the
usage messages.

Verifiable producer (before):
```
usage: verifiable-producer [-h] --topic TOPIC
    [--max-messages MAX-MESSAGES] [--throughput THROUGHPUT]
    [--acks ACKS] [--producer.config CONFIG_FILE]
    [--message-create-time CREATETIME] [--value-prefix VALUE-PREFIX]
    [--repeating-keys REPEATING-KEYS] [--command-config CONFIG_FILE]
    --bootstrap-server HOST1:PORT1[,HOST2:PORT2[...]]
```

(after)
```
usage: verifiable-producer [-h] --topic TOPIC
    [--max-messages MAX-MESSAGES] [--throughput THROUGHPUT]
    [--acks ACKS] [--producer.config CONFIG-FILE]
    [--message-create-time CREATE-TIME] [--value-prefix VALUE-PREFIX]
    [--repeating-keys REPEATING-KEYS] [--command-config CONFIG-FILE]
    --bootstrap-server HOST1:PORT1[,HOST2:PORT2[...]]
```

Verifiable consumer (before):
```
usage: verifiable-consumer [-h] --topic TOPIC
    [--group-protocol GROUP_PROTOCOL]
    [--group-remote-assignor GROUP_REMOTE_ASSIGNOR]
    --group-id GROUP_ID
    [--group-instance-id GROUP_INSTANCE_ID]
    [--max-messages MAX-MESSAGES]
    [--session-timeout TIMEOUT_MS] [--verbose]
    [--enable-autocommit] [--reset-policy RESETPOLICY]
    [--assignment-strategy ASSIGNMENTSTRATEGY]
    [--consumer.config CONFIG_FILE] [--command-config CONFIG_FILE]
    --bootstrap-server HOST1:PORT1[,HOST2:PORT2[...]]
```

(after)
```
usage: verifiable-consumer [-h] --topic TOPIC
    [--group-protocol GROUP-PROTOCOL]
    [--group-remote-assignor GROUP-REMOTE-ASSIGNOR]
    --group-id GROUP-ID
    [--group-instance-id GROUP-INSTANCE-ID]
    [--max-messages MAX-MESSAGES]
    [--session-timeout TIMEOUT-MS] [--verbose]
    [--enable-autocommit] [--reset-policy RESET-POLICY]
    [--assignment-strategy ASSIGNMENT-STRATEGY]
    [--consumer.config CONFIG-FILE] [--command-config CONFIG-FILE]
    --bootstrap-server HOST1:PORT1[,HOST2:PORT2[...]]
```

Verifiable share consumer (before):
```
usage: verifiable-share-consumer
       [-h] --topic TOPIC --group-id GROUP_ID
       [--max-messages MAX-MESSAGES] [--verbose]
       [--acknowledgement-mode ACKNOWLEDGEMENTMODE]
       [--offset-reset-strategy OFFSETRESETSTRATEGY]
       [--command-config CONFIG_FILE]
       --bootstrap-server HOST1:PORT1[,HOST2:PORT2[...]]
```

(after):
```
usage: verifiable-share-consumer
       [-h] --topic TOPIC --group-id GROUP-ID
       [--max-messages MAX-MESSAGES] [--verbose]
       [--acknowledgement-mode ACKNOWLEDGEMENT-MODE]
       [--offset-reset-strategy OFFSET-RESET-STRATEGY]
       [--command-config CONFIG-FILE]
       --bootstrap-server HOST1:PORT1[,HOST2:PORT2[...]]
```

Reviewers: Kirk True <kirk@kirktrue.pro>, Ken Huang
 <s7133700@gmail.com>, Lianet Magrans <lmagrans@confluent.io>
2025-09-03 15:38:42 +01:00
Shivsundar R d226b43597
KAFKA-18220: Refactor AsyncConsumerMetrics to not extend KafkaConsumerMetrics (#20283)
*What*
https://issues.apache.org/jira/browse/KAFKA-18220

- Currently, `AsyncConsumerMetrics` extends `KafkaConsumerMetrics`, but
is being used both by `AsyncKafkaConsumer` and `ShareConsumerImpl`.

- `ShareConsumerImpl` only needs the async consumer metrics(the metrics
associated with the new consumer threading model).
- This needs to be fixed, we are unnecessarily having
`KafkaConsumerMetrics` as a parent class for `ShareConsumer` metrics.

Fix :
- In this PR, we have removed the dependancy of `AsyncConsumerMetrics`
on `KafkaConsumerMetrics` and made it an independent class which both
`AsyncKafkaConsumer` and `ShareConsumerImpl` will use.
- The "`asyncConsumerMetrics`" field represents the metrics associated
with the new consumer threading model (like application event queue
size, background queue size, etc).
- The "`kafkaConsumerMetrics`" and "`kafkaShareConsumerMetrics`" fields
denote the actual consumer metrics for `KafkaConsumer` and
`KafkaShareConsumer` respectively.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-09-03 12:35:55 +01:00
dependabot[bot] 8448c288fa
MINOR: Bump requests from 2.31.0 to 2.32.4 in /tests (#19940)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2025-09-03 12:29:18 +02:00
Mickael Maison dd52058466
MINOR: Cleanups in Connect (#20077)
A few cleanups including Java 17 syntax, collections and assertEquals() order

Reviewers: Luke Chen <showuon@gmail.com>, Ken Huang <s7133700@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>
2025-09-03 11:11:57 +02:00
Kirk True 4271fd8c8b
KAFKA-19564: Close Consumer in ConsumerPerformance only after metrics displayed (#20267)
Ensure that metrics are retrieved and displayed (when requested) before
`Consumer.close()` is called. This is important because metrics are
technically supposed to be removed on `Consumer.close()`, which means
retrieving them _after_ `close()` would yield an empty map.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-09-03 09:25:21 +01:00
Federico Valeri 2ba30cc466
KAFKA-19574: Improve producer and consumer config files (#20302)
This is an attempt at improving the client configuration files. We now
have sections and comments similar to the other properties files.

Reviewers: Kirk True <ktrue@confluent.io>, Luke Chen <showuon@gmail.com>

---------

Signed-off-by: Federico Valeri <fedevaleri@gmail.com>
2025-09-02 11:24:35 +09:00
Matthias J. Sax 342a8e6773
MINOR: suppress build warning (#20424)
Suppress build warning.

Reviewers: TengYao Chi <frankvicky@apache.org>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-09-01 11:12:11 -07:00
Lan Ding 4f2114a49e
KAFKA-19645 add a lower bound to num.replica.fetchers (#20414)
Add a lower bound to num.replica.fetchers.

Reviewers: PoAn Yang <payang@apache.org>, TaiJuWu <tjwu1217@gmail.com>,
 Ken Huang <s7133700@gmail.com>, jimmy <wangzhiwang611@gmail.com>,
 Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-08-31 11:12:57 +08:00
Yunchi Pang 5441f5e3e1
KAFKA-19616 Add compression type and level support to LogCompactionTester (#20396)
issue: [KAFKA-19616](https://issues.apache.org/jira/browse/KAFKA-19616)

**why**: validate log compaction works correctly with compressed data.
**what**: adds compression config options to `LogCompactionTester` tool
and extends test coverage to validate log compaction with different
compression types and levels.

Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-08-30 10:21:22 +08:00
Bill Bejeck e389484697
MINOR: Prepend steams to group configs specific to Kafka Streams groups (#20448)
In the 4.1 `upgrade-guide` describing the new KIP-1071 protocol it would
be helpful to display the configs you can set via `kafka-configs.sh`
with `streams` pre-pended to the configs, the command will fail
otherwise.

Reviewers: Andrew J Schofield<aschofield@apache.org>,   Matthias J
 Sax<mjsax@apache.org>,  Genseric Ghiro<gghiro@confluent.io>
2025-08-29 16:57:23 -04:00
Kuan-Po Tseng 26fea78ae1
MINOR: Remove default config in creating internal stream topic (#20421)
Cleanup default configs in
AutoTopicCreationManager#createStreamsInternalTopics.   The streams
protocol would like to be consistent with the kafka streams using the
classic protocol - which would create the internal topics using
CreateTopic and therefore use the controller config.

Reviewers: Lucas Brutschy <lucasbru@apache.org>
2025-08-29 15:23:53 +02:00
Jhen-Yung Hsu 65f789f560
KAFKA-19626: KIP-1147 Consistency of command-line arguments for remaining CLI tools (#20431)
This implements [KIP-1147](https://cwiki.apache.org/confluence/x/DguWF)
for kafka-cluster.sh, kafka-leader-election.sh and
kafka-streams-application-reset.sh.

Jira: https://issues.apache.org/jira/browse/KAFKA-19626

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-08-29 12:04:03 +01:00
knoxy5467 2dd2db7a1e
KAFKA-8350 Fix stack overflow when batch size is larger than cluster max.message.byte (#20358)
### Summary
This PR fixes two critical issues related to producer batch splitting
that can cause infinite retry loops and stack overflow errors when batch
sizes are significantly larger than broker-configured message size
limits.

### Issues Addressed
- **KAFKA-8350**: Producers endlessly retry batch splitting when
`batch.size` is much larger than topic-level `message.max.bytes`,
leading to infinite retry loops with "MESSAGE_TOO_LARGE" errors
- **KAFKA-8202**: Stack overflow errors in
`FutureRecordMetadata.chain()` due to excessive recursive splitting
attempts

### Root Cause
The existing batch splitting logic in
`RecordAccumulator.splitAndReenqueue()` always used the configured
`batchSize` parameter for splitting, regardless of whether the batch had
already been split before. This caused:

1. **Infinite loops**: When `batch.size` (e.g., 8MB) >>
`message.max.bytes` (e.g., 1MB), splits would never succeed since the
split size was still too large
2. **Stack overflow**: Repeated splitting attempts created deep call
chains in the metadata chaining logic

### Solution
Implemented progressive batch splitting logic:

```java
int maxBatchSize = this.batchSize;
if (bigBatch.isSplitBatch()) {
    maxBatchSize = Math.max(bigBatch.maxRecordSize,
bigBatch.estimatedSizeInBytes() / 2);
}
```

__Key improvements:__

- __First split__: Uses original `batchSize` (maintains backward
compatibility)

- __Subsequent splits__: Uses the larger of:

  - `maxRecordSize`: Ensures we can always split down to individual
records   - `estimatedSizeInBytes() / 2`: Provides geometric reduction
for faster convergence

### Testing

Added comprehensive test
`testSplitAndReenqueuePreventInfiniteRecursion()` that:

- Creates oversized batches with 100 records of 1KB each
- Verifies splitting can reduce batches to single-record size
- Ensures no infinite recursion (safety limit of 100 operations)
- Validates no data loss or duplication during splitting
- Confirms all original records are preserved with correct keys

### Backward Compatibility

- No breaking changes to public APIs
- First split attempt still uses original `batchSize` configuration
- Progressive splitting only engages for retry scenarios

Reviewers: Colin P. McCabe <cmccabe@apache.org>, Jason Gustafson
<jason@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
###

---------

Co-authored-by: Michael Knox <mrknox@amazon.com>
2025-08-29 11:51:47 +01:00
Matthias J. Sax c7154b8bf8
MINOR: improve RLMQuotaMetricsTest (#20425)
Adds metrics description verification to RLMQuotaMetricsTest.

Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-08-29 17:35:48 +08:00
Apoorv Mittal 7eeb5c8344
MINOR: Removing incorrect multi threaded state transition tests (#20436)
These tests were written while finalizing approach for making inflight
state class thread safe but later approach changed and the lock is now
always required by SharePartition to change inflight state. Hence these
tests are incorrect and do not add any value.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-08-29 07:45:07 +01:00
Andrew Schofield e6f3efc914
KAFKA-19635: Minor docs tweaks (#20434)
Improve the wording in the upgrade doc slightly. Also fix a tiny
annoyance in the output from the message generator.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-08-28 18:52:04 +01:00
Andrew Schofield 50009cc76a
KAFKA-19635: KIP-1147 changes for upgrade.html (#20415)
Updates to `docs/upgrade.html` for

https://cwiki.apache.org/confluence/display/KAFKA/KIP-1147:+Improve+consistency+of+command-line+arguments.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-08-28 16:24:45 +01:00
Lucas Brutschy 3c378dab7d
KAFKA-19647: Implement integration test for offline migration (#20412)
In KAFKA-19570 we implemented offline migration between groups, that is,
the following integration test or system test should be possible:

Test A:

 - Start a streams application with classic protocol, process up to a
certain offset and commit the offset and shut down.   - Start the same
streams application with streams protocol (same app ID!).   - Make sure
that the offsets before the one committed in the first run are not
reprocessed in the second run.

Test B:

 - Start a streams application with streams protocol, process up to a
certain offset and commit the offset and shut down.   - Start the same
streams application with classic protocol (same app ID!).   - Make sure
that the offsets before the one committed in the first run are not
reprocessed in the second run.

We have unit tests that make sure that non-empty groups will not be
converted. This should be enough.

Reviewers: Bill Bejeck <bbejeck@apache.org>
2025-08-28 17:07:58 +02:00
Apoorv Mittal 6956417a3e
MINOR: Updated name from messages to records for consistency in share partition (#20416)
Minor PR to update name of maxInFlightMessages to maxInFlightRecords to
maintain consistency in share partition related classes.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-08-28 13:52:04 +01:00
Ken Huang 2cc66f12c3
MINOR: Remove OffsetsForLeaderEpochRequest unused static field (#20418)
This field was used for replica_id, but after

51c833e795,
the OffsetsForLeaderEpochRequest directly relies on the internal structs
generated by the automated protocol. Therefore, we can safely remove it.

Reviewers: Lan Ding <isDing_L@163.com>, TengYao Chi
<frankvicky@apache.org>
2025-08-28 17:24:01 +08:00
Kirk True 1fc25d8389
MINOR: remove arguments from AsyncKafkaConsumerTest.newConsumer() that are identical (#20426)
Very minor cleanup of redundant arguments in `AsyncKafkaConsumerTest`.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-08-28 09:56:39 +01:00
TengYao Chi 74b2228dd7
MINOR: Add a missing @Test to test case "shouldCallOldImplementationExceptionHandler" (#20427)
`shouldCallOldImplementationExceptionHandler` should be a test case, but
somehow misses the `@Test` tag

Reviewers: Ken Huang <s7133700@gmail.com>, TaiJuWu <tjwu1217@gmail.com>,
 Chia-Ping Tsai <chia7712@gmail.com>
2025-08-28 15:22:23 +08:00
Sanskar Jhajharia 7527a8bac0
MINOR: Cleanup Connect Module (4/n) (#20389)
Now that Kafka support Java 17, this PR makes some changes in connect
module. The changes in this PR are limited to only some files. A future
PR(s) shall follow.

The changes mostly include:
- Collections.emptyList(), Collections.singletonList() and
Arrays.asList() are replaced with List.of()
- Collections.emptyMap() and Collections.singletonMap() are replaced
with Map.of()
- Collections.singleton() is replaced with Set.of()

Modules target: runtime/src/main

Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-08-28 05:23:22 +08:00
anonymous 92fe00e184
KAFKA-19497; Topic replay code does not handle creation and deletion in the same delta (#20242)
There is a small logic bug in topic replay. If a topic is created and
then removed before the TopicsDelta is applied, we end up with the
deleted topic in createdTopics on the delta. Tis issue is fixed by
removing the topicName from createTopics when replaying
RemoveTopicRecord to make sure the deleted topics won't appear in
createTopics.

Reviewers: José Armando García Sancio <jsancio@apache.org>, Kevin Wu
 <kevin.wu2412@gmail.com>, Alyssa Huang <ahuang@confluent.io>
2025-08-27 16:34:12 -04:00
Lucas Brutschy 0412be9e9d
KAFKA-19641: Fix flaky RestoreIntegrationTest#shouldInvokeUserDefinedGlobalStateRestoreListener (#20419)
What I observed is that if I run both combinations useNewProtocol=true,
useNewProtocol=false it would often fail the second time, but if I only
run the second variation useNewProtocol=false it works, and only the
first variation useNewProtocol=true also works. So this points to some
state that is not cleared between the tests  - and indeed, the test
creates a topic “inputTopic”, produces to it, but doesn’t delete it, so
the second variation will run with produce to it again and then run with
twice the data.

I also reduced heartbeat interval and session timeout since some of the
tests need to wait for the old consumer to leave which (sigh) Kafka
Streams doesn't do, so we have to wait that it gets kicked out by
session timeout. So previously we waited for 45 seconds, now, we at
least wait only 1 second.

Reviewers: Bill Bejeck <bbejeck@apache.org>, Chia-Ping Tsai
<chia7712@gmail.com>
2025-08-27 20:18:50 +02:00
Sebastien Viale 04518f4ce1
KAFKA-19531 Add an end-to-end integration test for the DLQ feature (#20236)
This PR adds an end-to-end integration tests that validates the Dead
Letter Queue (DLQ) feature introduced in

[KIP-1034](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1034%3A+Dead+letter+queue+in+Kafka+Streams)

Reviewers: Lucas Brutschy <lucasbru@apache.org>
2025-08-27 18:54:21 +02:00
jimmy a931f85835
KAFKA-19625: Consistency of command-line arguments for verifiable producer/consumer (#20390)
As described in
[jira](https://issues.apache.org/jira/browse/KAFKA-19625), this PR
implements replace `consumer.config` and `producer.config` with
`command-config` for kafka-verifiable-producer.sh and
kafka-verifiable-consumer.sh.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-08-27 10:53:26 +01:00
Apoorv Mittal c5d0ddd6f7
MINOR: Refactored gap window names in share partition (#20411)
As per the suggestion by @adixitconfluent and @chirag-wadhwa5,

[here](https://github.com/apache/kafka/pull/20395#discussion_r2300810004),
I have refactored the code with variable and method names.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Chirag Wadhwa
<cwadhwa@confluent.io>
2025-08-27 10:06:43 +01:00
Chang-Chi Hsu c797f85de4 KAFKA-19642 Replace dynamicPerBrokerConfigs with dynamicDefaultConfigs (#20405)
- **Changes**: Replace misused dynamicPerBrokerConfigs with
dynamicDefaultConfigs
- **Reasons**: KRaft servers don't handle the cluser-level configs in
starting

from: https://github.com/apache/kafka/pull/18949/files#r2296809389

Reviewers: Jun Rao <junrao@gmail.com>, Jhen-Yung Hsu
<jhenyunghsu@gmail.com>, PoAn Yang <payang@apache.org>, Chia-Ping Tsai
<chia7712@gmail.com>

---------

Co-authored-by: PoAn Yang <payang@apache.org>
2025-08-27 14:35:31 +08:00
Evanston Zhou 7ffd6934ad
MINOR: Add example integration test commands to README (#20413)
Adds example commands for running integration tests from the command
line.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-08-27 13:59:31 +08:00
Abhijeet Kumar 8d93d1096c
KAFKA-17108: Add EarliestPendingUpload offset spec in ListOffsets API (#16584)
This is the first part of the implementation of

[KIP-1023](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1023%3A+Follower+fetch+from+tiered+offset)

The purpose of this pull request is for the broker to start returning
the correct offset when it receives a -6 as a timestamp in a ListOffsets
API request.

Added unit tests for the new timestamp.

Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>
2025-08-27 08:34:31 +05:30
jimmy 5fcbf3d3b1
KAFKA-18853 Add documentation to remind users to use valid LogLevelConfig constants (#20249)
This PR aims to add documentation to `alterLogLevelConfigs` method to
remind users to use valid LogLevelConfig constants.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-08-27 10:52:02 +08:00
Ken Huang 08057eac53
KAFKA-18600 Cleanup NetworkClient zk related logging (#18644)
This PR removes associated logging within NetworkClient to reduce noise
and streamline the client code.

Reviewers: Ismael Juma <ismael@juma.me.uk>, David Arthur
 <mumrah@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-08-27 03:51:28 +08:00
Ken Huang a9b2a6d9b6
MINOR: Optimize the entriesWithoutErrorsPerPartition when errorResults is empty (#20410)
If `errorResults` is empty, there’s no need to create a new
`entriesPerPartition` map.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-08-27 03:11:35 +08:00
Apoorv Mittal 49ee1fb4f9
KAFKA-19632: Handle overlap batch on partition re-assignment (#20395)
The PR fixes the batch alignment issue when partitions are re-assigned.
During initial read of state the batches can be broken arbitrarily. Say
the start offset is 10 and cache contains [15-18] batch during
initialization. When fetch happens at offset 10 and say the fetched
batch contain 10 records i.e. [10-19] then correct batches will be
created if maxFetchRecords is greater than 10. But if maxFetchRecords is
less than 10 then last offset of batch is determined, which will be 19.
Hence acquire method will incorrectly create a batch of [10-19] while
[15-18] already exists. Below check is required t resolve the issue:
```
if (isInitialReadGapOffsetWindowActive() && lastAcquiredOffset >
lastOffset) {
     lastAcquiredOffset = lastOffset;
}
```

While testing with other cases, other issues were determined while
updating the gap offset, acquire of records prior share partitions end
offset and determining next fetch offset with compacted topics. All
these issues can arise mainly during initial read window after partition
re-assignment.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Abhinav Dixit
 <adixit@confluent.io>, Chirag Wadhwa <cwadhwa@confluent.io>
2025-08-26 13:25:57 +01:00
Chang-Chi Hsu 71fdab1c5d
MINOR: describeTopics should pass the timeout to the describeCluster call (#20375)
This PR ensures that describeTopics correctly propagates its timeoutMs
setting to the underlying describeCluster call. Integration tests were
added to verify that the API now fails with a TimeoutException when
brokers do not respond within the configured timeout.

Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
 <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-08-26 19:38:53 +08:00
Kirk True 4e0d8c984b
MINOR: renamed testAsyncConsumerClassicConsumerSubscribeInvalidTopicCanUnsubscribe to align test case (#20407)
`testAsyncConsumerClassicConsumerSubscribeInvalidTopicCanUnsubscribe` does not align with the test case. This patch renames the test name to describe the test case more precisely.
Reviewers: TengYao Chi <frankvicky@apache.org>
2025-08-26 17:00:37 +08:00
Abhijeet Kumar 614bc3a19d
KAFKA-17344: Add empty replica FollowerFetch tests (#16884)
Add Unit Tests for an empty follower fetch for various Leader states.

| TieredStorage Enabled | Leader Log Start Offset | Leader Local Log
Start Offset | Leader Log End Offset | Remarks
|

|-----------------------|-------------------------|--------------------------------|-----------------------|---------------------------------------|
| N                     | 0                       | -
| 200                   | -                                     |  | N
| 10                      | -                              | 200
| -                                     |  | Y                     | 0
| 200                            | 200                   | No segments
deleted locally           |  | Y                     | 0
| 200                            | 100                   | Segments
uploaded and deleted locally |  | Y                     | 0
| 200                            | 200                   | All segments
deleted locally          |  | Y                     | 10
| 10                             | 200                   | No segments
deleted locally           |  | Y                     | 10
| 100                            | 200                   | Segments
uploaded and deleted locally |  | Y                     | 10
| 200                            | 200                   | All segments
deleted locally          |

Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>
2025-08-26 14:19:11 +05:30
jimmy d2f162a071
MINOR: kafka-stream-groups.sh should fail quickly if the partition leader is unavailable (#20271)
This PR applies the same partition leader check for `StreamsGroupCommand` as
`ShareGroupCommand`  and `ConsumerGroupCommand` to avoid the command
execution timeout.

Reviewers: Lucas Brutschy <lucasbru@apache.org>
2025-08-26 10:08:35 +02:00
Lucas Brutschy f621a635c1
KAFKA-19570: Implement offline migration for streams groups (#20288)
Offline migration essentially preserves offsets and nothing else. So
effectively write tombstones for classic group type when a streams
heartbeat is sent to with the group ID of an empty classic group, and
write tombstones for the streams group type when a classic consumer
attempts to join with a group ID of an empty streams group.

Reviewers: Bill Bejeck <bbejeck@apache.org>, Sean Quah
 <squah@confluent.io>, Dongnuo Lyu <dlyu@confluent.io>
2025-08-26 10:05:30 +02:00
Mickael Maison 30ffd42b26
MINOR: Cleanups in the release scripts (#20308)
A bunch of cleanups in the release scripts

Reviewers: Luke Chen <showuon@gmail.com>
2025-08-26 09:57:49 +02:00
PoAn Yang 5bbc421a13
MINOR: update TransactionLog#readTxnRecordValue to initialize TransactionMetadata with non-empty topic partitions (#20370)
This is followup PR for https://github.com/apache/kafka/pull/19699.

* Update TransactionLog#readTxnRecordValue to initialize
TransactionMetadata with non-empty topic partitions
* Update `TxnTransitMetadata` comment, because it's not immutable.

Reviewers: TengYao Chi <kitingiao@gmail.com>, Justine Olshan
 <jolshan@confluent.io>, Kuan-Po Tseng <brandboat@gmail.com>, Chia-Ping
 Tsai <chia7712@gmail.com>
2025-08-26 10:36:45 +08:00
PoAn Yang b2c1a0fb9f
KAFKA-18841: Enable to test docker image locally (#19028)
### Case 1: no --kafka-url and --kafka-archive

Should fail. One of argument (--kafka-url/--kafka-archive) is required.

```
> python docker_build_test.py apache/kafka --image-tag KAFKA-18841
--image-type jvm --build
usage: docker_build_test.py [-h] [--image-tag TAG] [--image-type
{jvm,native}] [--build] [--test] (--kafka-url KAFKA_URL |
--kafka-archive KAFKA_ARCHIVE) image
docker_build_test.py: error: one of the arguments --kafka-url/-u
--kafka-archive/-a is required
```

### Case 2: --kafka-url with native

```
> python docker_build_test.py apache/kafka --image-tag KAFKA-18841
--image-type native --kafka-url
https://dist.apache.org/repos/dist/dev/kafka/4.0.0-rc0/kafka_2.13-4.0.0.tgz
--build
```

### Case 3: --karka-url with jvm

```
> python docker_build_test.py apache/kafka --image-tag KAFKA-18841
--image-type jvm --kafka-url
https://dist.apache.org/repos/dist/dev/kafka/4.0.0-rc0/kafka_2.13-4.0.0.tgz
--build
```

### Case 4: --kafka-archive with native

```
> ./gradlew clean releaseTarGz
> cd docker
> python docker_build_test.py apache/kafka --image-tag KAFKA-18841
--image-type native --kafka-archive
</absolute/path/to/core/build/distributions/kafka_2.13-4.1.0-SNAPSHOT.tgz>
--build
```

### Case 5: --kafka-archive with jvm

```
> ./gradlew clean releaseTarGz
> cd docker
> python docker_build_test.py apache/kafka --image-tag KAFKA-18841
--image-type jvm --kafka-archive
</absolute/path/to/core/build/distributions/kafka_2.13-4.1.0-SNAPSHOT.tgz>
--build
```

Reviewers: Vedarth Sharma <vesharma@confluent.io>, Chia-Ping Tsai
 <chia7712@gmail.com>, TengYao Chi <frankvicky@apache.org>

---------

Signed-off-by: PoAn Yang <payang@apache.org>
2025-08-26 10:30:14 +08:00
Kuan-Po Tseng 0242d1c58a
MINOR: update kraft dynamic voter set doc (#20401)
Update the KRaft dynamic voter set documentation. In Kafka 4.1, we
introduced a powerful new feature that enables seamless migration from a
static voter set to a dynamic voter set.

Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-08-26 01:31:07 +08:00
Federico Valeri f97b95c60a
KAFKA-19498 Add include argument to ConsumerPerformance tool (#20221)
This patch adds the include argument to ConsumerPerformance tool.

ConsoleConsumer and ConsumerPerformance serve different purposes but
share common functionality for message consumption. Currently, there's
an inconsistency in their command-line interfaces:

- ConsoleConsumer supports an --include argument that allows users to
specify a regular expression pattern to filter topics for consumption
- ConsumerPerformance lacks this topic filtering capability, requiring
users to specify a single topic explicitly via --topic argument

This inconsistency creates two problems:

- Similar tools should provide similar topic selection capabilities for
better user experience
- Users cannot test consumer performance across multiple topics or
dynamically matching topic sets, making it difficult to test realistic
scenarios

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-08-25 04:15:37 +08:00
Kuan-Po Tseng ecd5b4c157
MINOR: enhance DescribeClusterResponse ControllerId description (#20400)
enhance the description of ControllerId in DescribeClusterResponse.json

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-08-25 02:41:02 +08:00
Michael Morris 732b22daff
MINOR: Upgrade log4j to version 2.25.1 (#20132)
Upgrade log4j to version 2.25.1

Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-08-25 02:35:12 +08:00