Commit Graph

15892 Commits

Author SHA1 Message Date
Nick Guo 465b01cd2c
KAFKA-19382:Upgrade junit from 5.10 to 5.13 (#19919)
CI / build (push) Waiting to run Details
jira: https://issues.apache.org/jira/browse/KAFKA-19382

Upgrade junit from 5.10.2 to
[5.13.1](https://github.com/junit-team/junit5/releases).

A new behavior was introduced to junit 5.12

(89a46dfa10),
disallowing `ClusterTestExtensions` to generate empty invocation
contexts. However, `ClusterTestExtensions` is invoked by junit extension
so it could result in empty contexts for some tests.

```
> Configure project :
Starting build with version 4.1.0-SNAPSHOT (commit id c4a769bc) using
Gradle 8.14.1, Java 17 and Scala 2.13.16
Build properties: ignoreFailures=false, maxParallelForks=10,
maxScalacThreads=8, maxTestRetries=0

> Task :core:test kafka.api.ConsumerBounceTest.initializationError
failed, log available in
/Users/lansg/Project/OpenSource/kafka/kafka-fork/kafka/core/build/reports/testOutput/kafka.api.ConsumerBounceTest.initializationError.test.stdout

Gradle Test Run :core:test > Gradle Test Executor 5 > ConsumerBounceTest
> testCloseDuringRebalance(String) > initializationError FAILED
org.junit.platform.commons.PreconditionViolationException: Provider
[ClusterTestExtensions] did not provide any invocation contexts, but was
expected to do so. You may override
mayReturnZeroTestTemplateInvocationContexts() to allow this.         at
java.base@17.0.13/java.util.ArrayList.forEach(ArrayList.java:1511) at
java.base@17.0.13/java.util.ArrayList.forEach(ArrayList.java:1511)
kafka.api.ConsumerBounceTest.initializationError failed, log available
in
/Users/lansg/Project/OpenSource/kafka/kafka-fork/kafka/core/build/reports/testOutput/kafka.api.ConsumerBounceTest.initializationError.test.stdout

```

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi
 <frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>
2025-06-10 15:35:30 +08:00
Kuan-Po Tseng 3a0a1705a1
KAFKA-18486 Remove becomeLeaderOrFollower from readFromLogWithOffsetOutOfRange and other related methods. (#19929)
refactor out becomeLeaderOrFollower in below tests
- readFromLogWithOffsetOutOfRange
- testBecomeFollowerWhileNewClientFetchInPurgatory
- testBecomeFollowerWhileOldClientFetchInPurgatory
- testBuildRemoteLogAuxStateMetricsThrowsException

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ken Huang
 <s7133700@gmail.com>, TengYao Chi <frankvicky@apache.org>
2025-06-10 12:39:32 +08:00
Kaushik Raina dbfda79951
KAFKA-19283: Update transaction exception handling documentation (#19931)
Added docs on Enhancements to transactional producer error handling:

* Added standardized exception categories (`RetriableException`,
`RefreshRetriableException`, `AbortableException`,
`ApplicationRecoverableException`, `InvalidConfigurationException`,
`KafkaException`) to ensure clearer error handling patterns.
* Included a link to example template code for handling transaction
exceptions: [Transaction Client

Demo](https://github.com/apache/kafka/blob/trunk/examples/src/main/java/kafka/examples/TransactionalClientDemo.java).

Reviewers: Justine Olshan <jolshan@confluent.io>
2025-06-09 17:07:29 -07:00
Calvin Liu b420e4092e
MINOR: ELR release note for 4.1 (#19909)
CI / build (push) Waiting to run Details
Mention that ELR will be enabled by default on new clusters in 4.1

Reviewers: Justine Olshan <jolshan@confluent.io>
2025-06-09 17:05:20 -07:00
Matthias J. Sax 0adc6fa3e1
KAFKA-19271: allow intercepting internal method call (#19832)
CI / build (push) Waiting to run Details
To allow intercepting the internal subscribe call to the async-consumer,
we need to extend ConsumerWrapper interface accordingly, instead of
returning the wrapped async-consumer back to the KS runtime.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2025-06-09 07:28:13 -07:00
YuChia Ma 948a91dfdf
MINOR: Safe update dependencies (#19897)
These dependencies have been updated across both files:

    caffeine: From 3.1.8 to 3.2.0      javassist: From 3.29.2-GA to
3.30.2-GA      Jetty-related: All Jetty components have been updated
from 12.0.15 to 12.0.22, including:          jetty-alpn-client
jetty-client          jetty-ee10-servlet          jetty-ee10-servlets
jetty-http          jetty-io          jetty-security
jetty-server          jetty-session          jetty-util      jose4j:
From 0.9.4 to 0.9.6      Jersey-related: All Jersey components have been
updated from 3.1.9 to 3.1.10, including:          jersey-client
jersey-common          jersey-container-servlet
jersey-container-servlet-core          jersey-hk2          jersey-server
classgraph: From 4.8.173 to 4.8.179      jline: From 3.25.1 to 3.30.4
pcollections: From 4.0.1 to 4.0.2      re2j: From 1.7 to 1.8
snappy-java: From 1.1.10.5 to 1.1.10.7

New Dependency (LICENSE-binary only)

    A new dependency, jspecify-1.0.0, has been added to LICENSE-binary.

gradle/dependencies.gradle Specific Updates

These updates are only reflected in the gradle/dependencies.gradle file:

    bcpkix: From 1.78.1 to 1.80      bndlib: From 7.0.0 to 7.1.0 jacoco:
From 0.8.10 to 0.8.13      hamcrest: From 2.2 to 3.0      jqwik: From
1.8.3 to 1.9.2

Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-06-09 21:52:58 +08:00
Apoorv Mittal d07aa37412
KAFKA-19386: Correcting ExpirationReaper thread names from Purgatory (#19918)
The PR: https://github.com/apache/kafka/pull/17636 migrated
DelayedOperationPurgatory from scala to java, and instatiated
`expirationReaper` at instance level where `purgatoryName` is still
`null` hence all expiration threads from different Purgatories has
incorrect names.

<img width="216" alt="Screenshot 2025-06-07 at 01 28 58"

src="https://github.com/user-attachments/assets/fd1b8137-b290-42e0-9a95-258fde5737d2"
/>

The PR fixes the instatiation of ExpirationReaper, in constructor when
`purgatoryName` is defined.

<img width="296" alt="Screenshot 2025-06-07 at 01 31 27"

src="https://github.com/user-attachments/assets/9912311b-ddf6-4554-8e04-d0b8ad208abc"
/>

This issue affects 4.0 version as well, though minor.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-06-09 12:10:59 +01:00
Ken Huang d6861f3f15
MINOR: Use `pollUntilTrue` instead of `waitForCondition` (#19911)
CI / build (push) Waiting to run Details
We can use `pollUntilTrue` instead of `waitForCondition`, thus do a
little refactor to reduce the duplicate code

Reviewers: TengYao Chi <frankvicky@apache.org>, Lan Ding
 <isDing_L@163.com>, TaiJuWu <tjwu1217@gmail.com>
2025-06-09 15:33:00 +08:00
Okada Haruki e2500186cb
KAFKA-19334 MetadataShell execution unintentionally deletes lock file (#19817)
## Summary
- MetadataShell may deletes lock file unintentionally when it exists or
fails to acquire lock. If there's running server, this causes unexpected
result as below:
  * MetadataShell succeeds on 2nd run unexpectedly
  * Even worse, LogManager/RaftManager's lock also no longer work from
concurrent Kafka process startup

Reviewers: TengYao Chi <frankvicky@apache.org>
2025-06-09 12:24:26 +08:00
Ken Huang df73133f3b
MINOR: Follow up KAFKA-19080 MetadataLogConfig (#19842)
CI / build (push) Waiting to run Details
See Discussion:
https://github.com/apache/kafka/pull/19371#discussion_r2109549343

Do the following changes:
- Update the internal config name with metadata prefix
- add the warning message for setting
`INTERNAL_METADATA_LOG_SEGMENT_BYTES_CONFIG`

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-06-09 03:13:03 +08:00
Nick Guo e23c8cea07
KAFKA-18486 Remove ReplicaManager#becomeLeaderOrFollower from `testReplicaAlterLogDirs` (#19922)
Use `applyDelta` replace `becomeLeaderOrFollower`

Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
2025-06-09 03:00:00 +08:00
Ken Huang 8fd0d33670
KAFKA-19042 Move PlaintextConsumerSubscriptionTest to client-integration-tests module (#19827)
CI / build (push) Waiting to run Details
Use Java to rewrite PlaintextConsumerSubscriptionTest by new test infra
and move it to client-integration-tests module.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-06-08 05:06:03 +08:00
Jing-Jia Hung c5e06f6e7a
KAFKA-18486 Update testExceptionWhenUnverifiedTransactionHasMultipleProducerIds (#19883)
- Replace the deprecated `becomeLeaderOrFollower` with the
metadata-based `applyDelta` method.
- Add overloaded `topicsCreateDelta` to support custom topic name and
topicId.

Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
 <kitingiao@gmail.com>, Nick Guo <lansg0504@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-06-08 00:55:20 +08:00
Bolin Lin 8a7e4a1423
KAFKA-18486 Update activeProducerState wih KRaft mechanism in ReplicaManagerTest (#19890)
Description:
* replace RPC with KRaft mechanism to test activeProducerState in
ReplicaManagerTest

Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-06-08 00:26:52 +08:00
Kuan-Po Tseng 83fb40d743
KAFKA-14895 [1/N] Move AddPartitionsToTxnManager files to java (#19879)
Move AddPartitionsToTxnManager to server module and convert to Java.
This patch moves AddPartitionsToTxnManager from the core module to the
server module, with its package updated from `kafka.server` to
`org.apache.kafka.server.transaction`. Additionally, several
configuration used by AddPartitionsToTxnManager are moved from
KafkaConfig.scala to AbstractKafkaConfig.java.
- brokerId
- requestTimeoutMs
- controllerListenerNames
- interBrokerListenerName
- interBrokerSecurityProtocol
- effectiveListenerSecurityProtocolMap

The next PR will move AddPartitionsToTxnManagerTest.scala to java

Reviewers: Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-06-08 00:16:55 +08:00
Kirk True 861eeb859d
KAFKA-19295: Remove AsyncKafkaConsumer event ID generation (#19915)
CI / build (push) Waiting to run Details
Remove the event IDs from the ApplicationEvent and BackgroundEvent as it
serves no functional purpose other than uniquely identifying events in
the logs.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-06-07 13:08:22 +01:00
hgh1472 c4a769bc8b
MINOR: Rename ambiguous method name (#19875)
CI / build (push) Waiting to run Details
While reading through the code, I found the method name to be somewhat
ambiguous and not fully descriptive of its purpose.

So I renamed the method to make its purpose clearer and more
self-explanatory. If there was another reason for the original naming,
I’d be happy to hear about it.

Reviewers: Lianet Magrans <lmagrans@confluent.io>
2025-06-06 15:03:51 -04:00
Ritika Reddy 3479ce793b
KAFKA-18202: Add rejection for non-zero sequences in TV2 (KIP-890) (#19902)
This change handles rejecting non-zero sequences when there is an empty
producerIDState with TV2.  The scenario will be covered with the
re-triable OutOfOrderSequence error.

For Transactions V2 with empty state:   Allow only sequence 0 is allowed for
new producers or after state cleanup (new validation added)   Don't allow any
non-zero sequence is rejected with our specific error message   Don't allow any epoch
bumps still require sequence 0 (existing validation remains)

For Transactions V1 with empty state:   Allow ANY sequence number is allowed
(0, 5, 100, etc.)   Don't allow epoch bumps still require sequence 0 (existing
validation)

Reviewers: Justine Olshan <jolshan@confluent.io>, Artem Livshits
 <alivshits@confluent.io>
2025-06-06 09:23:10 -07:00
Kaushik Raina 574516dcbf
KAFKA-19309 : Add transaction client template code in kafka examples (#15913)
This pull request introduces a new example application,
`TransactionalClientDemo`, which demonstrates how to use Kafka's
transactional capabilities for exactly-once processing semantics. The
application consumes messages from an input topic, processes them to
generate word count statistics, and produces the results to an output
topic. It also includes robust error handling and transaction
management.

### Key Changes:
* Added `TransactionalClientDemo` class to demonstrate a transactional
Kafka client application. It handles consuming messages, processing
them, and producing results to an output topic while ensuring
exactly-once processing semantics.
* Implements transactional error handling based on KIP-1050 guidelines,
including handling `TransactionAbortableException`,
`InvalidConfigurationException`, `ApplicationRecoverableException`, and
generic `KafkaException`.

Ref :

[KIP-1050](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1050%3A+Consistent+error+handling+for+Transactions)

Reviewers: Justine Olshan <jolshan@confluent.io>, Artem Livshits
 <alivshits@confluent.io>
2025-06-06 08:13:54 -07:00
PoAn Yang 844b0e651b
KAFKA-19369: Add group.share.assignors config and integration test (#19900)
CI / build (push) Waiting to run Details
* Add `group.share.assignors` config to `GroupCoordinatorConfig`.
* Send `rackId` in share group heartbeat request if it's not null.
* Add integration test `testShareConsumerWithRackAwareAssignor`.

Reviewers: Lan Ding <53332773+DL1231@users.noreply.github.com>, Andrew
 Schofield <aschofield@confluent.io>

---------

Signed-off-by: PoAn Yang <payang@apache.org>
2025-06-06 14:20:56 +01:00
PoAn Yang e0adec5549
KAFKA-19290: Exploit mapKey optimisation in protocol requests and responses (wip) (#19815)
The mapKey optimisation can be used in some KIP-932 RPC schemas to
improve efficiency of some key-based accesses.

* AlterShareGroupOffsetsResponse
* ShareFetchRequest
* ShareFetchResponse
* ShareAcknowledgeRequest
* ShareAcknowledgeResponse

Reviewers: Andrew Schofield <aschofield@confluent.io>

---------

Signed-off-by: PoAn Yang <payang@apache.org>
2025-06-06 14:19:08 +01:00
Janindu Pathirana 4d6cf3efef
KAFKA-18913: Start state updater in task manager (#19889)
Updated the code to start the State Updater Thread only after the Stream
Thread is started.

Changes done :
1. Moved the starting of the StateUpdater thread to a new init method in
the TaskManager.
2. Called the init of TaskManager in the run method of the StreamThread.
3. Updated the test cases in the StreamThreadTest to mimic the
aforementioned behaviour.

Reviewers: Bruno Cadonna <cadonna@apache.org>
2025-06-06 11:14:41 +02:00
Hong-Yi Chen aaed164be6
MINOR: Refactor brokerContactTimesMs and brokerRegistrationStates to use Long and Integer (#19888)
This PR simplifies two ConcurrentHashMap fields by removing their Atomic
wrappers:

- Change `brokerContactTimesMs` from `ConcurrentHashMap<Integer,
AtomicLong>` to `ConcurrentHashMap<Integer, Long>`.

- Change `brokerRegistrationStates` from `ConcurrentHashMap<Integer,
AtomicInteger>` to `ConcurrentHashMap<Integer, Integer>`.

This removes mutable holders without affecting thread safety (see
discussion in #19828).

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi
<frankvicky@apache.org>, Kevin Wu <kevin.wu2412@gmail.com>, Ken Huang
<s7133700@gmail.com>
2025-06-06 16:03:23 +08:00
Andrew Schofield 5cf8b2abb0
KAFKA-19370: Create JMH benchmark for share group assignor (#19907)
As part of readying share groups for production, we want to ensure that
the performance of the server-side assignor is optimal. In common with
consumer group assignors, a JMH benchmark is used for the analysis.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-06-06 08:29:18 +01:00
Sanskar Jhajharia a090dc3ba5
MINOR: Cleanup Core Module- Scala Modules (4/n) (#19805)
Now that Kafka Brokers support Java 17, this PR makes some changes in
core module. The changes in this PR are limited to only some Scala files
in the Core module's tests. The changes mostly include:
- Collections.emptyList(), Collections.singletonList() and
Arrays.asList() are replaced with List.of()
- Collections.emptyMap() and Collections.singletonMap() are replaced
with Map.of()
- Collections.singleton() is replaced with Set.of()

To be clear, the directories being targeted in this PR from unit.kafka
module:
- log
- network
- security
- tools
- utils

Reviewers: TengYao Chi <frankvicky@apache.org>
2025-06-06 14:49:16 +08:00
TaiJuWu f86659423d
KAFKA-19042 Move PlaintextConsumerAssignTest to clients-integration-tests module (#19773)
CI / build (push) Waiting to run Details
The PR do following:
1. rewrite to new test infra
2. rewrite to java
3. move to clients-integration-tests

Reviewers: Ken Huang <s7133700@gmail.com>, Kuan-Po Tseng
<brandboat@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-06-05 23:08:20 +08:00
Matthias J. Sax a662bc5634
MINOR: clean KafkaConsumer tests (#19669)
CI / build (push) Waiting to run Details
- Moving off deprecated methods
- Fixing argument order for assertEquals(...)
- Few other minor cleanups

Reviewers: PoAn Yang <payang@apache.org>, Lianet Magrans
 <lmagrans@confluent.io>, Ken Huang <s7133700@gmail.com>
2025-06-05 06:09:21 -07:00
Loïc GREFFIER 3edb406f98
KAFKA-16505: Add source raw key and value (#18739)
This PR is part of the KIP-1034.

It brings the support for the source raw key and the source raw
value in the `ErrorHandlerContext`. Required by the routing to DLQ implemented
by https://github.com/apache/kafka/pull/17942.

Reviewers: Bruno Cadonna <cadonna@apache.org>

Co-authored-by: Damien Gasparina <d.gasparina@gmail.com>
2025-06-05 10:35:03 +02:00
PoAn Yang 8eb84399f6
MINOR: rackId is Optional#empty if input string is empty (#19906)
Add test case `testRackIdIsEmptyIfValueIsEmptyString`.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-06-05 14:41:40 +08:00
Lianet Magrans 7cd99ea66d
KAFKA-19373 Fix protocol name comparison (#19903)
CI / build (push) Waiting to run Details
Fix to ensure protocol name comparison in integration test ignore case
(group protocol from param is lower case, vs enum name upper case)

The tests were not failing but the custom configs/expectation were not
being applied depending on the protocol (the tests checks for
"groupProtocol.equals(CLASSIC)" would never be true.

Found all comparisons with equals agains the constant name and fixed
them (not too many luckily).

I did consider changing the protocol param that is passed to every test
(that is now lowercase), but still, seems more robust to have the tests
ignore case.

Reviewers: Gaurav Narula <gaurav_narula2@apple.com>, Ken Huang
 <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi
 <frankvicky@apache.org>
2025-06-05 11:48:26 +08:00
snehashisp 2694d7aad9
KAFKA-19248: Multiversioning in Kafka Connect - Plugin Loading Isolation Tests (#18325)
This adds tests for [KIP-891](https://cwiki.apache.org/confluence/display/KAFKA/KIP-891%3A+Running+multiple+versions+of+Connector+plugins).
It primarily focuses on tests for the new additions in plugin loading
isolation. It has dependency on the actual KIP implementation PRs and
should be merged post https://github.com/apache/kafka/pull/17742

Reviewers: Greg Harris <greg.harris@aiven.io>
2025-06-04 18:01:18 -07:00
Ritika Reddy cc25d217da
KAFKA-18042: Reject the produce request with lower producer epoch early (KIP-890) (#19844)
CI / build (push) Waiting to run Details
With the transaction V2, replica manager checks whether the incoming
producer request produces to a partition belonging to a transaction.
ReplicaManager figures this out by checking the producer epoch stored in
the partition log. However, the current code does not reject the produce
request if its producer epoch is lower than the stored producer epoch.
It is an optimization to reject such requests earlier instead of sending
an AddPartitionToTxn request and getting rejected in the response.

Reviewers: Justine Olshan <jolshan@confluent.io>, Artem Livshits
 <alivshits@confluent.io>
2025-06-04 13:21:53 -07:00
Lucas Brutschy 25bc5f2cfa
KAFKA-19372: StreamsGroup not subscribed to a topic when empty (#19901)
We should behave more like a consumer group and make sure to not be
subscribed to the input topics anymore when the last member leaves the
group. We don't do this right now because our topology is still
initialized even after the last member leaves the group.

This will allow:
* Offsets to expire and be cleaned up.
* Offsets to be deleted through admin API calls.

Reviewers: Bill Bejeck <bbejeck@apache.org>
2025-06-04 20:55:14 +02:00
Lucas Brutschy 678d456ad7
KAFKA-19044: Handle tasks that are not present in the current topology (#19722)
A heartbeat might be sent to the group coordinator, claiming to own
tasks that we do  not know about. We need some logic to handle those
requests. In KIP-1071, we propose  to return `INVALID_REQUEST` error
whenever this happens, effectively letting the  clients crash.

This behavior will, however, make topology updates impossible. Bruno
Cadonna proposed  to only check that owned tasks match our set of
expected tasks if the topology epochs  between the group and the client
match. The aim of this change is to implement a  check and a behavior
for the first version of the protocol, which is to always  return
`INVALID_REQUEST` if an unknown task is sent to the group coordinator.

We can relax this constraint once we allow topology updating with
topology epochs.

To efficiently check this whenever we receive a heartbeat containing
tasks, we precompute the number of tasks for each subtopology. This also
benefits the performance of the assignor.

Reviewers: Bill Bejeck <bbejeck@apache.org>
2025-06-04 20:22:52 +02:00
PoAn Yang 949617b0b2
KAFKA-17747: [7/N] Add consumer group integration test for rack aware assignment (#19856)
* Add `RackAwareAssignor`. It uses `racksForPartition` to check the rack
id of a partition and assign it to a member which has the same rack id.
* Add `ConsumerIntegrationTest#testRackAwareAssignment` to check
`racksForPartition` works correctly.

Reviewers: David Jacot <djacot@confluent.io>

---------

Signed-off-by: PoAn Yang <payang@apache.org>
2025-06-04 19:32:17 +02:00
David Arthur 70b672b808
KAFKA-19347 Deduplicate ACLs when creating (#19898)
In #19840, we broke de-duplication during ACL creation. This patch fixes
that and adds a test to cover this case.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2025-06-04 12:38:54 -04:00
Ji-Seung Ryu cfd18132e8
KAFKA-19328: SharePartitionManagerTest testMultipleConcurrentShareFetches doAnswer chaining needs verification (#19872)
Hi, I've created pull request.

jira: [19328](https://issues.apache.org/jira/browse/KAFKA-19328)

problem:

1. doAnswer chaining works as intended only when calls are made
sequentially. In a multithreaded environment, its behavior is
unpredictable.
2. errors in a thread can be swallowed, not seen in main thread.
3. 5 doAnswer chain is not enough for 100 threads. The last chain is
returned for most cases.
4. nextFetchOffset seems to be called before doAnswer chain, so the last
values (25, 5,  26, 16) always was found in doAsnwer chain.

solution:

Delete doAnswer chain so that above four problems disappear.

Reviewers: Abhinav Dixit <adixit@confluent.io>, Apoorv Mittal
 <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-06-04 17:15:18 +01:00
Alieh Saeedi 2a2626b3d8
KAFKA-19244: Add support for kafka-streams-groups.sh options (delete all groups) [2/N] (#19758)
This PR implements all the options for `--delete --group grpId` and
`--delete --all-groups`

Tests:  Integration tests and unit tests.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Andrew Schofield
 <aschofield@confluent.io>
2025-06-04 17:30:07 +02:00
Andrew Schofield 2919478d00
MINOR: LIST_CONFIG_RESOURCES in security.html (#19896)
The `LIST_CLIENT_METRICS_RESOURCES` RPC was generalised to all config
resources in AK 4.1 and the RPC was renamed to `LIST_CONFIG_RESOURCES`.
This PR updates the RPC authorisation table in the documentation.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-06-04 15:44:20 +01:00
Abhinav Dixit 9671cff291
MINOR: replaced DelayedShareFetchTest mockStatic .close functionality with try-with-resources (#19892)
### About
Replaced `.close` functionality with `try-with-resources` for few tests
in `DelayedShareFetchTest` where we required to use `mockStatic`.

### Testing
The code has been tested by running the unit tests.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-06-04 15:32:04 +01:00
Kirk True 1e917906ab
KAFKA-18573: Add support for OAuth jwt-bearer grant type (#19754)
CI / build (push) Waiting to run Details
Adding support for the `urn:ietf:params:oauth:grant-type:jwt-bearer`
grant type (AKA `jwt-bearer`). Includes further refactoring of the
existing OAuth layer and addition of generic JWT assertion layer that
can be leveraged in the future.

This constitutes the main piece of the JWT Bearer grant type support.

Forthcoming commits/PRs will include improvements for both the
`client_credentials` and `jwt-bearer` grant types in the following
areas:

* Integration test coverage (KAFKA-19153)
* Unit test coverage (KAFKA-19308)
* Top-level documentation (KAFKA-19152)
* Improvements to and documentation for `OAuthCompatibilityTool`
(KAFKA-19307)

Reviewers: Manikumar Reddy <manikumar@confluent.io>, Lianet Magrans
 <lmagrans@confluent.io>

---------

Co-authored-by: Zachary Hamilton <77027819+zacharydhamilton@users.noreply.github.com>
Co-authored-by: Lianet Magrans <98415067+lianetm@users.noreply.github.com>
2025-06-04 09:01:05 -04:00
Alieh Saeedi ee3f80eda8
MINOR: Add test for kafka-streams-groups.sh list (#19894)
This PR adds integration tests for `--list`

(Transferred from the feature branch `kip1071`)  related ticket:
[KAFKA-18887](https://issues.apache.org/jira/browse/KAFKA-18887)

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2025-06-04 10:41:28 +02:00
Xuan-Zhang Gong d783f73288
MINOR: Remove unnecessary checks. (#19891)
The `String.split` method never returns an array containing null
elements.

Reviewers: TengYao Chi <frankvicky@apache.org>, Ken Huang
<s7133700@gmail.com>, Lan Ding <isDing_L@163.com>
2025-06-04 15:33:01 +08:00
Lucas Brutschy b47e2bbed8
KAFKA-19155: Update docs/security.html for streams-related RPCs (#19887)
We need to add the correct ACLs for the streams-related RPCs in
docs/security.html.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-06-04 09:14:18 +02:00
Kaushik Raina b1ea280ab1
KAFKA-19250 : txnProducer.abortTransaction() API should not return abortable exception (#19656)
CI / build (push) Waiting to run Details
## Problem
When an `txnProducer.abortTransaction()`operation encounters a
`TRANSACTION_ABORTABLE` error, it currently tries to transition to
`ABORTABLE_ERROR` state. This can create an infinite retry loop since:
1. The abort operation fails with `TRANSACTION_ABORTABLE`
2. We transition to `ABORTABLE_ERROR` state
3. The application recieves instance of TransactionAbortableException
and it retries the abort
4. The cycle repeats

## Solution
For `txnProducer.abortTransaction()`API,  convert
`TRANSACTION_ABORTABLE` errors to fatal errors (`KafkaException`) during
abort operations to ensure clean transaction termination. This prevents
retry loops by:
1. Treating abort failures as fatal errors at application layer
2. Ensuring the transaction can be cleanly terminated
3. Providing clear error messages to the application

## Changes
- Modified `EndTxnHandler.handleResponse()` to convert
`TRANSACTION_ABORTABLE` errors to `KafkaException` during abort
operations
- Set TransactionManager state to FATAL
- Updated test `testAbortableErrorIsConvertedToFatalErrorDuringAbort` to
verify this behavior

## Testing
- Added test case verifying that abort operations convert
`TRANSACTION_ABORTABLE` errors to `KafkaException`
    - Verified that Commit API with TRANSACTION_ABORTABLE error should
set TM to Abortable state
    - Verified that Abort API with TRANSACTION_ABORTABLE error should
convert to Fatal error i.e. KafkaException

## Impact
At application layer, this change improves transaction reliability by
preventing infinite retry loops during abort operations.

Reviewers: Justine Olshan <jolshan@confluent.io>
2025-06-03 17:27:15 -07:00
Kaushik Raina 8c71ab03b5
KAFKA-19176: Update Transactional producer to translate retriable into abortable exceptions (#19522)
CI / build (push) Waiting to run Details
### Problem
- Currently, when a transactional producer encounters retriable errors
(like `COORDINATOR_LOAD_IN_PROGRESS`) and exhausts all retries, finally
returns retriable error to Application Layer.
- Application reties can cause duplicate records. As a fix we are
transitioning all retriable errors  as Abortable Error in transaction
producer path.

- Additionally added InvalidTxnStateException as part of
https://issues.apache.org/jira/browse/KAFKA-19177

### Solution
- Modified the TransactionManager to automatically transition retriable
errors to abortable errors after all retries are exhausted. This ensures
that applications can abort transaction when they encounter
`TransactionAbortableException`

- `RefreshRetriableException` like `CoordinatorNotAvailableException`
will be refreshed internally

[[code](6c26595ce3/clients/src/main/java/org/apache/kafka/clients/producer/internals/TransactionManager.java (L1702-L1705))]
till reties are expired, then it will be treated as retriable errors and
translated to `TransactionAbortableException`

- Similarly for InvalidTxnStateException

### Testing
Added test `testSenderShouldTransitionToAbortableAfterRetriesExhausted`
to verify in sender thread:
- Retriable errors are properly converted to abortable state after
retries
- Transaction state transitions correctly and subsequent operations fail
appropriately with TransactionAbortableException

Reviewers: Justine Olshan <jolshan@confluent.io>
2025-06-03 10:21:22 -07:00
PoAn Yang 2977cb17d0
KAFKA-17747: [6/N] Replace subscription metadata with metadata hash in share group (#19796)
* Use metadata hash to replace subscription metadata.
* Remove `ShareGroupPartitionMetadataKey` and
`ShareGroupPartitionMetadataValue`.
* Use `subscriptionTopicNames` and `metadataImage` to replace
`subscriptionMetadata` in `subscribedTopicsChangeMap` function.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot
<djacot@confluent.io>, Andrew Schofield <aschofield@confluent.io>

---------

Signed-off-by: PoAn Yang <payang@apache.org>
2025-06-03 16:30:39 +01:00
Kaushik Raina 82ea9d0fce
MINOR : Handle error for client telemetry push (#19881)
Update catch to handle compression errors

Before :

![image](https://github.com/user-attachments/assets/c5ca121e-ba0c-4664-91f1-20b54abf67cc)

After
```
Sent message: KR Message 376
[kafka-producer-network-thread | kr-kafka-producer] INFO
org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter -
KR: Failed to compress telemetry payload for compression: zstd, sending
uncompressed data
Sent message: KR Message 377
```

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Bill Bejeck <bbejeck@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-06-03 14:29:44 +01:00
Andrew Schofield 016a4a6c4c
KAFKA-19353: Upgrade note and initial docs for KIP-932 (#19863)
CI / build (push) Waiting to run Details
This is the initial documentation for KIP-932 preview in AK 4.1. The aim
is to get very minimal docs in before the cutoff. Longer term, more
comprehensive documentation will be provided for AK 4.2.

The PR includes:
* Generation of group-level configuration documentation
* Add link to KafkaShareConsumer to API docs
* Add a summary of share group rational to design docs
* Add basic operations information for share groups to ops docs
* Add upgrade note describing arrival of KIP-932 preview in 4.1

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>

---------

Co-authored-by: Apoorv Mittal <apoorvmittal10@gmail.com>
2025-06-03 13:23:11 +01:00
PoAn Yang 425f028556
KAFKA-17747: [5/N] Replace subscription metadata with metadata hash in stream group (#19802)
* Use metadata hash to replace subscription metadata.
* Remove `StreamsGroupPartitionMetadataKey` and
`StreamsGroupPartitionMetadataValue`.
* Check whether `configuredTopology` is empty. If it's, call
`InternalTopicManager.configureTopics` and set the result to the group.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>

---------

Signed-off-by: PoAn Yang <payang@apache.org>
2025-06-03 13:21:34 +02:00