Adds a test dependency on
[mock-oauth2-server](https://github.com/navikt/mock-oauth2-server/) for
integration tests for OAuth layer. Also includes fixes for some
regressions that were caught by the integration tests.
Reviewers: Manikumar Reddy <manikumar@confluent.io>, Lianet Magrans
<lmagrans@confluent.io>
When InitProducerId is handled on the transaction coordinator, the
producer epoch is incremented (so that we fence stale requests), then if
a transaction was ongoing during this time, it's aborted. With
transaction version 2 (a.k.a. KIP-890 part 2), abort increments the
producer epoch again (it's the part of the new abort / commit protocol),
so the epoch ends up incremented twice.
In most cases, this is benign, but in the case where the epoch of the
ongoing transaction is 32766, it's incremented to 32767, which is the
maximum value for short. Then, when it's incremented for the second
time, it goes negative, causing an illegal argument exception.
To fix this we just avoid bumping the epoch a second time.
Reviewers: Justine Olshan <jolshan@confluent.io>, Artem Livshits
<alivshits@confluent.io>
This PR uses topic IDs received in assignment (under new protocol) to
ensure that only these assigned topics are included in the consumer
metadata requests performed when the user subscribes to broker-side
regex (RE2J).
For handling the edge case of consumer needing metadata for topics IDs
(from RE2J) and topic names (from transient topics), the approach is to
send a request for the transient topics needed temporarily, and once
those resolved, the request for the topic IDs needed for RE2J will
follow. (this is because the broker doesn't accept requests for names
and IDs at the same time)
With the changes we also end up fixing another issue (KAFKA-18729) aimed
at avoiding iterating the full set of assigned partitions when checking
if a topic should be retained from the metadata response when using
RE2J.
Reviewers: David Jacot <djacot@confluent.io>
Fixed a long-standing issue where the client JWT validation was decoding
the JWT sections using base 64 instead of URL-safe base 64.
Note: server-side validation leverages the jose4j library for parsing
JWTs, hence no fix is needed there.
Reviewers: Lianet Magrans <lmagrans@confluent.io>, Manikumar Reddy
<manikumar@confluent.io>
---------
Co-authored-by: Lianet Magrans <98415067+lianetm@users.noreply.github.com>
Consumers can subscribe to an RE2J SubscriptionPattern that will be
resolved and maintained on the server-side (KIP-848). Currently, those
regexes are refreshed on the coordinator when a consumer subscribes to a
new regex, or if there is a new topic metadata image (to ensure regex
resolution stays up-to-date with existing topics)
But with
[KAFKA-18813](https://issues.apache.org/jira/browse/KAFKA-18813), the
topics matching a regex are filtered based on ACLs. This generates a new
situation, as regexes resolution do not stay up-to-date as topics become
visible (ACLs added/delete).
This patch introduces time-based refresh for the subscribed regex by
- Adding internal `group.consumer.regex.batch.refresh.max.interval.ms`
config
that controls the refresh interval.
- Schedule a regex refresh when updating regex subscription if the
latest refresh is older than the max interval.
Reviewers: David Jacot <djacot@confluent.io>
Refactor testPartitionMetadataFile to use applyDelta and share
class-level partitions
- Replace deprecated becomeLeaderOrFollower with topicsCreateDelta +
applyDelta
- Test still asserts partition exists, local log exists, and verifies
partitionMetadataFile version (0) and topicId
Reviewers: TaiJuWu <tjwu1217@gmail.com>, Ken Huang <s7133700@gmail.com>,
Chia-Ping Tsai <chia7712@gmail.com>
update the following test to avoid using `becomeLeaderOrFollower`
- testClearPurgatoryOnBecomingFollower
- testDelayedFetchIncludesAbortedTransactions
- testDisabledTransactionVerification
- testFailedBuildRemoteLogAuxStateMetrics
Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
Remove ReplicaManager#becomeLeaderOrFollower in
`testVerificationErrorConversionsTV1 ` and
`testVerificationErrorConversionsTV2 `.
Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Simplify Set initialization and reduce the overhead of creating extra
collections.
The changes mostly include:
- new HashSet<>(List.of(...))
- new HashSet<>(Arrays.asList(...)) / new HashSet<>(asList(...))
- new HashSet<>(Collections.singletonList()) / new
HashSet<>(singletonList())
- new HashSet<>(Collections.emptyList())
- new HashSet<>(Set.of())
This change takes the following into account, and we will not change to
Set.of in these scenarios:
- Require `mutability` (UnsupportedOperationException).
- Allow `duplicate` elements (IllegalArgumentException).
- Allow `null` elements (NullPointerException).
- Depend on `Ordering`. `Set.of` does not guarantee order, so it could
make tests flaky or break public interfaces.
Reviewers: Ken Huang <s7133700@gmail.com>, PoAn Yang
<payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
Bump the commons-beanutils for CVE-2025-48734. Since `commons-validator`
hasn't had new release with newer `commons-beanutils` versions, we manually bump it in kafka.
Reviewers: Mickael Maison <mickael.maison@gmail.com>
Update opentelemetry-proto from 1.0.0-alpha to 1.3.2-alpha.
OpenTelemetry-Proto versions from v1.0.0 up to and including v1.3.2
introduce no breaking changes.
[release
note](https://github.com/open-telemetry/opentelemetry-proto/releases)
For example, starting with v1.4.0, protobuf-java was updated to version
4.28.3. To mitigate the risk of protobuf compatibility issues, upgrading
to v1.3.2 first allows the existing protobuf version to remain unchanged
for now.
Reviewers: poorv Mittal <apoorvmittal10@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Log segment closure results in right sizing the segment on disk along
with the associated index files.
This is specially important for TimeIndexes where a failure to right
size may eventually cause log roll failures leading to under replication
and log cleaner failures.
This change uses `Utils.closeAll` which propagates exceptions, resulting
in an "unclean" shutdown. That would then cause the broker to attempt to
recover the log segment and the index on next startup, thereby avoiding
the failures described above.
Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Jun Rao
<junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
For ShareFetch Requests, the fetch happens through DelayedShareFetch
operation. The operations which are already completed has reference to
data being sent as response. As the operation is watched over multiple
keys i.e. DelayedShareFetchGroupKey and DelayedShareFetchPartitionKey,
hence if the operation is already completed by either watched keys but
then again the reference to the operation is still present in other
watched key. Which means the memory can only be free once purge
operation is triggered by DelayedOperationPurgatory which removes the
watched key operation from remaining keys, as the operation is already
completed.
The purge operation is dependent on the config
`ShareGroupConfig#SHARE_FETCH_PURGATORY_PURGE_INTERVAL_REQUESTS_CONFIG`
hence if the value is not smaller than the number of share fetch
requests which can consume complete memory of the broker then broker can
go out of memory. This can also be avoided by having lower fetch max
bytes for request but this value is client dependent hence can't rely to
prevent the broker.
This PR triggers the completion on both watched keys hence the
DelayedShareFetch operation shall be removed from both keys which frees
the broker memory as soon the share fetch response is sent.
#### Testing
Tested with LocalTieredStorage where broker goes OOM after reading some
8040 messages before the fix, with default configurations as mentioned
in the
doc
[here](https://kafka.apache.org/documentation/#tiered_storage_config_ex).
But after the fix the consumption continues without any issue. And the
memory is released instantaneously.
Reviewers: Jun Rao <junrao@gmail.com>, Andrew Schofield
<aschofield@confluent.io>
Previously, the confirmation prompt for updating the PR body treated any
input other than 'n' as approval, which could lead to unintended
actions.
With this change, the update will only proceed if the user enters 'y',
'Y', or presses Enter. For any other input, the operation is canceled
and an Abort. message is printed. This makes the prompt behavior clearer
and more predictable.
Reviewers: TengYao Chi <frankvicky@apache.org>, PoAn Yang
<payang@apache.org>, Kuan-Po Tseng <brandboat@gmail.com>, Ken Huang
<s7133700@gmail.com>, Lan Ding <isDing_L@163.com>
We should be mindful of ours users and let them know early if they are
using an unsupported feature in 4.1.
Unsupported features:
- Regular expressions
- Warm-up replicas (high availability assignor)
- Static membership
- Standby replicas enabled through local config
- Named topologies (already checked)
- Non-default kafka-client supplier
Reviewers: Bill Bejeck <bbejeck@apache.org>, Chia-Ping Tsai
<chia7712@gmail.com>
The original `props.setProperty(TopicConfig.SEGMENT_MS_CONFIG,
config.logSegmentMillis.toString)` in the `KafkaMetadataLog` constructor
was accidentally removed in #19371. Add a test to ensure this property
is properly assigned.
Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
jira: https://issues.apache.org/jira/browse/KAFKA-19382
Upgrade junit from 5.10.2 to
[5.13.1](https://github.com/junit-team/junit5/releases).
A new behavior was introduced to junit 5.12
(89a46dfa10),
disallowing `ClusterTestExtensions` to generate empty invocation
contexts. However, `ClusterTestExtensions` is invoked by junit extension
so it could result in empty contexts for some tests.
```
> Configure project :
Starting build with version 4.1.0-SNAPSHOT (commit id c4a769bc) using
Gradle 8.14.1, Java 17 and Scala 2.13.16
Build properties: ignoreFailures=false, maxParallelForks=10,
maxScalacThreads=8, maxTestRetries=0
> Task :core:test kafka.api.ConsumerBounceTest.initializationError
failed, log available in
/Users/lansg/Project/OpenSource/kafka/kafka-fork/kafka/core/build/reports/testOutput/kafka.api.ConsumerBounceTest.initializationError.test.stdout
Gradle Test Run :core:test > Gradle Test Executor 5 > ConsumerBounceTest
> testCloseDuringRebalance(String) > initializationError FAILED
org.junit.platform.commons.PreconditionViolationException: Provider
[ClusterTestExtensions] did not provide any invocation contexts, but was
expected to do so. You may override
mayReturnZeroTestTemplateInvocationContexts() to allow this. at
java.base@17.0.13/java.util.ArrayList.forEach(ArrayList.java:1511) at
java.base@17.0.13/java.util.ArrayList.forEach(ArrayList.java:1511)
kafka.api.ConsumerBounceTest.initializationError failed, log available
in
/Users/lansg/Project/OpenSource/kafka/kafka-fork/kafka/core/build/reports/testOutput/kafka.api.ConsumerBounceTest.initializationError.test.stdout
```
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi
<frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>
Added docs on Enhancements to transactional producer error handling:
* Added standardized exception categories (`RetriableException`,
`RefreshRetriableException`, `AbortableException`,
`ApplicationRecoverableException`, `InvalidConfigurationException`,
`KafkaException`) to ensure clearer error handling patterns.
* Included a link to example template code for handling transaction
exceptions: [Transaction Client
Demo](https://github.com/apache/kafka/blob/trunk/examples/src/main/java/kafka/examples/TransactionalClientDemo.java).
Reviewers: Justine Olshan <jolshan@confluent.io>
To allow intercepting the internal subscribe call to the async-consumer,
we need to extend ConsumerWrapper interface accordingly, instead of
returning the wrapped async-consumer back to the KS runtime.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
These dependencies have been updated across both files:
caffeine: From 3.1.8 to 3.2.0 javassist: From 3.29.2-GA to
3.30.2-GA Jetty-related: All Jetty components have been updated
from 12.0.15 to 12.0.22, including: jetty-alpn-client
jetty-client jetty-ee10-servlet jetty-ee10-servlets
jetty-http jetty-io jetty-security
jetty-server jetty-session jetty-util jose4j:
From 0.9.4 to 0.9.6 Jersey-related: All Jersey components have been
updated from 3.1.9 to 3.1.10, including: jersey-client
jersey-common jersey-container-servlet
jersey-container-servlet-core jersey-hk2 jersey-server
classgraph: From 4.8.173 to 4.8.179 jline: From 3.25.1 to 3.30.4
pcollections: From 4.0.1 to 4.0.2 re2j: From 1.7 to 1.8
snappy-java: From 1.1.10.5 to 1.1.10.7
New Dependency (LICENSE-binary only)
A new dependency, jspecify-1.0.0, has been added to LICENSE-binary.
gradle/dependencies.gradle Specific Updates
These updates are only reflected in the gradle/dependencies.gradle file:
bcpkix: From 1.78.1 to 1.80 bndlib: From 7.0.0 to 7.1.0 jacoco:
From 0.8.10 to 0.8.13 hamcrest: From 2.2 to 3.0 jqwik: From
1.8.3 to 1.9.2
Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
We can use `pollUntilTrue` instead of `waitForCondition`, thus do a
little refactor to reduce the duplicate code
Reviewers: TengYao Chi <frankvicky@apache.org>, Lan Ding
<isDing_L@163.com>, TaiJuWu <tjwu1217@gmail.com>
## Summary
- MetadataShell may deletes lock file unintentionally when it exists or
fails to acquire lock. If there's running server, this causes unexpected
result as below:
* MetadataShell succeeds on 2nd run unexpectedly
* Even worse, LogManager/RaftManager's lock also no longer work from
concurrent Kafka process startup
Reviewers: TengYao Chi <frankvicky@apache.org>
See Discussion:
https://github.com/apache/kafka/pull/19371#discussion_r2109549343
Do the following changes:
- Update the internal config name with metadata prefix
- add the warning message for setting
`INTERNAL_METADATA_LOG_SEGMENT_BYTES_CONFIG`
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Use Java to rewrite PlaintextConsumerSubscriptionTest by new test infra
and move it to client-integration-tests module.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
- Replace the deprecated `becomeLeaderOrFollower` with the
metadata-based `applyDelta` method.
- Add overloaded `topicsCreateDelta` to support custom topic name and
topicId.
Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Nick Guo <lansg0504@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
Description:
* replace RPC with KRaft mechanism to test activeProducerState in
ReplicaManagerTest
Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
Move AddPartitionsToTxnManager to server module and convert to Java.
This patch moves AddPartitionsToTxnManager from the core module to the
server module, with its package updated from `kafka.server` to
`org.apache.kafka.server.transaction`. Additionally, several
configuration used by AddPartitionsToTxnManager are moved from
KafkaConfig.scala to AbstractKafkaConfig.java.
- brokerId
- requestTimeoutMs
- controllerListenerNames
- interBrokerListenerName
- interBrokerSecurityProtocol
- effectiveListenerSecurityProtocolMap
The next PR will move AddPartitionsToTxnManagerTest.scala to java
Reviewers: Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai
<chia7712@gmail.com>
Remove the event IDs from the ApplicationEvent and BackgroundEvent as it
serves no functional purpose other than uniquely identifying events in
the logs.
Reviewers: Andrew Schofield <aschofield@confluent.io>
While reading through the code, I found the method name to be somewhat
ambiguous and not fully descriptive of its purpose.
So I renamed the method to make its purpose clearer and more
self-explanatory. If there was another reason for the original naming,
I’d be happy to hear about it.
Reviewers: Lianet Magrans <lmagrans@confluent.io>
This change handles rejecting non-zero sequences when there is an empty
producerIDState with TV2. The scenario will be covered with the
re-triable OutOfOrderSequence error.
For Transactions V2 with empty state: ✅ Allow only sequence 0 is allowed for
new producers or after state cleanup (new validation added) ❌ Don't allow any
non-zero sequence is rejected with our specific error message ❌ Don't allow any epoch
bumps still require sequence 0 (existing validation remains)
For Transactions V1 with empty state: ✅ Allow ANY sequence number is allowed
(0, 5, 100, etc.) ❌ Don't allow epoch bumps still require sequence 0 (existing
validation)
Reviewers: Justine Olshan <jolshan@confluent.io>, Artem Livshits
<alivshits@confluent.io>
This pull request introduces a new example application,
`TransactionalClientDemo`, which demonstrates how to use Kafka's
transactional capabilities for exactly-once processing semantics. The
application consumes messages from an input topic, processes them to
generate word count statistics, and produces the results to an output
topic. It also includes robust error handling and transaction
management.
### Key Changes:
* Added `TransactionalClientDemo` class to demonstrate a transactional
Kafka client application. It handles consuming messages, processing
them, and producing results to an output topic while ensuring
exactly-once processing semantics.
* Implements transactional error handling based on KIP-1050 guidelines,
including handling `TransactionAbortableException`,
`InvalidConfigurationException`, `ApplicationRecoverableException`, and
generic `KafkaException`.
Ref :
[KIP-1050](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1050%3A+Consistent+error+handling+for+Transactions)
Reviewers: Justine Olshan <jolshan@confluent.io>, Artem Livshits
<alivshits@confluent.io>
* Add `group.share.assignors` config to `GroupCoordinatorConfig`.
* Send `rackId` in share group heartbeat request if it's not null.
* Add integration test `testShareConsumerWithRackAwareAssignor`.
Reviewers: Lan Ding <53332773+DL1231@users.noreply.github.com>, Andrew
Schofield <aschofield@confluent.io>
---------
Signed-off-by: PoAn Yang <payang@apache.org>
The mapKey optimisation can be used in some KIP-932 RPC schemas to
improve efficiency of some key-based accesses.
* AlterShareGroupOffsetsResponse
* ShareFetchRequest
* ShareFetchResponse
* ShareAcknowledgeRequest
* ShareAcknowledgeResponse
Reviewers: Andrew Schofield <aschofield@confluent.io>
---------
Signed-off-by: PoAn Yang <payang@apache.org>
Updated the code to start the State Updater Thread only after the Stream
Thread is started.
Changes done :
1. Moved the starting of the StateUpdater thread to a new init method in
the TaskManager.
2. Called the init of TaskManager in the run method of the StreamThread.
3. Updated the test cases in the StreamThreadTest to mimic the
aforementioned behaviour.
Reviewers: Bruno Cadonna <cadonna@apache.org>
This PR simplifies two ConcurrentHashMap fields by removing their Atomic
wrappers:
- Change `brokerContactTimesMs` from `ConcurrentHashMap<Integer,
AtomicLong>` to `ConcurrentHashMap<Integer, Long>`.
- Change `brokerRegistrationStates` from `ConcurrentHashMap<Integer,
AtomicInteger>` to `ConcurrentHashMap<Integer, Integer>`.
This removes mutable holders without affecting thread safety (see
discussion in #19828).
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi
<frankvicky@apache.org>, Kevin Wu <kevin.wu2412@gmail.com>, Ken Huang
<s7133700@gmail.com>
As part of readying share groups for production, we want to ensure that
the performance of the server-side assignor is optimal. In common with
consumer group assignors, a JMH benchmark is used for the analysis.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
Now that Kafka Brokers support Java 17, this PR makes some changes in
core module. The changes in this PR are limited to only some Scala files
in the Core module's tests. The changes mostly include:
- Collections.emptyList(), Collections.singletonList() and
Arrays.asList() are replaced with List.of()
- Collections.emptyMap() and Collections.singletonMap() are replaced
with Map.of()
- Collections.singleton() is replaced with Set.of()
To be clear, the directories being targeted in this PR from unit.kafka
module:
- log
- network
- security
- tools
- utils
Reviewers: TengYao Chi <frankvicky@apache.org>
The PR do following:
1. rewrite to new test infra
2. rewrite to java
3. move to clients-integration-tests
Reviewers: Ken Huang <s7133700@gmail.com>, Kuan-Po Tseng
<brandboat@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
- Moving off deprecated methods
- Fixing argument order for assertEquals(...)
- Few other minor cleanups
Reviewers: PoAn Yang <payang@apache.org>, Lianet Magrans
<lmagrans@confluent.io>, Ken Huang <s7133700@gmail.com>
This PR is part of the KIP-1034.
It brings the support for the source raw key and the source raw
value in the `ErrorHandlerContext`. Required by the routing to DLQ implemented
by https://github.com/apache/kafka/pull/17942.
Reviewers: Bruno Cadonna <cadonna@apache.org>
Co-authored-by: Damien Gasparina <d.gasparina@gmail.com>
Fix to ensure protocol name comparison in integration test ignore case
(group protocol from param is lower case, vs enum name upper case)
The tests were not failing but the custom configs/expectation were not
being applied depending on the protocol (the tests checks for
"groupProtocol.equals(CLASSIC)" would never be true.
Found all comparisons with equals agains the constant name and fixed
them (not too many luckily).
I did consider changing the protocol param that is passed to every test
(that is now lowercase), but still, seems more robust to have the tests
ignore case.
Reviewers: Gaurav Narula <gaurav_narula2@apple.com>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi
<frankvicky@apache.org>