Commit Graph

16390 Commits

Author SHA1 Message Date
OuO 27647c7c7c
MINOR: Remove the MetaLogShim namings (#20357)
Correct parameter name from `logManager` to `raftClient` (leftover from
PR #10705)

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-08-16 02:02:56 +08:00
majialong 5078b228f5
MINOR: Remove broken link in CoordinatorMetricsShard javadoc (#20355)
CoordinatorMetricsShard was split into a separate module in
(https://github.com/apache/kafka/pull/16883), causing the link in the
javadoc to become invalid.

So, remove broken link in CoordinatorMetricsShard javadoc.

Reviewers: TengYao Chi <kitingiao@gmail.com>, Sanskar Jhajharia
<sjhajharia@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-08-16 00:02:32 +08:00
Kevin Wu 833e25f015
KAFKA-19605; Fix the busy loop occurring in kraft client observers (#20354)
The broker observer should not read update voter set timer value when
polling to determine its backoff, since brokers cannot auto-join the
KRaft voter set. If auto-join or kraft.version=1 is not supported,
controller observers should not read this timer either when polling.

The updateVoterSetPeriodMs timer is not something that should be
considered when calculating the backoff returned by polling, since this
timer does not represent the same thing as the fetchTimeMs timer.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, José Armando García
 Sancio <jsancio@apache.org>, Alyssa Huang <ahuang@confluent.io>,
 Kuan-Po Tseng <brandboat@gmail.com>
2025-08-15 10:43:46 -04:00
Jhen-Yung Hsu 55260e9835
KAFKA-19042: Move AdminClientWithPoliciesIntegrationTest to clients-integration-tests module (#20339)
This PR does the following:

- Rewrite to new test infra.
- Rewrite to java.
- Move to clients-integration-tests.
- Add `ensureConsistentMetadata` method to `ClusterInstance`,
similar to `ensureConsistentKRaftMetadata` in the old infra, and
refactors related code.

Reviewers: TengYao Chi <frankvicky@apache.org>, Ken Huang
 <s7133700@gmail.com>
2025-08-15 17:44:47 +08:00
stroller 58d894170a
MINOR: Fix typo in `AdminBootstrapAddresses` (#20352)
Fix the typo in `AdminBootstrapAddresses`.

Reviewers: TengYao Chi <frankvicky@apache.org>, Ken Huang
<s7133700@gmail.com>
2025-08-15 14:19:40 +08:00
Ming-Yen Chung c4fb1008c4
MINOR: Use lambda expressions instead of ImmutableValue for Gauges (#20351)
Refactor metric gauges instantiation to use lambda expressions instead
of ImmutableValue.

Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
 <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-08-14 20:35:21 +08:00
Robert Young 3067f15caf
KAFKA-19596: Improve visibility when topic auto-creation fails (#20340)
Log a warning for each topic that failed to be created as a result of an
automatic creation. This makes the underlying cause more visible so
users can take action.

Previously, at the default log level, you could only see logs that the
broker was attempting to autocreate topics. If the creation failed, then
it was logged at debug.

Signed-off-by: Robert Young <robertyoungnz@gmail.com>

Reviewers: Luke Chen <showuon@gmail.com>, Kuan-Po Tseng <brandboat@gmail.com>
2025-08-14 10:47:12 +08:00
Kevin Wu 92d8cb562a
KAFKA-19078 Automatic controller addition to cluster metadata partition (#19589)
Add the `controller.quorum.auto.join.enable` configuration. When enabled
with KIP-853 supported, follower controllers who are observers (their
replica id + directory id are not in the voter set) will:

- Automatically remove voter set entries which match their replica id
but not directory id by sending the `RemoveVoterRPC` to the leader.
- Automatically add themselves as a voter when their replica id is not
present in the voter set by sending the `AddVoterRPC` to the leader.

Reviewers: José Armando García Sancio
 [jsancio@apache.org](mailto:jsancio@apache.org),   Chia-Ping Tsai
 [chia7712@gmail.com](mailto:chia7712@gmail.com)
2025-08-13 23:20:18 +08:00
Sanskar Jhajharia dbf3808f53
MINOR: Add test coverage for StorageTool format command feature validation (#20303)
### Summary
Adds comprehensive test coverage for the StorageTool format command
feature validation, including tests for valid feature overrides, invalid
feature detection, and multiple feature specifications. Also adds debug
output to help with troubleshooting format operations.

### Changes Made

#### Test Coverage Improvements
- **`testFormatWithReleaseVersionAndFeatureOverride()`**: Tests that
feature overrides work correctly when specified with `--feature` flag

- **`testFormatWithInvalidFeatureThrowsError()`**: Tests error handling
for unsupported features

- **`testFormatWithMultipleFeatures()`**: Tests multiple feature
specifications in a single format command

#### Debug Output Enhancement
- **Formatter.java**: Added debug output to print bootstrap metadata
during format operations
  - Helps with troubleshooting format issues by showing the complete
bootstrap metadata being written
  - Improves visibility into what features and configurations are being
applied

#### Test Updates
- **FormatterTest.java**: Updated existing tests to account for the new
debug output\

### Related
- KIP-1022: [Formatting and Updating Features

](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1022%3A+Formatting+and+Updating+Features)

Reviewers: Kevin Wu <kevin.wu2412@gmail.com>, Justine Olshan
 <jolshan@confluent.io>
2025-08-12 12:51:39 -07:00
Sanskar Jhajharia 9983331d91
Revert "MINOR: Update apacheds to 2.0.0.AM26 (#19565)" (#20341)
This reverts commit 9a0239a032.

The version bump causes build failure on trunk:
https://github.com/apache/kafka/pull/19565#issuecomment-3177841140

Reviewers: Luke Chen <showuon@gmail.com>
2025-08-12 16:11:09 +08:00
Xuan-Zhang Gong 2e774cca4f
MINOR: Remove unnecessary checks in `TopicDelta` (#20337)
field `directories` has already been validated in the constructor,so
there’s no need to check for null.


7d2ad18520/metadata/src/main/java/org/apache/kafka/metadata/PartitionRegistration.java (L221)

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi
<frankvicky@apache.org>, Yung <yungyung7654321@gmail.com>, Ken Huang
<s7133700@gmail.com>
2025-08-12 12:37:18 +08:00
Gantigmaa Selenge 9a0239a032
MINOR: Update apacheds to 2.0.0.AM26 (#19565)
Dependencies such as api-ldap-client-api and mina-core used by the
current version of apacheds have several critical CVEs: CVE-2018-1337,
CVE-2024-52046 and CVE-2019-0231.

Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-08-12 11:29:37 +08:00
TaiJuWu 18045c6ac3
KAFKA-19592: testGenerateAssignmentWithBootstrapServer uses wrong JSON format (#20336)
This PR do following:
1. Use correct json format to test.
2. make `PartitionReassignmentState` and `VerifyAssignmentResult.java`
become record

Reviewers: TengYao Chi <frankvicky@apache.org>, Ken Huang
 <s7133700@gmail.com>, PoAn Yang <payang@apache.org>
2025-08-11 20:14:06 +08:00
Stig Døssing 7d2ad18520
KAFKA-19580 Upgrade to spotbugs 4.9.4 (#20333)
Use Java 24 for the spotbugs checks, now that Spotbugs works  on Java
24.

Added some more warning exclusions for warnings that are new to 4.9.4.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-08-10 23:38:15 +08:00
Chirag Wadhwa 43a25043dd
KAFKA-19567: Added the check for underlying partition being the leader in delayedShareFetch tryComplete method (#20280)
In the current implementation, some delayed share fetch operations get
trapped in the delayed share fetch purgatory when the partition
leaderships change during share consumption. This is because there is no
check in code to make sure the current broker is still the partition
leader corresponding to the share partitions. So, when leadership
changes, the share partitions cannot be acquired, because they have
already been fenced, and tryComplete returns false. Although the
operatio does get completed when the timer expires for it, but it is too
late by then, and the operation get stuck in the watchers list waiting
for it to get purged when estimated operations increase to more than
1000. This Pr resolves this by adding the required check so that if
partition leadership changes, then the delayed share fetches waiting on
it gets completed instantaneously.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield
 <aschofield@confluent.io>
2025-08-10 10:14:58 +01:00
Clemens Hutter 8deb6c6911
MINOR: Remove SPAM URL in Streams Documentation (#20321)
The previous URL http://lambda-architecture.net/ seems to now be controlled by spammers

Co-authored-by: Shashank <hsshashank.grad@gmail.com>
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2025-08-08 11:17:36 +02:00
OuO ba97558bfe
MINOR: Improve RLMM doc (#20306)
Improve RLMM doc:
1. Distinguish RLMM configs from other tiered storage configs, all RLMM
configs need to start with a specific prefix, but the original
documentation miss description.
2. Added description of additional configs for client, which is required
when configuring authentication information. This can confuse users, for
example: Aiven-Open/tiered-storage-for-apache-kafka#681

Reviewers: Luke Chen <showuon@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-08-08 16:30:42 +08:00
Shivsundar R bf42924126
KAFKA-19572: Added check to prevent NPE logs during ShareConsumer::close (#20290)
*What*
https://issues.apache.org/jira/browse/KAFKA-19572

- If a `ShareConsumer` constructor failed due to any exception, then we
call `close()` in the catch block.

- If there were uninitialized members accessed during `close()`, then it
would throw a NPE. Currently there are no null checks, hence we were
attempting to use these fields during `close()` execution.

- To avoid this, PR adds null checks in the `close()` function before we
access fields which possibly could be null.

Reviewers: Apoorv Mittal <amittal@confluent.io>, Lianet Magrans
 <lmagrans@confluent.io>
2025-08-07 15:33:28 -04:00
Ming-Yen Chung 9d5dd079fe
KAFKA-18068 Document deprecation of PARTITIONER_ADPATIVE_PARTITIONING_ENABLE_CONFIG (#20322)
Document deprecation of PARTITIONER_ADPATIVE_PARTITIONING_ENABLE_CONFIG
in `upgrade.html`, which was missed in #20317

Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-08-08 02:30:24 +08:00
Apoorv Mittal dc96e29499
KAFKA-19476: Correcting max delivery on write state failure and lock timeout (#20310)
Fixing max delivery check on acquisition lock timeout and write state
RPC failure.

When acquisition lock is already timed out and write state RPC failure
occurs then we need to check if records need to be archived. However
with the fix we do not persist the information, which is relevant as
some records may be  archived or delivery count is bumped. The
information will be persisted eventually.

The persister call has failed already hence issuing another persister
call due to a failed persister call earlier is not correct. Rather let
the data persist in future persister calls.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Abhinav Dixit
 <adixit@confluent.io>
2025-08-07 19:22:00 +01:00
Sanskar Jhajharia cfe483b728
MINOR: Cleanup Trogdor Module (#20214)
Now that Kafka support Java 17, this PR makes some changes in `trogdor`
module. The changes mostly include:

- Collections.emptyList(), Collections.singletonList() and
Arrays.asList() are replaced with List.of()
- Collections.emptyMap() and Collections.singletonMap() are replaced
with Map.of()
- Collections.singleton() is replaced with Set.of()

Some minor cleanups around use of enhanced switch blocks and conversion
of classes to record classes.

Reviewers: Ken Huang <s7133700@gmail.com>, Vincent Jiang
 <vpotucek@me.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-08-07 23:22:32 +08:00
Apoorv Mittal ddab943b0b
KAFKA-18265: Move persister call outside of the lock (3/N) (#20316)
Minor PR to move persister call outside of the lock. The lock is not
required while making the persister call.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Abhinav Dixit <adixit@confluent.io>
2025-08-07 13:11:49 +01:00
Apoorv Mittal f12a9d8413
KAFKA-19464: Remove unnecessary update for find next fetch offset (#20315)
The PR removes unnecessary updates for find next fetch offset. When the
state is in transition and not yet completed then anyways respective
offsets should not be considered for acquisition. The find next fetch
offset is updated finally when transition is completed.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Abhinav Dixit
<adixit@confluent.io>
2025-08-07 13:11:07 +01:00
Ming-Yen Chung 2329def2ff
KAFKA-18068: Fix the typo `PARTITIONER_ADPATIVE_PARTITIONING_ENABLE` in ProducerConfig (#20317)
Fixes a typo in ProducerConfig:  Renames
`PARTITIONER_ADPATIVE_PARTITIONING_ENABLE_CONFIG` →
`PARTITIONER_ADAPTIVE_PARTITIONING_ENABLE_CONFIG`

The old key is retained for backward compatibility.

See: [KIP-1175: Fix the typo `PARTITIONER_ADPATIVE_PARTITIONING_ENABLE`
in ProducerConfig](https://cwiki.apache.org/confluence/x/KYogFQ)

Reviewers: Yung <yungyung7654321@gmail.com>, TengYao Chi
<frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>, Nick Guo
<lansg0504@gmail.com>, Ranuga Disansa <go2ranuga@gmail.com>
2025-08-07 17:28:17 +08:00
Federico Valeri 21db7e00bc
KAFKA-19573: Update num.recovery.threads.per.data.dir configs (#20299)
The default value of `num.recovery.threads.per.data.dir` is now 2
according to KIP-1030. We should update config files which are still
setting 1.

---------

Signed-off-by: Federico Valeri <fedevaleri@gmail.com>

Reviewers: Luke Chen <showuon@gmail.com>
2025-08-07 14:30:10 +08:00
Jinhe Zhang 03190e4c22
MINOR: retry upon missing source topic (#20284)
Implements a timeout mechanism (using maxPollTimeMs) that waits for
missing source topics to be created before failing, instead of
immediately  throwing exceptions in the new Streams protocol.
Additionally, throw  TopologyException when partition count mismatch is
detected.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Alieh Saeedi
 <asaeedi@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2025-08-06 15:32:48 -07:00
Chih-Yuan Chien 1b588afb96
MINOR: Fix the link for the heading Errant Record Reporter in connect.html (#20313)
The link for the heading Errant Record Reporter is missing the # symbol,
which is causing it to redirect to a 404 Not Found page.  Please refer
to the updated preview.  <img width="665" height="396"
alt="kafka-site-preview"

src="https://github.com/user-attachments/assets/1c6f3ea9-de9b-4b2c-a4d6-919199a6ff6f"
/>

Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
 <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-08-07 00:31:46 +08:00
Lan Ding 71442bf42f
MINOR: cleanup in QuotaFactory (#20312)
cleanup in QuotaFactory.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-08-07 00:05:45 +08:00
Matthias J. Sax 4a6a5466fa
MINOR: add missing section to TOC (#20305)
Add new group coordinator metrics section to TOC.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-08-05 14:27:09 -07:00
Chia-Ping Tsai eab771fae3
MINOR: update GitHub collaborators (#20309)
based on
https://github.com/apache/kafka/graphs/contributors?from=2024%2F8%2F3

Reviewers: PoAn Yang <payang@apache.org>, Jhen-Yung Hsu
 <jhenyunghsu@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Ken Huang
 <s7133700@gmail.com>, Nick Guo <lansg0504@gmail.com>
2025-08-06 04:03:59 +08:00
Stig Døssing 25da705178
MINOR: Run CI with Java 24 (#20295)
This commit updates CI to test against Java 24 instead of Java 23 which
is EOL.

Due to Spotbugs not having released version 4.9.4 yet, we can't run
Spotbugs on Java 24. Instead, we are choosing to run Spotbugs, and the
rest of the compile and validate build step, on Java 17 for now.

Once 4.9.4 has released, we will switch to using Java 24 for this.

Exclude spotbugs from the run-tests gradle action. Spotbugs is already
being run once in the build by "compile and validate", there is no
reason  to run it again as part of executing tests.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-08-05 21:26:13 +08:00
Sanskar Jhajharia b9413ea4d6
MINOR: Cleanup Tools Module (2/n) (#20096)
Now that Kafka support Java 17, this PR makes some changes in tools
module. The changes in this PR are limited to only some files. A future
PR(s) shall follow. The changes mostly include:

- Collections.emptyList(), Collections.singletonList() and
Arrays.asList() are replaced with List.of()
- Collections.emptyMap() and Collections.singletonMap() are replaced
with Map.of()
- Collections.singleton() is replaced with Set.of()

Some minor changes to use the enhanced switch.

Sub modules targeted: tools/src/test

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-08-05 21:16:01 +08:00
Ken Huang a12d38f091
MINOR: Cleanup RaftClient#upgradeKRaftVersion java document (#20275)
The compiler warning is due to a lack of import. This patch imports the ApiException to fix it.

Reviewers: TengYao Chi <frankvicky@apache.org>, Yung
<yungyung7654321@gmail.com>
2025-08-05 19:06:17 +08:00
Priyanka K U eb9f5189f5
KAFKA-16913: Support external schemas in JSONConverter (#19449)
When using a connector that requires a schema, such as JDBC connectors,
with JSON messages, the current JSONConverter necessitates including the
schema within every message. To address this, we are introducing a new
parameter, schema.content, which allows you to provide the schema
externally. This approach not only reduces the size of the messages but
also facilitates the use of more complex schemas.

KIP :
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1054%3A+Support+external+schemas+in+JSONConverter

Reviewers: Mickael Maison <mickael.maison@gmail.com>, TengYao Chi <frankvicky@apache.org>, Edoardo Comar <ECOMAR@uk.ibm.com>
2025-08-05 10:00:14 +02:00
majialong e78977e505
KAFKA-19579 Add missing metrics for doc tiered storage (#20304)
Add missing metrics for document tiered storage

-

kafka.log.remote:type=RemoteLogManager,name=RemoteLogReaderFetchRateAndTimeMs:Introduced
in

[KIP-1018](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1018%3A+Introduce+max+remote+fetch+timeout+config+for+DelayedRemoteFetch+requests)

-

kafka.server:type=DelayedRemoteListOffsetsMetrics,name=ExpiresPerSec,topic=([-.\w]),partition=([0-9]):Introduced
in

[KIP-1075](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1075%3A+Introduce+delayed+remote+list+offsets+purgatory+to+make+LIST_OFFSETS+async)

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Lan Ding
 <isDing_L@163.com>, Kamal Chandraprakash
 <kamal.chandraprakash@gmail.com>
2025-08-05 13:25:24 +08:00
Jared Harley 66b3c07954
KAFKA-19576 Fix typo in state-change log filename after rotate (#20269)
The `state-change.log` file is being incorrectly rotated to
`stage-change.log.[date]`. This change fixes the typo to have the log
file correctly rotated to `state-change.log.[date]`

_No functional changes._

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Christo Lolov
 <lolovc@amazon.com>, Luke Chen <showuon@gmail.com>, Ken Huang
 <s7133700@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-08-05 12:22:54 +08:00
Federico Valeri cd9dde11de
MINOR: Improve skip-record-metadata description (#20291)
This flag also skips control records, so the description needs to be
updated.

---------

Signed-off-by: Federico Valeri <fedevaleri@gmail.com>

Reviewers: Luke Chen <showuon@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Vincent Potucek
2025-08-05 08:50:50 +08:00
Stig Døssing cb9e7fe1c6
MINOR: Upgrade Spotbugs to 4.9.1 (#20294)
Add exclusions for new warnings to allow this upgrade.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-08-05 02:50:47 +08:00
Luke Chen 657b496f3c
MINOR: improve the min.insync.replicas doc (#20237)
Along with the change: https://github.com/apache/kafka/pull/17952

([KIP-966](https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas)),
the semantics of `min.insync.replicas` config has small change, and add
some constraints. We should document them clearly.

Reviewers: Jun Rao <junrao@gmail.com>, Calvin Liu <caliu@confluent.io>,
 Mickael Maison <mickael.maison@gmail.com>, Paolo Patierno
 <ppatierno@live.com>, Federico Valeri <fedevaleri@gmail.com>, Chia-Ping
 Tsai <chia7712@gmail.com>
2025-08-05 00:22:13 +08:00
Sean Quah 904ee87b85
MINOR: Add missing test coverage for OffsetFetchResponse.errorCounts() (#20263)
OffsetFetchResponses can have three different error structures depending
on the version. Version 2 adds a top level error code for group-level
errors. Version 8 adds support for querying multiple groups at a time
and nests the fields within a groups array. Add a test for the
errorCounts implementation since it varies depending on the version.

Reviewers: Dongnuo Lyu <dlyu@confluent.io>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-08-04 17:50:37 +08:00
Chang-Chi Hsu 888861d803
MINOR: Replace boundPort with brokerBoundPort (#20297)
## Changes:
- Replaced all references to boundPort with brokerBoundPort.

## Reasons
- boundPort and brokerBoundPort share the same definition and behavior.

Reviewers: TaiJuWu <tjwu1217@gmail.com>, Ken Huang <s7133700@gmail.com>,
 Chia-Ping Tsai <chia7712@gmail.com>
2025-08-04 16:44:12 +08:00
Federico Valeri e67c042f7f
KAFKA-18607: Update jfreechart dependency (#20264)
This patch updates the code and the dependency with the latest namespace
and version.

Signed-off-by: Federico Valeri <fedevaleri@gmail.com>

Reviewers: Mickael Maison <mickael.maison@gmail.com>
2025-08-04 10:37:36 +02:00
PoAn Yang ea771563e0
KAFKA-14604: SASL session expiration time will be overflowed when calculation (#18526)
CI / build (push) Has been cancelled Details
Fixup PR Labels / fixup-pr-labels (needs-attention) (push) Has been cancelled Details
Fixup PR Labels / fixup-pr-labels (triage) (push) Has been cancelled Details
Docker Image CVE Scanner / scan_jvm (3.7.2) (push) Has been cancelled Details
Docker Image CVE Scanner / scan_jvm (3.8.1) (push) Has been cancelled Details
Docker Image CVE Scanner / scan_jvm (3.9.1) (push) Has been cancelled Details
Docker Image CVE Scanner / scan_jvm (4.0.0) (push) Has been cancelled Details
Docker Image CVE Scanner / scan_jvm (latest) (push) Has been cancelled Details
Flaky Test Report / Flaky Test Report (push) Has been cancelled Details
Fixup PR Labels / needs-attention (push) Has been cancelled Details
The timeout value may be overflowed if users set a large expiration
time.

```
sessionExpirationTimeNanos = authenticationEndNanos + 1000 * 1000 *
sessionLifetimeMs;
```

Fixed it by throwing exception if the value is overflowed.

Reviewers: TaiJuWu <tjwu1217@gmail.com>, Luke Chen <showuon@gmail.com>,
 TengYao Chi <frankvicky@apache.org>

Signed-off-by: PoAn Yang <payang@apache.org>
2025-08-03 19:12:04 +08:00
Tsung-Han Ho (Miles Ho) 3f1d830174
MINOR: Remove duplicate renewTimePeriodOpt in DelegationTokenCommand validation (#20177)
The bug was a duplicate parameter validation in the
`DelegationTokenCommand` class.  The `checkInvalidArgs` method for the
`describeOpt` was incorrectly including `renewTimePeriodOpt` twice in
the set of invalid arguments.

This bug caused unexpected command errors during E2E testing.

### Before the fix:
The following command would fail due to the duplicate validation logic:
```
TC_PATHS="tests/kafkatest/tests/core/delegation_token_test.py::DelegationTokenTest"
/bin/bash tests/docker/run_tests.sh
```

### Error output:
```
ducktape.cluster.remoteaccount.RemoteCommandError: ducker@ducker03:
Command
'KAFKA_OPTS="-Djava.security.auth.login.config=/mnt/security/jaas.conf
-Djava.security.krb5.conf=/mnt/security/krb5.conf"
/opt/kafka-dev/bin/kafka-delegation-tokens.sh --bootstrap-server
ducker03:9094  --create  --max-life-time-period -1  --command-config
/mnt/kafka/client.properties > /mnt/kafka/delegation_token.out' returned
non-zero exit status 1. Remote error message: b'duplicate element:
[renew-time-period]\njava.lang.IllegalArgumentException: duplicate
element: [renew-time-period]\n\tat
java.base/java.util.ImmutableCollections$SetN.<init>(ImmutableCollections.java:918)\n\tat
java.base/java.util.Set.of(Set.java:544)\n\tat
org.apache.kafka.tools.DelegationTokenCommand$DelegationTokenCommandOptions.checkArgs(DelegationTokenCommand.java:304)\n\tat
org.apache.kafka.tools.DelegationTokenCommand.execute(DelegationTokenCommand.java:79)\n\tat
org.apache.kafka.tools.DelegationTokenCommand.mainNoExit(DelegationTokenCommand.java:57)\n\tat
org.apache.kafka.tools.DelegationTokenCommand.main(DelegationTokenCommand.java:52)\n\n'

[INFO:2025-07-31 11:27:25,531]: RunnerClient:
kafkatest.tests.core.delegation_token_test.DelegationTokenTest.test_delegation_token_lifecycle.metadata_quorum=ISOLATED_KRAFT:
Data: None
================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.12.0
session_id:       2025-07-31--002
run time:         33.213 seconds
tests run:        1
passed:           0
flaky:            0
failed:           1
ignored:          0
================================================================================
test_id:
kafkatest.tests.core.delegation_token_test.DelegationTokenTest.test_delegation_token_lifecycle.metadata_quorum=ISOLATED_KRAFT
status:     FAIL
run time:   33.090 seconds
```

### After the fix:
The same command now executes successfully:
```
TC_PATHS="tests/kafkatest/tests/core/delegation_token_test.py::DelegationTokenTest"
/bin/bash tests/docker/run_tests.sh
```

### Success output:
```
================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.12.0
session_id:       2025-07-31--001
run time:         35.488 seconds
tests run:        1
passed:           1
flaky:            0
failed:           0
ignored:          0
================================================================================
test_id:
kafkatest.tests.core.delegation_token_test.DelegationTokenTest.test_delegation_token_lifecycle.metadata_quorum=ISOLATED_KRAFT
status:     PASS
run time:   35.363 seconds
--------------------------------------------------------------------------------
```

Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, TengYao Chi
 <frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>, PoAn Yang
 <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2025-08-03 16:40:18 +08:00
Apoorv Mittal 05d71ad1a8
KAFKA-19476: Concurrent execution fixes for lock timeout and lso movement (#20286)
CI / build (push) Has been cancelled Details
The PR fixes following:

1. In case share partition arrive at a state which should be treated as
final state
of that batch/offset (example - LSO movement which causes offset/batch
to be ARCHIVED permanently), the result of pending write state RPCs for
that offset/batch override the ARCHIVED state. Hence track such updates
and apply when transition is completed.

2. If an acquisition lock timeout occurs while an offset/batch is
undergoing transition followed by write state RPC failure, then
respective batch/offset can
land in a scenario where the offset stays in ACQUIRED state with no
acquisition lock timeout task.

3. If a timer task is cancelled, but due to concurrent execution of
timer task and acknowledgement, there can be a scenario when timer task
has processed post cancellation. Hence it can mark the offset/batch
re-avaialble despite already acknowledged.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Abhinav Dixit
 <adixit@confluent.io>
2025-08-01 23:20:25 +01:00
Andrew Schofield b909544e99
MINOR: Improve consistency of acknowledge type terminology (#20282)
The code had a mixture of "acknowledgement type" and "acknowledge type".
The latter is preferred.

Reviewers: TengYao Chi <frankvicky@apache.org>, Lan Ding
 <isDing_L@163.com>
2025-08-01 21:17:22 +01:00
Shivsundar R e1f45218c9
KAFKA-19485 (II) : Complete any pending acknowledgements in ShareFetch on an error response. (#20247)
CI / build (push) Waiting to run Details
*What*
Currently when we received a top level error response in ShareFetch, we
would log the error, update the share session epoch and proceed to the
next request.
But these acknowledgements(if any) are not completed and the callback
would not have been processed.

PR aims to address this by completing these acknowledgements with the
error code from the response in this case.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-07-31 22:07:44 +01:00
Andrew Schofield f38359300b
MINOR: Fix javadoc mistake (#20281)
Fixes a couple of tiny mistakes in the javadoc for KafkaShareConsumer.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-07-31 19:25:04 +01:00
majialong 6b96735872
MINOR: Fix typo in GetOffsetShell (#20277)
Fix typo in GetOffsetShell : `visible for tseting` -> `Visible for
testing`

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-08-01 00:48:31 +08:00
Manikumar Reddy f73e3bcd6a
KAFKA-19561: Set OP_WRITE interest after SASL reauthentication to resume pending writes (#20258)
https://issues.apache.org/jira/browse/KAFKA-19561

Addresses a race condition during SASL reauthentication where the
server-side `KafkaChannel.send()` queues a response, but OP_WRITE is
removed before the channel becomes writable — resulting in stuck
responses and client  timeouts.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2025-07-31 21:59:21 +05:30