Commit Graph

15214 Commits

Author SHA1 Message Date
Ming-Yen Chung 34e7136b7a
MINOR: Fix wrong config property in KafkaConfigTest (#18815)
Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-06 17:09:52 +08:00
Shahbaz Aamir a565d8fdac
MINOR: removed unwanted line breaks (#18744)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-06 17:04:57 +08:00
Steven Schlansker 852f14065b
KAFKA-18689: Improve metric calculation to avoid NoSuchElementException (#18771)
Reviewers: Nick Telford <nick.telford@gmail.com>, Matthias J. Sax <matthias@confluent.io>
2025-02-05 21:39:30 -08:00
Matthias J. Sax 9774635bfd
MINOR: update Kafka Streams `Topology` JavaDocs (#18778)
Reviewers: Bill Bejeck <bill@confluent.io>
2025-02-05 20:24:14 -08:00
Joao Pedro Fonseca Dantas 8be2a8ed4e
MINOR: Add javadocs to AbstractMergedSortedCacheStoreIterator (#18772)
While reviewing PR #18287, I wrote some javadocs to help me understand the AbstractMergedSortedCacheStoreIterator. Maybe we could add them to help the next developers getting into it.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2025-02-05 17:20:53 -08:00
Apoorv Mittal 45c02d7fe3
MINOR: Removing share module from settings (#18806)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-06 02:04:30 +08:00
Kuan-Po Tseng b99be961b8
KAFKA-18206: EmbeddedKafkaCluster must set features (#18189)
related to KAFKA-18206, set features in EmbeddedKafkaCluster in both streams and connect module, note that this PR also fix potential transaction with empty records in sendPrivileged method as transaction version 2 doesn't allow this kind of scenario.

Reviewers: Justine Olshan <jolshan@confluent.io>
2025-02-05 09:14:36 -08:00
Lucas Brutschy 102de21355
KAFKA-17379: Fix inexpected state transition from ERROR to PENDING_SHUTDOWN (#18765)
The exception stack trace shown in the the ticket can happen when we are
concurrently closing the producer because of an error and doing a
regular close. This is not a bug in the test, but a real race condition
that can happen.

The sequence is this:

Thread1: Enter PENDING_ERROR
Thread2: Check if state is already ERROR
Thread1: Transition to ERROR
Thread2: Check if state is already PENDING_ERROR
Thread2: Transition to PENDING_SHUTDOWN

One idea to fix this would be to synchronize the sequence performed by
Thread1 using the state lock. However, this would require more changes,
since we cannot use the normal state transition method `setState` while
owning the lock, as it calls user-defined callbacks, which may create
deadlocks. Do avoid adding more synchronization, we can also fix it by
first attempting to transition to PENDING_SHUTDOWN, and _then_ checking
whether another thread is already attempting to shut down (states
PENDING_SHUTDOWN, PENDING_ERROR, ERROR, NOT_RUNNING). Since we never
transition from a shutdown state back to a non-shutdown state.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2025-02-05 17:09:14 +01:00
Chirag Wadhwa 01587d09d8
KAFKA-18494-3: solution for the bug relating to gaps in the share partition cachedStates post initialization (#18696)
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Abhinav Dixit <adixit@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2025-02-05 15:16:25 +00:00
Sushant Mahajan 0bd1ff936f
KAFKA-18629: Add persister impl and tests for DeleteShareGroupState RPC. [2/N] (#18748)
Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-05 14:51:19 +00:00
Sanskar Jhajharia 7dbed2f6e8
[KAFKA-16720] AdminClient Support for ListShareGroupOffsets (2/2) (#18671)
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Sushant Mahajan <smahajan@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2025-02-05 14:38:09 +00:00
TengYao Chi 66363160c5
KAFKA-18645: New consumer should align close timeout handling with classic consumer (#18702)
Reviewers: Lianet Magrans <lmagrans@confluent.io>, Kirk True <ktrue@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-05 09:08:51 -05:00
Kamal Chandraprakash cb9c6718fa
KAFKA-18722: Remove the unreferenced methods in TBRLMM and ConsumerManager (#18791)
Reviewers: Luke Chen <showuon@gmail.com>, Christo Lolov <lolovc@amazon.com>
2025-02-05 13:13:24 +00:00
Nick Guo 22d4248fba
KAFKA-18694: Migrate suitable classes to records in coordinator-common module (#18782)
Reviewers: TengYao Chi <kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Christo Lolov <lolovc@amazon.com>
2025-02-05 10:50:55 +00:00
PoAn Yang 21645ebf0b
KAFKA-18705: Move ConfigRepository to metadata module (#18784)
Reviewers: TengYao Chi <kitingiao@gmail.com>, Christo Lolov <lolovc@amazon.com>
2025-02-05 10:13:36 +00:00
TengYao Chi aac62a32d9
KAFKA-18698: Migrate suitable classes to records in server and server-common modules (#18783)
Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Christo Lolov <lolovc@amazon.com>
2025-02-05 10:00:11 +00:00
Ming-Yen Chung d830179375
KAFKA-18675 Add tests for valid and invalid broker addresses (#18781)
Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-05 17:01:51 +08:00
Matthias J. Sax a1d5dc0f9e HOTFIX: compilation error
Two merged PRs overlapped in a non-conflicting way, breaking compilation:
 - https://github.com/apache/kafka/pull/18722
 - https://github.com/apache/kafka/pull/18755
2025-02-04 20:32:52 -08:00
Matthias J. Sax 5988ee551e
MINOR: cleanup KStream JavaDocs (6/N) - map[Values] (#18755)
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2025-02-04 19:57:59 -08:00
Justine Olshan 00dddee347
MINOR: Add missing test tag to UnifiedLogTest.scala (#18794)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-04 13:56:14 -08:00
Sean Quah 42e7cbb67e
KAFKA-18690: Keep leader metadata for RE2J-assigned partitions (#18777)
Reviewers: Lianet Magrans <lmagrans@confluent.io>
2025-02-04 13:22:28 -05:00
Matthias J. Sax 8e3a001bf5
MINOR: disable "processing threads" in SmokeTestDriverIntegrationTest (#18773)
Reviewers: Bruno Cadonna <bruno@confluent.io>
2025-02-04 09:59:14 -08:00
Justine Olshan 822b8ab3d7
KAFKA-18691: Flaky test testFencingOnTransactionExpiration (#18793)
It appears this test was failing because the transaction was never aborting and the concurrent transactions errors would not go away.

ccab9eb introduced the test failure because it requires the transaction to complete, but I suspect the lack of completion was happening before the change.

The timeout for the write is based on the transactional timeout, and 100ms seemed too small -- thus the requests to update the state would often repeatedly time out.

Also removed the loop since it was not necessary.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Calvin Liu <caliu@confluent.io>
2025-02-04 08:45:34 -08:00
Bruno Cadonna b998189b00
KAFKA-18538: Add Streams membership manager (#18551)
The Streams membership manager is used client-side in the
background thread of the async consumer. For each member
/consumer, it is responsible for:
* keeping the member state,
* keeping assignments for the member,
* reconciling the assignments of the member -- for example
when tasks need to be revoked before other tasks are assigned
* requesting invocations of assignment and revocation callbacks
by the stream thread.

The Streams membership manager is called by the background thread of
the async consumer, directly in its event loop and from the
 Streams group heartbeat request manager. The Streams membership
manager uses the Streams rebalance events processor to request
assignment/revocation callback in the stream thread.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Bill Bejeck <bill@confluent.io>
2025-02-04 17:32:26 +01:00
Luke Chen 612e1299e4
KAFKA-18230: Handle not controller or not leader error in admin client (#18165)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-04 16:51:24 +01:00
David Arthur 9b793dc1f9
MINOR increase max flaky tests allowed (#18792)
Increase the maximum number of flaky tests we tolerate for the main test suite from 3 to 10. This will result in fewer failed builds.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-04 10:35:41 -05:00
David Jacot 676293d709
MINOR: Fix TestBounce sys test (#18798)
```
================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.12.0
session_id:       2025-02-04--005
run time:         4 minutes 0.023 seconds
tests run:        4
passed:           4
flaky:            0
failed:           0
ignored:          0
================================================================================
```

Reviewers: Lianet Magrans <lmagrans@confluent.io>
2025-02-04 14:49:20 +01:00
David Jacot 4c6af67eb1
MINOR: Fix PerformanceService sys test (#18797)
This patch fixes the PerformanceService system test which was still using ZK.

```
================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.12.0
session_id:       2025-02-04--003
run time:         1 minute 42.629 seconds
tests run:        4
passed:           4                                                                                                                                                                         flaky:            0
failed:           0                                                                                                                                                                         ignored:          0
================================================================================
```

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-04 14:39:24 +01:00
David Jacot 17d1447f9c
MINOR: Fix Benchmark sys tests (#18796)
This patch fixes the Benchmark system tests. We misconfigured the quorum in bc7b87001b.

```
================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.12.0
session_id:       2025-02-04--001
run time:         57 minutes 27.169 seconds
tests run:        62
passed:           62
flaky:            0
failed:           0
ignored:          0
================================================================================
```

Reviewers: PoAn Yang <payang@apache.org>, Christo Lolov <lolovc@amazon.com>
2025-02-04 14:34:57 +01:00
Albert f6d9ce2bcd
MINOR: Add missing MirrorMaker2 metrics to docs (#18691)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
Co-authored-by: u0184996 <alozano@caixabanktech.com>
2025-02-04 14:22:54 +01:00
Ming-Yen Chung 27b46f9a30
MINOR: Correct the link in the Javadoc for test-common-internal-api (#18788)
Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-04 16:52:18 +08:00
Calvin Liu ad031b99d3
KAFKA-18635: reenable the unclean shutdown detection (#18277)
We need to re-enable the unclean shutdown detection when in ELR mode, which was inadvertently removed during the development process.

Reviewers: David Mao <dmao@confluent.io>,  Jun Rao <junrao@gmail.com>
2025-02-03 22:26:57 -08:00
Matthias J. Sax 7719b5f70d
KAFKA-18644: improve generic type names for internal FK-join classes (#18700)
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2025-02-03 22:20:47 -08:00
Ming-Yen Chung 9f78771a1f
KAFKA-18693 Remove PasswordEncoder (#18790)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-04 13:18:41 +08:00
Matthias J. Sax 65961516fd
MINOR: cleanup KStream JavaDocs (4/N) - stream-table-inner-join (#18721)
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Bill Bejeck <bill@confluent.io>
2025-02-03 17:48:49 -08:00
Matthias J. Sax b8cafbfe2d
MINOR: Session windows should accept zero as session gap (#18734)
Reviewers: Almog Gavra <almog@responsive.dev>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2025-02-03 17:45:27 -08:00
Matthias J. Sax ce6f078192
MINOR: fix NPE in KS `Topology` for new `AutoOffsetReset` (#18780)
Introduced via KIP-1106.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2025-02-03 17:24:47 -08:00
Justine Olshan ab8ef87c7f
KAFKA-18654 [1/2]: Transaction Version 2 performance regression due to early return (#18720)
https://issues.apache.org/jira/browse/KAFKA-18575 solved a critical race condition by returning with CONCURRENT_TRANSACTIONS early when the transaction was still completing.
In testing, it was discovered that this early return could cause performance regressions.

Prior to KIP-890 the addpartitions call was a separate call from the producer. There was a previous change https://issues.apache.org/jira/browse/KAFKA-5477 that decreased the retry backoff to 20ms. With KIP-890 and making the call through the produce path, we go back to the default retry backoff which takes longer. Prior to 18575 we introduce a slight delay when sending to the coordinator, so prior to 18575, we are less likely to return quickly and get stuck in this backoff. However, based on results from produce benchmarks, we can still run into the default backoff in some scenarios.

This PR reverts KAFKA-18575, and doesn't return early and wait until the coordinator for checking if a transaction is ongoing. Instead, it will fix the handling with the verification guard so we don't hit the edge condition.

Also cleans up some of the verification text that was unclear.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Artem Livshits <alivshits@confluent.io>
2025-02-03 15:24:34 -08:00
Kamal Chandraprakash 87b536d5ec
MINOR: Remove the noisy log in consumer manager (#18787)
The statement gets logged in the INFO level and gets printed for every message produced to the __remote_log_metadata topic. Removed the log statement as it is needed only during debug session. And, we have another log at DEBUG level to capture this information.

Reviewers: Luke Chen <showuon@gmail.com>, Christo Lolov <lolovc@amazon.com>
2025-02-03 22:51:41 +05:30
Ken Huang 272d947f96
KAFKA-18545: Remove Zookeeper logic from LogManager (#18592)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Mickael Maison <mickael.maison@gmail.com>
2025-02-03 17:16:35 +00:00
Lucas Brutschy 4ca24a7dbf
KAFKA-18325: Add TargetAssignmentBuilder (#18676)
A class to build a new target assignment based on the provided parameters. As a result, it yields the records that must be persisted to the log and the new member assignments as a map.

Compared to the feature branch, I extended the unit tests (testing also standby and warm-up task logic) and adopted simplifications due to the TasksTuple class.

Reviewers: Bruno Cadonna <cadonna@apache.org>, Bill Bejeck <bbejeck@apache.org>
2025-02-03 17:35:28 +01:00
Dongnuo Lyu 1a106e4538
KAFKA-18655: Implement the consumer group size counter with scheduled task (#18717)
During testing we discovered that the empty group count is not updated in group conversion, but when the new group is transition to other state, the empty group count is decremented. This could result in negative empty group count.

We can have a new consumer group count implementation that follows the pattern we did for the classic group count. The timeout task periodically refreshes the metrics based on the current groups soft state.

Reviewers: Jeff Kim <jeff.kim@confluent.io>
2025-02-03 10:50:21 -05:00
Ken Huang 7fdd11295c
KAFKA-18685: Cleanup DynamicLogConfig constructor (#18764)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Christo Lolov <lolovc@amazon.com>
2025-02-03 15:38:05 +00:00
PoAn Yang bc7b87001b
KAFKA-18676; Update Benchmark system tests (#18785)
Update `benchmark_test.py` to use KRaft.

```
> TC_PATHS="tests/kafkatest/benchmarks/core/benchmark_test.py" /bin/bash tests/docker/run_tests.sh

================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.12.0
session_id:       2025-02-03--001
run time:         96 minutes 48.900 seconds
tests run:        120
passed:           120
flaky:            0
failed:           0
ignored:          0
================================================================================
```

Reviewers: David Jacot <djacot@confluent.io>
2025-02-03 14:42:22 +01:00
PoAn Yang f6f41dc5eb
KAFKA-17631 Convert SaslApiVersionsRequestTest to kraft (#18330)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-03 21:01:38 +08:00
Jhen-Yung Hsu 9ba2621620
MINOR: Remove the test for ZooKeeper metrics used by ZooKeeperClient (#18775)
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-03 20:06:01 +08:00
David Jacot bf05d2c914
KAFKA-18672; CoordinatorRecordSerde must validate value version (#18749)
CoordinatorRecordSerde does not validate the version of the value to check whether the version is supported by the current version of the software. This is problematic if a future and unsupported version of the record is read by an older version of the software because it would misinterpret the bytes. Hence CoordinatorRecordSerde must throw an error if the version is unknown. This is also consistent with the handling in the old coordinator.

Reviewers: Jeff Kim <jeff.kim@confluent.io>
2025-02-03 02:19:27 -08:00
Alieh Saeedi eb01221dc0
KAFKA-17125: Streams Sticky Task Assignor (#18652)
Implements streams sticky assignor on the broker-side.

Reviewers: Bill Bejeck <bbejeck@apache.org>, Lucas Brutschy <lbrutschy@confluent.io>
2025-02-03 10:43:26 +01:00
PoAn Yang 5268fcdc98
KAFKA-18678 Update TestVerifiableProducer system test (#18768)
Reviewers: David Jacot <djacot@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-03 14:14:54 +08:00
Ming-Yen Chung 9d6faf0283
KAFKA-18674 Document the incompatible changes in parsing --bootstrap-server (#18751)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-03 13:57:32 +08:00