Commit Graph

13792 Commits

Author SHA1 Message Date
TaiJuWu 359ddce3b2
KAFKA-17137 Ensure Admin APIs are properly tested (token-related) (#16905)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-03 00:22:55 +08:00
Apoorv Mittal 89418b66ae
KAFKA-17442: Handled persister errors with write async calls (KIP-932) (#16956)
The PR makes the persister write RPC async. Also handles the errors from persister as per the review comment here:
Addressing review comment for PR: #16397 (comment)

Reviewers: Andrew Schofield <aschofield@confluent.io>, Abhinav Dixit <adixit@confluent.io>, Jun Rao <junrao@gmail.com>
2024-09-01 16:36:26 -07:00
Matthias J. Sax b527691e0a MINOR: add missing @Override annotations 2024-09-01 11:37:59 -07:00
Dmitry Werner 6836fa259e
KAFKA-17320 Move SensorAccess to server-common module (#16864)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 23:41:28 +08:00
Kuan-Po Tseng cb835d0d6d
KAFKA-17461 move the .github/README.md to .github/workflows/README.md (#17069)
According to GitHub rule:

If a repository contains more than one README file, then the file shown is chosen from locations in the following order: the .github directory, then the repository's root directory, and finally the docs directory.

the file .github/readme will override root/readme. Hence, we should move .github/readme to ./github/workflows/readme

Reviewers: David Arthur <mumrah@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 23:24:25 +08:00
TengYao Chi 6a2789cf70
KAFKA-17293: New consumer HeartbeatRequestManager should rediscover disconnected coordinator (#16844)
Reviewers: Lianet Magrans <lianetmr@gmail.com>, TaiJuWu <tjwu1217@gmail.com>
2024-09-01 09:59:21 -04:00
PoAn Yang 0fde28049a
KAFKA-17331 Throw unsupported version exception if the server does NOT support EarliestLocalSpec and LatestTieredSpec (#16873)
Add the version check to server side for the specific timestamp:
- the version must be >=8 if timestamp=-4L
- the version must be >=9 if timestamp=-5L

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 21:13:40 +08:00
Matthias J. Sax 1f5aea2a86
MINOR: remove get prefix for internal DSL methods (#17050)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 17:14:51 +08:00
Viktor Somogyi-Vass 59d3d7021a
KAFKA-17437 Upgrade commons-validator from 1.7 to 1.9.0 (#17028)
Reviewers: Josep Prat <josep.prat@aiven.io>, Bertalan Kondrat <kb.pcre@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 13:15:18 +08:00
Matthias J. Sax 4189a36b41
MINOR: fix JavaDocs of Kafka Streams context classes (#17049)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 12:46:39 +08:00
Arnav Dadarya 4b5df1f8e9
KAFKA-12826: Remove Deprecated Class Serdes (#17023)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-08-31 19:42:46 -07:00
Chia Chuan Yu d44627da99
MINOR: remove get prefix for internal KeyValueStoreWrapper (#17065)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 10:35:08 +08:00
TaiJuWu dc650fbd73
MINOR: add UT for consumer.poll (#17047)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 10:09:18 +08:00
PoAn Yang 4a2577b801
KAFKA-17395 Flaky test testMissingOffsetNoResetPolicy for new consumer (#17056)
In AsyncKafkaConsumer, FindCoordinatorRequest is sent by background thread. In MockClient#prepareResponseFrom, it only matches the response to a future request. If there is some race condition, FindCoordinatorResponse may not match to a FindCoordinatorRequest. It's better to put MockClient#prepareResponseFrom before the request to avoid flaky test.

Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-08-31 23:57:19 +08:00
David Arthur 3efa785a65
MINOR: Handle test re-runs in junit.py (#17034)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-31 23:34:29 +08:00
Matthias J. Sax fc720d33a0
MINOR: remove get prefix for internal state methods (#17053)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-31 20:02:06 +08:00
TaiJuWu 8f4d856977
MINOR: add helper function `createTopic` to ClusterInstance (#16852)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-31 19:49:59 +08:00
Mickael Maison 1841c07d4a
KAFKA-17449 Move Quota classes to server-common module (#17060)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-31 12:41:34 +08:00
Eric Chang a6062d0868
KAFKA-17137 Feat admin client it acl configs (#16648)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-31 12:29:39 +08:00
David Arthur 7005a1fa4b
KAFKA-17433 Add a deflake Github action (#17019)
This patch adds a "deflake" github action which can be used to run a single JUnit test or suites. It works by parameterizing the --tests Gradle option. If the test extends ClusterTest, the "deflake" workflow can repeat number of times by setting the kafka.cluster.test.repeat system property.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-31 11:26:33 +08:00
PoAn Yang 70dd577286
KAFKA-15909 Throw error when consumer configured with empty/whitespace-only group.id for LegacyKafkaConsumer (#16933)
Per KIP-289, the use of an empty value for group.id configuration was deprecated back in 2.2.0.

In 3.7, the AsyncKafkaConsumer implementation will throw an error (see KAFKA-14438).

This task is to update the LegacyKafkaConsumer implementation to throw an error in 4.0.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-30 23:24:36 +08:00
PoAn Yang 4a3ab89f95
KAFKA-17386 Remove broker-list, threads and num-fetch-threads in ConsumerPerformance (#16983)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-30 22:09:37 +08:00
abhi-ksolves c23b6b0365
KAFKA-16327: Removed Deprecated variable StreamsConfig#TOPOLOGY_OPTIMIZATION (#16744)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-08-29 18:01:39 -07:00
PoAn Yang 2b495945a2
KAFKA-17377: Consider using defaultApiTimeoutMs in AsyncKafkaConsumer#unsubscribe (#17030)
Reviewers: Kirk True <ktrue@confluent.io>, Lianet Magrans <lianetmr@gmail.com>
2024-08-29 20:24:25 -04:00
JohnHuang b154f58ce8
KAFKA-12829: Remove deprecated Topology#addGlobalStore of old Processor API (#16791)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-08-29 16:52:52 -07:00
Nancy 865cdfc1cd
KAFKA- 12834 : Removed Deprecated method under MockProcessorContext (#16778)
Reviewers: Josep Prat <josep.prat@aiven.io>, Matthias J. Sax <matthias@confluent.io>
2024-08-29 16:38:52 -07:00
PoAn Yang 8db80d1f07
KAFKA-17064: New consumer assign should update assignment in background thread (#16673)
Reviewers: Kirk True <ktrue@confluent.io>, Lianet Magrans <lianetmr@gmail.com>
2024-08-29 23:01:36 +02:00
Colin Patrick McCabe 453cf9c987
KAFKA-17434: Do not test impossible scenarios in upgrade_test.py (#17024)
Because of KIP-902 (Upgrade Zookeeper version to 3.8.2), it is not possible to upgrade from a Kafka version
earlier than 2.4 to a version later than 2.4. Therefore, we should not test these upgrade scenarios
in upgrade_test.py. They do happen to work sometimes, but only in the trivial case where we don't
create topics or make changes during the upgrade (which would reveal the ZK incompatibility).
Instead, we should test only supported scenarios.

Reviewers: Reviewers: José Armando García Sancio <jsancio@gmail.com>
2024-08-29 12:51:42 -07:00
TaiJuWu 165076afc6
KAFKA-17390 Remove broker-list in GetOffsetShell (#16992)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-30 00:16:21 +08:00
David Arthur 237138e04b
MINOR: fix for opt-in flag for Github build (#17031)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-29 23:38:40 +08:00
Logan Zhu 464051929d
KAFKA-17388 Remove broker-list from VerifiableProducer (#16958) 2024-08-29 20:02:29 +08:00
Ken Huang 28e2e8631f
KAFKA-17170: Add test to ensure new consumer acks reconciled assignment even if first HB with ack lost (#16694)
Reviewers: Kirk True <ktrue@confluent.io>, Lianet Magrans <lianetmr@gmail.com>
2024-08-29 13:26:40 +02:00
Krishna Agarwal 24d88c46c0
MINOR: Add experimental message for the native docker image (#17040)
The docker image for Native Apache Kafka was introduced with KIP-974 and was first release with 3.8 AK release.
The docker image for Native Apache Kafka is currently intended for local development and testing purposes.

This PR intends to add a logline indicating the same during docker image startup.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2024-08-29 16:33:32 +05:30
David Jacot c977bfdd3c
KAFKA-17413; Re-introduce `group.version` feature flag (#17013)
This patch re-introduces the `group.version` feature flag and gates the new consumer rebalance protocol with it. The `group.version` feature flag is attached to the metadata version `4.0-IV0` and it is marked as production ready. This allows system tests to pick it up directly by default without requiring to set `unstable.feature.versions.enable` in all of them. This is fine because we don't plan to do any incompatible changes before 4.0.

Reviewers: Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-08-29 01:22:54 -07:00
Luke Chen ad4405c8dd
KAFKA-17062: handle dangling "copy_segment_start" state when deleting remote logs (#16959)
The COPY_SEGMENT_STARTED state segments are counted when calculating remote retention size. This causes unexpected retention size breached segment deletion. This PR fixes it by
  1. only counting COPY_SEGMENT_FINISHED and DELETE_SEGMENT_STARTED state segments when calculating remote log size.
  2. During copy Segment, if we encounter errors, we will delete the segment immediately.
  3. Tests added.

Co-authored-by: Guillaume Mallet <>

Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Satish Duggana <satishd@apache.org>, Guillaume Mallet <>
2024-08-29 14:09:55 +08:00
xijiu 291523e3e4
KAFKA-12829: Remove the deprecated method `init(ProcessorContext, StateStore)` from the `StateStore` interface (#16906)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Matthias J. Sax <matthias@confluent.io>
2024-08-28 17:49:16 -07:00
Colin Patrick McCabe ca0cc355f6
KAFKA-12670: Support configuring unclean leader election in KRaft (#16866)
Previously in KRaft mode, we could request an unclean leader election for a specific topic using
the electLeaders API. This PR adds an additional way to trigger unclean leader election when in
KRaft mode via the static controller configuration and various dynamic configurations.

In order to support all possible configuration methods, we have to do a multi-step configuration
lookup process:

1. check the dynamic topic configuration for the topic.
2. check the dynamic node configuration.
3. check the dynamic cluster configuration.
4. check the controller's static configuration.

Fortunately, we already have the logic to do this multi-step lookup in KafkaConfigSchema.java.
This PR reuses that logic. It also makes setting a configuration schema in
ConfigurationControlManager mandatory. Previously, it was optional for unit tests.

Of course, the dynamic configuration can change over time, or the active controller can change
to a different one with a different configuration. These changes can make unclean leader
elections possible for partitions that they were not previously possible for. In order to address
this, I added a periodic background task which scans leaderless partitions to check if they are
eligible for an unclean leader election.

Finally, this PR adds the UncleanLeaderElectionsPerSec metric.

Co-authored-by: Luke Chen showuon@gmail.com

Reviewers: Igor Soarez <soarez@apple.com>, Luke Chen <showuon@gmail.com>
2024-08-28 14:13:20 -07:00
José Armando García Sancio 25819cecdb
KAFKA-17426; Check node directory id for KRaft (#17017)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-08-28 11:31:58 -07:00
Matthias J. Sax f61719f962
MINOR: remove get prefix for internal PAPI methods (#17025)
Reviewers: Bill Bejeck <bill@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-08-28 09:41:32 -07:00
Kirk True dd7d7c3145
KAFKA-17335 Lack of default for URL encoding configuration for OAuth causes NPE (#16990)
AccessTokenRetrieverFactory uses the value of sasl.oauthbearer.header.urlencode provided by the user, or null if no value was provided for that configuration. When the HttpAccessTokenRetriever is created the JVM attempts to unbox the value into a boolean, a NullPointerException is thrown.

The fix is to explicitly check the Boolean, and if it's null, use Boolean.FALSE.

Reviewers: bachmanity1 <81428651+bachmanity1@users.noreply.github.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-08-28 23:11:41 +08:00
Nick Telford f9615ed275
KAFKA-17432: Fix threads alive after shutdown (#17018)
We currently use a `CountDownLatch` to signal when a thread has
completed shutdown to the blocking `shutdown` method. However, this
latch triggers _before_ the thread has fully exited.

Dependent on the OS thread scheduling, it's possible that this thread
will still be "alive" after the latch has unblocked the `shutdown`
method.

In practice, this is mostly a problem for `StreamThreadTest`, which now
checks that there are no `TaskExecutor` or `StateUpdater` threads
immediately after shutting them down.

Sometimes, after shutdown returns, we find that these threads are still
"alive", usually completing execution of the "thread shutdown" log
message, or even the `Thread#exit` JVM method that's invoked to clean up
threads just before they exit. This causes sporadic test failures, even
though these threads did indeed shutdown correctly.

Instead of using a `CountDownLatch`, let's just await the thread to exit
directly, using `Thread#join`. Just as before, we set a timeout, and if
the Thread is still alive after the timeout, we throw a
`StreamsException`, maintaining the contract of the `shutdown` method.

There should be no measurable impact on production code here. This will
mostly just improve the reliability of tests that require these threads
have fully exited after calling `shutdown`.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>, Bruno Cadonna <cadonna@apache.org>
2024-08-28 16:05:10 +02:00
Arpit Goyal ead9ed513e
KAFKA-17422: Adding copySegmentLatch countdown after expiration task is over (#17012)
The given test took 5 seconds as the logic was waiting completely for 5 seconds for the expiration task to be completed. Adding copySegmentLatch countdown after expiration task is over

Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-08-28 10:44:57 +08:00
abhi-ksolves fb19b3f7e7
KAFKA-14262 Deletion of MirrorMaker v1 deprecated classes & tests (#16879)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-28 09:49:56 +08:00
Lianet Magrans f9e30289d9
KAFKA-17403 Generate HB to leave on pollOnClose if needed (#16974)
Fix to ensure that the HB request to leave the group is generated when closing the HBRequestManager if the state is LEAVING. This is needed because we could end up closing the network thread without giving a chance to the HBManager to generate the request. This flow on consumer.close with short timeout:

1. app thread triggers releaseAssignmentAndLeaveGroup
2. background thread transitions to LEAVING
    2.1 the next run of the background thread should poll the HB manager and generate a request
3. app thread releaseAssignmentAndLeaveGroup times out and moves on to close the network thread (stops polling managers. Calls pollOnClose to gather the final requests and send them along with the unsent)

If 3 happens in the app thread before 2.1 happens in the background, the HB manager won't have a chance to generate the request to leave. This PR implements the pollOnClose to generate the final request if needed.


Reviewers: Kirk True <kirk@kirktrue.pro>, TaiJuWu <tjwu1217@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-08-28 09:04:25 +08:00
Matthias J. Sax f69b465414
MINOR: Keep Kafka Streams configs ordered in code and docs (#16816)
Reviewers: Bill Bejeck <bill@confluent.io>
2024-08-27 18:03:24 -07:00
kevin-wu24 f5439864c6
KAFKA-15406: Add the ForwardingManager metrics from KIP-938 (#16904)
Implement the remaining ForwardingManager metrics from KIP-938: Add more metrics for measuring KRaft performance:

kafka.server:type=ForwardingManager,name=QueueTimeMs.p99
kafka.server:type=ForwardingManager,name=QueueTimeMs.p999
kafka.server:type=ForwardingManager,name=QueueLength
kafka.server:type=ForwardingManager,name=RemoteTimeMs.p99
kafka.server:type=ForwardingManager,name=RemoteTimeMs.p999

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-08-27 16:39:31 -07:00
Apoorv Mittal 62dc982ce9
KAFKA-17396: Fix releasing session on final epoch (#16955)
The topic partitions were not being released as the final epoch removes the share session for the client at the start.

The PR streamlines the release process by invoking an explicit releaseSession API which removes the session and releases acquired records. Also newContext and acknowledgeSession method previously used to evict the share session but streamlined methods to just do the needful without altering the share session eviction.

Reviewers:  Andrew Schofield <aschofield@confluent.io>,  Manikumar Reddy <manikumar.reddy@gmail.com>, Chirag Wadhwa <122860692+chirag-wadhwa5@users.noreply.github.com>
2024-08-27 16:14:57 +05:30
Mickael Maison b9fe9f532f
KAFKA-16972: Move BrokerTopicStats to storage module (#17003)
Reviewers: Luke Chen <showuon@gmail.com>
2024-08-27 11:39:37 +02:00
DL1231 006af8b939
KAFKA-17327; Add support of group in kafka-configs.sh (#16887)
The patch adds support of alter/describe configs for group in kafka-configs.sh.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>
2024-08-27 02:16:46 -07:00
Kuan-Po Tseng 5557720246
KAFKA-17038 KIP-919 supports for `alterPartitionReassignments` and `listPartitionReassignments` (#16644)
This is a follow-up after KIP-919, extending support for BOOTSTRAP_CONTROLLERS_CONFIG to both Admin#alterPartitionReassignments and Admin#listPartitionReassignments.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-27 17:12:16 +08:00