kafka

Commit Graph

Author	SHA1	Message	Date
Sushant Mahajan	3fc103b48b	KAFKA-18629: ShareGroupDeleteState admin client impl. (#18928 ) * In this PR, we add various infra classes needed to support the `deleteShareGroups` functionality via the `kafka-share-groups.sh` script, as well as the implementation of `kafka-share-groups.sh --delete`. Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-22 16:21:10 +00:00
Sushant Mahajan	4f28973bd1	KAFKA-18827: Initialize share state, share coordinator impl. [1/N] (#18968 ) In this PR, we have added the share coordinator and KafkaApis side impl of the intialize share group state RPC. ref: https://cwiki.apache.org/confluence/display/KAFKA/KIP-932%3A+Queues+for+Kafka#KIP932:QueuesforKafka-InitializeShareGroupStateAPI Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-22 16:12:08 +00:00
xijiu	118818a7ca	KAFKA-18795 Remove `Records#downConvert` (#18897 ) Since we no longer convert records to the old format for fetch requests, this code is no longer used in production. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-02-22 02:29:58 +08:00
Lianet Magrans	c580874fc2	KAFKA-18813: [3/N] Client support for TopicAuthException in DescribeConsumerGroup path (#18996 ) Reviewers: David Jacot <djacot@confluent.io>	2025-02-21 12:42:00 -05:00
Lianet Magrans	c56c9faee2	KAFKA-18813: [2/N] Client support for TopicAuthException in HB path (#18986 ) Reviewers: David Jacot <djacot@confluent.io>	2025-02-21 08:45:20 -05:00
TengYao Chi	709bfc506a	KAFKA-18641: AsyncKafkaConsumer could lose records with auto offset commit (#18737 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>, Jun Rao <jun@confluent.io>, Kirk True <ktrue@confluent.io>	2025-02-20 12:11:01 -05:00
Ken Huang	eda8fc84ae	KAFKA-16918 TestUtils#assertFutureThrows should use future.get with timeout (#18891 ) Reviewers: TengYao Chi <kitingiao@gmail.com>, Luke Chen <showuon@gmail.com>, Parker Chang <45290853+Parkerhiphop@users.noreply.github.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-20 07:22:31 +08:00
Matthias J. Sax	538a60e1b3	MINOR: disallow rawtypes and fail build (#18877 ) Cleanup code to avoid rawtype, and add suppressions where necessary. Change the build to fail on rawtype warning. Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>	2025-02-19 13:11:49 -08:00
Shivsundar R	3603c8fe35	KAFKA-18829: Added check before converting to IMPLICIT mode (#18964 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-19 17:34:28 +00:00
Ismael Juma	3dba3125e9	KAFKA-18601: Assume a baseline of 3.3 for server protocol versions (#18845 ) 3.3.0 was the first KRaft release that was deemed production-ready and also when KIP-778 (KRaft to KRaft upgrades) landed. Given that, it's reasonable for 4.x to only support upgrades from 3.3.0 or newer (the metadata version also needs to be set to "3.3" or newer before upgrading). Noteworthy changes: 1. `AlterPartition` no longer includes topic names, which makes it possible to simplify `AlterParitionManager` logic. 2. Metadata versions older than `IBP_3_3_IV3` have been removed and `IBP_3_3_IV3` is now the minimum version. 3. `MINIMUM_BOOTSTRAP_VERSION` has been removed. 4. Removed `isLeaderRecoverySupported`, `isNoOpsRecordSupported`, `isKRaftSupported`, `isBrokerRegistrationChangeRecordSupported` and `isInControlledShutdownStateSupported` - these are always `true` now. Also removed related conditional code. 5. Removed default metadata version or metadata version fallbacks in multiple places - we now fail-fast instead of potentially using an incorrect metadata version. 6. Update `MetadataBatchLoader.resetToImage` to set `hasSeenRecord` based on whether image is empty - this was a previously existing issue that became more apparent after the changes in this PR. 7. Remove `ibp` parameter from `BootstrapDirectory` 8. A number of tests were not useful anymore and have been removed. I will update the upgrade notes via a separate PR as there are a few things that need changing and it would be easier to do so that way. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jun Rao <junrao@gmail.com>, David Arthur <mumrah@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Justine Olshan <jolshan@confluen.io>, Ken Huang <s7133700@gmail.com>	2025-02-19 05:35:42 -08:00
ShivsundarR	a6a588fbed	KAFKA-18198: Added check to prevent acknowledgements on initial ShareFetchRequest. (#18944 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-19 10:49:58 +00:00
TaiJuWu	4c8d96c0f0	KAFKA-18767: Add client side config check for shareConsumer (#18850 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-18 15:57:56 +00:00
Parker Chang	ed366e6b89	MINOR: Align assertFutureThrows method signature with JUnit conventions (#18825 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Andrew Schofield <aschofield@confluent.io>	2025-02-18 15:56:42 +00:00
Chirag Wadhwa	63229a768c	KAFKA-16718 [1/n]: Added DeleteShareGroupOffsets request and response schema (#18927 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-18 14:06:24 +00:00
Bruno Cadonna	d6b6952d48	KAFKA-18736: Add Streams group heartbeat request manager (1/N) (#18870 ) This commit adds the Streams group heartbeat request manager to the async consumer. The Streams group heartbeat request manager is responsible to send heartbeat requests and to process their responses. This commit implements: - sending of full heartbeat request (independent of any state) - processing successful response Reviewers: Bill Bejeck <bill@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>	2025-02-18 13:45:01 +01:00
Kaushik Raina	35420eb11b	KAFKA-18684: Add base exception classes (#18871 ) Introduced two new exception classes to the Kafka error handling framework: ApplicationRecoverableException: This exception signals that the error is recoverable, but the producer needs to be restarted. It helps in scenarios where recovery actions (like re-balancing or restoring from checkpoints) are needed. RefreshRetriableException: This exception occurs when metadata is outdated or invalid and needs to be refreshed before retrying the request. It helps handle retries that depend on updated metadata. Both classes are abstract and in upcoming PRs they will be extended by relevant classes as mentioned in KIP-1050:Exception Table. Reviewers: Justine Olshan <jolshan@confluent.io>, Sanskar Jhajharia <jhajharia.sanskar@gmail.com>	2025-02-17 12:11:51 -08:00
Ken Huang	d1db3d8e14	KAFKA-18805: add synchronized block for Consumer Heartbeat close (#18920 ) add synchronized block for Consumer Heartbeat close. Reviewers: Luke Chen <showuon@gmail.com>	2025-02-17 14:38:20 +08:00
Ming-Yen Chung	e828767062	KAFKA-18790 Fix testCustomQuotaCallback (#18906 ) Frequently updating the trust store can cause unexpected termination of the AsyncConsumer background thread. 1. To resolve this issue, reuse the same AdminClient instead of recreating it. 2. Add error logging when fail to initialize resources for the consumer network thread. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-02-15 03:07:59 +08:00
Jimmy Wang	6a6b80215d	KAFKA-16717 [1/2]: Add AdminClient.alterShareGroupOffsets (#18819 ) KAFKA-16720 aims to add the support for the AlterShareGroupOffsets AdminClient. Key Changes in the PR: 1. Added handing of alterShareGroupOffsets() in KafkaAdminClient and introduce AlterShareGroupOffsetRequest/AlterShareGroupOffsetResponse/AlterShareGroupOffsetsOptions classes. 2. Corresponding test in KafkaAdminClientTest. 3. Added ALTER_SHARE_GROUP_OFFSETS API (will finish it in next PR and the share coordinator pieces) Reviewers: poorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-15 02:35:46 +08:00
Calvin Liu	53c2b1604d	MINOR: TransactionManager logs the epoch bump less frequently. (#18895 ) Reviwers: Justine Olshan <jolshan@confluen.io>	2025-02-14 08:37:23 -08:00
Apoorv Mittal	e6b835f0b4	MINOR: Marking testVerifyFetchAndCloseImplicit flaky (#18893 ) Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-14 04:57:06 +08:00
Kirk True	057460e807	KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#18795 ) Reviewers: Jun Rao <jun@confluent.io>, Lianet Magrans <lmagrans@confluent.io>, Jeff Kim <jeff.kim@confluent.io>	2025-02-13 13:53:56 -05:00
Andrew Schofield	952113e8e0	KAFKA-16720: Support multiple groups in DescribeShareGroupOffsets RPC (#18834 ) Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>	2025-02-13 18:27:05 +00:00
Lianet Magrans	6eb6a5e578	KAFKA-18776: Fix flaky coordinator disconnect test & fix log level (#18866 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-02-13 12:11:45 -05:00
Lianet Magrans	c465cf6b4b	KAFKA-17298: Update upgrade notes for 4.0 KIP-848 (#18756 ) Reviewers: David Jacot <djacot@confluent.io>	2025-02-13 11:51:56 -05:00
ShivsundarR	0e40b80c86	KAFKA-18769: Improve leadership changes handling in ShareConsumeRequestManager. (#18851 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-12 15:54:01 +00:00
Sushant Mahajan	675a0889de	KAFKA-18764: Throttle on share state RPCs auth failure. (#18855 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-11 09:54:24 +00:00
Ismael Juma	da21b536c4	MINOR: Java version and TLS documentation improvements (#18822 ) Most of the changes are obvious clean-ups/fixes. A couple of noteworthy items: 1. Support for non LTS versions is clarified (we were incorrectly stating full support for Java 23). 2. TLS version negotiation details are clarified. Reviewers: Matthias J. Sax <matthias@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-10 12:24:28 -08:00
Ken Huang	70adf746c4	KAFKA-18225 ClientQuotaCallback#updateClusterMetadata is unsupported by kraft (#18196 ) This commit ensures that the ClientQuotaCallback#updateClusterMetadata method is executed in KRaft mode. This method is triggered whenever a topic or cluster metadata change occurs. However, in KRaft mode, the current implementation of the updateClusterMetadata API is inefficient due to the requirement of creating a full Cluster object. To address this, a follow-up issue (KAFKA-18239) has been created to explore more efficient mechanisms for providing cluster information to the ClientQuotaCallback without incurring the overhead of a full Cluster object creation. Reviewers: Mickael Maison <mickael.maison@gmail.com>, TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-11 01:03:02 +08:00
PoAn Yang	d0f4c2f844	KAFKA-18441: Remove flaky tag on KafkaAdminClientTest#testAdminClientApisAuthenticationFailure (#18847 ) Signed-off-by: PoAn Yang <payang@apache.org> Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-10 16:36:27 +00:00
Andrew Schofield	aa8c57665f	KAFKA-18618: Improve leader change handling of acknowledgements [1/N] (#18672 ) Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, ShivsundarR <shr@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>	2025-02-06 14:32:55 +00:00
Sushant Mahajan	0bd1ff936f	KAFKA-18629: Add persister impl and tests for DeleteShareGroupState RPC. [2/N] (#18748 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-05 14:51:19 +00:00
Sanskar Jhajharia	7dbed2f6e8	[KAFKA-16720] AdminClient Support for ListShareGroupOffsets (2/2) (#18671 ) Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Sushant Mahajan <smahajan@confluent.io>, Andrew Schofield <aschofield@confluent.io>	2025-02-05 14:38:09 +00:00
TengYao Chi	66363160c5	KAFKA-18645: New consumer should align close timeout handling with classic consumer (#18702 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>, Kirk True <ktrue@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-05 09:08:51 -05:00
Ming-Yen Chung	d830179375	KAFKA-18675 Add tests for valid and invalid broker addresses (#18781 ) Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-05 17:01:51 +08:00
Sean Quah	42e7cbb67e	KAFKA-18690: Keep leader metadata for RE2J-assigned partitions (#18777 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>	2025-02-04 13:22:28 -05:00
Bruno Cadonna	b998189b00	KAFKA-18538: Add Streams membership manager (#18551 ) The Streams membership manager is used client-side in the background thread of the async consumer. For each member /consumer, it is responsible for: * keeping the member state, * keeping assignments for the member, * reconciling the assignments of the member -- for example when tasks need to be revoked before other tasks are assigned * requesting invocations of assignment and revocation callbacks by the stream thread. The Streams membership manager is called by the background thread of the async consumer, directly in its event loop and from the Streams group heartbeat request manager. The Streams membership manager uses the Streams rebalance events processor to request assignment/revocation callback in the stream thread. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Bill Bejeck <bill@confluent.io>	2025-02-04 17:32:26 +01:00
Luke Chen	612e1299e4	KAFKA-18230: Handle not controller or not leader error in admin client (#18165 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-04 16:51:24 +01:00
Ismael Juma	78aff4fede	KAFKA-18659: librdkafka compressed produce fails unless api versions returns produce v0 (#18727 ) Return produce v0-v2 as supported versions in `ApiVersionsResponse`, but disable support for it everywhere else. Since clients pick the highest supported version by both client and broker during version negotiation, this solves the problem with minimal tech debt (even though it's not ideal that `ApiVersionsResponse` becomes inconsistent with the actual protocol support). Add one test for the socket server handling (in `ProcessorTest`) and one test for the client behavior (in `ProduceRequestTest`). Adjust a couple of api versions tests to verify the new behavior. Finally, include a few clean-ups in `ApiKeys`, `Protocol`, `ProduceRequest`, `ProduceRequestTest` and `BrokerApiVersionsCommandTest`. Reference to related librdkafka issue: https://github.com/confluentinc/librdkafka/issues/4956 Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>	2025-02-01 16:08:54 -08:00
Apoorv Mittal	484ba83f59	KAFKA-18683: Handle slicing of file records for updated start position (#18759 ) The PR corrects the check which was introduced in #5332 where position is checked to be within boundaries of file. The check position > currentSizeInBytes - start is incorrect, since the position is relative to start. Reviewers: Jun Rao <junrao@gmail.com>	2025-01-31 15:43:51 -08:00
Lianet Magrans	7920fadbb5	Revert "KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#17700 )" This reverts commit `6cf54c4dab`.	2025-01-31 17:18:35 -05:00
Mickael Maison	71314739f9	KAFKA-15995: Initial API + make Producer/Consumer plugins Monitorable (#17511 ) Reviewers: Greg Harris <gharris1727@gmail.com>, Luke Chen <showuon@gmail.com>	2025-01-31 10:40:10 +01:00
Luke Chen	15c5c075c1	MINOR: Clean up for sasl endpoints (#18519 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	2025-01-31 09:27:04 +01:00
Kirk True	6cf54c4dab	KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#17700 ) This change reduces fetch session cache evictions on the broker for AsyncKafkaConsumer by altering its logic to determine which partitions it includes in fetch requests. Background Consumer implementations fetch data from the cluster and temporarily buffer it in memory until the user next calls Consumer.poll(). When a fetch request is being generated, partitions that already have buffered data are not included in the fetch request. The ClassicKafkaConsumer performs much of its fetch logic and network I/O in the application thread. On poll(), if there is any locally-buffered data, the ClassicKafkaConsumer does not fetch any new data and simply returns the buffered data to the user from poll(). On the other hand, the AsyncKafkaConsumer consumer splits its logic and network I/O between two threads, which results in a potential race condition during fetch. The AsyncKafkaConsumer also checks for buffered data on its application thread. If it finds there is none, it signals the background thread to create a fetch request. However, it's possible for the background thread to receive data from a previous fetch and buffer it before the fetch request logic starts. When that occurs, as the background thread creates a new fetch request, it skips any buffered data, which has the unintended result that those partitions get added to the fetch request's "to remove" set. This signals to the broker to remove those partitions from its internal cache. This issue is technically possible in the ClassicKafkaConsumer too, since the heartbeat thread performs network I/O in addition to the application thread. However, because of the frequency at which the AsyncKafkaConsumer's background thread runs, it is ~100x more likely to happen. Options The core decision is: what should the background thread do if it is asked to create a fetch request and it discovers there's buffered data. There were multiple proposals to address this issue in the AsyncKafkaConsumer. Among them are: The background thread should omit buffered partitions from the fetch request as before (this is the existing behavior) The background thread should skip the fetch request generation entirely if there are any buffered partitions The background thread should include buffered partitions in the fetch request, but use a small “max bytes” value The background thread should skip fetching from the nodes that have buffered partitions Option 4 won out. The change is localized to AbstractFetch where the basic idea is to skip fetch requests to a given node if that node is the leader for buffered data. By preventing a fetch request from being sent to that node, it won't have any "holes" where the buffered partitions should be. Reviewers: Lianet Magrans <lmagrans@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Jun Rao <junrao@gmail.com>	2025-01-30 13:12:11 -08:00
Ken Huang	4b29fd6383	KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18548 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>, Kirk True <ktrue@confluent.io>	2025-01-30 11:22:54 -05:00
Pramithas Dhakal	aa27df9396	MINOR: KafkaProducerTest - Fix resource leakage and replace explicit invocation of close() method with try with resources (#18678 ) Reviewers: Divij Vaidya <diviv@amazon.com>, Greg Harris <greg.harris@aiven.io>, Christo Lolov <lolovc@amazon.com>	2025-01-30 12:34:57 +01:00
PoAn Yang	0dfc4017b8	KAFKA-18441: Fix flaky KafkaAdminClientTest#testAdminClientApisAuthenticationFailure (#18735 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Andrew Schofield <aschofield@confluent.io>	2025-01-30 08:01:20 +00:00
TengYao Chi	9dd73d43b0	KAFKA-18569: New consumer close may wait on unneeded FindCoordinator (#18590 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>, Kirk True <ktrue@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-01-29 14:15:56 -05:00
Calvin Liu	a3b34c1315	KAFKA-18662: Return CONCURRENT_TRANSACTIONS on produce request in TV2 (#18733 ) While testing, it was found that the not_enough_replicas error was super common and could be easily confused. Since we are already bumping the request, we can signify that the produce request may return this error and new clients can handle it (Note, the java client should be able to handle this already as a retriable error, but other client libraries may need to implement this change) Reviewers: Justine Olshan <jolshan@confluent.io>	2025-01-29 10:15:48 -08:00
Ismael Juma	ca5d2cf76d	KAFKA-18646: Null records in fetch response breaks librdkafka (#18726 ) Ensure we always return empty records (including cases where an error is returned). We also remove `nullable` from `records` since it is effectively expected to be non-null by a large percentage of clients in the wild. This behavior regressed in `fe56fc9` (KAFKA-18269). Empty records were previously set via `FetchResponse.recordsOrFail(partitionData)` in the now-removed `maybeConvertFetchedData` method. Added an integration test that fails without this fix and also update many tests to set `records` to `empty` instead of leaving them as `null`. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Arthur <mumrah@gmail.com>	2025-01-29 07:04:12 -08:00
TengYao Chi	97a228070e	KAFKA-18619: New consumer topic metadata events should set requireMetadata flag (#18668 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>	2025-01-29 08:36:05 -05:00
Andrew Schofield	f960e20647	KAFKA-18488: Improve KafkaShareConsumerTest (#18728 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>	2025-01-29 09:47:21 +00:00
Ismael Juma	e6d72c9e60	KAFKA-18648: Add back support for metadata version 0-3 (#18716 ) During testing, we identified that kafka-python (and aiokafka) relies on metadata request v0 and hence we need to add these back to comply with the premise of KIP-896 - i.e. it should not break the clients listed within it. I reverted the changes from #18218 related to the removal of metadata versions 0-3. I will submit a separate PR to undeprecate these API versions on the relevant 3.x branches. kafka-python (and aiokafka) work correctly (produce & consume) with this change on top of the 4.0 branch. Reviewers: David Arthur <mumrah@gmail.com>	2025-01-28 18:35:33 -08:00
David Arthur	f18457f2b8	MINOR Mark a StickyAssignorTest as flaky (#18719 ) Mark StickyAssignorTest#testLargeAssignmentAndGroupWithNonEqualSubscription as flaky. Used data from this report https://github.com/apache/kafka/actions/runs/12982945953 Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-01-28 10:34:05 -05:00
Sushant Mahajan	f32932cc25	KAFKA-18629: Delete share group state impl [1/N] (#18712 ) Reviewers: Christo Lolov <lolovc@amazon.com>, Andrew Schofield <aschofield@confluent.io>	2025-01-28 11:43:01 +00:00
Chung, Ming-Yen	43af241b50	KAFKA-18639 Enable the @Flaky annotation for some flaky tests (#18701 ) The following tests were previously reported as flaky but were only annotated with a comment in pull request #18558 due to module dependency limitations: testAdminClientApisAuthenticationFailure testOutdatedCoordinatorAssignment testThrottledProducerConsumer With the introduction of the new test infrastructure #18602 , which allows all modules to use the @Flaky annotation, these tests should now be updated to include the @Flaky annotation. Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-01-25 22:44:35 +08:00
David Arthur	8c0a0e07ce	KAFKA-17587 Refactor test infrastructure (#18602 ) This patch reorganizes our test infrastructure into three Gradle modules: ":test-common:test-common-internal-api" is now a minimal dependency which exposes interfaces and annotations only. It has one project dependency on server-common to expose commonly used data classes (MetadataVersion, Feature, etc). Since this pulls in server-common, this module is Java 17+. It cannot be used by ":clients" or other Java 11 modules. ":test-common:test-common-util" includes the auto-quarantined JUnit extension. The @Flaky annotation has been moved here. Since this module has no project dependencies, we can add it to the Java 11 list so that ":clients" and others can utilize the @Flaky annotation ":test-common:test-common-runtime" now includes all of the test infrastructure code (TestKitNodes, etc). This module carries heavy dependencies (core, etc) and so it should not normally be included as a compile-time dependency. In addition to this reorganization, this patch leverages JUnit SPI service discovery so that modules can utilize the integration test framework without depending on ":core". This will allow us to start moving integration tests out of core and into the appropriate sub-module. This is done by adding ":test-common:test-common-runtime" as a testRuntimeOnly dependency rather than as a testImplementation dependency. A trivial example was added to QuorumControllerTest to illustrate this. Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>	2025-01-24 09:03:43 -05:00
Ken Huang	0c9df75295	KAFKA-18474: Remove zkBroker listener (#18477 ) Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>, PoAn Yang <payang@apache.org>	2025-01-24 05:53:32 -08:00
Okada Haruki	17846fe743	KAFKA-16372 Fix producer doc discrepancy with the exception behavior (#15574 ) Currently, Producer.send doc is inconsistent with actual exception behavior - TimeoutException: This won't be thrown from send on buffer-full or metadata-missing actually. Instead, it will returned as failed future. - AuthenticationException/AuthorizationException: These exceptions are also won't be thrown. Returned with failed future actually. Fixed Callback javadoc and ProducerConfig doc as well. Reviewers: Luke Chen <showuon@gmail.com>, Andrew Schofield <aschofield@confluent.io>	2025-01-24 20:23:43 +08:00
Karsten Spang	400ecab518	KAFKA-13810: Document behavior of KafkaProducer.flush() w.r.t callbacks (#12042 ) Reviewers: Luke Chen <showuon@gmail.com>, Andrew Eugene Choi <andrew.choi@uwaterloo.ca>	2025-01-23 17:20:30 +01:00
Andrew Schofield	8000d04dcb	KAFKA-18488: Additional protocol tests for share consumption (#18601 ) Reviewers: ShivsundarR <shr@confluent.io>, Lianet Magrans <lmagrans@confluent.io>	2025-01-23 13:32:59 +00:00
Andrew Schofield	9da516b1a9	KAFKA-18392: Ensure client sets member ID for share group (#18649 ) Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Lianet Magrans <lmagrans@confluent.io>	2025-01-22 08:57:40 +00:00
Bruno Cadonna	239708f52e	KAFKA-18518: Add processor to handle rebalance events (#18527 ) This commit adds a processor named StreamsRebalanceEventsProcessor that handles the rebalance events sent from the background thread of the async consumer to the stream thread when an task assignment changes. It also adds the corresponding rebalance events. Additionally, this commit adds StreamsRebalanceData that maintains the data that is exchanges for the Streams rebalance protocol. All of these are used by the Streams heartbeat request manager and the Streams membership manager that will be added in a future commit. Reviewer: Lucas Brutschy <lbrutschy@confluent.io>	2025-01-22 08:30:56 +01:00
David Jacot	b368c38684	KAFKA-18302; Update CoordinatorRecord (#18512 ) This patch does a few things: 1) Replace ApiMessageAndVersion by ApiMessage in CoordinatorRecord for the key 2) Leverage the fact that ApiMessage exposes the apiKey. Hence we don't need to specify the key anymore. Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-01-21 18:11:26 +01:00
Artem Livshits	247c0f0ba5	KAFKA-15370: Support Participation in 2PC (KIP-939) (2/N) (#18316 ) Update producer id request / response formats and transaction log value format. There is no functional change. Reviewers: Justine Olshan <jolshan@confluent.io>, Calvin Liu <caliu@confluent.io>	2025-01-21 08:40:46 -08:00
Matthias J. Sax	ba774a09f4	KAFKA-8862: Improve Producer error message for failed metadata update (#18587 ) We should provide the same informative error message for both timeout cases. Reviewers: Kirk True <ktrue@confluent.io>, Andrew Schofield <aschofield@confluent.io>, Ismael Juma <ismael@juma.me.uk>	2025-01-21 08:37:45 -08:00
Andrew Schofield	7cbfd22bde	MINOR: Improve javadoc for ListShareGroupOffsetsResult (#18650 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>, PoAn Yang <payang@apache.org>, Apoorv Mittal <apoorvmittal10@gmail.com>	2025-01-21 13:56:40 +00:00
Ismael Juma	87b37a4065	KAFKA-14552: Assume a baseline of 3.0 for server protocol versions (#18497 ) Kafka 4.0 will remove support for zk mode and will require conversion to kraft before upgrading to 4.0. The minimum kraft version is 3.0 (aka 3.0-IV1). This provides an opportunity to remove exclusively server side protocols versions that only exist to allow direct upgrades from versions older than 3.0 or that are used only by zk mode. Since KRaft became production ready in 3.3, we should consider setting the baseline to 3.3. But that requires more discussion and it can be done via a separate change (KAFKA-18601). Protocol changes: * Remove RequestHeader v0 (only used by ControlledShutdown v0) * Remove WriteTxnMarkers v0 * Remove all versions of ControlledShutdown, LeaderAndIsr, StopReplica, UpdateMetadata In order to remove all versions safely, extend generator to support setting "versions" to "none". In this case, we no longer generate the `*Data` classes, but we still reserve the id for the relevant protocol api (so it doesn't get accidentally used for something else). The protocol documentation is correct after these changes. We kept a simplified version of `LeaderAndIsr{Request\|Response}` because it's used by many tests that are still relevant in kraft mode. Once KAFKA-18486 is done, it may be possible to remove it (I left a comment on the ticket). Similarly, KAFKA-18487 may make it possible to remove the introduced `StopReplicaPartitionState` (left a comment on that ticket too). There are a number of places that were adjusted to include an `ApiKeys.hasValidVersion` check. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-01-20 13:51:44 -08:00
PoAn Yang	7733323040	HOTFIX: ListShareGroupOffsetResult javadoc (#18642 ) Signed-off-by: PoAn Yang <payang@apache.org> Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-01-20 15:29:11 +00:00
Sanskar Jhajharia	bcbc72e29b	[KAFKA-16720] AdminClient Support for ListShareGroupOffsets (1/n) (#18571 ) Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>	2025-01-20 07:47:14 +00:00
Alyssa Huang	4583b033f0	KAFKA-17642: PreVote response handling and ProspectiveState (#18240 ) This PR implements the second part of KIP-996 and KAFKA-16164 (tasks KAFKA-16607, KAFKA-17642, KAFKA-17643, KAFKA-17675) which encompass the response handling of PreVotes, addition of new ProspectiveState, update to metrics, and addition of Raft simulation tests. Voters now transition to ProspectiveState first before CandidateState to prevent unnecessary epoch bumps. Voters in ProspectiveState send PreVotes requests which are Vote requests with PreVote set to true. Follower grants PreVotes if it has not yet fetched successfully from leader. Leader denies all PreVotes. Unattached, Prospective, Candidate, and Resigned will grant PreVotes if the requesting replica's log is at least as long as theirs. Granted PreVotes are not persisted like standard votes. It is possible for a voter to grant several PreVotes in the same epoch. The only state which is allowed to transition directly to CandidateState is ProspectiveState. This happens on reception of majority of granted PreVotes or if at least one voter doesn't support PreVote requests. Prospective will transition to Follower after election loss/timeout if it was already aware of last known leader and the leader's endpoint, or at any point if it discovers the leader. Prospective will transition to Unattached after election loss/timeout if it does not know the leader endpoints. After electionTimeout, Resigned now always transitions to Unattached and increases the epoch. Prospective grants standard votes if it has not already granted a standard vote (no votedKey), has no leaderId, and the recipient's log is current enough Candidate no longer backs off after election timeout. Candidate still backs off after election loss. Reviewers: José Armando García Sancio <jsancio@apache.org>	2025-01-17 09:38:03 -05:00
Bruno Cadonna	5c20aa187a	KAFKA-18546: Use mocks instead of a real DNS lookup to the outside (#18565 ) Since the example.com DNS lookup changed the second time within one year, we rewrote the unit tests for ClientUtils so that they do not make a real DNS lookup to the outside but use mocks. Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>, Lianet Magrans <lmagrans@confluent.io>	2025-01-16 16:18:44 +01:00
ShivsundarR	bf760d4ebe	KAFKA-18558: Added check before adding previously subscribed partitions (#18562 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-01-16 13:17:48 +00:00
Mickael Maison	8262e2315d	MINOR: Cleanups in JaasUtils (#18522 ) Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-01-16 14:07:16 +01:00
Ken Huang	3c1f965c60	KAFKA-18521 Cleanup NodeApiVersions zkMigrationEnabled field (#18535 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-01-16 20:05:04 +08:00
Jason Taylor	11c10fe4da	KAFKA-16368: Update default linger.ms to 5ms for KIP-1030 (#18080 ) Reviewers: Ismael Juma <ismael@juma.me.uk>, Divij Vaidya <diviv@amazon.com>	2025-01-16 10:50:06 +01:00
Mickael Maison	833921ab9e	MINOR: Adjust logging in SerializedJwt (#18523 ) Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-01-16 09:58:13 +01:00
Sushant Mahajan	47f22faac3	MINOR: Added flaky references for a few tests. (#18558 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-01-15 19:24:52 +00:00
Kuan-Po Tseng	d3b4c1bdf4	KAFKA-18401: Transaction version 2 does not support commit transaction without records (#18448 ) Fix the issue where producer.commitTransaction under transaction version 2 throws error if no partition or offset is added to transaction. The solution is to avoid sending the endTxnRequest unless producer.send or producer.sendOffsetsToTransaction is triggered. Reviewers: Justine Olshan <jolshan@confluent.io>	2025-01-15 10:21:11 -08:00
PoAn Yang	85d2e90074	HOTFIX: ClientUtilsTest#testParseAndValidateAddressesWithReverseLookup (#18549 ) Reviewers: Ismael Juma <ismael@juma.me.uk>, Gaurav Narula <gaurav_narula2@apple.com>, TengYao Chi <kitingiao@gmail.com>	2025-01-15 16:09:03 +01:00
Mickael Maison	66b1f00c0e	KAFKA-18520: Remove ZooKeeper logic from JaasUtils (#18530 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-01-15 13:17:06 +01:00
Mickael Maison	6b8cc5d558	MINOR: Remove ZooKeeper mentions in Admin javadoc (#18531 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-01-15 10:33:30 +01:00
Ismael Juma	f3a93551fa	Revert "KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18050 )" (#18544 ) This reverts commit `70d6312a3a`. Reviewers: Luke Chen <showuon@gmail.com>	2025-01-15 16:16:47 +08:00
Pramithas Dhakal	ea77352dfc	Rename the variable to reflect its purpose (#18525 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-01-14 18:00:27 +00:00
Sanskar Jhajharia	e3e4c17959	Add DescribeShareGroupOffsets API [KIP-932] (#18500 ) Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>	2025-01-14 14:33:39 +00:00
Istvan Toth	d7e5d0a59b	KAFKA-18064: SASL mechanisms should throw exception on wrap/unwrap (#17901 ) SASL mechanisms that do support neither integrity nor confidentality should throw exception on wrap/unwrap. The current implementation does not implement wrap/unwrap correctly. This may cause security issues, if the code using the mechanisms does not check for QOP correctly. Reviewers: Gaurav Narula <gaurav_narula2@apple.com>, Igor Soarez <i@soarez.me>	2025-01-14 11:30:01 +00:00
陳昱霖(Yu-Lin Chen)	4fcde4542b	KAFKA-18469;KAFKA-18036: AsyncConsumer should request metadata update if ListOffsetRequest encounters a retriable error (#18475 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>	2025-01-13 19:03:52 +01:00
Ken Huang	70d6312a3a	KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18050 ) Reviewers: Kirk True <ktrue@confluent.io>, Lianet Magrans <lmagrans@confluent.io>	2025-01-13 15:29:14 +01:00
Xuan-Zhang Gong	dbe27c9eb2	KAFKA-18467 enhance the docs of `NewTopic` - the first replica will be treated as the preferred leader (#18470 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-01-12 20:32:13 +08:00
Ismael Juma	d4aee71e36	KAFKA-18465: Remove MetadataVersions older than 3.0-IV1 (#18468 ) Apache Kafka 4.0 will only support KRaft and 3.0-IV1 is the minimum version supported by KRaft. So, we can assume that Apache Kafka 4.0 will only communicate with brokers that are 3.0-IV1 or newer. Note that KRaft was only marked as production-ready in 3.3, so we could go further and set the baseline to 3.3. I think we should have that discussion, but it made sense to start with the non controversial parts. Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <david.jacot@gmail.com>	2025-01-11 09:42:39 -08:00
Matthias J. Sax	f54cfff1dc	MINOR: simplify producer TX abort error handling (#18486 ) Reviewers: Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@responsive.dev>	2025-01-10 17:54:40 -08:00
Matthias J. Sax	3b38b016c8	KAFKA-17825: Update docs for ByteBufferDeserializer changes in 3.6 release (#18466 ) KIP-863 introduced a change to ByteBufferDeserializer which is not properly documented, but should be called out because it could surface bugs in application code which using ByteBufferDeserializer. Reviewers: Lianet Magrans <lmagrans@confluent.io>, Kirk True <ktrue@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-01-10 15:32:51 -08:00
PoAn Yang	2b7c039971	KAFKA-18440: Convert AuthorizationException to fatal error in AdminClient (#18435 ) Reviewers: Divij Vaidya <diviv@amazon.com>	2025-01-10 11:12:28 +01:00
Colt McNealy	bb22eec478	KAFKA-17455: fix stuck producer when throttling or retrying (#17527 ) A producer might get stuck after it was throttled. This PR unblocks the producer by polling again after pollDelayMs in NetworkUtils#awaitReady(). Reviewers: Matthias J. Sax <matthias@confluent.io>, David Jacot <djacot@confluent.io>	2025-01-09 10:27:04 -08:00
Ismael Juma	cf7029c026	KAFKA-13093: Log compaction should write new segments with record version v2 (KIP-724) (#18321 ) Convert v0/v1 record batches to v2 during compaction even if said record batches would be written with no change otherwise. A few important details: 1. V0 compressed record batch with multiple records is converted into single V2 record batch 2. V0 uncompressed records are converted into single record V2 record batches 3. V0 records are converted to V2 records with timestampType set to `CreateTime` and the timestamp is `-1`. 4. The `KAFKA-4298` workaround is no longer needed since the conversion to V2 fixes the issue too. 5. Removed a log warning applicable to consumers older than 0.10.1 - they are no longer supported. 6. Added back the ability to append records with v0/v1 (for testing only). 7. The creation of the leader epoch cache is no longer optional since the record version config is effectively always V2. Add integration tests, these tests existed before #18267 - restored, modified and extended them. Reviewers: Jun Rao <jun@confluent.io>	2025-01-09 09:37:23 -08:00
xijiu	fcd98da9ae	KAFKA-18445 Remove LazyDownConversionRecords and LazyDownConversionRecordsSend (#18445 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-01-10 00:22:56 +08:00
Ken Huang	64b8b4a632	MINOR: Remove ZooKeeper mentions in Sanitizer (#18420 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	2025-01-09 14:33:43 +01:00
Andrew Schofield	3f9d2c2db0	KAFKA-18433: Add BatchSize to ShareFetch request (1/N) (#18439 ) Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>	2025-01-08 15:29:43 +00:00
ShivsundarR	3c7ed3333d	KAFKA-18397: Added null check before sending background event from ShareConsumeRequestManager. (#18419 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-01-08 13:56:52 +00:00
Lianet Magrans	0721d21a57	KAFKA-18415: Fix for event queue metric and flaky test (#18416 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-01-08 14:31:10 +01:00

1 2 3 4 5 ...

3681 Commits