kafka

Commit Graph

Author	SHA1	Message	Date
Logan Zhu	423330ebe7	KAFKA-19692 improve the docs of "clusterId" for AddRaftVoterOptions and RemoveRaftVoterOptions (#20555 ) CI / build (push) Has been cancelled Details Improves the documentation of the clusterId field in AddRaftVoterOptions and RemoveRaftVoterOptions. The changes include: 1. Adding Javadoc to both addRaftVoter and removeRaftVoter methods to explain the behavior of the optional clusterId. 2. Integration tests have been added to verify the correct behavior of add and remove voter operations with and without clusterId, including scenarios with inconsistent cluster ids. Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-30 21:17:41 +08:00
YuChia Ma	92169b8f08	KAFKA-19357 AsyncConsumer#close hangs as commitAsync never completes when coordinator is missing (#19914 ) Problem: When AsyncConsumer is closing, CoordinatorRequestManager stops looking for coordinator by returning EMPTY in poll() method when closing flag is true. This prevents commitAsync() and other coordinator-dependent operations from completing, causing close() to hang until timeout. Solution: Modified the closing flag check in poll() method of CommitRequestManager to be more targeted: - When both coordinators are unknown and the consumer is closing, only return EMPTY - When this condition is met, proactively fail all pending commit requests with CommitFailedException - This allows coordinator lookup to continue when coordinator is available during shutdown, while preventing indefinite hangs when coordinator is unreachable Reviewers: PoAn Yang <payang@apache.org>, Andrew Schofield <aschofield@confluent.io>, TengYao Chi <kitingiao@gmail.com>, Kirk True <kirk@kirktrue.pro>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Lan Ding <isDing_L@163.com>, TaiJuWu <tjwu1217@gmail.com>, Ken Huang <s7133700@gmail.com>, KuoChe <kuoche1712003@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-29 23:06:56 +08:00
Lan Ding	c2aeec46a2	MINOR: Remove logContext arrtibute from StreamsGroup and CoordinatorRuntime (#20572 ) CI / build (push) Waiting to run Details The `logContext` attribute in `StreamsGroup` and `CoordinatorRuntime` is not used anymore. This patch removes it. Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-29 01:37:58 +08:00
Sanskar Jhajharia	97c8c6b595	KAFKA-19733 Fix arguments to assertEquals() in clients module (#20586 ) The given PR mostly fixes the order of arguments in `assertEquals()` for the Clients module. Some minor cleanups were included with the same too. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-09-25 23:37:02 +08:00
Ken Huang	8036e49a6e	KAFKA-17554 Flaky testFutureCompletionOutsidePoll in ConsumerNetworkClientTest (#18298 ) Jira: https://issues.apache.org/jira/browse/KAFKA-17554 In the previous workflow, the test passes under two conditions: 1. The `t1` thread is waiting for the main thread's `client.wakeup()`. If successful, `t1` will wake up `t2`, allowing `t2` to complete the future. 2. If `t1` fails to receive the `client.wakeup()` from the main thread, `t2` will be woken up by the main thread. In the previous implementation, we used a `CountDownLatch` to control the execution of three threads, but it often led to race conditions. Currently, we have modified it to use two threads to test this scenario. I run `I=0; while ./gradlew :clients:test --tests ConsumerNetworkClientTest.testFutureCompletionOutsidePoll --rerun --fail-fast; do (( I=$I+1 )); echo "Completed run: $I"; sleep 1; done` and pass 3000+ times. ![image](https://github.com/user-attachments/assets/3b8d804e-fbe0-4030-8686-4960fc717d07) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-09-24 18:42:25 +08:00
Lucas Brutschy	1f7631c8c6	MINOR: Fix StreamsRebalanceListenerInvoker (#20575 ) StreamsRebalanceListenerInvoker was implemented to match the behavior of ConsumerRebalanceListenerInvoker, however StreamsRebalanceListener has a subtly different interface than ConsumerRebalanceListener - it does not throw exceptions, but returns it as an Optional. In the interest of consistency, this change fixes this mismatch by changing the StreamsRebalanceListener interface to behave more like the ConsumerRebalanceListener - throwing exceptions directly. In another minor fix, the StreamsRebalanceListenerInvoker is changed to simply skip callback execution instead of throwing an IllegalStateException when no streamRebalanceListener is defined. This can happen when the consumer is closed before Consumer.subscribe is called. Reviewers: Lianet Magrans <lmagrans@confluent.io>, Matthias J. Sax <matthias@confluent.io>	2025-09-24 09:03:07 +02:00
Uladzislau Blok	f16d1f3c9d	KAFKA-19299: Fix race condition in RemoteIndexCacheTest (#19927 ) This MR should be couple of race conditions in RemoteIndexCacheTest. 1. There was a race condition between cache-cleanup-thread and test thread, which wants to check that cache is gone. This was fixed with TestUtils#waitForCondition 2. After each test we check that there is not thread leak. This check wasn't working properly, because live of thread status is set by JVM level, we can only set interrupted status (using private native void interrupt0(); method under the hood), but we don't really know when JVM will change the live status of thread. To fix this I've refactored TestUtils#assertNoLeakedThreadsWithNameAndDaemonStatus method to use TestUtils#waitForCondition. This fix should also affect few other tests, which were flaky because of this check. See gradle run on [develocity](https://develocity.apache.org/scans/tests?search.rootProjectNames=kafka&search.timeZoneId=Europe%2FLondon&tests.container=org.apache.kafka.storage.internals.log.RemoteIndexCacheTest&tests.sortField=FLAKY) After fix test were run 10000 times with repeated test annotation: `./gradlew clean storage:test --tests org.apache.kafka.storage.internals.log.RemoteIndexCacheTest.testCacheEntryIsDeletedOnRemoval` ... `Gradle Test Run :storage:test > Gradle Test Executor 20 > RemoteIndexCacheTest > testCacheEntryIsDeletedOnRemoval() > repetition 9998 of 10000 PASSED` `Gradle Test Run :storage:test > Gradle Test Executor 20 > RemoteIndexCacheTest > testCacheEntryIsDeletedOnRemoval() > repetition 9999 of 10000 PASSED` `Gradle Test Run :storage:test > Gradle Test Executor 20 > RemoteIndexCacheTest > testCacheEntryIsDeletedOnRemoval() > repetition 10000 of 10000 PASSED` `BUILD SUCCESSFUL in 20m 9s` `148 actionable tasks: 148 executed` Reviewers: Lianet Magrans <lmagrans@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-22 11:20:14 -04:00
Ken Huang	01fccd3513	KAFKA-15186 AppInfo metrics don't contain the client-id (#20493 ) All Kafka component register AppInfo metrics to track the application start time or commit-id etc. These metrics are useful for monitoring and debugging. However, the AppInfo doesn't provide client-id, which is an important information for custom metrics reporter. The AppInfoParser class registers a JMX MBean with the provided client-id, but when it adds metrics to the Metrics registry, the client-id is not included. This KIP aims to add the client-id as a tag. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-09-21 16:28:03 +08:00
Jhen-Yung Hsu	57e9f98e15	KAFKA-19644 Enhance the documentation for producer headers and integration tests (#20524 ) - Improve the docs for Record Headers. - Add integration tests to verify that the order of headers in a record is preserved when producing and consuming. - Add unit tests for RecordHeaders.java. Reviewers: Ken Huang <s7133700@gmail.com>, Hong-Yi Chen <apalan60@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-20 11:46:19 +08:00
Lianet Magrans	848e3d0092	KAFKA-19722: Adding missing metric assigned-partitions for new consumer (#20557 ) Adding the missing metric to track the number of partitions assigned. This metric should be registered whenever the consumer is using a groupId, and should track the number of partitions from the subscription state, regardless of the subscription type (manual or automatic). This PR registers the missing metric as part of the ConsumerRebalanceMetricsManager setup. This manager is created if there is a group ID, and reused by the consumer membershipMgr and the streamsMemberhipMgr, so we ensure that the metric is registered for the new consumer and streams. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, TengYao Chi <frankvicky@apache.org>	2025-09-19 12:42:43 -04:00
Lan Ding	9a32a71e76	KAFKA-19699 improve the documentation of `RecordsToDelete` (#20527 ) document the behavior of "-1" (HIGH_WATERMARK) Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-17 23:49:42 +08:00
Lianet Magrans	9f657abf3a	MINOR: Improve consumer rebalance callbacks docs (#20528 ) Clarify rebalance callbacks behaviour (got some questions for onPartitionsAssigned, docs where indeed confusing about the partitions received in params). Reviewed all rebalance callbacks with it. Reviewers: Bill Bejeck<bbejeck@apache.org>	2025-09-16 11:12:19 -04:00
Lucas Brutschy	2c347380b7	KAFKA-19694: Trigger StreamsRebalanceListener in Consumer.close (#20511 ) In the consumer, we invoke the consumer rebalance onPartitionRevoked or onPartitionLost callbacks, when the consumer closes. The point is that the application may want to commit, or wipe the state if we are closing unsuccessfully. In the StreamsRebalanceListener, we did not implement this behavior, which means when closing the consumer we may lose some progress, and in the worst case also miss that we have to wipe our local state state since we got fenced. In this PR we implement StreamsRebalanceListenerInvoker, very similarly to ConsumerRebalanceListenerInvoker and invoke it in Consumer.close. Reviewers: Lianet Magrans <lmagrans@confluent.io>, Matthias J. Sax <matthias@confluent.io>, TengYao Chi <frankvicky@apache.org>, Uladzislau Blok <123193120+UladzislauBlok@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-09-16 16:32:47 +02:00
Mickael Maison	3cbb2a0aaf	MINOR: Small cleanups in clients (#20530 ) - Fix non-constant calls to logging - Fix assertEquals order - Fix javadoc Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-16 03:56:11 +08:00
Lianet Magrans	caeca090b8	MINOR: Improve producer docs and add tests around timeout behaviour on missing topic/partition (#20533 ) Clarify timeout errors received on send if the case is topic not in metadata vs partition not in metadata. Add integration tests showcases the difference Follow-up from 4.1 fix for misleading timeout error message (https://issues.apache.org/jira/browse/KAFKA-8862) Reviewers: TengYao Chi <frankvicky@apache.org>, Kuan-Po Tseng <brandboat@gmail.com>	2025-09-15 13:28:27 -04:00
Chang-Chi Hsu	962f4ada75	KAFKA-19203 Replace `ApiError#exception` by `Error#exception` for KafkaAdminClient (#19623 ) This pull request addresses KAFKA-19203 by replacing `ApiError#exception` with `Error#exception` in `KafkaAdminClient`. The previous use of `ApiError#exception` was redundant, as we only need the exception without the additional wrapping of `ApiError`. ## Changes - Replaced some usages of `ApiError#exception` with `Error#exception` in `KafkaAdminClient`. - Simplified exception handling logic to reduce unnecessary layers. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-09-14 07:20:01 +08:00
Genseric Ghiro	8065d5cb1d	MINOR: Making sure log appender is closed in ShareConsumerImplTest.java::testFailConstructor (#20514 ) Similarly to what was done for AsyncKafkaConsumerTest::testFailConstructor, [here](https://github.com/apache/kafka/pull/20491) Reviewers: Lianet Magrans <lmagrans@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-10 07:03:19 +08:00
Lan Ding	45b96cb3a7	MINOR: add the explanation of `null` for DeleteAclsRequest#ResourceNameFilter (#20502 ) Add the explanation of `null` for DeleteAclsRequest#ResourceNameFilter Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-10 06:59:32 +08:00
Genseric Ghiro	872647fe06	KAFKA-19585: Avoid noisy NPE logs when closing consumer after constructor failures (#20491 ) If there's a failure in the kafka consumer constructor, we attempt to close it `2329def2ff/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java (L540)` In that case, it could be the case that some components may have not been created, so we should consider some null checks to avoid noisy logs about NPE. This noisy logs have been reported with the console share consumer in a similar scenario, so this task is to review and do a similar fix for the Async if needed. The fix is to check if handlers/invokers are null before trying to close them. Similar to what was done here https://github.com/apache/kafka/pull/20290 Reviewers: TengYao Chi <frankvicky@apache.org>, Lianet Magrans <lmagrans@confluent.io>	2025-09-08 13:51:57 -04:00
Ken Huang	0a12eaa80e	KAFKA-19112 Unifying LIST-Type Configuration Validation and Default Values (#20334 ) We add the three main changes in this PR - Disallowing null values for most LIST-type configurations makes sense, since users cannot explicitly set a configuration to null in a properties file. Therefore, only configurations with a default value of null should be allowed to accept null. - Disallowing duplicate values is reasonable, as there are currently no known configurations in Kafka that require specifying the same value multiple times. Allowing duplicates is both rare in practice and potentially confusing to users. - Disallowing empty list, even though many configurations currently accept them. In practice, setting an empty list for several of these configurations can lead to server startup failures or unexpected behavior. Therefore, enforcing non-empty lists helps prevent misconfiguration and improves system robustness. These changes may introduce some backward incompatibility, but this trade-off is justified by the significant improvements in safety, consistency, and overall user experience. Additionally, we introduce two minor adjustments: - Reclassify some STRING-type configurations as LIST-type, particularly those using comma-separated values to represent multiple entries. This change reflects the actual semantics used in Kafka. - Update the default values for some configurations to better align with other configs. These changes will not introduce any compatibility issues. Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-06 01:25:55 +08:00
Lianet Magrans	5fefb16f14	MINOR: extend consumer close java doc with error handling behaviour (#20472 ) Add to the consumer.close java doc to describe the error handling behaviour. Reviewers: Matthias J. Sax <matthias@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Andrew Schofield <aschofield@confluent.io>, TengYao Chi <frankvicky@apache.org>	2025-09-06 00:41:11 +08:00
Kuan-Po Tseng	af03353f71	KAFKA-19659: Wrong generic type for UnregisterBrokerOptions (#20490 ) Fix wrong generic type for UnregisterBrokerOptions Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-09-05 16:50:05 +01:00
Jonah Hooper	29ce96151c	MINOR; Revert "KAFKA-18681: Created GetReplicaLogInfo RPCs (#19664 )" (#20371 ) This reverts commit `d86ba7f54a`. Reverting since we are planning to change how KIP-966 is implemented. We should revert this RPC until we have more clarity on how this KIP will be executed. Reviewers: José Armando García Sancio <jsancio@apache.org>	2025-09-05 11:31:50 -04:00
Kirk True	f922ff6d1f	KAFKA-19259: Async consumer fetch intermittent delays on console consumer (#19980 ) There’s a difference in the two consumers’ `pollForFetches()` methods in this case: `ClassicKafkaConsumer` doesn't block waiting for data in the fetch buffer, but `AsyncKafkaConsumer` does. In `ClassicKafkaConsumer.pollForFetches()`, after enqueuing the `FETCH` request, the consumer makes a call to `ConsumerNetworkClient.poll()`. In most cases `poll()` returns almost immediately because it successfully sent the `FETCH` request. So even when the `pollTimeout` value is, e.g. 3000, the call to `ConsumerNetworkClient.poll()` doesn't block that long waiting for a response. After sending out a `FETCH` request, `AsyncKafkaConsumer` then calls `FetchBuffer.awaitNotEmpty()` and proceeds to block there for the full length of the timeout. In some cases, the response to the `FETCH` comes back with no results, which doesn't unblock `FetchBuffer.awaitNotEmpty()`. So because the application thread is still waiting for data in the buffer, it remains blocked, preventing any more `FETCH` requests from being sent, causing the long pauses in the console consumer. Reviewers: Lianet Magrans <lmagrans@confluent.io>, Andrew Schofield <aschofield@confluent.io>	2025-09-05 10:50:47 -04:00
Lan Ding	32c2383bfa	KAFKA-19658 Tweak org.apache.kafka.clients.consumer.OffsetAndMetadata (#20451 ) 1. Optimize the `equals()`, `hashCode()`, and `toString()` methods in `OffsetAndMetadata`. 2. Add UT and IT to these modifications. Reviewers: TengYao Chi <kitingiao@gmail.com>, Sean Quah <squah@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-05 06:06:08 +08:00
Shivsundar R	29b940bef4	MINOR: Use drainEvents() in ShareConsumerImpl::processBackgroundEvents (#20474 ) What - Currently in `ShareConsumerImpl`, we were not resetting `background-event-queue-size` metric to 0 after draining the events from the queue. - This PR fixes it by using `BackgroundEventHandler::drainEvents` similar to `AsyncKafkaConsumer`. - Added a unit test to verify the metric is reset to 0 after draining the events. Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-04 21:39:50 +08:00
Matthias J. Sax	c3af2064e7	MINOR: code cleanup (#20455 ) - rewrite code to avoid @Suppress - remove unused code - fix test error message Reviewer: Lucas Brutschy <lbrutschy@confluent.io>	2025-09-03 17:16:05 -07:00
Hong-Yi Chen	a9bce0647f	KAFKA-19535 add integration tests for DescribeProducersOptions#brokerId (#20420 ) Add tests for producer state listing with, without, and invalid brokerId. Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-04 03:15:21 +08:00
Shivsundar R	d226b43597	KAFKA-18220: Refactor AsyncConsumerMetrics to not extend KafkaConsumerMetrics (#20283 ) What https://issues.apache.org/jira/browse/KAFKA-18220 - Currently, `AsyncConsumerMetrics` extends `KafkaConsumerMetrics`, but is being used both by `AsyncKafkaConsumer` and `ShareConsumerImpl`. - `ShareConsumerImpl` only needs the async consumer metrics(the metrics associated with the new consumer threading model). - This needs to be fixed, we are unnecessarily having `KafkaConsumerMetrics` as a parent class for `ShareConsumer` metrics. Fix : - In this PR, we have removed the dependancy of `AsyncConsumerMetrics` on `KafkaConsumerMetrics` and made it an independent class which both `AsyncKafkaConsumer` and `ShareConsumerImpl` will use. - The "`asyncConsumerMetrics`" field represents the metrics associated with the new consumer threading model (like application event queue size, background queue size, etc). - The "`kafkaConsumerMetrics`" and "`kafkaShareConsumerMetrics`" fields denote the actual consumer metrics for `KafkaConsumer` and `KafkaShareConsumer` respectively. Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-09-03 12:35:55 +01:00
Federico Valeri	2ba30cc466	KAFKA-19574: Improve producer and consumer config files (#20302 ) This is an attempt at improving the client configuration files. We now have sections and comments similar to the other properties files. Reviewers: Kirk True <ktrue@confluent.io>, Luke Chen <showuon@gmail.com> --------- Signed-off-by: Federico Valeri <fedevaleri@gmail.com>	2025-09-02 11:24:35 +09:00
Matthias J. Sax	342a8e6773	MINOR: suppress build warning (#20424 ) Suppress build warning. Reviewers: TengYao Chi <frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-09-01 11:12:11 -07:00
knoxy5467	2dd2db7a1e	KAFKA-8350 Fix stack overflow when batch size is larger than cluster max.message.byte (#20358 ) ### Summary This PR fixes two critical issues related to producer batch splitting that can cause infinite retry loops and stack overflow errors when batch sizes are significantly larger than broker-configured message size limits. ### Issues Addressed - KAFKA-8350: Producers endlessly retry batch splitting when `batch.size` is much larger than topic-level `message.max.bytes`, leading to infinite retry loops with "MESSAGE_TOO_LARGE" errors - KAFKA-8202: Stack overflow errors in `FutureRecordMetadata.chain()` due to excessive recursive splitting attempts ### Root Cause The existing batch splitting logic in `RecordAccumulator.splitAndReenqueue()` always used the configured `batchSize` parameter for splitting, regardless of whether the batch had already been split before. This caused: 1. Infinite loops: When `batch.size` (e.g., 8MB) >> `message.max.bytes` (e.g., 1MB), splits would never succeed since the split size was still too large 2. Stack overflow: Repeated splitting attempts created deep call chains in the metadata chaining logic ### Solution Implemented progressive batch splitting logic: ```java int maxBatchSize = this.batchSize; if (bigBatch.isSplitBatch()) { maxBatchSize = Math.max(bigBatch.maxRecordSize, bigBatch.estimatedSizeInBytes() / 2); } ``` __Key improvements:__ - __First split__: Uses original `batchSize` (maintains backward compatibility) - __Subsequent splits__: Uses the larger of: - `maxRecordSize`: Ensures we can always split down to individual records - `estimatedSizeInBytes() / 2`: Provides geometric reduction for faster convergence ### Testing Added comprehensive test `testSplitAndReenqueuePreventInfiniteRecursion()` that: - Creates oversized batches with 100 records of 1KB each - Verifies splitting can reduce batches to single-record size - Ensures no infinite recursion (safety limit of 100 operations) - Validates no data loss or duplication during splitting - Confirms all original records are preserved with correct keys ### Backward Compatibility - No breaking changes to public APIs - First split attempt still uses original `batchSize` configuration - Progressive splitting only engages for retry scenarios Reviewers: Colin P. McCabe <cmccabe@apache.org>, Jason Gustafson <jason@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com> ### --------- Co-authored-by: Michael Knox <mrknox@amazon.com>	2025-08-29 11:51:47 +01:00
Ken Huang	2cc66f12c3	MINOR: Remove OffsetsForLeaderEpochRequest unused static field (#20418 ) This field was used for replica_id, but after `51c833e795`, the OffsetsForLeaderEpochRequest directly relies on the internal structs generated by the automated protocol. Therefore, we can safely remove it. Reviewers: Lan Ding <isDing_L@163.com>, TengYao Chi <frankvicky@apache.org>	2025-08-28 17:24:01 +08:00
Kirk True	1fc25d8389	MINOR: remove arguments from AsyncKafkaConsumerTest.newConsumer() that are identical (#20426 ) Very minor cleanup of redundant arguments in `AsyncKafkaConsumerTest`. Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-08-28 09:56:39 +01:00
Abhijeet Kumar	8d93d1096c	KAFKA-17108: Add EarliestPendingUpload offset spec in ListOffsets API (#16584 ) This is the first part of the implementation of [KIP-1023](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1023%3A+Follower+fetch+from+tiered+offset) The purpose of this pull request is for the broker to start returning the correct offset when it receives a -6 as a timestamp in a ListOffsets API request. Added unit tests for the new timestamp. Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>	2025-08-27 08:34:31 +05:30
jimmy	5fcbf3d3b1	KAFKA-18853 Add documentation to remind users to use valid LogLevelConfig constants (#20249 ) This PR aims to add documentation to `alterLogLevelConfigs` method to remind users to use valid LogLevelConfig constants. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-08-27 10:52:02 +08:00
Ken Huang	08057eac53	KAFKA-18600 Cleanup NetworkClient zk related logging (#18644 ) This PR removes associated logging within NetworkClient to reduce noise and streamline the client code. Reviewers: Ismael Juma <ismael@juma.me.uk>, David Arthur <mumrah@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-08-27 03:51:28 +08:00
Chang-Chi Hsu	71fdab1c5d	MINOR: describeTopics should pass the timeout to the describeCluster call (#20375 ) This PR ensures that describeTopics correctly propagates its timeoutMs setting to the underlying describeCluster call. Integration tests were added to verify that the API now fails with a TimeoutException when brokers do not respond within the configured timeout. Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-08-26 19:38:53 +08:00
Kirk True	4e0d8c984b	MINOR: renamed testAsyncConsumerClassicConsumerSubscribeInvalidTopicCanUnsubscribe to align test case (#20407 ) `testAsyncConsumerClassicConsumerSubscribeInvalidTopicCanUnsubscribe` does not align with the test case. This patch renames the test name to describe the test case more precisely. Reviewers: TengYao Chi <frankvicky@apache.org>	2025-08-26 17:00:37 +08:00
Kuan-Po Tseng	ecd5b4c157	MINOR: enhance DescribeClusterResponse ControllerId description (#20400 ) enhance the description of ControllerId in DescribeClusterResponse.json Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-08-25 02:41:02 +08:00
Kaushik Raina	b4bf0bf693	KAFKA-19506 Implement dynamic compression type selection and fallback for client telemetry (#20144 ) #### Summary This PR implements dynamic compression type selection and fallback mechanism for client telemetry to handle cases where compression libraries are not available on the client classpath. #### Problem Currently, when a compression library is missing (e.g., NoClassDefFoundError), the client telemetry system catches the generic Throwable but doesn't learn from the failure. This means, the same unsupported compression type will be attempted on every telemetry push #### Solution This PR introduces a comprehensive fallback mechanism: - Specific Exception Handling: Replace generic Throwable catching with specific exceptions (IOException, NoClassDefFoundError) - Unsupported Compression Tracking: Add unsupportedCompressionTypes collection to track compression types that have failed due to missing libraries - Dynamic Selection: Enhance ClientTelemetryUtils.preferredCompressionType() to accept an unsupported types parameter and filter out known problematic compression types - Thread Safety: Use ConcurrentHashMap.newKeySet() for thread-safe access to the unsupported types collection - Improved Logging: Include exception details in log messages for better debugging #### Key Changes - Modified createPushRequest() to track failed compression types in unsupportedCompressionTypes - Updated ClientTelemetryUtils.preferredCompressionType() to filter out unsupported types - Enhanced exception handling with specific exception types instead of Throwable #### Testing - Added appropriate Unit tests - Testing apache kafka on local logs: ``` ✗ cat ~/Desktop/kafka-client.log \| grep " org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter" 2025-07-17 07:56:52:602 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry subscription request with client instance id AAAAAAAAAAAAAAAAAAAAAA 2025-07-17 07:56:52:602 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from SUBSCRIPTION_NEEDED to SUBSCRIPTION_IN_PROGRESS 2025-07-17 07:56:52:640 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from SUBSCRIPTION_IN_PROGRESS to PUSH_NEEDED 2025-07-17 07:56:52:640 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Telemetry subscription push interval value from broker was 5000; to stagger requests the first push interval is being adjusted to 4551 2025-07-17 07:56:52:640 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Updating subscription - subscription: ClientTelemetrySubscription{clientInstanceId=aVd3fzviRGSgEuAWNY5mMA, subscriptionId=1650084878, pushIntervalMs=5000, acceptedCompressionTypes=[zstd, lz4, snappy, none], deltaTemporality=true, selector=org.apache.kafka.common.telemetry.internals.ClientTelemetryUtils$$Lambda$308/0x00000005011ce470@2f16e398}; intervalMs: 4551, lastRequestMs: 1752739012639 2025-07-17 07:56:52:640 [kafka-producer-network-thread \| kr-kafka-producer] INFO org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Client telemetry registered with client instance id: aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:56:57:196 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry push request with client instance id aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:56:57:196 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_NEEDED to PUSH_IN_PROGRESS 2025-07-17 07:56:57:224 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Compression library zstd not found, sending uncompressed data at org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter$DefaultClientTelemetrySender.createPushRequest(ClientTelemetryReporter.java:722) at org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter$DefaultClientTelemetrySender.createPushRequest(ClientTelemetryReporter.java:703) at org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter$DefaultClientTelemetrySender.createRequest(ClientTelemetryReporter.java:389) 2025-07-17 07:56:57:295 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_IN_PROGRESS to PUSH_NEEDED 2025-07-17 07:57:02:296 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry push request with client instance id aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:57:02:297 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_NEEDED to PUSH_IN_PROGRESS 2025-07-17 07:57:02:300 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Compression library lz4 not found, sending uncompressed data at org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter$DefaultClientTelemetrySender.createPushRequest(ClientTelemetryReporter.java:722) at org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter$DefaultClientTelemetrySender.createPushRequest(ClientTelemetryReporter.java:703) at org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter$DefaultClientTelemetrySender.createRequest(ClientTelemetryReporter.java:389) 2025-07-17 07:57:02:329 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_IN_PROGRESS to PUSH_NEEDED 2025-07-17 07:57:07:329 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry push request with client instance id aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:57:07:330 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_NEEDED to PUSH_IN_PROGRESS 2025-07-17 07:57:07:331 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Compression library snappy not found, sending uncompressed data at org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter$DefaultClientTelemetrySender.createPushRequest(ClientTelemetryReporter.java:722) at org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter$DefaultClientTelemetrySender.createPushRequest(ClientTelemetryReporter.java:703) at org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter$DefaultClientTelemetrySender.createRequest(ClientTelemetryReporter.java:389) 2025-07-17 07:57:07:344 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_IN_PROGRESS to PUSH_NEEDED 2025-07-17 07:57:12:346 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry push request with client instance id aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:57:12:346 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_NEEDED to PUSH_IN_PROGRESS 2025-07-17 07:57:12:400 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_IN_PROGRESS to PUSH_NEEDED 2025-07-17 07:57:17:402 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry push request with client instance id aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:57:17:402 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_NEEDED to PUSH_IN_PROGRESS 2025-07-17 07:57:17:442 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_IN_PROGRESS to PUSH_NEEDED 2025-07-17 07:57:22:442 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry push request with client instance id aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:57:22:442 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_NEEDED to PUSH_IN_PROGRESS 2025-07-17 07:57:22:508 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_IN_PROGRESS to PUSH_NEEDED 2025-07-17 07:57:27:512 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry push request with client instance id aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:57:27:512 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_NEEDED to PUSH_IN_PROGRESS 2025-07-17 07:57:27:555 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_IN_PROGRESS to PUSH_NEEDED 2025-07-17 07:57:32:555 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry push request with client instance id aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:57:32:555 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_NEEDED to PUSH_IN_PROGRESS 2025-07-17 07:57:32:578 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_IN_PROGRESS to PUSH_NEEDED 2025-07-17 07:57:37:580 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry push request with client instance id aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:57:37:580 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_NEEDED to PUSH_IN_PROGRESS 2025-07-17 07:57:37:606 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_IN_PROGRESS to PUSH_NEEDED 2025-07-17 07:57:42:606 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry push request with client instance id aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:57:42:606 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_NEEDED to PUSH_IN_PROGRESS 2025-07-17 07:57:42:646 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_IN_PROGRESS to PUSH_NEEDED 2025-07-17 07:57:47:647 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry push request with client instance id aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:57:47:647 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_NEEDED to PUSH_IN_PROGRESS 2025-07-17 07:57:47:673 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_IN_PROGRESS to PUSH_NEEDED 2025-07-17 07:57:52:673 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry push request with client instance id aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:57:52:673 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_NEEDED to PUSH_IN_PROGRESS 2025-07-17 07:57:52:711 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_IN_PROGRESS to PUSH_NEEDED 2025-07-17 07:57:57:711 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Creating telemetry push request with client instance id aVd3fzviRGSgEuAWNY5mMA 2025-07-17 07:57:57:711 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_NEEDED to PUSH_IN_PROGRESS 2025-07-17 07:57:57:765 [kafka-producer-network-thread \| kr-kafka-producer] DEBUG org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter - Setting telemetry state from PUSH_IN_PROGRESS to PUSH_NEEDED ``` Reviewers: poorv Mittal <apoorvmittal10@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-08-24 01:19:19 +08:00
Chih-Yuan Chien	84d817f40e	MINOR: Remove PartitionState (#20377 ) Remove unused PartitionState. It was unused after #7222. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, PoAn Yang <payang@apache.org>	2025-08-20 08:12:23 +08:00
Chih-Yuan Chien	38c3a411e9	MINOR: Fix typo and docs (#20373 ) Fix typo and docs in following. ``` clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRebalanceListener.java clients/src/main/resources/common/message/FetchRequest.json raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java ``` Reviewers: Kuan-Po Tseng <brandboat@gmail.com>, Lan Ding <isDing_L@163.com>, Ken Huang <s7133700@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, PoAn Yang <payang@apache.org>	2025-08-19 19:04:41 +08:00
Jhen-Yung Hsu	55260e9835	KAFKA-19042: Move AdminClientWithPoliciesIntegrationTest to clients-integration-tests module (#20339 ) This PR does the following: - Rewrite to new test infra. - Rewrite to java. - Move to clients-integration-tests. - Add `ensureConsistentMetadata` method to `ClusterInstance`, similar to `ensureConsistentKRaftMetadata` in the old infra, and refactors related code. Reviewers: TengYao Chi <frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>	2025-08-15 17:44:47 +08:00
stroller	58d894170a	MINOR: Fix typo in `AdminBootstrapAddresses` (#20352 ) Fix the typo in `AdminBootstrapAddresses`. Reviewers: TengYao Chi <frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>	2025-08-15 14:19:40 +08:00
Ming-Yen Chung	c4fb1008c4	MINOR: Use lambda expressions instead of ImmutableValue for Gauges (#20351 ) Refactor metric gauges instantiation to use lambda expressions instead of ImmutableValue. Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-08-14 20:35:21 +08:00
Shivsundar R	bf42924126	KAFKA-19572: Added check to prevent NPE logs during ShareConsumer::close (#20290 ) What https://issues.apache.org/jira/browse/KAFKA-19572 - If a `ShareConsumer` constructor failed due to any exception, then we call `close()` in the catch block. - If there were uninitialized members accessed during `close()`, then it would throw a NPE. Currently there are no null checks, hence we were attempting to use these fields during `close()` execution. - To avoid this, PR adds null checks in the `close()` function before we access fields which possibly could be null. Reviewers: Apoorv Mittal <amittal@confluent.io>, Lianet Magrans <lmagrans@confluent.io>	2025-08-07 15:33:28 -04:00
Ming-Yen Chung	2329def2ff	KAFKA-18068: Fix the typo `PARTITIONER_ADPATIVE_PARTITIONING_ENABLE` in ProducerConfig (#20317 ) Fixes a typo in ProducerConfig: Renames `PARTITIONER_ADPATIVE_PARTITIONING_ENABLE_CONFIG` → `PARTITIONER_ADAPTIVE_PARTITIONING_ENABLE_CONFIG` The old key is retained for backward compatibility. See: [KIP-1175: Fix the typo `PARTITIONER_ADPATIVE_PARTITIONING_ENABLE` in ProducerConfig](https://cwiki.apache.org/confluence/x/KYogFQ) Reviewers: Yung <yungyung7654321@gmail.com>, TengYao Chi <frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>, Nick Guo <lansg0504@gmail.com>, Ranuga Disansa <go2ranuga@gmail.com>	2025-08-07 17:28:17 +08:00
Luke Chen	657b496f3c	MINOR: improve the min.insync.replicas doc (#20237 ) Along with the change: https://github.com/apache/kafka/pull/17952 ([KIP-966](https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas)), the semantics of `min.insync.replicas` config has small change, and add some constraints. We should document them clearly. Reviewers: Jun Rao <junrao@gmail.com>, Calvin Liu <caliu@confluent.io>, Mickael Maison <mickael.maison@gmail.com>, Paolo Patierno <ppatierno@live.com>, Federico Valeri <fedevaleri@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-08-05 00:22:13 +08:00
Sean Quah	904ee87b85	MINOR: Add missing test coverage for OffsetFetchResponse.errorCounts() (#20263 ) OffsetFetchResponses can have three different error structures depending on the version. Version 2 adds a top level error code for group-level errors. Version 8 adds support for querying multiple groups at a time and nests the fields within a groups array. Add a test for the errorCounts implementation since it varies depending on the version. Reviewers: Dongnuo Lyu <dlyu@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-08-04 17:50:37 +08:00

1 2 3 4 5 ...

3971 Commits