kafka

Commit Graph

Author	SHA1	Message	Date
Chia-Ping Tsai	eab771fae3	MINOR: update GitHub collaborators (#20309 ) based on https://github.com/apache/kafka/graphs/contributors?from=2024%2F8%2F3 Reviewers: PoAn Yang <payang@apache.org>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Nick Guo <lansg0504@gmail.com>	2025-08-06 04:03:59 +08:00
Stig Døssing	25da705178	MINOR: Run CI with Java 24 (#20295 ) This commit updates CI to test against Java 24 instead of Java 23 which is EOL. Due to Spotbugs not having released version 4.9.4 yet, we can't run Spotbugs on Java 24. Instead, we are choosing to run Spotbugs, and the rest of the compile and validate build step, on Java 17 for now. Once 4.9.4 has released, we will switch to using Java 24 for this. Exclude spotbugs from the run-tests gradle action. Spotbugs is already being run once in the build by "compile and validate", there is no reason to run it again as part of executing tests. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-08-05 21:26:13 +08:00
Sanskar Jhajharia	b9413ea4d6	MINOR: Cleanup Tools Module (2/n) (#20096 ) Now that Kafka support Java 17, this PR makes some changes in tools module. The changes in this PR are limited to only some files. A future PR(s) shall follow. The changes mostly include: - Collections.emptyList(), Collections.singletonList() and Arrays.asList() are replaced with List.of() - Collections.emptyMap() and Collections.singletonMap() are replaced with Map.of() - Collections.singleton() is replaced with Set.of() Some minor changes to use the enhanced switch. Sub modules targeted: tools/src/test Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-08-05 21:16:01 +08:00
Ken Huang	a12d38f091	MINOR: Cleanup RaftClient#upgradeKRaftVersion java document (#20275 ) The compiler warning is due to a lack of import. This patch imports the ApiException to fix it. Reviewers: TengYao Chi <frankvicky@apache.org>, Yung <yungyung7654321@gmail.com>	2025-08-05 19:06:17 +08:00
Priyanka K U	eb9f5189f5	KAFKA-16913: Support external schemas in JSONConverter (#19449 ) When using a connector that requires a schema, such as JDBC connectors, with JSON messages, the current JSONConverter necessitates including the schema within every message. To address this, we are introducing a new parameter, schema.content, which allows you to provide the schema externally. This approach not only reduces the size of the messages but also facilitates the use of more complex schemas. KIP : https://cwiki.apache.org/confluence/display/KAFKA/KIP-1054%3A+Support+external+schemas+in+JSONConverter Reviewers: Mickael Maison <mickael.maison@gmail.com>, TengYao Chi <frankvicky@apache.org>, Edoardo Comar <ECOMAR@uk.ibm.com>	2025-08-05 10:00:14 +02:00
majialong	e78977e505	KAFKA-19579 Add missing metrics for doc tiered storage (#20304 ) Add missing metrics for document tiered storage - kafka.log.remote:type=RemoteLogManager,name=RemoteLogReaderFetchRateAndTimeMs：Introduced in [KIP-1018](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1018%3A+Introduce+max+remote+fetch+timeout+config+for+DelayedRemoteFetch+requests) - kafka.server:type=DelayedRemoteListOffsetsMetrics,name=ExpiresPerSec,topic=([-.\w]),partition=([0-9])：Introduced in [KIP-1075](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1075%3A+Introduce+delayed+remote+list+offsets+purgatory+to+make+LIST_OFFSETS+async) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Lan Ding <isDing_L@163.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>	2025-08-05 13:25:24 +08:00
Jared Harley	66b3c07954	KAFKA-19576 Fix typo in state-change log filename after rotate (#20269 ) The `state-change.log` file is being incorrectly rotated to `stage-change.log.[date]`. This change fixes the typo to have the log file correctly rotated to `state-change.log.[date]` _No functional changes._ Reviewers: Mickael Maison <mickael.maison@gmail.com>, Christo Lolov <lolovc@amazon.com>, Luke Chen <showuon@gmail.com>, Ken Huang <s7133700@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-08-05 12:22:54 +08:00
Federico Valeri	cd9dde11de	MINOR: Improve skip-record-metadata description (#20291 ) This flag also skips control records, so the description needs to be updated. --------- Signed-off-by: Federico Valeri <fedevaleri@gmail.com> Reviewers: Luke Chen <showuon@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Vincent Potucek	2025-08-05 08:50:50 +08:00
Stig Døssing	cb9e7fe1c6	MINOR: Upgrade Spotbugs to 4.9.1 (#20294 ) Add exclusions for new warnings to allow this upgrade. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-08-05 02:50:47 +08:00
Luke Chen	657b496f3c	MINOR: improve the min.insync.replicas doc (#20237 ) Along with the change: https://github.com/apache/kafka/pull/17952 ([KIP-966](https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas)), the semantics of `min.insync.replicas` config has small change, and add some constraints. We should document them clearly. Reviewers: Jun Rao <junrao@gmail.com>, Calvin Liu <caliu@confluent.io>, Mickael Maison <mickael.maison@gmail.com>, Paolo Patierno <ppatierno@live.com>, Federico Valeri <fedevaleri@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-08-05 00:22:13 +08:00
Sean Quah	904ee87b85	MINOR: Add missing test coverage for OffsetFetchResponse.errorCounts() (#20263 ) OffsetFetchResponses can have three different error structures depending on the version. Version 2 adds a top level error code for group-level errors. Version 8 adds support for querying multiple groups at a time and nests the fields within a groups array. Add a test for the errorCounts implementation since it varies depending on the version. Reviewers: Dongnuo Lyu <dlyu@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-08-04 17:50:37 +08:00
Chang-Chi Hsu	888861d803	MINOR: Replace boundPort with brokerBoundPort (#20297 ) ## Changes: - Replaced all references to boundPort with brokerBoundPort. ## Reasons - boundPort and brokerBoundPort share the same definition and behavior. Reviewers: TaiJuWu <tjwu1217@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-08-04 16:44:12 +08:00
Federico Valeri	e67c042f7f	KAFKA-18607: Update jfreechart dependency (#20264 ) This patch updates the code and the dependency with the latest namespace and version. Signed-off-by: Federico Valeri <fedevaleri@gmail.com> Reviewers: Mickael Maison <mickael.maison@gmail.com>	2025-08-04 10:37:36 +02:00
PoAn Yang	ea771563e0	KAFKA-14604: SASL session expiration time will be overflowed when calculation (#18526 ) CI / build (push) Has been cancelled Details Fixup PR Labels / fixup-pr-labels (needs-attention) (push) Has been cancelled Details Fixup PR Labels / fixup-pr-labels (triage) (push) Has been cancelled Details Docker Image CVE Scanner / scan_jvm (3.7.2) (push) Has been cancelled Details Docker Image CVE Scanner / scan_jvm (3.8.1) (push) Has been cancelled Details Docker Image CVE Scanner / scan_jvm (3.9.1) (push) Has been cancelled Details Docker Image CVE Scanner / scan_jvm (4.0.0) (push) Has been cancelled Details Docker Image CVE Scanner / scan_jvm (latest) (push) Has been cancelled Details Flaky Test Report / Flaky Test Report (push) Has been cancelled Details Fixup PR Labels / needs-attention (push) Has been cancelled Details The timeout value may be overflowed if users set a large expiration time. ``` sessionExpirationTimeNanos = authenticationEndNanos + 1000 * 1000 * sessionLifetimeMs; ``` Fixed it by throwing exception if the value is overflowed. Reviewers: TaiJuWu <tjwu1217@gmail.com>, Luke Chen <showuon@gmail.com>, TengYao Chi <frankvicky@apache.org> Signed-off-by: PoAn Yang <payang@apache.org>	2025-08-03 19:12:04 +08:00
Tsung-Han Ho (Miles Ho)	3f1d830174	MINOR: Remove duplicate renewTimePeriodOpt in DelegationTokenCommand validation (#20177 ) The bug was a duplicate parameter validation in the `DelegationTokenCommand` class. The `checkInvalidArgs` method for the `describeOpt` was incorrectly including `renewTimePeriodOpt` twice in the set of invalid arguments. This bug caused unexpected command errors during E2E testing. ### Before the fix: The following command would fail due to the duplicate validation logic: ``` TC_PATHS="tests/kafkatest/tests/core/delegation_token_test.py::DelegationTokenTest" /bin/bash tests/docker/run_tests.sh ``` ### Error output: ``` ducktape.cluster.remoteaccount.RemoteCommandError: ducker@ducker03: Command 'KAFKA_OPTS="-Djava.security.auth.login.config=/mnt/security/jaas.conf -Djava.security.krb5.conf=/mnt/security/krb5.conf" /opt/kafka-dev/bin/kafka-delegation-tokens.sh --bootstrap-server ducker03:9094 --create --max-life-time-period -1 --command-config /mnt/kafka/client.properties > /mnt/kafka/delegation_token.out' returned non-zero exit status 1. Remote error message: b'duplicate element: [renew-time-period]\njava.lang.IllegalArgumentException: duplicate element: [renew-time-period]\n\tat java.base/java.util.ImmutableCollections$SetN.<init>(ImmutableCollections.java:918)\n\tat java.base/java.util.Set.of(Set.java:544)\n\tat org.apache.kafka.tools.DelegationTokenCommand$DelegationTokenCommandOptions.checkArgs(DelegationTokenCommand.java:304)\n\tat org.apache.kafka.tools.DelegationTokenCommand.execute(DelegationTokenCommand.java:79)\n\tat org.apache.kafka.tools.DelegationTokenCommand.mainNoExit(DelegationTokenCommand.java:57)\n\tat org.apache.kafka.tools.DelegationTokenCommand.main(DelegationTokenCommand.java:52)\n\n' [INFO:2025-07-31 11:27:25,531]: RunnerClient: kafkatest.tests.core.delegation_token_test.DelegationTokenTest.test_delegation_token_lifecycle.metadata_quorum=ISOLATED_KRAFT: Data: None ================================================================================ SESSION REPORT (ALL TESTS) ducktape version: 0.12.0 session_id: 2025-07-31--002 run time: 33.213 seconds tests run: 1 passed: 0 flaky: 0 failed: 1 ignored: 0 ================================================================================ test_id: kafkatest.tests.core.delegation_token_test.DelegationTokenTest.test_delegation_token_lifecycle.metadata_quorum=ISOLATED_KRAFT status: FAIL run time: 33.090 seconds ``` ### After the fix: The same command now executes successfully: ``` TC_PATHS="tests/kafkatest/tests/core/delegation_token_test.py::DelegationTokenTest" /bin/bash tests/docker/run_tests.sh ``` ### Success output: ``` ================================================================================ SESSION REPORT (ALL TESTS) ducktape version: 0.12.0 session_id: 2025-07-31--001 run time: 35.488 seconds tests run: 1 passed: 1 flaky: 0 failed: 0 ignored: 0 ================================================================================ test_id: kafkatest.tests.core.delegation_token_test.DelegationTokenTest.test_delegation_token_lifecycle.metadata_quorum=ISOLATED_KRAFT status: PASS run time: 35.363 seconds -------------------------------------------------------------------------------- ``` Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, TengYao Chi <frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>, PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>	2025-08-03 16:40:18 +08:00
Apoorv Mittal	05d71ad1a8	KAFKA-19476: Concurrent execution fixes for lock timeout and lso movement (#20286 ) CI / build (push) Has been cancelled Details The PR fixes following: 1. In case share partition arrive at a state which should be treated as final state of that batch/offset (example - LSO movement which causes offset/batch to be ARCHIVED permanently), the result of pending write state RPCs for that offset/batch override the ARCHIVED state. Hence track such updates and apply when transition is completed. 2. If an acquisition lock timeout occurs while an offset/batch is undergoing transition followed by write state RPC failure, then respective batch/offset can land in a scenario where the offset stays in ACQUIRED state with no acquisition lock timeout task. 3. If a timer task is cancelled, but due to concurrent execution of timer task and acknowledgement, there can be a scenario when timer task has processed post cancellation. Hence it can mark the offset/batch re-avaialble despite already acknowledged. Reviewers: Andrew Schofield <aschofield@confluent.io>, Abhinav Dixit <adixit@confluent.io>	2025-08-01 23:20:25 +01:00
Andrew Schofield	b909544e99	MINOR: Improve consistency of acknowledge type terminology (#20282 ) The code had a mixture of "acknowledgement type" and "acknowledge type". The latter is preferred. Reviewers: TengYao Chi <frankvicky@apache.org>, Lan Ding <isDing_L@163.com>	2025-08-01 21:17:22 +01:00
Shivsundar R	e1f45218c9	KAFKA-19485 (II) : Complete any pending acknowledgements in ShareFetch on an error response. (#20247 ) CI / build (push) Waiting to run Details What Currently when we received a top level error response in ShareFetch, we would log the error, update the share session epoch and proceed to the next request. But these acknowledgements(if any) are not completed and the callback would not have been processed. PR aims to address this by completing these acknowledgements with the error code from the response in this case. Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-07-31 22:07:44 +01:00
Andrew Schofield	f38359300b	MINOR: Fix javadoc mistake (#20281 ) Fixes a couple of tiny mistakes in the javadoc for KafkaShareConsumer. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-07-31 19:25:04 +01:00
majialong	6b96735872	MINOR: Fix typo in GetOffsetShell (#20277 ) Fix typo in GetOffsetShell : `visible for tseting` -> `Visible for testing` Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-08-01 00:48:31 +08:00
Manikumar Reddy	f73e3bcd6a	KAFKA-19561: Set OP_WRITE interest after SASL reauthentication to resume pending writes (#20258 ) https://issues.apache.org/jira/browse/KAFKA-19561 Addresses a race condition during SASL reauthentication where the server-side `KafkaChannel.send()` queues a response, but OP_WRITE is removed before the channel becomes writable — resulting in stuck responses and client timeouts. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2025-07-31 21:59:21 +05:30
TaiJuWu	dfaf9f9cf7	MINOR: add test for `kafka-consumer-groups.sh` should not fail when partition offline (#20235 ) CI / build (push) Waiting to run Details See: https://github.com/apache/kafka/pull/20168#discussion_r2227310093 add follow test case: Given a topic with three partitions, where partition `t-2` is offline, if partitionsToReset contains only `t-1`, the method filterNoneLeaderPartitions incorrectly includes `t-2` in the result, leading to a failure in the tool. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Ken Huang <s7133700@gmail.com>, Andrew Schofield <aschofield@confluent.io>	2025-07-31 15:54:27 +01:00
Lan Ding	d0a9a04a02	MINOR: Cleanups in storage module (#20270 ) Cleanups including: - Rewrite `FetchCountAndOp` as a record class - Replace `Tuple` by `Map.Entry` Reviewers: TengYao Chi <frankvicky@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>	2025-07-31 20:55:49 +08:00
Sanskar Jhajharia	c7caf912aa	KAFKA-19524 Fix UnsupportedOperationException in connect-plugin-path (#20222 ) ### Problem The connect-plugin-path tool crashes with `UnsupportedOperationException` when processing plugins that have loadable classes but no ServiceLoader manifest files. ### Root Cause Line 326 attempts to remove from an immutable `Collections.emptySet()`: ```java nonLoadableManifests.getOrDefault(pluginDesc.className(), Collections.emptySet()).remove(pluginDesc.type()); ``` ### Solution Replace `Collections.emptySet()` with `new HashSet<>()` to provide a mutable set. Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-07-31 20:37:35 +08:00
Mickael Maison	fd9b5514ad	MINOR: Cleanups in coordinator-common/group-coordinator (#20097 ) - Use `record` where possible - Use enhanced `switch` - Tweak a bunch of assertions Reviewers: Yung <yungyung7654321@gmail.com>, TengYao Chi <frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>, Dongnuo Lyu <dlyu@confluent.io>, PoAn Yang <payang@apache.org>	2025-07-31 16:34:08 +08:00
Lucas Brutschy	44c6e956ed	KAFKA-19529: State updater sensor names should be unique (#20262 ) CI / build (push) Waiting to run Details All state updater threads use the same metrics instance, but do not use unique names for their sensors. This can have the following symptoms: 1) Data inserted into one sensor by one thread can affect the metrics of all state updater threads. 2) If one state updater thread is shutdown, the metrics associated to all state updater threads are removed. 3) If one state updater thread is started, while another one is removed, it can happen that a metric is registered with the `Metrics` instance, but not associated to any `Sensor` (because it is concurrently removed), which means that the metric will not be removed upon shutdown. If a thread with the same name later tries to register the same metric, we may run into a `java.lang.IllegalArgumentException: A metric named ... already exists`, as described in the ticket. This change fixes the bug giving unique names to the sensors. A test is added that there is no interference of the removal of sensors and metrics during shutdown. Reviewers: Matthias J. Sax <matthias@confluent.io>	2025-07-31 08:58:52 +02:00
Luke Chen	a058123cd8	KAFKA-19563: improve the add controller doc (#20261 ) JIRA: [KAFKA-19563](https://issues.apache.org/jira/browse/KAFKA-19563) Improve the add-controller doc to notify users they have to add the configs to be passed to admin client into the controller.properties before the improvement is done. ``` --command-config COMMAND_CONFIG Property file containing configs to be passed to Admin Client. For add-controller, the file is used to specify the controller properties as well. ``` Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-07-31 14:09:47 +08:00
Now	dda1b5a4e8	MINOR: Fix duplicate 'to' in ExactlyOnceMessageProcessor javadoc (#20228 ) CI / build (push) Waiting to run Details Fixed a simple typo in javadoc comment where "to to" appeared instead of "to". _No functional changes_ Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-07-30 23:59:49 +08:00
Kevin Wu	1bcaa19c46	KAFKA-19489; Extra validation when formatting a node (#20136 ) CI / build (push) Waiting to run Details This PR adds a check to the storage tool's format command which throws a TerseFailure when the controller.quorum.voters config is defined and the node is formatted with the --standalone flag or the --initial-controllers flag. Without this check, it is possible to have two voter sets. For example, in a three node setup, the two nodes that formatted with --no-initial-controllers could form quorum with each other since they have the static voter set, and the --standalone node would ignore the config and read the voter set of itself from its log, forming its own quorum of 1. Reviewers: José Armando García Sancio <jsancio@apache.org>, TaiJuWu <tjwu1217@gmail.com>, Alyssa Huang <ahuang@confluent.io>	2025-07-30 10:58:08 -04:00
Jhen-Yung Hsu	4c1b79851f	MINOR: fix e2e docs for test that cannot be run directly (#20145 ) `tests/kafkatest/services/console_consumer.py` cannot be run directly. The correct way is to run `tests/kafkatest/sanity_checks/test_console_consumer.py`. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-07-30 20:57:57 +08:00
Hong-Yi Chen	a5e3bfa232	KAFKA-19527 improve the docs of LogDirDescription for remote storage (#20211 ) Clarifies that the fields `LogDirDescription#totalBytes`, `LogDirDescription#usableBytes`, and `ReplicaInfo#size` do not include the size of remote storage by updating their corresponding docs. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-07-30 20:41:45 +08:00
Mickael Maison	6973deab03	MINOR: Cleanups in storage module (#20087 ) Cleanups including: - Java 17 syntax, record and switch - assertEquals() order - javadoc Reviewers: Andrew Schofield <aschofield@confluent.io>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-07-30 16:02:01 +08:00
Jinhe Zhang	81bdf0b889	MINOR: Remove duplicated code in PgeViewDemo (#20252 ) CI / build (push) Waiting to run Details Remove the default implementation Reviewers: Lucas Brutschy <lucasbru@apache.org>	2025-07-29 18:18:38 +02:00
Ken Huang	96c8e86cdf	KAFKA-19530 RemoteLogManager should record lag stats when remote storage is offline (#20218 ) CI / build (push) Waiting to run Details When remote storage is offline, then the segmentLag and bytesLag metrics are not recorded. These metrics are useful to know the pending data to upload when remote storage is down. Reviewers: TaiJuWu <tjwu1217@gmail.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>	2025-07-29 20:08:06 +05:30
George Wu	7dba91d025	KAFKA-19484: Fix bug with tiered storage throttle metrics (#20129 ) Fixes a bug with tiered storage quota metrics introduced in [KIP-956](https://cwiki.apache.org/confluence/display/KAFKA/KIP-956+Tiered+Storage+Quotas). The metrics tracking how much time have been spent in a throttled state can stop reporting if a cluster stops stops doing remote copy/fetch and the sensors go inactive. This change delegates the job of refreshing inactive sensors to SensorAccess. There's pretty similar logic in RLMQuotaManager which is actually responsible for tracking and enforcing quotas and also uses a Sensor object. ``` remote-fetch-throttle-time-avg remote-copy-throttle-time-avg remote-fetch-throttle-time-max remote-copy-throttle-time-max ``` Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>	2025-07-29 19:37:41 +05:30
lucliu1108	4ff851a562	MINOR: Delete the redundant feature upgrade instruction for running KIP-1071 EA (#20250 ) Follow up on https://github.com/apache/kafka/pull/20241. Delete the instruction that manually set `streams.version=1` for running Kafka 4.1 since it is already achieved in previous setup steps. Reviewers: Lucas Brutschy <lucasbru@apache.org>	2025-07-29 14:28:30 +02:00
jimmy	dd784e7d7a	KAFKA-16717 [3/N]: Add AdminClient.alterShareGroupOffsets (#19820 ) [KAFKA-16717](https://issues.apache.org/jira/browse/KAFKA-16717) aims to finish the AlterShareGroupOffsets for ShareGroupCommand part. Reviewers: Lan Ding <isDing_L@163.com>, Chia-Ping Tsai <chia7712@gmail.com>, TaiJuWu <tjwu1217@gmail.com>, Andrew Schofield <aschofield@confluent.io>	2025-07-29 11:47:24 +01:00
Shivsundar R	40b4fdb0d8	KAFKA-19559: Remove ShareFetchMetricsManager sensors on consumer.close() (#20255 ) What https://issues.apache.org/jira/browse/KAFKA-19559 - There are a few sensors(created when a `ShareConsumer` initializes) which are not removed when the `ShareConsumer` closes. - To maintain consistency and prevent any leaks, we have to remove all the sensors when the consumer is closed. This is similar to this issue for regular consumers - https://issues.apache.org/jira/browse/KAFKA-19542 Reviewers: Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>	2025-07-29 11:10:48 +01:00
Apoorv Mittal	875537f54b	KAFKA-19555: Restrict records acquisition post max in-flight limit (#20253 ) The PR restricts the records being acquired post max-inflight limit. Previously the max in-flight limit was only enforced while considering the share partition for further fetches i.e. once the limit was reached the share partition was not considered for further fetches. However, when the records are actively released then there might be some records being acquired post max-inflight limit. This is evident with higher number of consumers reading from same share partition and releasing the records. Reviewers: Andrew Schofield <aschofield@confluent.io>, Lan Ding <isDing_L@163.com>	2025-07-29 10:40:06 +01:00
Uladzislau Blok	f9ccf83a7f	KAFKA-18066: Fix mismatched StreamThread ID in log messages (#19517 ) CI / build (push) Waiting to run Details This PR fixes an issue where the thread name shown in log messages did not match the actual execution context. Previously, log entries displayed the context of the newly created thread, while the logger reflected the current executing thread. This mismatch led to confusion and made log tracing more difficult. Changes: - Use logger without context to not have context - Updated log messages to explicitly describe the thread being created - Fixed instances where the log context reflected the current thread instead of the newly created one Reviewers: Matthias J. Sax <matthias@confluent.io>	2025-07-28 17:00:50 -07:00
Rajani K	de2adb69de	KAFKA-12281: Deprecate BrokerNotFoundException (#20192 ) CI / build (push) Waiting to run Details Implements KIP-1195. BrokerNotFoundException exception is unused since 2.8 Marking it deprecated so that it can be removed in next major release. Reviewers: Matthias J. Sax <matthias@confluent.io>	2025-07-28 15:18:34 -07:00
Alyssa Huang	7339d65b27	KAFKA-19354: KRaft observer should send fetch to best node (#19854 ) Observers may get stuck fetching from bootstrap servers even on discovery of a leader from a fetch response. Observers currently fetch from bootstrap servers once their fetch timeout expires, and even on rediscovery of the leader (e.g. observer receives fetch response indicating location of leader), observers will not attempt to resume fetching from the leader. This change simplifies observer polling logic to just rely on maybeSendFetchToBestNode which takes care of the logic of selecting either the leader or bootstrap servers as the destination based on timeouts and backoffs. Reviewers: José Armando García Sancio <jsancio@apache.org>	2025-07-28 13:03:14 -04:00
Lan Ding	dfced692d2	KAFKA-19551: Remove the handling of FatalExitError in RemoteStorageThreadPool (#20245 ) CI / build (push) Waiting to run Details FatalExitError is not thrown after [KAFKA-19425](https://issues.apache.org/jira/browse/KAFKA-19425). Clean up the handling of FatalExitError in `RemoteStorageThreadPool`.	2025-07-28 17:49:45 +05:30
stroller	e61c297b73	KAFKA-19425: Stop the server when fail to initialize to avoid local segment never got deleted. (#20007 ) We found that one broker's local segment on disk never get removed forever no matter how long it stored. The disk always keep increasing. ![image](https://github.com/user-attachments/assets/42129bb6-7d07-481b-923f-971da3ab12da) note: Partition 2's node is the exception node. After we trouble shooting. we find if one broker is very slow to startup it will cause the TopicBasedRemoteLogMetadataManager#initializeResources's fail sometime (it meet expectation due to the server is not ready as fast). Thus it won't stop the server so that the server still run just with some exception log but not shutdown. It won't upload to remote for the local so that the local segment never to deleted. So propose the change to shutdown the broker to avoid the silence critical error which caused the disk keep increasing forever. Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Luke Chen <showuon@gmail.com>	2025-07-28 17:47:09 +05:30
Lan Ding	abbb6b3c13	KAFKA-19471: Enable acknowledgement for a record which could not be deserialized (#20148 ) CI / build (push) Waiting to run Details This patch mainly includes two improvements: 1. Update currentFetch when `pollForFetches()` throws an exception. 2. Add an override `KafkaShareConsumer.acknowledge(String topic, int partition, long offset, AcknowledgeType type)` . Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-07-27 22:35:04 +01:00
Yeikel Santana	1a176beff1	MINOR: Reword warning message when internal deprecated properties are used (#20224 ) This is a small typo I noticed while working on Connect Original warning message > level=WARN connector_context=The worker has been configured with one or more internal converter properties ([internal.key.converter, schemas.enable, internal.value.converter, schemas.enable]). Support for these properties was deprecated in version 2.0 and removed in version 3.0, and specifying them will have no effect. Instead, an instance of the JsonConverter with schemas.enable set to false will be used. For more information, please visit https://kafka.apache.org/documentation/#upgrade and consult the upgrade notesfor the 3.0 release. class=org.apache.kafka.connect.runtime.WorkerConfig line=310 Reviewers: Ken Huang <s7133700@gmail.com>, Andrew Schofield <aschofield@confluent.io>	2025-07-27 21:12:49 +01:00
Jinhe Zhang	73f195f062	MINOR: Re-add pageview demo to :streams:examples and remove dependency on :connect:json (#20239 ) CI / build (push) Has been cancelled Details With 4.0 release, we remove pageview demo because it depends on `:connect:json` which requires JDK 17. This PR removes the connect dependency and adds a customized serializer and deserializer, to make pageview demo works with JDK 11. Reviewers: Matthias J. Sax <matthias@confluent.io>	2025-07-25 11:06:12 -07:00
lucliu1108	46cb6cbcff	MINOR: Improve Kafka Streams Protocol Upgrade Doc (#20241 ) As a follow-up minor addition to PR for [KSTREAMS-7735](https://confluentinc.atlassian.net/browse/KSTREAMS-7735 ) (https://github.com/apache/kafka/pull/20029) , add the instructions for upgrading `streams.version` parameter for KIP-1071 EA. Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-07-25 18:22:09 +01:00
majialong	a27d6e32b0	MINOR: Optimize RemoteLogManager#buildFilteredLeaderEpochMap (#20205 ) Optimize `RemoteLogManager#buildFilteredLeaderEpochMap` . Add a temporary unit test `testBuildFilteredLeaderEpochMapModify` in `RemoteLogManagerTest` to verify the output consistency of the method before and after optimization. Randomly generate leaderEpochs and iterate 100000 times for verification. ``` @Test public void testBuildFilteredLeaderEpochMapModify() { int testIterations = 100000; for (int i = 0; i < testIterations; i++) { TreeMap<Integer, Long> leaderEpochToStartOffset = generateRandomLeaderEpochAndStartOffset(); // before optimize NavigableMap<Integer, Long> optimizeBefore = RemoteLogManager.buildFilteredLeaderEpochMap(leaderEpochToStartOffset); // after optimize NavigableMap<Integer, Long> optimizeAfter = RemoteLogManager.buildFilteredLeaderEpochMap2(leaderEpochToStartOffset); assertEquals(optimizeBefore, optimizeAfter); } } private static TreeMap<Integer, Long> generateRandomLeaderEpochAndStartOffset() { TreeMap<Integer, Long> map = new TreeMap<>(); Random random = new Random(); int numEntries = random.nextInt(100000); long lastStartOffset = 0; for (int i = 0; i < numEntries; i++) { // generate a leader epoch int leaderEpoch = random.nextInt(100000); long startOffset; // generate a random start offset , or use the last start offset if (i > 0 && random.nextDouble() < 0.2) { startOffset = lastStartOffset; } else { startOffset = Math.abs(random.nextLong()) % 100000; } lastStartOffset = startOffset; map.put(leaderEpoch, startOffset); } return map; } ``` Command: ``` ./gradlew storage:test --tests RemoteLogManagerTest``` Result: All unit tests passed. <img width="1258" height="424" alt="image" src="https://github.com/user-attachments/assets/7d9fc3b5-3bbc-440f-b1cf-3a2a5f97557a" /> <img width="411" height="66" alt="image" src="https://github.com/user-attachments/assets/22a0b443-88e8-43d2-a3f2-51266935ed34" /> Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-07-24 23:16:27 +08:00
Kevin Wu	93447d5b88	KAFKA-19400: Update AddRaftVoterRequest RPC to version 1 (#19982 ) Add the ackWhenCommitted boolean field to the AddRaftVoter request, and bump the RPC's version to 1. The default value of the ackWhenCommitted field is true, and in this case the leader will return a response after committing the VotersRecord generated by the RPC. If ackWhenCommitted is false, the leader will return a response after generating the new VotersRecord and adding it to the batch accumulator. Reviewers: Alyssa Huang <ahuang@confluent.io>, José Armando García Sancio <jsancio@apache.org>	2025-07-24 10:38:31 -04:00

1 2 3 4 5 ...

16161 Commits All Branches Search

16161 Commits

All Branches