kafka

Commit Graph

Author	SHA1	Message	Date
Kevin Wu	012e4ca6d8	KAFKA-19719 --no-initial-controllers should not assume kraft.version=1 (#20604 ) CI / build (push) Has been cancelled Details ``` commit `ec37eb538b` (HEAD -> KAFKA-19719-cherry-pick-41, origin/KAFKA-19719-cherry-pick-41) Author: Kevin Wu <kevin.wu2412@gmail.com> Date: Thu Sep 25 11:56:16 2025 -0500 KAFKA-19719: --no-initial-controllers should not assume kraft.version=1 (#20551) Just because a controller node sets --no-initial-controllers flag does not mean it is necessarily running kraft.version=1. The more precise meaning is that the controller node being formatted does not know what kraft version the cluster should be in, and therefore it is only safe to assume kraft.version=0. Only by setting --standalone,--initial-controllers, or --no-initial-controllers AND not specifying the controller.quorum.voters static config, is it known kraft.version > 0. For example, it is a valid configuration (although confusing) to run a static quorum defined by controller.quorum.voters but have all the controllers format with --no-initial-controllers. In this case, specifying --no-initial-controllers alongside a metadata version that does not support kraft.version=1 causes formatting to fail, which is does not support kraft.version=1 causes formatting to fail, which is a regression. Additionally, the formatter should not check the kraft.version against the release version, since kraft.version does not actually depend on any release version. It should only check the kraft.version against the static voters config/format arguments. This PR also cleans up the integration test framework to match the semantics of formatting an actual cluster. Reviewers: TengYao Chi <kitingiao@gmail.com>, Kuan-Po Tseng <brandboat@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, José Armando García Sancio <jsancio@apache.org> Conflicts: core/src/main/scala/kafka/tools/StorageTool.scala Minor conflicts. Keep changes from cherry-pick. core/src/test/java/kafka/server/ReconfigurableQuorumIntegrationTest.java Remove auto-join tests, since 4.1 does not support it. docs/ops.html Keep docs section from cherry-pick. metadata/src/test/java/org/apache/kafka/metadata/storage/FormatterTest.java Minor conflicts. Keep cherry-picked changes. test-common/test-common-runtime/src/main/java/org/apache/kafka/common/test/KafkaClusterTestKit.java Conflicts due to integration test framework changes. Keep new changes. commit `02d58b176c` (upstream/4.1) ``` Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-09-30 22:30:23 +08:00
Luke Chen	cdc7a4e2b7	MINOR: improve the min.insync.replicas doc (#20237 ) Along with the change: https://github.com/apache/kafka/pull/17952 ([KIP-966](https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas)), the semantics of `min.insync.replicas` config has small change, and add some constraints. We should document them clearly. Reviewers: Jun Rao <junrao@gmail.com>, Calvin Liu <caliu@confluent.io>, Mickael Maison <mickael.maison@gmail.com>, Paolo Patierno <ppatierno@live.com>, Federico Valeri <fedevaleri@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-08-05 00:27:04 +08:00
Calvin Liu	e4e2dce2eb	KAFKA-19522: avoid electing fenced lastKnownLeader (#20200 ) CI / build (push) Waiting to run Details This patch fixes the bug that allows the last known leader to be elected as a partition leader while still in a fenced state, before the next heartbeat removes the fence. https://issues.apache.org/jira/browse/KAFKA-19522 Reviewers: Jun Rao <junrao@gmail.com>, TengYao Chi <frankvicky@apache.org>	2025-07-20 16:55:45 +08:00
Calvin Liu	98cb8df7a5	MINOR: Bump LATEST_PRODUCTION to 4.1IV1 and Use MV to enable ELR (#20174 ) CI / build (push) Waiting to run Details Removing the isEligibleLeaderReplicasV1Enabled to let ELR be enabled if MV is at least 4.1IV1. Also bump the Latest Prod MV to 4.1IV1 Reviewers: Jun Rao <junrao@gmail.com>	2025-07-15 20:23:53 -07:00
Calvin Liu	b80aa15c17	KAFKA-19383: Handle the deleted topics when applying ClearElrRecord (#20033 ) https://issues.apache.org/jira/browse/KAFKA-19383 When applying the ClearElrRecord, it may pick up the topicId in the image without checking if the topic has been deleted. This can cause the creation of a new TopicRecord with an old topic ID. Reviewers: Alyssa Huang <ahuang@confluent.io>, Artem Livshits <alivshits@confluent.io>, Colin P. McCabe <cmccabe@apache.org> No conflicts.	2025-06-24 17:04:45 -07:00
Alyssa Huang	3f0ae7fd53	KAFKA-19411: Fix deleteAcls bug which allows more deletions than max records per user op (#19974 ) If there are more deletion filters after we initially hit the `MAX_RECORDS_PER_USER_OP` bound, we will add an additional deletion record ontop of that for each additional filter. The current error message returned to the client is not useful either, adding logic so client doesn't just get `UNKNOWN_SERVER_EXCEPTION` with no details returned.	2025-06-24 15:51:30 -07:00
José Armando García Sancio	88eced0c0f	KAFKA-14145; Faster KRaft HWM replication (#19800 ) This change compares the remote replica's HWM with the leader's HWM and completes the FETCH request if the remote HWM is less than the leader's HWM. When the leader's HWM is updated any pending FETCH RPC is completed. Reviewers: Alyssa Huang <ahuang@confluent.io>, David Arthur <mumrah@gmail.com>, Andrew Schofield <aschofield@confluent.io> (cherry picked from commit `742b327025`)	2025-06-17 13:27:11 -04:00
Hong-Yi Chen	aaed164be6	MINOR: Refactor brokerContactTimesMs and brokerRegistrationStates to use Long and Integer (#19888 ) This PR simplifies two ConcurrentHashMap fields by removing their Atomic wrappers: - Change `brokerContactTimesMs` from `ConcurrentHashMap<Integer, AtomicLong>` to `ConcurrentHashMap<Integer, Long>`. - Change `brokerRegistrationStates` from `ConcurrentHashMap<Integer, AtomicInteger>` to `ConcurrentHashMap<Integer, Integer>`. This removes mutable holders without affecting thread safety (see discussion in #19828). Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi <frankvicky@apache.org>, Kevin Wu <kevin.wu2412@gmail.com>, Ken Huang <s7133700@gmail.com>	2025-06-06 16:03:23 +08:00
David Arthur	70b672b808	KAFKA-19347 Deduplicate ACLs when creating (#19898 ) In #19840, we broke de-duplication during ACL creation. This patch fixes that and adds a test to cover this case. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2025-06-04 12:38:54 -04:00
Hong-Yi Chen	8b49130b92	KAFKA-19355 Remove interBrokerListenerName from ClusterControlManager (#19866 ) CI / build (push) Waiting to run Details Following the removal of the ZK-to-KRaft migration code in commit `85bfdf4`, controller-to-broker communication is now handled by the control-plane listener (`controller.listener.names`). The `interBrokerListenerName` parameter in `ClusterControlManager` is no longer referenced on the controller side and can be safely removed as dead code. Reviewers: Lan Ding <isDing_L@163.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-06-02 01:18:15 +08:00
Kevin Wu	8731c96122	MINOR: fixing updateBrokerContactTime (#19828 ) Fix `updateBrokerContactTime` so that existing brokers still have their contact time updated when they are already tracked. Also, update the unit test to test this case. Reviewers: Kuan-Po Tseng <brandboat@gmail.com>, Yung <yungyung7654321@gmail.com>, TengYao Chi <frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>	2025-05-29 11:58:09 +08:00
David Arthur	9dd4cff2d7	KAFKA-19347 Don't update timeline data structures in createAcls (#19840 ) CI / build (push) Waiting to run Details This patch fixes a problem in AclControlManager where we are updating the timeline data structures prematurely. Reviewers: Alyssa Huang <ahuang@confluent.io>, Colin P. McCabe <cmccabe@apache.org>, Andrew Schofield <aschofield@confluent.io>,	2025-05-28 09:40:19 -07:00
Ken Huang	bcda92b5b9	KAFKA-19080 The constraint on segment.ms is not enforced at topic level (#19371 ) CI / build (push) Waiting to run Details The main issue was that we forgot to set `TopicConfig.SEGMENT_BYTES_CONFIG` to at least `1024 * 1024`, which caused problems in tests with small segment sizes. To address this, we introduced a new internal config: `LogConfig.INTERNAL_SEGMENT_BYTES_CONFIG`, allowing us to set smaller segment bytes specifically for testing purposes. We also updated the logic so that if a user configures the topic-level segment bytes without explicitly setting the internal config, the internal value will no longer be returned to the user. In addition, we removed `MetadataLogConfig#METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG` and added three new internal configurations: - `INTERNAL_MAX_BATCH_SIZE_IN_BYTES_CONFIG` - `INTERNAL_MAX_FETCH_SIZE_IN_BYTES_CONFIG` - `INTERNAL_DELETE_DELAY_MILLIS_CONFIG` Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-05-25 20:57:22 +08:00
jimmy	b44bfca408	KAFKA-16717 [2/N]: Add AdminClient.alterShareGroupOffsets (#18929 ) [KAFKA-16720](https://issues.apache.org/jira/browse/KAFKA-16720) aims to finish the AlterShareGroupOffsets RPC. Reviewers: Andrew Schofield <aschofield@confluent.io> --------- Co-authored-by: jimmy <wangzhiwang@qq.com>	2025-05-23 09:05:48 +01:00
Jing-Jia Hung	bc797b077f	KAFKA-19314 Remove unnecessary code of closing snapshotWriter (#19763 ) - Remove redundant close of `snapshotWriter`. - `snapshotWriter` is already closed by `RaftSnapshotWriter#writer`. Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-05-23 03:57:14 +08:00
Kevin Wu	37963256d1	KAFKA-18666: Controller-side monitoring for broker shutdown and startup (#19586 ) CI / build (push) Waiting to run Details This PR introduces the following per-broker metrics: -`kafka.controller:type=KafkaController,name=BrokerRegistrationState,broker=X` -`kafka.controller:type=KafkaController,name=TimeSinceLastHeartbeatReceivedMs,broker=X` and this metric: `kafka.controller:type=KafkaController,name=ControlledShutdownBrokerCount` Reviewers: Colin P. McCabe <cmccabe@apache.org>	2025-05-14 10:59:47 -07:00
Bolin Lin	6eafe407bd	MINOR: Fix unchecked type warnings in several test classes (#19679 ) * In ConsoleShareConsumerTest, add `@SuppressWarnings("unchecked")` annotation in method shouldUpgradeDeliveryCount * In ListConsumerGroupOffsetsHandlerTest, add generic parameters to HashSet constructors * In TopicsImageTest, add explicit generic type to Collections.EMPTY_MAP to fix raw type usage Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-05-13 14:59:22 +08:00
Nick Guo	707a44a6cb	KAFKA-19068 Eliminate the duplicate type check in creating ControlRecord (#19346 ) CI / build (push) Waiting to run Details jira: https://issues.apache.org/jira/browse/KAFKA-19068 `RecordsIterator#decodeControlRecord` do the type check and then `ControlRecord` constructor does that again. we should add a static method to ControlRecord to create `ControlRecord` with type check, and then `ControlRecord` constructor should be changed to private to ensure all instance is created by the static method. Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>	2025-05-11 00:07:00 +08:00
Calvin Liu	3094ce2c20	KAFKA-19212: Correct the unclean leader election metric calculation (#19590 ) CI / build (push) Waiting to run Details The current ElectionWasClean checks if the new leader is in the previous ISR. However, there is a corner case in the partition reassignment. The partition reassignment can change the partition replicas. If the new preferred leader (the first one in the new replicas) is the last one to join ISR, this preferred leader will be elected in the same partition change. For example: In the previous state, the partition is Leader: 0, Replicas (2,1,0), ISR (1,0), Adding(2), removing(0). Then replica 2 joins the ISR. The new partition would be like: Leader: 2, Replicas (2,1), ISR(1,2). The new leader 2 is not in the previous ISR (1,0) but it is still a clean election. Reviewers: Jun Rao <junrao@gmail.com>	2025-05-07 13:26:53 -07:00
Kevin Wu	6cb6aa2030	MINOR; Add `--standalone --ignore-formatted` formatter test (#19643 ) CI / build (push) Waiting to run Details This PR adds an additional test case to `FormatterTest` that checks that formatting with `--standalone` and then formatting again with `--standalone --ignore-formatted` is indeed a no-op. Reviewers: José Armando García Sancio <jsancio@apache.org>	2025-05-07 10:41:18 -04:00
José Armando García Sancio	2df14b1190	MINOR; Log message for unexpected buffer allocation (#19596 ) Log a message when reading a batch that is larger than the currently allocated batch. Reviewers: Colin Patrick McCabe <cmccabe@apache.org>, PoAn Yang <payang@apache.org>	2025-05-06 12:01:49 -04:00
yunchi	bff5ba4ad9	MINOR: replace .stream().forEach() with .forEach() (#19626 ) CI / build (push) Waiting to run Details replace all applicable `.stream().forEach()` in codebase with just `.forEach()`. Reviewers: TengYao Chi <kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-05-04 20:39:55 +08:00
YuChia Ma	979f49f967	KAFKA-19146 Merge OffsetAndEpoch from raft to server-common (#19475 ) CI / build (push) Waiting to run Details 1. remove org.apache.kafka.raft.OffsetAndEpoch 2. rewrite org.apache.kafka.server.common.OffsetAndEpoch by record keyword 3. rename OffsetAndEpoch#leaderEpoch to OffsetAndEpoch#epoch Reviewers: PoAn Yang <payang@apache.org>, Xuan-Zhang Gong <gongxuanzhangmelt@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-05-02 03:12:52 +08:00
Andrew Schofield	2022b4c480	KAFKA-16894 Correct definition of ShareVersion (#19606 ) The ShareVersion feature does not make any metadata version changes. As a result, `SV_1` does not depend on any MV level, and no MV needs to be defined for the preview of KIP-932. Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-05-02 01:30:00 +08:00
Matthias J. Sax	b0a26bc2f4	KAFKA-19173: Add `Feature` for "streams" group (#19509 ) Add new StreamsGroupFeature, disabled by default, and add "streams" as default value to `group.coordinator.rebalance.protocols`. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <david.jacot@gmail.com>, Lucas Brutschy <lbrutschy@confluent.io>, Justine Olshan <jolshan@confluent.io>, Andrew Schofield <aschofield@confluent.io>, Jun Rao <jun@confluent.io>	2025-04-29 22:51:10 -07:00
PoAn Yang	81881dee83	KAFKA-18760: Deprecate Optional<String> and return String from public Endpoint#listener (#19191 ) * Deprecate org.apache.kafka.common.Endpoint#listenerName. * Add org.apache.kafka.common.Endpoint#listener to replace org.apache.kafka.common.Endpoint#listenerName. * Replace org.apache.kafka.network.EndPoint with org.apache.kafka.common.Endpoint. * Deprecate org.apache.kafka.clients.admin.RaftVoterEndpoint#name * Add org.apache.kafka.clients.admin.RaftVoterEndpoint#listener to replace org.apache.kafka.clients.admin.RaftVoterEndpoint#name Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, TaiJuWu <tjwu1217@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, TengYao Chi <frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>, Bagda Parth , Kuan-Po Tseng <brandboat@gmail.com> --------- Signed-off-by: PoAn Yang <payang@apache.org>	2025-04-30 12:15:33 +08:00
Ken Huang	676e0f2ad6	KAFKA-19139 Plugin#wrapInstance should use LinkedHashMap instead of Map (#19519 ) CI / build (push) Waiting to run Details There will be an update to the PluginMetrics#metricName method: the type of the tags parameter will be changed from Map to LinkedHashMap. This change is necessary because the order of metric tags is important 1. If the tag order is inconsistent, identical metrics may be treated as distinct ones by the metrics backend 2. KAFKA-18390 is updating metric naming to use LinkedHashMap. For consistency, we should follow the same approach here. Reviewers: TengYao Chi <frankvicky@apache.org>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, lllilllilllilili	2025-04-30 10:43:01 +08:00
Colin Patrick McCabe	22b89b6413	KAFKA-19192; Old bootstrap checkpoint files cause problems updated servers (#19545 ) Old bootstrap.metadata files cause problems with server that include KAFKA-18601. When the server tries to read the bootstrap.checkpoint file, it will fail if the metadata.version is older than 3.3-IV3 (feature level 7). This causes problems when these clusters are upgraded. This PR makes it possible to represent older MVs in BootstrapMetadata objects without causing an exception. An exception is thrown only if we attempt to access the BootstrapMetadata. This ensures that only the code path in which we start with an empty metadata log checks that the metadata version is 7 or newer. Reviewers: José Armando García Sancio <jsancio@apache.org>, Ismael Juma <ismael@juma.me.uk>, PoAn Yang <payang@apache.org>, Liu Zeyu <zeyu.luke@gmail.com>, Alyssa Huang <ahuang@confluent.io>	2025-04-24 15:43:35 -04:00
José Armando García Sancio	b97a130c08	KAFKA-16538; Enable upgrading kraft version for existing clusters (#19416 ) This change implements upgrading the kraft version from 0 to 1 in existing clusters. Previously, clusters were formatted with either version 0 or version 1, and could not be moved between them. The kraft version for the cluster metadata partition is recorded using the KRaftVersion control record. If there is no KRaftVersion control record the default kraft version is 0. The kraft version is upgraded using the UpdateFeatures RPC. These RPCs are handled by the QuorumController and FeatureControlManager. This change adds special handling in the FeatureControlManager so that upgrades to the kraft.version are directed to RaftClient#upgradeKRaftVersion. To allow the FeatureControlManager to call RaftClient#upgradeKRaftVersion is a non-blocking fashion, the kraft version upgrade uses optimistic locking. The call to RaftClient#upgradeKRaftVersion does validations of the version change. If the validations succeeds, it generates the necessary control records and adds them to the BatchAccumulator. Before the kraft version can be upgraded to version 1, all of the brokers and controllers in the cluster need to support kraft version 1. The check that all brokers support kraft version 1 is done by the FeatureControlManager. The check that all of the controllers support kraft version is done by KafkaRaftClient and LeaderState. When the kraft version is 0, the kraft leader starts by assuming that all voters do not support kraft version 1. The leader discovers which voters support kraft version 1 through the UpdateRaftVoter RPC. The KRaft leader handles UpdateRaftVoter RPCs by storing the updated information in-memory until the kraft version is upgraded to version 1. This state is stored in LeaderState and contains the latest directory id, endpoints and supported kraft version for each voter. Only when the KRaft leader has received an UpdateRaftVoter RPC from all of the voters will it allow the upgrade from kraft.version 0 to 1. Reviewers: Alyssa Huang <ahuang@confluent.io>, Colin P. McCabe <cmccabe@apache.org>	2025-04-22 16:02:51 -07:00
Colin Patrick McCabe	c465abc458	KAFKA-19130: Do not add fenced brokers to BrokerRegistrationTracker on startup (#19454 ) When the controller starts up (or becomes active after being inactive), we add all of the registered brokers to BrokerRegistrationTracker so that they will not be accidentally fenced the next time we are looking for a broker to fence. We do this because the state in BrokerRegistrationTracker is "soft state" (it doesn't appear in the metadata log), and the newly active controller starts off with no soft state. (Its soft state will be populated by the brokers sending heartbeat requests to it over time.) In the case of fenced brokers, we are not worried about accidentally fencing the broker due to it being missing from BrokerRegistrationTracker for a while (it's already fenced). Therefore, it should be reasonable to just not add fenced brokers to the tracker initially. One case where this change will have a positive impact is for people running single-node demonstration clusters in combined KRaft mode. In that case, when the single-node cluster is taken down and restarted, it currently will have to wait about 9 seconds for the broker to come up and re-register. With this change, the broker should be able to re-register immediately (assuming the previous shutdown happened cleanly through controller shutdown.) One possible negative impact is that if there is a controller failover, it will open a small window where a broker with the same ID as a fenced broker could re-register. However, our detection of duplicate broker IDs is best-effort (and duplicate broker IDs are an administrative mistake), so this downside seems acceptable. Reviewers: Alyssa Huang <ahuang@confluent.io>, José Armando García Sancio <jsancio@apache.org>	2025-04-16 11:57:35 -07:00
Mickael Maison	fb2ce76b49	KAFKA-18888: Add KIP-877 support to Authorizer (#19050 ) This also adds metrics to StandardAuthorizer Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ken Huang <s7133700@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, TaiJuWu <tjwu1217@gmail.com>	2025-04-15 19:40:24 +02:00
PoAn Yang	42771b6144	KAFKA-18845 Remove flaky tag on QuorumControllerTest#testUncleanShutdownBrokerElrEnabled (#19403 ) It has been around two weeks since fixing QuorumControllerTest#testUncleanShutdownBrokerElrEnabled PR https://github.com/apache/kafka/pull/19240 was merged. There is no flaky result after 2025/03/21, so it has enough evidence to prove the flaky is fixed. It's good to remove flaky tag. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-04-15 23:53:08 +08:00
Xuan-Zhang Gong	17ab374bb5	KAFKA-15371 MetadataShell is stuck when bootstrapping (#19419 ) issue link https://issues.apache.org/jira/browse/KAFKA-15371 ## conclusion This issue isn’t caused by differences between the `log` file and the `checkpoint` file, but rather by the order in which asynchronous events occur. ## reliably reproduce In the current version, you can reliably reproduce this issue by adding a small sleep in `SnapshotFileReader#handleNextBatch` , like this: ``` private void handleNextBatch() { if (!batchIterator.hasNext()) { try { Thread.sleep(1000); } catch (InterruptedException e) { throw new RuntimeException(e); } beginShutdown("done"); return; } FileChannelRecordBatch batch = batchIterator.next(); if (batch.isControlBatch()) { handleControlBatch(batch); } else { handleMetadataBatch(batch); } scheduleHandleNextBatch(); lastOffset = batch.lastOffset(); } ``` you can download a test file [test checkpoint file](https://github.com/user-attachments/files/19659636/00000000000000007169-0000000001.checkpoint.log) ⚠️: Please remove the .log extension after downloading, since GitHub doesn’t allow uploading checkpoint files directly. After change code and gradle build , you can run `bin/kafka-metadata-shell.sh --snapshot ${your file path}` You will only see a loading message in the console like this: <img width="248" alt="image" src="https://github.com/user-attachments/assets/fe4b4eba-7a6a-4cee-9b56-c82a5fa02c89" /> ## Cause of the Bug After the `SnapshotFileReader startup`, it will enqueue the iterator’s events to its own kafkaQueue. The impontent method is: `SnapshotFileReader#scheduleHandleNextBatch` When processing each batch of the iterator, it adds metadata events for the batch to the kafkaQueue(different from the SnapshotFileReader.) of the metadataLoader. The impontent method is `SnapshotFileReader#handleMetadataBatch` and `MetadataLoader#handleCommit` When the MetadataLoader processes a MetadataDelta, it checks whether the high watermark has been updated. If not, it skips processing The impontent method is `MetadataLoader#maybePublishMetadata` and `maybePublishMetadata#stillNeedToCatchUp` The crucial high watermark update happens after the SnapshotFileReader’s iterator finishes reading, using the cleanup task of its kafkaQueue. So, if the MetadataLoader finishes processing all batches before the high watermark is updated, the main thread will keep waiting. <img width="1088" alt="image" src="https://github.com/user-attachments/assets/03daa288-ff39-49a3-bbc7-e7b5831a858b" /> <img width="867" alt="image" src="https://github.com/user-attachments/assets/fc0770dd-de54-4f69-b669-ab4e696bd2a7" /> ## Solution If we’ve reached the last batch in the iteration, we update the high watermark first before adding events to the MetadataLoader, ensuring that MetadataLoader runs at least once after the watermark is updated. After modifying the code, you’ll see the normal shell execution behavior. <img width="337" alt="image" src="https://github.com/user-attachments/assets/2791d03c-81ae-4762-a015-4d6d9e526455" /> Reviewers: PoAn Yang <payang@apache.org>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-04-14 14:13:50 +08:00
Andrew Schofield	21a080f08c	KAFKA-16894: Define feature to enable share groups (#19293 ) This PR proposes a switch to enable share groups for 4.1 (preview) and 4.2 (GA). * `share.version=1` to indicate that share groups are enabled. This is used as the switch for turning share groups on and off. In 4.1, the default will be `share.version=0`. Then a user wanting to evaluate the preview of KIP-932 would use `bin/kafka-features.sh --bootstrap.server xxxx upgrade --feature share.version=1`. In 4.2, the default will be `share.version=1`. Reviewers: Jun Rao <junrao@gmail.com>	2025-04-11 12:14:38 +01:00
Alyssa Huang	6e446f0b05	KAFKA-19047: Allow quickly re-registering brokers that are in controlled shutdown (#19296 ) Allow re-registration of brokers with active sessions if the previous broker registration was in controlled shutdown. Reviewers: Colin P. McCabe <cmccabe@apache.org>, Reviewers: José Armando García Sancio <jsancio@apache.org>, David Mao <dmao@confluent.io>	2025-04-08 13:39:04 -07:00
PoAn Yang	dcf6f9d4c9	MINOR: remove unused function BrokerRegistration#isMigratingZkBroker (#19330 ) The `BrokerRegistration#isMigratingZkBroker` is not used by any production function. Remove it. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-04-09 01:56:46 +08:00
Dmitry Werner	4144290335	MINOR: Cleanup metadata module (#18937 ) Removed unused code and fixed IDEA warnings. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-03-31 17:46:21 +08:00
José Armando García Sancio	82de719fff	MINOR; Improve error message for the storage format command (#19210 ) Minor improvement on the error output for the storage format command. Suggests changes for a valid storage format command. Reviewers: Justine Olshan <jolshan@confluent.io>	2025-03-23 19:30:57 -04:00
David Arthur	8fa3856473	MINOR Mar 19 flaky tests (#19248 ) CoordinatorRequestManagerTest#testMarkCoordinatorUnknownLoggingAccuracy has become flaky again. Last 30 days report shows a sudden re-occurrence https://develocity.apache.org/scans/tests?search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.tags=github,trunk,not:flaky,not:new&search.tasks=test&search.timeZoneId=America%2FNew_York&tests.container=org.apache.kafka.clients.consumer.internals.CoordinatorRequestManagerTest&tests.sortField=FLAKY# Also mark QuorumControllerTest.testMinIsrUpdateWithElr as flaky. Reviewers: Lianet Magrans <lmagrans@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-03-21 09:26:13 -04:00
Calvin Liu	1c582a4a35	KAFKA-18954: Add ELR election rate metric (#19180 ) Add a metric to track the number of election is done using ELR. https://issues.apache.org/jira/browse/KAFKA-18954 Reviewers: Colin P. McCabe <cmccabe@apache.org>, Justine Olshan <jolshan@confluent.io>	2025-03-20 15:37:49 -07:00
PoAn Yang	71875ec58e	KAFKA-18845: Fix flaky QuorumControllerTest#testUncleanShutdownBrokerElrEnabled (#19240 ) There're two root causes: 1. When we unclean shutdown `brokerToBeTheLeader`, we didn't wait for the result. That means when we send heartbeat to unfence broker, it has chance to use stale broker epoch to send the request. [0] 2. We use different replica directory to unclean shutdown broker. Even if broker is unfenced, it cannot get an online directory, so the `brokerToBeTheLeader` cannot be elected as a new leader. [1] [0] `a5325e029e/metadata/src/test/java/org/apache/kafka/controller/QuorumControllerTest.java (L484-L497)` [1] `a5325e029e/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java (L2470-L2477)` Reviewers: David Arthur <mumrah@gmail.com>	2025-03-20 13:37:09 -04:00
PoAn Yang	da46cf6e79	KAFKA-17565 Move MetadataCache interface to metadata module (#18801 ) ### Changes * Move MetadataCache interface to metadata module and change Scala function to Java. * Remove functions `getTopicPartitions`, `getAliveBrokers`, `topicNamesToIds`, `topicIdInfo`, and `getClusterMetadata` from MetadataCache interface, because these functions are only used in test code. ### Performance * ReplicaFetcherThreadBenchmark ``` ./jmh-benchmarks/jmh.sh -f 1 -i 2 -wi 2 org.apache.kafka.jmh.fetcher.ReplicaFetcherThreadBenchmark ``` * trunk ``` Benchmark (partitionCount) Mode Cnt Score Error Units ReplicaFetcherThreadBenchmark.testFetcher 100 avgt 2 4775.490 ns/op ReplicaFetcherThreadBenchmark.testFetcher 500 avgt 2 25730.790 ns/op ReplicaFetcherThreadBenchmark.testFetcher 1000 avgt 2 55334.206 ns/op ReplicaFetcherThreadBenchmark.testFetcher 5000 avgt 2 488427.547 ns/op ``` * branch ``` Benchmark (partitionCount) Mode Cnt Score Error Units ReplicaFetcherThreadBenchmark.testFetcher 100 avgt 2 4825.219 ns/op ReplicaFetcherThreadBenchmark.testFetcher 500 avgt 2 25985.662 ns/op ReplicaFetcherThreadBenchmark.testFetcher 1000 avgt 2 56056.005 ns/op ReplicaFetcherThreadBenchmark.testFetcher 5000 avgt 2 497138.573 ns/op ``` * KRaftMetadataRequestBenchmark ``` ./jmh-benchmarks/jmh.sh -f 1 -i 2 -wi 2 org.apache.kafka.jmh.metadata.KRaftMetadataRequestBenchmark ``` * trunk ``` Benchmark (partitionCount) (topicCount) Mode Cnt Score Error Units KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 10 500 avgt 2 884933.558 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 10 1000 avgt 2 1910054.621 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 10 5000 avgt 2 21778869.337 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 20 500 avgt 2 1537550.670 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 20 1000 avgt 2 3168237.805 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 20 5000 avgt 2 29699652.466 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 50 500 avgt 2 3501483.852 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 50 1000 avgt 2 7405481.182 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 50 5000 avgt 2 55839670.124 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 10 500 avgt 2 333.667 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 10 1000 avgt 2 339.685 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 10 5000 avgt 2 334.293 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 20 500 avgt 2 329.899 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 20 1000 avgt 2 347.537 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 20 5000 avgt 2 332.781 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 50 500 avgt 2 327.085 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 50 1000 avgt 2 325.206 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 50 5000 avgt 2 316.758 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 10 500 avgt 2 7.569 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 10 1000 avgt 2 7.565 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 10 5000 avgt 2 7.574 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 20 500 avgt 2 7.568 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 20 1000 avgt 2 7.557 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 20 5000 avgt 2 7.585 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 50 500 avgt 2 7.560 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 50 1000 avgt 2 7.554 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 50 5000 avgt 2 7.574 ns/op ``` * branch ``` Benchmark (partitionCount) (topicCount) Mode Cnt Score Error Units KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 10 500 avgt 2 910337.770 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 10 1000 avgt 2 1902351.360 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 10 5000 avgt 2 22215893.338 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 20 500 avgt 2 1572683.875 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 20 1000 avgt 2 3188560.081 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 20 5000 avgt 2 29984751.632 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 50 500 avgt 2 3413567.549 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 50 1000 avgt 2 7303174.254 ns/op KRaftMetadataRequestBenchmark.testMetadataRequestForAllTopics 50 5000 avgt 2 54293721.640 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 10 500 avgt 2 318.335 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 10 1000 avgt 2 331.386 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 10 5000 avgt 2 332.944 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 20 500 avgt 2 340.322 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 20 1000 avgt 2 330.294 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 20 5000 avgt 2 342.154 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 50 500 avgt 2 341.053 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 50 1000 avgt 2 335.458 ns/op KRaftMetadataRequestBenchmark.testRequestToJson 50 5000 avgt 2 322.050 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 10 500 avgt 2 7.538 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 10 1000 avgt 2 7.548 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 10 5000 avgt 2 7.545 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 20 500 avgt 2 7.597 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 20 1000 avgt 2 7.567 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 20 5000 avgt 2 7.558 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 50 500 avgt 2 7.559 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 50 1000 avgt 2 7.615 ns/op KRaftMetadataRequestBenchmark.testTopicIdInfo 50 5000 avgt 2 7.562 ns/op ``` * PartitionMakeFollowerBenchmark ``` ./jmh-benchmarks/jmh.sh -f 1 -i 2 -wi 2 org.apache.kafka.jmh.partition.PartitionMakeFollowerBenchmark ``` * trunk ``` Benchmark Mode Cnt Score Error Units PartitionMakeFollowerBenchmark.testMakeFollower avgt 2 158.816 ns/op ``` * branch ``` Benchmark Mode Cnt Score Error Units PartitionMakeFollowerBenchmark.testMakeFollower avgt 2 160.533 ns/op ``` * UpdateFollowerFetchStateBenchmark ``` ./jmh-benchmarks/jmh.sh -f 1 -i 2 -wi 2 org.apache.kafka.jmh.partition.UpdateFollowerFetchStateBenchmark ``` * trunk ``` Benchmark Mode Cnt Score Error Units UpdateFollowerFetchStateBenchmark.updateFollowerFetchStateBench avgt 2 4975.261 ns/op UpdateFollowerFetchStateBenchmark.updateFollowerFetchStateBenchNoChange avgt 2 4880.880 ns/op ``` * branch ``` Benchmark Mode Cnt Score Error Units UpdateFollowerFetchStateBenchmark.updateFollowerFetchStateBench avgt 2 5020.722 ns/op UpdateFollowerFetchStateBenchmark.updateFollowerFetchStateBenchNoChange avgt 2 4878.855 ns/op ``` * CheckpointBench ``` ./jmh-benchmarks/jmh.sh -f 1 -i 2 -wi 2 org.apache.kafka.jmh.server.CheckpointBench ``` * trunk ``` Benchmark (numPartitions) (numTopics) Mode Cnt Score Error Units CheckpointBench.measureCheckpointHighWatermarks 3 100 thrpt 2 0.997 ops/ms CheckpointBench.measureCheckpointHighWatermarks 3 1000 thrpt 2 0.703 ops/ms CheckpointBench.measureCheckpointHighWatermarks 3 2000 thrpt 2 0.486 ops/ms CheckpointBench.measureCheckpointLogStartOffsets 3 100 thrpt 2 1.038 ops/ms CheckpointBench.measureCheckpointLogStartOffsets 3 1000 thrpt 2 0.734 ops/ms CheckpointBench.measureCheckpointLogStartOffsets 3 2000 thrpt 2 0.637 ops/ms ``` * branch ``` Benchmark (numPartitions) (numTopics) Mode Cnt Score Error Units CheckpointBench.measureCheckpointHighWatermarks 3 100 thrpt 2 0.990 ops/ms CheckpointBench.measureCheckpointHighWatermarks 3 1000 thrpt 2 0.659 ops/ms CheckpointBench.measureCheckpointHighWatermarks 3 2000 thrpt 2 0.508 ops/ms CheckpointBench.measureCheckpointLogStartOffsets 3 100 thrpt 2 0.923 ops/ms CheckpointBench.measureCheckpointLogStartOffsets 3 1000 thrpt 2 0.736 ops/ms CheckpointBench.measureCheckpointLogStartOffsets 3 2000 thrpt 2 0.637 ops/ms ``` * PartitionCreationBench ``` ./jmh-benchmarks/jmh.sh -f 1 -i 2 -wi 2 org.apache.kafka.jmh.server.PartitionCreationBench ``` * trunk ``` Benchmark (numPartitions) (useTopicIds) Mode Cnt Score Error Units PartitionCreationBench.makeFollower 20 false avgt 2 5.997 ms/op PartitionCreationBench.makeFollower 20 true avgt 2 6.961 ms/op ``` * branch ``` Benchmark (numPartitions) (useTopicIds) Mode Cnt Score Error Units PartitionCreationBench.makeFollower 20 false avgt 2 6.212 ms/op PartitionCreationBench.makeFollower 20 true avgt 2 7.005 ms/op ``` Reviewers: Ismael Juma <ismael@juma.me.uk>, David Arthur <mumrah@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-03-17 23:59:11 +08:00
Sanskar Jhajharia	766caaa551	MINOR: Clean up metadata module (#19069 ) Given that now we support Java 17 on our brokers, this PR replace the use of the following in metadata module: Collections.singletonList() and Collections.emptyList() with List.of() Collections.singletonMap() and Collections.emptyMap() with Map.of() Collections.singleton() and Collections.emptySet() with Set.of() Reviewers: David Arthur <mumrah@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-03-16 03:09:40 +08:00
José Armando García Sancio	d04efca493	KAFKA-18979; Report correct kraft.version in ApiVersions (#19205 ) Skip kraft.version when applying FeatureLevelRecord records. The kraft.version is stored as control records and not as metadata records. This solution has the benefits of removing from snapshots any FeatureLevelRecord for kraft.version that was incorrectly written to the log and allows ApiVersions to report the correct finalized kraft.version. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2025-03-13 18:39:24 -04:00
PoAn Yang	d171ff08a7	KAFKA-18858 Refactor FeatureControlManager to avoid using uninitialized MV (#19040 ) The `FeatureControlManager` used `MetadataVersion#LATEST_PRODUCTION` as uninitialized MV. This makes other component may get a stale MV. In production code, the `FeatureControlManager` set MV when replaying `FeatureLevelRecord`, so we can set `Optional.empty()` as uninitialized MV. If other components get an empty result, the `FeatureLevelRecord` throws an exception like `FeaturesImage`. Unit test: * FeatureControlManagerTest#testMetadataVersion: test getting MetadataVersion * before and after replaying FeatureLevelRecord. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-03-13 23:37:41 +08:00
David Arthur	0ebc3e83c5	MINOR Mar 12 Flaky tests (#19190 ) Mark the following tests as flaky: * StickyAssignorTest > testLargeAssignmentAndGroupWithUniformSubscription * DeleteSegmentsByRetentionTimeTest * QuorumControllerTest > testUncleanShutdownBrokerElrEnabled Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-03-12 13:47:35 -04:00
Lucas Brutschy	fc2e3dfce9	MINOR: Disallow unused local variables (#18963 ) Recently, we found a regression that could have been detected by static analysis, since a local variable wasn't being passed to a method during a refactoring, and was left unused. It was fixed in [`7a749b5`](`7a749b589f`), but almost slipped into 4.0. Unused variables are typically detected by IDEs, but this is insufficient to prevent these kinds of bugs. This change enables unused local variable detection in checkstyle for Kafka. A few notes on the usage: - There are two situations in which people actually want to have a local variable but not use it. First, there are `for (Type ignored: collection)` loops which have to loop `collection.length` number of times, but that do not use `ignored` in the loop body. These are typically still easier to read than a classical `for` loop. Second, some IDEs detect it if a return value of a function such as `File.delete` is not being used. In this case, people sometimes store the result in an unused local variable to make ignoring the return value explicit and to avoid the squiggly lines. - In Java 22, unsued local variables can be omitted by using a single underscore `_`. This is supported by checkstyle. In pre-22 versions, IntelliJ allows such variables to be named `ignored` to suppress the unused local variable warning. This pattern is often (but not consistently) used in the Kafka codebase. This is, however, not supported by checkstyle. Since we cannot switch to Java 22, yet, and we want to use automated detection using checkstyle, we have to resort to prefixing the unused local variables with `@SuppressWarnings("UnusedLocalVariable")`. We have to apply this in 11 cases across the Kafka codebase. While not being pretty, I'd argue it's worth it to prevent bugs like the one fixed in [`7a749b5`](`7a749b589f`). Reviewers: Andrew Schofield <aschofield@confluent.io>, David Arthur <mumrah@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Bruno Cadonna <cadonna@apache.org>, Kirk True <ktrue@confluent.io>	2025-03-10 09:37:35 +01:00
PoAn Yang	a5e5e2dcd5	KAFKA-18706 Move AclPublisher to metadata module (#18802 ) Move AclPublisher to org.apache.kafka.metadata.publisher package. Reviewers: Christo Lolov <lolovc@amazon.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-03-09 21:00:33 +08:00
Colin Patrick McCabe	343bc995f4	KAFKA-18920: The kcontrollers must set kraft.version in ApiVersionsResponse (#19127 ) The kafka controllers need to set kraft.version in their ApiVersionsResponse messages according to the current kraft.version reported by the Raft layer. Instead, currently they always set it to 0. Also remove FeatureControlManager.latestFinalizedFeatures. It is not needed and it does a lot of copying. Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-03-07 13:46:46 -08:00
Calvin Liu	db38bef076	KAFKA-18940: fix electionWasClean (#19156 ) The electionWasClean should also consider if the election is done through ELR. Otherwise, the metric uncleanLeaderElection will wrongly count the ELR election https://issues.apache.org/jira/browse/KAFKA-18940 Reviewers: Jun Rao <junrao@gmail.com>	2025-03-07 11:04:06 -08:00

1 2 3 4 5 ...

495 Commits