kafka

Commit Graph

Author	SHA1	Message	Date
Dongnuo Lyu	21d60eabab	KAFKA-16673; Simplify `GroupMetadataManager#toTopicPartitions` by using `ConsumerProtocolSubscription` instead of `ConsumerPartitionAssignor.Subscription` (#16309 ) In `GroupMetadataManager#toTopicPartitions`, we generate a list of `ConsumerGroupHeartbeatRequestData.TopicPartitions` from the input deserialized subscription. Currently the input subscription is `ConsumerPartitionAssignor.Subscription`, where the topic partitions are stored as (topic-partition) pairs, whereas in `ConsumerGroupHeartbeatRequestData.TopicPartitions`, we need the topic partitions to be stored as (topic-partition list) pairs. `ConsumerProtocolSubscription` is an intermediate data structure in the deserialization where the topic partitions are stored as (topic-partition list) pairs. This pr uses `ConsumerProtocolSubscription` instead as the input subscription to make `toTopicPartitions` more efficient. Reviewers: David Jacot <djacot@confluent.io>	2024-06-17 02:47:52 -07:00
Omnia Ibrahim	e99da2446c	KAFKA-15853: Move KafkaConfig.configDef out of core (#16116 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-14 17:26:00 +02:00
gongxuanzhang	6d9ef0e12a	KAFKA-10787 Apply spotless to `group-coordinator` and `group-coordinator-api` (#16298 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-14 12:46:28 +08:00
Dongnuo Lyu	11c85a93c3	MINOR: Make online downgrade failure logs less noisy and update the timeouts scheduled in `convertToConsumerGroup` (#16290 ) This patch: - changes the order of the checks in `validateOnlineDowngrade`, so that only when the last member using the consumer protocol leave and the group still has classic member(s), `online downgrade is disabled` is logged if the policy doesn't allow downgrade. - changes the session timeout in `convertToConsumerGroup` from `consumerGroupSessionTimeoutMs` to `member.classicProtocolSessionTimeout().get()`. Reviewers: David Jacot <djacot@confluent.io>	2024-06-13 02:11:01 -07:00
gongxuanzhang	596b945072	KAFKA-16643 Add ModifierOrder checkstyle rule (#15890 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-13 15:39:32 +08:00
David Jacot	638844f833	KAFKA-16770; [2/2] Coalesce records into bigger batches (#16215 ) This patch is the continuation of https://github.com/apache/kafka/pull/15964. It introduces the records coalescing to the CoordinatorRuntime. It also introduces a new configuration `group.coordinator.append.linger.ms` which allows administrators to chose the linger time or disable it with zero. The new configuration defaults to 10ms. Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>	2024-06-11 23:29:50 -07:00
David Jacot	98f7da9172	KAFKA-16930; UniformHeterogeneousAssignmentBuilder throws NPE when one member has no subscriptions (#16283 ) Fix the following NPE: ``` java.lang.NullPointerException: Cannot invoke "org.apache.kafka.coordinator.group.assignor.MemberAssignment.targetPartitions()" because the return value of "java.util.Map.get(Object)" is null at org.apache.kafka.coordinator.group.assignor.GeneralUniformAssignmentBuilder.canMemberParticipateInReassignment(GeneralUniformAssignmentBuilder.java:248) at org.apache.kafka.coordinator.group.assignor.GeneralUniformAssignmentBuilder.balance(GeneralUniformAssignmentBuilder.java:336) at org.apache.kafka.coordinator.group.assignor.GeneralUniformAssignmentBuilder.buildAssignment(GeneralUniformAssignmentBuilder.java:157) at org.apache.kafka.coordinator.group.assignor.UniformAssignor.assign(UniformAssignor.java:84) at org.apache.kafka.coordinator.group.consumer.TargetAssignmentBuilder.build(TargetAssignmentBuilder.java:302) at org.apache.kafka.coordinator.group.GroupMetadataManager.updateTargetAssignment(GroupMetadataManager.java:1913) at org.apache.kafka.coordinator.group.GroupMetadataManager.consumerGroupHeartbeat(GroupMetadataManager.java:1518) at org.apache.kafka.coordinator.group.GroupMetadataManager.consumerGroupHeartbeat(GroupMetadataManager.java:2254) at org.apache.kafka.coordinator.group.GroupCoordinatorShard.consumerGroupHeartbeat(GroupCoordinatorShard.java:308) at org.apache.kafka.coordinator.group.GroupCoordinatorService.lambda$consumerGroupHeartbeat$0(GroupCoordinatorService.java:298) at org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime$CoordinatorWriteEvent.lambda$run$0(CoordinatorRuntime.java:769) at org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime.withActiveContextOrThrow(CoordinatorRuntime.java:1582) at org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime.access$1400(CoordinatorRuntime.java:96) at org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime$CoordinatorWriteEvent.run(CoordinatorRuntime.java:767) at org.apache.kafka.coordinator.group.runtime.MultiThreadedEventProcessor$EventProcessorThread.handleEvents(MultiThreadedEventProcessor.java:144) at org.apache.kafka.coordinator.group.runtime.MultiThreadedEventProcessor$EventProcessorThread.run(MultiThreadedEventProcessor.java:176) ``` Reviewers: Lianet Magrans <lianetmr@gmail.com>, Justine Olshan <jolshan@confluent.io>	2024-06-11 11:43:56 -07:00
David Jacot	049cfeac02	MINOR: Rename uniform assignor's internal builders (#16233 ) This patch renames the uniform assignor's builders to match the `SubscriptionType` which is used to determine which one is called. It removes the abstract class `AbstractUniformAssignmentBuilder` which is not necessary anymore. It also applies minor refactoring. Reviewers: Ritika Reddy <rreddy@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-10 05:26:56 -07:00
David Jacot	7d832cf74f	KAFKA-14701; Move `PartitionAssignor` to new `group-coordinator-api` module (#16198 ) This patch moves the `PartitionAssignor` interface and all the related classes to a newly created `group-coordinator/api` module, following the pattern used by the storage and tools modules. Reviewers: Ritika Reddy <rreddy@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-06 12:19:20 -07:00
Dongnuo Lyu	7ddfa64759	MINOR: Adjust validateOffsetCommit/Fetch in ConsumerGroup to ensure compatibility with classic protocol members (#16145 ) During online migration, there could be ConsumerGroup that has members that uses the classic protocol. In the current implementation, `STALE_MEMBER_EPOCH` could be thrown in ConsumerGroup offset fetch/commit validation but it's not supported by the classic protocol. Thus this patch changed `ConsumerGroup#validateOffsetCommit` and `ConsumerGroup#validateOffsetFetch` to ensure compatibility. Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>	2024-06-04 23:08:38 -07:00
Ritika Reddy	078dd9a311	KAFKA-16821; Member Subscription Spec Interface (#16068 ) This patch reworks the `PartitionAssignor` interface to use interfaces instead of POJOs. It mainly introduces the `MemberSubscriptionSpec` interface that represents a member subscription and changes the `GroupSpec` interfaces to expose the subscriptions and the assignments via different methods. The patch does not change the performance. before: ``` Benchmark (memberCount) (partitionsToMemberRatio) (topicCount) Mode Cnt Score Error Units TargetAssignmentBuilderBenchmark.build 10000 10 100 avgt 5 3.462 ± 0.687 ms/op TargetAssignmentBuilderBenchmark.build 10000 10 1000 avgt 5 3.626 ± 0.412 ms/op JMH benchmarks done ``` after: ``` Benchmark (memberCount) (partitionsToMemberRatio) (topicCount) Mode Cnt Score Error Units TargetAssignmentBuilderBenchmark.build 10000 10 100 avgt 5 3.677 ± 0.683 ms/op TargetAssignmentBuilderBenchmark.build 10000 10 1000 avgt 5 3.991 ± 0.065 ms/op JMH benchmarks done ``` Reviewers: David Jacot <djacot@confluent.io>	2024-06-04 06:44:37 -07:00
David Jacot	7d82f7625e	MINOR: Log time taken to compute the target assignment (#16185 ) The time taken to compute a new assignment is critical. This patches extending the existing logging to log it too. This is very useful information to have. Reviewers: Luke Chen <showuon@gmail.com>	2024-06-04 06:38:56 -07:00
Jeff Kim	d7bc43ed06	KAFKA-16664; Re-add EventAccumulator.poll(long, TimeUnit) (#16144 ) We have revamped the thread idle ratio metric in https://github.com/apache/kafka/pull/15835. https://github.com/apache/kafka/pull/15835#discussion_r1588068337 describes a case where the metric loses accuracy and in order to set a lower bound to the accuracy, this patch re-adds a poll with a timeout that was removed as part of https://github.com/apache/kafka/pull/15430. Reviewers: David Jacot <djacot@confluent.io>	2024-06-03 23:27:35 -07:00
TingIāu "Ting" Kì	7973aa6a39	KAFKA-16861: Don't convert to group to classic if the size is larger than group max size. (#16163 ) Fix the bug where the group downgrade to a classic one when a member leaves, even though the consumer group size is still larger than `classicGroupMaxSize`. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>	2024-06-03 11:36:07 -07:00
David Jacot	979f8d9aa3	MINOR: Small refactor in TargetAssignmentBuilder (#16174 ) This patch is a small refactoring which mainly aims at avoid to construct a copy of the new target assignment in the TargetAssignmentBuilder because the copy is not used by the caller. The change relies on the exiting tests and it does not really have an impact on performance (e.g. validated with TargetAssignmentBuilderBenchmark). Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-03 11:32:39 -07:00
David Jacot	fb566e48bf	KAFKA-16864; Optimize uniform (homogenous) assignor (#16088 ) This patch optimizes uniform (homogenous) assignor by avoiding creating a copy of all the assignments. Instead, the assignor creates a copy only if the assignment is updated. It is a sort of copy-on-write. This change reduces the overhead of the TargetAssignmentBuilder when ran with the uniform (homogenous) assignor. Trunk: ``` Benchmark (memberCount) (partitionsToMemberRatio) (topicCount) Mode Cnt Score Error Units TargetAssignmentBuilderBenchmark.build 10000 10 100 avgt 5 24.535 ± 1.583 ms/op TargetAssignmentBuilderBenchmark.build 10000 10 1000 avgt 5 24.094 ± 0.223 ms/op JMH benchmarks done ``` ``` Benchmark (assignmentType) (assignorType) (isRackAware) (memberCount) (partitionsToMemberRatio) (subscriptionType) (topicCount) Mode Cnt Score Error Units ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 10000 10 HOMOGENEOUS 100 avgt 5 14.697 ± 0.133 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 10000 10 HOMOGENEOUS 1000 avgt 5 15.073 ± 0.135 ms/op JMH benchmarks done ``` Patch: ``` Benchmark (memberCount) (partitionsToMemberRatio) (topicCount) Mode Cnt Score Error Units TargetAssignmentBuilderBenchmark.build 10000 10 100 avgt 5 3.376 ± 0.577 ms/op TargetAssignmentBuilderBenchmark.build 10000 10 1000 avgt 5 3.731 ± 0.359 ms/op JMH benchmarks done ``` ``` Benchmark (assignmentType) (assignorType) (isRackAware) (memberCount) (partitionsToMemberRatio) (subscriptionType) (topicCount) Mode Cnt Score Error Units ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 10000 10 HOMOGENEOUS 100 avgt 5 1.975 ± 0.086 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 10000 10 HOMOGENEOUS 1000 avgt 5 2.026 ± 0.190 ms/op JMH benchmarks done ``` Reviewers: Ritika Reddy <rreddy@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>	2024-05-31 13:17:59 -07:00
Andrew Schofield	2d9994e0de	KAFKA-16722: Introduce ConsumerGroupPartitionAssignor interface (#15998 ) KIP-932 introduces share groups to go alongside consumer groups. Both kinds of group use server-side assignors but it is unlikely that a single assignor class would be suitable for both. As a result, the KIP introduces specific interfaces for consumer group and share group partition assignors. This PR introduces only the consumer group interface, `o.a.k.coordinator.group.assignor.ConsumerGroupPartitionAssignor`. The share group interface will come in a later release. The existing implementations of the general `PartitionAssignor` interface have been changed to implement `ConsumerGroupPartitionAssignor` instead and all other code changes are just propagating the change throughout the codebase. Note that the code in the group coordinator that actually calculates assignments uses the general `PartitionAssignor` interface so that it can be used with both kinds of group, even though the assignors themselves are specific. Reviewers: Apoorv Mittal <amittal@confluent.io>, David Jacot <djacot@confluent.io>	2024-05-29 08:31:52 -07:00
Dongnuo Lyu	eefd114c4a	KAFKA-16832; LeaveGroup API for upgrading ConsumerGroup (#16057 ) This patch implements the LeaveGroup API to the consumer groups that are in the mixed mode. Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>	2024-05-28 23:21:30 -07:00
Ritika Reddy	a8d166c00e	KAFKA-16625; Reverse lookup map from topic partitions to members (#15974 ) This patch speeds up the computation of the unassigned partitions by exposing the inverted target assignment. It allows the assignor to check whether a partition is assigned or not. Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>	2024-05-25 09:06:15 -07:00
Jeff Kim	d585a494a4	KAFKA-16831: CoordinatorRuntime should initialize MemoryRecordsBuilder with max batch size write limit (#16059 ) CoordinatorRuntime should initialize MemoryRecordsBuilder with max batch size write limit. Otherwise, we default the write limit to the min buffer size of 16384 for the write limit. This causes the coordinator to threw RecordTooLargeException even when it's under the 1MB max batch size limit. Reviewers: David Jacot <djacot@confluent.io>	2024-05-24 13:33:57 -07:00
Jeff Kim	520aa8665c	KAFKA-16626; Lazily convert subscribed topic names to topic ids (#15970 ) This patch aims to remove the data structure that stores the conversion from topic names to topic ids which was taking time similar to the actual assignment computation. Instead, we reuse the already existing ConsumerGroupMember.subscribedTopicNames() and do the conversion to topic ids when the iterator is requested. Reviewers: David Jacot <djacot@confluent.io>	2024-05-24 00:51:50 -07:00
Dongnuo Lyu	14b5c4d1e8	KAFKA-16793; Heartbeat API for upgrading ConsumerGroup (#15988 ) This patch implements the heartbeat api to the members that use the classic protocol in a ConsumerGroup. Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>	2024-05-22 23:27:00 -07:00
Jeff Kim	e692feed34	MINOR: fix flaky testRecordThreadIdleRatio (#15987 ) DelayEventAccumulator should return immediately if there are no events in the queue. Also removed some unused fields inside EventProcessorThread. Reviewers: Gaurav Narula <gaurav_narula2@apple.com>, Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>	2024-05-22 23:24:23 -07:00
Mickael Maison	affe8da54c	KAFKA-7632: Support Compression Levels (KIP-390) (#15516 ) Reviewers: Jun Rao <jun@confluent.io>, Luke Chen <showuon@gmail.com> Co-authored-by: Lee Dongjin <dongjin@apache.org>	2024-05-21 17:58:49 +02:00
David Jacot	b4c2d66801	KAFKA-16770; [1/N] Coalesce records into bigger batches (#15964 ) We have discovered during large scale performance tests that the current write path of the new coordinator does not scale well. The issue is that each write operation writes synchronously from the coordinator threads. Coalescing records into bigger batches helps drastically because it amortizes the cost of writes. Aligning the batches with the snapshots of the timelines data structures also reduces the number of in-flight snapshots. This patch is the first of a series of patches that will bring records coalescing into the coordinator runtime. As a first step, we had to rework the PartitionWriter interface and move the logic to build MemoryRecords from it to the CoordinatorRuntime. The main changes are in these two classes. The others are related mechanical changes. Reviewers: Justine Olshan <jolshan@confluent.io>	2024-05-20 23:47:09 -07:00
David Jacot	5b34574e86	MINOR: Refactor write timeout in CoordinatorRuntime (#15976 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-05-18 01:47:44 +08:00
Dongnuo Lyu	b8c96389b4	KAFKA-16762: SyncGroup API for upgrading ConsumerGroup (#15954 ) This patch implements the sync group api for the consumer groups that are in the mixed mode. In classicGroupSyncToConsumerGroup, the assignedPartitions calculated in the JoinGroup will be returned as the assignment in the sync response and the member session timeout will be rescheduled. Reviewers: David Jacot <djacot@confluent.io>	2024-05-17 07:12:40 -07:00
David Jacot	ffb31e172a	MINOR: Remove usage of Stream API in CoordinatorRecordHelpers (#15969 ) This patch removes the usage of the Stream API in CoordinatorRecordHelpers. I saw it in a couple of profiles so it is better to remove it. Reviewers: Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-05-15 23:13:55 -07:00
David Jacot	bf88013a28	MINOR: Rename `Record` to `CoordinatorRecord` (#15949 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-05-15 13:57:19 +08:00
Dongnuo Lyu	0e023e1f73	MINOR: Add classic member session timeout to ClassicMemberMetadata (#15921 ) The heartbeat api to the consumer group with classic protocol members schedules the session timeout. At present, there's no way to get the classic member session timeout in heartbeat to consumer group. This patch stores the session timeout into the ClassicMemberMetadata in ConsumerGroupMemberMetadataValue and update it when it's provided in the join request. Reviewers: David Jacot <djacot@confluent.io>	2024-05-14 20:41:20 +02:00
Jeff Kim	df5735dda5	MINOR: fix flaky testRecordThreadIdleRatioTwoThreads test (#15937 ) Reviewers: David Jacot <djacot@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-05-14 23:20:36 +08:00
Ritika Reddy	ccd83cafea	KAFKA-16694; Remove Rack Awareness Code from the Server Side Assignors (#15903 ) Reviewers: David Jacot <djacot@confluent.io>	2024-05-14 00:13:35 -07:00
David Jacot	f9169b7d3a	KAFKA-16735; Deprecate offsets.commit.required.acks (#15931 ) This patch deprecates `offsets.commit.required.acks` in Apache Kafka 3.8 as described in KIP-1041: https://cwiki.apache.org/confluence/x/9YobEg. Reviewers: Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-05-13 11:30:34 -07:00
Ritika Reddy	ee16eee5de	KAFKA-16587: Add subscription model information to group state (#15785 ) This patch introduces the SubscriptionType to the group state and passes it along to the partition assignor. A group is "homogeneous" when all the members are subscribed to the same topics; or it is "heterogeneous" otherwise. This mainly helps the uniform assignor because it does not have to re-compute this information to determine which algorithm to use. trunk: Benchmark (assignmentType) (assignorType) (isRackAware) (memberCount) (partitionsToMemberRatio) (subscriptionModel) (topicCount) Mode Cnt Score Error Units ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 100 10 HOMOGENEOUS 100 avgt 5 0.136 ± 0.001 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 100 10 HOMOGENEOUS 1000 avgt 5 0.198 ± 0.002 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 1000 10 HOMOGENEOUS 100 avgt 5 1.767 ± 0.138 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 1000 10 HOMOGENEOUS 1000 avgt 5 1.540 ± 0.020 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 10000 10 HOMOGENEOUS 100 avgt 5 32.419 ± 7.173 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 10000 10 HOMOGENEOUS 1000 avgt 5 26.731 ± 1.985 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 100 10 HOMOGENEOUS 100 avgt 5 0.242 ± 0.006 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 100 10 HOMOGENEOUS 1000 avgt 5 1.002 ± 0.006 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 1000 10 HOMOGENEOUS 100 avgt 5 2.544 ± 0.168 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 1000 10 HOMOGENEOUS 1000 avgt 5 10.749 ± 0.207 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 10000 10 HOMOGENEOUS 100 avgt 5 26.832 ± 0.154 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 10000 10 HOMOGENEOUS 1000 avgt 5 106.209 ± 0.301 ms/op JMH benchmarks done patch: Benchmark (assignmentType) (assignorType) (isRackAware) (memberCount) (partitionsToMemberRatio) (subscriptionType) (topicCount) Mode Cnt Score Error Units ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 100 10 HOMOGENEOUS 100 avgt 5 0.131 ± 0.001 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 100 10 HOMOGENEOUS 1000 avgt 5 0.185 ± 0.004 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 1000 10 HOMOGENEOUS 100 avgt 5 1.943 ± 0.091 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 1000 10 HOMOGENEOUS 1000 avgt 5 1.450 ± 0.139 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 10000 10 HOMOGENEOUS 100 avgt 5 30.803 ± 2.644 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 10000 10 HOMOGENEOUS 1000 avgt 5 24.251 ± 1.230 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 100 10 HOMOGENEOUS 100 avgt 5 0.155 ± 0.004 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 100 10 HOMOGENEOUS 1000 avgt 5 0.235 ± 0.010 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 1000 10 HOMOGENEOUS 100 avgt 5 1.602 ± 0.046 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 1000 10 HOMOGENEOUS 1000 avgt 5 1.901 ± 0.174 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 10000 10 HOMOGENEOUS 100 avgt 5 16.098 ± 1.905 ms/op ServerSideAssignorBenchmark.doAssignment INCREMENTAL UNIFORM false 10000 10 HOMOGENEOUS 1000 avgt 5 17.681 ± 0.174 ms/op JMH benchmarks done Reviewers: David Jacot <djacot@confluent.io>	2024-05-13 02:19:05 -07:00
Jeff Kim	8a9dd2beda	KAFKA-16663; Cancel write timeout TimerTask on successful event completion (#15902 ) Write events create and add a TimerTask to schedule the timeout operation. The issue is that we pile up the number of timer tasks which are essentially no-ops if replication was successful. They stay in memory for 15 seconds (default write timeout) and as the rate of write increases, the impact on memory usage increases. Instead, cancel the corresponding write timeout task when the write event is committed to the log. This also applies to complete transaction events. Reviewers: David Jacot <djacot@confluent.io>	2024-05-13 00:18:32 -07:00
Jeff Kim	21bf715622	KAFKA-16307; Fix coordinator thread idle ratio (#15835 ) This PR fixes the thread idle ratio. We take a similar approach to the kafka request handler idle ratio: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaRequestHandler.scala#L108-L117 Instead of calculating the actual ratio per thread, we record the time each thread stays idle while waiting for a new event, divided by the number of threads as an approximation. Reviewers: David Jacot <djacot@confluent.io>	2024-05-07 06:21:09 -07:00
Dongnuo Lyu	459eaec666	KAFKA-16615; JoinGroup API for upgrading ConsumerGroup (#15798 ) The patch implements JoinGroup API for the new consumer groups. It allow members using the classic rebalance protocol with the consumer embedded protocol to join a new consumer group. Reviewers: David Jacot <djacot@confluent.io>	2024-05-06 23:59:10 -07:00
David Jacot	42754336e1	MINOR: Remove `ConsumerGroupPartitionMetadataValue.Epoch` field (#15854 ) ConsumerGroupPartitionMetadataValue.Epoch is not used anywhere so we can remove it. Note that we already have non-backward compatible changes lined up for 3.8 so it is fine to do it. Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-05-06 05:02:39 -07:00
Chia Chuan Yu	55a00be4e9	MINOR: Replaced Utils.join() with JDK API. (#15823 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-05-06 15:13:01 +08:00
David Jacot	2c0b8b6920	MINOR: ConsumerGroup#getOrMaybeCreateMember should not add the member to the group (#15847 ) While reviewing https://github.com/apache/kafka/pull/15785, I noticed that the member is added to the group directly in `ConsumerGroup#getOrMaybeCreateMember`. This does not hurt but confuses people because the state must not be mutated at this point. It should only be mutated when records are replayed. I think that it is better to remove it in order to make it clear. Reviewers: Dongnuo Lyu <dlyu@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-05-03 06:24:26 -07:00
Dongnuo Lyu	1e8415160f	MINOR: Add replayRecords to CoordinatorResult (#15818 ) The patch adds a boolean attribute `replayRecords` that specifies whether the records should be replayed. Reviewers: David Jacot <djacot@confluent.io>	2024-04-30 09:14:02 -07:00
Dongnuo Lyu	994077e43e	MINOR: Fix the flaky testConsumerGroupHeartbeatWithStableClassicGroup by sorting the topic partition list (#15816 ) We are seeing flaky test in `testConsumerGroupHeartbeatWithStableClassicGroup` where the error is caused by the different ordering in the expected and actual values. The patch sorts the topic partition list in the records to fix the issue. Reviewers: Jeff Kim <jeff.kim@confluent.io>, Igor Soarez <soarez@apple.com>, David Jacot <djacot@confluent.io>	2024-04-29 00:43:49 -07:00
Dongnuo Lyu	dc1d8fc330	KAFKA-16554: Online downgrade triggering and group type conversion (#15721 ) Online downgrade from a consumer group to a classic group is triggered when the last consumer that uses the consumer protocol leaves the group. A rebalance is manually triggered after the group conversion. This patch adds consumer group downgrade validation and conversion. Reviewers: David Jacot <djacot@confluent.io>	2024-04-25 07:44:25 -07:00
Omnia Ibrahim	363f4d2847	KAFKA-15853 Move consumer group and group coordinator configs out of core (#15684 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-04-17 20:41:22 +08:00
Dongnuo Lyu	a9f65a5d7f	KAFKA-16436; Online upgrade triggering and group type conversion (#15662 ) This patch introduces the conversion from a classic group to a consumer group when a member joins with the new consumer group protocol (epoch is 0) but only if the conversion is enabled. Reviewers: David Jacot <djacot@confluent.io>	2024-04-17 04:57:44 -07:00
Dongnuo Lyu	619f27015f	KAFKA-16294: Add group protocol migration enabling config (#15411 ) This patch adds the `group.consumer.migration.policy` config which controls how consumer groups can be converted from classic group to consumer group and vice versa. The config is kept as an internal one while we develop the feature. Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>	2024-04-10 10:59:26 -07:00
Dongnuo Lyu	9bc48af1c1	MINOR: Add type check to classic group timeout operations (#15587 ) When implementing the group type conversion from a classic group to a consumer group, if the replay of conversion records fails, the group should be reverted back including its timeouts. A possible solution is to keep all the classic group timeouts and add a type check to the timeout operations. If the group is successfully upgraded, it won't be able to pass the type check and its operations will be executed without actually doing anything; if the group upgrade fails, the group map will be reverted and the timeout operations will be executed as is. We've already have group type check in consumer group timeout operations. This patch adds similar type check to those classic group timeout operations. Reviewers: David Jacot <djacot@confluent.io>	2024-04-10 00:36:49 -07:00
Erik van Oosten	8e61f04228	MINOR: Fix usage of none in javadoc (#15674 ) - Use `Empty` instead of 'none' when referring to `Optional` values. - `Headers.lastHeader` returns `null` when no header is found. - Fix minor spelling mistakes. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-04-08 08:43:05 +08:00
Jeff Kim	b3116f4f76	KAFKA-16148: Implement GroupMetadataManager#onUnloaded (#15446 ) This patch completes all awaiting futures when a group is unloaded. Reviewers: David Jacot <djacot@confluent.io>	2024-04-02 03:16:02 -07:00
Sean Quah	ad960635a9	KAFKA-16386: Convert NETWORK_EXCEPTIONs from KIP-890 transaction verification (#15559 ) KIP-890 Part 1 introduced verification of transactions with the transaction coordinator on the `Produce` and `TxnOffsetCommit` paths. This introduced the possibility of new errors when responding to those requests. For backwards compatibility with older clients, a choice was made to convert some of the new retriable errors to existing errors that are expected and retried correctly by older clients. `NETWORK_EXCEPTION` was forgotten about and not converted, but can occur if, for example, the transaction coordinator is temporarily refusing connections. Now, we convert it to: * `NOT_ENOUGH_REPLICAS` on the `Produce` path, just like the other retriable errors that can arise from transaction verification. * `COORDINATOR_LOAD_IN_PROGRESS` on the `TxnOffsetCommit` path. This error does not force coordinator lookup on clients, unlike `COORDINATOR_NOT_AVAILABLE`. Note that this deviates from KIP-890, which says that retriable errors should be converted to `COORDINATOR_NOT_AVAILABLE`. Reviewers: Artem Livshits <alivshits@confluent.io>, David Jacot <djacot@confluent.io>, Justine Olshan <jolshan@confluent.io>	2024-03-25 16:08:23 -07:00

1 2 3 4

179 Commits