Commit Graph

55 Commits

Author SHA1 Message Date
David Jacot 1574b9f16d
MINOR: Code cleanups in group-coordinator module (#14117)
This patch does a few code cleanups in the group-coordinator module.

It renames Coordinator to CoordinatorShard;
It renames ReplicatedGroupCoordinator to GroupCoordinatorShard. I was never really happy with this name. The new name makes more sense to me;
It removes TopicPartition from the GroupMetadataManager. It was only used in log messages. The log context already includes it so we don't have to log it again.
It renames assignors to consumerGroupAssignors.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-07-28 11:28:54 -07:00
Ritika Reddy 3709901c9e
KAFKA-14702: Extend server side assignor to support rack aware replica placement (#14099)
This patch updates the `PartitionAssignor` interface to support rack-awareness. The change introduces the `SubscribedTopicDescriber` interface that can be used to retrieve topic metadata such as the number of partitions or the racks from within an assignor. We use an interface because it allows us to wrap internal data structures instead of having to copy them. It is more efficient.

Reviewers: David Jacot <djacot@confluent.io>
2023-07-28 19:30:04 +02:00
Jeff Kim 19f9e1e6d0
KAFKA-14501: Implement Heartbeat protocol in new GroupCoordinator (#14056)
This patch implements the existing Heartbeat API in the new Group Coordinator.

Reviewers: David Jacot <djacot@confluent.io>
2023-07-28 15:13:27 +02:00
David Jacot dcabc295ec
KAFKA-14048; CoordinatorContext should be protected by a lock (#14090)
Accessing the `CoordinatorContext` in the `CoordinatorRuntime` should be protected by a lock. The runtime guarantees that the context is never access concurrently however it is accessed by multiple threads. The lock is here to ensure that we have a proper memory barrier. The patch does the following:
1) Adds a lock to `CoordinatorContext`;
2) Adds helper methods to get the context and acquire/release the lock.
3) Allow transition from Failed to Loading. Previously, the context was recreated in this case.

Reviewers: Justine Olshan <jolshan@confluent.io>
2023-07-28 14:49:48 +02:00
David Jacot 29825ee24f
KAFKA-14499: [3/N] Implement OffsetCommit API (#14067)
This patch introduces the `OffsetMetadataManager` and implements the `OffsetCommit` API for both the old rebalance protocol and the new rebalance protocol. It introduces version 9 of the API but keeps it as unstable for now. The patch adds unit tests to test the API. Integration tests will be done separately.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-07-27 13:18:10 +02:00
Jeff Kim d2fc907623
KAFKA-14500; [6/6] Implement SyncGroup protocol in new GroupCoordinator (#14017)
This patch implements the SyncGroup API in the new group coordinator. All the new unit tests are based on the existing scala tests.

Reviewers: David Jacot <djacot@confluent.io>
2023-07-27 08:02:29 +02:00
David Jacot 2528dd4116
KAFKA-14499: [2/N] Add OffsetCommit record & related (#14047)
This patch does a few things:
1) It introduces the `OffsetAndMetadata` class which hold the committed offsets in the group coordinator.
2) It adds methods to deal with OffsetCommit records to `RecordHelpers`.
3) It adds `MetadataVersion#offsetCommitValueVersion` to get the version of the OffsetCommit value record that should be used.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Arthur <mumrah@gmail.com>, Justine Olshan <jolshan@confluent.io>
2023-07-21 20:09:06 +02:00
Jeff Kim a500c3ecf9
KAFKA-14500; [5/N] Implement JoinGroup protocol in new GroupCoordinator (#13870)
This patch implements the existing JoinGroup protocol within the new group coordinator. 

Some notable differences:
* Methods return a CoordinatorResult to the runtime framework, which includes records to append to the log as well as a future to complete after the append succeeds/fails.
* The coordinator runtime ensures that only a single thread will be processing a group at any given time, therefore there is no more locking on groups.
* Instead of using on purgatories, we rely on the Timer interface to schedule/cancel delayed operations.

Reviewers: David Jacot <djacot@confluent.io>
2023-07-19 09:15:13 +02:00
David Jacot 32ff347b2c
KAFKA-14462; [23/23] Wire GroupCoordinatorService in BrokerServer (#13991)
This patch wires the new group coordinator in BrokerServer (KRaft only). With this, it is now possible to run a cluster with the new group coordinator and to use the ConsumerGroupHeartbeat API by specifying the following two properties:
- group.coordinator.new.enable = true (to enable the new group coordinator)
- unstable.api.versions.enable = true (to enable unreleased APIs)

Note that the new group coordinator does not support all the existing APIs yet.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-07-14 17:41:06 +02:00
David Jacot aafbe34443
KAFKA-14462; [22/N] Implement session and revocation timeouts (#13963)
This patch adds the session timeout and the revocation timeout to the new consumer group protocol.

Reviewers: Calvin Liu <caliu@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-07-12 11:10:51 +02:00
David Jacot 9cde3a7910
KAFKA-14462; [21/N] Add CoordinatorTimer implementation in CoordinatorRuntime (#13961)
This patch adds EventBasedCoordinatorTimer and the CoordinatorTimer interface.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-07-07 22:21:30 +02:00
David Jacot 98fbd8afc7
KAFKA-14462; [20/N] Refresh subscription metadata on new metadata image (#13901)
This patch adds (1) the logic to propagate a new MetadataImage to the running coordinators; and (2) the logic to ensure that all the consumer groups subscribed to topics with changes will refresh their subscriptions metadata on the next heartbeat. In the mean time, it ensures that freshly loaded consumer groups also refresh their subscriptions metadata on the next heartbeat.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-07-05 18:28:38 +02:00
David Jacot 482299c4e2
KAFKA-14462; [19/N] Add CoordinatorLoader implementation (#13880)
This patch adds a coordinator loader implementation.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-06-29 08:12:53 +02:00
David Jacot a81486e4f8
KAFKA-14462; [18/N] Add GroupCoordinatorService (#13812)
This patch introduces the GroupCoordinatorService. This is the new (incomplete) implementation of the group coordinator based on the coordinator runtime introduced in https://github.com/apache/kafka/pull/13795.

Reviewers: Divij Vaidya <diviv@amazon.com>, Justine Olshan <jolshan@confluent.io>
2023-06-22 09:06:10 +02:00
minjian.cai 39a47c8999
MINOR: fix typos for group coordinator (#13886)
Reviewers: Manyanda Chitimbo <manyanda.chitimbo@gmail.com>, Divij Vaidya <diviv@amazon.com>
2023-06-20 22:57:26 +02:00
minjian.cai 3d97743c67
MINOR: Fix some typos for core (#13882)
Reviewers:  Divij Vaidya <diviv@amazon.com>
2023-06-20 22:52:39 +02:00
David Jacot 7556ce366a
KAFKA-14462; [17/N] Add CoordinatorRuntime (#13795)
This patch introduces the CoordinatorRuntime. The CoordinatorRuntime is a framework which encapsulate all the common features requires to build a coordinator such as the group coordinator. Please refer to the javadoc of that class for the details.

Reviewers: Divij Vaidya <diviv@amazon.com>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-06-13 09:46:38 +02:00
David Jacot 7d147cf241
KAFKA-14462; [14/N] Add PartitionWriter (#13675)
This patch introduces the `PartitionWriter` interface in the `group-coordinator` module. The `ReplicaManager` resides in the `core` module and it is thus not accessible from the `group-coordinator` one. The `CoordinatorPartitionWriter` is basically an implementation of the interface residing in `core` which interfaces with the `ReplicaManager`.

One notable difference from the usual produce path is that the `PartitionWriter` returns the offset following the written records. This is then used by the coordinator runtime to track when the request associated with the write can be completed.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-06-06 16:24:48 +02:00
David Jacot e5d67b81fd
KAFKA-14462; [16/N] Add CoordinatorLoader and CoordinatorPlayback interfaces (#13794)
This patch adds the CoordinatorLoader and the CoordinatorPlayback interfaces. The former is used to load (or rebuild) the coordinator state by replaying all the records stored in the partition. The later is the interface that must be implemented by coordinator to support replaying records.

Reviewers: Calvin Liu <caliu@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-06-02 21:21:51 +02:00
David Jacot 7c3a2846d4
KAFKA-14462; [15/N] Make Result generic and rename it (#13793)
This patch makes the record type generic, moves the class to the runtime package and finally rename the class to follow the naming of the other classes in the runtime package.

Reviewers: Calvin Liu <caliu@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-06-02 09:18:09 +02:00
David Jacot 47551ea369
KAFKA-14462; [13/N] CoordinatorEvent and CoordinatorEventProcessor (#13666)
Adds CoordinatorEvent, CoordinatorEventProcessor, and MultiThreadedEventProcessor.

Reviewers: Kirk True <ktrue@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-06-01 13:33:40 -07:00
David Jacot 49d9c6775d
KAFKA-14462; [12/N] Add GroupMetadataManager and ConsumerGroup (#13639)
This patch adds the GroupMetadataManager to the group-coordinator module. This manager is responsible for handling the groups management, the members management and the entire reconciliation process. At this point, only the new consumer group type/protocol is implemented.

The new manager is based on an architecture inspired from the quorum controller. A request can access/read the state but can't mutate it directly. Instead, a list of records is generated together with the response and those records are applied to the state by the runtime framework. We use timeline data structures. Note that the runtime framework is not part of this patch. It will come in a following one.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-05-31 08:29:41 +02:00
Jeff Kim 15f8705246
KAFKA-14500; [4/N] Add Timer interface (#13708)
Reviewers: David Jacot <djacot@confluent.io>
2023-05-23 10:43:59 +02:00
Jeff Kim c98c1ed41c
KAFKA-14500; [3/N] add GroupMetadataKey/Value record helpers (#13704)
This path enables the new group metadata manager to generate GroupMetadataKey/Value records.

Reviewers: David Jacot <djacot@confluent.io>
2023-05-23 10:42:13 +02:00
Jeff Kim cc011f77aa
KAFKA-14500; [2/N] Rewrite GroupMetadata in Java (#13663)
This patch introduces `GenericGroup` which rewrite the `GroupMetadata` in Java. The `GenericGroup` is basically a group using the current rebalance protocol in the new group coordinator.

Reviewers: Ritika Reddy <rreddy@confluent.io>, Christo Lolov <lolovc@amazon.com>, David Jacot <djacot@confluent.io>
2023-05-12 11:22:29 +02:00
Ritika Reddy 4653507926
KAFKA-14514; Add Range Assignor on the Server (KIP-848) (#13443)
This patch adds the RangeAssignor on the server for KIP-848. This range assignor is very different from the old client side implementation. We added functionality to make these assignments sticky while also inheriting crucial properties of the range assignor such as facilitating joins and distributing partitions of a topic somewhat equally amongst its subscribers.

Reviewers: Philip Nee <philipnee@gmail.com>, Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>
2023-05-10 14:09:12 +02:00
Jeff Kim 228434d235
KAFKA-14500; [1/N] Rewrite MemberMetadata in Java (#13644)
This patch adds GenericGroupMember which is a rewrite of MemberMetadata in Java.

Reviewers: David Jacot <djacot@confluent.io>
2023-05-09 16:49:27 +02:00
David Jacot 7634eee262
KAFKA-14462; [11/N] Add CurrentAssignmentBuilder (#13638)
This patch adds the `CurrentAssignmentBuilder` class which encapsulates the reconciliation engine of the consumer group protocol. Given the current state of a member and a desired or target assignment state, the state machine takes the necessary steps to converge the member to its desired state.

Reviewers: Ritika Reddy <rreddy@confluent.io>, Calvin Liu <caliu@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-05-08 20:46:07 +02:00
David Jacot 16fc8e1cff
KAFKA-14462; [10/N] Add TargetAssignmentBuilder (#13637)
This patch adds TargetAssignmentBuilder. It is responsible for computing a target assignment for a given group.

Reviewers: Ritika Reddy <rreddy@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-05-02 18:04:50 +02:00
David Jacot 8bde4e79cd
KAFKA-14462; [9/N] Add RecordHelpers (#13544)
This patch adds RecordHelpers.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-04-27 14:05:41 +02:00
David Jacot 9a36da12b7
KAFKA-14462; [8/N] Add ConsumerGroupMember (#13538)
This patch adds ConsumerGroupMember.

Reviewers: Christo Lolov <lolovc@amazon.com>, Jeff Kim <jeff.kim@confluent.io>, Jason Gustafson <jason@confluent.io>
2023-04-25 18:50:51 +02:00
David Jacot c39bf714bb
KAFKA-14462; [7/N] Add ClientAssignor, Assignment, TopicMetadata and VersionedMetadata (#13537)
This patch adds ClientAssignor, Assignment, TopicMetadata and VersionedMetadata classes.

Reviewers: Christo Lolov <lolovc@amazon.com>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-04-21 11:22:16 +02:00
Jeff Kim 61530d68ce
KAFKA-14869: Bump coordinator value records to flexible versions (KIP-915, Part-2) (#13526)
This patch implemented the second part of KIP-915. It bumps the versions of the value records used by the group coordinator and the transaction coordinator to make them flexible versions. The new versions are not used when writing to the partitions but only when reading from the partitions. This allows downgrades from future versions that will include tagged fields.

Reviewers: David Jacot <djacot@confluent.io>
2023-04-18 15:37:04 +02:00
David Jacot 741a27351e
KAFKA-14462; [6/N] Update Records (#13536)
This patch updates the KIP-848's records.

Reviewers: Christo Lolov <lolovc@amazon.com>, Jason Gustafson <jason@confluent.io>
2023-04-14 14:25:33 +02:00
Ritika Reddy f1e7a64bf6
MINOR: Refine `PartitionAssignor` server-side interface (#13524)
This patch updates the `PartitionAssignor` server-side interface used in the new group coordinator for the new consumer group protocol as follow:
- It switches subscription from topic names to topic ids in order to be closer to the server side implementation.
- It switches assignment from Set to Map<Integer, Set> to be closer to the server side implementation.
- It adds getters for all attributes.
- It makes all attributes final private.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Alexandre Dupriez <alexandre.dupriez@gmail.com>, David Jacot <djacot@confluent.io>
2023-04-14 14:22:51 +02:00
David Jacot 440a53099d
KAFKA-14462; [5/N] Add EventAccumulator (#13505)
This patch adds the `EventAccumulator` which will be used in the runtime of the new group coordinator. The aim of this accumulator is to basically have a queue per __consumer_group partitions and to ensure that events addressed to the same partitions are not processed concurrently. The accumulator is generic so we could reuse it in different context.

Reviewers: Alexandre Dupriez <alexandre.dupriez@gmail.com>, Justine Olshan <jolshan@confluent.io>
2023-04-13 08:33:40 +02:00
David Jacot e1e3900ba1
KAFKA-14462; [4/N] Add Group, Record and Result (#13520)
This patch adds Group, Record and Result.

Reviewers: Jason Gustafson <jason@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-04-12 13:16:49 +02:00
David Jacot 788cc11f45
KAFKA-14462; [3/N] Add `onNewMetadataImage` to `GroupCoordinator` interface (#13357)
The new group coordinator needs to access cluster metadata (e.g. topics, partitions, etc.) and it needs a mechanism to be notified when the metadata changes (e.g. to trigger a rebalance). In KRaft clusters, the easiest is to subscribe to metadata changes via the MetadataPublisher.

Reviewers: Justine Olshan <jolshan@confluent.io>
2023-03-08 08:52:01 +01:00
David Jacot 6d37b0f07f
KAFKA-14462; [2/N] Add ConsumerGroupHeartbeart to GroupCoordinator interface (#13329)
This patch adds ConsumerGroupHeartbeat to the GroupCoordinator interface and implements the API in KafkaApis.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-03-07 09:20:03 +01:00
David Jacot 39962eeeb3
KAFKA-14513; Add broker side PartitionAssignor interface (#13202)
This patch adds the broker side `PartitionAssignor` interface as detailed in KIP-848. The interfaces differs a bit from the KIP in the following ways:
* The POJOs are not defined within the interface because the interface is to heavy like this.
* The interface is kept in the `group-coordinator` module for now. We don't want to have it out there until KIP-848 is ready to be released. We will move it to its final destination later.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>, Christo Lolov <lolovc@amazon.com>, Guozhang Wang <wangguoz@gmail.com>
2023-02-10 08:26:00 +01:00
David Jacot 659dd2e49f
KAFKA-14048: Add new `__consumer_offsets` records from KIP-848 (#13203)
This patch adds the new (only the new ones) `__consumer_offsets` records as described in [KIP-848](https://cwiki.apache.org/confluence/display/KAFKA/KIP-848%3A+The+Next+Generation+of+the+Consumer+Rebalance+Protocol#KIP848:TheNextGenerationoftheConsumerRebalanceProtocol-Records).

Reviewers: Christo Lolov <lolovc@amazon.com>, Mickael Maison <mickael.maison@gmail.com>
2023-02-09 09:10:28 +01:00
David Jacot 094e343f18
KAFKA-14678; Move `__consumer_offsets` records from `core` to `group-coordinator` (#13200)
This patch moves the current `__consumer_offsets` records from the `core` module to the new `group-coordinator` module.

Reviewers: Christo Lolov <lolovc@amazon.com>, Mickael Maison <mickael.maison@gmail.com>
2023-02-07 09:06:56 +01:00
David Jacot 2e0a005dd4
KAFKA-14367; Add internal APIs to the new `GroupCoordinator` interface (#13112)
This patch migrates all the internal APIs of the current group coordinator to the new `GroupCoordinator` interface. It also makes the current implementation package private to ensure that it is not used anymore.

Reviewers: Justine Olshan <jolshan@confluent.io>
2023-01-20 08:38:21 +01:00
David Jacot 700947aa5a
KAFKA-14367; Add `OffsetDelete` to the new `GroupCoordinator` interface (#12902)
This patch adds `OffsetDelete` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-01-17 20:39:01 +01:00
David Jacot a2926edc2f
KAFKA-14367; Add `TxnOffsetCommit` to the new `GroupCoordinator` interface (#12901)
This patch adds `TxnOffsetCommit` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-01-13 09:54:54 +01:00
David Jacot e6669672ef
KAFKA-14367; Add `OffsetCommit` to the new `GroupCoordinator` interface (#12886)
This patch adds `OffsetCommit` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it.

Reviewers: Omnia G H Ibrahim <o.g.h.ibrahim@gmail.com>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-01-12 18:05:49 +01:00
David Jacot 24a86423e9
KAFKA-14367; Add `OffsetFetch` to the new `GroupCoordinator` interface (#12870)
This patch adds OffsetFetch to the new GroupCoordinator interface and updates KafkaApis to use it. 

Reviewers: Philip Nee <pnee@confluent.i>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-01-10 11:38:31 -08:00
David Jacot f8556fe791
KAFKA-14367; Add `DeleteGroups` to the new `GroupCoordinator` interface (#12858)
This patch adds `deleteGroups` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it.

Reviewers: Omnia G H Ibrahim <o.g.h.ibrahim@gmail.com>, Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io>
2022-12-15 09:29:56 +01:00
David Jacot 4a9c0fa4a4
KAFKA-14367; Add `DescribeGroups` to the new `GroupCoordinator` interface (#12855)
This patch adds `describeGroups` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it.

Reviewers: Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io>
2022-12-13 09:19:21 +01:00
David Jacot 854dfb5ffc
KAFKA-14367; Add `ListGroups` to the new `GroupCoordinator` interface (#12853)
This patch adds `listGroups` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it.

Reviewers: Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io>
2022-12-07 20:42:42 +01:00