Commit Graph

120 Commits

Author SHA1 Message Date
David Jacot d066b94c81
MINOR: Fix UpdatedImage and HighWatermarkUpdated events' logs (#15432)
I have noticed the following log when a __consumer_offsets partition immigrate from a broker. It appends because the event is queued up after the event that unloads the state machine. This patch fixes it and fixes another similar one.

```
[2024-02-06 17:14:51,359] ERROR [GroupCoordinator id=1] Execution of UpdateImage(tp=__consumer_offsets-28, offset=13251) failed due to This is not the correct coordinator.. (org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime)
org.apache.kafka.common.errors.NotCoordinatorException: This is not the correct coordinator.
```

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-02-29 07:01:21 -08:00
David Jacot 52289c92be
MINOR: Optimize EventAccumulator (#15430)
`poll(long timeout, TimeUnit unit)` is either used with `Long.MAX_VALUE` or `0`. This patch replaces it with `poll` and `take`. It removes the `awaitNanos` usage.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2024-02-28 05:38:02 -08:00
Jeff Kim 0979327520
KAFKA-16306: fix GroupCoordinatorService logger (#15433)
This patch corrects the logger used for GroupCoordinatorService.

Reviewers: Anton Liauchuk <anton93lev@gmail.com>, David Jacot <djacot@confluent.io>
2024-02-27 05:45:55 -08:00
David Jacot 5edf52359a
MINOR: Fix group metadata loading log (#15368)
Spotted the following log: 
```
[2024-02-14 09:59:30,103] INFO [GroupCoordinator id=1] Finished loading of metadata from 39 in __consumer_offsets-4ms with epoch 2 where 39ms was spent in the scheduler. Loaded 0 records which total to 0 bytes. (org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime)
```
The partition and the time are incorrect. This patch fixes it.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Bruno Cadonna <bruno@confluent.io>
2024-02-15 00:19:43 -08:00
David Jacot d24abe0ede
MINOR: Refactor GroupMetadataManagerTest (#15348)
`GroupMetadataManagerTest` class got a little under control. We have too many things defined in it. As a first steps, this patch extracts all the inner classes. It also extracts all the helper methods. However, the logic is not changed at all.

Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Justine Olshan <jolshan@confluent.io>
2024-02-13 23:29:29 -08:00
David Jacot 9865d54c42
MINOR: EventAccumulator should signal one thread when key becomes available (#15340)
`signalAll` was mistakenly used instead of `signal` when a key become available in the `EventAccumulator`. The fix relies on existing tests.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2024-02-09 04:29:04 -08:00
Ritika Reddy 68745ef21a
KAFKA-15460: Add group type filter to List Groups API (#15152)
This patch adds the support for filtering groups by types (Classic or Consumer) to both the old and the new group coordinators.

Reviewers: David Jacot <djacot@confluent.io>
2024-02-05 00:56:39 -08:00
David Jacot af41fc3614
KAFKA-16168; Implement GroupCoordinator.onPartitionsDeleted (#15237)
This patch implements `GroupCoordinator.onPartitionsDeleted` that is called whenever a partition is deleted and must deleted all the offsets related to them. The patch uses a naive approach similar to the one used in the old coordinator. It basically iterates over all the regular end pending offsets and deletes the ones matching the deleted partition set.

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-02-01 00:28:32 -08:00
David Jacot 0472db2cd3
MINOR: Uniformize error handling/transformation in GroupCoordinatorService (#15196)
This patch uniformizes the error handling in the GroupCoordinatorService with the aim to reuse the same error translation for all operations. It also ensures that exceptions are unwrapped if needed.

Reviewers: Dongnuo Lyu <dlyu@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2024-01-30 23:23:58 -08:00
DL1231 82920ffad0
KAFKA-16095: Update list group state type filter to include the states for the new consumer group type (#15211)
While using —list —state the current accepted values correspond to the classic group type states. This patch adds the new states introduced by KIP-848. It also make the matching on the server case insensitive.

Co-authored-by: d00791190 <dinglan6@huawei.com>

Reviewers: Ritika Reddy <rreddy@confluent.io>, David Jacot <djacot@confluent.io>
2024-01-29 07:19:05 -08:00
David Jacot e7fa0edd63
KAFKA-14505; [8/8] Update offset delete paths (#15221)
This is the last patch to complete the implementation of the transactional offsets. This patch updates the following paths:
* delete offsets - the patch ensures that a tombstone is written for pending transactional offsets too.
* delete all offsets - the patch ensures that all pending transactional offsets are deleted too.
* expire offsets - the patch ensures that an offset for a partition is not expire is there is a pending transaction.
* replay offset record - the patch ensures that all pending transactional offsets are removed when a tombstone is received.

Reviewers: Ritika Reddy <rreddy@confluent.io>, Dongnuo Lyu <dlyu@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2024-01-26 05:16:22 -08:00
Dongnuo Lyu c6194bbb0a
MINOR: populate TopicName in ConsumerGroupDescribe (#15205)
The patch populates the topic name of `ConsumerGroupDescribeResponseData.TopicPartitions` with the corresponding topic id in `ConsumerGroupDescribe`.

Reviewers: David Jacot <djacot@confluent.io>
2024-01-25 05:16:33 -08:00
Apoorv Mittal 208f9e7765
KAFKA-15813: Evict client instances from cache (KIP-714) (#15234)
KIP-714 requires client instance cache in broker which should also have a time-based eviction policy where client instances which are not actively sending metrics should be evicted. KIP mentions This client instance specific state is maintained in broker memory up to MAX(60*1000, PushIntervalMs * 3) milliseconds.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Jun Rao <junrao@gmail.com>
2024-01-23 15:06:02 -08:00
David Jacot 4d6a422e86
KAFKA-14505; [7/N] Always materialize the most recent committed offset (#15183)
When transactional offset commits are eventually committed, we must always keep the most recent committed when we have a mix of transactional and regular offset commits. We achieve this by storing the offset of the offset commit record along side the committed offset in memory. Without preserving information of the commit record offset, compaction of the __consumer_offsets topic itself may result in the wrong offset commit being materialized.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2024-01-22 23:26:40 -08:00
Omnia Ibrahim 62ce551826
KAFKA-15853: Move KafkaConfig.Defaults to server module (#15158)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ismael Juma <ismael@juma.me.uk>
, David Jacot <djacot@confluent.io>, Nikolay <NIzhikov@gmail.com>
2024-01-22 15:29:11 +01:00
David Jacot cf90382fb9
KAFKA-16147; Partition is assigned to two members at the same time (#15212)
We had a case where a partition got assigned to two members and we found a bug in the partition epochs bookkeeping. Basically, when a member has a partition pending revocation re-assigned to him before the revocation is completed, the partition epoch is lost. Here is an example of such transition:

```
[2024-01-16 12:10:52,613] INFO [GroupCoordinator id=1 topic=__consumer_offsets partition=7] [GroupId rdkafkatest_rnd53b4eb0c2de343_0113u] Member M2 transitioned from CurrentAssignment(memberEpoch=11, previousMemberEpoch=9, targetMemberEpoch=14, state=revoking, assignedPartitions={}, partitionsPendingRevocation={EnZMikZURKiUoxZf0rozaA=[0, 1, 2, 3, 4, 5, 6, 7]}, partitionsPendingAssignment={IKXGrFR1Rv-Qes7Ummas6A=[0, 5]}) to CurrentAssignment(memberEpoch=15, previousMemberEpoch=11, targetMemberEpoch=15, state=stable, assignedPartitions={EnZMikZURKiUoxZf0rozaA=[0, 1, 2, 3, 4, 5, 6, 7]}, partitionsPendingRevocation={}, partitionsPendingAssignment={}). (org.apache.kafka.coordinator.group.GroupMetadataManager)
```

This patch fixes the bug and also strengthen the partition epochs bookkeeping to not accept such invalid transitions.

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-01-22 01:16:20 -08:00
Jeff Kim 96f852f9e7
MINOR: log new coordinator partition load schedule time (#15017)
The current load summary exposes the time from when the partition load operation is scheduled to when the load completes. We are missing the information of how long the scheduled operation stays in the scheduler. Log that information.

Reviewers: David Jacot <djacot@confluent.io>
2024-01-18 02:28:17 -08:00
David Arthur 7bf7fd99a5
KAFKA-16078: Be more consistent about getting the latest MetadataVersion
This PR creates MetadataVersion.latestTesting to represent the highest metadata version (which may be unstable) and MetadataVersion.latestProduction to represent the latest version that should be used in production. It fixes a few cases where the broker was advertising that it supported the testing versions even when unstable metadata versions had not been configured.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>
2024-01-17 14:59:22 -08:00
David Jacot 3a6e699f13
KAFKA-16118; Coordinator unloading fails when replica is deleted (#15182)
When a replica is deleted, the unloading procedure of the coordinator is called with an empty leader epoch. However, the current implementation of the new group coordinator throws an exception in this case. My bad. This patch updates the logic to handle it correctly.

We discovered the bug in our testing environment. We will add a system test or an integration test in a subsequent patch to better exercise this path.

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-01-12 15:34:52 -08:00
David Jacot 6b9cb5ccbf
KAFKA-14505; [5/N] Add `UNSTABLE_OFFSET_COMMIT` error support (#15155)
This patch adds `UNSTABLE_OFFSET_COMMIT` errors support in the new group coordinator. `UNSTABLE_OFFSET_COMMIT` errors for partitions with unstable offset commits. Here unstable means that there are ongoing transactions.

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-01-12 05:33:39 -08:00
Divij Vaidya 65424ab484
MINOR: New year code cleanup - include final keyword (#15072)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Sagar Rao <sagarmeansocean@gmail.com>
2024-01-11 17:53:35 +01:00
David Jacot 6ff21ee1e0
MINOR: Disalow using a group id with only whitespaces in the new consumer group protocol (#15173)
This patch strengthen the validation of the group id when the new consumer group protocol is used.

Reviewers: Divij Vaidya <diviv@amazon.com>
2024-01-11 07:04:18 -08:00
David Jacot a8203f9c7a
KAFKA-14505; [4/N] Wire transaction verification (#15142)
This patch wires the transaction verification in the new group coordinator. It basically calls the verification path before scheduling the write operation. If the verification fails, the error is returned to the caller.

Note that the patch uses `appendForGroup`. I suppose that we will move away from using it when https://github.com/apache/kafka/pull/15087 is merged.

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-01-11 04:58:57 -08:00
Omnia Ibrahim dba789dc93
KAFKA-15853: Move OffsetConfig to group-coordinator module (#15161)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, David Jacot <djacot@confluent.io>, Nikolay <nizhikov@apache.org>
2024-01-11 10:19:42 +01:00
Jeff Kim cd3b3d9804
MINOR: fix custom retry backoff in new group coordinator (#15170)
When a retryable write operation fails, we retry with the default 500ms backoff. If a custom retry backoff was used to originally schedule the operation, we should retry with the same custom backoff instead of the default.

Reviewers: David Jacot <djacot@confluent.io>
2024-01-11 00:28:32 -08:00
Jeff Kim ac7ddc7d46
MINOR: Remove classic group preparing rebalance sensor (#15143)
Remove "group-rebalance-rate" and "group-rebalance-count" metrics from the new coordinator as this is not part of KIP-848.

Reviewers: David Jacot <djacot@confluent.io>
2024-01-09 01:10:21 -08:00
vamossagar12 d5aa341a18
MINOR: Fix flaky test GroupMetadataManagerTest.testStaticMemberGetsBackAssignmentUponRejoin (#15100)
Reviewers: Divij Vaidya <diviv@amazon.com>

---------

Co-authored-by: Sagar Rao <sagarrao@Sagars-MacBook-Pro.local>
2023-12-31 12:47:14 +01:00
David Jacot 98aca56ee5
KAFKA-16040; Rename `Generic` to `Classic` (#15059)
People has raised concerned about using `Generic` as a name to designate the old rebalance protocol. We considered using `Legacy` but discarded it because there are still applications, such as Connect, using the old protocol. We settled on using `Classic` for the `Classic Rebalance Protocol`.

The changes in this patch are extremely mechanical. It basically replaces the occurrences of `generic` by `classic`.

Reviewers: Divij Vaidya <diviv@amazon.com>, Lucas Brutschy <lbrutschy@confluent.io>
2023-12-21 13:39:17 -08:00
David Jacot 79757b3081
KAFKA-14505; [3/N] Wire WriteTxnMarkers API (#14985)
This patch wires the handling of makers written by the transaction coordinator via the WriteTxnMarkers API. In the old group coordinator, the markers are written to the logs and the group coordinator is informed to materialize the changes as a second step if the writes were successful. This approach does not really work with the new group coordinator for mainly two reasons: 1) The second step would actually fail while the coordinator is loading and there is no guarantee that the loading has picked up the write or not; 2) It does not fit well with the new memory model where the state is snapshotted by offset. In both cases, it seems that having a single writer to the `__consumer_offsets` partitions is more robust and preferable.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-12-21 10:59:41 -08:00
Jeff Kim 4613286076
KAFKA-16030: new group coordinator should check if partition goes offline during load (#15043)
The new coordinator stops loading if the partition goes offline during load. However, the partition is still considered active. Instead, we should return NOT_LEADER_OR_FOLLOWER exception during load.

Another change is that we only want to invoke CoordinatorPlayback#updateLastCommittedOffset if the current offset (last written offset) is greater than or equal to the current high watermark. This is to ensure that in the case the high watermark is ahead of the current offset, we don't clear snapshots prematurely.

Reviewers: David Jacot <djacot@confluent.io>
2023-12-21 06:17:35 -08:00
Jeff Kim f3038d5e73
KAFKA-15870: Move new group coordinator metrics from Yammer to Metrics (#14848)
This patch moves all the newly introduced metrics to the Kafka Metrics.

Reviewers: David Jacot <djacot@confluent.io>
2023-12-19 23:37:40 -08:00
Jeff Kim 3bd8ec16f6
MINOR: Transform new coordinator error before returning to client (#15001)
This was missing from https://issues.apache.org/jira/browse/KAFKA-14500. The existing coordinator transforms the log append error before returning to client. Apply the same transformation.

Reviewers: David Jacot <djacot@confluent.io>
2023-12-15 06:33:25 -08:00
vamossagar12 a1e985d22f
KAFKA-15237: Implement write operation timeout (#14981)
This patch ensure that `offset.commit.timeout.ms` is enforced. It does so by adding a timeout to the CoordinatorWriteEvent.

Reviewers: David Jacot <djacot@confluent.io>
2023-12-13 11:30:53 -08:00
Jeff Kim 0d9ee03742
KAFKA-15981: update Group size only when groups size changes (#14988)
Currently, we increment generic group metrics whenever we create a new Group object when we load a partition. This is incorrect as the partition may contain several records for the same group if in the active segment or if the segment has not yet been compacted.

The same applies to removing groups; we can possibly have multiple group tombstone records. Instead, only increment the metric if we created a new group and only decrement the metric if the group exists.

Reviewers: David Jacot <djacot@confluent.io>
2023-12-13 00:01:56 -08:00
David Jacot 131581a2b4
MINOR: Remove `SubscribedTopicRegex` field from `ConsumerGroupHeartbeatRequest` (#14956)
The support for regular expressions has not been implemented yet in the new consumer group protocol. This patch removes the `SubscribedTopicRegex` from the `ConsumerGroupHeartbeatRequest` in preparation for 3.7. It seems better to bump the version and add it back when we implement the feature, as part of https://issues.apache.org/jira/browse/KAFKA-14517, instead of having an unused field in the request.

Reviewers: Sagar Rao <sagarmeansocean@gmail.com>, Justine Olshan <jolshan@confluent.io>
2023-12-10 23:53:08 -08:00
David Jacot 522c2864cd
KAFKA-14505; [2/N] Implement TxnOffsetCommit API (#14845)
This patch implements the TxnOffsetCommit API. When a transactional offset commit is received, it is stored in the pending transactional offsets structure and waits there until the transaction is committed or aborted. Note that the handling of the transaction completion is not implemented in this patch.

Reviewers: Justine Olshan <jolshan@confluent.io>
2023-12-07 02:51:22 -08:00
Jeff Kim b888fa1ec9
KAFKA-15910: New group coordinator needs to generate snapshots while loading (#14849)
After the new coordinator loads a __consumer_offsets partition, it logs the following exception when making a read operation (fetch/list groups, etc):

 ```
java.lang.RuntimeException: No in-memory snapshot for epoch 740745. Snapshot epochs are:
at org.apache.kafka.timeline.SnapshotRegistry.getSnapshot(SnapshotRegistry.java:178)
at org.apache.kafka.timeline.SnapshottableHashTable.snapshottableIterator(SnapshottableHashTable.java:407)
at org.apache.kafka.timeline.TimelineHashMap$ValueIterator.<init>(TimelineHashMap.java:283)
at org.apache.kafka.timeline.TimelineHashMap$Values.iterator(TimelineHashMap.java:271)
```
 
This happens because we don't have a snapshot at the last updated high watermark after loading. We cannot generate a snapshot at the high watermark after loading all batches because it may contain records that have not yet been committed. We also don't know where the high watermark will advance up to so we need to generate a snapshot for each offset the loader observes to be greater than the current high watermark. Then once we add the high watermark listener and update the high watermark we can delete all of the older snapshots. 

Reviewers: David Jacot <djacot@confluent.io>
2023-12-06 08:38:05 -08:00
David Jacot 34e1dbbaba
MINOR: Add Uniform assignor to the default config (#14826)
This patch adds the `Uniform` assignor to the default list of supported assignors. It also do small changes in the code.

Reviewers: Justine Olshan <jolshan@confluent.io>
2023-12-05 00:32:50 -08:00
Jeff Kim 8038bc9342
KAFKA-14987 [2/2]; customize retry backoff for group/offsets expiration (#14870)
The group expiration log becomes noisy when we encounter a retry-able error as the retry backoff is fixed to 500 ms. Allow customizable retry backoff so that even in the case of failure we have a longer delay. The current default for offsetsRetentionCheckIntervalMs is set to 10 minutes so even if the operation fails we will "retry" after 10 minutes. 

Reviewers: David Jacot <djacot@confluent.io>
2023-12-05 00:18:56 -08:00
Max Riedel b7c99e22a7
KAFKA-14509: [2/N] Implement server side logic for ConsumerGroupDescribe API (#14544)
This patch implements the ConsumerGroupDescribe API.

Reviewers: David Jacot <djacot@confluent.io>
2023-12-04 07:19:28 -08:00
David Jacot 5ae0b49839
KAFKA-14505; [1/N] Add support for transactional writes to CoordinatorRuntime (#14844)
This patch adds support for transactional writes to the CoordinatorRuntime framework. This mainly consists in adding CoordinatorRuntime#scheduleTransactionalWriteOperation and in adding the producerId and producerEpoch to various interfaces. The patch also extends the CoordinatorLoaderImpl and the CoordinatorPartitionWriter accordingly.

Reviewers: Justine Olshan <jolshan@confluent.io>
2023-11-29 08:54:23 -08:00
vamossagar12 bb1c4465c9
KAFKA-14516: [1/N] Static Member leave, join, re-join request using ConsumerGroupHeartbeats (#14432)
This patch add the support for static membership to the new consumer group protocol. With a static member can join, re-join, temporarily leave and leave. When a member leaves with the expectation to rejoin, it must rejoin within the session timeout. It is kicks out from the consumer group otherwise.

Reviewers: David Jacot <djacot@confluent.io>
2023-11-28 10:08:16 -08:00
Dongnuo Lyu 891dd2a58a
KAFKA-15756: [1/2] Migrate existing integration tests to run old protocol in new coordinator (#14781)
This patch updates the testing framework to support running tests with kraft and the new group coordinator introduced in the context of KIP-848. This can be done by using `kraft+kip-848` as a quorum. Note that this is temporary until we make it the default and only option in 4.0. To verify this, this patch also enables kraft and kraft+kip-848 in PlaintextConsumerTest and its parent classes.

Reviewers: David Jacot <djacot@confluent.io>
2023-11-23 02:05:54 -08:00
Ritika Reddy 55017a4f68
KAFKA-15484: General Rack Aware Assignor (#14481)
This patch adds the second part of the Uniform Assignor, used when the subscriptions of each member in a consumer group are different.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>
2023-11-23 01:18:50 -08:00
Jeff Kim 14065a7fdc
MINOR: fix MultiThreadedEventProcessorTest.testMetrics() (#14802)
Reviewers: David Jacot <djacot@confluent.io>
2023-11-21 00:16:32 -08:00
Jeff Kim 07fee62afe
KAFKA-14519; [2/N] New coordinator metrics (#14387)
This patch copy over existing metrics and add new consumer group metrics to the new GroupCoordinatorService.

Now that each coordinator is responsible for a topic partition, this patch introduces a GroupCoordinatorMetrics that records gauges for global metrics such as the number of generic groups in PreparingRebalance state, etc. For GroupCoordinatorShard specific metrics, GroupCoordinatorMetrics will activate new GroupCoordinatorMetricsShards that will be responsible for incrementing/decrementing TimelineLong objects and then aggregate the total amount across all shards.

As the CoordinatorRuntime/CoordinatorShard does not care about group metadata, we have introduced a CoordinatorMetrics.java/CoordinatorMetricsShard.java so that in the future transaction coordinator metrics can also be onboarded in a similar fashion.

Main files to look at:

GroupCoordinatorMetrics.java
GroupCoordinatorMetricsShard.java
CoordinatorMetrics.java
CoordinatorMetricsShard.java
CoordinatorRuntime.java
Metrics to add after #14408 is merged:

offset deletions sensor (OffsetDeletions); Meter(offset-deletion-rate, offset-deletion-count)
Metrics to add after https://issues.apache.org/jira/browse/KAFKA-14987 is merged:

offset expired sensor (OffsetExpired); Meter(offset-expiration-rate, offset-expiration-count)

Reviewers: Justine Olshan <jolshan@confluent.io>
2023-11-20 21:38:50 -08:00
Dongnuo Lyu b1796ce6d2
KAFKA-15849: Fix ListGroups API when runtime partition size is zero (#14785)
When the group coordinator does not host any __consumer_offsets partitions, the existing ListGroup implementation won't schedule any operation, thus a `new CompletableFuture<>()` is returned directly and never gets completed. This patch fixes the issue.

Reviewers: David Jacot <djacot@confluent.io>
2023-11-17 04:48:02 -08:00
Jay Wang a64037cdef
MINOR: Fix GroupCoordinatorShardTest stubbing (#14637)
This patch fixes incorrect stubs in GroupCoordinatorShardTest.

Reviewers: David Jacot <djacot@confluent.io>
2023-11-14 23:45:11 -08:00
Ritika Reddy 1e7e9ce918
KAFKA 14515: Optimized Uniform Rack Aware Assignor (#14416)
This patch adds the Optimized Uniform Rack Aware Assignor.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>
2023-11-05 22:33:48 -08:00
Dongnuo Lyu 7bdd1a015e
KAFKA-15647: Fix the different behavior in error handling between the old and new group coordinator (#14589)
In `KafkaApis.scala`, we build the API response differently if exceptions are thrown during the API execution. Since the new group coordinator only populates the response with error code instead of throwing an exception when an error occurs, there may be different behavior between the existing group coordinator and the new one.

This patch:
- Fixes the response building in `KafkaApis.scala` for the two APIs affected by such difference -- OffsetFetch and OffsetDelete.
- In `GroupCoordinatorService.java`, returns a response with error code instead of a failed future when the coordinator is not active.

Reviewers: David Jacot <djacot@confluent.io>
2023-10-31 03:11:52 -07:00