Commit Graph

52 Commits

Author SHA1 Message Date
Sushant Mahajan 6ef675d08b
MINOR: Change log level for cold snapshot log. (#20209)
CI / build (push) Waiting to run Details
* We INFO log a message, if a share partition could be cold snapshotted.
* However, this may create noise if we have highly partitioned topic
backing the share partition. This will be further exacerbated by
multiple share groups using that topic.
* To reduce log pollution, this PR changes the level to DEBUG.

Reviewers: ShivsundarR <shr@confluent.io>, Andrew Schofield
 <aschofield@confluent.io>
2025-07-21 16:13:37 +01:00
Elizabeth Bennett f81853ca88
KAFKA-19441: encapsulate MetadataImage in GroupCoordinator/ShareCoordinator (#20061)
CI / build (push) Waiting to run Details
The MetadataImage has a lot of stuff in it and it gets passed around in
many places in the new GroupCoordinator. This makes it difficult to
understand what metadata the group coordinator actually relies on and
makes it too easy to use metadata in ways it wasn't meant to be used. 

This change encapsulate the MetadataImage in an interface
(`CoordinatorMetadataImage`) that indicates and controls what metadata
the group coordinator actually uses. Now it is much easier at a glance
to see what dependencies the GroupCoordinator has on the metadata. Also,
now we have a level of indirection that allows more flexibility in how
the GroupCoordinator is provided the metadata it needs.
2025-07-18 08:16:54 +08:00
Sushant Mahajan 05b2601dde
KAFKA-19456: State and leader epoch should not be updated on writes. (#20079)
* If a write request with higher state than seen so far for a
specific share partition arrives at the share coordinator, the code will
create a new share snapshot and also update the internal view of the
state epoch.
* For writes with higher leader epoch, the current records are updated
with that value as well.
* The above is not the expected behavior and only initialize RPCs should
set and alter the state epoch and read RPC should set the leader epoch.
* This PR rectifies the behavior.
* Few tests have been removed.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-07-01 19:57:57 +01:00
Sushant Mahajan ac583ad2c0
KAFKA-19455: Retry persister request on metadata image issues. (#20078)
* If we get an `UNKNOWN_TOPIC_OR_PARTITION` response from the
`ShareCoordinator` is could imply a transient issue where the metadata
image is not upto date.
* In this case it makes sense to retry the request to give time for data
to be available.
* In this PR, we are making the required change.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-07-01 19:47:59 +01:00
Sushant Mahajan cb809e2574
MINOR: Change snapshot epoch type to int32 in schema. (#20016)
CI / build (push) Waiting to run Details
* `SnapshotEpoch` type in `ShareSnapshotValue.json` and
`ShareUpdateValue.json` is currently
`uint16` which might overflow under heavy traffic.
* To be consistent with other epochs, this PR updates the type to
`int32`.

Reviewers: Andrew Schofield <aschofield@confluent.io>, ShivsundarR
 <shr@confluent.io>
2025-06-23 14:15:01 +01:00
Sushant Mahajan 56a6ba2d2e
MINOR: Add retention prop to share group state topic. (#20013)
CI / build (push) Waiting to run Details
*

https://cwiki.apache.org/confluence/display/KAFKA/KIP-932%3A+Queues+for+Kafka
states the `retention.ms` property for the `__share_group_state` to be
`-1`.
* This PR makes it explicit when defining the values of those configs.
* Existing test has been updated.

```
$ bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe
--topic __share_group_state

Topic: __share_group_state      TopicId: XCwzZjEGSjm5lUc_BeCrqA
PartitionCount: 50      ReplicationFactor: 1
Configs:
compression.type=producer,
min.insync.replicas=1,
cleanup.policy=delete,
segment.bytes=104857600,
retention.ms=-1
...
```

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-06-22 22:02:35 +01:00
Jhen-Yung Hsu 2e968560e0
MINOR: Cleanup simplify set initialization with Set.of (#19925)
Simplify Set initialization and reduce the overhead of creating extra
collections.

The changes mostly include:
- new HashSet<>(List.of(...))
- new HashSet<>(Arrays.asList(...)) / new HashSet<>(asList(...))
- new HashSet<>(Collections.singletonList()) / new
HashSet<>(singletonList())
- new HashSet<>(Collections.emptyList())
- new HashSet<>(Set.of())

This change takes the following into account, and we will not change to
Set.of in these scenarios:
- Require `mutability` (UnsupportedOperationException).
- Allow `duplicate` elements (IllegalArgumentException).
- Allow `null` elements (NullPointerException).
- Depend on `Ordering`. `Set.of` does not guarantee order, so it could
make tests flaky or break public interfaces.

Reviewers: Ken Huang <s7133700@gmail.com>, PoAn Yang
 <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2025-06-11 18:36:14 +08:00
Sushant Mahajan df93571f50
KAFKA-19338: Error on read/write of uninitialized share part. (#19861)
- Currently, read and write share state requests were allowed on
uninitialized share partitions (share partitions on which
initializeState has NOT been called). This should not be the case.
- This PR addresses the concern by adding error checks on read and
write. Other requests are allowed (initialize, readSummary, alter).
- Refactored `ShareCoordinatorShardTest` to reduce redundancy and added
some new tests.
- Some request/response classes have also been reformatted.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-06-03 11:26:38 +01:00
Sushant Mahajan 13b5627274
KAFKA-19337: Write state writes snapshot for higher state epoch. (#19843)
- Due to condition on number of updates/snapshot in
`generateShareStateRecord`, share updates gets written for write state
requests even if they have the highest state epoch seen so far.
- A share update cannot record state epoch. As a result, this update
gets missed.
- This PR remedies the issue and adds a test as proof of the fix.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-29 13:45:54 +01:00
Andrew Schofield 5a607db6ea
MINOR: Improve share coordinator record schemas (#19830)
CI / build (push) Waiting to run Details
This PR makes some very small improvements to the record schemas for the
share coordinator.

* It removes the health warnings about incompatible changes. All changes
are compatible now.
* It marks the fields in the values as version 0+, in common with all
other record schemas in Kafka.
Many were already 0+, so this just corrects the outliers.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Sushant Mahajan
<smahajan@confluent.io>
2025-05-28 10:24:17 +01:00
Sushant Mahajan c58de75712
KAFKA-19204: Add timestamp to share state metadata init maps [1/N] (#19781)
1.  Currently, the code allows for retrying any initializing topics in
subsequent heartbeats. This can result in duplicate calls to persister
if multiple share consumers join the same group concurrently.
Furthermore, only one of these will succeed as the others will have a
lower state epoch and will be fenced.
2. The existing change was made in
https://github.com/apache/kafka/pull/19603 to allow for retrying
initialization of initializing topics, in case the original caller was
not able to persist the information in the persister due to a dead
broker/timeout.
3. To prevent multiple calls as well as allow for retry we have
supplemented the timelinehashmap holding the
`ShareGroupStatePartitionMetadataInfo` to also hold the timestamp at
which this record gets replayed.
  a. Now when we get multiple consumers for the same group and topic,
only one of them is allowed to make the persister initialize request and
this information is added to the map when it is replayed. Thus solving
issue 1.
  b. To allow for retries, if an initializing topic is found with a
timestamp which is older than 2*offset_write_commit_ms, that topic will
be allowed to be retried. Here too only one consumer would be able to
retry thus resolving issue 2 as well.
4. Tests have been added wherever applicable and existing ones updated.
5. No record schema changes are involved.
6. The `ShareGroupStatePartitionMetadataInfo` and `InitMapValue` records
have been moved to the `ShareGroup` class for better encapsulation.
7. Some logs have been changed from error to info in
`ShareCoordinatorShard` and extra information is logged.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-23 08:56:05 +01:00
Sanskar Jhajharia 9f293866ab
MINOR: Cleanup Share Coordinator (#19770)
CI / build (push) Waiting to run Details
Now that Kafka Brokers support Java 17, this PR updates the share
coordinator module to get rid of older code. The changes mostly include:
- Collections.emptyList(), Collections.singletonList() and
- Arrays.asList() are replaced with List.of()
- Collections.emptyMap() and Collections.singletonMap() are replaced
with Map.of()
- Collections.singleton() is replaced with Set.of()

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-20 12:33:20 +01:00
Sushant Mahajan 847968e530
KAFKA-19281: Add share enable flag to periodic jobs. (#19721)
* We have a few periodic timer tasks in `ShareCoordinatorService` which
run continuously.
* With the recent introduction of share group enabled config at feature
level, we would like these jobs to stop when the aforementioned feature
is disabled.
* In this PR, we have added the functionality to make that possible.
* Additionally the service has been supplemented with addition of a
static share group config supplier.
* New test has been added.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal
 <apoorvmittal10@gmail.com>
2025-05-15 14:05:06 +01:00
Sushant Mahajan bf53561d16
KAFKA-19201: Handle deletion of user topics part of share partitions. (#19559)
* Currently even if a user topic is deleted, its related records are not
deleted with respect to subscribed share groups from the GC and the SC.
* The event of topic delete is propagated from the
BrokerMetadataPublisher to the coordinators via the
`onPartitionsDeleted` method. This PR leverages this method to issue
cleanup calls to the GC and SC respectively.
* To prevent chaining of futures in the GC, we issue async calls to both
GC and SC independently and the methods take care of the respective
cleanups unaware of the other.
* This method is more efficient and transcends issues related to
timeouts/broker restarts resulting in chained future execution issues.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-13 14:22:17 +01:00
Andrew Schofield 7d027a4d83
KAFKA-19218: Add missing leader epoch to share group state summary response (#19602)
CI / build (push) Waiting to run Details
When the persister is responding to a read share-group state summary
request, it has no way of including the leader epoch in its response,
even though it has the information to hand. This means that the leader
epoch information is not initialised in the admin client operation to
list share group offsets, and this then means that the information
cannot be displayed in kafka-share-groups.sh.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Sushant Mahajan
 <smahajan@confluent.io>
2025-05-06 14:53:12 +01:00
Sushant Mahajan 4558d15856
MINOR: Change info log to debug for scheduled timer tasks. (#19624)
CI / build (push) Waiting to run Details
* We have a 2 perpetual timer tasks in ShareCoordinatorService to do
internal topic cleanup and snapshot cold partitions respectively.
* There are a few info level logs being printed as part of the
procedures. These are introducing noise and are not absolutely
necessary.
* We also move a debug log to error for the prune job.
* To remedy the situation, this PR changes the log level from info to
debug.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield
 <aschofield@confluent.io>
2025-05-03 07:18:17 +01:00
Sushant Mahajan 6fe1598e6b
KAFKA-18170: Add scheduled job to snapshot cold share partitions. (#19443)
* There could be scenarios where share partition records in
`__share_group_state` internal topic are not updated for a while
implying these partitions are basically cold.
* In this situation, the presence of these holds back the
pruner from keeping the topic clean and of manageable size.
* To remedy the situation, we have added a periodic
`setupSnapshotColdPartitions` in `ShareCoordinatorService` which does a
writeAll operation on the associated shards in the coordinator and
forces snapshot creation for any cold partitions. In this way the pruner
can continue.
This job has been added as a timer task.
* A new internal config
`share.coordinator.cold.partition.snapshot.interval.ms` has been
introduced to set the period of the job.
* Any failures are logged and ignored.
* New tests have been added to verify the feature.

Reviewers: PoAn Yang <payang@apache.org>, Andrew Schofield <aschofield@confluent.io>
2025-04-23 11:52:28 +01:00
Sushant Mahajan a6dfde7ce6
KAFKA-18629: Utilize share group partition metadata for delete group. (#19363)
* Currently, the delete share group code flow uses
`group.subscribedTopicNames()` to fetch information about all the share
partitions to which a share group is subscribed to.
* However, this is incorrect since once the group is EMPTY, a
precondition for delete, the aforementioned method will return an empty
list.
* In this PR we have leveraged the `ShareGroupStatePartitionMetadata`
record to grab the `initialized` and `initializing` partitions to build
the delete candidates, which remedies the situation.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-04-10 20:15:13 +01:00
Sushant Mahajan c3b7aa6e64
KAFKA-18170: Add create and write timestamp fields in share snapshot [1/N] (#19432)
* We wish to track the time of creation of the `ShareSnapshot` records
so that automated jobs could force their creation if a share partition
has gone cold (no updates for a specified time interval).
* To accomplish this, we have added 2 new fields `CreateTimestamp` and
`WriteTimestamp` in the `ShareSnapshot` record.
* The former tracks snapshot creation due to regular RPC calls while the
latter will track snapshots created by periodic jobs.
* In this PR we have made the requisite changes.
* This is a first of a series of PRs to create the automated jobs and
associated scaffolding.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-04-10 15:56:58 +01:00
Lucas Brutschy fc2e3dfce9
MINOR: Disallow unused local variables (#18963)
Recently, we found a regression that could have been detected by static
analysis, since a local variable wasn't being passed to a method during
a refactoring, and was left unused. It was fixed in
[7a749b5](7a749b589f),
but almost slipped into 4.0. Unused variables are typically detected by
IDEs, but this is insufficient to prevent these kinds of bugs. This
change enables unused local variable detection in checkstyle for Kafka.

A few notes on the usage:
- There are two situations in which people actually want to have a local
variable but not use it. First, there are `for (Type ignored:
collection)` loops which have to loop `collection.length` number of
times, but that do not use `ignored` in the loop body. These are
typically still easier to read than a classical `for` loop. Second, some
IDEs detect it if a return value of a function such as `File.delete` is
not being used. In this case, people sometimes store the result in an
unused local variable to make ignoring the return value explicit and to
avoid the squiggly lines.
- In Java 22, unsued local variables can be omitted by using a single
underscore `_`. This is supported by checkstyle. In pre-22 versions,
IntelliJ allows such variables to be named `ignored` to suppress the
unused local variable warning. This pattern is often (but not
consistently) used in the Kafka codebase. This is, however, not
supported by checkstyle.

Since we cannot switch to Java 22, yet, and we want to use automated
detection using checkstyle, we have to resort to prefixing the unused
local variables with `@SuppressWarnings("UnusedLocalVariable")`. We have
to apply this in 11 cases across the Kafka codebase. While not being
pretty, I'd argue it's worth it to prevent bugs like the one fixed in
[7a749b5](7a749b589f).

Reviewers: Andrew Schofield <aschofield@confluent.io>, David Arthur
<mumrah@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Bruno
Cadonna <cadonna@apache.org>, Kirk True <ktrue@confluent.io>
2025-03-10 09:37:35 +01:00
Sanskar Jhajharia a206feb4ba
MINOR: Clean up share-coordinator (#19007)
Given that now we support Java 17 on our brokers, this PR replace the use of `Collections.singletonList()` and `Collections.emptyList()` with `List.of()`

Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-23 11:27:38 +08:00
Sushant Mahajan 4f28973bd1
KAFKA-18827: Initialize share state, share coordinator impl. [1/N] (#18968)
In this PR, we have added the share coordinator and KafkaApis side impl
of the intialize share group state RPC.
ref:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-932%3A+Queues+for+Kafka#KIP932:QueuesforKafka-InitializeShareGroupStateAPI

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-22 16:12:08 +00:00
Matthias J. Sax 538a60e1b3
MINOR: disallow rawtypes and fail build (#18877)
Cleanup code to avoid rawtype, and add suppressions where necessary.
Change the build to fail on rawtype warning.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-02-19 13:11:49 -08:00
Sushant Mahajan 5235e11d4d
KAFKA-18809 Set min in sync replicas for __share_group_state. (#18922)
- The share.coordinator.state.topic.min.isr config defined in ShareCoordinatorConfig was not being used in the AutoTopicCreationManager.
- The AutoTopicCreationManager calls the ShareCoordinatorService.shareGroupStateTopicConfigs to configs for the topic to create.
- The method ShareCoordinatorService.shareGroupStateTopicConfigs was not setting the supplied config value for share.coordinator.state.topic.min.isr to min.insync.replicas.
- In this PR, we remedy the situation by setting the value
- A test has been added to ShareCoordinatorServiceTest so that this is not repeated for any configs.

Reviewers: poorv Mittal <apoorvmittal10@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-17 04:22:48 +08:00
David Jacot bf05d2c914
KAFKA-18672; CoordinatorRecordSerde must validate value version (#18749)
CoordinatorRecordSerde does not validate the version of the value to check whether the version is supported by the current version of the software. This is problematic if a future and unsupported version of the record is read by an older version of the software because it would misinterpret the bytes. Hence CoordinatorRecordSerde must throw an error if the version is unknown. This is also consistent with the handling in the old coordinator.

Reviewers: Jeff Kim <jeff.kim@confluent.io>
2025-02-03 02:19:27 -08:00
Sushant Mahajan f32932cc25
KAFKA-18629: Delete share group state impl [1/N] (#18712)
Reviewers: Christo Lolov <lolovc@amazon.com>, Andrew Schofield <aschofield@confluent.io>
2025-01-28 11:43:01 +00:00
David Jacot b368c38684
KAFKA-18302; Update CoordinatorRecord (#18512)
This patch does a few things:
1) Replace ApiMessageAndVersion by ApiMessage in CoordinatorRecord for the key
2) Leverage the fact that ApiMessage exposes the apiKey. Hence we don't need to specify the key anymore.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-01-21 18:11:26 +01:00
David Jacot 76bf38a4fd
KAFKA-18604; Update transaction coordinator (#18636)
This patch updates the transaction coordinator record to use the new coordinator record definition.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-01-21 08:36:23 +01:00
Sushant Mahajan 06a5e258e4
KAFKA-18232: Add share group state topic prune metrics. (#18174)
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-01-20 15:17:15 +00:00
Sanskar Jhajharia bcbc72e29b
[KAFKA-16720] AdminClient Support for ListShareGroupOffsets (1/n) (#18571)
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-01-20 07:47:14 +00:00
mingdaoy 042da16fd6
KAFKA-18557 streamline codebase with testConfig() (#18582)
Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-01-17 08:40:06 +00:00
Sushant Mahajan f1675436e4
MINOR: Change share coord linger ms in line with group coord. (#18572)
Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-01-16 13:16:17 +00:00
Apoorv Mittal e12db663f0
KAFKA-18514 Remove server dependency on share coordinator (#18536)
The PR removes dependency of server module on share-coordinator, rather it should be other way. Moved the ShareCoordinatorConfig class from server to share-coordinator.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
2025-01-16 00:47:01 +08:00
David Jacot 87334e6c2e
KAFKA-18308; Update CoordinatorSerde (#18455)
This patch updates the GroupCoordinatorSerde and the ShareGroupCoordinatorSerde to leverage the CoordinatorRecordType to deserialize records. With this, newly added record are automatically picked up. In other words, the serdes work with all defined records without doing anything.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-01-10 11:17:30 +01:00
David Jacot 7b6e94642a
KAFKA-18303; Update ShareCoordinator to use new record format (#18396)
Following https://github.com/apache/kafka/pull/18261, this patch updates the Share Coordinator to use the new record format.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-01-06 23:59:07 -08:00
Ismael Juma d6f24d3665
Use `instanceof` pattern to avoid explicit cast (#18373)
This feature was introduced in Java 16.

Reviewers: David Arthur <mumrah@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>
2025-01-02 09:32:51 -08:00
Sushant Mahajan 4c5ea05ec8
KAFKA-18058: Share group state record pruning impl. (#18014)
In this PR, we've added a class ShareCoordinatorOffsetsManager, which tracks the last redundant offset for each share group state topic partition. We have also added a periodic timer job in ShareCoordinatorService which queries for the redundant offset at regular intervals and if a valid value is found, issues the deleteRecords call to the ReplicaManager via the PartitionWriter. In this way the size of the partitions is kept manageable.

Reviewers: Jun Rao <junrao@gmail.com>, David Jacot <djacot@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2024-12-12 07:38:03 +00:00
yx9o 38e727fe4d
KAFKA-17864: add descriptions to fields in the agreement (#17681)
Improve descriptive information in Kafka protocol documentation.

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>
2024-12-07 18:47:11 +00:00
Sushant Mahajan 1f26b9607e
MINOR: Perf improvement in share state batch combiner. (#18090)
Change from ArrayList to LinkedList following performance analysis

Reviewers: Andrew Schofield <aschofield@confluent.io>
2024-12-07 18:36:17 +00:00
Sushant Mahajan 42f74a1c3a
KAFKA-17796: Persist higher leaderEpoch in read state call. (#17580)
This PR adds code into the ShareCoordinatorService.readState method to issue a runtime.scheduleWriteOperation call if the incoming read state request holds a valid leaderEpoch value (not -1).

Co-authored-by: TaiJu Wu <tjwu1217@gmail.com>

Reviewers: Andrew Schofield <aschofield@confluent.io>, David Jacot <djacot@confluent.io>
2024-12-07 09:02:03 +00:00
David Jacot a211ee99b5
KAFKA-17593; [7/N] Introduce CoordinatorExecutor (#17823)
This patch introduces the `CoordinatorExecutor` construct into the `CoordinatorRuntime`. It allows scheduling asynchronous tasks from within a `CoordinatorShard` while respecting the runtime semantic. It will be used to asynchronously resolve regular expressions.

The `GroupCoordinatorService` uses a default `ExecutorService` with a single thread to back it at the moment. It seems that it should be sufficient. In the future, we could consider making the number of threads configurable.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Lianet Magrans <lmagrans@confluent.io>
2024-11-19 07:19:22 -08:00
Chirag Wadhwa 9db5ed00a8
KAFKA-16726: Added share.auto.offset.reset dynamic config for share groups (#17573)
This PR adds another dynamic config share.auto.offset.reset fir share groups.


Reviewers:  Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>,  Abhinav Dixit <adixit@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-11-11 14:36:11 +05:30
Sushant Mahajan 2e2b0a58ed
KAFKA-17914: Update string ref with SharePartitionKey. (#17660)
Currently, we are using the String repr of the shareCoordinator/sharePartition key (groupId:topicId:parition) as defined in kip-932 in a few methods like ShareCoordinator.partitionFor and ShareCoordinatorMetadataCacheHelper.getShareCoordinator.
This has the potential to introduce subtle bugs when incorrect strings are used to invoke these methods. What is perturbing is that the failures might be intermittent.
This PR aims to remedy the situation by changing the type to the concrete SharePartitionKey. This way callers need not be worried about a specific encoding or format of the coordinator key as long as the SharePartitionKey has the correct fields set.
There is one issue - the FIND_COORDINATOR RPC does require the coordinator key to be set as a String in the request body. We can't get around this and have to set the value as String. However, on the KafkaApis handler side we parse this key into a SharePartitionKey again and gain compile time safety. One downside is that we need to split and format the incoming coordinator key in the request but that can be encapsulated at a single location in SharePartitionKey.
Added tests for partitionFor.

Reviewers: Andrew Schofield <aschofield@confluent.io>,  Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-11-08 15:05:39 +05:30
Andrew Schofield 346fdbafc5
KAFKA-17912 Align string representations of SharePartitionKey (#17656)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-01 15:43:32 +08:00
Sushant Mahajan 5f92f60bff
KAFKA-17329: DefaultStatePersister implementation (#17270)
Adds the DefaultStatePersister and other supporting classes for managing share state.

* Added DefaultStatePersister implementation. This is the entry point for callers who wish to invoke the share state RPC API.
* Added PersisterStateManager which is used by DefaultStatePersister to manage and send the RPCs over the network.
* Added code to BrokerServer and BrokerMetadataPublisher to instantiate the appropriate persister based on the config value for group.share.persister.class.name. If this is not specified, the DefaultStatePersister will be used. To force use of NoOpStatePersister, set the config to empty. This is an internal config, not to be exposed to the end user. This will be used to factory plug the appropriate persister.
* Using this persister, the internal __share_group_state topic will come to life and will be used for persistence of share group info.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Jun Rao <junrao@gmail.com>, David Arthur <mumrah@gmail.com>
2024-10-28 14:11:04 -04:00
Sushant Mahajan 5545d717c3
KAFKA-17633: Add share group record formatter and parser. (#17467)
As part of KIP-932, a new internal topic __share_group_state was introduced. There are 2 types of records which are currently being added in this topic - ShareSnapshotKey/Value and ShareUpdateKey/Value
In light of this, we must make the existing tooling like kafka-console-consumer and kafka-dump-log aware of these records for debugging and introspection purposes.
This PR introduces ShareGroupStateMessageFormatter to be used used with kafka-console-consumer and adds an internal class ShareGroupStateMessageParser in DumpLogSegments.scala.
Unit tests have been added to DumpLogSegmentsTest.scala


Reviewers:  Andrew Schofield <aschofield@confluent.io>,  Manikumar Reddy <manikumar.reddy@gmail.com>
2024-10-15 11:44:15 +05:30
Sushant Mahajan d173842d36
KAFKA-17469: Move persister related classes to persister pkg. (#17349)
Reviewers: Andrew Schofield <aschofield@confluent.io>, David Arthur <mumrah@gmail.com>
2024-10-03 11:00:22 -04:00
Sushant Mahajan 7b7eb6243f
KAFKA-17367: Share coordinator persistent batch merging algorithm. [3/N] (#17149)
This patch introduces a merging algorithm for persistent state batches in the share coordinator. 

The algorithm removes any expired batches (lastOffset before startOffset) and then places the rest in a sorted map. It then identifies batch pairs which overlap and combine them while preserving the relative priorities of any intersecting sub-ranges. The resultant batches are placed back into the map. The algorithm ends when no more overlapping pairs can be found.

Reviewers: Andrew Schofield <aschofield@confluent.io>, David Arthur <mumrah@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>, Jun Rao <junrao@gmail.com>
2024-10-02 11:30:51 -04:00
Sushant Mahajan 67f966f348
KAFKA-17469: Moved share external interfaces to share module. (#17262)
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, David Arthur <mumrah@gmail.com>
2024-09-24 12:08:01 -04:00
Sushant Mahajan 821c10157d
KAFKA-17367: Introduce share coordinator [2/N] (#17011)
Introduces the share coordinator. This coordinator is built on the new coordinator runtime framework. It 
is responsible for persistence of share-group state in a new internal topic named "__share_group_state".
The responsibility for being a share coordinator is distributed across the brokers in a cluster. 

Reviewers: David Arthur <mumrah@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>
2024-09-09 20:01:24 -04:00