Commit Graph

45 Commits

Author SHA1 Message Date
Logan Zhu 24d03a18ef
KAFKA-19517: Include control records in LoadSummary#numRecords (#20206)
## Summary
jira: https://issues.apache.org/jira/browse/KAFKA-19517
Ensure `LoadSummary#numRecords` counts all records, including control
batches, to maintain consistency with numBytes.

## Test
`testLoading` now verifies `numRecords`.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi
 <frankvicky@apache.org>
2025-07-21 15:12:18 +08:00
Logan Zhu 1b351ad6e2
MINOR: Remove unnecessary dependencies from coordinator-common (follow up to pr#20089) (#20194)
CI / build (push) Waiting to run Details
This PR removes the dependencies on `core` and `scala-library` from the
`coordinator-common` module, as a follow-up to
https://github.com/apache/kafka/pull/20089.

These dependencies have been removed from tests, and the previously
added import-control relaxations have been reverted accordingly.

Reviewers: TengYao Chi <frankvicky@apache.org>, Ken Huang
<s7133700@gmail.com>
2025-07-19 19:08:33 +08:00
Elizabeth Bennett f81853ca88
KAFKA-19441: encapsulate MetadataImage in GroupCoordinator/ShareCoordinator (#20061)
CI / build (push) Waiting to run Details
The MetadataImage has a lot of stuff in it and it gets passed around in
many places in the new GroupCoordinator. This makes it difficult to
understand what metadata the group coordinator actually relies on and
makes it too easy to use metadata in ways it wasn't meant to be used. 

This change encapsulate the MetadataImage in an interface
(`CoordinatorMetadataImage`) that indicates and controls what metadata
the group coordinator actually uses. Now it is much easier at a glance
to see what dependencies the GroupCoordinator has on the metadata. Also,
now we have a level of indirection that allows more flexibility in how
the GroupCoordinator is provided the metadata it needs.
2025-07-18 08:16:54 +08:00
Logan Zhu d03878c7fb
MINOR: Migrate CoordinatorLoaderImpl from Scala to Java (#20089)
CI / build (push) Waiting to run Details
### Summary of Changes

- Rewrote both `CoordinatorLoaderImpl` and `CoordinatorLoaderImplTest`
in Java, replacing their original Scala implementations.
- Removed the direct dependency on `ReplicaManager` and replaced it with
functional interfaces for `partitionLogSupplier` and
`partitionLogEndOffsetSupplier`
- Preserved original logic and test coverage during migration.

Reviewers: TaiJuWu <tjwu1217@gmail.com>, Ken Huang <s7133700@gmail.com>,
 TengYao Chi <frankvicky@apache.org>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-07-18 01:51:46 +08:00
Chang-Chi Hsu 4a2d4ee76a
MINOR: Replace Long with primitive long for CoordinatorPlayback (#20171)
Reviewers: Ken Huang <s7133700@gmail.com>, Jhen-Yung Hsu
 <jhenyunghsu@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping
 Tsai <chia7712@gmail.com>
2025-07-18 01:34:00 +08:00
Ming-Yen Chung e3276ae029
KAFKA-19427 Allow the coordinator to grow its buffer dynamically (#20040)
CI / build (push) Waiting to run Details
* Coordinator starts with a smaller buffer, which can grow as needed.

* In freeCurrentBatch, release the appropriate buffer:
  * The Coordinator recycles the expanded buffer
(`currentBatch.builder.buffer()`), not `currentBatch.buffer`, because
`MemoryBuilder` may allocate a new `ByteBuffer` if the existing one
isn't large enough.

  * There are two cases that buffer may exceeds `maxMessageSize`      1.
If there's a single record whose size exceeds `maxMessageSize` (which,
so far, is derived from `max.message.bytes`) and the write is in
`non-atomic` mode, it's still possible for the buffer to grow beyond
`maxMessageSize`. In this case, the Coordinator should revert to using a
smaller buffer afterward.      2. Coordinator do not recycles the buffer
that larger than `maxMessageSize`. If the user dynamically reduces
`maxMessageSize` to a value even smaller than `INITIAL_BUFFER_SIZE`, the
Coordinator should avoid recycling any buffer larger than
`maxMessageSize` so that Coordinator can allocate the smaller buffer in
the next round.

* Add tests to verify the above scenarios.

Reviewers: David Jacot <djacot@confluent.io>, Sean Quah
<squah@confluent.io>, Ken Huang <s7133700@gmail.com>, PoAn Yang
<payang@apache.org>, TaiJuWu <tjwu1217@gmail.com>, Jhen-Yung Hsu
<jhenyunghsu@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-07-16 22:06:33 +08:00
Sean Quah 08eda2ebed
KAFKA-19445: Fix coordinator runtime metrics sharing sensors (#20062)
When sensors are shared between different metric groups, data from all
groups is combined and added to all metrics under each sensor. This
means that different metric groups will report the same values for their
metrics.

Prefix sensor names with metric group names to isolate metric groups.

Reviewers: Yung <yungyung7654321@gmail.com>, Sushant Mahajan
<smahajan@confluent.io>, Dongnuo Lyu <dlyu@confluent.io>, TengYao Chi
<frankvicky@apache.org>
2025-06-30 15:14:39 +08:00
Mason Chen d442c31e92
KAFKA-19402: Typo in EventAccumulator.java (#19951)
CI / build (push) Waiting to run Details
Fix multiple typo and grammar issues in EventAccumulator.java

Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Lianet Magrans
 <lmagrans@confluent.io>

---------

Co-authored-by: Lianet Magrans <98415067+lianetm@users.noreply.github.com>
2025-06-20 15:34:02 -04:00
Jhen-Yung Hsu 2e968560e0
MINOR: Cleanup simplify set initialization with Set.of (#19925)
Simplify Set initialization and reduce the overhead of creating extra
collections.

The changes mostly include:
- new HashSet<>(List.of(...))
- new HashSet<>(Arrays.asList(...)) / new HashSet<>(asList(...))
- new HashSet<>(Collections.singletonList()) / new
HashSet<>(singletonList())
- new HashSet<>(Collections.emptyList())
- new HashSet<>(Set.of())

This change takes the following into account, and we will not change to
Set.of in these scenarios:
- Require `mutability` (UnsupportedOperationException).
- Allow `duplicate` elements (IllegalArgumentException).
- Allow `null` elements (NullPointerException).
- Depend on `Ordering`. `Set.of` does not guarantee order, so it could
make tests flaky or break public interfaces.

Reviewers: Ken Huang <s7133700@gmail.com>, PoAn Yang
 <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2025-06-11 18:36:14 +08:00
Abhinav Dixit caf4a6cc5f
KAFKA-19216: Eliminate flakiness in kafka.server.share.SharePartitionTest (#19639)
### About
11 of the test cases in `SharePartitionTest` have failed at least once
in the past 28 days.

https://develocity.apache.org/scans/tests?search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.timeZoneId=Europe%2FLondon&tests.container=kafka.server.share.SharePartitionTest
Observing the flakiness, they seem to be caused due to the usage of
`SystemTimer` for various acquisition lock timeout related tests. I have
replaced the usage of `SystemTimer` with `MockTimer` and also improved
the `MockTimer` API with regard to removing the timer task entries that
have already been cancelled.
Also, this has reduced the time taken to run `SharePartitionTest` from
~6 sec to ~1.5 sec

### Testing
The testing has been done with the help of already present unit tests in
Apache Kafka.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-05-05 20:04:22 +01:00
David Jacot 6e26ec06bb
MINOR: Update GroupCoordinator interface to use AuthorizableRequestContext instead of RequestContext (#19485)
This patch updates the `GroupCoordinator` interface to use
`AuthorizableRequestContext` instead of using `RequestContext`. It makes
the interface more generic. The only downside is that the request
version in `AuthorizableRequestContext` is an `int` instead of a `short`
so we had to adapt it in a few places. We opted for using `int` directly
wherever possible.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
2025-04-16 09:12:11 -07:00
Sean Quah d7fa406b32
KAFKA-19064: Handle exceptions from deferred events in coordinator (#19333)
Previously in KAFKA-18484, we added exception handling for exceptions
coming from batch events. Also handle exceptions from 0-record write
events and transaction completion events.

Reviewers: David Jacot <djacot@confluent.io>
2025-04-03 05:16:48 -07:00
Sean Quah 1eea7f0528
MINOR: Make coordinator runtime inner classes static (#19332)
Make DeferredEventCollection and CoordinatorBatch static classes.
DeferredEventCollection only needs to access the logger and
CoordinatorBatch is only non-static because it holds
DeferredEventCollections.

Reviewers: David Jacot <djacot@confluent.io>
2025-04-01 07:17:39 -07:00
Sean Quah 9992e4cfa7
MINOR: Move record helper methods out of CoordinatorRuntimeTest (#19279)
These methods are generally useful for constructing records in
coordinator tests.

Reviewers: David Jacot <djacot@confluent.io>
2025-03-25 03:56:15 -07:00
Sean Quah ec12d360a1
MINOR: Move inner test classes out of CoordinatorRuntimeTest (#19258)
Some of these classes are generally useful for testing.
MockCoordinatorShard is already shared by SnapshottableCoordinatorTest.

Also do some minor refactors.

Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>
2025-03-21 08:36:23 -07:00
co63oc e4ece37dbf
Fix typos in multiple files (#19086)
Fix typos in multiple files

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-03-04 16:05:51 +00:00
Sanskar Jhajharia a39fcac95c
MINOR: Clean up coordinator-common and server modules (#19009)
Given that now we support Java 17 on our brokers, this PR replace the
use of :

Collections.singletonList() and Collections.emptyList() with List.of()
Collections.singletonMap() and Collections.emptyMap() with Map.of()
Collections.singleton() and Collections.emptySet() with Set.of()

Affected modules: server and coordinator-common

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-27 17:09:21 +08:00
Parker Chang ed366e6b89
MINOR: Align assertFutureThrows method signature with JUnit conventions (#18825)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-02-18 15:56:42 +00:00
Sean Quah 1c9190d6b1
KAFKA-18807; Fix thread idle ratio metric (#18934)
When group.coordinator.threads is greater than 1, we lose track of thread idle time because of integer arithmetic. Use doubles instead.

Reviewers: David Jacot <djacot@confluent.io>
2025-02-18 08:11:38 +01:00
Nick Guo 22d4248fba
KAFKA-18694: Migrate suitable classes to records in coordinator-common module (#18782)
Reviewers: TengYao Chi <kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Christo Lolov <lolovc@amazon.com>
2025-02-05 10:50:55 +00:00
David Jacot bf05d2c914
KAFKA-18672; CoordinatorRecordSerde must validate value version (#18749)
CoordinatorRecordSerde does not validate the version of the value to check whether the version is supported by the current version of the software. This is problematic if a future and unsupported version of the record is read by an older version of the software because it would misinterpret the bytes. Hence CoordinatorRecordSerde must throw an error if the version is unknown. This is also consistent with the handling in the old coordinator.

Reviewers: Jeff Kim <jeff.kim@confluent.io>
2025-02-03 02:19:27 -08:00
Jeff Kim 048dfeffd0
MINOR: prevent exception from HdrHistogram (#18674)
HdrHistogram can throw an exception if the recorded value is greater than a configured limit. Expand the ceiling from per-metric to all invocations.

Reviewers: David Jacot <djacot@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-01-29 11:34:46 -05:00
Sean Quah 5946f27ac5
KAFKA-18484 [2/2]; Handle exceptions during coordinator unload (#18667)
Ensure that unloading a coordinator always succeeds. Previously, we have
guarded against exceptions from DeferredEvent completions. All that
remains is handling exceptions from the onUnloaded() method of the
coordinator state machine.

Reviewers: David Jacot <djacot@confluent.io>
2025-01-23 17:15:21 +01:00
Sean Quah 5a57473a52
KAFKA-18484 [1/N]; Handle exceptions from deferred events in coordinator (#18661)
Guard against the coordinator getting stuck due to deferred events
throwing exceptions.

Reviewers: David Jacot <djacot@confluent.io>
2025-01-22 14:46:19 +01:00
David Jacot b368c38684
KAFKA-18302; Update CoordinatorRecord (#18512)
This patch does a few things:
1) Replace ApiMessageAndVersion by ApiMessage in CoordinatorRecord for the key
2) Leverage the fact that ApiMessage exposes the apiKey. Hence we don't need to specify the key anymore.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-01-21 18:11:26 +01:00
David Jacot 76bf38a4fd
KAFKA-18604; Update transaction coordinator (#18636)
This patch updates the transaction coordinator record to use the new coordinator record definition.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-01-21 08:36:23 +01:00
David Jacot 87334e6c2e
KAFKA-18308; Update CoordinatorSerde (#18455)
This patch updates the GroupCoordinatorSerde and the ShareGroupCoordinatorSerde to leverage the CoordinatorRecordType to deserialize records. With this, newly added record are automatically picked up. In other words, the serdes work with all defined records without doing anything.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-01-10 11:17:30 +01:00
PoAn Yang b4be178599
KAFKA-17393: Remove log.message.format.version/message.format.version (KIP-724) (#18267)
Based on [KIP-724](https://cwiki.apache.org/confluence/display/KAFKA/KIP-724%3A+Drop+support+for+message+formats+v0+and+v1), the `log.message.format.version` and `message.format.version` can be removed in 4.0.

These configs effectively a no-op with inter-broker protocol version 3.0 or higher
since Apache Kafka 3.0, so the impact should be minimal.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2024-12-21 15:35:15 -08:00
Sean Quah 18f17ed4d3
KAFKA-18200; Handle empty batches in coordinator runtime (#18144)
* Avoid attaching empty writes to empty batches.
* Handle flushes of empty batches, which would return a 0 offset otherwise.

Reviewers: David Jacot <djacot@confluent.io>
2024-12-16 23:39:48 -08:00
Sushant Mahajan 4c5ea05ec8
KAFKA-18058: Share group state record pruning impl. (#18014)
In this PR, we've added a class ShareCoordinatorOffsetsManager, which tracks the last redundant offset for each share group state topic partition. We have also added a periodic timer job in ShareCoordinatorService which queries for the redundant offset at regular intervals and if a valid value is found, issues the deleteRecords call to the ReplicaManager via the PartitionWriter. In this way the size of the partitions is kept manageable.

Reviewers: Jun Rao <junrao@gmail.com>, David Jacot <djacot@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2024-12-12 07:38:03 +00:00
Jeff Kim 05fd36a3b7
KAFKA-18174: Subsequent write event completions should be a noop (#18083) 2024-12-09 17:17:29 -05:00
Calvin Liu 755adf8a56
KAFKA-14563: RemoveClient-Side AddPartitionsToTxn Requests (#17698)
Removes the client side AddPartitionsToTxn/AddOffsetsToTxn calls so that the partition is implicitly added as part of KIP-890 part 2. 

This change also requires updating the valid state transitions. The client side can not know for certain if a partition has been added server side when the request times out (partial completion). Thus for TV2, the transition to PrepareAbort is now valid for Empty, CompleteCommit, and CompleteAbort. 

For readability, the V1 and V2 endTransaction methods have been separated. 

Reviewers: Artem Livshits <alivshits@confluent.io>, Justine Olshan <jolshan@confluent.io>, Ritika Reddy <rreddy@confluent.io>
2024-12-06 09:00:04 -08:00
David Jacot 24dd11d693
KAFKA-17593; [8/N] Resolve regular expressions (#17864)
This patch introduces the asynchronous resolution of regular expressions. Let me unpack a few details about the implementations:
1) I have decided to finally update all the regular expressions within a consumer group together. My assumption is that the number of regular expressions in a group will be generally small but the number of topics in a cluster is large. Hence grouping has two benefits. Firstly, it allows to go through the list of topics once for all the regular expressions. Secondly, it reduces the number of potential rebalances because all the regular expressions are updated at the same time.
2) An update is triggered when the group is subscribed to at least one regular expressions.
3) An update is triggered when there is no ongoing update.
4) An update is triggered only of the previous one is older than 10s.
5) An update is triggered when the group has unresolved regular expressions.
6) An update is triggered when the metadata image has new topics.

Reviewers: Jeff Kim <jeff.kim@confluent.io>
2024-11-26 08:56:25 -08:00
David Jacot a211ee99b5
KAFKA-17593; [7/N] Introduce CoordinatorExecutor (#17823)
This patch introduces the `CoordinatorExecutor` construct into the `CoordinatorRuntime`. It allows scheduling asynchronous tasks from within a `CoordinatorShard` while respecting the runtime semantic. It will be used to asynchronously resolve regular expressions.

The `GroupCoordinatorService` uses a default `ExecutorService` with a single thread to back it at the moment. It seems that it should be sufficient. In the future, we could consider making the number of threads configurable.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Lianet Magrans <lmagrans@confluent.io>
2024-11-19 07:19:22 -08:00
Mickael Maison 389f96aabd
MINOR: Various cleanups in coordinator modules (#17828)
Reviewers: David Jacot <djacot@confluent.io>, Ken Huang <s7133700@gmail.com>
2024-11-19 10:01:05 +01:00
David Jacot bc68011b62
MINOR: Various cleanups in CoordinatorRuntimeTest (#17829)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-11-18 10:30:54 +01:00
Jeff Kim 65ae070e12
MINOR: log fix in SnapshottableCoordinator (#17726)
Reviewers: donaldzhu-cc, Chia-Ping Tsai <chia7712@gmail.com>
2024-11-09 14:00:41 +08:00
Anshul Goyal 2353a7c508
KAFKA-17723 Fix "this-escape" compiler warnings (MultiThreadedEventProcessor and DistributedHerder) for JDK 23 (#17417)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-11 21:28:27 +08:00
Gaurav Narula b03fe66cfe
KAFKA-17759 Remove Utils.mkSet (#17460)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-11 21:20:43 +08:00
Sean Quah bb6ebd83f9
MINOR: Fix typo and refactor new group coordinator tests (#17072)
This patch fixes a few things:
* Typos.
* Merge the tests for fetchOffsets and fetchAllOffsets together into parameterized tests since they share the same structure.
* Use Topic.GROUP_METADATA_TOPIC_NAME instead of __consumer_offsets in new group coordinator tests.

Reviewers: Ken Huang <s7133700@gmail.com>, David Jacot <djacot@confluent.io>
2024-10-09 07:37:23 -07:00
Dimitar Dimitrov bc47ce1a53
MINOR: Fix a race and add JMH bench for HdrHistogram (#17221) 2024-09-27 23:49:10 +09:00
Sushant Mahajan 821c10157d
KAFKA-17367: Introduce share coordinator [2/N] (#17011)
Introduces the share coordinator. This coordinator is built on the new coordinator runtime framework. It 
is responsible for persistence of share-group state in a new internal topic named "__share_group_state".
The responsibility for being a share coordinator is distributed across the brokers in a cluster. 

Reviewers: David Arthur <mumrah@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>
2024-09-09 20:01:24 -04:00
Sushant Mahajan 1621f88f06
KAFKA-17367: Share coordinator infra classes [1/N] (#16921)
Introduce ShareCoordinator interface and related classes.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>, David Arthur <mumrah@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-08-26 10:09:47 -04:00
Jeff Kim ced34e3176
KAFKA-16379; Coordinator event queue, processing, flush, purgatory time histograms (#16949)
This patch introduces a wrapper around [HdrHistogram](https://github.com/HdrHistogram/HdrHistogram) to use for group coordinator histograms, event queue time, event processing time, flush time, and purgatory time.

Reviewers: David Jacot <djacot@confluent.io>
2024-08-23 04:53:22 -07:00
Sushant Mahajan c5e9154672
KAFKA-17342 Moved common coordinator code to separate module (#16883)
There is a lot of code in group-coordinator which is not share/consumer/classic group specific.

Since we are introducing a share-coordinator as part of KIP-932 (in a new module), it would make sense to get the common coordinator functionality into a separate common coordinator module so that share-coordinator need not depend on group-coordinator.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, David Jacot <djacot@confluent.io>, Andrew Schofield <aschofield@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-08-18 21:48:44 +08:00