LATEST_PRODUCTION version in MetadataVersion.java was updated in
both #16347 and #16400, but it was left unchanged in the system
tests.
Reviewers: Josep Prat <josep.prat@aiven.io>
The current release script has a couple of issues:
* It's a single long file with duplicated logic, which makes
it difficult to understand and make changes
* When a command fails, the user is forced to start from the
beginning, expanding feedback loops. e.g. publishing step
fails because the credentials were set incorrectly in ~/.gradle/gradle.properties
Reviewers: Mickael Maison <mickael.maison@gmail.com>
This change simply adds property methods to LogOffsetMetadata. It
changes all of the callers to use the new property methods instead of
using the fields directly.
Reviewers: Colin P. McCabe <cmccabe@apache.org>
The server side range assignor was made to be sticky i.e. partitions from the existing assignment are retained as much as possible. During a rebalance, the expected behavior is to achieve co-partitioning for members that are subscribed to the same set of topics with equal number of partitions.
However, there are cases where this cannot be achieved efficiently with the current algorithm. There is no easy way to implement stickiness and co-partitioning and hence we have resorted to recomputing the target assignment every time.
In case of static membership, instanceIds are leveraged to ensure some form of stickiness.
```
Benchmark (assignmentType) (assignorType) (isRackAware) (memberCount) (partitionsToMemberRatio) (subscriptionType) (topicCount) Mode Cnt Score Error Units
ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 100 10 HOMOGENEOUS 100 avgt 5 0.052 ± 0.001 ms/op
ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 100 10 HOMOGENEOUS 1000 avgt 5 0.454 ± 0.003 ms/op
ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 1000 10 HOMOGENEOUS 100 avgt 5 0.476 ± 0.046 ms/op
ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 1000 10 HOMOGENEOUS 1000 avgt 5 3.102 ± 0.055 ms/op
ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 10000 10 HOMOGENEOUS 100 avgt 5 5.640 ± 0.223 ms/op
ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 10000 10 HOMOGENEOUS 1000 avgt 5 37.947 ± 1.000 ms/op
ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 100 10 HETEROGENEOUS 100 avgt 5 0.172 ± 0.001 ms/op
ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 100 10 HETEROGENEOUS 1000 avgt 5 1.882 ± 0.006 ms/op
ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 1000 10 HETEROGENEOUS 100 avgt 5 1.730 ± 0.036 ms/op
ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 1000 10 HETEROGENEOUS 1000 avgt 5 17.654 ± 1.160 ms/op
ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 10000 10 HETEROGENEOUS 100 avgt 5 18.595 ± 0.316 ms/op
ServerSideAssignorBenchmark.doAssignment INCREMENTAL RANGE false 10000 10 HETEROGENEOUS 1000 avgt 5 172.398 ± 2.251 ms/op
JMH benchmarks done
Benchmark (memberCount) (partitionsToMemberRatio) (topicCount) Mode Cnt Score Error Units
TargetAssignmentBuilderBenchmark.build 100 10 100 avgt 5 0.071 ± 0.004 ms/op
TargetAssignmentBuilderBenchmark.build 100 10 1000 avgt 5 0.428 ± 0.026 ms/op
TargetAssignmentBuilderBenchmark.build 1000 10 100 avgt 5 0.659 ± 0.028 ms/op
TargetAssignmentBuilderBenchmark.build 1000 10 1000 avgt 5 3.346 ± 0.102 ms/op
TargetAssignmentBuilderBenchmark.build 10000 10 100 avgt 5 8.947 ± 0.386 ms/op
TargetAssignmentBuilderBenchmark.build 10000 10 1000 avgt 5 40.240 ± 3.113 ms/op
JMH benchmarks done
```
Reviewers: David Jacot <djacot@confluent.io>
Defined share group, member and sinmple assignor classes with API definition for Share Group Heartbeat and Describe API.
The ShareGroup and ShareGroupMember extends the common ModernGroup and ModernGroupMember respectively.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
The group coordinator has (internal) write operations that could generate a large number of records (e.g. expiring offsets and groups). At the moment, those operations are limited by the maximum message size. If they hit it, they are basically stuck forever. This patch extends the CoordinatorRuntime to support non-atomic writes and it changes those internal operations to be non-atomic.
Reviewers: Dongnuo Lyu <dlyu@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
As part of KIP-956, we have added quota for remote copies to remote storage. In this PR, we are adding the following metrics for remote copy throttling.
1. remote-copy-throttle-time-avg The average time in millis remote copies was throttled by a broker
2. remote-copy-throttle-time-max The max time in millis remote copies was throttled by a broker
Added unit test for the metrics.
Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org>
Update the documentation for the example properties files now that controllers are allowed to advertise their endpoints.
Reviewers: José Armando García Sancio <jsancio@apache.org>
This change implements response handling for the new version of Vote, Fetch, FetchSnapshot, BeginQuorumEpoch and EndQuorumEpoch. All of these responses were extended to include the leader's endpoint when the leader is known.
This change also includes sending the new version of the requests for Vote, Fetch, FetchSnapshot, BeginQuorumEpoch and EndQuorumEpoch. The two most notable changes are that:
1. The leader is going to include all of its endpoints in the BeginQuorumEpoch and EndQuorumEpoch requests.
2. The replica is going to include the destination replica key for the Vote and BeginQuorumEpoch request.
QuorumState was extended so that the replica transitions to UnattachedState instead of FollowerState during startup, if the leader is known but the leader's endpoint is not known. This can happen if the known leader is not part of the voter set replicated by the replica. The expectation is that the replica will rediscover the leader from Fetch responses from the bootstrap servers or from the BeginQuorumEpoch request from the leader.
To make sure that replicas never forget the leader of a given epoch the unattached state was extended to allow an optional leader id for when the leader is known but the leader's endpoint is not known.
Reviewers: Colin P. McCabe <cmccabe@apache.org>
The release API exposed Partitions which should be an internal implementation detail for releaseAcquiredRecords API. Also lessen the scope for cached topic partitions method as it's not needed.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>, Abhinav Dixit <adixit@confluent.io>
When the PurgeRepartitionTopicintegrationTest was written, the InitialTaskDelayMs was hard-coded on the broker requiring setting a timeout in the test to wait for the delay to expire. But I believe this creates a race condition where the test times out before the broker deletes the inactive segment. PR #15719 introduced an internal config to control the IntitialTaskDelayMs config for speeding up tests, and this PR leverages this internal config to reduce the task delay to 0 to eliminate this race condition.
Implemented the functionality which takes care of archiving the records when LSO moves past them. Implemented the following functions -
1. updateCacheAndOffsets - Updates the cached state, start and end offsets of the share partition as per the new log start offset. The method is called when the log start offset is moved for the share partition.
2. archiveAvailableRecordsOnLsoMovement - This function archives all the available records when they are behind the LSO.
3. archivePerOffsetBatchRecords - It archives all the available records in the per offset tracked batch passed to this function.
4. archiveCompleteBatch - It archives all the available records of the complete batch passed to this function.
Reviewers: Andrew Schofield <aschofield@confluent.io>,Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
xmlns strings need to match exactly, and these vocabularies are defined with `http` namespace strings, so we need to follow that.
Reviewers: Mickael Maison <mickael.maison@gmail.com>
As part of [KIP-956](https://cwiki.apache.org/confluence/display/KAFKA/KIP-956+Tiered+Storage+Quotas), we have added quota for remote fetches from remote storage. In this PR, we are adding the following metrics for remote fetch throttling.
remote-fetch-throttle-time-avg : The average time in millis remote fetches was throttled by a broker
remote-fetch-throttle-time-max : The max time in millis remote fetches was throttled by a broker
Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org>
Following the discussion and suggestion by @dajac, https://github.com/apache/kafka/pull/16054#discussion_r1613638293, the PR refactors the common classes to build TargetAssignment in `modern` package. `consumer` package has been moved inside `modern` package with classes exclusive to `consumer group`.
This PR completes the refactoring and base to introduce `share` package inside `modern`. The subsequent PRs will define the implementation specific to Share Groups while re-using the common functionality from `modern` package classes.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>
About
This PR adds acknowledge code in SharePartitionManager. Internally, the record acknowledgements happen at the SharePartition level. SharePartitionManager identifies the SharePartitions and calls their acknowledge method to actually acknowledge the individual records
Testing
Added unit tests to cover the new functionality added in SharePartitionManagerTest
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
This patch partially reverts `group.version` in trunk. I kept the `GroupVersion` class but removed it from `Features` so it is not advertised. I also kept all the changes in the test framework. I removed the logic to require `group.version=1` to enable the new consumer rebalance protocol. The new protocol is enabled based on the static configuration.
For the context, I prefer to revert it in trunk now so we don't forget to revert it in the 3.9 release. I will bring it back for the 4.0 release.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
his PR adds KRaft support to the following tests in AlterUserScramCredentialsRequestNotAuthorizedTest
Co-authored-by: Manikumar Reddy <manikumar.reddy@gmail.com>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
This patch is an attempt to simplifying GroupMetadataManager#consumerGroupHeartbeat and GroupMetadataManager#classicGroupJoinToConsumerGroup by sharing more of the common logic. It slightly change how static members are replaced too. Now, we generate the records to replace the member and then we update the member if needed.
Reviewers: Dongnuo Lyu <dlyu@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>
For a non-existing output topic, Kafka Streams ends up in an infinite retry loop, because the returned TimeoutException extends RetriableException.
This PR updates the error handling pass for this case and instead of retrying calls the ProductionExceptionHandler to allow breaking the infinite retry loop.
Reviewers: Matthias J. Sax <matthias@confluent.io>
This version of the ShadowJarPlugin uses the incorrect classifier for the published archive. This is a temporary measure to fix publishing prior to the upcoming release.
Signed-off-by: Greg Harris <greg.harris@aiven.io>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Finishing migration of MembershipManagerImplTest away from ConsumerTestBuilder and removed all spy objects.
Reviewers: Lianet Magrans <lianetmr@gmail.com>, Philip Nee <pnee@confluent.io>, Matthias J. Sax <matthias@confluent.io>
With KIP-853, the leader's endpoint is sent to the other voters using the BeginQuorumEpoch RPC. The remote replicas never store the leader's endpoint. That means that leaders need to resend the leader's endpoint if a voter restarts.
This change accomplishes this by sending the BeginQuorumEpoch as a heartbeat. The period is sent to the half the fetch timeout to prevent voters from transitioning to the candidate state when restarting.
Reviewers: José Armando García Sancio <jsancio@apache.org>
Prior to KIP-853, users were not allow to enumerate listeners specified in `controller.listener.names` in the `advertised.listeners`. This decision was made in 3.3 because the `controller.quorum.voters` property is in effect the list of advertised listeners for all of the controllers.
KIP-853 is moving away from `controller.quorum.voters` in favor of a dynamic set of voters. This means that the user needs to have a way of specifying the advertised listeners for controller.
This change allows the users to specify listener names in `controller.listener.names` in `advertised.listeners`. To make this change forwards compatible (use a valid configuration from 3.8 in 3.9), the controller's advertised listeners are going to get computed by looking up the endpoint in `advertised.listeners`. If it doesn't exist, the controller will look up the endpoint in the `listeners` configuration.
This change also includes a fix the to the BeginQuorumEpoch request where the default value for VoterId was 0 instead of -1.
Reviewers: Colin P. McCabe <cmccabe@apache.org>
Create 3 new metadata versions:
- 3.8-IV0, for the upcoming 3.8 release.
- 3.9-IV0, to add support for KIP-1005.
- 3.9-IV1, as the new release vehicle for KIP-966.
Create ListOffsetRequest v9, which will be used in 3.9-IV0 to support KIP-1005. v9 is currently an unstable API version.
Reviewers: Jun Rao <junrao@gmail.com>, Justine Olshan <jolshan@confluent.io>
Abstracted code for 2 classes `ConsumerGroup` and `ConsumerGroupMember` to `ModernGroup` and `ModernGroupMember` respectively. The new abstract classes are created to share common functionality with `ShareGroup` and `ShareGroupMember` which are being introduced with KIP-932.
The patch is majorly code refactoring from existing classes to abstract classes. Also created a new package called `modern` where `MemberState` class is moved, in upcoming patches, I will move common classes for `Share` and `Consumer` Group in `modern` package itself.
Reviewers: Lianet Magrans <lianetmr@gmail.com>, Andrew Schofield <aschofield@confluent.io>, David Jacot <djacot@confluent.io>
About
Implemented release acquired records functionality in SharePartition. This functionality is used when a share session gets closed, hence all the acquired records should either move to AVAILABLE or ARCHIVED state. Implemented the following functions -
1. releaseAcquiredRecords - This function is executed when the acquisition lock timeout is reached. The function releases the acquired records.
2. releaseAcquiredRecordsForCompleteBatch - Function which releases acquired records maintained at a batch level.
3. releaseAcquiredRecordsForPerOffsetBatch - Function which releases acquired records maintained at an offset level.
Testing
Added unit tests to cover the new functionality added.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>