All the work that we have done to automate and to simplify the coordinator records allows us to simplify the related MessageParsers in DumpLogSegments. They can all share the same based implementation. This is nice because it ensures that we handle all those records similarly.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Currently, Producer.send doc is inconsistent with actual exception behavior
- TimeoutException: This won't be thrown from send on buffer-full or metadata-missing actually. Instead, it will returned as failed future.
- AuthenticationException/AuthorizationException: These exceptions are also won't be thrown. Returned with failed future actually.
Fixed Callback javadoc and ProducerConfig doc as well.
Reviewers: Luke Chen <showuon@gmail.com>, Andrew Schofield <aschofield@confluent.io>
Implements the current assignment builder, analogous to the current assignment builder of consumer groups. The main difference is the underlying assigned resource, and slightly different logic around process IDs: We make sure to move a task only to a new client, once the task is not owned anymore by any client with the same process ID (sharing the same state directory) - in any role (active, standby or warm-up).
Compared to the feature branch, the main difference is that I refactored the separate treatment of active, standby and warm-up tasks into a compound datatype called TaskTuple (which is used in place of the more specific Assignment class). This also has effects on StreamsGroupMember.
Reviewers: Bruno Cadonna <cadonna@apache.org>, Bill Bejeck <bbejeck@apache.org>
Ensure that unloading a coordinator always succeeds. Previously, we have
guarded against exceptions from DeferredEvent completions. All that
remains is handling exceptions from the onUnloaded() method of the
coordinator state machine.
Reviewers: David Jacot <djacot@confluent.io>
This patch is a first step towards removing `ReplicaManager#becomeLeaderOrFollower`. It updates the `LocalLeaderEndPointTest` tests.
Reviewers: Christo Lolov <lolovc@amazon.com>, Ismael Juma <ismael@juma.me.uk>
There is a subtle race condition with transactions V2 if a transaction is still completing when checking if we need to add a partition, but it completes when the request reaches the coordinator.
One approach was to remove the verification for TV2 and just check the epoch on write, but a simpler one is to simply return concurrent transactions from the partition leader (before attempting to add the partition). I've done this and added a test for this behavior.
Locally, I reproduced the race but adding a 1 second sleep when handling the WriteTxnMarkersRequest and a 2 second delay before adding the partition to the AddPartitionsToTxnManager. Without this change, the race happened on every second transaction as the first one completed. With this change, the error went away.
As a followup, we may want to clean up some of the code and comments with respect to verification as the code is used by both TV0 + verification and TV2. But that doesn't need to complete for 4.0. This does :)
Reviewers: Jeff Kim <jeff.kim@confluent.io>, Artem Livshits <alivshits@confluent.io>, Calvin Liu <caliu@confluent.io>
While setting Defaults.LogMessageTimestampType to "LogAppendTime", `LogCleanerManagerTest.testLogsUnderCleanupIneligibleForCompaction()` fails with a InvalidTimestampException.
This PR fixes this by regenerating the records instead of previous approach of re-using same records in the test.
Reviewers: Divij Vaidya <diviv@amazon.com>, Kvicii <kvicii.yu@gmail.com>
---------
Co-authored-by: fangxiaobing <fangxiaobing@kuaishou.com>
This commit adds a processor named
StreamsRebalanceEventsProcessor that handles the rebalance
events sent from the background thread of the async
consumer to the stream thread when an task
assignment changes. It also adds the corresponding rebalance
events.
Additionally, this commit adds StreamsRebalanceData that
maintains the data that is exchanges for the Streams rebalance
protocol.
All of these are used by the Streams heartbeat request manager
and the Streams membership manager that will be added in a future
commit.
Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
These methods were previously invoked by ZK components, but we have just removed them.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
This patch includes some maintenance updates for Develocity.
* Publish build scans to develocity.apache.org
* Update Develocity Gradle plugin to to 3.19
* Use `DEVELOCITY_ACCESS_KEY` to authenticate to `develocity.apache.org`
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Arthur <mumrah@gmail.com>
This patch does a few things:
1) Replace ApiMessageAndVersion by ApiMessage in CoordinatorRecord for the key
2) Leverage the fact that ApiMessage exposes the apiKey. Hence we don't need to specify the key anymore.
Reviewers: Andrew Schofield <aschofield@confluent.io>
Update producer id request / response formats and transaction log value format. There is no functional change.
Reviewers: Justine Olshan <jolshan@confluent.io>, Calvin Liu <caliu@confluent.io>
We should provide the same informative error message for both timeout
cases.
Reviewers: Kirk True <ktrue@confluent.io>, Andrew Schofield <aschofield@confluent.io>, Ismael Juma <ismael@juma.me.uk>
This patch removes `ReplicaManager#stopReplicas`. I have ensured that removed unit tests are covered by other existing tests or are updated to use kraft.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
Although `MetadataCache`'s `getPartitionReplicaEndpoints` takes a single topic-partition, the `KRaftMetadataCache` implementation iterates over all partitions of the matching topic. This is not necessary and can cause significant performance degradation when the topic has a relatively high number of partitions.
Note that this is not a recent regression - it has been a part of `KRaftMetadataCache` since its creation.
Reviewers: Ismael Juma <ismael@juma.me.uk>, David Jacot <djacot@confluent.io>
The internal topic manager takes a requested topology and returns a configured topology, as well as any internal topics that need to be created.
Shares with the client-side "InternalTopicManager" the name only.
Reviewers: Bruno Cadonna <cadonna@apache.org>
This patch updates the transaction coordinator record to use the new coordinator record definition.
Reviewers: Andrew Schofield <aschofield@confluent.io>
Kafka 4.0 will remove support for zk mode and will require conversion to kraft
before upgrading to 4.0. The minimum kraft version is 3.0 (aka 3.0-IV1).
This provides an opportunity to remove exclusively server side protocols versions
that only exist to allow direct upgrades from versions older than 3.0 or that are
used only by zk mode.
Since KRaft became production ready in 3.3, we should consider setting the
baseline to 3.3. But that requires more discussion and it can be done via a
separate change (KAFKA-18601).
Protocol changes:
* Remove RequestHeader v0 (only used by ControlledShutdown v0)
* Remove WriteTxnMarkers v0
* Remove all versions of ControlledShutdown, LeaderAndIsr, StopReplica, UpdateMetadata
In order to remove all versions safely, extend generator to support setting
"versions" to "none". In this case, we no longer generate the `*Data` classes,
but we still reserve the id for the relevant protocol api (so it doesn't get
accidentally used for something else). The protocol documentation is correct
after these changes.
We kept a simplified version of `LeaderAndIsr{Request|Response}` because
it's used by many tests that are still relevant in kraft mode. Once
KAFKA-18486 is done, it may be possible to remove it (I left a comment on
the ticket). Similarly, KAFKA-18487 may make it possible to remove
the introduced `StopReplicaPartitionState` (left a comment on that ticket too).
There are a number of places that were adjusted to include an
`ApiKeys.hasValidVersion` check.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>