Commit Graph

4945 Commits

Author SHA1 Message Date
Kamal Chandraprakash f359908fcd
KAFKA-15776: Support added to update remote.fetch.max.wait.ms dynamically (#16203)
Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>
2024-06-10 20:42:12 +05:30
Max Riedel 40de07dab5
KAFKA-14509; [4/4] Handle includeAuthorizedOperations in ConsumerGroupDescribe API (#16158)
This patch implements the handling of `includeAuthorizedOperations` flag in the ConsumerGroupDescribe API.

Reviewers: David Jacot <djacot@confluent.io>
2024-06-10 05:07:45 -07:00
Gantigmaa Selenge 53b048bf0b
KAFKA-15718: Refactor UncleanLeaderElectionTest to enable KRaft later (#16157)
Refactor UncleanLeaderElectionTest to allow to enable KRaft later

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-10 19:15:34 +08:00
Chia Chuan Yu e5b8712993
KAFKA-16885 Renamed the enableRemoteStorageSystem to isRemoteStorageSystemEnabled (#16256)
Reviewers: Kamal Chandraprakash <kchandraprakash@uber.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-10 02:14:15 +08:00
Gaurav Narula b8b248b1e8
KAFKA-16920: close BrokerLifecycleManager in tests (#16252)
Tests in BrokerLifecycleManagerTest do not close BrokerLifecycleManager
if an assertion fails.

This change makes BrokerLifecycleManager an instance variable that is
closed in an `@AfterEach` method.

Reviewers: Igor Soarez <i@soarez.me>
2024-06-09 00:03:00 +01:00
PoAn Yang 3d5d1504f7
KAFKA-16878 Remove powermock and easymock from code base (#16236)
Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-09 00:17:43 +08:00
Igor Soarez 5a5a292146
MINOR: Fix broken ReassignPartitionsCommandTest test (#16251)
KAFKA-16606 (#15834) introduced a change that broke
ReassignPartitionsCommandTest.testReassignmentCompletionDuringPartialUpgrade.

The point was to validate that the MetadataVersion supports JBOD
in KRaft when multiple log directories are configured.
We do that by checking the version used in
kafka-features.sh upgrade --metadata, and the version discovered
via a FeatureRecord for metadata.version in the cluster metadata.

There's no point in checking inter.broker.protocol.version in
KafkaConfig, since in KRaft, that configuration is deprecated
and ignored — always assuming the value of MINIMUM_KRAFT_VERSION.

The broken that was broken sets inter.broker.protocol.version in
KRaft mode and configures 3 directories. So alternatively, we
could change the test to not configure this property.
Since the property isn't forbidden in KRaft mode, just ignored,
and operators may forget to remove it, it seems better to remote
the fail condition in KafkaConfig.

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-08 14:06:46 +03:00
Igor Soarez 7c0fff7c36
KAFKA-16606 Gate JBOD configuration on 3.7-IV2 (#15834)
Support for multiple log directories in KRaft exists from
MetataVersion 3.7-IV2.

When migrating a ZK broker to KRaft, we already check that
the IBP is high enough before allowing the broker to startup.

With KIP-584 and KIP-778, Brokers in KRaft mode do not require
the IBP configuration - the configuration is deprecated.
In KRaft mode inter.broker.protocol.version defaults to
MetadataVersion.MINIMUM_KRAFT_VERSION (IBP_3_0_IV1).

Instead KRaft brokers discover the MetadataVersion by reading
the "metadata.version" FeatureLevelRecord from the cluster metadata.

This change adds a new configuration validation step upon discovering
the "metadata.version" from the cluster metadata.

Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-06-07 09:11:57 +01:00
Kirk True d6cd83e2fb
KAFKA-16200: Enforce that RequestManager implementations respect user-provided timeout (#16031)
Improve consistency and correctness for user-provided timeouts at the Consumer network request layer, per the Java client Consumer timeouts design (https://cwiki.apache.org/confluence/display/KAFKA/Java+client+Consumer+timeouts). While the changes introduced in KAFKA-15974 enforce timeouts at the Consumer's event layer, this change enforces timeouts at the network request layer.

The changes mostly fit into the following areas:

1. Create shared code and idioms so timeout handling logic is consistent across current and future RequestManager implementations
2. Use deadlineMs instead of expirationMs, expirationTimeoutMs, retryExpirationTimeMs, timeoutMs, etc.
3. Update "preemptive pruning" to remove expired requests that have had at least one attempt

Reviewers: Lianet Magrans <lianetmr@gmail.com>, Bruno Cadonna <cadonna@apache.org>
2024-06-07 09:53:27 +02:00
Apoorv Mittal c01279b92a
KAFKA-16905: Fix blocking DescribeCluster call in AdminClient DescribeTopics (#16217)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, David Arthur <mumrah@gmail.com>
2024-06-06 18:11:43 -04:00
David Jacot 7d832cf74f
KAFKA-14701; Move `PartitionAssignor` to new `group-coordinator-api` module (#16198)
This patch moves the `PartitionAssignor` interface and all the related classes to a newly created `group-coordinator/api` module, following the pattern used by the storage and tools modules.

Reviewers: Ritika Reddy <rreddy@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-06 12:19:20 -07:00
Murali Basani a41f7a4e13
KAFKA-16884 Refactor RemoteLogManagerConfig with AbstractConfig (#16199)
Reviewers: Greg Harris <gharris1727@gmail.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-07 00:06:25 +08:00
Cy f36a873642
MINOR: Added test for ClusterConfig#displayTags (#16110)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-06 23:46:49 +08:00
Okada Haruki 3835515fea
KAFKA-16541 Fix potential leader-epoch checkpoint file corruption (#15993)
A patch for KAFKA-15046 got rid of fsync on LeaderEpochFileCache#truncateFromStart/End for performance reason, but it turned out this could cause corrupted leader-epoch checkpoint file on ungraceful OS shutdown, i.e. OS shuts down in the middle when kernel is writing dirty pages back to the device.

To address this problem, this PR makes below changes: (1) Revert LeaderEpochCheckpoint#write to always fsync
(2) truncateFromStart/End now call LeaderEpochCheckpoint#write asynchronously on scheduler thread
(3) UnifiedLog#maybeCreateLeaderEpochCache now loads epoch entries from checkpoint file only when current cache is absent

Reviewers: Jun Rao <junrao@gmail.com>
2024-06-06 15:10:13 +09:00
Abhijeet Kumar bd9d68f24a
KAFKA-15265: Integrate RLMQuotaManager for throttling fetches from remote storage (#16071)
Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org>
2024-06-05 19:12:25 +05:30
Kamal Chandraprakash 02c794dfd3
KAFKA-15776: Introduce remote.fetch.max.timeout.ms to configure DelayedRemoteFetch timeout (#14778)
KIP-1018, part1, Introduce remote.fetch.max.timeout.ms to configure DelayedRemoteFetch timeout

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-05 14:42:23 +08:00
Dongnuo Lyu 7ddfa64759
MINOR: Adjust validateOffsetCommit/Fetch in ConsumerGroup to ensure compatibility with classic protocol members (#16145)
During online migration, there could be ConsumerGroup that has members that uses the classic protocol. In the current implementation, `STALE_MEMBER_EPOCH` could be thrown in ConsumerGroup offset fetch/commit validation but it's not supported by the classic protocol. Thus this patch changed `ConsumerGroup#validateOffsetCommit` and `ConsumerGroup#validateOffsetFetch` to ensure compatibility.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>
2024-06-04 23:08:38 -07:00
Apoorv Mittal 252c1acac3
KAFKA-16740: Adding skeleton code for Share Fetch and Acknowledge RPC (KIP-932) (#16184)
The PR adds skeleton code for Share Fetch and Acknowledge RPCs. The changes include:

1. Defining RPCs in KafkaApis.scala
2. Added new SharePartitionManager class which handles the RPCs handling
3. Added SharePartition class which manages in-memory record states and for fetched data.


Reviewers: David Jacot <djacot@confluent.io>, Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-06-05 10:25:24 +05:30
PoAn Yang b89999b504
KAFKA-16483: Remove preAppendErrors from createPutCacheCallback (#16105)
The method createPutCacheCallback has a input argument preAppendErrors. It is used to keep the "error" happens before appending. However, it is always empty. Also, the pre-append error is handled before createPutCacheCallback by calling responseCallback. Hence, we can remove preAppendErrors.

Signed-off-by: PoAn Yang <payang@apache.org>

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-05 08:02:52 +08:00
Kuan-Po (Cooper) Tseng 01e9918530
KAFKA-16814 KRaft broker cannot startup when `partition.metadata` is missing (#16165)
When starting up kafka logManager, we'll check stray replicas to avoid some corner cases. But this check might cause broker unable to startup if partition.metadata is missing because when startup kafka, we load log from file, and the topicId of the log is coming from partition.metadata file. So, if partition.metadata is missing, the topicId will be None, and the LogManager#isStrayKraftReplica will fail with no topicID error.

The partition.metadata missing could be some storage failure, or another possible path is unclean shutdown after topic is created in the replica, but before data is flushed into partition.metadata file. This is possible because we do the flush in async way here.

When finding a log without topicID, we should treat it as a stray log and then delete it.

Reviewers: Luke Chen <showuon@gmail.com>, Gaurav Narula <gaurav_narula2@apple.com>
2024-06-05 07:56:18 +08:00
Igor Soarez 7e0caad96e
MINOR: Cleanup unused references in core (#16192)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-05 05:12:33 +08:00
Colin P. McCabe 9ceed8f18f KAFKA-16535: Implement AddVoter, RemoveVoter, UpdateVoter RPCs
Implement the add voter, remove voter, and update voter RPCs for
KIP-853. This is just adding the RPC handling; the current
implementation in RaftManager just throws UnsupportedVersionException.

Reviewers: Andrew Schofield <aschofield@confluent.io>, José Armando García Sancio <jsancio@apache.org>
2024-06-04 14:07:48 -07:00
Mickael Maison 55d38efcc5
KAFKA-15852: Move LinuxIoMetricsCollector to server module (#16178)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-04 16:42:35 +02:00
Igor Soarez 16359e70d3
KAFKA-16583: Handle PartitionChangeRecord without directory IDs (#16118)
When PartitionRegistration#merge() reads a PartitionChangeRecord
from an older MetadataVersion, with a replica assignment change
and without #directories() set, it produces a direcotry assignment
of DirectoryId.UNASSIGNED. This is problematic because the MetadataVersion
may not yet support directory assignments, leading to a
UnwritableMetadataException in PartitionRegistration#toRecord.

Since the Controller always sets directories on PartitionChangeRecord
if the MetadataVersion supports it, via PartitionChangeBuilder,
there's no need for PartitionRegistration#merge() to populate
directories upon a replica assignment change.

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-04 15:37:20 +01:00
Kuan-Po (Cooper) Tseng a08db65670
KAFKA-16888 Fix failed StorageToolTest.testFormatSucceedsIfAllDirectoriesAreAvailable and StorageToolTest.testFormatEmptyDirectory (#16186)
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-04 21:03:25 +08:00
David Jacot 53d592e369
MINOR: Fix type in MetadataVersion.IBP_4_0_IV0 (#16181)
This patch fixes a typo in MetadataVersion.IBP_4_0_IV0. It should be 0 not O.

Reviewers: Justine Olshan <jolshan@confluent.io>, Jun Rao <junrao@gmail.com>,  Chia-Ping Tsai <chia7712@gmail.com>
2024-06-03 20:48:04 -07:00
José Armando García Sancio 459da4795a
KAFKA-16525; Dynamic KRaft network manager and channel (#15986)
Allow KRaft replicas to send requests to any node (Node) not just the nodes configured in the
controller.quorum.voters property. This flexibility is needed so KRaft can implement the
controller.quorum.voters configuration, send request to the dynamically changing set of voters and
send request to the leader endpoint (Node) discovered through the KRaft RPCs (specially
BeginQuorumEpoch request and Fetch response).

This was achieved by changing the RequestManager API to accept Node instead of just the replica ID.
Internally, the request manager tracks connection state using the Node.idString method to match the
connection management used by NetworkClient.

The API for RequestManager is also changed so that the ConnectState class is not exposed in the
API. This allows the request manager to reclaim heap memory for any connection that is ready.

The NetworkChannel was updated to receive the endpoint information (Node) through the outbound raft
request (RaftRequent.Outbound). This makes the network channel more flexible as it doesn't need to
be configured with the list of all possible endpoints. RaftRequest.Outbound and
RaftResponse.Inbound were updated to include the remote node instead of just the remote id.

The follower state tracked by KRaft replicas was updated to include both the leader id and the
leader's endpoint (Node). In this comment the node value is computed from the set of voters. In
future commit this will be updated so that it is sent through KRaft RPCs. For example
BeginQuorumEpoch request and Fetch response.

Support for configuring controller.quorum.bootstrap.servers was added. This includes changes to
KafkaConfig, QuorumConfig, etc. All of the tests using QuorumTestHarness were changed to use the
controller.quorum.bootstrap.servers instead of the controller.quorum.voters for the broker
configuration. Finally, the node id for the bootstrap server will be decreasing negative numbers
starting with -2.

Reviewers: Jason Gustafson <jason@confluent.io>, Luke Chen <showuon@gmail.com>, Colin P. McCabe <cmccabe@apache.org>
2024-06-03 14:24:48 -07:00
Andrew Schofield 8f82f14a48
KAFKA-16713: Define initial set of RPCs for KIP-932 (#16022)
This PR defines the initial set of RPCs for KIP-932. The RPCs for the admin client and state management are not in this PR.

Reviewers: Apoorv Mittal <amittal@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-06-03 11:52:35 +05:30
Ken Huang 8507693229
KAFKA-16859 Cleanup check if tiered storage is enabled (#16153)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-03 11:04:58 +08:00
Ken Huang 2c82ecd67f
KAFKA-16807 DescribeLogDirsResponseData#results#topics have unexpected topics having empty partitions (#16042)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-02 17:33:02 +08:00
Colin Patrick McCabe 8ace33b47f
KAFKA-16757: Fix broker re-registration issues around MV 3.7-IV2 (#15945)
When upgrading from a MetadataVersion older than 3.7-IV2, we need to resend the broker registration, so that the controller can record the storage directories. The current code for doing this has several problems, however. One is that it tends to trigger even in cases where we don't actually need it. Another is that when re-registering the broker, the broker is marked as fenced.

This PR moves the handling of the re-registration case out of BrokerMetadataPublisher and into BrokerRegistrationTracker. The re-registration code there will only trigger in the case where the broker sees an existing registration for itself with no directories set.  This is much more targetted than the original code.

Additionally, in ClusterControlManager, when re-registering the same broker, we now preserve its fencing and shutdown state, rather than clearing those. (There isn't any good reason re-registering the same broker should clear these things... this was purely an oversight.) Note that we can tell the broker is "the same" because it has the same IncarnationId.

Reviewers: Gaurav Narula <gaurav_narula2@apple.com>, Igor Soarez <soarez@apple.com>
2024-06-01 23:51:39 +01:00
Chia Chuan Yu e33eb82fed
KAFKA-16574 The metrics of LogCleaner disappear after reconfiguration (#15863)
Reviewers: Gaurav Narula <gaurav_narula2@apple.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-02 01:02:03 +08:00
TaiJuWu db2a09fa90
KAFKA-16652 add unit test for ClusterTemplate offering zero ClusterConfig (#15862)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-02 00:56:54 +08:00
David Jacot ba61ff0cd9
KAFKA-16860; [1/2] Introduce group.version feature flag (#16120)
This patch introduces the `group.version` feature flag with one version:
1) Version 1 enables the new consumer group rebalance protocol (KIP-848).

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-05-31 12:48:55 -07:00
Kamal Chandraprakash cdd4455cb8
KAFKA-16866 Used the right constant in RemoteLogManagerTest#testFetchQuotaManagerConfig (#16152)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-01 01:14:31 +08:00
Mickael Maison b6d0fb055d
MINOR: Refactor DynamicConfig (#16133)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-01 01:09:46 +08:00
Ken Huang 21caf6b123
KAFKA-16629 Add broker-related tests to ConfigCommandIntegrationTest (#15840)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-31 20:24:33 +08:00
Kuan-Po (Cooper) Tseng 3d125a2322
MINOR: Add more unit tests to LogSegments (#16085)
add more unit tests to LogSegments and do some small refactor in LogSegments.java

Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-31 16:07:38 +08:00
Chia-Ping Tsai b0fb2ac06d
KAFKA-16866 RemoteLogManagerTest.testCopyQuotaManagerConfig failing (#16146)
Reviewers: Justine Olshan <jolshan@confluent.io>, Satish Duggana <satishd@apache.org>
2024-05-31 06:32:50 +05:30
Justine Olshan 7c1bb1585f
KAFKA-16308 [2/N]: Allow unstable feature versions and rename unstable metadata config (#16130)
As per KIP-1022, we will rename the unstable metadata versions enabled config to support all feature versions.

Features is also updated to return latest production and latest testing versions of each feature.

A feature is production ready when the corresponding metadata version (bootstrapMetadataVersion) is production ready.

Adds tests for the feature usage of the unstableFeatureVersionsEnabled config

Reviewers: David Jacot <djacot@confluent.io>, Jun Rao <junrao@gmail.com>
2024-05-30 14:52:50 -07:00
David Jacot cd750582c0
MINOR: Enable transaction verification with new group coordinator in TransactionsTest (#16139)
While working on https://github.com/apache/kafka/pull/16120, I noticed that the transaction verification feature is disabled in `TransactionsTest` when the new group coordinator is enabled. We did this initially because the feature was not available in the new group coordinator but we fixed it a long time ago. We can enable it now.

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-05-30 12:35:29 -07:00
Dongnuo Lyu a626e87303
MINOR: Make public the consumer group migration policy config
This patch exposes the group coordinator config `CONSUMER_GROUP_MIGRATION_POLICY_CONFIG`.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>
2024-05-30 11:36:11 -07:00
Krishna Agarwal bb6a042e99
KAFKA-16827: Integrate kafka native-image with system tests (#16046)
This PR does following things

System tests should bring up Kafka broker in the native mode
System tests should run on Kafka broker in native mode
Extract out native build command so that it can be reused.
Allow system tests to run on Native Kafka broker using Docker mechanism

To run system tests by bringing up Kafka in native mode:
Pass kafka_mode as native in the ducktape globals:--globals '{\"kafka_mode\":\"native\"}'

Running system tests by bringing up kafka in native mode via docker mechanism
_DUCKTAPE_OPTIONS="--globals '{\"kafka_mode\":\"native\"}'" TC_PATHS="tests/kafkatest/tests/"  bash tests/docker/run_tests.sh

To only bring up ducker nodes to cater native kafka
bash tests/docker/ducker-ak up -m native

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2024-05-30 22:24:23 +05:30
Abhijeet Kumar bb7db87f98
KAFKA-15265: Add Remote Log Manager quota manager (#15625)
Added the implementation of the quota manager that will be used to throttle copy and fetch requests from the remote storage. Reference KIP-956

Reviewers: Luke Chen <showuon@gmail.com>, Kamal Chandraprakash <kchandraprakash@uber.com>, Jun Rao <junrao@gmail.com>
2024-05-30 09:06:49 -07:00
Mickael Maison 8068a086a3
MINOR: Remove KafkaConfig dependency in KafkaRequestHandler (#16108)
Reviewers: Luke Chen <showuon@gmail.com>, Apoorv Mittal <amittal@confluent.io>
2024-05-30 11:51:24 +02:00
David Jacot 2a6078a4ce
MINOR: Prevent consumer protocol to be used in ZK mode (#16121)
This patch disallows enabling the new consumer rebalance protocol in ZK mode.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>
2024-05-29 23:02:21 -07:00
Murali Basani 3d14690cbf
KAFKA-16790: Update RemoteLogManager configuration in broker server (#16005)
n BrokerServer.scala, brokerMetadataPublishers are configured and when there are metadata updates remoteLogManager is not configured by then.
Ex : remoteLogManager.foreach(rlm => rlm.onLeadershipChange(partitionsBecomeLeader.asJava, partitionsBecomeFollower.asJava, topicIds)) in ReplicaManager is invoked after publishers are instantiated, and here rlm has relevant managers configured.

This change makes sure rlm is configured before the brokerMetadataPublishers initialization.

Reviewers: Luke Chen <showuon@gmail.com>, Nikhil Ramakrishnan <nikrmk@amazon.com>
2024-05-30 08:21:30 +08:00
Justine Olshan 5e3df22095
KAFKA-16308 [1/N]: Create FeatureVersion interface and add `--feature` flag and handling to StorageTool (#15685)
As part of KIP-1022, I have created an interface for all the new features to be used when parsing the command line arguments, doing validations, getting default versions, etc.

I've also added the --feature flag to the storage tool to show how it will be used.

Created a TestFeatureVersion to show an implementation of the interface (besides MetadataVersion which is unique) and added tests using this new test feature.

I will add the unstable config and tests in a followup.

Reviewers: David Mao <dmao@confluent.io>, David Jacot <djacot@confluent.io>, Artem Livshits <alivshits@confluent.io>, Jun Rao <junrao@apache.org>
2024-05-29 16:36:06 -07:00
Mickael Maison 3f3f3ac155
MINOR: Delete KafkaSecurityConfigs class (#16113)
Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-30 05:55:24 +08:00
gongxuanzhang 0f0c9ecbf3
KAFKA-16771 First log directory printed twice when formatting storage (#16010)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-30 01:08:17 +08:00
Andrew Schofield 2d9994e0de
KAFKA-16722: Introduce ConsumerGroupPartitionAssignor interface (#15998)
KIP-932 introduces share groups to go alongside consumer groups. Both kinds of group use server-side assignors but it is unlikely that a single assignor class would be suitable for both. As a result, the KIP introduces specific interfaces for consumer group and share group partition assignors.

This PR introduces only the consumer group interface, `o.a.k.coordinator.group.assignor.ConsumerGroupPartitionAssignor`. The share group interface will come in a later release. The existing implementations of the general `PartitionAssignor` interface have been changed to implement `ConsumerGroupPartitionAssignor` instead and all other code changes are just propagating the change throughout the codebase.

Note that the code in the group coordinator that actually calculates assignments uses the general `PartitionAssignor` interface so that it can be used with both kinds of group, even though the assignors themselves are specific.

Reviewers: Apoorv Mittal <amittal@confluent.io>, David Jacot <djacot@confluent.io>
2024-05-29 08:31:52 -07:00
gongxuanzhang 0b75cf7c0b
KAFKA-16705 the flag "started" of RaftClusterInstance is false even though the cluster is started (#15946)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-29 22:38:00 +08:00
Luke Chen 897cab2a61
KAFKA-16399: Add JBOD support in tiered storage (#15690)
After JBOD is supported in KRaft, we should also enable JBOD support in tiered storage. Unit tests and Integration tests are also added.

Reviewers: Satish Duggana <satishd@apache.org>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Igor Soarez <soarez@apple.com>, Mickael Maison <mickael.maison@gmail.com>
2024-05-29 15:30:18 +08:00
PoAn Yang 4d04eb83ea
KAFKA-16796 Introduce new org.apache.kafka.tools.api.Decoder to replace kafka.serializer.Decoder (#16064)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-29 03:13:33 +08:00
Luke Chen a649bc457f
KAFKA-16711: Make sure to update highestOffsetInRemoteStorage after log dir change (#15947)
Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Satish Duggana <satishd@apache.org>
2024-05-28 21:35:49 +05:30
Omnia Ibrahim 64f699aeea
KAFKA-15853: Move general configs out of KafkaConfig (#16040)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-28 16:22:54 +02:00
Sanskar Jhajharia 699438b7f7
MINOR: Fix the config name in ProducerFailureHandlingTest (#16099)
When moving from KafkaConfig.ReplicaFetchMaxBytesProp we used ReplicationConfigs.REPLICA_LAG_TIME_MAX_MS_CONFIG instead of ReplicationConfigs.REPLICA_FETCH_MAX_BYTES_CONFIG. This PR patches the same.

Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-05-28 16:34:44 +05:30
Luke Chen 91284d8d7b
KAFKA-16709: abortAndPauseCleaning only when future log is not existed (#15951)
When doing alter replica logDirs, we'll create a future log and pause log cleaning for the partition( here). And this log cleaning pausing will resume after alter replica logDirs completes (here). And when in the resuming log cleaning, we'll decrement 1 for the LogCleaningPaused count. Once the count reached 0, the cleaning pause is really resuming. (here). For more explanation about the logCleaningPaused state can check here.

But, there's still one factor that could increase the LogCleaningPaused count: leadership change (here). When there's a leadership change, we'll check if there's a future log in this partition, if so, we'll create future log and pauseCleaning (LogCleaningPaused count + 1). So, if during the alter replica logDirs:

1. alter replica logDirs for tp0 triggered (LogCleaningPaused count = 1)
2. tp0 leadership changed (LogCleaningPaused count = 2)
3. alter replica logDirs completes, resuming logCleaning (LogCleaningPaused count = 1)
4. LogCleaning keeps paused because the count is always >  0

This PR fixes this issue by only abortAndPauseCleaning when future log is not existed. We did the same check in alterReplicaLogDirs. So this change can make sure there's only 1 abortAndPauseCleaning for either abortAndPauseCleaning or maybeAddLogDirFetchers. Tests also added.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Igor Soarez <soarez@apple.com>
2024-05-28 12:23:34 +08:00
Colin P. McCabe bac8df56ff MINOR: fix typo in KAFKA-16515 2024-05-27 08:53:53 -07:00
David Jacot da3304ecb6
KAFKA-16371; fix lingering pending commit when handling OFFSET_METADATA_TOO_LARGE (#16072)
This patch was initially created in #15536.

When there is a commit for multiple topic partitions and some, but not all, exceed the offset metadata limit, the pending commit is not properly cleaned up leading to UNSTABLE_OFFSET_COMMIT errors when trying to fetch the offsets with read_committed. This change makes it so the invalid commits are not added to the pendingOffsetCommits set.

Co-authored-by: Kyle Phelps <kyle.phelps@datadoghq.com>

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>
2024-05-27 08:10:37 -07:00
Kamal Chandraprakash 524ad1e14b
KAFKA-16452: Don't throw OOORE when converting the offset to metadata (#15825)
Don't throw OFFSET_OUT_OF_RANGE error when converting the offset to metadata, and next time the leader should increment the high watermark by itself after receiving fetch requests from followers. This can happen when checkpoint files are missing and being elected as a leader. 

Reviewers: Luke Chen <showuon@gmail.com>, Jun Rao <junrao@apache.org>
2024-05-27 17:44:23 +08:00
Colin P. McCabe 4f55786a8a KAFKA-16515: Fix the ZK Metadata cache confusion between brokers and controllers
ZkMetadataCache could theoretically return KRaft controller information from a call to
ZkMetadataCache.getAliveBrokerNode, which doesn't make sense. KRaft controllers are not part of the
set of brokers. The only use-case for this functionality was in MetadataCacheControllerNodeProvider
during ZK migration, where it allowed ZK brokers in migration mode to forward requests to
kcontrollers when appropriate. This PR changes MetadataCacheControllerNodeProvider to simply
delegate to quorumControllerNodeProvider in this case.

Reviewers: José Armando García Sancio <jsancio@apache.org>
2024-05-24 10:16:59 -07:00
Colin P. McCabe 90892ae99f KAFKA-16516: Fix the controller node provider for broker to control channel
Fix the code in the RaftControllerNodeProvider to query RaftManager to find Node information,
rather than consulting a static map. Add a RaftManager.voterNode function to supply this
information. In KRaftClusterTest, add testControllerFailover to get more coverage of controller
failovers.

Reviewers: José Armando García Sancio <jsancio@apache.org>
2024-05-24 09:52:47 -07:00
Gantigmaa Selenge c5cd190818
MINOR: Refactor SSL/SASL admin integration tests to not use a custom authorizer (#15377)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-05-24 12:50:47 +02:00
Viktor Somogyi-Vass 5a4898450d
KAFKA-15649: Handle directory failure timeout (#15697)
A broker that is unable to communicate with the controller will shut down
after the configurable log.dir.failure.timeout.ms.

The implementation adds a new event to the Kafka EventQueue. This event
is deferred by the configured timeout and will execute the shutdown
if the heartbeat communication containing the failed log dir is still
pending with the controller.

Reviewers: Igor Soarez <soarez@apple.com>
2024-05-23 16:36:39 +01:00
Mickael Maison ab0cc72499
MINOR: Move parseCsvList to server-common (#16029)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-23 16:01:45 +02:00
Mickael Maison e4e1116156
MINOR: Move Throttler to storage module (#16023)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-22 18:47:31 +02:00
Krishna Agarwal 271c04bd17
KAFKA-15444: Native docker image for Apache Kafka (KIP-974) (#15927)
This PR aims to add Docker Image for GraalVM based Native Kafka Broker as per the following KIP - https://cwiki.apache.org/confluence/display/KAFKA/KIP-974%3A+Docker+Image+for+GraalVM+based+Native+Kafka+Broker

This PR adds the following functionalities:

Ability to build the docker image for Native Apache Kafka
   - Dockerfile
   - Launch script
   - metadata configs required by graalVM native-image: link
Add Kafka startup ability in the KafkaDockerWrapper.scala
Ability to build and test the image - integrated with the existing JVM docker image framework.
2024-05-22 10:52:46 +05:30
Mickael Maison affe8da54c
KAFKA-7632: Support Compression Levels (KIP-390) (#15516)
Reviewers: Jun Rao <jun@confluent.io>,  Luke Chen <showuon@gmail.com>
Co-authored-by: Lee Dongjin <dongjin@apache.org>
2024-05-21 17:58:49 +02:00
TaiJuWu 89083520ef
KAFKA-16654 Refactor kafka.test.annotation.Type and ClusterTestExtensions (#15916)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-21 22:29:06 +08:00
Nikhil Ramakrishnan b5a013e456 KAFKA-16513; Add test for WriteTxnMarkers with AlterCluster permission
In #15837, we introduced the change to allow calling the WriteTxnMarkers API with AlterCluster permissions. This PR proposes 2 enhancements:

- When a WriteTxnMarkers request is received, it is first authorized against the alter cluster permission. If the user does not have this permission, a 'deny' will be logged. However, if the user does have the cluster action permission, the request will be successfully authorized.  Don't log the first deny to avoid confusion.
- Add a `WriteTxnMarkersRequest` to be called from the test `testAuthorizationWithTopicExisting`, so that the request can be exercised and verified with both possible permissions.

Author: Nikhil Ramakrishnan <nikrmk@amazon.com>

Reviewers: Christo Lolov <lolovc@amazon.com>

Closes #15952 from nikramakrishnan/kip1037-addTest
2024-05-21 10:34:28 +01:00
Lianet Magrans 52b4596dae
KAFKA-16675: Refactored and new rebalance callbacks integration tests (#15965)
Move existing rebalance callback + consumer.position test to the PlaintextConsumerCallbackTest file (refactored to reuse the new helper funcs available)
Add new integration tests for callbacks interaction with seek and pause
Minor cleanup in the callbacks test file

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-21 10:40:57 +02:00
Nikolay c10bb58d1c
KAFKA-14588 [4/N] ConfigCommandTest rewritten in java (#15839)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-21 16:39:39 +08:00
David Jacot b4c2d66801
KAFKA-16770; [1/N] Coalesce records into bigger batches (#15964)
We have discovered during large scale performance tests that the current write path of the new coordinator does not scale well. The issue is that each write operation writes synchronously from the coordinator threads. Coalescing records into bigger batches helps drastically because it amortizes the cost of writes. Aligning the batches with the snapshots of the timelines data structures also reduces the number of in-flight snapshots.

This patch is the first of a series of patches that will bring records coalescing into the coordinator runtime. As a first step, we had to rework the PartitionWriter interface and move the logic to build MemoryRecords from it to the CoordinatorRuntime. The main changes are in these two classes. The others are related mechanical changes.

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-05-20 23:47:09 -07:00
Gaurav Narula 95adb7bfbf
MINOR: ensure KafkaServerTestHarness::tearDown is always invoked (#15996)
An exception thrown while closing the client instances in `IntegrationTestHarness::tearDown` may result in `KafkaServerTestHarness::tearDown` not being invoked. This would result in thread leaks of the broker and controller threads spawned in the failing test.

An example of this is the [CI run](https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-15994/1/tests) for #15994 where `Build / JDK 8 and Scala 2.12 / testCoordinatorFailover(String, String).quorum=kraft+kip848.groupProtocol=consumer – kafka.api.PlaintextConsumerTest` failing results in `consumers.foreach(_.close(Duration.ZERO))` in `IntegrationTestHarness::tearDown` throwing an exception.

A side effect of this is it poisons Gradle test runner JVM and prevents tests in other unrelated classes from executing as `@BeforeAll` check in QuorumTestHarness would cause them to fail immediately.

This PR encloses the client closure in try-finally to ensure`KafkaServerTestHarness::tearDown` is always invoked.

Reviewers: Nikhil Ramakrishnan <nikrmk@amazon.com>, Igor Soarez <soarez@apple.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-21 03:10:08 +08:00
Gaurav Narula 412b05df00
KAFKA-16789 Fix thread leak detection for event handler threads (#15984)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-19 18:21:56 +08:00
Justine Olshan 3e15ab98ec
KAFKA-16992: InvalidRequestException: ADD_PARTITIONS_TO_TXN with version 4 which is not enabled when upgrading from kafka (#15971)
We weren't enabling discoverBrokerVersions to check the supported versions in the AddPartitionsToTxnManager. This means that any verification request (or any AddPartitionsToTxnRequest version) from a newer broker would fail when sending to an older broker.

The bulk of this change is adding additional transactions system tests for old versions.
One test upgrades the cluster completely. This didn't catch the issue but could be useful.

The other test forces a new broker to send a verification request to an older one. Without the discoverBrokerVersions change, all tests between mixed brokers failed. (We introduced a new request version in 3.8 -- which is a separate version from the one that caused the bug for 3.5 -> 3.6) With the addition, the tests all passed.

I also manually ran a test for 3.5 -> 3.6 since the issue there was slightly different and was caused by the unstableLatestVersion flag being enabled. This change should fix this as well. 👍

Reviewers:  David Jacot <djacot@confluent.io>
2024-05-17 21:35:28 -07:00
Mickael Maison 22f5113dba
KAFKA-15723 KRaft support in ListOffsetsRequestTest (#15980)
Co-authored-by: Zihao Lin <104664078@163.com>

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-18 03:54:18 +08:00
Ken Huang 7fea279ff9
KAFKA-16763 Upgrade to scala 2.12.19 and scala 2.13.14 (#15958)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-18 00:36:38 +08:00
José Armando García Sancio 056d232f4e
KAFKA-16526; Quorum state data version 1 (#15859)
Allow KRaft replicas to read and write version 0 and 1 of the quorum-state file. Which version is written is controlled by the kraft.version. With kraft.version 0, version 0 of the quorum-state file is written. With kraft.version 1, version 1 of the quorum-state file is written. Version 1 of the quorum-state file adds the VotedDirectoryId field and removes the CurrentVoters. The other fields removed in version 1 are not important as they were not overwritten or used by KRaft.

In kraft.version 1 the set of voters will be stored in the kraft partition log segments and snapshots.

To implement this feature the following changes were made to KRaft.

FileBasedStateStore was renamed to FileQuorumStateStore to better match the name of the implemented interface QuorumStateStore.

The QuorumStateStore::writeElectionState was extended to include the kraft.version. This version is used to determine which version of QuorumStateData to store. When writing version 0 the VotedDirectoryId is not persisted but the latest value is kept in-memory. This allows the replica to vote consistently while they stay online. If a replica restarts in the middle of an election it will forget the VotedDirectoryId if the kraft.version is 0. This should be rare in practice and should only happen if there is an election and failure while the system is upgrading to kraft.version 1.

The type ElectionState, the interface EpochState and all of the implementations of EpochState (VotedState, UnattachedState, FollowerState, ResignedState, CandidateState and LeaderState) are extended to support the new voted directory id.

The type QuorumState is changed so that local directory id is used. The type is also changed so that the latest value for the set of voters and the kraft version is query from the KRaftControlRecordStateMachine.

The replica directory id is read from the meta.properties and passed to the KafkaRaftClient. The replica directory id is guaranteed to be set in the local replica.

Adds a new metric for current-vote-directory-id which exposes the latest in-memory value of the voted directory id.

Renames VoterSet.VoterKey to ReplicaKey.

It is important to note that after this change, version 1 of the quorum-state file will not be written by kraft controllers and brokers. This change adds support reading and writing version 1 of the file in preparation for future changes.

Reviewers: Jun Rao <junrao@apache.org>
2024-05-16 09:53:36 -04:00
Chia-Ping Tsai 2c51594607
MINOR: rewrite TopicBasedRemoteLogMetadataManagerTest by ClusterTestExtensions (#15917)
Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Luke Chen <showuon@gmail.com>
2024-05-16 21:26:08 +08:00
Nikolay 7b1fe33d01
KAFKA-14588 [3/N] ConfigCommandTest rewritten in java (#15930)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-16 18:06:59 +08:00
Johnny Hsu dac569b967
KAFKA-16668 Add tags support in ClusterTestExtension (#15861)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-16 18:02:13 +08:00
Mickael Maison 5da4b238d6
MINOR: Remove unused method in ToolsUtils (#15967)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-16 15:40:50 +08:00
Chia-Ping Tsai aca5d249d6
MINOR: revisit LogValidatorTest#checkRecompression (#15948)
Reviewers: Jun Rao <junrao@apache.org>
2024-05-16 02:26:43 +08:00
Lucas Brutschy c218c4e1b5
KAFKA-16287: Implement example tests for common rebalance callback (#15408)
We need to add an example test to the PlaintextConsumerTest that tests a common
ConsumerRebalanceListener use case. For example, create an integration test that
invokes the Consumer API to commit offsets in the onPartitionsRevoked callback.

This test is implemented in a reasonably general way with a view to using it as a
template from which other tests can be created later. Eventually we will need to
have a comprehensive set of tests that cover all the basic use cases.

Reviewers: Lianet Magrans <lianetmr@gmail.com>, Bruno Cadonna <cadonna@apache.org>
2024-05-15 11:12:21 +02:00
David Jacot bf88013a28
MINOR: Rename `Record` to `CoordinatorRecord` (#15949)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-15 13:57:19 +08:00
Kuan-Po (Cooper) Tseng d59336a615
MINOR: Use ClusterTemplate in ApiVersionsRequestTest (#15936)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-14 21:19:57 +08:00
Greg Harris de105a8c14
KAFKA-16703 Close serverChannel in SocketServer if unable to bind to a port (#15923)
Reviewers: Gaurav Narula <gaurav_narula2@apple.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-14 14:47:00 +08:00
Max Riedel d61b34f2a7
KAFKA-14509: [3/4] Add integration test for consumerGroupDescribe API (#15727)
This patch adds integration tests for consumerGroupDescribe API.

Reviewers: David Jacot <djacot@confluent.io>
2024-05-13 11:32:50 -07:00
David Jacot f9169b7d3a
KAFKA-16735; Deprecate offsets.commit.required.acks (#15931)
This patch deprecates `offsets.commit.required.acks` in Apache Kafka 3.8 as described in KIP-1041: https://cwiki.apache.org/confluence/x/9YobEg.

Reviewers: Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-13 11:30:34 -07:00
Nikolay 6161fd0db2
KAFKA-14588 [2/N] ConfigCommandTest rewritten in java (#15873)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-13 19:45:28 +08:00
PoAn Yang 334d5d58bb
KAFKA-16677 Replace ClusterType#ALL and ClusterType#DEFAULT by Array (#15897)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-13 14:24:59 +08:00
Gaurav Narula 47841e0bb9
KAFKA-9401 Reduce contention for Fetch requests (#15836)
KIP-227 introduced in-memory caching of FetchSessions. Brokers with a large number of Fetch requests suffer from contention on trying to acquire a lock on FetchSessionCache.

This change aims to reduce lock contention for FetchSessionCache by sharding the cache into multiple segments, each responsible for an equal range of sessionIds. Assuming Fetch requests have a uniform distribution of sessionIds, the probability of contention on a segment is reduced by a factor of the number of segments.

We ensure backwards compatibility by ensuring total number of cache entries remain the same as configured and sessionIds are randomly allocated.

Reviewers: Igor Soarez <soarez@apple.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-11 23:19:59 +08:00
Chia-Ping Tsai 4dff60df67
MINOR: fix LogValidatorTest#checkNonCompressed (#15904)
Reviewers: Jun Rao <junrao@apache.org>
2024-05-11 09:50:52 +08:00
Sid Yagnik ef7b48e66a
Allowing WriteTxnMarkers API to run with AlterCluster permissions (#15837)
https://issues.apache.org/jira/browse/KAFKA-16513

https://cwiki.apache.org/confluence/display/KAFKA/KIP-1037%3A+Allow+WriteTxnMarkers+API+with+Alter+Cluster+Permission

Reviewers: Christo Lolov <christo_lolov@yahoo.com>,  Luke Chen <showuon@gmail.com>, Justine Olshan <jolshan@confluent.io>
2024-05-10 15:30:57 -07:00
Greg Harris 4e4f7d3231
KAFKA-15804: Close SocketServer channels when calling shutdown before enableRequestProcessing (#14729)
Signed-off-by: Greg Harris <greg.harris@aiven.io>
Reviewers: Chris Egerton <chrise@aiven.io>, hudeqi <1217150961@qq.com>, Qichao Chu <qichao@uber.com>
2024-05-10 13:56:50 -07:00
Mario Pareja 147ea55dfe
MINOR: correct KAFKA_HEAP_OPTS server property in KafkaDockerWrapper(#15345)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Vedarth Sharma <142404391+VedarthConfluent@users.noreply.github.com>
2024-05-10 21:59:40 +05:30
Kuan-Po (Cooper) Tseng 58c7369be2
KAFKA-16660 reduce the check interval to speedup DelegationTokenRequestsTest (#15907)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-10 14:00:12 +08:00
Kuan-Po (Cooper) Tseng 7e9ab4b2c6
KAFKA-16484 Support to define per broker/controller property by ClusterConfigProperty (#15715)
Introduce a new field id in annotation ClusterConfigProperty. The main purpose of new field is to define specific broker/controller(kraft) property. And the default value is -1 which means the ClusterConfigProperty will apply to all broker/controller.

Note that under Type.KRAFT mode, the controller id starts from 3000, and then increments by one each time. Other modes the broker/controller id starts from 0 and then increments by one.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-10 10:31:45 +08:00
Sanskar Jhajharia c64a315fd5
MINOR: Made the supportedOperation variable name more verbose (#15892)
As a part of 2e8d69b78c, we had introduced the TransactionAbortableException in AK. On more detailed analysis we figured out that the enum SupportedOperation was a bit misleading. Hence updated the same to TransactionSupportedOperation to allow a better and more defined function signature

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-05-08 10:14:12 -07:00
Jorge Esteban Quilcate Otoya 2a5efe4a33
KAFKA-16685: Add parent exception to RLMTask warning logs (#15880)
KAFKA-16685: Add parent exception to RLMTask warning logs

Reviewers: Josep Prat <josep.prat@aiven.io>
2024-05-08 14:27:03 +02:00
TingIāu "Ting" Kì f74f596bc7
KAFKA-16640 Replace TestUtils#resource by scala.util.Using (#15881)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-08 15:56:27 +08:00
Kamal Chandraprakash 8655094e6c
KAFKA-16511: Fix the leaking tiered segments during segment deletion (#15817)
When there are overlapping segments in the remote storage, then the deletion may fail to remove the segments due to isRemoteSegmentWithinLeaderEpochs check. Once the deletion starts to fail for a partition, then segments won't be eligible for cleanup. The one workaround that we have is to move the log-start-offset using the kafka-delete-records script.

Reviewers: Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org>
2024-05-08 15:21:23 +08:00
TingIāu "Ting" Kì a0f1658bb1
KAFKA-16678 Remove variable "unimplementedquorum" (#15879)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-08 12:30:34 +08:00
Lianet Magrans ea485a7061
KAFKA-16665: Allow to initialize newly assigned partition's positions without allowing fetching while callback runs (#15856)
Fix to allow to initialize positions for newly assigned partitions, while the onPartitionsAssigned callback is running, even though the partitions remain non-fetchable until the callback completes.

Before this PR, we were not allowing initialization or fetching while the callback was running. The fix here only allows to initialize the newly assigned partition position, and keeps the existing logic for making sure that the partition remains non-fetchable until the callback completes.

The need for this fix came out in one of the connect system tests, that attempts to retrieve a newly assigned partition position with a call to consumer.position from within the onPartitionsAssigned callback (WorkerSinkTask). With this PR, we allow to make such calls (test added), which is the behaviour of the legacy consumer.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-07 10:40:00 +02:00
Dongnuo Lyu 459eaec666
KAFKA-16615; JoinGroup API for upgrading ConsumerGroup (#15798)
The patch implements JoinGroup API for the new consumer groups. It allow members using the classic rebalance protocol with the consumer embedded protocol to join a new consumer group.

Reviewers: David Jacot <djacot@confluent.io>
2024-05-06 23:59:10 -07:00
TingIāu "Ting" Kì 0de3b7c40b
KAFKA-16593 Rewrite DeleteConsumerGroupsTest by ClusterTestExtensions (#15766)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-07 14:04:32 +08:00
David Jacot 0df340d64d
KAFKA-16470 kafka-dump-log --offsets-decoder should support new records (#15652)
When the consumer group protocol is used in a cluster, it is, at the moment, impossible to see all records stored in the __consumer_offsets topic with kafka-dump-log --offsets-decoder. It does not know how to handle all the new records.

This patch refactors the OffsetsMessageParser used internally by kafka-dump-log to use the RecordSerde used by the new group coordinator. It ensures that the tool is always in sync with the coordinator implementation. The patch also changes the format to using the toString'ed representations of the records instead of having custom logic to dump them. It ensures that all the information is always dumped. The downside of the latest is that inner byte arrays (e.g. assignment in the classic protocol) are no longer deserialized. Personally, I feel like that it is acceptable and it is actually better to stay as close as possible to the actual records in this tool. It also avoids issues like https://issues.apache.org/jira/browse/KAFKA-15603.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-07 08:49:31 +08:00
David Arthur fe8ccbc92c
KAFKA-16539 Fix IncrementalAlterConfigs during ZK migration (#15744)
This patch fixes two issues with IncrementalAlterConfigs and the ZK migration. First, it changes the handling of IncrementalAlterConfigs to check if the controller is ZK vs KRaft and only forward for KRaft. Second, it adds a check in KafkaZkClient#setOrCreateEntityConfigs to ensure a ZK broker is not directly modifying configs in ZK if there is a KRaft controller. This closes the race condition between KRaft taking over as the active controller and the ZK brokers learning about this.

*Forwarding*

During the ZK migration, there is a time when the ZK brokers are running with migrations enabled, but KRaft has yet to take over as the controller. Prior to KRaft taking over as the controller, the ZK brokers in migration mode were unconditionally forwarding IncrementalAlterConfigs (IAC) to the ZK controller. This works for some config types, but breaks when setting BROKER and BROKER_LOGGER configs for a specific broker. The behavior in KafkaApis for IAC was to always forward if the forwarding manager was defined. Since ZK brokers in migration mode have forwarding enabled, the forwarding would happen, and the special logic for BROKER and BROKER_LOGGER would be missed, causing the request to fail.

With this fix, the IAC handler will check if the controller is KRaft or ZK and only forward for KRaft.

*Protected ZK Writes*

As part of KIP-500, we moved most (but not all) ZK mutations to the ZK controller. One of the things we did not move fully to the controller was entity configs. This is because there was some special logic that needed to run on the broker for certain config updates. If a broker-specific config was set, AdminClient would route the request to the proper broker. In KRaft, we have a different mechanism for handling broker-specific config updates.

Leaving this ZK update on the broker side would be okay if we were guarding writes on the controller epoch, but it turns out KafkaZkClient#setOrCreateEntityConfigs does unprotected "last writer wins" updates to ZK. This means a ZK broker could update the contents of ZK after the metadata had been migrated to KRaft. No good! To fix this, this patch adds a check on the controller epoch to KafkaZkClient#setOrCreateEntityConfigs but also adds logic to fail the update if the controller is a KRaft controller.

The new logic in setOrCreateEntityConfigs adds STALE_CONTROLLER_EPOCH as a new exception that can be thrown while updating configs.

Reviewers:  Luke Chen <showuon@gmail.com>, Akhilesh Chaganti <akhileshchg@users.noreply.github.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-07 08:29:57 +08:00
Nikolay 6a8977e212
KAFKA-14588 [3/N] ConfigCommandTest rewritten in java (#15850)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-06 18:44:34 +08:00
Chia Chuan Yu 55a00be4e9
MINOR: Replaced Utils.join() with JDK API. (#15823)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-06 15:13:01 +08:00
PoAn Yang 970ac07881
KAFKA-16659 KafkaConsumer#position() does not respect wakup when group protocol is CONSUMER (#15853)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-06 08:45:11 +08:00
Johnny Hsu 25118cec14
MINOR: remove redundant check in KafkaClusterTestKit (#15858)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-05 11:47:40 +08:00
José Armando García Sancio bfe81d6229
KAFKA-16207; KRaft's internal log listener to update voter set (#15671)
Adds support for the KafkaRaftClient to read the control records KRaftVersionRecord and VotersRecord in the snapshot and log. As the control records in the KRaft partition are read, the replica's known set of voters are updated. This change also contains the necessary changes to include the control records when a snapshot is generated by the KRaft state machine.

It is important to note that this commit changes the code and the in-memory state to track the sets of voters but it doesn't change any data that is externally exposed. It doesn't change the RPCs, data stored on disk or configuration.

When the KRaft replica starts the PartitionListener reads the latest snapshot and then log segments up to the LEO, updating the in-memory state as it reads KRaftVersionRecord and VotersRecord. When the replica (leader and follower) appends to the log, the PartitionListener catches up to the new LEO. When the replica truncates the log because of a diverging epoch, the PartitionListener also truncates the in-memory state to the new LEO. When the state machine generate a new snapshot the PartitionListener trims any prefix entries that are not needed. This is all done to minimize the amount of data tracked in-memory and to make sure that it matches the state on disk.

To implement the functionality described above this commit also makes the following changes:

Adds control records for KRaftVersionRecord and VotersRecord. KRaftVersionRecord describes the finalized kraft.version supported by all of the replicas. VotersRecords describes the set of voters at a specific offset.

Changes Kafka's feature version to support 0 as the smallest valid value. This is needed because the default value for kraft.version is 0.

Refactors FileRawSnapshotWriter so that it doesn't directly call the onSnapshotFrozen callback. It adds NotifyingRawSnapshotWriter for calling such callbacks. This reorganization is needed because in this change both the KafkaMetadataLog and the KafkaRaftClient need to react to snapshots getting frozen.

Cleans up KafkaRaftClient's initialization. Removes initialize from RaftClient - this is an implementation detail that doesn't need to be exposed in the interface. Removes RaftConfig.AddressSpec and simplifies the bootstrapping of the static voter's address. The bootstrapping of the address is delayed because of tests. We should be able to simplify this further in future commits.

Update the DumpLogSegment CLI to support the new control records KRaftVersionRecord and VotersRecord.

Fix the RecordsSnapshotReader implementations so that the iterator includes control records. RecordsIterator is extended to support reading the new control records.
Improve the BatchAccumulator implementation to allow multiple control records in one control batch. This is needed so that KRaft can make sure that VotersRecord is included in the same batch as the control record (KRaftVersionRecord) that upgrades the kraft.version to 1.

Add a History interface and default implementation TreeMapHistory. This is used to track all of the sets of voters between the latest snapshot and the LEO. This is needed so that KafkaRaftClient can query for the latest set of voters and so that KafkaRaftClient can include the correct set of voters when the state machine generates a new snapshot at a given offset.

Add a builder pattern for RecordsSnapshotWriter. The new builder pattern also implements including the KRaftVersionRecord and VotersRecord control records in the snapshot as necessary. A KRaftVersionRecord should be appended if the kraft.version is greater than 0 at the snapshot's offset. Similarly, a VotersRecord should be appended to the snapshot with the latest value up to the snapshot's offset.

Reviewers: Jason Gustafson <jason@confluent.io>
2024-05-04 12:43:16 -07:00
Kirk True 9b8aac22ec
KAFKA-16427 KafkaConsumer#position() does not respect timeout when group protocol is CONSUMER (#15843)
The AsyncKafkaConsumer implementation of position(TopicPartition, Duration) was not updating its internal Timer, causing it to execute the loop forever. Adding a call to update the Timer at the bottom of the loop fixes the issue.

An integration test was added to catch this case; it fails without the newly added call to Timer.update(long).

Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-04 10:29:27 +08:00
Alyssa Huang 1fd39150aa
KAFKA-16655: Deflake ZKMigrationIntegrationTest.testDualWrite #15845
Reviewers: Colin P. McCabe <cmccabe@apache.org>, Johnny Hsu <44309740+johnnychhsu@users.noreply.github.com>
2024-05-03 10:44:37 -07:00
PoAn Yang 87390f961f
KAFKA-16572 allow defining number of disks per broker in ClusterTest (#15745)
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-03 14:24:59 +08:00
Nikolay cdc4caa578
KAFKA-14588 UserScramCredentialsCommandTest rewritten in Java (#15832)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Igor Soarez <soarez@apple.com>
2024-05-02 10:35:10 +01:00
Kuan-Po (Cooper) Tseng 89d8045a15
KAFKA-16647 Remove setMetadataDirectory from BrokerNode/ControllerNode (#15833)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-02 09:04:15 +08:00
TaiJuWu d9c36299db
KAFKA-16614 Disallow @ClusterTemplate("") (#15800)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-02 07:15:22 +08:00
PoAn Yang 4825c89d14
KAFKA-16588 broker shutdown hangs when log.segment.delete.delay.ms is zero (#15773)
Instead of entering pending forever, this PR invoke next schedule after 1ms. However, the side effect is busy-waiting. Hence, This PR also update the docs to remind users about that - the issue about smaller log.segment.delete.delay.ms

Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-01 17:11:20 +08:00
Ken Huang da5f4424dc
MINOR: Clean up TestUtils.scala (#15808)
This PR do the following cleanup for TestUtils.scala

1) remove unused methods
2) move methods used by single test class out of

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-01 04:13:29 +08:00
Kuan-Po (Cooper) Tseng 6d436a8f98
KAFKA-16627 Remove ClusterConfig parameter in BeforeEach and AfterEach (#15824)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-30 08:40:28 +08:00
Johnny Hsu 78c7f08e20
MINOR: Reuse KafkaConfig to create MetadataLogConfig (#15788)
Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Kuan-Po (Cooper) Tseng <brandboat@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-04-30 08:04:16 +08:00
Johnny Hsu 150a78ab90
KAFKA-15897 fix kafka.server.ControllerRegistrationManagerTest#testWrongIncarnationId (#15828)
ControllerRegistrationManagerTest is flaky due to the poll in L221. The potential root cause is a race condition between the first poll (L221) and the second poll (L229). Before the second poll, we mock a response (L226), which should be processed by the second poll. However, if the first poll take this away, the second poll would get nothing, and this could lead to an error.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-30 07:55:12 +08:00
Nikolay 81c24d6bf8
KAFKA-15588 ConfigCommandIntegrationTest rewritten in java (#15645)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-30 01:46:08 +08:00
Omnia Ibrahim e1bfaec49d
KAFKA-15853 Move metrics configs out of KafkaConfig (#15822)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-30 01:19:05 +08:00
Kuan-Po (Cooper) Tseng 5de5d967ad
KAFKA-16560 Refactor/cleanup BrokerNode/ControllerNode/ClusterConfig (#15761)
* Make ClusterConfig immutable
* Make BrokerNode immutable
* Refactor out build argument in ControllerNode
* Add setPrefix and replace put property with set map in ClusterConfig
* Remove rollingBrokerRestart from ClusterInstance interface
* Refactor KRaftClusterTest#doOnStartedKafkaCluster

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-28 02:00:56 +08:00
TaiJuWu 4060d4370e
KAFKA-6527 Enable DynamicBrokerReconfigurationTest.testDefaultTopicConfig (#15796)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-27 08:00:29 +08:00
Omnia Ibrahim d88c15fc3e
KAFKA-15853 Move KRAFT configs out of KafkaConfig (#15775)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-27 07:02:31 +08:00
Gaurav Narula 025f9816f1
MINOR: fix javadoc warnings (#15527)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-26 08:31:52 +08:00
Omnia Ibrahim 6feae817d2
MINOR: Rename RaftConfig to QuorumConfig (#15797)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-26 03:08:31 +08:00
TaiJuWu ce9026f597
MINOR: Modified System.getProperty("line.separator") to java.lang.System.lineSeparator() (#15782)
Reviewers: Igor Soarez  <soarez@apple.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-04-26 02:32:11 +08:00
Mickael Maison 0a6d5ff23c
MINOR: Various cleanups in core (#15786)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Omnia Ibrahim <o.g.h.ibrahim@gmail.com>
2024-04-25 16:45:00 +02:00
TingIāu "Ting" Kì 864744ffd4
KAFKA-16610 Replace "Map#entrySet#forEach" by "Map#forEach" (#15795)
Reviewers: Apoorv Mittal <amittal@confluent.io>, Igor Soarez <soarez@apple.com>
2024-04-25 01:52:24 +01:00
PoAn Yang 81c222e977
KAFKA-16613 remove TestUtils#subscribeAndWaitForRecords (#15794)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-25 02:15:21 +08:00
Kamal Chandraprakash a8c0f2b98f
KAFKA-16605: Fix the flaky LogCleanerParameterizedIntegrationTest (#15787)
Even if the log start offset is updated, the log deletion might still not completed. Making the test more robust.

Reviewers: Luke Chen <showuon@gmail.com>
2024-04-24 17:52:52 +08:00
PoAn Yang a38185280c
KAFKA-16424: remove truncated logs after alter dir (#15616)
If there are some logs to be deleted during the log dir movement, we'll send for a scheduler to do the deletion later.
However, when the log dir movement completed, the future log is renamed, the async log deletion will fail with no file existed error.

Signed-off-by: PoAn Yang <payang@apache.org>

Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, SoontaekLim <soontaek.lim@neya.kr>, Johnny Hsu <johnnyhsu@fb.com>
2024-04-24 17:51:29 +08:00
Omnia Ibrahim cfe5ab5cf2
KAFKA-15853 Move quota configs into server-common package (#15774)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-04-24 13:05:18 +08:00
Omnia Ibrahim 1b301b3020
KAFKA-15853 Move socket configs into org.apache.kafka.network.SocketServerConfigs (#15772)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-23 17:39:36 +08:00
Ken Huang fb529d8966
KAFKA-16548 Avoid decompressing/collecting all records when all we want to do is to find a single matched record from remote storage (#15765)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-22 21:23:11 +08:00
charliecheng630 1763fe19dd
KAFKA-16549 suppress the warnings from RemoteLogManager (#15767)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-22 20:42:17 +08:00
Lucas Brutschy ed47e37b28
KAFKA-16103: AsyncConsumer should await pending async commits on commitSync and close (#15613)
The javadoc for KafkaConsumer.commitSync says:

Note that asynchronous offset commits sent previously with the {@link #commitAsync(OffsetCommitCallback)}
(or similar) are guaranteed to have their callbacks invoked prior to completion of this method.

This is not always true in the async consumer, where there is no code at all to make sure that the callback is executed before commitSync returns.

Similarly, the async consumer is also missing logic to await callback execution in close. While the javadoc doesn't explicitly promise callback execution, it promises "completing commits", which one would reasonably expect to include callback execution. Also, the legacy consumer contains some code to execute callbacks before closing.

This change proposed a number of fixes to clean up the callback execution guarantees in the async consumer:

We keep track of the incomplete async commit
futures and wait for them to complete before returning from
commitSync or close (if there is time).
Since we need to block to make sure that our previous commits are
completed, we allow the consumer to wake up.
Some similar gaps are addressed in the legacy consumer, see #15693

Testing
Two new integration tests and a couple of unit tests.

Reviewers: Bruno Cadonna <cadonna@apache.org>, Kirk True <ktrue@confluent.io>, Lianet Magrans <lianetmr@gmail.com>
2024-04-22 13:26:15 +02:00
Omnia Ibrahim 5e96e5c898
KAFKA-15853 Refactor KafkaConfig to use PasswordEncoderConfigs (#15770)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-22 00:47:57 +08:00
TingIāu "Ting" Kì 98548c517d
KAFKA-16591 Clear all admins after close all them. (#15763)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-21 06:26:44 +08:00
Kuan-Po (Cooper) Tseng ced79ee12f
KAFKA-16552 Create an internal config to control InitialTaskDelayMs in LogManager to speed up tests (#15719)
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-04-20 20:34:02 +08:00
hudeqi 613d4c8578
MINOR: fix hint in fetchOffsetForTimestamp (#15757)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-20 20:14:21 +08:00
Gaurav Narula 0308543d59
KAFKA-16082 Avoid resuming future replica if current replica is in the same directory (#15136)
It is observed that for scenario (3), i.e. a broker crashes while it
waits for the future replica to catch up for the second time and the
`dir1` is unavailable when the broker is restarted, the
broker tries to create the partition in `dir2` according to the metadata
in the controller. However, ReplicaManager also tries to resume the
stale future replica which was abandoned when the broker crashed. This
results in the renaming of the future replica to fail eventually because
the directory for the topic partition already exists in `dir2` and the
broker then marks `dir2` as offline.

This PR attempts to fix this behaviour by ignoring any future replicas
which are in the same directory as where the log exists. It further
marks the stale future replica for deletion.

Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>,  Igor Soarez <soarez@apple.com>, Proven Provenzano <pprovenzano@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-04-20 04:18:51 +08:00
Omnia Ibrahim ecb2dd4cdc
KAFKA-15853 Move KafkaConfig log properties and docs out of core (#15569)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Nikolay <nizhikov@apache.org>, Federico Valeri <fvaleri@redhat.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-04-20 04:14:23 +08:00