Commit Graph

12952 Commits

Author SHA1 Message Date
Chia Chuan Yu 9eb05fc729
KAFKA-16223 Replace EasyMock/PowerMock with Mockito for KafkaConfigBackingStoreTest (#16164)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-02 20:32:54 +08:00
Ken Huang 2c82ecd67f
KAFKA-16807 DescribeLogDirsResponseData#results#topics have unexpected topics having empty partitions (#16042)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-02 17:33:02 +08:00
Colin Patrick McCabe 8ace33b47f
KAFKA-16757: Fix broker re-registration issues around MV 3.7-IV2 (#15945)
When upgrading from a MetadataVersion older than 3.7-IV2, we need to resend the broker registration, so that the controller can record the storage directories. The current code for doing this has several problems, however. One is that it tends to trigger even in cases where we don't actually need it. Another is that when re-registering the broker, the broker is marked as fenced.

This PR moves the handling of the re-registration case out of BrokerMetadataPublisher and into BrokerRegistrationTracker. The re-registration code there will only trigger in the case where the broker sees an existing registration for itself with no directories set.  This is much more targetted than the original code.

Additionally, in ClusterControlManager, when re-registering the same broker, we now preserve its fencing and shutdown state, rather than clearing those. (There isn't any good reason re-registering the same broker should clear these things... this was purely an oversight.) Note that we can tell the broker is "the same" because it has the same IncarnationId.

Reviewers: Gaurav Narula <gaurav_narula2@apple.com>, Igor Soarez <soarez@apple.com>
2024-06-01 23:51:39 +01:00
Farbod Ahmadian 966f2eb3ef
Minor: Add URL to log for Connect RestClient (#16166)
Reviewers: Chris Egerton <chrise@aiven.io>
2024-06-01 16:25:16 -04:00
Ken Huang 355d5da79a
MINOR: reduce the test suits of consumer group tools (#16155)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-02 01:20:59 +08:00
Chia Chuan Yu e33eb82fed
KAFKA-16574 The metrics of LogCleaner disappear after reconfiguration (#15863)
Reviewers: Gaurav Narula <gaurav_narula2@apple.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-02 01:02:03 +08:00
TaiJuWu db2a09fa90
KAFKA-16652 add unit test for ClusterTemplate offering zero ClusterConfig (#15862)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-02 00:56:54 +08:00
David Jacot fb566e48bf
KAFKA-16864; Optimize uniform (homogenous) assignor (#16088)
This patch optimizes uniform (homogenous) assignor by avoiding creating a copy of all the assignments. Instead, the assignor creates a copy only if the assignment is updated. It is a sort of copy-on-write. This change reduces the overhead of the TargetAssignmentBuilder when ran with the uniform (homogenous) assignor.

Trunk:

```
Benchmark                                     (memberCount)  (partitionsToMemberRatio)  (topicCount)  Mode  Cnt   Score   Error  Units
TargetAssignmentBuilderBenchmark.build                10000                         10           100  avgt    5  24.535 ± 1.583  ms/op
TargetAssignmentBuilderBenchmark.build                10000                         10          1000  avgt    5  24.094 ± 0.223  ms/op
JMH benchmarks done
```

```
Benchmark                                       (assignmentType)  (assignorType)  (isRackAware)  (memberCount)  (partitionsToMemberRatio)  (subscriptionType)  (topicCount)  Mode  Cnt   Score   Error  Units
ServerSideAssignorBenchmark.doAssignment             INCREMENTAL         UNIFORM          false          10000                         10         HOMOGENEOUS           100  avgt    5  14.697 ± 0.133  ms/op
ServerSideAssignorBenchmark.doAssignment             INCREMENTAL         UNIFORM          false          10000                         10         HOMOGENEOUS          1000  avgt    5  15.073 ± 0.135  ms/op
JMH benchmarks done
```

Patch:

```
Benchmark                                     (memberCount)  (partitionsToMemberRatio)  (topicCount)  Mode  Cnt  Score   Error  Units
TargetAssignmentBuilderBenchmark.build                10000                         10           100  avgt    5  3.376 ± 0.577  ms/op
TargetAssignmentBuilderBenchmark.build                10000                         10          1000  avgt    5  3.731 ± 0.359  ms/op
JMH benchmarks done
```

```
Benchmark                                       (assignmentType)  (assignorType)  (isRackAware)  (memberCount)  (partitionsToMemberRatio)  (subscriptionType)  (topicCount)  Mode  Cnt  Score   Error  Units
ServerSideAssignorBenchmark.doAssignment             INCREMENTAL         UNIFORM          false          10000                         10         HOMOGENEOUS           100  avgt    5  1.975 ± 0.086  ms/op
ServerSideAssignorBenchmark.doAssignment             INCREMENTAL         UNIFORM          false          10000                         10         HOMOGENEOUS          1000  avgt    5  2.026 ± 0.190  ms/op
JMH benchmarks done
```

Reviewers: Ritika Reddy <rreddy@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2024-05-31 13:17:59 -07:00
TingIāu "Ting" Kì ca9f4aeda7
KAFKA-16639 Ensure HeartbeatRequestManager generates leave request regardless of in-flight heartbeats. (#16017)
Fix the bug where the heartbeat is not sent when a newly created consumer is immediately closed.

When there is a heartbeat request in flight and the consumer is then closed. In the current code, the HeartbeatRequestManager does not correctly send the closing heartbeat because a previous heartbeat request is still in flight. However, the closing heartbeat is only sent once, so in this situation, the broker will not know that the consumer has left the consumer group until the consumer's heartbeat times out.
This situation causes the broker to wait until the consumer's heartbeat times out before triggering a consumer group rebalance, which in turn affects message consumption.

Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-01 04:14:15 +08:00
David Jacot 190dd79457
KAFKA-16860; [2/2] Introduce group.version feature flag (#16149)
This patch updates the system tests to correctly enable the new consumer protocol/coordinator in the tests requiring them.

I went with the simplest approach for now. Long term, I think that we should refactor the tests to better handle features and non-production features.

I got a successful run of the consumer system tests with this patch combined with https://github.com/apache/kafka/pull/16120: https://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1717155071--dajac--KAFKA-16860-2--29028ae0dd/2024-05-31--001./2024-05-31--001./report.html.

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-05-31 12:49:26 -07:00
David Jacot ba61ff0cd9
KAFKA-16860; [1/2] Introduce group.version feature flag (#16120)
This patch introduces the `group.version` feature flag with one version:
1) Version 1 enables the new consumer group rebalance protocol (KIP-848).

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-05-31 12:48:55 -07:00
Sebastien Viale f8ad9ee892
MINOR: update all-latency-avg documentation (#16148)
Change description: from iterator create to close time.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-05-31 12:24:01 -07:00
Greg Harris 9c2b1b8d0b
KAFKA-16809: Run Javadoc in CI (#16025)
Signed-off-by: Greg Harris <greg.harris@aiven.io>
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-05-31 10:31:08 -07:00
Kamal Chandraprakash cdd4455cb8
KAFKA-16866 Used the right constant in RemoteLogManagerTest#testFetchQuotaManagerConfig (#16152)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-01 01:14:31 +08:00
Mickael Maison b6d0fb055d
MINOR: Refactor DynamicConfig (#16133)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-01 01:09:46 +08:00
Josep Prat 7e81cc5e68
MINOR: Bump trunk to 3.9.0-SNAPSHOT (#16150)
Signed-off-by: Josep Prat <josep.prat@aiven.io>

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-31 16:41:44 +02:00
Ken Huang 21caf6b123
KAFKA-16629 Add broker-related tests to ConfigCommandIntegrationTest (#15840)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-31 20:24:33 +08:00
TingIāu "Ting" Kì 0971924ebc
KAFKA-16824: Utils.getHost and Utils.getPort do not catch a lot of invalid host and ports. (#16048)
Modify regex of HOST_PORT_PATTERN to prevent malformed hosts and ports.

Reviewers: Luke Chen <showuon@gmail.com>
2024-05-31 16:50:27 +08:00
Lianet Magrans eb39031cd0
KAFKA-16766: offset fetch timeout exception in new consumer consistent with legacy (#16125)
* Timeout exception fetching offsets

* Tests
2024-05-31 10:33:20 +02:00
Kuan-Po (Cooper) Tseng 3d125a2322
MINOR: Add more unit tests to LogSegments (#16085)
add more unit tests to LogSegments and do some small refactor in LogSegments.java

Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-31 16:07:38 +08:00
Bruno Cadonna 76d1f18e42
Revert "KAFKA-16448: Add ProcessingExceptionHandler interface and implementations (#16090)" (#16142)
This reverts commit 8d11d95795.

We decided to not release KIP-1033 with AK 3.8

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-31 09:56:36 +02:00
Chia-Ping Tsai b0fb2ac06d
KAFKA-16866 RemoteLogManagerTest.testCopyQuotaManagerConfig failing (#16146)
Reviewers: Justine Olshan <jolshan@confluent.io>, Satish Duggana <satishd@apache.org>
2024-05-31 06:32:50 +05:30
Antoine Pourchet 370e5ea1f8
KAFKA-15045: (KIP-924 pt. 15) Implement #defaultStandbyTaskAssignment and finish rack-aware standby optimization (#16129)
This fills in the implementation details of the standby task assignment utility functions within TaskAssignmentUtils.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-30 15:11:33 -07:00
Justine Olshan 7c1bb1585f
KAFKA-16308 [2/N]: Allow unstable feature versions and rename unstable metadata config (#16130)
As per KIP-1022, we will rename the unstable metadata versions enabled config to support all feature versions.

Features is also updated to return latest production and latest testing versions of each feature.

A feature is production ready when the corresponding metadata version (bootstrapMetadataVersion) is production ready.

Adds tests for the feature usage of the unstableFeatureVersionsEnabled config

Reviewers: David Jacot <djacot@confluent.io>, Jun Rao <junrao@gmail.com>
2024-05-30 14:52:50 -07:00
Alyssa Huang a8e99eb969
KAFKA-16833 Fixing PartitionInfo and Cluster equals and hashCode (#16062)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-31 05:00:42 +08:00
Sanskar Jhajharia e974914ca5
MINOR: Code Cleanup - Connect Module (#16066)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-31 04:55:00 +08:00
Ahmed Najiub 33a292e4dd
MINOR: Adds a test case to test that an exception is thrown in invalid ports (#16112)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-31 04:38:23 +08:00
David Jacot cd750582c0
MINOR: Enable transaction verification with new group coordinator in TransactionsTest (#16139)
While working on https://github.com/apache/kafka/pull/16120, I noticed that the transaction verification feature is disabled in `TransactionsTest` when the new group coordinator is enabled. We did this initially because the feature was not available in the new group coordinator but we fixed it a long time ago. We can enable it now.

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-05-30 12:35:29 -07:00
Murali Basani 701f8e7ad4
KAFKA-16802: Move java versions inside java block to resolve deprecation (#16135)
Reviewers: Greg Harris <greg.harris@aiven.io>
2024-05-30 12:11:35 -07:00
Dongnuo Lyu a626e87303
MINOR: Make public the consumer group migration policy config
This patch exposes the group coordinator config `CONSUMER_GROUP_MIGRATION_POLICY_CONFIG`.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>
2024-05-30 11:36:11 -07:00
Krishna Agarwal bb6a042e99
KAFKA-16827: Integrate kafka native-image with system tests (#16046)
This PR does following things

System tests should bring up Kafka broker in the native mode
System tests should run on Kafka broker in native mode
Extract out native build command so that it can be reused.
Allow system tests to run on Native Kafka broker using Docker mechanism

To run system tests by bringing up Kafka in native mode:
Pass kafka_mode as native in the ducktape globals:--globals '{\"kafka_mode\":\"native\"}'

Running system tests by bringing up kafka in native mode via docker mechanism
_DUCKTAPE_OPTIONS="--globals '{\"kafka_mode\":\"native\"}'" TC_PATHS="tests/kafkatest/tests/"  bash tests/docker/run_tests.sh

To only bring up ducker nodes to cater native kafka
bash tests/docker/ducker-ak up -m native

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2024-05-30 22:24:23 +05:30
Abhijeet Kumar bb7db87f98
KAFKA-15265: Add Remote Log Manager quota manager (#15625)
Added the implementation of the quota manager that will be used to throttle copy and fetch requests from the remote storage. Reference KIP-956

Reviewers: Luke Chen <showuon@gmail.com>, Kamal Chandraprakash <kchandraprakash@uber.com>, Jun Rao <junrao@gmail.com>
2024-05-30 09:06:49 -07:00
Bruno Cadonna fea3eeb7f7
Revert "KAFKA-16448: Add ProcessingExceptionHandler in Streams configuration (#16092)" (#16141)
This reverts commit 3f70c46874.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-30 17:52:07 +02:00
Fan Yang 32b2b73f67
KAFKA-16844: Add ByteBuffer support for Connect ByteArrayConverter (#16101)
Reviewers: Chris Egerton <chrise@aiven.io>
2024-05-30 11:26:25 -04:00
Ken Huang 3327435c8d
KAFKA-16598 Mirgrate `ResetConsumerGroupOffsetTest` to new test infra (#15779)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-30 21:51:16 +08:00
PoAn Yang 3b92046c08
MINOR: migrate ListConsumerGroupTest to use ClusterTestExtensions (#15821)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-30 21:30:19 +08:00
Mickael Maison 8068a086a3
MINOR: Remove KafkaConfig dependency in KafkaRequestHandler (#16108)
Reviewers: Luke Chen <showuon@gmail.com>, Apoorv Mittal <amittal@confluent.io>
2024-05-30 11:51:24 +02:00
Loïc GREFFIER 3f70c46874
KAFKA-16448: Add ProcessingExceptionHandler in Streams configuration (#16092)
This PR is part of KAFKA-16448 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR brings ProcessingExceptionHandler in Streams configuration.

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: sebastienviale <sebastien.viale@michelin.com>

Reviewer: Bruno Cadonna <cadonna@apache.org>
2024-05-30 10:39:38 +02:00
David Jacot 2a6078a4ce
MINOR: Prevent consumer protocol to be used in ZK mode (#16121)
This patch disallows enabling the new consumer rebalance protocol in ZK mode.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>
2024-05-29 23:02:21 -07:00
dengziming 131ce0ba59
Minor: Fix VoterSetHistoryTest.testAddAt (#16104)
Reviewers: Luke Chen <showuon@gmail.com>
2024-05-30 10:28:07 +08:00
Murali Basani 3d14690cbf
KAFKA-16790: Update RemoteLogManager configuration in broker server (#16005)
n BrokerServer.scala, brokerMetadataPublishers are configured and when there are metadata updates remoteLogManager is not configured by then.
Ex : remoteLogManager.foreach(rlm => rlm.onLeadershipChange(partitionsBecomeLeader.asJava, partitionsBecomeFollower.asJava, topicIds)) in ReplicaManager is invoked after publishers are instantiated, and here rlm has relevant managers configured.

This change makes sure rlm is configured before the brokerMetadataPublishers initialization.

Reviewers: Luke Chen <showuon@gmail.com>, Nikhil Ramakrishnan <nikrmk@amazon.com>
2024-05-30 08:21:30 +08:00
Calvin Liu c8af740bd4
Improve producer ID expiration performance (#16075)
Skip using stream when expiring the producer ID. This can improve the performance significantly when the count is high.
Before

Benchmark                                        (numProducerIds)  Mode  Cnt      Score       Error  Units
ProducerStateManagerBench.testDeleteExpiringIds             10000  avgt    3    101.253 ±    28.031  us/op
ProducerStateManagerBench.testDeleteExpiringIds            100000  avgt    3   2297.219 ±  1690.486  us/op
ProducerStateManagerBench.testDeleteExpiringIds           1000000  avgt    3  30688.865 ± 16348.768  us/op
After

Benchmark                                        (numProducerIds)  Mode  Cnt     Score     Error  Units
ProducerStateManagerBench.testDeleteExpiringIds             10000  avgt    3    39.122 ±   1.151  us/op
ProducerStateManagerBench.testDeleteExpiringIds            100000  avgt    3   464.363 ±  98.857  us/op
ProducerStateManagerBench.testDeleteExpiringIds           1000000  avgt    3  5731.169 ± 674.380  us/op
Also, made a change to the JMH testing which excludes the producer ID populating from the testing.

Reviewers: Artem Livshits <alivshits@confluent.io>, Justine Olshan <jolshan@confluent.io>
2024-05-29 16:49:55 -07:00
Justine Olshan 5e3df22095
KAFKA-16308 [1/N]: Create FeatureVersion interface and add `--feature` flag and handling to StorageTool (#15685)
As part of KIP-1022, I have created an interface for all the new features to be used when parsing the command line arguments, doing validations, getting default versions, etc.

I've also added the --feature flag to the storage tool to show how it will be used.

Created a TestFeatureVersion to show an implementation of the interface (besides MetadataVersion which is unique) and added tests using this new test feature.

I will add the unstable config and tests in a followup.

Reviewers: David Mao <dmao@confluent.io>, David Jacot <djacot@confluent.io>, Artem Livshits <alivshits@confluent.io>, Jun Rao <junrao@apache.org>
2024-05-29 16:36:06 -07:00
Antoine Pourchet 5c08ee0062
KAFKA-15045: (KIP-924 pt. 9) TaskAssignmentUtils implementation of optimizeRackAwareActiveTasks (#16033)
This PR implements the rack aware optimization of active tasks that can be used by the assignors themselves. It takes in the full output of the assignment and tries to reorganize it so as to minimize cross-rack traffic.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-29 16:11:37 -07:00
Mickael Maison 3f3f3ac155
MINOR: Delete KafkaSecurityConfigs class (#16113)
Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-30 05:55:24 +08:00
Antoine Pourchet cc269b0d43
KAFKA-15045: (KIP-924 pt. 14) Callback to TaskAssignor::onAssignmentComputed (#16123)
This PR adds the logic and wiring necessary to make the callback to
TaskAssignor::onAssignmentComputed with the necessary parameters.

We also fixed some log statements in the actual assignment error
computation, as well as modified the ApplicationState::allTasks method
to return a Map instead of a Set of TaskInfos.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-29 13:15:02 -07:00
Eugene Mitskevich 862ea12cd7
MINOR: Fix rate metric spikes (#15889)
Rate reports value in the form of sumOrCount/monitoredWindowSize. It has a bug in monitoredWindowSize calculation, which leads to spikes in result values.

Reviewers: Jun Rao <junrao@gmail.com>
2024-05-29 13:14:37 -07:00
gongxuanzhang 0f0c9ecbf3
KAFKA-16771 First log directory printed twice when formatting storage (#16010)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-30 01:08:17 +08:00
Andrew Schofield 2d9994e0de
KAFKA-16722: Introduce ConsumerGroupPartitionAssignor interface (#15998)
KIP-932 introduces share groups to go alongside consumer groups. Both kinds of group use server-side assignors but it is unlikely that a single assignor class would be suitable for both. As a result, the KIP introduces specific interfaces for consumer group and share group partition assignors.

This PR introduces only the consumer group interface, `o.a.k.coordinator.group.assignor.ConsumerGroupPartitionAssignor`. The share group interface will come in a later release. The existing implementations of the general `PartitionAssignor` interface have been changed to implement `ConsumerGroupPartitionAssignor` instead and all other code changes are just propagating the change throughout the codebase.

Note that the code in the group coordinator that actually calculates assignments uses the general `PartitionAssignor` interface so that it can be used with both kinds of group, even though the assignors themselves are specific.

Reviewers: Apoorv Mittal <amittal@confluent.io>, David Jacot <djacot@confluent.io>
2024-05-29 08:31:52 -07:00
gongxuanzhang 0b75cf7c0b
KAFKA-16705 the flag "started" of RaftClusterInstance is false even though the cluster is started (#15946)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-29 22:38:00 +08:00