Commit Graph

13676 Commits

Author SHA1 Message Date
Chris Egerton 50e7022a1b
MINOR: Improve error message when Connect's EmbeddedKafkaCluster::verifyClusterReadiness method fails (#16918)
Reviewers: Greg Harris <greg.harris@aiven.io>
2024-09-07 10:23:59 -04:00
David Arthur e4d108d4df
KAFKA-15793 Fix ZkMigrationIntegrationTest#testMigrateTopicDeletions (#17004)
Reviewers: Igor Soarez <soarez@apple.com>, Ajit Singh <>
2024-09-06 21:57:13 +01:00
Alyssa Huang a9a4a52c9d
KAFKA-16963: Ducktape test for KIP-853 (#17081)
Add a ducktape system test for KIP-853 quorum reconfiguration, including adding and removing voters.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-09-06 13:44:09 -07:00
David Arthur 62379d7d53
KAFKA-17479 Fix ignoreFailures logic in CI workflow [3/n] (#17106)
The ignoreFailures property was removed in #17066 to prevent test failures from being cached. However, this breaks the JUnit report and makes the github workflow less user friendly.

The problem is that we are copying the junit test report files into a new directory (added in #17098) in a Gradle doLast closure. If we don't run with ignoreFailures=true, then this closure will not run and the test failures won't be processed by junit.py.

This patch adds logic to ensure the doLast closure of :test is always run. The user provided -PignoreFailures is still honored for the test tasks so local developer workflows should not be disturbed.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-06 14:47:42 -04:00
David Arthur 1fd1646eb9
KAFKA-15648 Update leader volatile before handleLeaderChange in LocalLogManager (#17118)
Update the leader before calling handleLeaderChange and use the given epoch in LocalLogManager#prepareAppend. This should hopefully fix several flaky QuorumControllerTest tests.

Reviewers: José Armando García Sancio <jsancio@apache.org>
2024-09-06 13:54:03 -04:00
Apoorv Mittal eec9eccacb
KAFKA-17483: Complete pending share fetch requests on broker close (#17096)
The PR adds capability to complete pending fetch requests on broker shutdown.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-09-06 14:03:51 +05:30
Scott Hendricks 50ca2c8c73
MINOR: Fix Trogdor Off-By-One Errors. (#17095)
Upon further inspection of the ConfigurableProducerWorker, I noticed that there is an off-by-one error that can cause us to greatly exceed the target messages per second.

I created a test harness so I could quickly evaluate with and without this change.

With this change, the test harness outputs sane values:

GaussianThroughputGenerator throttle = new GaussianThroughputGenerator(350, 35, 100, 100);
... Output:
Changing Throttle: 323 ==> 318
Rate: 3513 Messages / Second
Changing Throttle: 318 ==> 318
Rate: 3510 Messages / Second
Changing Throttle: 318 ==> 333
Rate: 3506 Messages / Second
Changing Throttle: 333 ==> 352
Rate: 3505 Messages / Second
Changing Throttle: 352 ==> 356
Rate: 3505 Messages / Second
Changing Throttle: 356 ==> 302
Rate: 3505 Messages / Second
Changing Throttle: 302 ==> 347
Rate: 3501 Messages / Second
Changing Throttle: 347 ==> 397
Rate: 3501 Messages / Second
Without this change, the throttle thrashes, the values can skyrocket, and unintentional code paths can be called.

GaussianThroughputGenerator throttle = new GaussianThroughputGenerator(350, 35, 100, 100);
... Output:
Changing Throttle: 374 ==> 314
Changing Throttle: 314 ==> 346
Changing Throttle: 346 ==> 340
Changing Throttle: 340 ==> 382
Changing Throttle: 382 ==> 377
Changing Throttle: 377 ==> 352
Changing Throttle: 352 ==> 397
Changing Throttle: 397 ==> 335
Rate: 4468 Messages / Second
Changing Throttle: 335 ==> 398
Changing Throttle: 398 ==> 345
Changing Throttle: 345 ==> 381
Changing Throttle: 381 ==> 334
Changing Throttle: 334 ==> 303
Changing Throttle: 303 ==> 359
Changing Throttle: 359 ==> 353
Changing Throttle: 353 ==> 422
Changing Throttle: 422 ==> 274
Changing Throttle: 274 ==> 317
Rate: 4733 Messages / Second
Changing Throttle: 317 ==> 316
Changing Throttle: 316 ==> 392
Changing Throttle: 392 ==> 342
Changing Throttle: 342 ==> 429
Changing Throttle: 429 ==> 305
Changing Throttle: 305 ==> 389

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-09-05 13:39:26 -07:00
David Arthur 84aa5d7a63
KAFKA-17479 Relocate junit XML files [2/n] (#17098)
Recently, we fixed caching for ":jar" and ":test" tasks. A side effect of this is that the test results will be restored as part of the Gradle cache resolution. This means test tasks which are skipped (as a result of FROM-CACHE) will still have test results in their build directory. To avoid incorrectly reporting these results in the job summary, this patch uses a doLast task handler to relocate JUnit XML files into a new directory.

This patch also removes the "continue-on-error" from the JUnit test step which caused timed-out builds to appear successful.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-05 13:50:33 -04:00
Sebastien Viale b4f47aeff5
KAFKA-16448: Add timestamp to error handler context (#17054)
Part of KIP-1033.

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-09-05 08:42:52 -07:00
Kuan-Po Tseng 190df07ace
KAFKA-17265 Fix flaky MemoryRecordsBuilderTest#testBuffersDereferencedOnClose (#17092)
Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-09-05 19:47:16 +08:00
Luke Chen 887b947d8a
MINOR: Update doc for tiered storage GA (#17088)
Reviewers: Satish Duggana <satishd@apache.org>
2024-09-05 15:56:23 +05:30
Mickael Maison af0604e6ef
KAFKA-16188: Delete kafka.common.MessageReader (#17090)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-05 09:41:59 +02:00
TaiJuWu 235a5b6cf9
MINOR: `ClusterInstance#waitForTopic` gets hanging when the broker is shutdown (#17085)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-05 14:07:41 +08:00
David Jacot 9abb8d3b3c
MINOR: Set `group.coordinator.rebalance.protocols` to `classic,consumer` by default (#17057)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-05 13:50:20 +08:00
Sasaki Toru 748d20200f
MINOR: Fix broken output layout of kafka-consumer-groups.sh (#17058)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lmagrans@confluent.io>
2024-09-04 16:31:59 -04:00
David Arthur 0294b1402d
KAFKA-17479 Allow ":jar" tasks to be cached [1/n] (#17066)
For several modules, we include a kafka-version.properties in the Jar file. This file includes the Git SHA of the project at the time of the build. This means that even if no source files change, the :jar task will never be UP-TO-DATE between two git commits. Ultimately, this breaks Gradle caching.

This patch marks all of the createVersionFile tasks as cacheable and also changes our Gradle invocation to override the commit ID to a dummy static value. This will allow the :jar task to be cacheable and reusable between builds.

This patch also configures the trunk build to only write to the build cache and not read from it. This will prevent any cache pollution/corruption from propagating from build to build.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-04 15:06:11 -04:00
David Arthur 59f5d91d8f
MINOR increase develocity expiry to 4 hours (#17077)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-04 09:17:00 -04:00
Bill Bejeck 661825ad28
KAFKA-16993: Flaky test shouldInvokeUserDefinedGlobalStateRestoreListener (#16970)
Increased the number of records while decreasing the restore batch size to ensure the restoration does not complete before the second Kafka Streams instance starts up.

Reviewers: Matthias J. Sax <mjsax@apache.org>
2024-09-04 08:50:27 -04:00
Dmitry Werner 005155ba5c
KAFKA-17406: Move ClientIdAndBroker to server-common module (#16967)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, David Arthur <mumrah@gmail.com>
2024-09-04 11:04:40 +02:00
Dmitry Werner 5fd7ce2ace
KAFKA-17414 Move RequestLocal to server-common module (#16986)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-04 16:12:20 +08:00
TengYao Chi f0f37409be
KAFKA-17454 Fix failed transactions_mixed_versions_test.py when running with 3.2 (#17067)
why df04887ba5 does not fix it?

The fix of df04887ba5 is to NOT collect the log from path `/mnt/kafka/kafka-operational-logs/debug/xxxx.log`if the task is successful. It does not change the log level. see ducktape b2ad7693f2/ducktape/tests/test.py (L181)

why df04887ba5 does not see the error of "sort"

df04887ba5 does NOT show the error since the number of features is only "one" (only metadata.version). Hence, the bug is not triggered as it does not need to "sort". Now, we have two features - metadata.version and krafe.version - so the sort is executed and then we see the "hello bug"

why we should change the kafka.log_level to INFO?

the template of log4j.properties is controlled by `log_level` (https://github.com/apache/kafka/blob/trunk/tests/kafkatest/services/kafka/templates/log4j.properties#L16), and the bug happens in writing debug message (e4ca066680/core/src/main/scala/kafka/server/metadata/BrokerMetadataListener.scala (L274)). Hence, changing the log level to DEBUG can avoid triggering the bug.

Reviewers: Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-09-04 15:09:03 +08:00
Matthias J. Sax 32f03a06be
KAFKA-17474 fix state transition in GlobalStreamThread (#17078)
KAFKA-17100 changed the behavior of GlobalStreamThread introducing a race condition for state changes, that was exposed by failing (flaky) tests in GlobalStreamThreadTest.

This PR moves the state transition to fix the race condition.

Reviewers: Bill Bejeck <bbejeck@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-09-04 14:48:51 +08:00
Kuan-Po Tseng 19e8c447a6
KAFKA-17137 Ensure Admin APIs are properly tested (consumer group and quota) (#16717)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-04 12:08:38 +08:00
Luke Chen eb9cfb06c0
KAFKA-17412: add doc for `unclean.leader.election.enable` in KRaft (#17051)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-09-03 16:11:46 -07:00
Ritika Reddy edac19ba50
KAFKA-17277: [1/2] Add version mapping command to the storage tool and feature command tool (#16973)
As a part of KIP-1022 the following has been implemented in this patch:

A version-mapping command to to look up the corresponding features for a given metadata version. Using the command with no --release-version argument will return the mapping for the latest stable metadata version.
This command has been added to the FeatureCommand Tool and the Storage Tool.
The storage tools parsing method has been made more modular similar to the feature command tool

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-09-03 15:48:36 -07:00
Ken Huang 4d23fe92f1
KAFKA-16928: Test all of the request and response methods in RaftUtil (#16517)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-09-03 13:13:33 -07:00
Andrew Schofield b0d0956b20
KAFKA-17425: Improve coexistence of consumer and share groups (#17039)
This PR ensures that using the various group RPCs work properly when issued against the wrong type of group, such as DescribeConsumerGroups for a share group, or ConsumerGroupHeartbeat for a share group. There are no changes to the RPC error codes required.

The significant code changes are:

Making sure that the group coordinator does not assume that only classic and consumer groups exist. This was the cause of a ClassCastException when ConsumerGroupHeartbeat was being used against a share group.
Making sure that committing offsets to a share group fails with GroupIdNotFoundException rather than java.lang.UnsupportedOperation. This was the cause of a name collision between a share group and a consumer group when using kafka-consumer-groups.sh --reset-offsets which inadvertently created a consumer group of the same name.

 Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2024-09-04 00:16:15 +05:30
Sean Quah b8ea409132
MINOR: Log when a consumer group is created by the admin client (#17073)
Reviewers: David Jacot <djacot@confluent.io>, Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-09-04 02:37:54 +08:00
Mickael Maison 839431e591
KAFKA-17468 Move kafka/log/remote/quota classes to storage module (#17074)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-04 02:18:47 +08:00
Kamal Chandraprakash 88b9ff30ad
KAFKA-15859 Introduce remote.list.offsets.request.timeout.ms dynamic config (#17045)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-03 23:06:01 +08:00
TengYao Chi 2f9b236259
KAFKA-17294 Handle retriable errors when fetching offsets in new consumer (#16833)
The original behavior was implemented to maintain the behavior of the Classic consumer, where the ConsumerCoordinator would do the same when handling the OffsetFetchResponse. This behavior is being updated for the legacy coordinator as part of KAFKA-17279, to retry on all retriable errors.

We should review and update the CommitRequestManager to align with this, and retry on all retriable errors, which seems sensible when fetching offsets.

The corresponding PR for classic consumer is #16826

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-03 19:05:29 +08:00
Omnia Ibrahim f59d829381
KAFKA-15853 Move TransactionLogConfig and TransactionStateManagerConfig getters out of KafkaConfig (#16665)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-03 18:24:12 +08:00
Mickael Maison c30615e6d7
KAFKA-17430: Move RequestChannel.Metrics/RequestMetrics to server module (#17015)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-03 10:11:47 +02:00
ShivsundarR 743e185c8b
KAFKA-17450 Reduced the handlers for handling ShareAcknowledgeResponse (#17061)
Currently there are 4 handler functions present for handling ShareAcknowledge responses. ShareConsumeRequestManager had an interface and the respective handlers would implement it. Instead of having 4 different handlers for this, now using AcknowledgeRequestType, we could merge the code and have only 2 handler functions, one for ShareAcknowledge success and one for ShareAcknowledge failure, eliminating the need for the interface.

This PR also fixes a bug - We were not using the time at which the response was received while handling the ShareAcknowledge response, we were using an outdated time. Now the bug is fixed.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-09-03 11:31:49 +08:00
TaiJuWu 359ddce3b2
KAFKA-17137 Ensure Admin APIs are properly tested (token-related) (#16905)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-03 00:22:55 +08:00
Apoorv Mittal 89418b66ae
KAFKA-17442: Handled persister errors with write async calls (KIP-932) (#16956)
The PR makes the persister write RPC async. Also handles the errors from persister as per the review comment here:
Addressing review comment for PR: #16397 (comment)

Reviewers: Andrew Schofield <aschofield@confluent.io>, Abhinav Dixit <adixit@confluent.io>, Jun Rao <junrao@gmail.com>
2024-09-01 16:36:26 -07:00
Matthias J. Sax b527691e0a MINOR: add missing @Override annotations 2024-09-01 11:37:59 -07:00
Dmitry Werner 6836fa259e
KAFKA-17320 Move SensorAccess to server-common module (#16864)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 23:41:28 +08:00
Kuan-Po Tseng cb835d0d6d
KAFKA-17461 move the .github/README.md to .github/workflows/README.md (#17069)
According to GitHub rule:

If a repository contains more than one README file, then the file shown is chosen from locations in the following order: the .github directory, then the repository's root directory, and finally the docs directory.

the file .github/readme will override root/readme. Hence, we should move .github/readme to ./github/workflows/readme

Reviewers: David Arthur <mumrah@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 23:24:25 +08:00
TengYao Chi 6a2789cf70
KAFKA-17293: New consumer HeartbeatRequestManager should rediscover disconnected coordinator (#16844)
Reviewers: Lianet Magrans <lianetmr@gmail.com>, TaiJuWu <tjwu1217@gmail.com>
2024-09-01 09:59:21 -04:00
PoAn Yang 0fde28049a
KAFKA-17331 Throw unsupported version exception if the server does NOT support EarliestLocalSpec and LatestTieredSpec (#16873)
Add the version check to server side for the specific timestamp:
- the version must be >=8 if timestamp=-4L
- the version must be >=9 if timestamp=-5L

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 21:13:40 +08:00
Matthias J. Sax 1f5aea2a86
MINOR: remove get prefix for internal DSL methods (#17050)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 17:14:51 +08:00
Viktor Somogyi-Vass 59d3d7021a
KAFKA-17437 Upgrade commons-validator from 1.7 to 1.9.0 (#17028)
Reviewers: Josep Prat <josep.prat@aiven.io>, Bertalan Kondrat <kb.pcre@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 13:15:18 +08:00
Matthias J. Sax 4189a36b41
MINOR: fix JavaDocs of Kafka Streams context classes (#17049)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 12:46:39 +08:00
Arnav Dadarya 4b5df1f8e9
KAFKA-12826: Remove Deprecated Class Serdes (#17023)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-08-31 19:42:46 -07:00
Chia Chuan Yu d44627da99
MINOR: remove get prefix for internal KeyValueStoreWrapper (#17065)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 10:35:08 +08:00
TaiJuWu dc650fbd73
MINOR: add UT for consumer.poll (#17047)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-01 10:09:18 +08:00
PoAn Yang 4a2577b801
KAFKA-17395 Flaky test testMissingOffsetNoResetPolicy for new consumer (#17056)
In AsyncKafkaConsumer, FindCoordinatorRequest is sent by background thread. In MockClient#prepareResponseFrom, it only matches the response to a future request. If there is some race condition, FindCoordinatorResponse may not match to a FindCoordinatorRequest. It's better to put MockClient#prepareResponseFrom before the request to avoid flaky test.

Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-08-31 23:57:19 +08:00
David Arthur 3efa785a65
MINOR: Handle test re-runs in junit.py (#17034)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-31 23:34:29 +08:00
Matthias J. Sax fc720d33a0
MINOR: remove get prefix for internal state methods (#17053)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-31 20:02:06 +08:00