The ignoreFailures property was removed in #17066 to prevent test failures from being cached. However, this breaks the JUnit report and makes the github workflow less user friendly.
The problem is that we are copying the junit test report files into a new directory (added in #17098) in a Gradle doLast closure. If we don't run with ignoreFailures=true, then this closure will not run and the test failures won't be processed by junit.py.
This patch adds logic to ensure the doLast closure of :test is always run. The user provided -PignoreFailures is still honored for the test tasks so local developer workflows should not be disturbed.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Update the leader before calling handleLeaderChange and use the given epoch in LocalLogManager#prepareAppend. This should hopefully fix several flaky QuorumControllerTest tests.
Reviewers: José Armando García Sancio <jsancio@apache.org>
The PR adds capability to complete pending fetch requests on broker shutdown.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
Recently, we fixed caching for ":jar" and ":test" tasks. A side effect of this is that the test results will be restored as part of the Gradle cache resolution. This means test tasks which are skipped (as a result of FROM-CACHE) will still have test results in their build directory. To avoid incorrectly reporting these results in the job summary, this patch uses a doLast task handler to relocate JUnit XML files into a new directory.
This patch also removes the "continue-on-error" from the JUnit test step which caused timed-out builds to appear successful.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Part of KIP-1033.
Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>
For several modules, we include a kafka-version.properties in the Jar file. This file includes the Git SHA of the project at the time of the build. This means that even if no source files change, the :jar task will never be UP-TO-DATE between two git commits. Ultimately, this breaks Gradle caching.
This patch marks all of the createVersionFile tasks as cacheable and also changes our Gradle invocation to override the commit ID to a dummy static value. This will allow the :jar task to be cacheable and reusable between builds.
This patch also configures the trunk build to only write to the build cache and not read from it. This will prevent any cache pollution/corruption from propagating from build to build.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Increased the number of records while decreasing the restore batch size to ensure the restoration does not complete before the second Kafka Streams instance starts up.
Reviewers: Matthias J. Sax <mjsax@apache.org>
why df04887ba5 does not fix it?
The fix of df04887ba5 is to NOT collect the log from path `/mnt/kafka/kafka-operational-logs/debug/xxxx.log`if the task is successful. It does not change the log level. see ducktape b2ad7693f2/ducktape/tests/test.py (L181)
why df04887ba5 does not see the error of "sort"
df04887ba5 does NOT show the error since the number of features is only "one" (only metadata.version). Hence, the bug is not triggered as it does not need to "sort". Now, we have two features - metadata.version and krafe.version - so the sort is executed and then we see the "hello bug"
why we should change the kafka.log_level to INFO?
the template of log4j.properties is controlled by `log_level` (https://github.com/apache/kafka/blob/trunk/tests/kafkatest/services/kafka/templates/log4j.properties#L16), and the bug happens in writing debug message (e4ca066680/core/src/main/scala/kafka/server/metadata/BrokerMetadataListener.scala (L274)). Hence, changing the log level to DEBUG can avoid triggering the bug.
Reviewers: Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
KAFKA-17100 changed the behavior of GlobalStreamThread introducing a race condition for state changes, that was exposed by failing (flaky) tests in GlobalStreamThreadTest.
This PR moves the state transition to fix the race condition.
Reviewers: Bill Bejeck <bbejeck@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
As a part of KIP-1022 the following has been implemented in this patch:
A version-mapping command to to look up the corresponding features for a given metadata version. Using the command with no --release-version argument will return the mapping for the latest stable metadata version.
This command has been added to the FeatureCommand Tool and the Storage Tool.
The storage tools parsing method has been made more modular similar to the feature command tool
Reviewers: Justine Olshan <jolshan@confluent.io>
This PR ensures that using the various group RPCs work properly when issued against the wrong type of group, such as DescribeConsumerGroups for a share group, or ConsumerGroupHeartbeat for a share group. There are no changes to the RPC error codes required.
The significant code changes are:
Making sure that the group coordinator does not assume that only classic and consumer groups exist. This was the cause of a ClassCastException when ConsumerGroupHeartbeat was being used against a share group.
Making sure that committing offsets to a share group fails with GroupIdNotFoundException rather than java.lang.UnsupportedOperation. This was the cause of a name collision between a share group and a consumer group when using kafka-consumer-groups.sh --reset-offsets which inadvertently created a consumer group of the same name.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
The original behavior was implemented to maintain the behavior of the Classic consumer, where the ConsumerCoordinator would do the same when handling the OffsetFetchResponse. This behavior is being updated for the legacy coordinator as part of KAFKA-17279, to retry on all retriable errors.
We should review and update the CommitRequestManager to align with this, and retry on all retriable errors, which seems sensible when fetching offsets.
The corresponding PR for classic consumer is #16826
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Currently there are 4 handler functions present for handling ShareAcknowledge responses. ShareConsumeRequestManager had an interface and the respective handlers would implement it. Instead of having 4 different handlers for this, now using AcknowledgeRequestType, we could merge the code and have only 2 handler functions, one for ShareAcknowledge success and one for ShareAcknowledge failure, eliminating the need for the interface.
This PR also fixes a bug - We were not using the time at which the response was received while handling the ShareAcknowledge response, we were using an outdated time. Now the bug is fixed.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
The PR makes the persister write RPC async. Also handles the errors from persister as per the review comment here:
Addressing review comment for PR: #16397 (comment)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Abhinav Dixit <adixit@confluent.io>, Jun Rao <junrao@gmail.com>
According to GitHub rule:
If a repository contains more than one README file, then the file shown is chosen from locations in the following order: the .github directory, then the repository's root directory, and finally the docs directory.
the file .github/readme will override root/readme. Hence, we should move .github/readme to ./github/workflows/readme
Reviewers: David Arthur <mumrah@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Add the version check to server side for the specific timestamp:
- the version must be >=8 if timestamp=-4L
- the version must be >=9 if timestamp=-5L
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
In AsyncKafkaConsumer, FindCoordinatorRequest is sent by background thread. In MockClient#prepareResponseFrom, it only matches the response to a future request. If there is some race condition, FindCoordinatorResponse may not match to a FindCoordinatorRequest. It's better to put MockClient#prepareResponseFrom before the request to avoid flaky test.
Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>