Commit Graph

18 Commits

Author SHA1 Message Date
Xuan-Zhang Gong 4410d35cdc
KAFKA-19179: remove the dot from thread_dump_url (#19525)
As the title.
Ticket: https://issues.apache.org/jira/browse/KAFKA-19179

Reviewers: PoAn Yang <payang@apache.org>, Jhen-Yung Hsu
 <jhenyunghsu@gmail.com>, TengYao Chi <frankvicky@apache.org>, Nick Guo
 <lansg0504@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
2025-04-22 10:16:54 +08:00
David Arthur cb33e98dfc
KAFKA-18748 Run new tests separately in PRs (#18770)
Split the JUnit tests into "new", "flaky", and the remainder. 

On PR builds, "new" tests are anything that do not exist on trunk. They are run with zero tolerance for flakiness. 

On trunk builds, "new" tests are anything added in the last 7 days. They are run with some tolerance for flakiness.

Another change included here is that we will not update the test catalog if any test job fails on a trunk build. We have had difficulty determining if all the tests had or not (due to timeout or failures in upstream Gradle tasks). By requiring green ":test" jobs, we can be sure that the resulting catalog will be valid.

---

The purpose of this change is to discourage contributors from adding flaky tests, but give some leeway for trunk so we have successful builds.

The "quarantinedTest" Gradle target has been consolidated into the regular "test" target. There are now some
runtime properties to control what tests are run.

* kafka.test.catalog.file: path to test catalog
* kafka.test.run.new: include new tests. this selection depends on the age of the loaded test catalog
* kafka.test.run.flaky: include tests marked as `@Flaky` (replaces the `excludeTags 'flaky'` directive)
* kafka.test.verbose: include additional logging from new JUnit classes (enabled by default if re-running GitHub workflow with debug logging enabled)
* maxTestRetries: how many retries to allow via Develocity retry plugin (default 0)
* maxTestRetryFailures: how many failures to allow before stopping retries (default 0)


Thanks to Jun Rao for inspiring the idea.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
2025-02-24 17:08:15 -05:00
David Arthur 617196c68e
KAFKA-18636 Fix how we handle Gradle exits in CI (#18681)
This patch removes the explicit failure of test tasks in Gradle when there is a flaky test. This also fixes a fall-through case in junit.py where we did not recognize an error prior to running the tests (such as the javadoc task).

Additionally, this patch removes usages of ignoreFailures in our CI and changes the XML copy task to a finalizer task instead of doLast closure.

Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-01-29 18:42:39 -05:00
David Arthur 8c0a0e07ce
KAFKA-17587 Refactor test infrastructure (#18602)
This patch reorganizes our test infrastructure into three Gradle modules:

":test-common:test-common-internal-api" is now a minimal dependency which exposes interfaces and annotations only. It has one project dependency on server-common to expose commonly used data classes (MetadataVersion, Feature, etc). Since this pulls in server-common, this module is Java 17+. It cannot be used by ":clients" or other Java 11 modules.

":test-common:test-common-util" includes the auto-quarantined JUnit extension. The @Flaky annotation has been moved here. Since this module has no project dependencies, we can add it to the Java 11 list so that ":clients" and others can utilize the @Flaky annotation

":test-common:test-common-runtime" now includes all of the test infrastructure code (TestKitNodes, etc). This module carries heavy dependencies (core, etc) and so it should not normally be included as a compile-time dependency.

In addition to this reorganization, this patch leverages JUnit SPI service discovery so that modules can utilize the integration test framework without depending on ":core". This will allow us to start moving integration tests out of core and into the appropriate sub-module. This is done by adding ":test-common:test-common-runtime" as a testRuntimeOnly dependency rather than as a testImplementation dependency. A trivial example was added to QuorumControllerTest to illustrate this.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
2025-01-24 09:03:43 -05:00
David Arthur af5d6c2578
MINOR Fix some test-catalog issues (#18272)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-12-20 08:42:57 -05:00
David Arthur 441a6d0b79
MINOR fix test-catalog generation (#17866)
Fixes another issue introduced in #17725 where the streaming XML parser would skip over tests that followed a SKIPPED test. This caused a large number of tests to be removed from the test catalog e4a5eb8

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-19 15:37:41 -05:00
David Arthur a334b1b6fd
MINOR Fix build scan artifact name in ci-complete (#17863)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-19 09:48:37 -05:00
David Arthur 5f4cbd4aa4
KAFKA-17767 Automatically quarantine new tests [5/n] (#17725)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-19 09:56:36 +08:00
David Arthur cbf440dfd0
KAFKA-17767 Parse quarantined tests and display them [4/n] (#17661)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-04 23:33:55 +08:00
David Arthur ef6c950b88
KAFKA-17767 Extract test catalog from JUnit output [1/n] (#17397)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-17 14:09:22 +08:00
David Arthur ef567bcc3f
MINOR: Group the junit parser console logs (#17229)
Use ::group:: feature of GitHub Actions to hide some of the verbose output from Parse JUnit Tests step.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-20 00:50:07 +08:00
David Arthur 420f69abbd
MINOR Add a thread dump on build timeout (#17181)
In the case of a CI timeout, this patch uses jstack to capture thread dumps from the Gradle test workers.
These thread dumps are stored in files which are later archived by the CI workflow.

This patch also increases the compression level to 9 for our "actions/upload-artifact" steps to save a bit of storage space. 

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-13 11:16:50 -04:00
xijiu ac4784ec0c
KAFKA-17418: Improve markdown formatting in junit.py (#17071)
Newline characters in the failure message of tests were causing the Markdown tables to be malformed.
This patch fixes that by replacing newlines with "<br>" tags and escaping other HTML that may appear in message.

Reviewers: David Arthur <mumrah@gmail.com>
2024-09-10 09:25:03 -04:00
David Arthur 040ae26472
KAFKA-17479 Fail the whole pipeline if junit step times out [4/n] (#17121)
Fixes an issue where the CI workflow could appear to be successful in the event of a timeout and no failing tests. Instead of using Github Action's timeout, this patch makes use of the linux `timeout` command. This lets us capture the exit code and handle timeouts separately from a failed execution.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-07 15:13:20 -04:00
David Arthur 84aa5d7a63
KAFKA-17479 Relocate junit XML files [2/n] (#17098)
Recently, we fixed caching for ":jar" and ":test" tasks. A side effect of this is that the test results will be restored as part of the Gradle cache resolution. This means test tasks which are skipped (as a result of FROM-CACHE) will still have test results in their build directory. To avoid incorrectly reporting these results in the job summary, this patch uses a doLast task handler to relocate JUnit XML files into a new directory.

This patch also removes the "continue-on-error" from the JUnit test step which caused timed-out builds to appear successful.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-05 13:50:33 -04:00
David Arthur 0294b1402d
KAFKA-17479 Allow ":jar" tasks to be cached [1/n] (#17066)
For several modules, we include a kafka-version.properties in the Jar file. This file includes the Git SHA of the project at the time of the build. This means that even if no source files change, the :jar task will never be UP-TO-DATE between two git commits. Ultimately, this breaks Gradle caching.

This patch marks all of the createVersionFile tasks as cacheable and also changes our Gradle invocation to override the commit ID to a dummy static value. This will allow the :jar task to be cacheable and reusable between builds.

This patch also configures the trunk build to only write to the build cache and not read from it. This will prevent any cache pollution/corruption from propagating from build to build.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-04 15:06:11 -04:00
David Arthur 3efa785a65
MINOR: Handle test re-runs in junit.py (#17034)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-31 23:34:29 +08:00
David Arthur be3e674e78
MINOR: Add JUnit parser for GH Actions (#16966)
A few improvements for JUnit in the Actions workflow:

* Generate a human readable job summary of the tests 
* Fail the workflow if JUnit tests fail
* Archive the HTML JUnit reports

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-23 10:09:59 -04:00