Use Java 24 for the spotbugs checks, now that Spotbugs works on Java
24.
Added some more warning exclusions for warnings that are new to 4.9.4.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
This commit updates CI to test against Java 24 instead of Java 23 which
is EOL.
Due to Spotbugs not having released version 4.9.4 yet, we can't run
Spotbugs on Java 24. Instead, we are choosing to run Spotbugs, and the
rest of the compile and validate build step, on Java 17 for now.
Once 4.9.4 has released, we will switch to using Java 24 for this.
Exclude spotbugs from the run-tests gradle action. Spotbugs is already
being run once in the build by "compile and validate", there is no
reason to run it again as part of executing tests.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Add the verify_license.py script to our build to detect missing licenses.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ken Huang <s7133700@gmail.com>, David Arthur <mumrah@gmail.com>
Add a single job that runs after the whole CI pipeline and make it a
required check before merging a PR. This will prevent us from merging
PRs which have not run through the CI.
Reviewers: Justine Olshan <jolshan@confluent.io>
The default checkout behavior for GitHub Actions is to use a special
merge ref which is equivalent to the base branch with the PR merged into
it. While this is crucial for checking compilation issues against trunk,
it significantly diminishes our ability to use any build caching.
This patch changes the JUnit test jobs to checkout the HEAD commit of the PR
when building. The "Compile and Check" step still checks out the merge commit
so we can keep that level of validation.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
Split the JUnit tests into "new", "flaky", and the remainder.
On PR builds, "new" tests are anything that do not exist on trunk. They are run with zero tolerance for flakiness.
On trunk builds, "new" tests are anything added in the last 7 days. They are run with some tolerance for flakiness.
Another change included here is that we will not update the test catalog if any test job fails on a trunk build. We have had difficulty determining if all the tests had or not (due to timeout or failures in upstream Gradle tasks). By requiring green ":test" jobs, we can be sure that the resulting catalog will be valid.
---
The purpose of this change is to discourage contributors from adding flaky tests, but give some leeway for trunk so we have successful builds.
The "quarantinedTest" Gradle target has been consolidated into the regular "test" target. There are now some
runtime properties to control what tests are run.
* kafka.test.catalog.file: path to test catalog
* kafka.test.run.new: include new tests. this selection depends on the age of the loaded test catalog
* kafka.test.run.flaky: include tests marked as `@Flaky` (replaces the `excludeTags 'flaky'` directive)
* kafka.test.verbose: include additional logging from new JUnit classes (enabled by default if re-running GitHub workflow with debug logging enabled)
* maxTestRetries: how many retries to allow via Develocity retry plugin (default 0)
* maxTestRetryFailures: how many failures to allow before stopping retries (default 0)
Thanks to Jun Rao for inspiring the idea.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
This patch includes some maintenance updates for Develocity.
* Publish build scans to develocity.apache.org
* Update Develocity Gradle plugin to to 3.19
* Use `DEVELOCITY_ACCESS_KEY` to authenticate to `develocity.apache.org`
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Arthur <mumrah@gmail.com>
Add a new "load-catalog" job to the workflow. This job will checkout the test-catalog branch at 7 days prior and generate a text file of all the tests that were known at that time. This file is then passed down to the two parallel "test" jobs to be used as a source of data for the quarantined test behavior.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
This patch adds a CI job to store our test catalog in an orphaned branch named "test-catalog" within this repo.
This data will be used to help determine which tests should be quarantined.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Our "validate" job was running on JDK 21 while the "test" job was running 11 and 23. This patch updates the validate job to 23 and fixes the test catalog step to only run on JDK 23 (instead of 21)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Introduce new quarantinedTest that excludes tests tagged with "flaky". Also introduce two new build parameters "maxQuarantineTestRetries" and "maxQuarantineTestRetryFailures".
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
This updates the versions of Java we test on from 8 and 21 to 11 and 21. This also removes unnecessary Check and Compile Java variations.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Fix the CI workflow to treat the `is-public-fork` input as a string.
Also add some docs on composite actions.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
This patch bring the PR and trunk builds closer in line. Rather than switching between `--scan` and `--no-scan`,
both scenarios now use `--no-scan` and rely on the CI Complete workflow to publish the scans.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
In the case of a CI timeout, this patch uses jstack to capture thread dumps from the Gradle test workers.
These thread dumps are stored in files which are later archived by the CI workflow.
This patch also increases the compression level to 9 for our "actions/upload-artifact" steps to save a bit of storage space.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Trunk builds are run off of "push" events rather than "pull_request". We were missing some logic in the is-public-fork condition that mistakenly caused some trunk builds to skip the build scan.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Publish Gradle build scans produced by PRs. This is done by using a `workflow_run` action that is triggered when the "CI" workflow completes. It downloads the build scan files from the PR workflow and publishes to ge.apache.org.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Fixes an issue where the CI workflow could appear to be successful in the event of a timeout and no failing tests. Instead of using Github Action's timeout, this patch makes use of the linux `timeout` command. This lets us capture the exit code and handle timeouts separately from a failed execution.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
The ignoreFailures property was removed in #17066 to prevent test failures from being cached. However, this breaks the JUnit report and makes the github workflow less user friendly.
The problem is that we are copying the junit test report files into a new directory (added in #17098) in a Gradle doLast closure. If we don't run with ignoreFailures=true, then this closure will not run and the test failures won't be processed by junit.py.
This patch adds logic to ensure the doLast closure of :test is always run. The user provided -PignoreFailures is still honored for the test tasks so local developer workflows should not be disturbed.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Recently, we fixed caching for ":jar" and ":test" tasks. A side effect of this is that the test results will be restored as part of the Gradle cache resolution. This means test tasks which are skipped (as a result of FROM-CACHE) will still have test results in their build directory. To avoid incorrectly reporting these results in the job summary, this patch uses a doLast task handler to relocate JUnit XML files into a new directory.
This patch also removes the "continue-on-error" from the JUnit test step which caused timed-out builds to appear successful.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
For several modules, we include a kafka-version.properties in the Jar file. This file includes the Git SHA of the project at the time of the build. This means that even if no source files change, the :jar task will never be UP-TO-DATE between two git commits. Ultimately, this breaks Gradle caching.
This patch marks all of the createVersionFile tasks as cacheable and also changes our Gradle invocation to override the commit ID to a dummy static value. This will allow the :jar task to be cacheable and reusable between builds.
This patch also configures the trunk build to only write to the build cache and not read from it. This will prevent any cache pollution/corruption from propagating from build to build.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
In order for Gradle to restore the cache, it needs to use the same workflow ID. This PR consolidates trunk and PRs builds to both use the CI build.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>