History

Stig Døssing 25da705178 MINOR: Run CI with Java 24 (#20295 ) This commit updates CI to test against Java 24 instead of Java 23 which is EOL. Due to Spotbugs not having released version 4.9.4 yet, we can't run Spotbugs on Java 24. Instead, we are choosing to run Spotbugs, and the rest of the compile and validate build step, on Java 17 for now. Once 4.9.4 has released, we will switch to using Java 24 for this. Exclude spotbugs from the run-tests gradle action. Spotbugs is already being run once in the build by "compile and validate", there is no reason to run it again as part of executing tests. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>		2025-08-05 21:26:13 +08:00
..
README.md	MINOR Improve PR linter output (#19159 )	2025-03-10 18:10:22 -04:00
build.yml	MINOR: Run CI with Java 24 (#20295 )	2025-08-05 21:26:13 +08:00
ci-complete.yml	MINOR: Run CI with Java 24 (#20295 )	2025-08-05 21:26:13 +08:00
ci.yml	MINOR Fix some test-catalog issues (#18272 )	2024-12-20 08:42:57 -05:00
deflake.yml	KAFKA-18748 Run new tests separately in PRs (#18770 )	2025-02-24 17:08:15 -05:00
docker_build_and_test.yml	MINOR: Fix error in installing docker-compose on docker-builds workflows (#18042 )	2024-12-04 23:59:55 +05:30
docker_official_image_build_and_test.yml	MINOR: Fix error in installing docker-compose on docker-builds workflows (#18042 )	2024-12-04 23:59:55 +05:30
docker_promote.yml	MINOR: fix some GHA run syntax (#17471 )	2024-10-12 08:52:55 +08:00
docker_rc_release.yml	MINOR: fix some GHA run syntax (#17471 )	2024-10-12 08:52:55 +08:00
docker_scan.yml	MINOR: Update the supported tags in docker_scan.yml (#19766 )	2025-05-20 11:21:27 +08:00
generate-reports.yml	MINOR Fix condition in flaky test report workflow (#18599 )	2025-01-21 13:38:14 -05:00
pr-labeled.yml	KAFKA-18244: Fix empty SHA on "Pull Request Labeled" workflow (#18190 )	2024-12-16 11:14:37 -05:00
pr-labels-cron.yml	MINOR Remove needs-attention label after review (#18366 )	2025-01-08 19:35:45 -05:00
pr-linter.yml	MINOR Improve PR linter output (#19159 )	2025-03-10 18:10:22 -04:00
pr-reviewed.yml	MINOR Improve PR linter output (#19159 )	2025-03-10 18:10:22 -04:00
pr-update.yml	MINOR Add PR triage workflow (#17881 )	2024-12-10 12:34:09 -05:00
prepare_docker_official_image_source.yml	MINOR: fix some GHA run syntax (#17471 )	2024-10-12 08:52:55 +08:00
stale.yml	MINOR Remove needs-attention label after review (#18366 )	2025-01-08 19:35:45 -05:00
workflow-requested.yml	MINOR Auto-approve Pull Request Reviewed with ci-approved label (#19098 )	2025-03-04 21:02:00 -05:00

README.md

GitHub Actions

Overview

The entry point for our build is the "CI" workflow which is defined in ci.yml. This is used for both PR and trunk builds. The jobs and steps of the workflow are defined in build.yml.

For Pull Requests, the "CI" workflow runs in an unprivileged context. This means it does not have access to repository secrets. After the "CI" workflow is complete, the "CI Complete" workflow is automatically run. This workflow consumes artifacts from the "CI" workflow and does run in a privileged context. This is how we are able to upload Gradle Build Scans to Develocity without exposing our access token to the Pull Requests.

Disabling Email Notifications

By default, GitHub sends an email for each failed action run. To change this, visit https://github.com/settings/notifications and find System -> Actions. Here you can change your notification preferences.

Security

Please read the following GitHub articles before authoring new workflows.

Variable Injection

Any workflows that use the run directive should avoid using the ${{ ... }} syntax. Instead, declare all injectable variables as environment variables. For example:

    - name: Copy RC Image to promoted image
      env:
        PROMOTED_DOCKER_IMAGE: ${{ github.event.inputs.promoted_docker_image }}
        RC_DOCKER_IMAGE: ${{ github.event.inputs.rc_docker_image }}
      run: |
        docker buildx imagetools create --tag $PROMOTED_DOCKER_IMAGE $RC_DOCKER_IMAGE

This prevents untrusted inputs from doing script injection in the run steps.

`pull_request_target` events

In addition to the above security articles, please review the official documentation on pull_request_target. This event type allows PRs to trigger actions that run with elevated permission and access to repository secrets. We should only be using this for very simple tasks such as applying labels or adding comments to PRs.

We must never run the untrusted PR code in the elevated pull_request_target context

Our Workflows

Trunk Build

The ci.yml is run when commits are pushed to trunk. This calls into build.yml to run our main build. In the trunk build, we do not read from the Gradle cache, but we do write to it. Also, the test catalog is only updated from trunk builds.

PR Build

Similar to trunk, this workflow starts in ci.yml and calls into build.yml. Unlike trunk, the PR builds will utilize the Gradle cache.

PR Triage

In order to get the attention of committers, we have a triage workflow for Pull Requests opened by non-committers. This workflow consists of two files:

pr-update.yml When a PR is created, add the triage label if the PR was opened by a non-committer.
pr-labels-cron.yml Cron job to add needs-attention label to community PRs that have not been reviewed after 7 days. Also includes a cron job to remove the triage and needs-attention labels from PRs which have been reviewed.

The pr-update.yml workflow includes pull_request_target!

For committers to avoid having this label added, their membership in the ASF GitHub organization must be public. Here are the steps to take:

Navigate to the ASF organization's "People" page https://github.com/orgs/apache/people
Find yourself
Change "Organization Visibility" to Public

Full documentation for this process can be found in GitHub's docs: https://docs.github.com/en/account-and-profile/setting-up-and-managing-your-personal-account-on-github/managing-your-membership-in-organizations/publicizing-or-hiding-organization-membership

If you are a committer and do not want your membership in the ASF org listed as public, you will need to remove the triage label manually.

CI Approved

Due to a combination of GitHub security and ASF's policy, we required explicit approval of workflows on PRs submitted by non-committers (and non-contributors). To simply this process, we have a ci-approved label which automatically approves these workflows.

There are two files related to this workflow:

pr-labeled.yml approves a pending approval for PRs that have been labeled with ci-approved
workflow-requested.yml approves future workflow requests automatically if the PR has the ci-approved label

The pr-labeled.yml workflow includes pull_request_target!

PR Linter

To help ensure good commit messages, we have added a "Pull Request Linter" job that checks the title and body of the PR.

There are two files related to this workflow:

pr-reviewed.yml runs when a PR is reviewed or has its title or body edited. This workflow simply captures the PR number into a text file
pr-linter.yml runs after pr-reviewed.yml and loads the PR using the saved text file. This workflow runs the linter script that checks the structure of the PR

Note that the pr-reviewed.yml workflow uses the ci-approved mechanism described above.

The following checks are performed on our PRs:

Title is not too short or too long
Title starts with "KAFKA-", "MINOR", or "HOTFIX"
Body is not empty
Body includes "Reviewers:" if the PR is approved

With the merge queue, our PR title and body will become the commit subject and message. This linting step will help to ensure that we have nice looking commits.

Stale PRs

This one is straightforward. Using the "actions/stale" GitHub Action, we automatically label and eventually close PRs which have not had activity for some time. See the stale.yml workflow file for specifics.

GitHub Actions Quirks

Composite Actions

Composite actions are a convenient way to reuse build logic, but they have some limitations.

Cannot run more than one step in a composite action (see workflow_call instead)
Inputs can only be strings, no support for typed parameters. See: https://github.com/actions/runner/issues/2238

Troubleshooting

Gradle Cache Misses

If your PR is running for longer than you would expect due to cache misses, there are a few things to check.

First, find the cache that was loaded into your PR build. This is found in the Setup Gradle output. Look for a line starting with "Restored Gradle User Home from cache key". For example,

Restored Gradle User Home from cache key: gradle-home-v1|Linux-X64|test[188616818c9a3165053ef8704c27b28e]-5c20aa187aa8f51af4270d7d1b0db4963b0cd10b

The last part of the cache key is the SHA of the commit on trunk where the cache was created. If that commit is not on your branch, it means your build loaded a cache that includes changes your PR does not yet have. This is a common way to have cache misses. To resolve this, update your PR with the latest cached trunk commit:

git fetch origin
./committer-tools/update-cache.sh
git merge trunk-cached

then push your branch.

If your build seems to be using the correct cache, the next thing to check is for changes to task inputs. You can find this by locating the trunk Build Scan from the cache commit on trunk and comparing it with the build scan of your PR build. This is done in the Develocity UI using the two overlapping circles like (A()B). This will show you differences in the task inputs for the two builds.

Finally, you can run your PR with extra cache debugging. Add this to the gradle invocation in run-gradle/action.yml.

-Dorg.gradle.caching.debug=true

This will dump out a lot of output, so you may also reduce the test target to one module.