In #12148 , we removed log4jAppender dependency, and add testImplementation dependency for slf4jlog4j lib. However, we need this runtime dependency in tools module to output logs. (ref) Adding this dependency back.
Note: The slf4jlog4j lib was added in log4j-appender dependency. Since it's removed, we need to explicitly declare it.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Edoardo Comar <ecomar@uk.ibm.com>
This patch moves the `PartitionAssignor` interface and all the related classes to a newly created `group-coordinator/api` module, following the pattern used by the storage and tools modules.
Reviewers: Ritika Reddy <rreddy@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
This PR aims to add JVM based Docker Official Image for Apache Kafka as per the following KIP - https://cwiki.apache.org/confluence/display/KAFKA/KIP-1028%3A+Docker+Official+Image+for+Apache+Kafka
This PR adds the following functionalities:
Introduces support for Apache Kafka Docker Official Images via:
GitHub Workflows:
- Workflow to prepare static source files for Docker images
- Workflow to build and test Docker official images
- Scripts to prepare source files and perform Docker image builds and tests
A new directory for Docker official images, named docker/docker_official_images. This is the new directory to house all Docker Official Image assets.
Co-authored-by: Vedarth Sharma <vesharma@confluent.io>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Vedarth Sharma <vesharma@confluent.io>
When the consumer group protocol is used in a cluster, it is, at the moment, impossible to see all records stored in the __consumer_offsets topic with kafka-dump-log --offsets-decoder. It does not know how to handle all the new records.
This patch refactors the OffsetsMessageParser used internally by kafka-dump-log to use the RecordSerde used by the new group coordinator. It ensures that the tool is always in sync with the coordinator implementation. The patch also changes the format to using the toString'ed representations of the records instead of having custom logic to dump them. It ensures that all the information is always dumped. The downside of the latest is that inner byte arrays (e.g. assignment in the classic protocol) are no longer deserialized. Personally, I feel like that it is acceptable and it is actually better to stay as close as possible to the actual records in this tool. It also avoids issues like https://issues.apache.org/jira/browse/KAFKA-15603.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Small cleanup: removed version when excluding shaded dependencies from clients library as it's not needed.
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
This avoids `gradle dependencyCheckAggregate` from reporting on
advisories in build-time dependencies (e.g. CVE-2023-46122) which
typically should not affect us.
I checked that this does not prevent advisories in 'regular'
dependencies from being reported (but there currently are none).
Reviewers: Josep Prat <josep.prat@aiven.io>
Following test cases don't really run kraft case. The reason is that the test info doesn't contain parameter name, so it always returns false in TestInfoUtils#isKRaft.
1) TopicCommandIntegrationTest
2) DeleteConsumerGroupsTest
3) AuthorizerIntegrationTest
4) DeleteOffsetsConsumerGroupCommandIntegrationTest
We can fix it by adding options.compilerArgs << '-parameters' after
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
The issue KAFKA-16359 reported inclusion of kafka-clients runtime dependencies in MANIFEST.MF Class-Path.
The root cause is the issue defined here with the usage of shadow plugin.
Looking into the specifics of plugin and documentation, specifies that any dependency marked as shadow will be treated as following by the shadow plugin:
1. Adds the dependency as runtime dependency in resultant pom.xml - code here
2. Adds the dependency as Class-Path in MANIFEST.MF as well - code here
Resolution
We do need the runtime dependencies available in the pom (1 above) but not on manifest (2 above). Also there is no clear way to separate the behaviour as both above tasks relies on shadow configuration.
To fix, I have defined another custom configuration named shadowed which is later used to populate the correct pom (the code is similar to what shadow plugin does to populate pom for runtime dependencies).
Though this might seem like a workaround, but I think that's the only way to fix the issue. I have checked other SDKs which suffered with same issue and went with similar route to populate pom.
Reviewers: Luke Chen <showuon@gmail.com>, Reviewers: Mickael Maison <mickael.maison@gmail.com>, Gaurav Narula <gaurav_narula2@apple.com>
1) This PR moves kafka.security classes from core to server module.
2) AclAuthorizer not moved, because it has heavy dependencies on core classes that not rewrited from scala at the moment.
3) AclAuthorizer will be deleted as part of ZK removal
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
reportScoverage task (which was used previously as dependency of the registered coverage task)
creates a task for each Test task and executes them. There's unitTest, integrationTest and
the test tasks (which is just for executing both unit and integration), so reportScoverage
executes all three with their corresponding scoverage task, hence running all tests twice.
Solution is just to use the reportTestScoverage task as dependency.
Reviewers: Viktor Somogyi-Vass <viktorsomogyi@gmail.com>
1) Rename WorkerSinkTaskMockitoTest back to WorkerSinkTaskTest
2) Tidy up the code a bit
3) rewrite "fail" by "assertThrow"
Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Jacoco and scoverage reporting hasn't been working for a while. This commit fixes report generation. After this PR only subproject level reports are generated as Jenkins and Sonar only cares about that.
This PR doesn't change Kafka's Jenkinsfile.
Reviewers: Viktor Somogyi-Vass <viktorsomogyi@gmail.com>
Currently, there are few document files generated automatically like the task genConnectMetricsDocs
However, the unwanted log information also added into it.
And the format is not aligned with other which has Mbean located of the third column.
I modified the code logic so the format could follow other section in ops.html
Also close the log since we take everything from the std as a documentation
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
In order to move ConfigCommand to tools we must move all it's dependencies which includes KafkaConfig and other core classes to java. This PR moves log cleaner configuration to CleanerConfig class of storage module.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
This PR is part of #14471
Is contains some of ConsoleGroupCommand tests rewritten in java.
Intention of separate PR is to reduce changes and simplify review.
Reviewers: Luke Chen <showuon@gmail.com>
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ismael Juma <ismael@juma.me.uk> , David Jacot <djacot@confluent.io>, Nikolay <NIzhikov@gmail.com>
This patch wires the transaction verification in the new group coordinator. It basically calls the verification path before scheduling the write operation. If the verification fails, the error is returned to the caller.
Note that the patch uses `appendForGroup`. I suppose that we will move away from using it when https://github.com/apache/kafka/pull/15087 is merged.
Reviewers: Justine Olshan <jolshan@confluent.io>
Migrates functionality provided by utility to Kafka core. This wrapper will be used to generate property files and format storage when invoked from docker container.
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Viktor Somogyi-Vass <viktorsomogyi@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
The PR fixes the publishing of kafka-clients artifact to remote maven. The kafka-clients jar was recently shadowed which would publish the artifacts to the local maven repo successfully but would throw an error when publishing to remote maven. (as part of the release process)
The issue triggers only with publishMavenJavaPublicationToMavenRepository due to signing. Generating signed asc files error out for shadowed release artifacts as the module name (clients) differs from the artifact name (kafka-clients).
The fix is basically to explicitly define artifact of shadowJar to signing and publish plugin. project.shadow.component(mavenJava) previously outputs the name as client-<version>-all.jar though the classifier and archivesBaseName are already defined correctly in :clients and shadowJar construction.
This patch moves the `RaftIOThread` implementation into Java. I changed the name to `KafkaRaftClientDriver` since the main thing it does is drive the calls to `poll()`. There shouldn't be any changes to the logic.
Reviewers: José Armando García Sancio <jsancio@apache.org>
Improve JsonConverter performance by using afterBurnModule of Jackson library.
Reviewers: Divij Vaidya <diviv@amazon.com>, Mickael Maison <mickael.maison@gmail.com>
MetadataShell should take an advisory lock on the .lock file of the directory it is reading from.
Add an integration test of this functionality in MetadataShellIntegrationTest.java.
Note: in build.gradle, I had to add some dependencies on server-common's test files in order to use
MockFaultHandler, etc.
MetadataBatchLoader.java: fix a case where a log message was incorrect. The intention was to print
the number equivalent to (offset + index). Instead it was printing the offset, followed by the
index. So if the offset was 100 and the index was 1, 1001 would be printed rather than 101.
Co-authored-by: Igor Soarez <i@soarez.me>
Reviewers: David Arthur <mumrah@gmail.com>, José Armando García Sancio <jsancio@apache.org>
We need to be careful when aborting a long poll with wakeup() since the
consumer might never return records if the poll is interrupted after the
consumer position has been updated but the records have not been returned
to the caller of poll().
This PR avoid wake-ups during this critical period.
Reviewers: Philip Nee <pnee@confluent.io>, Kirk True <ktrue@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>
We only move Java classes that have minimal or no dependencies on Scala classes in this PR.
Details:
* Configured `server` module in build files.
* Changed `ControllerRequestCompletionHandler` to be an interface since it has no implementations.
* Cleaned up various import control files.
* Minor build clean-ups for `server-common`.
* Disabled `testAssignmentAggregation` when executed with Java 8, this is an existing issue (see #14794).
For broader context on this change, please check:
* KAFKA-15852: Move server code from `core` to `server` module
Reviewers: Divij Vaidya <diviv@amazon.com>
Reviewers: Christo Lolov <lolovc@amazon.com>, Colin P. McCabe <cmccabe@apache.org>, Proven Provenzano <pprovenzano@confluent.io>, Ron Dagostino <rdagostino@confluent.io>
Part of KIP-714.
The PR comprises of changes to include opentlemetry library as defined in KIP-714. The libraries are shadowed to prevent conflicts.
Co-authored-by: Xavier Léauté <xl+github@xvrl.net>
Reviewers: Andrew Schofield <aschofield@confluent.io>, Xavier Léauté <xavier@confluent.io>, Walker Carlson <wcarlson@confluent.io>, Matthias J. Sax <matthias@confluent.io>
This PR for KIP-714 - KAFKA-1564 lays out interfaces and classes for capturing client telemetry metrics.
Below image defines interaction of different classes among them interfaces have been included in the PR.
Reviewers: Walker Carlson <wcarlson@apache.org>, Matthias J. Sax <matthias@confluent.io>, Andrew Schofield <andrew_schofield@uk.ibm.com>, Kirk True <ktrue@confluent.io>, Philip Nee <pnee@confluent.io>, Jun Rao <junrao@gmail.com>,
git gc moves commit hashes from individual .git/refs/heads/ to .git/packed-refs which is not read
by the determineCommitId function.
Replace the existing lookup within the .git directory with a GrGit lookup that handles packed and
unpacked refs transparently.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Implements the following metrics:
kafka.server:type=group-coordinator-metrics,name=num-partitions,state=loading
kafka.server:type=group-coordinator-metrics,name=num-partitions,state=active
kafka.server:type=group-coordinator-metrics,name=num-partitions,state=failed
kafka.server:type=group-coordinator-metrics,name=event-queue-size
kafka.server:type=group-coordinator-metrics,name=partition-load-time-max
kafka.server:type=group-coordinator-metrics,name=partition-load-time-avg
kafka.server:type=group-coordinator-metrics,name=thread-idle-ratio-min
kafka.server:type=group-coordinator-metrics,name=thread-idle-ratio-avg
The PR makes these metrics generic so that in the future the transaction coordinator runtime can implement the same metrics in a similar fashion.
Also, CoordinatorLoaderImpl#load will now return LoadSummary which encapsulates the start time, end time, number of records/bytes.
Co-authored-by: David Jacot <djacot@confluent.io>
Reviewers: Ritika Reddy <rreddy@confluent.io>, Calvin Liu <caliu@confluent.io>, David Jacot <djacot@confluent.io>, Justine Olshan <jolshan@confluent.io>
Spotbugs was temporarily disabled as part of KAFKA-15485 to support Kafka build with JDK 21. This PR upgrades the spotbugs version to 4.8.0 which adds support for JDK 21 and enables it's usage on build again.
Reviewers: Divij Vaidya <diviv@amazon.com>
This PR is part of #13247
It contains ReassignPartitionsIntegrationTest rewritten in java.
Goal of PR is reduce changes size in main PR.
Reviewers: Taras Ledkov <tledkov@apache.org>, Justine Olshan <jolshan@confluent.io>
* Update CI to build with Java 21 instead of Java 20
* Disable spotbugs when building with Java 21 as it doesn't support it yet (filed KAFKA-15492 for
addressing this)
* Disable SslTransportLayerTest.testValidEndpointIdentificationCN with Java 21 (same as Java 20)
Reviewers: Divij Vaidya <diviv@amazon.com>
Resolves cache misses in checkstyle tasks due to absolute paths in configProperties.
Sets configDirectory extension property, which is made available by the checkstyle plugin as ${config_loc} in the checkstyle xml files, as shown in the Checkstyle Gradle docs. The absolute paths set in configProperties are then replaced by relative paths from configDirectory. Because the header and suppression config file names are static and only referenced once, these were removed from configProperties and the file names are given directly in checkstyle.xml
Reviewers: Divij Vaidya <diviv@amazon.com>
This change does the following:
1. Make RemoteLogManagerConfigs that are implemented public
2. Add tasks to generate html docs for the configs
3. Include config docs in the main site
Reviewers: Divij Vaidya <diviv@amazon.com>, Luke Chen <showuon@gmail.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org>
`TieredStorageTestHarness` is a base class for integration tests exercising the tiered storage functionality. This uses `LocalTieredStorage` instance as the second-tier storage system and `TopicBasedRemoteLogMetadataManager` as the remote log metadata manager.
Co-authored-by: Alexandre Dupriez <alexandre.dupriez@gmail.com>
Co-authored-by: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>
Only initialize remote topic metrics when system-wise remote storage is enabled to avoid impacting performance for existing brokers. Also add tests.
Reviewers: Divij Vaidya <diviv@amazon.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>
* KAFKA-14953: Adding RemoteLogManager metrics
In this PR, I have added the following metrics that are related to tiered storage mentioned in[ KIP-405](https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage).
|Metric|Description|
|-----------------------------------------|--------------------------------------------------------------|
| RemoteReadRequestsPerSec | Number of remote storage read requests per second |
| RemoteWriteRequestsPerSec | Number of remote storage write requests per second |
| RemoteBytesInPerSec | Number of bytes read from remote storage per second |
| RemoteReadErrorsPerSec | Number of remote storage read errors per second |
| RemoteBytesOutPerSec | Number of bytes copied to remote storage per second |
| RemoteWriteErrorsPerSec | Number of remote storage write errors per second |
| RemoteLogReaderTaskQueueSize | Number of remote storage read tasks pending for execution. |
| RemoteLogReaderAvgIdlePercent | Average idle percent of the remote storage reader thread pool|
| RemoteLogManagerTasksAvgIdlePercent | Average idle percent of RemoteLogManager thread pool |
Added unit tests for all the rate metrics.
Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>, Staniel Yao<yaolixinylx@gmail.com>, hudeqi<1217150961@qq.com>, Satish Duggana <satishd@apache.org>
KAFKA-14522 Rewrite and Move of RemoteIndexCache to storage module.
Cleanedup index file suffix usages and other minor cleanups
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>
It's good for us to add support for Java 20 in preparation for Java 21 - the next LTS.
Given that Scala 2.12 support has been deprecated, a Scala 2.12 variant is not included.
Also remove some branch builds that add load to the CI, but have
low value: JDK 8 & Scala 2.13 (JDK 8 support has been deprecated),
JDK 11 & Scala 2.12 (Scala 2.12 support has been deprecated) and
JDK 17 & Scala 2.12 (Scala 2.12 support has been deprecated).
A newer version of Mockito (4.9.0 -> 4.11.0) is required for Java 20 support, but we
only use it with Scala 2.13+ since it causes compilation errors with Scala 2.12. Similarly,
we upgrade easymock when the Java version is 16 or newer as it's incompatible
with powermock (which doesn't support Java 16 or newer).
Filed KAFKA-15117 for a test that fails with Java 20 (SslTransportLayerTest.testValidEndpointIdentificationCN).
Finally, fixed some lossy conversions that were added after #13582 was submitted.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Use thread safe Caffeine to cache indexes fetched from RemoteTier locally. This PR removes a lock contention that led to higher fetch latencies as the IO threads spent time unnecessarily waiting on global cache lock while a single thread fetches the index from remote tier. See PR #13850 for details and rejected alternatives.
Reviewers: Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org>
This fix the following issue that we occasionally see in [builds](https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-13848/4/pipeline/13/).
```
[2023-06-14T11:41:50.769Z] * What went wrong:
[2023-06-14T11:41:50.769Z] A problem was found with the configuration of task ':rat' (type 'RatTask').
[2023-06-14T11:41:50.769Z] - Gradle detected a problem with the following location: '/home/jenkins/jenkins-agent/workspace/Kafka_kafka-pr_PR-13848'.
[2023-06-14T11:41:50.769Z]
[2023-06-14T11:41:50.769Z] Reason: Task ':rat' uses this output of task ':clients:processTestMessages' without declaring an explicit or implicit dependency. This can lead to incorrect results being produced, depending on what order the tasks are executed.
[2023-06-14T11:41:50.769Z]
[2023-06-14T11:41:50.769Z] Possible solutions:
[2023-06-14T11:41:50.769Z] 1. Declare task ':clients:processTestMessages' as an input of ':rat'.
[2023-06-14T11:41:50.769Z] 2. Declare an explicit dependency on ':clients:processTestMessages' from ':rat' using Task#dependsOn.
[2023-06-14T11:41:50.769Z] 3. Declare an explicit dependency on ':clients:processTestMessages' from ':rat' using Task#mustRunAfter.
[2023-06-14T11:41:50.769Z]
[2023-06-14T11:41:50.769Z] Please refer to https://docs.gradle.org/8.1.1/userguide/validation_problems.html#implicit_dependency for more details about this problem.
```
Validated manually as well:
```
% ./gradlew rat
> Configure project :
Starting build with version 3.6.0-SNAPSHOT (commit id 874081ca) using Gradle 8.1.1, Java 17 and Scala 2.13.10
Build properties: maxParallelForks=10, maxScalacThreads=8, maxTestRetries=0
> Task :storage:processMessages
MessageGenerator: processed 4 Kafka message JSON files(s).
> Task :raft:processMessages
MessageGenerator: processed 1 Kafka message JSON files(s).
> Task :core:processMessages
MessageGenerator: processed 2 Kafka message JSON files(s).
> Task :group-coordinator:processMessages
MessageGenerator: processed 16 Kafka message JSON files(s).
> Task :streams:processMessages
MessageGenerator: processed 1 Kafka message JSON files(s).
> Task :metadata:processMessages
MessageGenerator: processed 20 Kafka message JSON files(s).
> Task :clients:processMessages
MessageGenerator: processed 146 Kafka message JSON files(s).
> Task :clients:processTestMessages
MessageGenerator: processed 4 Kafka message JSON files(s).
BUILD SUCCESSFUL in 8s
```
Reviewers: Divij Vaidya <diviv@amazon.com>
This patch rewrite `MockTime` in Java and moves it to `server-common` module. This is a prerequisite to move `MockTimer` later on to `server-common` as well.
Reviewers: David Arthur <mumrah@gmail.com>
Also upgrade gradle plugins:
- `org.owasp.dependencycheck` gradle plugin to version `8.2.1`
- `com.github.johnrengelman.shadow gradle` plugin to version `8.1.1`
Gradle release notes:
* https://docs.gradle.org/8.1.1/release-notes.html
Reviewers: Ismael Juma <ismael@juma.me.uk>
Loosens the validation so that Kafka can accept duplicate listeners on the same port but if and only if the listeners are valid IP addresses with one address being an IPv4 address and the other being an IPv6 address.
Reviewers: Josep Prat <jlprat@apache.org>, Luke Chen <showuon@apache.org>
topic counts.
Introduces the use of persistent data structures in the KRaft metadata image to avoid copying the entire TopicsImage upon every change. Performance that was O(<number of topics in the cluster>) is now O(<number of topics changing>), which has dramatic time and GC improvements for the most common topic-related metadata events. We abstract away the chosen underlying persistent collection library via ImmutableMap<> and ImmutableSet<> interfaces and static factory methods.
Reviewers: Luke Chen <showuon@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>, Purshotam Chauhan <pchauhan@confluent.io>
A privious change disabled strict stubbing for the `RocksDBMetricsRecorderTest`. To re-enable the behavior in JUnit-5, we need to pull in a new dependency in the `streams` gradle project.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
The new group coordinator needs to access cluster metadata (e.g. topics, partitions, etc.) and it needs a mechanism to be notified when the metadata changes (e.g. to trigger a rebalance). In KRaft clusters, the easiest is to subscribe to metadata changes via the MetadataPublisher.
Reviewers: Justine Olshan <jolshan@confluent.io>
This fixes the following `./gradlew install` issue:
```text
* What went wrong:
A problem was found with the configuration of task ':storage:srcJar' (type 'Jar').
- Gradle detected a problem with the following location: '/Users/ijuma/src/kafka/storage/src/generated/java'.
Reason: Task ':storage:srcJar' uses this output of task ':storage:processMessages' without declaring an explicit or implicit dependency. This can lead to incorrect results being produced, depending on what order the tasks are executed.
Possible solutions:
1. Declare task ':storage:processMessages' as an input of ':storage:srcJar'.
2. Declare an explicit dependency on ':storage:processMessages' from ':storage:srcJar' using Task#dependsOn.
3. Declare an explicit dependency on ':storage:processMessages' from ':storage:srcJar' using Task#mustRunAfter.
Please refer to https://docs.gradle.org/8.0.1/userguide/validation_problems.html#implicit_dependency for more details about this problem.
```
Reviewers: David Jacot <david.jacot@gmail.com>
Also re-enable it in CI. We do this by adjusting the `Jenkinsfile`
to use a more general task (`./gradlew check -x test`).
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Dejan Stojadinović <dejan2609@users.noreply.github.com>
Details:
* gradle upgrade: 7.6 -> 8.0.1
* spotbugs plugin upgrade: 5.0.9 -> 5.0.13
* tweaked the mechanics for `-release`/`-source`/`-target` to workaround idiosyncrasies in Gradle 8.0.1 and newer Scala 2.13 versions.
* streams-scala `test` task no longer triggers the `spotless` task since a newer version is required for Gradle 8 support, but the newer version requires Java 11.
Note: relates to #5479
Gradle upgrade highlights:
* "Scala Incremental Compilation for Multi-Module projects broken in 7.x": https://github.com/gradle/gradle/issues/20101
* "Incremental compilation of java modules is broken with Gradle 7.6": https://github.com/gradle/gradle/issues/23067
Full release notes: https://docs.gradle.org/8.0/release-notes.html
Reviewers: Ismael Juma <ismael@juma.me.uk>
Reviewers: Daniel Urban <durban@cloudera.com>, Greg Harris <greg.harris@aiven.io>, Viktor Somogyi-Vass <viktorsomogyi@gmail.com>, Mickael Maison <mickael.maison@gmail.com>
This patch moves the current `__consumer_offsets` records from the `core` module to the new `group-coordinator` module.
Reviewers: Christo Lolov <lolovc@amazon.com>, Mickael Maison <mickael.maison@gmail.com>
Make sure no scaladoc warnings are emitted from the streams-scala project build.
We cannot fully fix all scaladoc warnings due to limitations of the scaladoc tool,
so this is a best-effort attempt at fixing as many warnings as possible. We also
disable one problematic class of scaladoc wornings (link errors) in the gradle build.
The causes of existing warnings are that we link to java members from scaladoc, which
is not possible, or we fail to disambiguate some members.
The broad rule applied in the changes is
- For links to Java members such as [[StateStore]], we use the fully qualified name in a code tag
to make manual link resolution via a search engine easy.
- For some common terms that are also linked to Java members, like [[Serde]], we omit the link.
- We disambiguate where possible.
- In the special case of @throws declarations with Java Exceptions, we do not seem to be able
to avoid the warning altogther.
Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>
Most were converted not to use PowerMock, but some no
longer exist.
Reviewers: Chris Egerton <chrise@aiven.io>, Christo Lolov <christo_lolov@yahoo.com>
This patch migrates all the internal APIs of the current group coordinator to the new `GroupCoordinator` interface. It also makes the current implementation package private to ensure that it is not used anymore.
Reviewers: Justine Olshan <jolshan@confluent.io>
There were some concurrency inconsistencies in `KafkaScheduler` flagged by spotBugs
that had to be fixed, summary of changes below:
* Executor is `volatile`
* We always synchronize and check `isStarted` as the first thing within the critical
section when a mutating operation is performed.
* We don't synchronize (but ensure the executor is not null in a safe way) in read-only
operations that operate on the executor.
With regards to `MockScheduler/MockTask`:
* Set the type of `nextExecution` to `AtomicLong` and replaced inconsistent synchronization
* Extracted logic into `MockTask.rescheduleIfPeriodic`
Tweaked the `Scheduler` interface a bit:
* Removed `unit` parameter since we always used `ms` except one invocation
* Introduced a couple of `scheduleOnce` overloads to replace the usage of default
arguments in Scala
* Pulled up `resizeThreadPool` to the interface and removed `isStarted` from the
interface.
Other cleanups:
* Removed spotBugs exclusion affecting `kafka.log.LogConfig`, which no longer exists.
For broader context, see:
* KAFKA-14470: Move log layer to storage module
Reviewers: Jun Rao <junrao@gmail.com>
Also improved `LogValidatorTest` to cover a bug that was originally
only caught by `LogAppendTimeTest`.
For broader context on this change, please check:
* KAFKA-14470: Move log layer to storage module
Reviewers: Jun Rao <junrao@gmail.com>
This PR implements the follower fetch protocol as mentioned in KIP-405.
Added a new version for ListOffsets protocol to receive local log start offset on the leader replica. This is used by follower replicas to find the local log star offset on the leader.
Added a new version for FetchRequest protocol to receive OffsetMovedToTieredStorageException error. This is part of the enhanced fetch protocol as described in KIP-405.
We introduced a new field locaLogStartOffset to maintain the log start offset in the local logs. Existing logStartOffset will continue to be the log start offset of the effective log that includes the segments in remote storage.
When a follower receives OffsetMovedToTieredStorage, then it tries to build the required state from the leader and remote storage so that it can be ready to move to fetch state.
Introduced RemoteLogManager which is responsible for
initializing RemoteStorageManager and RemoteLogMetadataManager instances.
receives any leader and follower replica events and partition stop events and act on them
also provides APIs to fetch indexes, metadata about remote log segments.
Followup PRs will add more functionality like copying segments to tiered storage, retention checks to clean local and remote log segments. This will change the local log start offset and make sure the follower fetch protocol works fine for several cases.
You can look at the detailed protocol changes in KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage#KIP405:KafkaTieredStorage-FollowerReplication
Co-authors: satishd@apache.org, kamal.chandraprakash@gmail.com, yingz@uber.com
Reviewers: Kowshik Prakasam <kprakasam@confluent.io>, Cong Ding <cong@ccding.com>, Tirtha Chatterjee <tirtha.p.chatterjee@gmail.com>, Yaodong Yang <yangyaodong88@gmail.com>, Divij Vaidya <diviv@amazon.com>, Luke Chen <showuon@gmail.com>, Jun Rao <junrao@gmail.com>
`core` should only be used for legacy cli tools and tools that require
access to `core` classes instead of communicating via the kafka protocol
(typically by using the client classes).
Summary of changes:
1. Convert the command implementation and tests to Java and move it to
the `tools` module.
2. Introduce mechanism to capture stdout and stderr from tests.
3. Change `kafka-metadata-quorum.sh` to point to the new command class.
4. Adjusted the test classpath of the `tools` module so that it supports tests
that rely on the `@ClusterTests` annotation.
5. Improved error handling when an exception different from `TerseFailure` is
thrown.
6. Changed `ToolsUtils` to avoid usage of arrays in favor of `List`.
Reviewers: dengziming <dengziming1993@gmail.com>
This path moves the timeline data structures from metadata module to server-common module as those will be used in the new group coordinator.
Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, Colin Patrick McCabe <cmccabe@apache.org>
This PR build on top of #11017. I have added the previous author's comment in this PR for attribution. I have also addressed the pending comments from @chia7712 in this PR.
Notes to help the reviewer:
Mockito has mockStatic method which is equivalent to PowerMock's method.
When we run the tests using @RunWith(MockitoJUnitRunner.StrictStubs.class) Mockito performs a verify() for all stubs that are mentioned, hence, there is no need to explicitly verify the stubs (unless you want to verify the number of times etc.). Note that this does not work for static mocks.
Reviewers: Bruno Cadonna <cadonna@apache.org>, Walker Carlson <wcarlson@confluent.io>, Bill Bejeck <bbejeck@apache.org>
Also moves the Streams LogCaptureAppender class into the clients module so that it can be used by both Streams and Connect.
Reviewers: Nigel Liang <nigel@nigelliang.com>, Kalpesh Patel <kpatel@confluent.io>, John Roesler <vvcephei@apache.org>, Tom Bentley <tbentley@redhat.com>
In https://github.com/apache/kafka/pull/12695, we discovered a gap in our testing of `StandardAuthorizer`. We addressed the specific case that was failing, but I think we need to establish a better methodology for testing which incorporates randomized inputs. This patch is a start in that direction. We implement a few basic property tests using jqwik which focus on prefix searching. It catches the case from https://github.com/apache/kafka/pull/12695 prior to the fix. In the future, we can extend this to cover additional operation types, principal matching, etc.
Reviewers: David Arthur <mumrah@gmail.com>
Kafka Streams expects a version resource at /kafka/kafka-streams-version.properties which is read by ClientMetrics. Unfortunately, the name of the generated file was wrong and thus was not copied as a resource.
Reviewer: Bruno Cadonna <cadonna@apache.org>
We sometimes see build failures where the code encounters an exit condition and fails abruptly. For example:
```
[2022-09-18T10:01:25.947Z] * What went wrong:
[2022-09-18T10:01:25.947Z] Execution failed for task ':core:unitTest'.
[2022-09-18T10:01:25.947Z] > Process 'Gradle Test Executor 116' finished with non-zero exit value 1
```
When this happens, it can be difficult to track the failure back to a specific test from the build output because we don't know which test was executing on 'Gradle Test Executor 116.'
There is a test logging property in gradle called `displayGranularity`, which lets us see the executor for each test run: https://docs.gradle.org/current/dsl/org.gradle.api.tasks.testing.logging.TestLogging.html#org.gradle.api.tasks.testing.logging.TestLogging:displayGranularity. When `displayGranularity` is set to 2 (the default), we get the following:
```
AdminZkClientTest > testGetBrokerMetadatas() PASSED
```
When set to 0, it looks like this instead:
```
Gradle Test Run :core:test > Gradle Test Executor 76 > AdminZkClientTest > testGetBrokerMetadatas() PASSED
```
Having this extra information should make it easier to debug failures.
Reviewers: Luke Chen <showuon@gmail.com>, David Jacot <djacot@confluent.io>
Verified that the artifact generated by `releaseTarGz` no longer includes
swagger-jaxrs2 or its dependencies (like snakeyaml).
Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, Chris Egerton <fearthecellos@gmail.com>
Add `MetadataQuorumCommand` to describe quorum status, I'm trying to use arg4j style command format, currently, we only support one sub-command which is "describe" and we can specify 2 arguments which are --status and --replication.
```
# describe quorum status
kafka-metadata-quorum.sh --bootstrap-server localhost:9092 describe --replication
ReplicaId LogEndOffset Lag LastFetchTimeMs LastCaughtUpTimeMs Status
0 10 0 -1 -1 Leader
1 10 0 -1 -1 Follower
2 10 0 -1 -1 Follower
kafka-metadata-quorum.sh --bootstrap-server localhost:9092 describe --status
ClusterId: fMCL8kv1SWm87L_Md-I2hg
LeaderId: 3002
LeaderEpoch: 2
HighWatermark: 10
MaxFollowerLag: 0
MaxFollowerLagTimeMs: -1
CurrentVoters: [3000,3001,3002]
CurrentObservers: [0,1,2]
# specify AdminClient properties
kafka-metadata-quorum.sh --bootstrap-server localhost:9092 --command-config config.properties describe --status
```
Reviewers: Jason Gustafson <jason@confluent.io>
When consuming both `kafka-client:3.0.1` and `kafka-client:3.0.1:test`
through maven a hygene tool was detecting multiple instances of the same
class loaded into the classpath.
Verified this change by building locally with a before and after build with
`./gradlew clients:publishToMavenLocal`, then used beyond compare to
verify the contents.
Reviewers: Ismael Juma <ismael@juma.me.uk>
This PR is created on top of #10904 and includes commits from original author for attribution.
## Testing
1. `./gradlew connect:runtime:unitTest --tests WorkerGroupMemberTest` is successful.
2. Verified that test is run as part of `./gradlew connect:runtime:unitTest` (see report in the PR)
Reviewers: Ismael Juma <ismael@juma.me.uk>
Co-authored-by: Chun-Hao Tang <tang7526@gmail.com>
Fix a bug in the KAFKA-14124 PR where a gradle test dependency was missing.
This causes missing test class exceptions.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Before trying to commit a batch of records to the __cluster_metadata log, the active controller
should try to apply them to its current in-memory state. If this application process fails, the
active controller process should exit, allowing another node to take leadership. This will prevent
most bad metadata records from ending up in the log and help to surface errors during testing.
Similarly, if the active controller attempts to renounce leadership, and the renunciation process
itself fails, the process should exit. This will help avoid bugs where the active controller
continues in an undefined state.
In contrast, standby controllers that experience metadata application errors should continue on, in
order to avoid a scenario where a bad record brings down the whole controller cluster. The
intended effect of these changes is to make it harder to commit a bad record to the metadata log,
but to continue to ride out the bad record as well as possible if such a record does get committed.
This PR introduces the FaultHandler interface to implement these concepts. In junit tests, we use a
FaultHandler implementation which does not exit the process. This allows us to avoid terminating
the gradle test runner, which would be very disruptive. It also allows us to ensure that the test
surfaces these exceptions, which we previously were not doing (the mock fault handler stores the
exception).
In addition to the above, this PR fixes a bug where RaftClient#resign was not being called from the
renounce() function. This bug could have resulted in the raft layer not being informed of an active
controller resigning.
Reviewers: David Arthur <mumrah@gmail.com>
When the migration of the Streams project to JUnit 5 started with PR #12285, we discovered that the migrated tests were not run by the PR builds. This PR ensures that Streams' tests that are written in JUnit 4 and JUnit 5 are run in the PR builds.
Co-authored-by: Divij Vaidya <diviv@amazon.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Bruno Cadonna <cadonna@apache.org>
Highlights:
* The default Scala Zinc version was updated from 1.3.5 to 1.6.1
* Multiple Checkstyle tasks may now run in parallel within a project
* Support for Java 18
* Much more responsive continuous builds on Windows and macOS
* Improved diagnostics for dependency resolution
Some of our tests require java.util and java.lang modules to be open,
so do it explicitly given the following Gradle bug fix:
> When running on Java 9+, Gradle no longer opens the java.base/java.util
> and java.base/java.lang JDK modules for all Test tasks. In some cases,
> this would cause code to pass during testing but fail at runtime.
Release notes: https://docs.gradle.org/7.5/release-notes.html
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Luke Chen <showuon@gmail.com>
* KAFKA-13930: Add 3.2.0 Streams upgrade system tests
Apache Kafka 3.2.0 was recently released. Now we need
to test upgrades from 3.2 to trunk in our system tests.
Reviewer: Bill Bejeck <bbejeck@apache.org>
Clients remain connected and able to produce or consume despite an expired OAUTHBEARER token.
Root cause seems to be SaslServerAuthenticator#calcCompletionTimesAndReturnSessionLifetimeMs failing to set ReauthInfo#sessionExpirationTimeNanos when tokens have already expired (when session life time goes negative), in turn causing KafkaChannel#serverAuthenticationSessionExpired returning false and finally SocketServer not closing the channel.
The issue is observed with OAUTHBEARER but seems to have a wider impact on SASL re-authentication.
Reviewers: Luke Chen <showuon@gmail.com>, Tom Bentley <tbentley@redhat.com>, Sam Barker <sbarker@redhat.com>
New gradle task `connect:runtime:genConnectOpenAPIDocs` that generates `connect_rest.yaml` under `docs/generated`.
This task is executed when `siteDocsTar` runs.
This patch builds on #12072 and adds controller support for metadata.version. The kafka-storage tool now allows a
user to specify a specific metadata.version to bootstrap into the cluster, otherwise the latest version is used.
Upon the first leader election of the KRaft quroum, this initial metadata.version is written into the metadata log. When
writing snapshots, a FeatureLevelRecord for metadata.version will be written out ahead of other records so we can
decode things at the correct version level.
This also includes additional validation in the controller when setting feature levels. It will now check that a given
metadata.version is supportable by the quroum, not just the brokers.
Reviewers: José Armando García Sancio <jsancio@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, dengziming <dengziming1993@gmail.com>, Alyssa Huang <ahuang@confluent.io>
* Replace `log4j` with `reload4j` in `copyDependantLibs`. Since we have
some projects that have an explicit `reload4j` dependency, it
was included in the final release release tar - i.e. it was effectively
a workaround for this bug.
* Exclude `log4j` and `slf4j-log4j12` transitive dependencies for
`streams:upgrade-system-tests`. Versions 0100 and 0101
had a transitive dependency to `log4j` and `slf4j-log4j12` via
`zkclient` and `zookeeper`. This avoids classpath conflicts that lead
to [NoSuchFieldError](https://github.com/qos-ch/reload4j/issues/41) in
system tests.
Reviewers: Jason Gustafson <jason@confluent.io>
Refactoring ApiVersion to MetadataVersion to support both old IBP versioning and new KRaft versioning (feature flags)
for KIP-778.
IBP versions are now encoded as enum constants and explicitly prefixed w/ IBP_ instead of KAFKA_, and having a
LegacyApiVersion vs DefaultApiVersion was not necessary and replaced with appropriate parsing rules for extracting
the correct shortVersions/versions.
Co-authored-by: David Arthur <mumrah@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com>, Colin P. McCabe <cmccabe@apache.org>
* Fix UP-TO-DATE check in `create*VersionFile` tasks
`create*VersionFile` tasks explicitly declared output UP-TO-DATE status
as being false. This change properly sets the inputs to
`create*VersionFile` tasks to the `commitId` and `version` values and
sets `receiptFile` locally rather than in an extra property.
* Enable output caching for `process*Messages` tasks
`process*Messages` tasks did not have output caching enabled. This
change enables that caching, as well as setting a property name and
RELATIVE path sensitivity.
* Fix existing Gradle deprecations
Replaces `JavaExec#main` with `JavaExec#mainClass`
Replaces `Report#destination` with `Report#outputLocation`
Adds a `generator` configuration to projects that need to resolve
the `generator` project (rather than referencing the runtimeClasspath
of the `generator` project from other project contexts.
Reviewers: Mickael Maison <mickael.maison@gmail.com>
This patch adds audit support through the kafka.authorizer.logger logger to StandardAuthorizer. It
follows the same conventions as AclAuthorizer with a similarly formatted log message. When
logIfAllowed is set in the Action, then the log message is at DEBUG level; otherwise, we log at
trace. When logIfDenied is set, then the log message is at INFO level; otherwise, we again log at
TRACE.
Reviewers: Colin P. McCabe <cmccabe@apache.org>