JDK 15 no longer receives updates, so we want to switch from JDK 15 to JDK 16.
However, we have a number of tests that don't yet pass with JDK 16.
Instead of replacing JDK 15 with JDK 16, we have both for now and we either
disable (via annotations) or exclude (via gradle) the tests that don't pass with
JDK 16 yet. The annotations approach is better, but it doesn't work for tests
that rely on the PowerMock JUnit 4 runner.
Also add `--illegal-access=permit` when building with JDK 16 to make MiniKdc
work for now. This has been removed in JDK 17, so we'll have to figure out
another solution when we migrate to that.
Relevant JIRAs for the disabled tests: KAFKA-12790, KAFKA-12941, KAFKA-12942.
Moved some assertions from `testTlsDefaults` to `testUnsupportedTlsVersion`
since the former claims to test the success case while the former tests the failure case.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
* Development of EasyMock and PowerMock has stagnated while Mockito continues to be actively developed. With the new Java cadence, it's a problem to depend on libraries that do bytecode generation and are not actively maintained. In addition, Mockito is also easier to use.KAFKA-7438
Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>, Bruno Cadonna <cadonna@apache.org>
The command used by our private CI is ./gradlew cleanTest xxx:test. It does not re-run test when we use unitTest and integrationTest to replace test. The root cause is that we don't offer test output (unitTest and integrationTest) to cleanTest task and so it does not delete related test output.
Reviewers: Ismael Juma <ismael@juma.me.uk>
This patch removes the temporary shim layer we added to bridge the interface
differences between MetaLogManager and RaftClient. Instead, we now use the
RaftClient directly from the metadata module. This also means that the
metadata gradle module now depends on raft, rather than the other way around.
Finally, this PR also consolidates the handleResign and handleNewLeader APIs
into a single handleLeaderChange API.
Co-authored-by: Jason Gustafson <jason@confluent.io>
Added server-common module to have server side common classes. Moved ApiMessageAndVersion, RecordSerde, AbstractApiMessageSerde, and BytesApiMessageSerde to server-common module.
Reivewers: Kowshik Prakasam <kprakasam@confluent.io>, Jun Rao <junrao@gmail.com>
KAFKA-12429: Added serdes for the default implementation of RLMM based on an internal topic as storage. This topic will receive events of RemoteLogSegmentMetadata, RemoteLogSegmentUpdate, and RemotePartitionDeleteMetadata. These events are serialized into Kafka protocol message format.
Added tests for all the event types for that topic.
This is part of the tiered storaqe implementation KIP-405.
Reivewers: Kowshik Prakasam <kprakasam@confluent.io>, Jun Rao <junrao@gmail.com>
Move Trogdor out of tools and into its own gradle module. This allows us to minimize
the dependencies of the tools module. We still keep Trogdor in the CLASSPATH
created by kafka-run-class.sh.
Reviewers: Colin P. McCabe <cmccabe@apache.org>
Verified that `./bin/kafka-metadata-shell.sh --help` on the release tarball
works as expected. It failed with a `ClassNotFoundException` before this
change.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Gwen (Chen) Shapira <cshapi@gmail.com>
The Gradle RAT plugin properly declares inputs and outputs and is also
cachable. This also relieves the Kafka developers from maintaining the build
integration with RAT.
The generated RAT report is identical to the one generated previously. The only
difference is the RAT report name: the RAT plugin sets the HTML report name to
`index.html` (still under `build/rat`).
Verified that the rat task fails if unlicensed files are present (and not excluded). Also
`./gradlew rat` succeeds when there is no .git folder.
`testRuntime` is deprecated in Gradle 6.x and has been removed in Gradle 7.0.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Satish Duggana <satishd@apache.org>
Moved tiered storage API classes from clients module to a new storage-api module.
Created storage and storage-api modules. All the remote storage API classes are moved to storage-api module. All the remote storage implementation classes will be added to storage module.
Reviewers: Jun Rao <junrao@gmail.com>
Fixes the LICENSE files that we ship with our releases:
* the source-distribution license included wrong and unnecessary dependencies
* the binary-distribution license was missing most of our actual dependencies
Reviewers: A. Sophie Blee-Goldman <ableegoldman@apache.org>, Ewen Cheslack-Postava <ewencp@apache.org>, Justin Mclean <jmclean@apache.org>
* Standardize license headers in scala, python, and gradle files.
* Relocate copyright attribution to the NOTICE.
* Add a license header check to `spotless` for scala files.
Reviewers: Ewen Cheslack-Postava <ewencp@apache.org>, Matthias J. Sax <mjsax@apache.org>, A. Sophie Blee-Goldman <ableegoldman@apache.org
`Self-managed` is also used in the context of Cloud vs on-prem and it can
be confusing.
`KRaft` is a cute combination of `Kafka Raft` and it's pronounced like `craft`
(as in `craftsmanship`).
Reviewers: Colin P. McCabe <cmccabe@apache.org>, Jose Sancio <jsancio@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Ron Dagostino <rdagostino@confluent.io>
As @chia7712 found, we currently apply the `java` plugin indirectly via `rat.gradle`
if the `.git` folder exists (7071ded2a6/gradle/rat.gradle (L101)).
This led to inconsistent behavior if the `.git` directory was not present. With
this change, the `java-library` plugin (which has slightly more features than
the `java` plugin) is always applied.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, John Roesler <vvcephei@apache.org>
There were errors while generating javadoc for the streams:test-utils module
because the included TopologyTestDriver imported some excluded classes.
This fixes the errors by inlining the previously excluded packages.
Reviewers: Chia-Ping Tsai <chia7712@apache.org>, Ismael Juma <ijuma@apache.org>
1. replace org.junit.Assert by org.junit.jupiter.api.Assertions
2. replace org.junit by org.junit.jupiter.api
3. replace Before by BeforeEach
4. replace After by AfterEach
5. remove ExternalResource from all scala modules
6. add explicit AfterClass/BeforeClass to stop/start EmbeddedKafkaCluster
Noted that this PR does not migrate stream module to junit 5 so it does not introduce callback of junit 5 to deal with beforeAll/afterAll. The next PR of migrating stream module can replace explicit beforeAll/afterAll by junit 5 extension. Or we can keep the beforeAll/afterAll if it make code more readable.
Reviewers: John Roesler <vvcephei@apache.org>
KIP-500 is not particularly descriptive. I also tweaked the readme text a bit.
Tested that the readme for self-managed still works after these changes.
Reviewers: Colin P. McCabe <cmccabe@apache.org>, Ron Dagostino <rdagostino@confluent.io>, Jason Gustafson <jason@confluent.io>
Builds are failing since this file does not have a license. Similar .md files do not seem to have licenses,
so I've added this file to the exclude list.
It was incorrectly set within `dependencyUpdates` and it still worked.
That is, this is a no-op in terms of behavior, but makes it easier to
read and understand.
I tested that `releaseTarGz` produced a tar with two versions of
`javassist` _without this block_ and a single version with it (in either
position).
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
`org.rocksdb.Options` is part of Kafka Streams public api via `RocksDBConfigSetter`.
Reviewers: Anna Sophie Blee-Goldman <sophie@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
This patch changes the raft simulation tests to use jqwik, which is a property testing library. This provides two main benefits:
- It simplifies the randomization of test parameters. Currently the tests use a fixed set of `Random` seeds, which means that most builds are doing redundant work. We get a bigger benefit from allowing each build to test different parameterizations.
- It makes it easier to reproduce failures. Whenever a test fails, jqwik will report the random seed that failed. A developer can then modify the `@Property` annotation to use that specific seed in order to reproduce the failure.
This patch also includes an optimization for `MockLog.earliestSnapshotId` which reduces the time to run the simulation tests dramatically.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>, José Armando García Sancio <jsancio@gmail.com>, David Jacot <djacot@confluent.io>
This is useful when debugging build issues. I also removed two printlns that are now redundant, so
this makes the build more informative and less noisy at the same time. Example output:
> Starting build with version 3.0.0-SNAPSHOT using Gradle 6.8.3, Java 15 and Scala 2.13.5
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>
- Use consistent options for `javadoc` and `aggregatedJavadoc`
- `aggregatedJavadoc` depends on `compileJava`
- `connect-api` inherits `options.links`
- `streams` and `streams-test-utils` javadoc exclusions should be more
specific to avoid unexpected behavior in `aggregatedJavadoc` when the
javadoc for multiple modules is generated together
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Gradle 7.0 is required for Java 16 compatibility and it removes a number of
deprecated APIs. Fix most issues preventing the upgrade to Gradle 7.0.
The remaining ones are more complicated and should be handled
in a separate PR. Details of the changes:
* Release tarball no longer includes includes test, sources, javadoc and test sources jars (these
are still published to the Maven Central repository).
* Replace `compile` with `api` or `implementation` - note that `implementation`
dependencies appear with `runtime` scope in the pom file so this is a (positive)
change in behavior
* Add missing dependencies that were uncovered by the usage of `implementation`
* Replace `testCompile` with `testImplementation`
* Replace `runtime` with `runtimeOnly` and `testRuntime` with `testRuntimeOnly`
* Replace `configurations.runtime` with `configurations.runtimeClasspath`
* Replace `configurations.testRuntime` with `configurations.testRuntimeClasspath` (except for
the usage in the `streams` project as that causes a cyclic dependency error)
* Use `java-library` plugin instead of `java`
* Use `maven-publish` plugin instead of deprecated `maven` plugin - this changes the
commands used to publish and to install locally, but task aliases for `install` and
`uploadArchives` were added for backwards compatibility
* Removed `-x signArchives` line from the readme since it was wrong (it was a
no-op before and it fails now, however)
* Replaces `artifacts` block with an approach that works with the `maven-publish` plugin
* Don't publish `jmh-benchmark` module - the shadow jar is pretty large and not
particularly useful (before this PR, we would publish the non shadow jars)
* Replace `version` with `archiveVersion`, `baseName` with `archiveBaseName` and
`classifier` with `archiveClassifier`
* Update Gradle and plugins to the latest stable version (7.0 is not stable yet)
* Use `plugin` DSL to configure plugins
* Updated notable changes for 3.0
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Randall Hauch <rhauch@gmail.com>
KIP-405 introduces tiered storage feature in Kafka. With this feature, Kafka cluster is configured with two tiers of storage - local and remote. The local tier is the same as the current Kafka that uses the local disks on the Kafka brokers to store the log segments. The new remote tier uses systems, such as HDFS or S3 or other cloud storages to store the completed log segments. Consumers fetch the records stored in remote storage through the brokers with the existing protocol.
We introduced a few SPIs for plugging in log/index store and remote log metadata store.
This involves two parts
1. Storing the actual data in remote storage like HDFS, S3, or other cloud storages.
2. Storing the metadata about where the remote segments are stored. The default implementation uses an internal Kafka topic.
You can read KIP for more details at https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage
Reviewers: Jun Rao <junrao@gmail.com>
Implements KIP-695
Reverts a previous behavior change to Consumer.poll and replaces
it with a new Consumer.currentLag API, which returns the client's
currently cached lag.
Uses this new API to implement the desired task idling semantics
improvement from KIP-695.
Reverts fdcf8fbf72 / KAFKA-10866: Add metadata to ConsumerRecords (#9836)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Guozhang Wang <guozhang@apache.org>
As mentioned in #9548, users currently use the kafka jar (`core` module)
for integration testing and the current inlining behavior causes
problems when the user's classpath contains a different Scala version
than the one that was used for compilation (e.g. 2.13.4 versus 2.13.3).
An example error:
`java.lang.NoClassDefFoundError: scala/math/Ordering$$anon$7`
We now disable inlining of the `scala` package by default, but make it
easy to enable it for those who so desire (a good option if you can
ensure the scala library version matches the one used for compilation).
While at it, we make it possible to disable scala compiler optimizations
(`none`) or to use only method local optimizations (`method`). This can
be useful if optimizing for compilation time during development.
Verified behavior by running gradlew with `--debug` and checking the
output after `[zinc] The Scala compiler is invoked with:`
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
The quorum controller stores metadata in the KIP-500 metadata log, not in Apache
ZooKeeper. Each controller node is a voter in the metadata quorum. The leader of the
quorum is the active controller, which processes write requests. The followers are standby
controllers, which replay the operations written to the log. If the active controller goes away,
a standby controller can take its place.
Like the ZooKeeper-based controller, the quorum controller is based on an event queue
backed by a single-threaded executor. However, unlike the ZK-based controller, the quorum
controller can have multiple operations in flight-- it does not need to wait for one operation
to be finished before starting another. Therefore, calls into the QuorumController return
CompleteableFuture objects which are completed with either a result or an error when the
operation is done. The QuorumController will also time out operations that have been
sitting on the queue too long without being processed. In this case, the future is completed
with a TimeoutException.
The controller uses timeline data structures to store multiple "versions" of its in-memory
state simultaneously. "Read operations" read only committed state, which is slightly older
than the most up-to-date in-memory state. "Write operations" read and write the latest
in-memory state. However, we can not return a successful result for a write operation until
its state has been committed to the log. Therefore, if a client receives an RPC response, it
knows that the requested operation has been performed, and can not be undone by a
controller failover.
Reviewers: Jun Rao <junrao@gmail.com>, Ron Dagostino <rdagostino@confluent.io>
The Kafka Metadata shell is a new command which allows users to
interactively examine the metadata stored in a KIP-500 cluster.
It can examine snapshot files that are specified via --snapshot.
The metadata tool works by replaying the log and storing the state into
in-memory nodes. These nodes are presented in a fashion similar to
filesystem directories.
Reviewers: Jason Gustafson <jason@confluent.io>, David Arthur <mumrah@gmail.com>, Igor Soarez <soarez@apple.com>
Add ClusterTool as specified in KIP-631. It can report the current cluster ID, and also send the new RPC for removing broker registrations.
Reviewers: David Arthur <mumrah@gmail.com>
This patch adds a `RecordSerde` implementation for the metadata record format expected by KIP-631.
Reviewers: Colin McCabe <cmccabe@apache.org>, Ismael Juma <mlists@juma.me.uk>
Replace BrokerStates.scala with BrokerState.java, to make it easier to use from Java code if needed. This also makes it easier to go from a numeric type to an enum.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
This was causing IntelliJ to choose the Vintage runner when running `core` tests
which would then fail during the discovery phase.
Verified that `core` tests work in IntelliJ again after this change.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Ensure that runtime-only dependencies don't cause a different version to be used.
More specifically:
> The relationship is directed, which means that if the runtimeClasspath configuration
has to be resolved, Gradle will first resolve the compileClasspath and then "inject" the
result of resolution as strict constraints into the runtimeClasspath.
For more details, see:
https://docs.gradle.org/6.8/userguide/resolution_strategy_tuning.html#sec:configuration_consistency
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Add a runtime dependency on the jupiter engine to avoid failures
during `./gradlew unitTest/integrationTest/test` for `upgrade-system-tests-*`.
Also remove `connect` and `examples` from the JUnit 5 blocklist since they
contains no tests.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>