The updated version includes a few optimizations that benefit us:
* Classes with no finalizers (opt-in) that have better GC behavior
* `InputStream.skip()` implementation that uses cached buffers
* Minor buffer recycler optimizations (used for OutputStream only)
Full diff:
https://github.com/luben/zstd-jni/compare/v1.4.8-2...v1.4.8-4
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Add KafkaEventQueue, which is used by the KIP-500 controller to manage its event queue.
Compared to using an Executor, KafkaEventQueue has the following advantages:
* Events can be given "deadlines." If an event lingers in the queue beyond the deadline, it
will be completed with a timeout exception. This is useful for implementing timeouts for
controller RPCs.
* Events can be prepended to the queue as well as appended.
* Events can be given tags to make them easier to manage. This is especially useful for
rescheduling or cancelling events which were previously scheduled to execute in the future.
Reviewers: Jun Rao <junrao@gmail.com>, José Armando García Sancio <jsancio@gmail.com>
A few important fixes:
* ZOOKEEPER-3829: Zookeeper refuses request after node expansion
* ZOOKEEPER-3842: Rolling scale up of zookeeper cluster does not work with reconfigEnabled=false
* ZOOKEEPER-3830: After add a new node, zookeeper cluster won't commit any proposal if this new node is leader
Full release notes: https://zookeeper.apache.org/doc/r3.5.9/releasenotes.html
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Also updated the jmh readme to make it easier for new people to know
what's possible and best practices.
There were some changes in the generated benchmarking code that
required adjusting `spotbugs-exclude.xml` and for a `javac` warning
to be suppressed for the benchmarking module. I took the chance
to make the spotbugs exclusion mode maintainable via a regex
pattern.
Tested the commands on Linux and macOS with zsh.
JMH highlights:
* async-profiler integration. Can be used with -prof async,
pass -prof async:help to look for the accepted options.
* perf c2c [2] integration. Can be used with -prof perfc2c,
if available.
* JFR profiler integration. Can be used with -prof jfr, pass
-prof jfr:help to look for the accepted options.
Full details:
* 1.24: https://mail.openjdk.java.net/pipermail/jmh-dev/2020-August/002982.html
* 1.25: https://mail.openjdk.java.net/pipermail/jmh-dev/2020-August/002987.html
* 1.26: https://mail.openjdk.java.net/pipermail/jmh-dev/2020-October/003024.html
* 1.27: https://mail.openjdk.java.net/pipermail/jmh-dev/2020-December/003096.html
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Bill Bejeck <bbejeck@gmail.com>, Lucas Bradstreet <lucasbradstreet@gmail.com>
Scala 2.13.4 restores default global `ExecutionContext` to 2.12 behavior
(to fix a perf regression in some use cases) and improves pattern matching
(especially exhaustiveness checking). Most of the changes are related
to the latter as I have enabled the newly introduced `-Xlint:strict-unsealed-patmat`.
More details on the code changes:
* Don't swallow exception in `ReassignPartitionsCommand.topicDescriptionFutureToState`.
* `RequestChannel.Response` should be `sealed`.
* Introduce sealed ClientQuotaManager.BaseUserEntity to avoid false positive
exhaustiveness warning.
* Handle a number of cases where pattern matches were not exhaustive:
either by marking them with @unchecked or by adding a catch-all clause.
* Workaround scalac bug related to exhaustiveness warnings in ZooKeeperClient
* Remove warning suppression annotations related to the optimizer that are no
longer needed in ConsumerGroupCommand and AclAuthorizer.
* Use `forKeyValue` in `AclAuthorizer.acls` as the scala bug preventing us from
using it seems to be fixed.
* Also update scalaCollectionCompat to 2.3.0, which includes minor improvements.
Full release notes:
* https://github.com/scala/scala/releases/tag/v2.13.4
* https://github.com/scala/scala-collection-compat/releases/tag/v2.3.0
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Zstd-jni 1.4.5-6 allocates large internal buffers inside of ZstdInputStream and ZstdOutputStream. This caused a lot of allocation and GC activity when creating and closing the streams. It also does not buffer the reads or writes. This causes inefficiency when DefaultRecord.writeTo() does a series of small single bytes reads using various ByteUtils methods. The JNI is more efficient if the writes of uncompressed data were flushed in large pieces rather than for each byte. This is due to the the expense of context switching between the Java code and the native code. This is also the case when reading as well. Per luben/zstd-jni#141 the maintainer of zstd-jni and I agreed to not buffer reads and writes in favor of having the caller do that, so here we are updating the caller.
In this patch, I upgraded to the most recent zstd-jni version with the buffer reuse built-in. This was done in luben/zstd-jni#143 and luben/zstd-jni#146 Since we decided not to add additional buffering of input/output with zstd-jni, I added the BufferedInputStream and BufferedOutputStream to CompressionType.ZSTD just like we currently do for CompressionType.GZIP which also is inefficient for single byte reads and writes. I used the same buffer sizes as that existing implementation.
NOTE: if so desired we could pass a wrapped BufferSupplier into the Zstd*Stream classes to have Kafka decide how the buffer recycling occurs. This functionality was added in the latter PR linked above. I am holding off on this since based on jmh benchmarking the performance gains were not clear and personally I don't know if it worth the complexity of trying to hack around the reflection at this point in time. The zstd-jni uses a very similar default recycler as snappy does currently which seems to provide decent efficiency. While this PR fixes the defect, I feel that using BufferSupplier in both zstd-jni and snappy is outside of the scope of this bugfix and should be considered a separate improvement. I would prefer this change get merged in on its own since the performance gains here are very significant relative to the more incremental and minor optimizations which could be achieved by doing that separate work.
There are some noticeable improvements in the JMH benchmarks (excerpt):
BEFORE:
Benchmark (bufferSupplierStr) (bytes) (compressionType) (maxBatchSize) (messageSize) (messageVersion) Mode Cnt Score Error Units
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed CREATE RANDOM ZSTD 200 1000 2 thrpt 15 27743.260 ± 673.869 ops/s
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 3399.966 ± 82.608 MB/sec
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 134968.010 ± 0.012 B/op
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 3850.985 ± 84.476 MB/sec
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 152881.128 ± 942.189 B/op
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Survivor_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 174.241 ± 3.486 MB/sec
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Survivor_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 6917.758 ± 82.522 B/op
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 1689.000 counts
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 82621.000 ms
JMH benchmarks done
Benchmark (bufferSupplierStr) (bytes) (compressionType) (maxBatchSize) (messageSize) (messageVersion) Mode Cnt Score Error Units
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage CREATE RANDOM ZSTD 200 1000 2 thrpt 15 24095.711 ± 895.866 ops/s
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2932.289 ± 109.465 MB/sec
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 134032.012 ± 0.013 B/op
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 3282.912 ± 115.042 MB/sec
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 150073.914 ± 1342.235 B/op
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Survivor_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 149.697 ± 5.786 MB/sec
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Survivor_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 6842.462 ± 64.515 B/op
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 1449.000 counts
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 82518.000 ms
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize CREATE RANDOM ZSTD 200 1000 2 thrpt 15 1449.060 ± 230.498 ops/s
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 198.051 ± 31.532 MB/sec
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 150502.519 ± 0.186 B/op
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 200.064 ± 31.879 MB/sec
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 152569.341 ± 13826.686 B/op
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 91.000 counts
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 75869.000 ms
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2609.660 ± 1145.160 ops/s
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 815.441 ± 357.818 MB/sec
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 344309.097 ± 0.238 B/op
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 808.952 ± 354.975 MB/sec
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 345712.061 ± 51434.034 B/op
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Old_Gen CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.019 ± 0.042 MB/sec
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Old_Gen.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 18.615 ± 42.045 B/op
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Survivor_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 24.132 ± 12.254 MB/sec
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Survivor_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 13540.960 ± 14649.192 B/op
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 148.000 counts
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 23848.000 ms
JMH benchmarks done
AFTER
Benchmark (bufferSupplierStr) (bytes) (compressionType) (maxBatchSize) (messageSize) (messageVersion) Mode Cnt Score Error Units
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed CREATE RANDOM ZSTD 200 1000 2 thrpt 15 147792.454 ± 2721.318 ops/s
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2708.481 ± 50.012 MB/sec
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 20184.002 ± 0.002 B/op
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2732.667 ± 59.258 MB/sec
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 20363.460 ± 120.585 B/op
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Old_Gen CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.042 ± 0.033 MB/sec
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Old_Gen.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.316 ± 0.249 B/op
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 833.000 counts
CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 8390.000 ms
JMH benchmarks done
Benchmark (bufferSupplierStr) (bytes) (compressionType) (maxBatchSize) (messageSize) (messageVersion) Mode Cnt Score Error Units
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage CREATE RANDOM ZSTD 200 1000 2 thrpt 15 166786.092 ± 3285.702 ops/s
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2926.914 ± 57.464 MB/sec
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 19328.002 ± 0.002 B/op
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2938.541 ± 66.850 MB/sec
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 19404.357 ± 177.485 B/op
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Old_Gen CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.516 ± 0.100 MB/sec
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Old_Gen.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 3.409 ± 0.657 B/op
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Survivor_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.032 ± 0.131 MB/sec
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Survivor_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.207 ± 0.858 B/op
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 834.000 counts
RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 9370.000 ms
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize CREATE RANDOM ZSTD 200 1000 2 thrpt 15 15988.116 ± 137.427 ops/s
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 448.636 ± 3.851 MB/sec
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 30907.698 ± 0.020 B/op
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 450.905 ± 5.587 MB/sec
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 31064.113 ± 291.190 B/op
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.churn.G1_Old_Gen CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.043 ± 0.007 MB/sec
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.churn.G1_Old_Gen.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2.931 ± 0.493 B/op
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 790.000 counts
RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 999.000 ms
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize CREATE RANDOM ZSTD 200 1000 2 thrpt 15 11345.169 ± 206.528 ops/s
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2314.800 ± 42.094 MB/sec
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 224714.266 ± 0.028 B/op
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2320.213 ± 45.521 MB/sec
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 225235.965 ± 803.309 B/op
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Old_Gen CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.026 ± 0.005 MB/sec
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Old_Gen.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2.551 ± 0.455 B/op
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 994.000 counts
RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 1189.000 ms
JMH benchmarks done
Reviewers: Ismael Juma <ismael@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
Jetty 9.4.32 and before are affected by CVE-2020-27216. This vulnerability is fixed in Jetty 9.4.33, please see the jetty project security advisory for details: https://github.com/eclipse/jetty.project/security/advisories/GHSA-g3wg-6mcf-8jj6#advisory-comment-63053
Unit tests and integration tests pass locally after the upgrade.
Author: Nitesh Mor <nmor@confluent.io>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Closes#9556 from niteshmor/trunk
`forKeyValue` invokes `foreachEntry` in Scala 2.13 and falls back to
`foreach` in Scala 2.12.
This change requires a newer version of scala-collection-compat, so
update it to the latest version (2.2.0).
Finally, included a minor clean-up in `GetOffsetShell` to use `toArray`
before `sortBy` since it's more efficient.
Reviewers: Jason Gustafson <jason@confluent.io>, David Jacot <djacot@confluent.io>, José Armando García Sancio <jsancio@users.noreply.github.com>, Chia-Ping Tsai <chia7712@gmail.com>
This change sets the groundwork for migrating other modules incrementally.
Main changes:
- Replace `junit` 4.13 with `junit-jupiter` and `junit-vintage` 5.7.0-RC1.
- All modules except for `tools` depend on `junit-vintage`.
- `tools` depends on `junit-jupiter`.
- Convert `tools` tests to JUnit 5.
- Update `PushHttpMetricsReporterTest` to use `mockito` instead of `powermock` and `easymock`
(powermock doesn't seem to work well with JUnit 5 and we don't need it since mockito can mock
static methods).
- Update `mockito` to 3.5.7.
- Update `TestUtils` to use JUnit 5 assertions since `tools` depends on it.
Unrelated clean-ups:
- Remove `unit` from package names in a few `core` tests.
- Replace `try/catch/fail` with `assertThrows` in a number of places.
- Tag `CoordinatorTest` as integration test.
- Remove unnecessary type parameters when invoking methods and constructors.
Tested with IntelliJ and gradle. Verified that the following commands work as expected:
* ./gradlew tools:unitTest
* ./gradlew tools:integrationTest
* ./gradlew tools:test
* ./gradlew core:unitTest
* ./gradlew core:integrationTest
* ./gradlew clients:test
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Only check if positions need validation if there is new metadata.
Also fix some inefficient java.util.stream code in the hot path of SubscriptionState.
I left out updates that could be risky. Preliminary testing indicates
we can build (including spotBugs) and run tests with Java 15 with
these changes. I will do more thorough testing once Java 15 reaches
release candidate stage in a few weeks.
Minor updates with mostly bug fixes:
- Scala: 2.12.11 -> 2.12.12 (compiler and collection performance improvements)
- Bouncy castle: 1.64 -> 1.66 (several bug fixes)
- HttpClient: 4.5.11 -> 4.5.12 (small number of bug fixes)
- Mockito: 3.3.3 -> 3.4.4 (several bug fixes and Java 15 support)
- Netty: 4.5.10 -> 4.5.11 (several bug fixes)
- Snappy: 1.1.7.3 -> 1.1.7.6 (small number of bug fixes)
- Zstd: 1.4.5-2 -> 1.4.5-6 (small number of bug fixes)
Gradle plugin and library upgrades:
- Gradle version plugins: 0.28.0 -> 0.29.0 (small number of bug fixes)
- Git: 4.0.1 -> 4.0.2 (small number of bug fixes)
- Scoverage plugin: 4.0.1 -> 4.0.2 (small number of bug fixes)
- Shadow plugin: 5.2.0 -> 6.0.0 (Java 15 support and require Gradle 6.0)
- Test Retry plugin: 1.1.5 -> 1.1.6 (small number of bug fixes)
- Spotless plugin: 4.4.4 -> 5.1.0 (several internal changes that should not matter to us)
- Spotbugs: 4.0.3 -> 4.0.6 (small number of bug fixes)
- Spotbugs plugin: 4.2.4 -> 4.4.4 (small number of bug fixes)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
This PR includes 3 MessageFormatters for MirrorMaker2 internal topics:
- HeartbeatFormatter
- CheckpointFormatter
- OffsetSyncFormatter
This also introduces a new public interface org.apache.kafka.common.MessageFormatter that users can implement to build custom formatters.
Reviewers: Konstantine Karantasis <k.karantasis@gmail.com>, Ryanne Dolan <ryannedolan@gmail.com>, David Jacot <djacot@confluent.io>
Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com>
This includes important fixes. Netty is required by ZooKeeper if TLS is
enabled.
I verified that the netty jars were changed from 4.1.48 to 4.1.50 with
this PR, `find . -name '*netty*'`:
```text
./core/build/dependant-libs-2.13.3/netty-handler-4.1.50.Final.jar
./core/build/dependant-libs-2.13.3/netty-transport-native-epoll-4.1.50.Final.jar
./core/build/dependant-libs-2.13.3/netty-codec-4.1.50.Final.jar
./core/build/dependant-libs-2.13.3/netty-transport-native-unix-common-4.1.50.Final.jar
./core/build/dependant-libs-2.13.3/netty-transport-4.1.50.Final.jar
./core/build/dependant-libs-2.13.3/netty-resolver-4.1.50.Final.jar
./core/build/dependant-libs-2.13.3/netty-buffer-4.1.50.Final.jar
./core/build/dependant-libs-2.13.3/netty-common-4.1.50.Final.jar
```
Note that the previous netty exclude no longer worked since we upgraded
to ZooKeeper 3.5.x as it switched to Netty 4 which has different module names.
Also, the Netty dependency is needed by ZooKeeper for TLS support so we
cannot exclude it.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Fix findbugs multithreaded correctness warnings for streams, updated variables to be threadsafe
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, John Roesler <vvcephei@apache.org>
Recently, commit 492306a updated both jetty to version 9.4.27.v20200227 and jersey to version 2.31
However in the latest versions of jetty, the renaming of the method `Response#closeOutput` to `Response#completeOutput` has been reverted, with the latest version using again `Response#closeOutput`.
Jersey has not released a recent version in which `Response#closeOutput` is called directly. In its currently latest version (2.31) `Response#closeOutput` will be called if `Response#completeOutput` throws a `NoSuchMethodError` exception. Given that, this version combination is compatible. Jersey should be upgraded once a new version that uses `Response#closeOutput` directly is out.
Reviewers: Ismael Juma <ismael@juma.me.uk>
I had to fix several compiler errors due to deprecation of auto application of `()`. A related
Xlint config (`-Xlint:nullary-override`) is no longer valid in 2.13, so we now only enable it
for 2.12. The compiler flagged two new inliner warnings that required suppression and
the semantics of `&` in `@nowarn` annotations changed, requiring a small change in
one of the warning suppressions.
I also removed the deprecation of a number of methods in `KafkaZkClient` as
they should not have been deprecated in the first place since `KafkaZkClient` is an
internal class and we still use these methods in the Controller and so on. This
became visible because the Scala compiler now respects Java's `@Deprecated`
annotation.
Finally, I included a few minor clean-ups (eg using `toBuffer` instead `toList`) when fixing
the compilation warnings.
Noteworthy bug fixes in Scala 2.13.3:
* Fix 2.13-only bug in Java collection converters that caused some operations to perform an extra pass
* Fix 2.13.2 performance regression in Vector: restore special cases for small operands in appendedAll and prependedAll
* Increase laziness of #:: for LazyList
* Fixes related to annotation parsing of @Deprecated from Java sources in mixed compilation
Full release notes:
https://github.com/scala/scala/releases/tag/v2.13.3
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Upgrade jetty to 9.4.27.v20200227 and jersey to 2.31
Also remove the workaround used on previous versions from Connect's SSLUtils.
(Reverts KAFKA-9771 - commit ee832d7d)
Reviewers: Ismael Juma <ismael@juma.me.uk>, Chris Egerton <chrise@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>
Gradle 6.5 includes a fix for https://github.com/gradle/gradle/pull/12866, which
affects the performance of Scala compilation.
I profiled the scalac build with async profiler and 54% of the time was on GC
even after the Gradle upgrade (it was more than 60% before), so I switched to
the throughput GC (GC latency is less important for batch builds) and it
was reduced to 38%.
I also centralized the jvm configuration in `build.gradle` and simplified it a bit
by removing the minHeapSize configuration from the test tasks.
On my desktop, the time to execute clean builds with no cached Gradle daemon
was reduced from 127 seconds to 97 seconds. With a cached daemon, it was
reduced from 120 seconds to 88 seconds. The performance regression when
we upgraded to Gradle 6.x was 27 seconds with a cached daemon
(https://github.com/apache/kafka/pull/7677#issuecomment-616271179), so it
should be fixed now.
Gradle 6.4 with no cached daemon:
```
BUILD SUCCESSFUL in 2m 7s
115 actionable tasks: 112 executed, 3 up-to-date
./gradlew clean compileScala compileJava compileTestScala compileTestJava 1.15s user 0.12s system 0% cpu 2:08.06 total
```
Gradle 6.4 with cached daemon:
```
BUILD SUCCESSFUL in 2m 0s
115 actionable tasks: 111 executed, 4 up-to-date
./gradlew clean compileScala compileJava compileTestScala compileTestJava 0.95s user 0.10s system 0% cpu 2:01.42 total
```
Gradle 6.5 with no cached daemon:
```
BUILD SUCCESSFUL in 1m 46s
115 actionable tasks: 111 executed, 4 up-to-date
./gradlew clean compileScala compileJava compileTestScala compileTestJava 1.27s user 0.12s system 1% cpu 1:47.71 total
```
Gradle 6.5 with cached daemon:
```
BUILD SUCCESSFUL in 1m 37s
115 actionable tasks: 111 executed, 4 up-to-date
./gradlew clean compileScala compileJava compileTestScala compileTestJava 1.02s user 0.10s system 1% cpu 1:38.31 total
```
This PR with no cached Gradle daemon:
```
BUILD SUCCESSFUL in 1m 37s
115 actionable tasks: 81 executed, 34 up-to-date
./gradlew clean compileScala compileJava compileTestScala compileTestJava 1.27s user 0.10s system 1% cpu 1:38.70 total
```
This PR with cached Gradle daemon:
```
BUILD SUCCESSFUL in 1m 28s
115 actionable tasks: 111 executed, 4 up-to-date
./gradlew clean compileScala compileJava compileTestScala compileTestJava 1.02s user 0.10s system 1% cpu 1:29.35 total
```
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
It improves decompression speed:
>For x64 cpus, expect a speed bump of at least +5%, and up to +10% in favorable cases.
>ARM cpus receive more benefit, with speed improvements ranging from +15% vicinity,
>and up to +50% for certain SoCs and scenarios (ARM‘s situation is more complex due
>to larger differences in SoC designs).
See https://github.com/facebook/zstd/releases/tag/v1.4.5 for more details.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
This fixes critical bugs in Gradle 6.4:
* Regression: Different daemons are used between IDE and CLI builds for the same project
* Regression: Main-Class attribute always added to jar manifest when using application plugin
* Fix potential NPE if code is executed concurrently
More details: https://github.com/gradle/gradle/releases/tag/v6.4.1
Reviewers: Manikumar Reddy <manikumar@confluent.io>
It fixes 30 issues, including third party CVE fixes, several leader-election
related fixes and a compatibility issue with applications built against earlier
3.5 client libraries (by restoring a few non public APIs).
See ZooKeeper 3.5.8 Release Notes for details: https://zookeeper.apache.org/doc/r3.5.8/releasenotes.html
Reviewers: Manikumar Reddy <manikumar@confluent.io>
The version of Zinc included with Gradle 6.4 includes a fix for the blocker
that was preventing us from passing `-release 8` to scalac.
Release notes for Gradle 6.4:
https://docs.gradle.org/6.4/release-notes.html
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
In the case described in the JIRA, there was a 50%+ increase in the total fetch request rate in
2.4.0 due to this change.
I included a few additional clean-ups:
* Simplify `findPreferredReadReplica` and avoid unnecessary collection copies.
* Use `LongSupplier` instead of `Supplier<Long>` in `SubscriptionState` to avoid unnecessary boxing.
Added a unit test to ReplicaManagerTest and cleaned up the test class a bit including
consistent usage of Time in MockTimer and other components.
Reviewers: Gwen Shapira <gwen@confluent.io>, David Arthur <mumrah@gmail.com>, Jason Gustafson <jason@confluent.io>
* Upgrade to Scala 2.13.2 which introduces the ability to suppress warnings.
* Upgrade to scala-collection-compat 2.1.6 as it introduces the
@nowarn annotation for Scala 2.12.
* While at it, also update scala-java8-compat to 0.9.1.
* Fix compiler warnings and add @nowarn for the unfixed ones.
Scala 2.13.2 highlights (besides @nowarn):
* Rewrite Vector (using "radix-balanced finger tree vectors"),
for performance. Small vectors are now more compactly
represented. Some operations are now drastically faster on
large vectors. A few operations may be a little slower.
* Matching strings makes switches in bytecode.
https://github.com/scala/scala/releases/tag/v2.13.2
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Also:
* Remove deprecated `=` in resolutionStrategy.
* Replace `AES/GCM/PKCS5Padding` with `AES/GCM/NoPadding`
in `PasswordEncoderTest`. The former is invalid and JDK 14 rejects it,
see https://bugs.openjdk.java.net/browse/JDK-8229043.
With these changes, the build works with Java 14 and Scala 2.12. The
same will apply to Scala 2.13 when Scala 2.13.2 is released (should
happen within 1-2 weeks).
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Matthias J. Sax <matthias@confluent.io>
* Introduce `gradlewAll` script to replace `*All` tasks since the approach
used by the latter doesn't work since Gradle 6.0 and it's unclear when,
if ever, it will work again ( see https://github.com/gradle/gradle/issues/11301).
* Update release script and README given the above.
* Update zinc to 1.3.5.
* Update gradle-versions-plugin to 0.28.0.
The major improvements in Gradle 6.0 to 6.3 are:
- Improved incremental compilation for Scala
- Support for Java 14 (although some Gradle plugins
like spotBugs may need to be updated or disabled,
will do that separately)
- Improved scalac reporting, warnings are clearly
marked as such, which is very helpful.
Tested `gradlewAll` manually for the commands listed in the README
and release script. For `uploadArchive`, I tested it with a local Maven
repository.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Once Scala 2.13.2 is officially released, I will submit a follow up PR
that enables `-Xfatal-warnings` with the necessary warning
exclusions. Compiler warning exclusions were only introduced in 2.13.2
and hence why we have to wait for that. I used a snapshot build to
test it in the meantime.
Changes:
* Remove Deprecated annotation from internal request classes
* Class.newInstance is deprecated in favor of
Class.getConstructor().newInstance
* Replace deprecated JavaConversions with CollectionConverters
* Remove unused kafka.cluster.Cluster
* Don't use Map and Set methods deprecated in 2.13:
- collection.Map +, ++, -, --, mapValues, filterKeys, retain
- collection.Set +, ++, -, --
* Add scala-collection-compat dependency to streams-scala and
update version to 2.1.4.
* Replace usages of deprecated Either.get and Either.right
* Replace usage of deprecated Integer(String) constructor
* `import scala.language.implicitConversions` is not needed in Scala 2.13
* Replace usage of deprecated `toIterator`, `Traversable`, `seq`,
`reverseMap`, `hasDefiniteSize`
* Replace usage of deprecated alterConfigs with incrementalAlterConfigs
where possible
* Fix implicit widening conversions from Long/Int to Double/Float
* Avoid implicit conversions to String
* Eliminate usage of deprecated procedure syntax
* Remove `println`in `LogValidatorTest` instead of fixing the compiler
warning since tests should not `println`.
* Eliminate implicit conversion from Array to Seq
* Remove unnecessary usage of 3 argument assertEquals
* Replace `toStream` with `iterator`
* Do not use deprecated SaslConfigs.DEFAULT_SASL_ENABLED_MECHANISMS
* Replace StringBuilder.newBuilder with new StringBuilder
* Rename AclBuffers to AclSeqs and remove usage of `filterKeys`
* More consistent usage of Set/Map in Controller classes: this also fixes
deprecated warnings with Scala 2.13
* Add spotBugs exclusion for inliner artifact in KafkaApis with Scala 2.12.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
This PR works to improve high watermark checkpointing performance.
`ReplicaManager.checkpointHighWatermarks()` was found to be a major contributor to GC pressure, especially on Kafka clusters with high partition counts and low throughput.
Added a JMH benchmark for `checkpointHighWatermarks` which establishes a
performance baseline. The parameterized benchmark was run with 100, 1000 and
2000 topics.
Modified `ReplicaManager.checkpointHighWatermarks()` to avoid extra copies and cached
the Log parent directory Sting to avoid frequent allocations when calculating
`File.getParent()`.
A few clean-ups:
* Changed all usages of Log.dir.getParent to Log.parentDir and Log.dir.getParentFile to
Log.parentDirFile.
* Only expose public accessor for `Log.dir` (consistent with `Log.parentDir`)
* Removed unused parameters in `Partition.makeLeader`, `Partition.makeFollower` and `Partition.createLogIfNotExists`.
Benchmark results:
| Topic Count | Ops/ms | MB/sec allocated |
|-------------|---------|------------------|
| 100 | + 51% | - 91% |
| 1000 | + 143% | - 49% |
| 2000 | + 149% | - 50% |
Reviewers: Lucas Bradstreet <lucas@confluent.io>. Ismael Juma <ismael@juma.me.uk>
Co-authored-by: Gardner Vickers <gardner@vickers.me>
Co-authored-by: Ismael Juma <ismael@juma.me.uk>
Highlights:
* Performance improvements in the ollections
library: algorithmic improvements and
changes to avoid unnecessary allocations.
* Performance improvements in the compiler.
* ASM was upgraded to 7.3.1, allowing the
optimizer to run on JDK 13+.
Full release notes: https://github.com/scala/scala/releases/tag/v2.12.11
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
This PR removes intermediate conversions between `MetadataResponse.TopicMetadata` => `MetadataResponseTopic` and `MetadataResponse.PartitionMetadata` => `MetadataResponsePartition` objects.
There is 15-20% reduction in object allocations and 5-10% improvement in metadata request performance.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson<jason@confluent.io>
9.4.25 renamed closeOutput to completeOutput
(c5acf96506),
which is a method used by recent Jersey versions including the
latest (2.30.1). An example of the error:
> java.lang.NoSuchMethodError: org.eclipse.jetty.server.Response.closeOutput()V
> at org.glassfish.jersey.jetty.JettyHttpContainer$ResponseWriter.commit(JettyHttpContainer.java:326)
The request still completes and hence why no test fails. We should think about how
to improve the testing for this kind of problem, but I want to get the fix in before
2.5 RC0.
Credit to @rigelbm for finding this.
Reviewers: Ewen Cheslack-Postava <me@ewencp.org>, Andrew Choi <a24choi@edu.uwaterloo.ca>
Disabled by default, but enabled for Jenkins PR builds (maximum of 1 retry per
test with up to 5 retries for the test run).
Reviewers: Ismael Juma <ismael@juma.me.uk>
This PR implements the KIP-559: https://cwiki.apache.org/confluence/display/KAFKA/KIP-559%3A+Make+the+Kafka+Protocol+Friendlier+with+L7+Proxies
- it adds the Protocol Type and the Protocol Name fields in JoinGroup and SyncGroup API;
- it validates that the fields are provided by the client when the new version of the API is used and ensure that they are consistent. it errors out otherwise;
- it validates that the fields are consistent in the client and errors out otherwise;
- it adds many tests related to the API changes but also extends the testing coverage of the requests/responses themselves.
- it standardises the naming in the coordinator. now, `ProtocolType` and `ProtocolName` are used across the board in the coordinator instead of having a mix of protocol type, protocol name, subprotocol, protocol, etc.
Reviewers: Jason Gustafson <jason@confluent.io>
* lz4: fixes identified by oss-fuzz
* jetty: fixes a few recent regressions
* powermock: better support for Java 12+
* zstd-jni: minor fixes
* httpclient: minor fixes
* spotless-plugin: minor fixes
* jmh: minor fixes
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
* Adjust build and documentation.
* Use lambda syntax for SAM types in `core`, `streams-scala` and
`connect-runtime` modules.
* Remove `runnable` and `newThread` from `CoreUtils` as lambda
syntax for SAM types make them unnecessary.
* Remove stale comment in `FunctionsCompatConversions`,
`KGroupedStream`, `KGroupedTable' and `KStream` about Scala 2.11,
the conversions are needed for Scala 2.12 too.
* Deprecate `org.apache.kafka.streams.scala.kstream.Suppressed`
and use `org.apache.kafka.streams.kstream.Suppressed` instead.
* Use `Admin.create` instead of `AdminClient.create`. Static methods
in Java interfaces can be invoked since Scala 2.12. I noticed that
MirrorMaker 2 uses `AdminClient.create`, but I did not change them
as Connectors have restrictions on newer client APIs.
* Improve efficiency in a few `Gauge` implementations by avoiding
unnecessary intermediate collections.
* Remove pointless `Option.apply` in `ZookeeperClient`
`SessionState` metric.
* Fix unused import/variable and other compiler warnings.
* Reduce visibility of some vals/defs.
Reviewers: Manikumar Reddy <manikumar@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Gwen Shapira <gwen@confluent.io>
Newer versions of Gradle handle this automatically. Tested with Gradle 5.6.
Credit to @granthenke for the tip.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Given we need to follow the Apache rule of not checking
any binaries into the source code, Kafka has always had
a bit of a tricky Gradle bootstrap.
Using ./gradlew as users expect doesn’t work and a
local and compatible version of Gradle was required to
generate the wrapper first.
This patch changes the behavior of the wrapper task to
instead generate a gradlew script that can bootstrap the
jar itself. Additionally it adds a license, removes the bat
script, and handles retries.
The documentation in the readme was also updated.
Going forward patches that upgrade gradle should run
`gradle wrapper` before checking in the change.
With this change users using ./gradlew can be sure they
are always building with the correct version of Gradle.
Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Ismael Juma <ismael@juma.me.uk
Rather than maintain hand coded protocol serialization code, Streams could use the same code-generation framework as Clients/Core.
There isn't a perfect match, since the code generation framework includes an assumption that you're generating "protocol messages", rather than just arbitrary blobs, but I think it's close enough to justify using it, and improving it over time.
Using the code generation allows us to drop a lot of detail-oriented, brittle, and hard-to-maintain serialization logic in favor of a schema spec.
Reviewers: Colin P. McCabe <cmccabe@apache.org>, Boyang Chen <boyang@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
It includes an important fix for people running on k8s:
* ZOOKEEPER-3320: Leader election port stop listen when
hostname unresolvable for some time
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
The scalac optimizer is able to inline methods to avoid lambda allocations, eliminating
the runtime cost of higher order functions in many cases. The compilation parameters
we are using here were introduced in 2.12.x, so we don't enable them for Scala 2.11.
Also, we enable a more aggressive inlining policy for the `core` project since it's
not meant to be used as a library.
See https://www.lightbend.com/blog/scala-inliner-optimizer for more information about
the optimizer.
I verified that the lambda allocation in the code below (from LogCleaner.scala) went away
after this change with Scala 2.12 and 2.13.
```scala
private def consumeAbortedTxnsUpTo(offset: Long): Unit = {
while (abortedTransactions.headOption.exists(_.firstOffset <= offset)) {
val abortedTxn = abortedTransactions.dequeue()
ongoingAbortedTxns.getOrElseUpdate(abortedTxn.producerId, new AbortedTransactionMetadata(abortedTxn))
}
}
```
The relevant part of the bytecode when compiled with Scala 2.13 looks like:
```text
private void consumeAbortedTxnsUpTo(long);
Code:
0: aload_0
1: invokespecial #54 // Method abortedTransactions:()Lscala/collection/mutable/PriorityQueue;
4: invokevirtual #175 // Method scala/collection/mutable/PriorityQueue.headOption:()Lscala/Option;
7: dup
8: ifnonnull 13
11: aconst_null
12: athrow
13: astore 4
15: aload 4
17: invokevirtual #145 // Method scala/Option.isEmpty:()Z
20: ifne 48
23: aload 4
25: invokevirtual #148 // Method scala/Option.get:()Ljava/lang/Object;
28: checkcast #177 // class kafka/log/AbortedTxn
```
The increased inlining causes some spurious spotBugs warnings, I added a few suppressions
and fixed one warning by avoiding unnecessary boxing.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Part of supporting KIP-213 ( https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable ). Murmur3 hash is used as a hashing mechanism in KIP-213 for the large range of uniqueness. The Murmur3 class and tests are ported directly from Apache Hive, with no alterations to the code or dependencies.
Author: Adam Bellemare <adam.bellemare@wishabi.com>
Reviewers: John Roesler <vvcephei@users.noreply.github.com>, Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
Closes#7271 from bellemare/murmur3hash
This PR makes two changes to code in the ReplicaManager.updateFollowerFetchState path, which is in the hot path for follower fetches. Although calling ReplicaManager.updateFollowerFetch state is inexpensive on its own, it is called once for each partition every time a follower fetch occurs.
1. updateFollowerFetchState no longer calls maybeExpandIsr when the follower is already in the ISR. This avoid repeated expansion checks.
2. Partition.maybeIncrementLeaderHW is also in the hot path for ReplicaManager.updateFollowerFetchState. Partition.maybeIncrementLeaderHW calls Partition.remoteReplicas four times each iteration, and it performs a toSet conversion. maybeIncrementLeaderHW now avoids generating any intermediate collections when updating the HWM.
**Benchmark results for Partition.updateFollowerFetchState on a r5.xlarge:**
Old:
```
1288.633 ±(99.9%) 1.170 ns/op [Average]
(min, avg, max) = (1287.343, 1288.633, 1290.398), stdev = 1.037
CI (99.9%): [1287.463, 1289.802] (assumes normal distribution)
```
New (when follower fetch offset is updated):
```
261.727 ±(99.9%) 0.122 ns/op [Average]
(min, avg, max) = (261.565, 261.727, 261.937), stdev = 0.114
CI (99.9%): [261.605, 261.848] (assumes normal distribution)
```
New (when follower fetch offset is the same):
```
68.484 ±(99.9%) 0.025 ns/op [Average]
(min, avg, max) = (68.446, 68.484, 68.520), stdev = 0.023
CI (99.9%): [68.460, 68.509] (assumes normal distribution)
```
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
2.9.9.1 and 2.9.9.2 include security fixes while 2.9.9.3 fixes a regression
introduced in 2.9.9.2.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
ZooKeeper 3.5.5 is the first stable release in the 3.5.x series. The key new feature
in is TLS support, but there are a few more noteworthy features:
* Dynamic reconfiguration
* Local sessions
* New node types: Container, TTL
* Ability to remove watchers
* Multi-threaded commit processor
* Upgraded to Netty 4.1
See the release notes for more detail:
https://zookeeper.apache.org/doc/r3.5.5/releasenotes.html
In addition to the version bump, we:
* Add `commons-cli` dependency as it's required by `ZooKeeperMain`, but specified as
`provided` in their pom.
* Remove unnecessary `ZooKeeperMainWrapper`, the bug it worked around was fixed
upstream a long time ago.
* Ignore non zero exit in one system test invocation of `ZooKeeperMain`.
`ZooKeeperMainWrapper` always returned `0` and `ZooKeeperService.query` relies
on that for correct behavior.
Reviewers: Jason Gustafson <jason@confluent.io>
ZkUtils was removed so we don't need this anymore.
Also:
* Fix ZkSecurityMigrator and ReplicaManagerTest not to
reference ZkClient classes.
* Remove references to zkclient in various `log4j.properties`
and `import-control.xml`.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>
Scala 2.13 support was added to build via #5454. This PR adjusts the code so that
it compiles with 2.11, 2.12 and 2.13.
Changes:
* Add `scala-collection-compat` dependency.
* Import `scala.collection.Seq` in a number of places for consistent behavior between
Scala 2.11, 2.12 and 2.13.
* Remove wildcard imports that were causing the Java classes to have priority over the
Scala ones, related Scala issue: https://github.com/scala/scala/pull/6589.
* Replace parallel collection usage with `Future`. The former is no longer included by
default in the standard library.
* Replace val _: Unit workaround with one that is more concise and works with Scala 2.13
* Replace `filterKeys` with `filter` when we expect a `Map`. `filterKeys` returns a view
that doesn't implement the `Map` trait in Scala 2.13.
* Replace `mapValues` with `map` or add a `toMap` as an additional transformation
when we expect a `Map`. `mapValues` returns a view that doesn't implement the
`Map` trait in Scala 2.13.
* Replace `breakOut` with `iterator` and `to`, `breakOut` was removed in Scala
2.13.
* Replace to() with toMap, toIndexedSeq and toSet
* Replace `mutable.Buffer.--` with `filterNot`.
* ControlException is an abstract class in Scala 2.13.
* Variable arguments can only receive arrays or immutable.Seq in Scala 2.13.
* Use `Factory` instead of `CanBuildFrom` in DecodeJson. `CanBuildFrom` behaves
a bit differently in Scala 2.13 and it's been deprecated. `Factory` has the behavior
we need and it's available via the compat library.
* Fix failing tests due to behavior change in Scala 2.13,
"Map.values.map is not strict in Scala 2.13" (https://github.com/scala/bug/issues/11589).
* Use Java collections instead of Scala ones in StreamResetter (a Java class).
* Adjust CheckpointFile.write to take an `Iterable` instead of `Seq` to avoid
unnecessary collection copies.
* Fix DelayedElectLeader to use a Map instead of Set and avoid `to` call that
doesn't work in Scala 2.13.
* Use unordered map for mapping in SimpleAclAuthorizer, mapping of ordered
maps require an `Ordering` in Scala 2.13 for safety reasons.
* Adapt `ConsumerGroupCommand` to compile with Scala 2.13.
* CoreUtils.min takes an `Iterable` instead of `TraversableOnce`, the latter does
not exist in Scala 2.13.
* Replace `Unit` with `()` in a couple places. Scala 2.13 is stricter when it expects
a value instead of a type.
* Fix bug in CustomQuotaCallbackTest where we did not necessarily set `partitionRatio`
correctly, `forall` can terminate early.
* Add a couple of spotbugs exclusions that are needed by code generated by Scala 2.13
* Remove unused variables, simplify some code and remove procedure syntax in a few
places.
* Remove unused `CoreUtils.JSONEscapeString`.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, José Armando García Sancio <jsancio@users.noreply.github.com>
- include Scala 2.13 in gradle build
- handle future milestone and RC versions of Scala in a better way
- if no Scala version is specified, default to scala 2.12 (bump from 2.11)
- include certain Xlint options (removed by Scala 2.13) for Scala 2.11/2.12 build only
- upgrade versions for dependencies:
- scalaLogging: 3.9.0 -->> 3.9.2
- scalatest: 3.0.7 -->> 3.0.8
- scoverage: 1.3.1 -->> 1.4.0
Reviewers: Ewen Cheslack-Postava <me@ewencp.org>, Ismael Juma <ismael@juma.me.uk>
This commit makes three changes:
- Adds a constructor for NewTopic(String, Optional<Integer>, Optional<Short>)
which allows users to specify Optional.empty() for numPartitions or
replicationFactor in order to use the broker default.
- Changes AdminManager to accept -1 as valid options for replication
factor and numPartitions (resolving to broker defaults).
- Makes --partitions and --replication-factor optional arguments when creating
topics using kafka-topics.sh.
- Adds a dependency on scalaJava8Compat library to make it simpler to
convert Scala Option to Java Optional
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ryanne Dolan <ryannedolan@gmail.com>, Jason Gustafson <jason@confluent.io>
Reviewers: Bill Bejeck <bill@confluent.io>, Boyang Chen <boyang@confluent.io>, Bruno Cadonna <bruno@confluent.io>, Guozhang Wang <guozhang@confuent.io>
Since the originals map passed to AbstractConfig constructor may be immutable, avoid updating this map while resolving indirect config variables. Instead a new ResolvingMap instance is now used to store resolved configs.
Reviewers: Randall Hauch <rhauch@gmail.com>, Boyang Chen <bchen11@outlook.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
This upgrade exposes a number of new options, including the WriteBufferManager which -- along with existing TableConfig options -- allows users to limit the total memory used by RocksDB across instances. This can alleviate some cascading OOM potential when, for example, a large number of stateful tasks are suddenly migrated to the same host.
The RocksDB docs guarantee backwards format compatibility across versions
Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bbejeck@gmail.com>,
Verified that the https links work.
I didn't update the license header in this PR since that touches
so many files. Will file a separate one for that.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
* Describe/Delete/Reset offsets on multiple consumer groups at a time (including each group by repeating `--group` parameter)
* Describe/Delete/Reset offsets on ALL consumer groups at a time (add new `--all-groups` option similar to `--all-topics`)
* Reset plan CSV file generation reworked: structure updated to support multiple consumer groups and make sure that CSV file generation is done properly since there are no restrictions on consumer group names and symbols like commas and quotes are allowed.
* Extending data output table format by adding `GROUP` column for all `--describe` queries
The test `org.apache.kafka.connect.runtime.rest.RestServerTest#testCORSEnabled` assumes Jersey client can send restricted HTTP headers(`Origin`).
Jersey client uses `sun.net.www.protocol.http.HttpURLConnection`.
`sun.net.www.protocol.http.HttpURLConnection` drops restricted headers(`Host`, `Keep-Alive`, `Origin`, etc) based on static property `allowRestrictedHeaders`.
This property is initialized in a static block by reading Java system property `sun.net.http.allowRestrictedHeaders`.
So, if classloader loads `HttpURLConnection` before we set `sun.net.http.allowRestrictedHeaders=true`, then all subsequent changes of this system property won't take any effect(which happens if `org.apache.kafka.connect.integration.ExampleConnectIntegrationTest` is executed before `RestServerTest`).
To prevent this, we have to either make sure we set `sun.net.http.allowRestrictedHeaders=true` as early as possible or do not rely on this system property at all.
This PR adds test dependency on `httpcomponents-client` which doesn't depend on `sun.net.http.allowRestrictedHeaders` system property. Thus none of existing tests should interfere with `RestServerTest`.
Author: Alex Diachenko <sansanichfb@gmail.com>
Reviewers: Randall Hauch, Konstantine Karantasis, Gwen Shapira
Closes#6236 from avocader/KAFKA-7799
JUnit 4.13 fixes the issue where `Category` and `Parameterized` annotations
could not be used together. It also deprecates `ExpectedException` and
`assertThat`. Given this, we:
- Replace `ExpectedException` with the newly introduced `assertThrows`.
- Replace `Assert.assertThat` with `MatcherAssert.assertThat`.
- Annotate `AbstractLogCleanerIntegrationTest` with `IntegrationTest` category.
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, David Arthur <mumrah@gmail.com>
This patch adds a framework to automatically generate the request/response classes for Kafka's protocol. The code will be updated to use the generated classes in follow-up patches. Below is a brief summary of the included components:
**buildSrc/src**
The message generator code is here. This code is automatically re-run by gradle when one of the schema files changes. The entire directory is processed at once to minimize the number of times we have to start a new JVM. We use Jackson to translate the JSON files into Java objects.
**clients/src/main/java/org/apache/kafka/common/protocol/Message.java**
This is the interface implemented by all automatically generated messages.
**clients/src/main/java/org/apache/kafka/common/protocol/MessageUtil.java**
Some utility functions used by the generated message code.
**clients/src/main/java/org/apache/kafka/common/protocol/Readable.java, Writable.java, ByteBufferAccessor.java**
The generated message code uses these classes for writing to a buffer.
**clients/src/main/message/README.md**
This README file explains how the JSON schemas work.
**clients/src/main/message/\*.json**
The JSON files in this directory implement every supported version of every Kafka API. The unit tests automatically validate that the generated schemas match the hand-written schemas in our code. Additionally, there are some things like request and response headers that have schemas here.
**clients/src/main/java/org/apache/kafka/common/utils/ImplicitLinkedHashSet.java**
I added an optimization here for empty sets. This is useful here because I want all messages to start with empty sets by default prior to being loaded with data. This is similar to the "empty list" optimizations in the `java.util.ArrayList` class.
Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Ismael Juma <ismael@juma.me.uk>, Bob Barrett <bob.barrett@outlook.com>, Jason Gustafson <jason@confluent.io>
Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Ryanne Dolan <ryannedolan@gmail.com>, Guozhang Wang <guozhang@confluent.io>
The StreamsUpgradeTest::test_upgrade_downgrade_brokers used sleep calls in the test which led to flaky test performance and as a result, we placed an @ignore annotation on the test. This PR uses log events instead of the sleep calls hence we can now remove the @ignore setting.
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
See https://github.com/spotbugs/spotbugs/issues/756 for details on
the false positives affecting try with resources. An example is:
> RCN | Nullcheck of fc at line 629 of value previously dereferenced in
> org.apache.kafka.common.utils.Utils.readFileAsString(String, Charset)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
KAFKA-7597: Add configurable transaction support to ProduceBenchWorker. In order to get support for serializing Optional<> types to JSON, add a new library: jackson-datatype-jdk8. Once Jackson 3 comes out, this library will not be needed.
Reviewers: Colin McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>
EasyMock 4.0.x includes a change that relies on the caller for inferring
the return type of mock creator methods. Updated a number of Scala
tests for compilation and execution to succeed.
The versions of EasyMock and PowerMock in this PR include full support
for Java 11.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Development of EasyMock and PowerMock has stagnated while Mockito
continues to be actively developed. With the new Java release cadence,
it's a problem to depend on libraries that do bytecode manipulation
and are not actively maintained. In addition, Mockito is also
easier to use.
While updating the tests, I attempted to go from failing test to
passing test. In cases where the updated test passed on the first
attempt, I artificially broke it to ensure the test was still doing its
job.
I included a few improvements that were helpful while making these
changes:
1. Better exception if there are no nodes in `leastLoadedNodes`
2. Always close the producer in `KafkaProducerTest`
3. requestsInFlight producer metric should not hold a reference to
`Sender`
Finally, `Metadata` is no longer final so that we don't need
`PowerMock` to mock it. It's an internal class, so it's OK.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Dong Lin <lindong28@gmail.com>
Closes#5691 from ijuma/kafka-7438-mockito
Removed ignore annotations from the upgrade tests. This PR includes the following changes for updating the upgrade tests:
* Uploaded new versions 0.10.2.2, 0.11.0.3, 1.0.2, 1.1.1, and 2.0.0 (in the associated scala versions) to kafka-packages
* Update versions in version.py, Dockerfile, base.sh
* Added new versions to StreamsUpgradeTest.test_upgrade_downgrade_brokers including version 2.0.0
* Added new versions StreamsUpgradeTest.test_simple_upgrade_downgrade test excluding version 2.0.0
* Version 2.0.0 is excluded from the streams upgrade/downgrade test as StreamsConfig needs an update for the new version, requiring a KIP. Once the community votes the KIP in, a minor follow-up PR can be pushed to add the 2.0.0 version to the upgrade test.
* Fixed minor bug in kafka-run-class.sh for classpath in upgrade/downgrade tests across versions.
* Follow on PRs for 0.10.2x, 0.11.0x, 1.0.x, 1.1.x, and 2.0.x will be pushed soon with the same updates required for the specific version.
Reviewers: Eno Thereska <eno.thereska@gmail.com>, John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>
findBugs is abandoned, it doesn't work with Java 9 and the Gradle plugin will be deprecated in
Gradle 5.0: https://github.com/gradle/gradle/pull/6664
spotBugs is actively maintained and it supports Java 8, 9 and 10. Java 11 is not supported yet,
but it's likely to happen soon.
Also fixed a file leak in Connect identified by spotbugs.
Manually tested spotBugsMain, jarAll and importing kafka in IntelliJ and running
a build in the IDE.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Dong Lin <lindong28@gmail.com>
Closes#5625 from ijuma/kafka-5887-spotbugs
"Jetty 9.4.12 includes compatibility for JDK 11. Additionally, TLS 1.3 support has been implemented. While full functionality for new JDK features is not yet supported, this release has been built and tested for compatibility with the latest releases from Oracle."
http://dev.eclipse.org/mhonarc/lists/jetty-announce/msg00124.html
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Previously, we depicted creating a Jackson serde for every pojo class, which becomes a burden in practice. There are many ways to avoid this and just have a single serde, so we've decided to model this design choice instead.
Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Relative paths in Gradle break when the Gradle daemon is used
unless user.dir can be changed while the process is running.
Java 11 disallows this, so we use project paths instead.
Verified that rat and checkstyle work with Java 11 after these
changes.
Reviewers: Dong Lin <lindong28@gmail.com>
This includes a fix for ZOOKEEPER-2184 (Zookeeper Client
should re-resolve hosts when connection attempts fail), which
fixes KAFKA-4041.
Updated a couple of tests as unresolvable addresses are now
retried until the connection timeout. Cleaned up tests a little.
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
- Removed Scala consumers (`SimpleConsumer` and `ZooKeeperConsumerConnector`)
and their tests.
- Removed Scala request/response/message classes.
- Removed any mention of new consumer or new producer in the code
with the exception of MirrorMaker where the new.consumer option was
never deprecated so we have to keep it for now. The non-code
documentation has not been updated either, that will be done
separately.
- Removed a number of tools that only made sense in the context
of the Scala consumers (see upgrade notes).
- Updated some tools that worked with both Scala and Java consumers
so that they only support the latter (see upgrade notes).
- Removed `BaseConsumer` and related classes apart from `BaseRecord`
which is used in `MirrorMakerMessageHandler`. The latter is a pluggable
interface so effectively public API.
- Removed `ZkUtils` methods that were only used by the old consumers.
- Removed `ZkUtils.registerBroker` and `ZKCheckedEphemeral` since
the broker now uses the methods in `KafkaZkClient` and no-one else
should be using that method.
- Updated system tests so that they don't use the Scala consumers except
for multi-version tests.
- Updated LogDirFailureTest so that the consumer offsets topic would
continue to be available after all the failures. This was necessary for it
to work with the Java consumer.
- Some multi-version system tests had not been updated to include
recently released Kafka versions, fixed it.
- Updated findBugs and checkstyle configs not to refer to deleted
classes and packages.
Reviewers: Dong Lin <lindong28@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
Upgrade strongly recommended due to security fixes for
jackson-databind (same as ones in 2.7.9.4 and 2.8.11.2).
Reviewers: Matthias J. Sax <matthias@confluent.io>
Connect API currently depends on Jersey API as a side-effect of KIP-285. It should only depend on the JAX RS API.
Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>
Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#5190 from mageshn/KAFKA-7031
(cherry picked from commit 51ac53d903)
Signed-off-by: Ewen Cheslack-Postava <me@ewencp.org>
In addition to Gradle, updated snappy, owasp-dependency-check,
apache directory service api.
Gradle 4.8 fixes a fatal issue when building with Java 11, but
full support is coming in 4.9 or later.
KAFKA-6921 removed deprecated scala producer. This pull request removes the now unnecessary findbugs exclusion that matched one of the affected classes.
* Set --source, --target and --release to 1.8.
* Build Scala 2.12 by default.
* Remove some conditionals in the build file now that Java 8
is the minimum version.
* Bump the version of Jetty, Jersey and Checkstyle (the newer
versions require Java 8).
* Fixed issues uncovered by the new version if Checkstyle.
* A couple of minor updates to handle an incompatible source
change in the new version of Jetty.
* Add dependency to jersey-hk2 to fix failing tests caused
by Jersey upgrade.
* Update release script to use Java 8 and to take into account
that Scala 2.12 is now built by default.
* While we're at it, bump the version of Gradle, Gradle plugins,
ScalaLogging, JMH and apache directory api.
* Minor documentation updates including the readme and upgrade
notes. A number of Streams Java 7 examples can be removed
subsequently.
This PR implements a Scala wrapper library for Kafka Streams. The library is implemented as a project under streams, namely `:streams:streams-scala`. The PR contains the following:
* the library implementation of the wrapper abstractions
* the test suite
* the changes in `build.gradle` to build the library jar
The library has been tested running the tests as follows:
```
$ ./gradlew -Dtest.single=StreamToTableJoinScalaIntegrationTestImplicitSerdes streams:streams-scala:test
$ ./gradlew -Dtest.single=StreamToTableJoinScalaIntegrationTestImplicitSerdesWithAvro streams:streams-scala:test
$ ./gradlew -Dtest.single=WordCountTest streams:streams-scala:test
```
Author: Debasish Ghosh <ghosh.debasish@gmail.com>
Author: Sean Glover <seglo@randonom.com>
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Ismael Juma <ismael@juma.me.uk>, John Roesler <john@confluent.io>, Damian Guy <damian@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4756 from debasishg/scala-streams
- adds Streams upgrade tests for 1.1 release
- introduces metadata version 3
Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>
* Upgrade EasyMock to 3.6 which adds support for Java 10
by upgrading to ASM 6.1.1.
* Ensure that Jacoco is truly disabled for the `core` project.
This was the original intent, since it's in Scala, but it had not
been achieved. This is important because the Jacoco agent
fails when it tries to instrument the classes compiled by
scalac with Java 10.
* Added dependencies so that Trogdor and Connect work with Java 9 and 10
* Updated Jacoco to 0.8.1 so that it works with Java 10
* Updated Gradle to 4.6
* A few minor version bumps (not related to Java9/10 fixes)
I tested manually that we can run ./gradlew test with Java 10
after these changes. There are test failures as EasyMock
and PowerMock will have to be updated to use a newer
ASM version. But compiling successfully and most tests
passing is progress. :)
I also tested manually that Trogdor can be started with Java 10.
It previously failed with a ClassNotFoundError.
Reviewers: Jason Gustafson <jason@confluent.io>
It's a critical bug that only affects the server, but we
don't have an easy way to use 3.4.11 for the client
only.
Reviewers: Jun Rao <junrao@gmail.com>, Damian Guy <damian.guy@gmail.com>
* MINOR: Update gradle, jackson and jacoco
- Gradle update adds support for Java 10
- Jacoco update adds support for Java 9
- Jackson bug fix update adds more serialization
robustness checks
* Update Jetty
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
This PR implements the JIRA issue [KAFKA-4029: SSL support for Connect REST API](https://issues.apache.org/jira/browse/KAFKA-4029) / [KIP-208](https://cwiki.apache.org/confluence/display/KAFKA/KIP-208%3A+Add+SSL+support+to+Kafka+Connect+REST+interface).
Summary of the main changes:
- Jetty `HttpClient` is used as HTTP client instead of the one shipped with Java. That allows to keep the SSL configuration for Server and Client be in single place (both use the Jetty `SslContextFactory`). It also has much richer configuration than the JDK client (it is easier to configure things such as supported cipher suites etc.).
- The `RestServer` class has been broker into 3 parts. `RestServer` contains the server it self. `RestClient` contains the HTTP client used for forwarding requests etc. and `SSLUtils` contain some helper classes for configuring SSL. One of the reasons for this was Findbugs complaining about the class complexity.
- A new method `valuesWithPrefixAllOrNothing` has been added to `AbstractConfig` to make it easier to handle the situation that we want to use either only the prefixed SSL options or only the non-prefixed. But not mixed them.
Author: Jakub Scholz <www@scholzj.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#4429 from scholzj/kip-208
- Use newly added pause method in LogCleaner and ControllerChannelManager classes
- Remove LogCleaner, Cleaner exclusions from findbugs-exclude.xml
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Updates:
- Gradle, gradle plugins and maven artifact updated
- Bug fix updates for ZooKeeper, Jackson, EasyMock and Snappy
Not updated:
- RocksDB as it often causes issues, so better done separately
- args4j as our test coverage is weak and the update was a
feature release
Also fixed scala-reflect version to match scala-library.
Release notes for ZooKeeper 3.4.11:
https://zookeeper.apache.org/doc/r3.4.11/releasenotes.html
A notable fix is improved handling of UnknownHostException:
https://issues.apache.org/jira/browse/ZOOKEEPER-2614
Manually tested that IntelliJ import and build still works.
Relying on existing test suite otherwise.
Reviewers: Jun Rao <junrao@gmail.com>
- Rename `encode` to `legacyEncodeAsString`, we
can remove this when we remove `ZkUtils`.
- Introduce `encodeAsString` that uses Jackson.
- Change `encodeAsBytes` to use Jackson.
- Avoid intermediate string when converting
Broker to json bytes.
The methods that use Jackson only support
Java collections unlike `legacyEncodeAsString`.
Tests were added `encodeAsString` and
`encodeAsBytes`.
Author: umesh chaudhary <umesh9794@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#4259 from umesh9794/KAFKA-5631
Use slf4j (via scala-logging) instead. Also:
- Log4jController is only initialised if log4j if in the classpath
- Use FATAL marker to support log4j's FATAL level (as the log4j-slf4j bridge does)
- Removed `Logging.swallow` in favour of CoreUtils.swallow, which logs to the
correct logger
Author: Viktor Somogyi <viktor.somogyi@cloudera.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3477 from viktorsomogyi/KAFKA-1044
The main change is Java 9 support.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#4185 from ijuma/scala-2.11.12
Previously, Trogdor only handled "Faults." Now, Trogdor can handle
"Tasks" which may be either faults, or workloads to execute in the
background.
The Agent and Coordinator have been refactored from a
mutexes-and-condition-variables paradigm into a message passing
paradigm. No locks are necessary, because only one thread can access
the task state or worker state. This makes them a lot easier to reason
about.
The MockTime class can now handle mocking deferred message passing
(adding a message to an ExecutorService with a delay). I added a
MockTimeTest.
MiniTrogdorCluster now starts up Agent and Coordinator classes in
paralle in order to minimize junit test time.
RPC messages now inherit from a common Message.java class. This class
handles implementing serialization, equals, hashCode, etc.
Remove FaultSet, since it is no longer necessary.
Previously, if CoordinatorClient or AgentClient hit a networking
problem, they would throw an exception. They now retry several times
before giving up. Additionally, the REST RPCs to the Coordinator and
Agent have been changed to be idempotent. If a response is lost, and
the request is resent, no harm will be done.
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#4073 from cmccabe/KAFKA-6060
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Vahid Hashemian <vahidhashemian@us.ibm.com>, Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>
Closes#4136 from guozhangwang/K6100-rocksdb-580-regression
Mainly for Java 9 fixes and improved compilation times (5-10% reduction):
http://www.scala-lang.org/news/2.12.4
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#4102 from ijuma/update-scala-version
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Damian Guy <damian.guy@gmail.com>
Closes#3819 from guozhangwang/KMinor-rocksDB-573
Also:
1. Fix WorkerTest to use the correct `Mock` annotations. `org.easymock.Mock`
is not supported by PowerMock 2.x.
2. Rename `powermock` to `powermockJunit4` in `dependencies.gradle` for
clarity.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#3881 from ijuma/kafka-5884-powermock-java
- EasyMock 3.5 supports Java 9.
- Fixed issues in `testFailedSendRetryLogic` and
`testCreateConnectorAlreadyExists` exposed by new EasyMock
version. The former was passing `anyObject` to
`andReturn`, which doesn't make sense. This was leaving
behind a global `any` matcher, which caused a few issues in
the new version. Fixing this meant that the correlation ids had
to be updated to actually match. The latter was missing a
couple of expectations that the previous version of EasyMock
didn't catch.
- Removed unnecessary PowerMock dependency from 3 tests.
- Disabled remaining PowerMock tests when running with Java 9
until https://github.com/powermock/powermock/issues/783 is
in a release.
- Once we merge this PR, we can enable tests in the Java 9 builds
in Jenkins.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#3845 from ijuma/kafka-4501-easymock-powermock-java-9
There have been a few bug fix releases since
the previous update.
Author: Andras Beni <andrasbeni@cloudera.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3791 from andrasbeni/dependency-upgrade
Notable updates:
1. Gradle 4.1 includes a number of performance and
CLI improvements as well as initial Java 9 support.
2. Scala 2.12.3 has substantial compilation time
improvements.
3. lz4-java 1.4 allows us to remove a workaround in
KafkaLZ4BlockInputStream (not done in this PR).
4. snappy-java 1.1.4 improved performance of compression (5%)
and decompression (20%). There was a slight increase in the
compressed size in one of our tests.
Not updated:
1. PowerMock due to a couple of regressions. I investigated one of them
and filed https://github.com/powermock/powermock/issues/828.
2. Jackson, which will be done via #3631.
3. Rocksdb, which will be done via #3519.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#3619 from ijuma/update-deps-for-1.0.0
In a test by onurkaraman involving 3066 topics and 95895 partitions,
Controller initialisation time spent on JSON parsing would be reduced from
37.1 seconds to 0.7 seconds by switching from the current JSON parser to
Jackson. See the following JIRA comment for more details:
https://issues.apache.org/jira/browse/KAFKA-5328?focusedCommentId=16027086
I tested that we only use Jackson methods introduced in 2.0 in the main
codebase by compiling it with the older version locally. We use a
constructor introduced in 2.4 in one test, but I didn't remove it as it
seemed harmless. The reasoning for this is explained in the mailing list
thread:
http://search-hadoop.com/m/uyzND1FWbWw1qUbWe
Finally, this PR only handles the parsing side. It would be good to use Jackson
for serialising to JSON as well. I filed KAFKA-5631 for that.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Onur Karaman <okaraman@linkedin.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#83 from ijuma/kafka-1595-remove-deprecated-json-parser-jackson
I included a JMH benchmark and the results follow. The
implementation in this PR takes no more than 1/10th
of the time when compared to trunk. I also included
results for an alternative implementation that is a little
slower than the one in the PR.
Trunk:
```text
TopicBenchmark.testValidate topic avgt 15 134.107 ± 3.956 ns/op
TopicBenchmark.testValidate longer-topic-name avgt 15 316.241 ± 13.379 ns/op
TopicBenchmark.testValidate very-long-topic-name_with_more_text avgt 15 636.026 ± 30.272 ns/op
```
Implementation in the PR:
```text
TopicBenchmark.testValidate topic avgt 15 13.153 ± 0.383 ns/op
TopicBenchmark.testValidate longer-topic-name avgt 15 26.139 ± 0.896 ns/op
TopicBenchmark.testValidate very-long-topic-name.with_more_text avgt 15 44.829 ± 1.390 ns/op
```
Alternative implementation where boolean validChar = Character.isLetterOrDigit(c) || c == '.' || c == '_' || c == '-';
```text
TopicBenchmark.testValidate topic avgt 15 18.883 ± 1.044 ns/op
TopicBenchmark.testValidate longer-topic-name avgt 15 36.696 ± 1.220 ns/op
TopicBenchmark.testValidate very-long-topic-name_with_more_text avgt 15 65.956 ± 0.669 ns/op
```
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#3234 from ijuma/optimise-topic-is-valid
The JMH benchmark included shows that the redundant
volatile write causes the constructor of `ProducerRecord`
to take more than 50% longer:
ProducerRecordBenchmark.constructorBenchmark avgt 15 24.136 ± 1.458 ns/op (before)
ProducerRecordBenchmark.constructorBenchmark avgt 15 14.904 ± 0.231 ns/op (after)
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#3233 from ijuma/remove-volatile-write-in-records-header-constructor
- reuse decompression buffers in consumer Fetcher
- switch lz4 input stream to operate directly on ByteBuffers
- avoids performance impact of catching exceptions when reaching the end of legacy record batches
- more tests with both compressible / incompressible data, multiple
blocks, and various other combinations to increase code coverage
- fixes bug that would cause exception instead of invalid block size
for invalid incompressible blocks
- fixes bug if incompressible flag is set on end frame block size
Overall this improves LZ4 decompression performance by up to 40x for small batches.
Most improvements are seen for batches of size 1 with messages on the order of ~100B.
We see at least 2x improvements for for batch sizes of < 10 messages, containing messages < 10kB
This patch also yields 2-4x improvements on v1 small single message batches for other compression types.
Full benchmark results can be found here
https://gist.github.com/xvrl/05132e0643513df4adf842288be86efd
Author: Xavier Léauté <xavier@confluent.io>
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2967 from xvrl/kafka-5150
The code was correct since the method is only called from
one thread, but the change is worthwhile anyway.
Author: Amit Daga <adaga@adobe.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2966 from amitdaga/findbugs-streams-multithread
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#2840 from apurvam/exactly-once-transactional-clients
Worth special mention:
1. Update Scala to 2.11.11 and 2.12.2
2. Update Gradle to 3.5
3. Update ZooKeeper to 3.4.10
4. Update reflections to 0.9.11, which:
* Switches to jsr305 annotations with a provided scope
* Updates Guava from 18 to 20
* Updates javaassist from 3.18 to 3.21
There’s a separate PR for updating RocksDb, so
I didn’t include that here.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#2872 from ijuma/update-deps-for-0.11
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Jozef Koval <jozef.koval@protonmail.ch>, Ismael Juma <ismael@juma.me.uk>
Closes#2687 from cmccabe/KAFKA-4899
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Eno Thereska <eno@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2780 from cmccabe/KAFKA-4995
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2779 from cmccabe/KAFKA-4993
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2763 from cmccabe/KAFKA-4977
There were a couple of important issues fixed in Gradle 3.2.1:
* [GRADLE-3582] - Gradle wrapper fails to escape arguments with nested quotes
* [GRADLE-3583] - Newlines in JAVA_OPTS breaks application plugin shell script in Gradle 3.2
And a lot of important issues fixed in Scala 2.12.1:
* http://www.scala-lang.org/news/2.12.1
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack-Postava <me@ewencp.org>
Closes#2216 from ijuma/gradle-3.2.1-and-scala-2.12.1
This reverts commit e035fc0395 for the
following reasons:
1. License files are missing causing local builds to fail during the
rat task (rat is not being run in Jenkins for some reason, filed
KAFKA-4459 for that)
2. It renames a number of system test files when there's a better
way to achieve the goal of running a subset of system tests to stay
under the Travis limit.
3. It adds the gradle wrapper binary even though this was removed
intentionally a while back.
A new PR will be submitted for KAFKA-4345 without the undesired
changes.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2187 from ijuma/kafka-4345-revert
As of now the ducktape tests that we have for kafka are not run for pull request. We can run these test using travis-ci. Here is a sample run:
https://travis-ci.org/raghavgautam/kafka/builds/170574293
Author: Raghav Kumar Gautam <raghav@apache.org>
Reviewers: Sriharsha Chintalapani <harsha@hortonworks.com>
Closes#2064 from raghavgautam/trunk
There are 32 failing tests on both trunk and my branch.
Author: jozi-k <jozef.koval@protonmail.ch>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#2036 from jozi-k/update-rocksdb-4.11.2
https://issues.apache.org/jira/browse/KAFKA-4025
this patch sets the file.encoding system property to UTF-8 before invoking rat during the build process and resets it to the original value afterwards
Author: radai-rosenblatt <radai.rosenblatt@gmail.com>
Reviewers: Joel Koshy <jjkoshy.w@gmail.com>
Closes#1710 from radai-rosenblatt/fix-build-on-windows
rocksdbjni version 4.9.0 now includes support for running on Windows; this PR updates Kafka Stream's dependency to that version. Tests pass locally, except for a timeout in testReprocessingFromScratchAfterReset that doesn't seem related; it happens with and without this change.
This contribution is my original work and I license the work to the project under the project's open source license.
Author: Mathieu Fenniak <mathieu.fenniak@replicon.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1783 from mfenniak/update-rocksdb-4.9
* The hope is that RocksDb 4.4.1 is more stable than 4.1.0 (occasional segfaults) and 4.2.0 (very frequent segfaults), release notes for 4.4.1: https://www.facebook.com/groups/rocksdb.dev/permalink/925995520832296/
* slf4j 1.7.21 includes thread-safety fixes: http://www.slf4j.org/news.html
* snappy 1.1.2.4 includes performance improvements requested by Spark, which apply to our usage: https://github.com/xerial/snappy-java/blob/master/Milestone.md
I ran the stream tests several times and they passed every time while 4.2.0 segfaulted every time.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Eno Thereska <eno.thereska@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#1219 from ijuma/kafka-3557-update-rocks-db-4.4.1-snappy-slf4j
All dependencies on hadoop were removed with MiniKDC. This removes the left over version entry.
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ismael Juma
Closes#1214 from granthenke/remove-hadoop
This also fixes KAFKA-3453 and KAFKA-2866.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Gwen Shapira
Closes#1155 from ijuma/kafka-3475-introduce-our-minikdc
This ZkClient version adds authentication validation and a conditional delete method needed for other patches
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ismael Juma, Gwen Shapira
Closes#1084 from granthenke/zkclient-08
Adds a gradle task to generate a report of outdate release dependencies:
`gradle dependencyUpdates`
Updates a few minor versions.
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ismael Juma, Gwen Shapira
Closes#973 from granthenke/outdated-deps
This is the latest version in Maven even though HISTORY.md includes releases all the way to 4.5.0.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Grant Henke <ghenke@cloudera.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#937 from ijuma/update-rocks-db-for-streams
Patch version bumps for bouncy castle, minikdc, snappy, slf4j, scalatest and powermock. Notable fixes:
* Snappy: fixes a resource leak
* Bouncy castle: security fixes
Also update Gradle to 2.11 (where the notable change is improved IDE integration) and the grgit build dependency.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Grant Henke <granthenke@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#903 from ijuma/kafka-3227-conservative-update-of-kafka-deps