mirror of https://github.com/apache/kafka.git
This PR works to improve high watermark checkpointing performance. `ReplicaManager.checkpointHighWatermarks()` was found to be a major contributor to GC pressure, especially on Kafka clusters with high partition counts and low throughput. Added a JMH benchmark for `checkpointHighWatermarks` which establishes a performance baseline. The parameterized benchmark was run with 100, 1000 and 2000 topics. Modified `ReplicaManager.checkpointHighWatermarks()` to avoid extra copies and cached the Log parent directory Sting to avoid frequent allocations when calculating `File.getParent()`. A few clean-ups: * Changed all usages of Log.dir.getParent to Log.parentDir and Log.dir.getParentFile to Log.parentDirFile. * Only expose public accessor for `Log.dir` (consistent with `Log.parentDir`) * Removed unused parameters in `Partition.makeLeader`, `Partition.makeFollower` and `Partition.createLogIfNotExists`. Benchmark results: | Topic Count | Ops/ms | MB/sec allocated | |-------------|---------|------------------| | 100 | + 51% | - 91% | | 1000 | + 143% | - 49% | | 2000 | + 149% | - 50% | Reviewers: Lucas Bradstreet <lucas@confluent.io>. Ismael Juma <ismael@juma.me.uk> Co-authored-by: Gardner Vickers <gardner@vickers.me> Co-authored-by: Ismael Juma <ismael@juma.me.uk> |
||
---|---|---|
.. | ||
.scalafmt.conf | ||
checkstyle.xml | ||
import-control-core.xml | ||
import-control-jmh-benchmarks.xml | ||
import-control.xml | ||
java.header | ||
suppressions.xml |