Commit Graph

21 Commits

Author SHA1 Message Date
TengYao Chi 0e4d8b3e86
KAFKA-17569 Rewrite TestLinearWriteSpeed by Java (#17736)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-26 23:43:01 +08:00
Ken Huang 2ff13976ab
KAFKA-17568 Rewrite TestPurgatoryPerformance by Java (#17246)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-24 02:44:37 +08:00
Dimitar Dimitrov bc47ce1a53
MINOR: Fix a race and add JMH bench for HdrHistogram (#17221) 2024-09-27 23:49:10 +09:00
Mickael Maison c30615e6d7
KAFKA-17430: Move RequestChannel.Metrics/RequestMetrics to server module (#17015)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-03 10:11:47 +02:00
Omnia Ibrahim d88c15fc3e
KAFKA-15853 Move KRAFT configs out of KafkaConfig (#15775)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-27 07:02:31 +08:00
Nikolay d8673b26bf
KAFKA-15899 [1/2] Move kafka.security package from core to server module (#15572)
1) This PR moves kafka.security classes from core to server module.
2) AclAuthorizer not moved, because it has heavy dependencies on core classes that not rewrited from scala at the moment.
3) AclAuthorizer will be deleted as part of ZK removal

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-30 11:54:22 +08:00
Yash Mayya 9bb2f78d53
KAFKA-15034: Improve performance of the ReplaceField SMT; add JMH benchmark (#13776)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-06-01 15:14:31 -04:00
Ron Dagostino e27926f92b
KAFKA-14735: Improve KRaft metadata image change performance at high … (#13280)
topic counts.

Introduces the use of persistent data structures in the KRaft metadata image to avoid copying the entire TopicsImage upon every change.  Performance that was O(<number of topics in the cluster>) is now O(<number of topics changing>), which has dramatic time and GC improvements for the most common topic-related metadata events.  We abstract away the chosen underlying persistent collection library via ImmutableMap<> and ImmutableSet<> interfaces and static factory methods.

Reviewers: Luke Chen <showuon@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>, Purshotam Chauhan <pchauhan@confluent.io>
2023-04-17 17:52:28 -04:00
Satish Duggana 1d3fb76092
KAFKA-14688 Move package org.apache.kafka.server.log.internals to org.apache.kafka.storage.internals.log (#13213)
Reviewers: Ismael Juma <ismael@juma.me.uk>
2023-02-08 09:22:42 +05:30
David Jacot 2e0a005dd4
KAFKA-14367; Add internal APIs to the new `GroupCoordinator` interface (#13112)
This patch migrates all the internal APIs of the current group coordinator to the new `GroupCoordinator` interface. It also makes the current implementation package private to ensure that it is not used anymore.

Reviewers: Justine Olshan <jolshan@confluent.io>
2023-01-20 08:38:21 +01:00
Wenjun Ruan 760e6f3741
Add license header in suppressions.xml (#11753)
Add license header in suppressions.xml 

Reviewers: Luke Chen <showuon@gmail.com>
2022-02-17 14:35:36 +08:00
Colin Patrick McCabe 772f2cfc82
MINOR: Replace BrokerStates.scala with BrokerState.java (#10028)
Replace BrokerStates.scala with BrokerState.java, to make it easier to use from Java code if needed.  This also makes it easier to go from a numeric type to an enum.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2021-02-03 13:41:38 -08:00
Colin Patrick McCabe 1711cfa4eb
KAFKA-12209: Add the timeline data structures for the KIP-631 controller (#9901)
Reviewers: Jun Rao <junrao@gmail.com>
2021-02-02 11:33:55 -08:00
David Arthur 1a9697430a
KAFKA-8806 Reduce calls to validateOffsetsIfNeeded (#7222)
Only check if positions need validation if there is new metadata. 

Also fix some inefficient java.util.stream code in the hot path of SubscriptionState.
2020-08-21 10:25:52 -04:00
Lucas Bradstreet 46540eb5e0
KAFKA-9820: validateMessagesAndAssignOffsetsCompressed allocates unused iterator (#8422)
3e9d1c1411 introduced skipKeyValueIterator(s) which were intended to be used, but in this case were created but were not used in offset validation.

A subset of the benchmark results follow. Looks like a 20% improvement in validation performance and a 40% reduction in garbage allocation for 1-2 batch sizes.

**# Parameters: (bufferSupplierStr = NO_CACHING, bytes = RANDOM, compressionType = LZ4, maxBatchSize = 1, messageSize = 1000, messageVersion = 2)**

Before:
Result "org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation":
  64851.837 ±(99.9%) 944.248 ops/s [Average]              
  (min, avg, max) = (64505.317, 64851.837, 65114.359), stdev = 245.218
  CI (99.9%): [63907.589, 65796.084] (assumes normal distribution)                                       
                                                             
"org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm":
  164088.003 ±(99.9%) 0.004 B/op [Average]                                                                                 
  (min, avg, max) = (164088.001, 164088.003, 164088.004), stdev = 0.001
  CI (99.9%): [164087.998, 164088.007] (assumes normal distribution)

After:

Result "org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation":                                      
  78910.273 ±(99.9%) 707.024 ops/s [Average]                                                                               
  (min, avg, max) = (78785.486, 78910.273, 79234.007), stdev = 183.612                                                     
  CI (99.9%): [78203.249, 79617.297] (assumes normal distribution)                                       

"org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm":                                                                                                                                   
  96440.002 ±(99.9%) 0.001 B/op [Average]                                                                                  
  (min, avg, max) = (96440.002, 96440.002, 96440.002), stdev = 0.001                                                       
  CI (99.9%): [96440.002, 96440.003] (assumes normal distribution)   

 **# Parameters: (bufferSupplierStr = NO_CACHING, bytes = RANDOM, compressionType = LZ4, maxBatchSize = 2, messageSize = 1000, messageVersion = 2)**

Before:
Result "org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation":                                      
  64815.364 ±(99.9%) 639.309 ops/s [Average]                                                                               
  (min, avg, max) = (64594.545, 64815.364, 64983.305), stdev = 166.026                                                                                                                                                                                
  CI (99.9%): [64176.056, 65454.673] (assumes normal distribution)                                                         
                                                                                                                                                                                        "org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm":        
  163944.003 ±(99.9%) 0.001 B/op [Average]                                                                                 
  (min, avg, max) = (163944.002, 163944.003, 163944.003), stdev = 0.001                                                    
  CI (99.9%): [163944.002, 163944.004] (assumes normal distribution)                                     

After:
Result "org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation":
  77075.096 ±(99.9%) 201.092 ops/s [Average]              
  (min, avg, max) = (77021.537, 77075.096, 77129.693), stdev = 52.223
  CI (99.9%): [76874.003, 77276.188] (assumes normal distribution)                                       
                                                             
"org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm":
  96504.002 ±(99.9%) 0.003 B/op [Average]                                                                                  
  (min, avg, max) = (96504.001, 96504.002, 96504.003), stdev = 0.001
  CI (99.9%): [96503.999, 96504.005] (assumes normal distribution)

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ismael Juma <ismael@juma.me.uk>
2020-04-04 10:05:51 -07:00
Gardner Vickers 8cf781ef01
MINOR: Improve performance of checkpointHighWatermarks, patch 1/2 (#6741)
This PR works to improve high watermark checkpointing performance.

`ReplicaManager.checkpointHighWatermarks()` was found to be a major contributor to GC pressure, especially on Kafka clusters with high partition counts and low throughput.

Added a JMH benchmark for `checkpointHighWatermarks` which establishes a
performance baseline. The parameterized benchmark was run with 100, 1000 and
2000 topics. 

Modified `ReplicaManager.checkpointHighWatermarks()` to avoid extra copies and cached
the Log parent directory Sting to avoid frequent allocations when calculating
`File.getParent()`.

A few clean-ups:
* Changed all usages of Log.dir.getParent to Log.parentDir and Log.dir.getParentFile to
Log.parentDirFile.
* Only expose public accessor for `Log.dir` (consistent with `Log.parentDir`)
* Removed unused parameters in `Partition.makeLeader`, `Partition.makeFollower` and `Partition.createLogIfNotExists`.

Benchmark results:

| Topic Count | Ops/ms | MB/sec allocated |
|-------------|---------|------------------|
| 100               | + 51%    |  - 91% |
| 1000             | + 143% |  - 49% |
| 2000            | + 149% |   - 50% |

Reviewers: Lucas Bradstreet <lucas@confluent.io>. Ismael Juma <ismael@juma.me.uk>

Co-authored-by: Gardner Vickers <gardner@vickers.me>
Co-authored-by: Ismael Juma <ismael@juma.me.uk>
2020-03-25 20:53:42 -07:00
Manikumar Reddy a0e1407820
KAFKA-9670; Reduce allocations in Metadata Response preparation (#8236)
This PR removes  intermediate  conversions between `MetadataResponse.TopicMetadata` => `MetadataResponseTopic` and `MetadataResponse.PartitionMetadata` => `MetadataResponsePartition` objects.

There is 15-20% reduction in object allocations and 5-10% improvement in metadata request performance.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson<jason@confluent.io>
2020-03-16 09:30:48 -07:00
jiao e3ccf20794 KAFKA-9685: Solve Set concatenation perf issue in AclAuthorizer
To dismiss the usage of operation ++ against Set which is slow when Set has many entries. This pr introduces a new class 'AclSets' which takes multiple Sets as parameters and do 'find' against them one by one. For more details about perf and benchmark, refer to [KAFKA-9685](https://issues.apache.org/jira/browse/KAFKA-9685)

Author: jiao <jiao.zhang@linecorp.com>

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #8261 from jiao-zhangS/jira-9685
2020-03-13 20:53:29 +05:30
Manikumar Reddy 8dff0b168a Kafka 9626: Improve ACLAuthorizer.acls() performance
This PR avoids creation of unnecessary sets in AclAuthorizer.acls() method implementation.

Perf results:
**Old**
```
Benchmark                                (aclCount)  (resourceCount)  Mode  Cnt    Score   Error  Units
AclAuthorizerBenchmark.testAclsIterator           5             5000  avgt   15    5.821 ? 0.309  ms/op
AclAuthorizerBenchmark.testAclsIterator           5            10000  avgt   15   15.303 ? 0.107  ms/op
AclAuthorizerBenchmark.testAclsIterator           5            50000  avgt   15   74.976 ? 0.543  ms/op
AclAuthorizerBenchmark.testAclsIterator          10             5000  avgt   15   15.366 ? 0.184  ms/op
AclAuthorizerBenchmark.testAclsIterator          10            10000  avgt   15   29.899 ? 0.129  ms/op
AclAuthorizerBenchmark.testAclsIterator          10            50000  avgt   15  167.301 ? 1.723  ms/op
AclAuthorizerBenchmark.testAclsIterator          15             5000  avgt   15   21.980 ? 0.114  ms/op
AclAuthorizerBenchmark.testAclsIterator          15            10000  avgt   15   44.385 ? 0.255  ms/op
AclAuthorizerBenchmark.testAclsIterator          15            50000  avgt   15  241.919 ? 3.955  ms/op
```
**New**

```
Benchmark                                (aclCount)  (resourceCount)  Mode  Cnt   Score   Error  Units
AclAuthorizerBenchmark.testAclsIterator           5             5000  avgt   15   0.666 ? 0.004  ms/op
AclAuthorizerBenchmark.testAclsIterator           5            10000  avgt   15   1.427 ? 0.015  ms/op
AclAuthorizerBenchmark.testAclsIterator           5            50000  avgt   15  21.410 ? 0.225  ms/op
AclAuthorizerBenchmark.testAclsIterator          10             5000  avgt   15   1.230 ? 0.018  ms/op
AclAuthorizerBenchmark.testAclsIterator          10            10000  avgt   15   4.303 ? 0.744  ms/op
AclAuthorizerBenchmark.testAclsIterator          10            50000  avgt   15  36.724 ? 0.409  ms/op
AclAuthorizerBenchmark.testAclsIterator          15             5000  avgt   15   2.433 ? 0.379  ms/op
AclAuthorizerBenchmark.testAclsIterator          15            10000  avgt   15   9.818 ? 0.214  ms/op
AclAuthorizerBenchmark.testAclsIterator          15            50000  avgt   15  52.886 ? 0.525  ms/op
```

Author: Manikumar Reddy <manikumar.reddy@gmail.com>
Author: Lucas Bradstreet <lucas@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Rajini Sivaram <rajinisivaram@googlemail.com>, Lucas Bradstreet <lucas@confluent.io>

Closes #8199 from omkreddy/KAFKA-9626
2020-03-03 01:51:09 +05:30
Lucas Bradstreet 8966d066bd KAFKA-9039: Optimize ReplicaFetcher fetch path (#7443)
Improves the performance of the replica fetcher for high partition count fetch requests, where a majority of the partitions did not update between fetch requests. All benchmarks were run on an r5x.large.

Vanilla
Benchmark (partitionCount) Mode Cnt Score Error Units
ReplicaFetcherThreadBenchmark.testFetcher 100 avgt 15 26491.825 ± 438.463 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 500 avgt 15 153941.952 ± 4337.073 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 1000 avgt 15 339868.602 ± 4201.462 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 5000 avgt 15 2588878.448 ± 22172.482 ns/op

From 100 to 5000 partitions the latency increase is 2588878.448 / 26491.825 = 97.

Avoid gettimeofdaycalls in steady state fetch states
8545888

Benchmark (partitionCount) Mode Cnt Score Error Units
ReplicaFetcherThreadBenchmark.testFetcher 100 avgt 15 22685.381 ± 267.727 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 500 avgt 15 113622.521 ± 1854.254 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 1000 avgt 15 273698.740 ± 9269.554 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 5000 avgt 15 2189223.207 ± 1706.945 ns/op

From 100 to 5000 partitions the latency increase is 2189223.207 / 22685.381 = 97X

Avoid copying partition states to maintain fetch offsets
29fdd60

Benchmark (partitionCount) Mode Cnt Score Error Units
ReplicaFetcherThreadBenchmark.testFetcher 100 avgt 15 17039.989 ± 609.355 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 500 avgt 15 99371.086 ± 1833.256 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 1000 avgt 15 216071.333 ± 3714.147 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 5000 avgt 15 2035678.223 ± 5195.232 ns/op

From 100 to 5000 partitions the latency increase is 2035678.223 / 17039.989 = 119X

Keep lag alongside PartitionFetchState to avoid expensive isReplicaInSync check
0e57e3e

Benchmark (partitionCount) Mode Cnt Score Error Units
ReplicaFetcherThreadBenchmark.testFetcher 100 avgt 15 15131.684 ± 382.088 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 500 avgt 15 86813.843 ± 3346.385 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 1000 avgt 15 193050.381 ± 3281.833 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 5000 avgt 15 1801488.513 ± 2756.355 ns/op

From 100 to 5000 partitions the latency increase is 1801488.513 / 15131.684 = 119X

Fetch session optimizations (mostly presizing the next hashmap, and avoiding making a copy of sessionPartitions, as a deep copy is not required for the ReplicaFetcher)
2614b24

Benchmark (partitionCount) Mode Cnt Score Error Units
ReplicaFetcherThreadBenchmark.testFetcher 100 avgt 15 11386.203 ± 416.701 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 500 avgt 15 60820.292 ± 3163.001 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 1000 avgt 15 146242.158 ± 1937.254 ns/op
ReplicaFetcherThreadBenchmark.testFetcher 5000 avgt 15 1366768.926 ± 3305.712 ns/op

From 100 to 5000 partitions the latency increase is 1366768.926 / 11386.203 = 120

Reviewers: Jun Rao <junrao@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
2019-10-16 09:49:53 -07:00
Lucas Bradstreet f3ded39a05 KAFKA-8841; Reduce overhead of ReplicaManager.updateFollowerFetchState (#7324)
This PR makes two changes to code in the ReplicaManager.updateFollowerFetchState path, which is in the hot path for follower fetches. Although calling ReplicaManager.updateFollowerFetch state is inexpensive on its own, it is called once for each partition every time a follower fetch occurs.

1. updateFollowerFetchState no longer calls maybeExpandIsr when the follower is already in the ISR. This avoid repeated expansion checks. 
2. Partition.maybeIncrementLeaderHW is also in the hot path for ReplicaManager.updateFollowerFetchState. Partition.maybeIncrementLeaderHW calls Partition.remoteReplicas four times each iteration, and it performs a toSet conversion. maybeIncrementLeaderHW now avoids generating any intermediate collections when updating the HWM.

**Benchmark results for Partition.updateFollowerFetchState on a r5.xlarge:**
Old:
```
  1288.633 ±(99.9%) 1.170 ns/op [Average]
  (min, avg, max) = (1287.343, 1288.633, 1290.398), stdev = 1.037
  CI (99.9%): [1287.463, 1289.802] (assumes normal distribution)
```

New (when follower fetch offset is updated):
```
  261.727 ±(99.9%) 0.122 ns/op [Average]
  (min, avg, max) = (261.565, 261.727, 261.937), stdev = 0.114
  CI (99.9%): [261.605, 261.848] (assumes normal distribution)
```

New (when follower fetch offset is the same):
```
  68.484 ±(99.9%) 0.025 ns/op [Average]
  (min, avg, max) = (68.446, 68.484, 68.520), stdev = 0.023
  CI (99.9%): [68.460, 68.509] (assumes normal distribution)
```

Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
2019-09-18 09:11:39 -07:00