Commit Graph

7371 Commits

Author SHA1 Message Date
John Roesler 872694be4e Merge branch 'trunk' into temp-8436 2020-04-09 17:32:32 -05:00
A. Sophie Blee-Goldman 0470e2bc95
KAFKA-6145: KIP-441: fix flaky shouldEnforceRebalance test in StreamThreadTest (#8452)
Reviewers: Boyang Chen <boyang@confluent.io>, John Roesler <vvcephei@apache.org>
2020-04-09 17:28:44 -05:00
Colin Patrick McCabe bf6dffe93b
KAFKA-9309: Add the ability to translate Message classes to and from JSON (#7844)
Reviewers: David Arthur <mumrah@gmail.com>, Ron Dagostino <rdagostino@confluent.io>
2020-04-09 13:11:36 -07:00
Matthias J. Sax 73ec7304b9
KAFKA-9748: Extend Streams integration tests for EOS beta (#8441)
Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2020-04-09 10:57:10 -07:00
SoontaekLim 179be72e30
KAFKA-9642: Change "BigDecimal(double)" constructor to "BigDecimal.valueOf(double)" (#8212)
Co-authored-by: Soontaek Lim <soontaek.lim@ultratendency.com>

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2020-04-09 10:22:23 -07:00
Tom Bentley 371ad143a6
KAFKA-9691: Fix NPE by waiting for reassignment request (#8317)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Chia-Ping Tsai <chia7712@gmail.com>
2020-04-09 15:24:44 +01:00
Tom Bentley c84e6ab491
KAFKA-9433: Use automated protocol for AlterConfigs request and response (#8315)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Boyang Chen <boyang@confluent.io>
2020-04-09 14:59:25 +01:00
Dezhi “Andy” Fang 0fad944ca7
KAFKA-9583; Use topic-partitions grouped by node to send OffsetsForLeaderEpoch requests (#8077)
In `validateOffsetsAsync` in t he consumer, we group the requests by leader node for efficiency. The list of topic-partitions are grouped from `partitionsToValidate` (all partitions) to `node` => `fetchPostitions` (partitions by node). However, when actually sending the request with `OffsetsForLeaderEpochClient`, we use `partitionsToValidate`, which is the list of all topic-partitions passed into `validateOffsetsAsync`. This results in extra partitions being included in the request sent to brokers that are potentially not the leader for those partitions.

This PR fixes the issue by using `fetchPositions`, which is the proper list of partitions that we should send in the request. Additionally, a small typo of API name in `OffsetsForLeaderEpochClient` is corrected (it originally referenced `LisfOffsets` as the API name).

Reviewers: David Arthur <mumrah@gmail.com>, Jason Gustafson <jason@confluent.io>
2020-04-08 23:46:45 -07:00
ableegoldman 1f0a36ba38 add guard 2020-04-08 18:37:59 -07:00
ableegoldman 4dd64a4422 remove import that snuck in again >:( 2020-04-08 18:34:09 -07:00
ableegoldman bf809a7ab0 fix test broken by rebase 2020-04-08 18:32:59 -07:00
ableegoldman 24983a84b6 rebased 2020-04-08 18:27:49 -07:00
A. Sophie Blee-Goldman ed3a7157e0
KAFKA-6145: KIP-441 Move tasks with caught-up destination clients right away (#8425)
Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>
2020-04-08 20:08:40 -05:00
Jason Gustafson 778b1e3f54
KAFKA-9835; Protect `FileRecords.slice` from concurrent write (#8451)
A read from the end of the log interleaved with a concurrent write can result in reading data above the expected read limit. In particular, this would allow a read above the high watermark. The root of the problem is consecutive calls to `sizeInBytes` in `FileRecords.slice` which do not account for an increase in size due to a concurrent write. This patch fixes the problem by using a single call to `sizeInBytes` and caching the result.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2020-04-08 11:31:27 -07:00
A. Sophie Blee-Goldman 98ea773a22
KAFKA-6145: KIP-441 Pt. 6 Trigger probing rebalances until group is stable (#8409)
Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>
2020-04-08 13:02:30 -05:00
maulin-vasavada 9ba49b806a
KAFKA-8890: Make SSL context/engine configuration extensible (KIP-519) (#8338) 2020-04-08 15:20:32 +01:00
Chia-Ping Tsai 833dc7725c
HOTFIX: exclude ConsumerCoordinator from NPathComplexity check (#8447)
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2020-04-08 13:00:03 +01:00
Guozhang Wang 94ef25ab91 KAFKA-9801: Still trigger rebalance when static member joins in CompletingRebalance phase (#8405)
Fix the direct cause of the observed issue on the client side: when heartbeat getting errors and resetting generation, we only need to set it to UNJOINED when it was not already in REBALANCING; otherwise, the join-group handler would throw the retriable UnjoinedGroupException to force the consumer to re-send join group unnecessarily.

Fix the root cause of the issue on the broker side: we should still trigger rebalance when static member joins in CompletingRebalance phase; otherwise the member.ids would be changed when the assignment is received from the leader, hence causing the new member.id's assignment to be empty.

Reviewers: Boyang Chen <boyang@confluent.io>, Jason Gustafson <jason@confluent.io>
2020-04-07 20:46:46 -07:00
David Jacot 4c9f5cf7c3
KAFKA-8107; Flaky Test kafka.api.ClientIdQuotaTest.testQuotaOverrideDelete (#8394)
Invoke `waitForQuotaUpdate` after the quotas are removed. It also changes
the default request quota to `Long.MaxValue`.

Reviewers: Anna Povzner <anna@confluent.io>, Ismael Juma <ismael@juma.me.uk>
2020-04-07 13:32:16 -07:00
John Roesler 29e08fd2c2
KAFKA-8410: Part 1: processor context bounds (#8414)
Add type bounds to the ProcessorContext, which bounds the types that can be forwarded to child nodes.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2020-04-07 13:11:02 -05:00
David Jacot cd1e46c8bb
MINOR: Pass one action per unique resource name in KafkaApis.filterAuthorized (#8432)
90bbeedf52 introduced a regression resulting in passing an action per resource
name to the `Authorizer` instead of passing one per unique resource name. Refactor
the signatures of both `filterAuthorized` and `authorize` to make them easier to test
and add a test for each.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2020-04-07 06:26:18 -07:00
Matthias J. Sax 731630e866
KAFKA-9818: improve error message to debug test (#8423)
Reviewers: A. Sophie Blee Goldman <sophie@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2020-04-06 21:22:34 -07:00
Boyang Chen f1850162de
MINOR: Should cleanup the tasks after dirty close (#8433)
Some tasks get closed inside HandleAssignment and did not remove from the task manager bookkeep list. The next time they would be re-closed which is illegal state.

Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2020-04-06 20:44:14 -07:00
Jason Gustafson 9d965236a4
MINOR: Fix log cleaner offset range log message (#8435)
The upper limit offset is displayed incorrectly in the log cleaner summary message. For example:
```
Log cleaner thread 0 cleaned log __consumer_offsets-47 (dirty section = [358800359, 358800359])
```
We should be using the next dirty offset as the upper limit.

Reviewers: David Arthur <mumrah@gmail.com>
2020-04-06 17:11:12 -07:00
Rajini Sivaram 588e8a5be8
KAFKA-9815; Ensure consumer always re-joins if JoinGroup fails (#8420)
On metadata change for assigned topics, we trigger rebalance, revoke partitions and send JoinGroup. If metadata reverts to the original value and JoinGroup fails, we don't resend JoinGroup because we don't set `rejoinNeeded`. This PR sets `rejoinNeeded=true` when rebalance is triggered due to metadata change to ensure that we retry on failure.

Reviewers: Boyang Chen <boyang@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>
2020-04-06 17:00:11 -07:00
Guozhang Wang 82dff1db54
KAFKA-9753: A few more metrics to add (#8371)
Instance-level:
* number of alive stream threads

Thread-level:
* avg / max number of records polled from the consumer per runOnce, INFO
* avg / max number of records processed by the task manager (i.e. across all tasks) per runOnce, INFO

Task-level:
* number of current buffered records at the moment (i.e. it is just a dynamic gauge), DEBUG.

Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <john@confluent.io>
2020-04-06 15:30:29 -07:00
Boyang Chen 712ac5203e
KAFKA-9793: Expand the try-catch for task commit in HandleAssignment (#8402)
As title suggests, we would like to broaden this check so that we don't fail to close a doom-to-cleanup task.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2020-04-05 22:50:26 -07:00
Boyang Chen 3b8ed0a194
KAFKA-9784: Add OffsetFetch to group concurrency test (#8383)
As title suggested, consumers would first do an OffsetFetch before starting the normal processing. It makes sense to add it to the concurrent test suite to verify whether there would be a blocking behavior.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2020-04-05 22:47:38 -07:00
Lucas Bradstreet 37990f9099
MINOR: fix inaccurate RecordBatchIterationBenchmark.measureValidation benchmark (#8428)
KAFKA-9820 (https://github.com/apache/kafka/pull/8422) added a benchmark of LogValidator.validateMessagesAndAssignOffsetsCompressed. Unfortunately it instantiated BrokerTopicStats within the benchmark itself, and it is expensive. The fixed benchmark does not change the outcome of the improvement in KAFKA-9820, and actually increases the magnitude of the improvement in percentage terms.

```
Updated benchmark before KAFKA-9820:
Benchmark                                                                     (bufferSupplierStr)  (bytes)  (compressionType)  (maxBatchSize)  (messageSize)  (messageVersion)   Mode  Cnt       Score      Error   Units
RecordBatchIterationBenchmark.measureValidation                                        NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15  164173.236 ± 2927.701   ops/s
RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate                         NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15   20440.980 ±  364.411  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm                    NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15  137120.002 ±    0.002    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Eden_Space                NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15   20708.378 ±  372.041  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Eden_Space.norm           NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15  138913.935 ±  398.960    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Old_Gen                   NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15       0.547 ±    0.107  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Old_Gen.norm              NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15       3.664 ±    0.689    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.count                              NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15    2713.000             counts
RecordBatchIterationBenchmark.measureValidation:·gc.time                               NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15    1398.000                 ms
RecordBatchIterationBenchmark.measureValidation                                        NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15  164305.533 ± 5143.457   ops/s
RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate                         NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15   20490.828 ±  641.408  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm                    NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15  137328.002 ±    0.002    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Eden_Space                NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15   20767.922 ±  648.843  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Eden_Space.norm           NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15  139185.616 ±  325.790    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Old_Gen                   NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15       0.681 ±    0.053  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Old_Gen.norm              NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15       4.560 ±    0.292    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.count                              NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15    3101.000             counts
RecordBatchIterationBenchmark.measureValidation:·gc.time                               NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15    1538.000                 ms
RecordBatchIterationBenchmark.measureValidation                                        NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15  169572.635 ±  595.613   ops/s
RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate                         NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15   21129.934 ±   74.618  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm                    NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15  137216.002 ±    0.002    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Eden_Space                NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15   21410.416 ±   70.458  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Eden_Space.norm           NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15  139037.806 ±  309.278    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Old_Gen                   NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15       0.312 ±    0.420  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Old_Gen.norm              NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15       2.026 ±    2.725    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.count                              NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15    3398.000             counts
RecordBatchIterationBenchmark.measureValidation:·gc.time                               NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15    1701.000                 ms
JMH benchmarks done


Updated benchmark after KAFKA-9820:
Benchmark                                                                     (bufferSupplierStr)  (bytes)  (compressionType)  (maxBatchSize)  (messageSize)  (messageVersion)   Mode  Cnt       Score     Error   Units
RecordBatchIterationBenchmark.measureValidation                                        NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15  322678.586 ± 254.126   ops/s
RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate                         NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15   20376.474 ±  15.326  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm                    NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15   69544.001 ±   0.001    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Eden_Space                NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15   20485.394 ±  44.087  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Eden_Space.norm           NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15   69915.744 ± 143.372    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Old_Gen                   NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15       0.027 ±   0.002  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Old_Gen.norm              NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15       0.091 ±   0.008    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.count                              NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15    3652.000            counts
RecordBatchIterationBenchmark.measureValidation:·gc.time                               NO_CACHING   RANDOM                LZ4               1           1000                 2  thrpt   15    1773.000                ms
RecordBatchIterationBenchmark.measureValidation                                        NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15  321332.070 ± 869.841   ops/s
RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate                         NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15   20303.259 ±  55.609  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm                    NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15   69600.001 ±   0.001    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Eden_Space                NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15   20394.052 ±  72.842  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Eden_Space.norm           NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15   69911.238 ± 160.177    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Old_Gen                   NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15       0.028 ±   0.003  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Old_Gen.norm              NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15       0.096 ±   0.010    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.count                              NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15    3637.000            counts
RecordBatchIterationBenchmark.measureValidation:·gc.time                               NO_CACHING   RANDOM                LZ4               2           1000                 2  thrpt   15    1790.000                ms
RecordBatchIterationBenchmark.measureValidation                                        NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15  315490.355 ± 271.921   ops/s
RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate                         NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15   19943.166 ±  21.235  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm                    NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15   69640.001 ±   0.001    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Eden_Space                NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15   20020.263 ±  43.144  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Eden_Space.norm           NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15   69909.228 ± 136.413    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Old_Gen                   NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15       0.026 ±   0.002  MB/sec
RecordBatchIterationBenchmark.measureValidation:·gc.churn.G1_Old_Gen.norm              NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15       0.090 ±   0.008    B/op
RecordBatchIterationBenchmark.measureValidation:·gc.count                              NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15    3571.000            counts
RecordBatchIterationBenchmark.measureValidation:·gc.time                               NO_CACHING   RANDOM                LZ4              10           1000                 2  thrpt   15    1764.000                ms
```

Reviewers: Ismael Juma <ismael@juma.me.uk>
2020-04-05 14:24:54 -07:00
Boyang Chen 0eab92012b
HOTFIX: fix compilation error (#8424)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Guozhang Wang <guozhang@confluent.io
2020-04-04 11:12:05 -07:00
Konstantine Karantasis 23b4d875d6
MINOR: Remove redundant braces from log message in FetchSessionHandler (#8410)
https://issues.apache.org/jira/browse/KAFKA-8889 attempted to fill in the missing stacktrace in the log message when handling errors in FetchSessionHandler#handleError

But the fix is not effective without KAFKA-7016

The current fix removes the redundant pair of braces {} at the end of the log message. If and when the Throwable that is passed as argument to this method has a stacktrace, the log message will include it. Currently it doesn't because the Throwable argument does not have a stacktrace.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
2020-04-04 10:22:41 -07:00
Lucas Bradstreet 46540eb5e0
KAFKA-9820: validateMessagesAndAssignOffsetsCompressed allocates unused iterator (#8422)
3e9d1c1411 introduced skipKeyValueIterator(s) which were intended to be used, but in this case were created but were not used in offset validation.

A subset of the benchmark results follow. Looks like a 20% improvement in validation performance and a 40% reduction in garbage allocation for 1-2 batch sizes.

**# Parameters: (bufferSupplierStr = NO_CACHING, bytes = RANDOM, compressionType = LZ4, maxBatchSize = 1, messageSize = 1000, messageVersion = 2)**

Before:
Result "org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation":
  64851.837 ±(99.9%) 944.248 ops/s [Average]              
  (min, avg, max) = (64505.317, 64851.837, 65114.359), stdev = 245.218
  CI (99.9%): [63907.589, 65796.084] (assumes normal distribution)                                       
                                                             
"org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm":
  164088.003 ±(99.9%) 0.004 B/op [Average]                                                                                 
  (min, avg, max) = (164088.001, 164088.003, 164088.004), stdev = 0.001
  CI (99.9%): [164087.998, 164088.007] (assumes normal distribution)

After:

Result "org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation":                                      
  78910.273 ±(99.9%) 707.024 ops/s [Average]                                                                               
  (min, avg, max) = (78785.486, 78910.273, 79234.007), stdev = 183.612                                                     
  CI (99.9%): [78203.249, 79617.297] (assumes normal distribution)                                       

"org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm":                                                                                                                                   
  96440.002 ±(99.9%) 0.001 B/op [Average]                                                                                  
  (min, avg, max) = (96440.002, 96440.002, 96440.002), stdev = 0.001                                                       
  CI (99.9%): [96440.002, 96440.003] (assumes normal distribution)   

 **# Parameters: (bufferSupplierStr = NO_CACHING, bytes = RANDOM, compressionType = LZ4, maxBatchSize = 2, messageSize = 1000, messageVersion = 2)**

Before:
Result "org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation":                                      
  64815.364 ±(99.9%) 639.309 ops/s [Average]                                                                               
  (min, avg, max) = (64594.545, 64815.364, 64983.305), stdev = 166.026                                                                                                                                                                                
  CI (99.9%): [64176.056, 65454.673] (assumes normal distribution)                                                         
                                                                                                                                                                                        "org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm":        
  163944.003 ±(99.9%) 0.001 B/op [Average]                                                                                 
  (min, avg, max) = (163944.002, 163944.003, 163944.003), stdev = 0.001                                                    
  CI (99.9%): [163944.002, 163944.004] (assumes normal distribution)                                     

After:
Result "org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation":
  77075.096 ±(99.9%) 201.092 ops/s [Average]              
  (min, avg, max) = (77021.537, 77075.096, 77129.693), stdev = 52.223
  CI (99.9%): [76874.003, 77276.188] (assumes normal distribution)                                       
                                                             
"org.apache.kafka.jmh.record.RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm":
  96504.002 ±(99.9%) 0.003 B/op [Average]                                                                                  
  (min, avg, max) = (96504.001, 96504.002, 96504.003), stdev = 0.001
  CI (99.9%): [96503.999, 96504.005] (assumes normal distribution)

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ismael Juma <ismael@juma.me.uk>
2020-04-04 10:05:51 -07:00
Matthias J. Sax ab5e4f52ec
MINOR: Refactor StreamsProducer (#8380)
Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Andrew Choi <a24choi@edu.uwaterloo.ca>
2020-04-03 19:17:57 -07:00
Jason Gustafson d4eb406f01
KAFKA-9807; Protect LSO reads from concurrent high-watermark updates (#8418)
If the high-watermark is updated in the middle of a read with the `read_committed` isolation level, it is possible to return data above the LSO. In the worst case, this can lead to the read of an aborted transaction. The root cause is that the logic depends on reading the high-watermark twice. We fix the problem by reading it once and caching the value.

Reviewers: David Arthur <mumrah@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
2020-04-03 13:56:42 -07:00
Konstantine Karantasis 8595267260
KAFKA-9810: Document Connect Root REST API on / (#8408)
Document the supported endpoint at the top-level (root) REST API resource and the information that it returns when a request is made to a Connect worker.

Fixes an omission in documentation after KAFKA-2369 and KAFKA-6311 (KIP-238)

Reviewers: Toby Drake <tobydrake7@gmail.com>, Soenke Liebau <soenke.liebau@opencore.com>
2020-04-03 13:27:27 -07:00
A. Sophie Blee-Goldman 6e0d553350
MINOR: clean up Streams assignment classes and tests (#8406)
First set of cleanup pushed to followup PR after KIP-441 Pt. 5. Main changes are:

1. Moved `RankedClient` and the static `buildClientRankingsByTask` to a new file
2. Moved `Movement` and the static `getMovements` to a new file (also renamed to `TaskMovement`)
3. Consolidated the many common variables throughout the assignment tests to the new `AssignmentTestUtils` 
4. New utility to generate comparable/predictable UUIDs for tests, and removed the generic from `TaskAssignor` and all related classes

Reviewers: John Roesler <vvcephei@apache.org>, Andrew Choi <a24choi@edu.uwaterloo.ca>
2020-04-03 13:53:51 -05:00
Jason Gustafson 62dcfa196e
KAFKA-9750; Fix race condition with log dir reassign completion (#8412)
There is a race on receiving a LeaderAndIsr request for a replica with an active log dir reassignment. If the reassignment completes just before the LeaderAndIsr handler updates epoch information, it can lead to an illegal state error since no future log dir exists. This patch fixes the problem by ensuring that the future log dir exists when the fetcher is started. Removal cannot happen concurrently because it requires access the same partition state lock.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Arthur <mumrah@gmail.com>

Co-authored-by: Chia-Ping Tsai <chia7712@gmail.com>
2020-04-03 11:51:04 -07:00
Tom Bentley 9a154c6505 KAFKA-9775: Fix IllegalFormatConversionException in ToolsUtils
The runtime type of Metric.metricValue() needn't always be a Double,
for example, if it's a gauge from IntGaugeSuite.
Since it's impossible to format non-double values with 3 point precision
IllegalFormatConversionException resulted.

Author: Tom Bentley <tbentley@redhat.com>
Author: Tom Bentley <tombentley@users.noreply.github.com>

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Ismael Juma <ismael@juma.me.uk>

Closes #8373 from tombentley/KAFKA-9775-IllegalFormatConversionException
2020-04-03 22:34:44 +05:30
John Roesler 726a7d5de2
KAFKA-9812: fix infinite loop in test code (#8411)
Reviewers: Boyang Chen <boyang@confluent.io>
2020-04-03 11:29:34 -05:00
Bill Bejeck 9783b85fdd
KAFKA-9739: Fixes null key changing child node (#8400)
For some context, when building a streams application, the optimizer keeps track of the key-changing operations and any repartition nodes that are descendants of the key-changer. During the optimization phase (if enabled), any repartition nodes are logically collapsed into one. The optimizer updates the graph by inserting the single repartition node between the key-changing node and its first child node. This graph update process is done by searching for a node that has the key-changing node as one of its direct parents, and the search starts from the repartition node, going up in the parent hierarchy.

The one exception to this rule is if there is a merge node that is a descendant of the key-changing node, then during the optimization phase, the map tracking key-changers to repartition nodes is updated to have the merge node as the key. Then the optimization process updates the graph to place the single repartition node between the merge node and its first child node.

The error in KAFKA-9739 occurred because there was an assumption that the repartition nodes are children of the merge node. But in the topology from KAFKA-9739, the repartition node was a parent of the merge node. So when attempting to find the first child of the merge node, nothing was found (obviously) resulting in StreamException(Found a null keyChangingChild node for..)

This PR fixes this bug by first checking that all repartition nodes for optimization are children of the merge node.

This PR includes a test with the topology from KAFKA-9739.

Reviewers: John Roesler <john@confluent.io>
2020-04-03 12:04:44 -04:00
Boyang Chen 7f640f13b4
KAFKA-9776: Downgrade TxnCommit API v3 when broker doesn't support (#8375)
Revert the decision for the sendOffsetsToTransaction(groupMetadata) API to fail with old version of brokers for the sake of making the application easier to adapt between versions. This PR silently downgrade the TxnOffsetCommit API when the build version is small than 3.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2020-04-02 21:48:37 -07:00
Boyang Chen 6ddbf4d800
KAFKA-9809: Shrink transaction timeout for streams (#8407)
As documented in the KIP:

We shall set `transaction.timout.ms` default to 10000 ms (10 seconds) on Kafka Streams. 

Reviewer: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2020-04-02 19:53:14 -07:00
Daniel dfdb49f26a
KAFKA-9778: Add methods to validate and assert connector configurations in integration tests with EmbeddedConnectCluster (#8359)
* Add validateConnector functionality to the EmbeddedConnectCluster
* PR Revision - added ConnectException conversion, validateConnectorConfig calls to ExampleConnectIntegrationTest
* PR revision - Added method to EmbeddedConnectClusterAssertions

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2020-04-02 16:52:56 -07:00
A. Sophie Blee-Goldman 2322bc0a6f
KAFKA-6145: Pt. 5 Implement high availability assignment (#8337)
Adds a new TaskAssignor implementation, currently hidden behind an internal feature flag, that implements the high availability algorithm of KIP-441.

Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>
2020-04-02 13:36:03 -05:00
Ron Dagostino f0ad03069a
MINOR: System test ZooKeeper upgrades (#8384)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2020-04-02 23:23:48 +05:30
Michal T 86a3ebe537
MINOR: Fix typo in version 2.4.1 of kafka folder in Dockerfile (#8393) 2020-04-01 17:56:47 -07:00
Guozhang Wang a2092ecd7a HOTFIX: remove redundant check for QueryableStateIntegrationTest 2020-04-01 16:16:13 -07:00
Matthias J. Sax cc59150f40
KAFKA-9748: extend EosIntegrationTest for EOS-beta (#8331)
Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2020-04-01 13:20:27 -07:00
José Armando García Sancio fee5f0e50f
MINOR: Rename TopicCommandTest (#8398)
Rename the test suite to later add unit tests that don't depend on
ZK or the AdminClient TopiCommand types.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ismael Juma <ismael@juma.me.uk>
2020-04-01 12:35:41 -07:00
Bruno Cadonna e6a6b69acf
MINOR: Improve close tests of caching state store (#8386)
Reviewers: John Roesler <vvcephei@apache.org>, Matthias J. Sax <matthias@confluent.io>, Andrew Choi <a24choi@edu.uwaterloo.ca>
2020-04-01 12:42:16 -05:00