Compare commits

...

254 Commits
main ... 1.2

Author SHA1 Message Date
Yu Ning d7a3768e7f
refactor(s3stream/object-wal): complete appends sequentially (#2551)
refactor(s3stream/object-wal): complete appends sequentially (#2426)

* chore: add todos



* refactor(s3stream/object-wal): sequentially succeed



* refactor(s3stream/object-wal): drop discontinuous objects during recovery



* test: introduce `MockObjectStorage`



* test: test sequentially succeed



* refactor: record endOffset in the object path



* feat: different version of wal object header



* refactor: adapt to the new object header format



* feat: recover from the trim offset



* test: test recover continuous objects from trim offset



* test: test marshal and unmarshal wal object header



* test: fix tests



* test: test recover from discontinuous objects



* test: test recover from v0 and v1 objects



* style: fix lint



---------

Signed-off-by: Ning Yu <ningyu@automq.com>
2025-05-15 10:29:15 +08:00
Yu Ning 37edae9802
fix(s3stream/wal): fix incorrect offset return value during recovery (#2546)
Signed-off-by: Ning Yu <ningyu@automq.com>
2025-05-14 17:09:15 +08:00
Shichao Nie a8861f6a81
fix(s3stream): add delayed deletion for S3 WAL (#2532)
Signed-off-by: Shichao Nie <niesc@automq.com>
2025-05-13 16:58:53 +08:00
lifepuzzlefun 235ad5da42
fix(s3stream): skip waiting for pending part on release (#2316) (#2403)
Signed-off-by: Shichao Nie <niesc@automq.com>
(cherry picked from commit c330e59073)

Co-authored-by: Shichao Nie <niesc@automq.com>
2025-04-03 11:57:29 +08:00
Xu Han@AutoMQ 811546efbe
fix(action): fix release action (#2325)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2025-02-19 21:02:17 +08:00
Xu Han@AutoMQ f0266fc7e6
fix(action): try fix docker build (#2321)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2025-02-19 20:14:44 +08:00
Xu Han@AutoMQ f7a1addb36
fix(action): downgrade qemu (#2320)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2025-02-19 19:43:42 +08:00
Xu Han@AutoMQ 60bd7b8e84
fix(action): try fix docker build with update platform (#2318)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2025-02-19 18:22:44 +08:00
Xu Han@AutoMQ a76659d195
fix(action): add upload endpoint (#2317)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2025-02-19 17:37:27 +08:00
Xu Han@AutoMQ 2c8ef90da0
fix(action): change upload bucket (#2314)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2025-02-19 14:49:37 +08:00
Xu Han@AutoMQ 720ebf8461
feat(release): bump version to 1.2.2 (#2293)
release(project): bump version to 1.2.2

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2025-02-12 15:31:20 +08:00
Shichao Nie a68236a726
fix(s3stream): fix compaction block on upload exception (#2265)
Signed-off-by: Shichao Nie <niesc@automq.com>
2025-01-10 10:21:18 +08:00
Xu Han@AutoMQ 2e50f79bf5 chore(config): change s3.stream.object.compaction.max.size.bytes default value from 1GiB to 10GiB (#2249)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2025-01-02 17:38:36 +08:00
Robin Han 2e229a4865 feat(gradle): bump version to 1.2.2-rc0
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-12-30 10:57:50 +08:00
Shichao Nie 0c5a2dbfa6
fix(core): fix potential infinite recursion on reading empty segment (#2231)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-12-17 14:02:58 +08:00
Shichao Nie 370f947fb6
feat(telemetry): support gzip compression on uploading metrics & logs… (#2224)
feat(telemetry): support gzip compression on uploading metrics & logs to s3

Signed-off-by: Shichao Nie <niesc@automq.com>
2024-12-13 16:39:49 +08:00
Xu Han@AutoMQ 432c0dcfc7
fix(docker): use bucket url (#2214)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-12-05 18:48:38 +08:00
Xu Han@AutoMQ 78678acd93
fix(issues2193): retry 2 times to cover most of BlockNotContinuousException (#2196)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-11-29 19:59:04 +08:00
Yu Ning 4ddc5c245f
fix: use the "adjusted" `maxSize` in `ElasticLogSegment#readAsync` (#2191)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-11-28 10:55:06 +08:00
Yu Ning 63769c9431
fix: release `PooledMemoryRecords` if it's dropped in the fetch session (#2188)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-11-28 10:53:46 +08:00
Yu Ning 2c50200635
fix(stream): release `FetchResult`s if the subsequent fetch fails (#2175)
fix(stream): release `FetchResult`s if the subsequent fetch fails (#2172)

* fix(stream): release `FetchResult`s if the subsequent fetch fails



* revert: "fix(stream): release `FetchResult`s if the subsequent fetch fails"

This reverts commit 5836a6afa0.

* refactor: add the `FetchResult` into the list in order rather than in reverse order



* fix: release `FetchResult`s if failed to fetch



---------

Signed-off-by: Ning Yu <ningyu@automq.com>
2024-11-22 17:29:00 +08:00
Yu Ning 7d5a108a1a
chore(github): update code owners (#2157)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-11-13 17:33:05 +08:00
Xu Han@AutoMQ 94d71646f3
feat(version): bump version to 1.2.1 (#2154)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-11-11 21:48:23 +08:00
Shichao Nie 8638fb083f
fix(issue2140): remove override equals and hashCode method for Object… (#2149)
fix(issue2140): remove override equals and hashCode method for ObjectReader
close #2140

Signed-off-by: Shichao Nie <niesc@automq.com>
2024-11-08 15:33:15 +08:00
Shichao Nie ed6bddc6b0
fix(issue2139): add computeIfAbsent atomic operation to AsyncLRUCache (#2146)
close #2139

Signed-off-by: Shichao Nie <niesc@automq.com>
2024-11-08 15:05:42 +08:00
Shichao Nie 6fffb154da
fix(issue2139): prevent read object info from closed ObjectReader (#2142)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-11-08 11:51:21 +08:00
Shichao Nie c098ee34f7
fix(compaction): prevent double release on compaction shutdown (#2117)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-11-05 11:22:17 +08:00
Xu Han@AutoMQ 40da95bbc9
fix(s3stream): wait force upload complete before return (#2114)
Signed-off-by: Shichao Nie <niesc@automq.com>
Co-authored-by: Shichao Nie <niesc@automq.com>
2024-11-03 22:31:15 +08:00
Shichao Nie b9238c8f0a
fix(issue2108): avoid blocking at the end of a compaction iteration when there are un-uploaded data (#2110)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-11-03 16:14:46 +08:00
Shichao Nie 6c6aa9669d
fix(auto_balancer): fix mistakenly reused reference (#2103)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-10-30 15:31:34 +08:00
Yu Ning 2fcc3b2825
chore: bump version to 1.2.1-rc2 (#2099)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-10-29 13:29:31 +08:00
Yu Ning 047560c7cf
fix(sasl): fix sasl configs (#2096)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-10-29 12:29:40 +08:00
Shichao Nie 3eedc14642
fix(core): fix time unit (#2091)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-10-22 16:58:59 +08:00
Xu Han@AutoMQ 52601d42f9
feat(version): bump version to 1.2.1-rc1 (#2090)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-10-21 18:05:46 +08:00
Gezi-lzq 3a88d60a37
chore: remove log and add topic check method (#2086) (#2088) 2024-10-21 11:45:42 +08:00
Xu Han@AutoMQ 35f5687c3f
feat(version): bump version (#2083)
feat: bump version

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-10-18 11:13:30 +08:00
Yu Ning a4b641bc5f
perf(kraft): decrease the index interval bytes of KRaft Log from 1MB to 4KB (#2079)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-10-17 10:14:12 +08:00
Yu Ning 84d6a95b6d
fix(config): update example config "listeners" for brokers (#2073)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-10-15 20:43:25 +08:00
Yu Ning 94e1efbfd1
fix(scripts): add single quotes around the env values (#2071)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-10-15 20:23:33 +08:00
Yu Ning 04d33f9f6b
fix(scripts): fix "advertised.listeners" when deploying cluster (#2069)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-10-15 20:05:34 +08:00
Yu Ning 57ca9dd923
fix(scripts): fix typo (#2067)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-10-15 19:48:20 +08:00
Yu Ning bff5144040
fix(scripts): start controller node in daemon mode (#2064)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-10-15 16:29:23 +08:00
Xu Han@AutoMQ 34d3e18145
feat(s3stream): add read timeout (#2059)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-10-11 14:11:38 +08:00
Xu Han@AutoMQ ecb4fc2b17
chore(version): bump version to 1.2.0 (#2045)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-10-08 10:45:07 +08:00
Xu Han@AutoMQ c8a0b17f01
fix(issues2038): fix timestamp to offset not found (#2039) (#2041)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-09-26 20:24:27 +08:00
Xu Han@AutoMQ f22830a554
fix(metadata): fix stream endoffset set back (#2042)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-09-26 20:24:10 +08:00
Shichao Nie 777cac913e
fix(metrics): fix potential NPE when exporting metrics (#2030)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-09-19 15:23:46 +08:00
Xu Han@AutoMQ daa4fdc7f0
feat(version): bump version to 1.2.0-rc1 (#2025)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-09-14 10:44:23 +08:00
Shichao Nie d26f7b35d7
fix(core): remove offset metrics when expired (#2022)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-09-12 17:30:46 +08:00
Shichao Nie 162aac0cd3
fix(core): rsp immediately for catch-up read even if rst is not enough (#2019)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-09-12 12:29:05 +08:00
Shichao Nie d58a74b0bb
feat(core): revert rack aware assignment on broker fence (#2017)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-09-11 17:12:42 +08:00
Yu Ning 26a4e1d151
perf(DelayedFetch): only try to fast read on complete a delayed fetch (#2013)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-09-11 11:19:15 +08:00
SSpirits d5956cfa00
fix(core): release all records before delayed fetch (#2009)
* fix(core): release all records before delayed fetch

Signed-off-by: SSpirits <admin@lv5.moe>

* fix(core): release all records before delayed fetch

Signed-off-by: SSpirits <admin@lv5.moe>

---------

Signed-off-by: SSpirits <admin@lv5.moe>
2024-09-10 21:23:42 +08:00
Shichao Nie 2046e120b4
fix(issue2004): fix AutoBalancerMetricsReporter cannot process with t… (#2006)
fix(issue2004): fix AutoBalancerMetricsReporter cannot process with topic name contains period

Signed-off-by: Shichao Nie <niesc@automq.com>
2024-09-10 15:50:06 +08:00
Xu Han@AutoMQ 29db5e2273
feat(version): bump to 1.2.0-rc0 (#2003)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-09-10 14:11:41 +08:00
Xu Han@AutoMQ 1b2bb89f24
fix(issues1999): fix Processor.channelContexts memory leak (#2002)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-09-09 16:30:25 +08:00
Yu Ning 9304c2e92a
chore(util): implement `IdURI#toString` (#1997)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-09-09 09:55:43 +08:00
Yu Ning fb369586f1
chore: beautify `ElasticLogMeta` in logs (#1996)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-09-06 16:16:19 +08:00
Shichao Nie 4bf41ff720
fix(auto_balancer): fix broker status change (#1992)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-09-05 11:58:57 +08:00
Shichao Nie 4b475757bb
feat(auto_balancer): limit the maximum time to execute actions (#1990)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-09-04 15:42:37 +08:00
Shichao Nie a61a45ae11
fix(auto_balancer): fix string format (#1988)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-09-04 10:41:56 +08:00
Shichao Nie ef6c7f278f
feat(core): default configuration optimization (#1980)
1. decrease default max stream number per sso config to 20k, corresponds to 15k streams per node as recommended upper limit
2. increase single batch record size limit to 50k for worst case to force split a sso with 20k streams

Signed-off-by: Shichao Nie <niesc@automq.com>
2024-09-02 15:27:09 +08:00
SSpirits 4dfcedeb9c
fix(s3stream): fix namespace (#1978)
Signed-off-by: SSpirits <admin@lv5.moe>
2024-09-02 12:16:57 +08:00
Yu Ning 29dac86972
fix(log): use the same view to calculate trim offsets (#1972)
* fix(log): use the same view to calculate trim offsets

Signed-off-by: Ning Yu <ningyu@automq.com>

* fix: calculate trim offsets in the lock

Signed-off-by: Ning Yu <ningyu@automq.com>

* perf: use `Iterator` to reduce the overhead caused by passing intermediate data

Signed-off-by: Ning Yu <ningyu@automq.com>

* perf: merge `calTrimOffset` and `calStreamsMinOffset`

Signed-off-by: Ning Yu <ningyu@automq.com>

* test: fix tests

Signed-off-by: Ning Yu <ningyu@automq.com>

---------

Signed-off-by: Ning Yu <ningyu@automq.com>
2024-08-31 12:56:22 +08:00
Xu Han@AutoMQ 7c9d3c205d
fix(issues1960): allow UNSET attributes for version 1.1 (#1961)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-29 11:03:47 +08:00
Shichao Nie 625de6dab0
fix(s3stream): change reject handler to prevent task lost (#1957)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-28 15:31:13 +08:00
SSpirits f655c7ab95
fix(s3stream): fix buffer leak (#1949)
Signed-off-by: SSpirits <admin@lv5.moe>
2024-08-26 18:15:24 +08:00
SSpirits 74be37e047
fix(s3stream): resolve unintended buffer reuse (#1946)
Signed-off-by: SSpirits <admin@lv5.moe>
2024-08-26 14:53:31 +08:00
Yu Ning 14e22afe9c
docs(s3stream/version): statement of features supported by S3Stream V1 (#1942)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-08-26 12:13:14 +08:00
SSpirits 35950e93bd
feat(s3stream): add checksum for s3 wal (#1940)
Signed-off-by: SSpirits <admin@lv5.moe>
2024-08-25 11:32:25 +08:00
Xu Han@AutoMQ 4da7a9c536
feat(s3stream): ensure compaction could be executed (#1935)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-24 16:45:14 +08:00
Shichao Nie 531c04592c
feat(metadata): blocking wait for uploading local index cache on stre… (#1932)
feat(metadata): blocking wait for uploading local index cache on stream close

Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-23 18:43:17 +08:00
Shichao Nie 07a0c88bdd
feat(auto_balancer): limit the max number of nodes to reassign partit… (#1934)
* feat(auto_balancer): limit the max number of nodes to reassign partitions to same broker

Signed-off-by: Shichao Nie <niesc@automq.com>

* feat(auto_balancer): rename constant variable

Signed-off-by: Shichao Nie <niesc@automq.com>

---------

Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-23 18:37:44 +08:00
Shichao Nie 0f0c06cf39
feat(metadata): expire node sparse index cache after write (#1926)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-23 15:09:55 +08:00
Robin Han 04af131b1e
fix(test): remove e2e_6 dependency
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-23 15:04:05 +08:00
Xu Han@AutoMQ 034f04b7ec
chore(test): set runner to e2e (#1928)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-23 15:00:03 +08:00
SSpirits 44b8e7296b
feat(s3stream): limit the inflight fast retry request count (#1924)
Signed-off-by: SSpirits <admin@lv5.moe>
2024-08-23 11:47:50 +08:00
Xiang Chen 76bea9167b
fix(e2e): fix bug (#1923) 2024-08-23 11:23:17 +08:00
Xu Han@AutoMQ f0263f163b
feat(s3stream): unify object not exist exception (#1921)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-23 11:20:10 +08:00
Yu Ning 77b525e007
perf(config): increase the default index interval from 4KiB to 1MiB (#1914)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-08-22 21:05:30 +08:00
Shichao Nie 98a4651cb4
fix(metadata): fix interrupted batch upload (#1912)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-22 20:17:39 +08:00
Xu Han@AutoMQ e9e9ad76eb
fix(test): set dev version to 3.8.0 (#1911)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-22 18:51:08 +08:00
SSpirits 49c85a8add
feat(s3stream): add docs for checksumAlgorithm configuration (#1910)
Signed-off-by: SSpirits <admin@lv5.moe>
2024-08-22 18:43:04 +08:00
SSpirits 265159305e
feat(s3stream): support change checksum algorithm (#1909)
feat(s3stream): change the checksum algorithm to crc32c  (#1860)

* feat(s3stream): change the checksum algorithm to crc32c

Signed-off-by: SSpirits <admin@lv5.moe>

* feat(s3stream): introduce aws crt

Signed-off-by: SSpirits <admin@lv5.moe>

* feat(s3stream): completely disable the MD5 checksum

Signed-off-by: SSpirits <admin@lv5.moe>

* feat(s3stream): support change checksum algorithm

Signed-off-by: SSpirits <admin@lv5.moe>

* fix(s3stream): make spotbugs happy

Signed-off-by: SSpirits <admin@lv5.moe>

---------

Signed-off-by: SSpirits <admin@lv5.moe>
(cherry picked from commit ae752b60e2)
2024-08-22 18:23:02 +08:00
Shichao Nie a3e744e112
fix(metadata): fix wrong stream number in serialization (#1907)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-22 17:59:33 +08:00
Shichao Nie 2d4e6ccc8f
fix(metadata): prevent concurrent uploading of local sparse index cache (#1905)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-22 17:52:34 +08:00
Shichao Nie 2532875b9e
feat(metadata): add metrics for sparse index cache monitor (#1903)
* feat(metadata): add metrics for sparse index cache monitor

Signed-off-by: Shichao Nie <niesc@automq.com>

* feat(metadata): record objects to search metrics from sparse index only

Signed-off-by: Shichao Nie <niesc@automq.com>

---------

Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-22 17:21:28 +08:00
Shichao Nie 50b053f587
fix(metadata): fix evict sparse index cache potential endless loop (#1901)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-22 15:16:39 +08:00
Shichao Nie f1972d5f3f
fix(metadata): fix sparse index leak on compaction split (#1895)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-22 15:13:03 +08:00
Curtis Wan f6e69bdc7b
fix(e2e): fix data map format (#1899)
Signed-off-by: Curtis Wan <wcy9988@163.com>
2024-08-22 13:59:23 +08:00
Yu Ning 1c7116013e
fix(s3stream/wal): fix memory leak (#1897)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-08-22 12:46:36 +08:00
SSpirits 469cfcb259
perf(s3stream): optimize the critical section in s3 wal (#1880) (#1893)
* perf(s3stream): optimize the critical section in s3 wal

Signed-off-by: SSpirits <admin@lv5.moe>

* fix(s3stream): fix check style

Signed-off-by: SSpirits <admin@lv5.moe>

* feat(s3stream): optimize

Signed-off-by: SSpirits <admin@lv5.moe>

* fix(s3stream): fix resource leak in test

Signed-off-by: SSpirits <admin@lv5.moe>

---------

Signed-off-by: SSpirits <admin@lv5.moe>
(cherry picked from commit 3bbaf81092)
2024-08-22 11:48:02 +08:00
Xu Han@AutoMQ 11a5262b08
chore(logger): change e2e streams log level to INFO (#1890)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-22 11:39:18 +08:00
Xiang Chen 6be6a1d834
fix(e2e): fix bug (#1887)
* fix(e2e): fix bug

* fix(e2e): fix bug
2024-08-22 11:08:21 +08:00
Xu Han@AutoMQ 4831c539bc
fix(test): fix consumer mode e2e by increase timeout (#1886)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-21 18:47:21 +08:00
Yu Ning dcfa29f540
perf(s3stream/wal): reuse the `ByteBuf` for record headers (#1877) (#1878)
* perf(s3stream/wal): reuse the `ByteBuf` for record headers (#1877)

* refactor: manage the record headers' lifecycle in `Block`

Signed-off-by: Ning Yu <ningyu@automq.com>

* perf(s3stream/wal): reuse the `ByteBuf` for record headers

Signed-off-by: Ning Yu <ningyu@automq.com>

* perf: remove the max size limit

Signed-off-by: Ning Yu <ningyu@automq.com>

* test: test `FixedSizeByteBufPool`

Signed-off-by: Ning Yu <ningyu@automq.com>

* revert: "perf: remove the max size limit"

This reverts commit ed6311210a.

* feat: use a separate `poolSize` to limit the size of the pool

Signed-off-by: Ning Yu <ningyu@automq.com>

---------

Signed-off-by: Ning Yu <ningyu@automq.com>

* perf: increase the size of the buffer pool as CPU cores

Signed-off-by: Ning Yu <ningyu@automq.com>

---------

Signed-off-by: Ning Yu <ningyu@automq.com>
2024-08-21 11:21:52 +08:00
Yu Ning 24c8c921cb
chore(tools/perf): increase the message sending rate during the warmup to accelerate JVM warmup (#1881)
* perf: increase the message sending rate during the warmup to accelerate JVM warmup

Signed-off-by: Ning Yu <ningyu@automq.com>

* chore: rename "catchup-rate" to "send-rate-during-catchup"

Signed-off-by: Ning Yu <ningyu@automq.com>

* fix: fix compile error

Signed-off-by: Ning Yu <ningyu@automq.com>

---------

Signed-off-by: Ning Yu <ningyu@automq.com>
2024-08-21 11:21:40 +08:00
Yu Ning 80e9d3113b
perf(s3stream): check the logger level before `trace` and `debug` (#1876)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-08-20 15:57:16 +08:00
Shichao Nie ff927d237c
fix(metadata): catch exception to prevent unnecessary error log (#1873)
fix(metadata): prevent unnecessary error log

Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-20 15:24:39 +08:00
Yu Ning b0c16eec8d
revert: enable `LocalStreamRangeIndexCacheTest#testEvict` (#1870)
* revert: enable `LocalStreamRangeIndexCacheTest#testEvict`

Signed-off-by: Ning Yu <ningyu@automq.com>

* test: increase timeout

Signed-off-by: Ning Yu <ningyu@automq.com>

---------

Signed-off-by: Ning Yu <ningyu@automq.com>
2024-08-20 14:30:52 +08:00
Yu Ning d3e00b3339
fix(storage): regard all produce requests as duplicate whose sequence… (#1868)
fix(storage): regard all produce requests as duplicate whose sequence… (#1865)

* fix(storage): regard all produce requests as duplicate whose sequence number is less than the 5 retained batches



* test: disable an unstable test



* test: update kafka tests



---------

Signed-off-by: Ning Yu <ningyu@automq.com>
2024-08-20 11:24:57 +08:00
Curtis Wan d38797351a
fix(e2e): always archive result artifacts (#1867)
* fix(e2e): always archive result artifacts

Signed-off-by: Curtis Wan <wcy9988@163.com>

* fix(e2e): remove other setting

Signed-off-by: Curtis Wan <wcy9988@163.com>

---------

Signed-off-by: Curtis Wan <wcy9988@163.com>
2024-08-20 10:18:48 +08:00
Curtis Wan 5eb9b319a6
feat(e2e): use cloud vms for e2e tests; move wal path (#1864)
feat(e2e): use cloud vms for e2e tests; move wal path (#1862)

* refactor(e2e): test running on cloud provider

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* wip: write /etc/docker/daemon.json if needed

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* wip: write /etc/docker/daemon.json if needed

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* wip: remove docker setup

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* wip: change test py

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* wip: test artifacts

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* wip: fix path

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* wip: fix path

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* wip: report results

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* wip: copy only valid files

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* wip: sum in e2e-run

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* wip: fix space

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* wip: fix NULL

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* fix json

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* remove unneeded inputs

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* move wal location; enhance upload-artifact; roll back input options

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* add java-setup

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* remove test yml

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* only retain reports

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* filter archive files

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* filter archive files

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

* remove test yml

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>

---------

Signed-off-by: Curtis Wan <wanchaoyi@automq.com>
(cherry picked from commit 96155c1a05)
2024-08-19 20:50:50 +08:00
SSpirits ea1c7b28dc
feat(s3stream): add network rate limiter for s3 wal to export metrics (#1861)
Signed-off-by: SSpirits <admin@lv5.moe>
2024-08-19 19:54:23 +08:00
Shichao Nie da926d8f84
feat(metadata): prune invalid sparse index with image on startup (#1857)
* feat(metadata): prune invalid sparse index with image on startup

Signed-off-by: Shichao Nie <niesc@automq.com>

* feat(metadata): address comments

Signed-off-by: Shichao Nie <niesc@automq.com>

---------

Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-19 16:57:43 +08:00
Xu Han@AutoMQ 06f7485a00
feat(s3stream): support call #close(force=true) after #close(force=false) (#1855)
feat(s3stream): support call #close(force=true) after #close(force=false) (#1854)

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-19 15:46:08 +08:00
Shichao Nie 47bda7ff49
feat(metadata): limit sparse index cache update interval (#1850)
* feat(metadata): limit sparse index cache update interval

Signed-off-by: Shichao Nie <niesc@automq.com>

* fix(metadata): clear lru cache with pop

Signed-off-by: Shichao Nie <niesc@automq.com>

---------

Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-19 15:21:00 +08:00
Shichao Nie 91c3b737ef
feat(metadata): avoid serialize empty index (#1853)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-19 15:12:14 +08:00
Xu Han@AutoMQ 369171622f
fix(revert): error callback (#1849)
Revert "fix(s3stream): error callback (#1846)"

This reverts commit a8503396b0.
2024-08-19 15:03:28 +08:00
Xu Han@AutoMQ a8503396b0
fix(s3stream): error callback (#1846)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-19 12:40:05 +08:00
Xu Han@AutoMQ 283b1bcb06
feat(issues1842): cleanup metastream kv after deleting topic (#1844)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-19 11:59:21 +08:00
Shichao Nie 2214e8692b
feat(metadata): limit the size of sparse index cache (#1833)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-19 09:39:38 +08:00
Yu Ning 34ba11ecda
fix(s3stream/wal): fail all IO operations once the WAL is failed (#1841)
fix(s3stream/wal): fail all IO operations once the WAL is failed (#1840)

* refactor: remove `WALChannel#writeAndFlush`



* refactor: introduce `WALChannel#retry`



* refactor: introduce `AbstractWALChannel`



* fix(s3stream/wal): fail all IO operations once the WAL is failed



* refactor: check failed before each IO operation



---------

Signed-off-by: Ning Yu <ningyu@automq.com>
2024-08-18 22:51:46 +08:00
Xu Han@AutoMQ a3f3c9f2c8
fix(s3stream): fix storage failure handle deadlock (#1839)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-16 18:54:14 +08:00
Shichao Nie edcc8385bb
feat(s3stream): add AsyncLRUCache metric (#1829) (#1837)
Co-authored-by: lifepuzzlefun <wjl_is_213@163.com>
2024-08-16 18:50:07 +08:00
Xu Han@AutoMQ a09399e7a1
fix(test): cherry-pick #1830 #1835 (#1836)
* chore(workflows): add e2e base image release (#1830)

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>

* fix(test): fix rolling_update_test e2e (#1835)

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>

---------

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-16 18:18:39 +08:00
Yu Ning 0c2a2a3d75
fix(s3strea/wal): handle `IOException` during flushing WAL header (#1828)
Signed-off-by: Ning Yu <ningyu@automq.com>
2024-08-16 15:46:32 +08:00
Xu Han@AutoMQ 860d253b92
chore(workflows): e2e support release runner (#1826)
chore(workflows): e2e support release runner (#1825)

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-16 14:36:19 +08:00
Xu Han@AutoMQ daea64558a
chore(config): change quorum default timeout to 5s (#1824)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-16 14:20:58 +08:00
Xu Han@AutoMQ 385b5b1c03
chore(logger): move trim log to broker (#1820)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-15 19:26:05 +08:00
Xu Han@AutoMQ fd1f9c1043
fix(cli): add cli utils back (#1816)
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-15 18:14:02 +08:00
Xiang Chen 2c1cdd5feb
fix(e2e): adaptation to pr: https://github.com/AutoMQ/automq/pull/1794 (#1811)
* fix(e2e): adaptation to pr: https://github.com/AutoMQ/automq/pull/1794

* fix(e2e): fix bug
2024-08-15 17:49:17 +08:00
Shichao Nie ec7a61bbf8
fix(metadata): fix sparse index serialization capacity (#1815)
Signed-off-by: Shichao Nie <niesc@automq.com>
2024-08-15 17:48:37 +08:00
Xu Han@AutoMQ 64bad21a71
Merge pull request #1813 from AutoMQ/merge_3.8.0
feat(all): merge apache kafka 3.8.0 771b957
2024-08-15 17:45:51 +08:00
Robin Han cf8ef53947
fix(merge): fix storage auto format
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-15 17:18:30 +08:00
Robin Han baafb4aab9
fix(all): fix merge compile error
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-15 16:32:40 +08:00
SSpirits 9e4eafb6a5
fix(s3stream): fix dead lock issue (#1810)
Signed-off-by: SSpirits <admin@lv5.moe>
2024-08-15 16:14:36 +08:00
Robin Han d22931adc8
feat(all): merge apache kafka 3.8.0 771b957
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
2024-08-15 15:34:11 +08:00
Josep Prat 771b9576b0
Bump version to 3.8.0 2024-07-23 10:04:44 +02:00
PoAn Yang 7495337c76 KAFKA-17166 Use NoOpScheduler to rewrite LogManagerTest#testLogRecoveryMetrics (#16641)
Reviewers: Okada Haruki <ocadaruma@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-07-23 01:22:15 +08:00
PoAn Yang 29e7796747 KAFKA-17142 Fix deadlock caused by LogManagerTest#testLogRecoveryMetrics (#16614)
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-07-23 01:12:59 +08:00
Greg Harris 0172280427
KAFKA-17148: Remove print MetaPropertiesEnsemble from kafka-storage tool (#16607) (3.8) (#16617)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Greg Harris <greg.harris@aiven.io>, Chris Egerton <chrise@aiven.io>

Co-authored-by: Dmitry Werner <grimekillah@gmail.com>
2024-07-17 17:07:11 -07:00
Greg Harris bd29da9c43 KAFKA-17150: Use Utils.loadClass instead of Class.forName to resolve aliases correctly (#16608)
Signed-off-by: Greg Harris <greg.harris@aiven.io>
Reviewers: Chris Egerton <chrise@aiven.io>, Chia-Ping Tsai <chia7712@gmail.com>, Josep Prat <josep.prat@aiven.io>
2024-07-17 17:02:01 -07:00
Apoorv Mittal f05e1678f7
KAFKA-16916: Fixing error in completing future (#16249)
Fix to complete Future which was stuck due the exception.getCause() throws an error.

The fix in the #16217 unblocked blocking thread but exception in catch block from blocking get call was wrapped in ExecutionException which is not the case when moved to async workflow hence getCause is not needed.

Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-07-15 08:52:59 +02:00
Apoorv Mittal 21464d5c5f
KAFKA-16905: Fix blocking DescribeCluster call in AdminClient DescribeTopics (#16217)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, David Arthur <mumrah@gmail.com>
2024-07-15 08:52:32 +02:00
Bruno Cadonna 7d9b8e4604
KAFKA-17098: Re-add task to state updater if transit to RUNNING fails (#16570)
When Streams tries to transit a restored active task to RUNNING, the
first thing it does is getting the committed offsets for this task.
If getting the offsets expires a timeout, Streams does not re-throw
the error initially, but tries to get the committed offsets later
until a Streams-specific timeout is hit.

Restored active tasks from the state updater are removed from the
output queue of the restored tasks in the state updater. If a
timeout occurs, the restored task is neither added to the
task registry nor re-added to the state updater. The task is lost
since it is not maintained anywhere. This means the task is also
not closed. When the same task is created again on the same
stream thread since the stream thread does not know about this
lost task, the state stores are opened again and RocksDB will
throw the "No locks available" error.

This commit re-adds the task to the state updater if the
committed request times out.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-07-11 17:34:57 +02:00
Vikas Balani 1d803c4d9f
KAFKA-17111: explicitly register Afterburner module in JsonSerializer and JsonDeserializer (#16565)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Josep Prat <josep.prat@aiven.io>, Greg Harris <greg.harris@aiven.io>
2024-07-11 14:22:54 +02:00
Bruno Cadonna 4ecbb75c1f
KAFKA-17085: Handle tasks in state updater before tasks in task registry (#16555)
When a active tasks are revoked they land as suspended tasks in the
task registry. If they are then reassigned, the tasks are resumed
and put into restoration. On assignment, we first handle the tasks
in the task registry and then the tasks in the state updater. That
means that if a task is re-assigned after a revocation, we remove
the suspended task from the task registry, resume it, add it
to the state updater, and then remove it from the list of tasks
to create. After that we iterate over the tasks in the state
updater and remove from there the tasks that are not in the list
of tasks to create. However, now the state updater contains the
resumed tasks that we removed from the task registry before but
are no more in the list of tasks to create. In other words, we
remove the resumed tasks from the state updater and close them
although we just got them assigned.

This commit ensures that we first handle the tasks in the
state updater and then the tasks in the task registry.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-07-10 08:45:46 +02:00
Vincent Rose 1dd16c4f2e MINOR: Generate javadocs on all source files for streams:test-utils (#16556)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-07-09 11:33:29 -07:00
Bruno Cadonna 1dd9385794
KAFKA-10199: Close pending active tasks to init on partitions lost (#16545)
With enabled state updater tasks that are created but not initialized
are stored in a set. In each poll iteration the stream thread drains
that set, intializes the tasks, and adds them to the state updater.

On partition lost, all active tasks are closed.

This commit ensures that active tasks pending initialization in
the set mentioned above are closed cleanly on partition lost.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-07-08 15:23:24 +02:00
Igor Soarez c8f88ed3ab
KAFKA-17083: Update LATEST_STABLE_METADATA_VERSION in system tests (#16533)
LATEST_PRODUCTION version in MetadataVersion.java was updated in
both #16347 and #16400, but it was left unchanged in the system
tests.

Reviewers: Josep Prat <josep.prat@aiven.io>
2024-07-05 21:31:56 +01:00
Josep Prat 2fbe32ecb9
MINOR: Update dependency numbers in LICENSE file (#16514)
Reviewers: Burno Cadonna <cadonna@apache.org>
2024-07-03 11:01:43 +02:00
Josep Prat b20a7356e2
Revert "KAFKA-16154" and mark MV 3.8-IV0 as the latest production (#16400)
* Revert "KAFKA-16154: Broker returns offset for LATEST_TIERED_TIMESTAMP (#15213)"
* Set 3.8_IV0 as latest production version in 3.8
* Bring changes committed in KAFKA-16968
* Make ClusterTest annotation metadata default to 3.9

---------

Signed-off-by: Josep Prat <josep.prat@aiven.io>

Reviewers: Justine Olshan <jolshan@confluent.io>, Jun Rao <junrao@gmail.com>
2024-07-02 19:22:11 +02:00
Justine Olshan 31a9c702a4
KAFKA-17050: Revert group.version from 3.8 (#16478)
Reverting due to complications when trying to fix KAFKA-17011 in 3.8. Now there will be no production features, so we won't send any over the wire in ApiVersions or BrokerRegistration and cause issues when the receiver is on an old version.

I reverted the typo PR to make the reverts cleaner and minimize chances for errors. The only conflicts were due to imports and a modified test testConsumerGroupDescribe. The fix was to keep the modified parameters but remove the metadataCache code.

Some other minor changes for items we wanted to keep (not revert)

Reviewers: Colin P. McCabe <cmccabe@apache.org>, David Jacot <djacot@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Jun Rao <junrao@apache.org>
2024-06-28 15:46:03 -07:00
Lucas Brutschy 5edab42ae3 MINOR: Update 3.8 documentation for Kafka Streams (#16265)
All public interface changes should be briefly mentioned in the
upgrade guide.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Anna Sophie Blee-Goldman <sophie@responsive.dev>, Nick Telford <nick.telford@gmail.com>
2024-06-27 11:57:31 +02:00
Justine Olshan 35b34a85a7
Revert "KAFKA-16275: Update kraft_upgrade_test.py to support KIP-848’s group protocol config (#16409)
This reverts commit e95e91a.

With the change to include the group.version flag, these tests fail due to trying to set the feature for the old version.

It is unclear if these tests originally worked as intended and given the upgrade is not expected for 3.8, we will just revert from 3.8.

Reviewers: David Jacot <djacot@confluent.io>
2024-06-21 09:04:13 -07:00
Gaurav Narula 26763c5268 MINOR: use 2 logdirs in ZK migration system tests (#15394)
Zookeeper migration system tests currently override the config to
use only one log directory.

This PR removes the override so that the system tests run with 2 log
directories following the work done as part of KIP-858.

Reviewers: Igor Soarez <soarez@apple.com>, Proven Provenzano <pprovenzano@confluent.io>
2024-06-19 17:17:12 +08:00
Rohan 45027f3d33 KAFKA-15774: use the default dsl store supplier for fkj subscriptions (#16380)
Foreign key joins have an additional "internal" state store used for subscriptions, which is not exposed for configuration via Materialized or StoreBuilder which means there is no way to plug in a different store implementation via the DSL operator. However, we should respect the configured default dsl store supplier if one is configured, to allow these stores to be customized and conform to the store type selection logic used for other DSL operator stores

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-19 00:33:46 -07:00
Jinyong Choi 0225c49f8f KAFKA-15302: Stale value returned when using store.all() with key deletion [docs] (#15495)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-06-18 17:46:21 -07:00
Luke Chen 3f605fd197
KAFKA-16988: add 1 more node for test_exactly_once_source system test (#16379)
Reviewers: Igor Soarez <soarez@apple.com>
2024-06-18 13:12:06 +01:00
Mickael Maison 6669c3050e MINOR: Fix doc for zookeeper.ssl.client.enable (#16374)
Reviewers: Luke Chen <showuon@gmail.com>
2024-06-18 13:21:42 +02:00
Josep Prat b682e4dc9d
MINOR: update documentation link to 3.8 (#16382)
Reviewers:  Luke Chen <showuon@gmail.com>
2024-06-18 10:27:31 +02:00
Lianet Magrans 86abaf2fbe
KAFKA-16954: fix consumer close to release assignment in background (#16376)
* MINOR: Improving log for outstanding requests on close and cleanup (#16304)

Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>

* KAFKA-16954: fix consumer close to release assignment in background (#16343)

This PR fixes consumer close to avoid updating the subscription state object in the app thread. Now the close simply triggers an UnsubscribeEvent that is handled in the background to trigger callbacks, clear assignment, and send leave heartbeat. Note that after triggering the event, the unsubscribe will continuously process background events until the event completes, to ensure that it allows for callbacks to run in the app thread.
The logic around what happens if the unsubscribe fails remain unchanged: close will log, keep the first exception and carry on.

It also removes the redundant LeaveOnClose event (it used to do the the exact same thing as the UnsubscribeEvent, both calling membershipMgr.leaveGroup).

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-06-18 08:50:57 +02:00
Kamal Chandraprakash 4ff8e16c91 KAFKA-15265: Reapply dynamic remote configs after broker restart (#16353)
The below remote log configs can be configured dynamically:
1. remote.log.manager.copy.max.bytes.per.second
2. remote.log.manager.fetch.max.bytes.per.second and
3. remote.log.index.file.cache.total.size.bytes

If those values are configured dynamically, then during the broker restart, it ensures the dynamic values are loaded instead of the static values from the config.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>
2024-06-18 10:41:49 +05:30
Igor Soarez f2e99f362a KAFKA-16969: Log error if config conficts with MV (#16366)
When broker configuration is incompatible with the current Metadata Version the Broker should log an error-level message but avoid shutting down.

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-18 11:12:53 +08:00
Luke Chen 99b33e205c [MINOR] Add a note for JBOD support for tiered storage (#16369)
Reviewers: Satish Duggana <satishd@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>
2024-06-18 05:11:51 +05:30
Krishna Agarwal f2e471fb81 KAFKA-16932: Add documentation for the native docker image (#16338)
This PR contains the the following documentation changes for the native docker image:

in the docker/README.md: How to build, release and promote the native docker image.
in the tests/README.md: How to run system tests by bringing up kafka in the native mode.
added docker/native/README.md
added html changes for the kafka-site
added native docker image support in the docker compose files examples.

Testing:
Tested all the docker compose files with both the docker images - jvm and native
Tested the html changes locally with the kafka-site

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Vedarth Sharma <vesharma@confluent.io>
2024-06-16 17:38:59 +05:30
Colin P. McCabe 2c27977620 Revert new configurations from KAFKA-16525; Dynamic KRaft network manager and channel (#15986)
Since KIP-853 has been removed from the 3.8 release, do not allow quorum.bootstrap.servers to be
set outside of JUnit tests. Hide the configuration by making it internal.
2024-06-14 22:22:36 -07:00
Colin P. McCabe c29260f381 Revert "KAFKA-16535: Implement AddVoter, RemoveVoter, UpdateVoter RPCs"
This reverts commit 7879f1c013.
2024-06-14 18:17:29 -07:00
Kirk True 1e83351be5 KAFKA-16637 AsyncKafkaConsumer removes offset fetch responses from cache too aggressively (#16310)
Allow the committed offsets fetch to run for as long as needed. This handles the case where a user invokes Consumer.poll() with a very small timeout (including zero).

Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-15 08:49:50 +08:00
Matthias J. Sax 7435dfaa97 MINOR: update Kafka Streams docs with 3.4 KIP information (#16336)
Reviewers: Jim Galasyn <jim.galasyn@confluent.io>, Bill Bejeck <bill@confluent.io>
2024-06-14 15:02:33 -07:00
TingIāu "Ting" Kì dd24c9267e KAFKA-16946: Utils.getHost/getPort cannot parse SASL_PLAINTEXT://host:port (#16319)
In previous PR(#16048), I mistakenly excluded the underscore (_) from the set of valid characters for the protocol,
resulting in the inability to correctly parse the connection string for SASL_PLAINTEXT. This bug fix addresses the
issue and includes corresponding tests.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Luke Chen <showuon@gmail.com>
2024-06-14 13:07:51 -07:00
Edoardo Comar ee9c1c14e3 MINOR: Add integration tag to AdminFenceProducersIntegrationTest (#16326)
Add @tag("integration") to AdminFenceProducersIntegrationTest

Reviewers: Chris Egerton <chrise@aiven.io>
2024-06-14 16:45:05 +01:00
Lianet Magrans 34cf02c2c1 KAFKA-16933: New consumer unsubscribe close commit fixes (#16272)
Fixes for the leave group flow (unsubscribe/close):

Fix to send Heartbeat to leave group on close even if the callbacks fail
fix to ensure that if a member gets fenced while blocked on callbacks (ex. on unsubscribe), it will clear its epoch to not include it in commit requests
fix to avoid race on the subscription state object on unsubscribe, updating it only on the background thread when the callbacks to leave complete (success or failure).
Also improving logging in this area.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Philip Nee <pnee@confluent.io>
2024-06-14 13:06:12 +02:00
Kamal Chandraprakash af4ccc50aa KAFKA-16948: Reset tier lag metrics on becoming follower (#16321)
When the node transitions from a leader to a follower for a partition, then the tier-lag metrics should be reset to zero. Otherwise, it would lead to false positive in metrics. Addressed the concurrency issue while emitting the metrics.

Reviewers: Satish Duggana <satishd@apache.org>, Francois Visconte <f.visconte@gmail.com>,
2024-06-14 16:16:09 +05:30
Antoine Pourchet 89405075ec KAFKA-15045: (KIP-924 pt. 26) default standby task assignment nit (#16331)
The new default standby task assignment in TaskAssignment should only assign standby tasks for changelogged tasks, not all stateful tasks.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-14 00:11:35 -07:00
Matthias J. Sax 3e9fe3a679 MINOR: update Kafka Streams docs with 3.3 KIP information (#16316)
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Jim Galasyn <jim.galasyn@confluent.io>
2024-06-13 15:17:39 -07:00
Matthias J. Sax 6330447dad MINOR: update Kafka Streams docs with 3.2 KIP information (#16313)
Reviewers: Bruno Cadonna <bruno@confluent.io>, Jim Galasyn <jim.galasyn@confluent.io>
2024-06-13 14:58:33 -07:00
A. Sophie Blee-Goldman 562b8006f9 KAFKA-15045: (KIP-924 pt. 25) Rename old internal StickyTaskAssignor to LegacyStickyTaskAssignor (#16322)
To avoid confusion in 3.8/until we fully remove all the old task assignors and internal config, we should rename the old internal assignor classes like the StickyTaskAssignor so that they won't be mixed up with the new version of the assignor (which is also named StickyTaskAssignor)

Reviewers: Bruno Cadonna <cadonna@apache.org>, Josep Prat <josep.prat@aiven.io>
2024-06-13 11:32:33 -07:00
Dongnuo Lyu c05c5a9fdf MINOR: Make online downgrade failure logs less noisy and update the timeouts scheduled in `convertToConsumerGroup` (#16290)
This patch: 
- changes the order of the checks in `validateOnlineDowngrade`, so that only when the last member using the consumer protocol leave and the group still has classic member(s), `online downgrade is disabled` is logged if the policy doesn't allow downgrade.
- changes the session timeout in `convertToConsumerGroup` from `consumerGroupSessionTimeoutMs` to `member.classicProtocolSessionTimeout().get()`.

Reviewers: David Jacot <djacot@confluent.io>
2024-06-13 11:11:48 +02:00
Antoine Pourchet 4b4a59174f KAFKA-15045: (KIP-924 pt. 24) internal TaskAssignor rename to LegacyTaskAssignor (#16318)
Since the new public API for TaskAssignor shared a name, this rename will prevent users from confusing the internal definition with the public one.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-13 00:33:24 -07:00
Chris Egerton 15b62351a1 MINOR: Add readiness check for connector and separate Kafka cluster in ExactlyOnceSourceIntegrationTest::testSeparateOffsetsTopic (#16306)
Reviewers: Greg Harris <gharris1727@gmail.com>
2024-06-12 23:44:22 -04:00
Kamal Chandraprakash 91bd1baff0 KAFKA-16890: Compute valid log-start-offset when deleting overlapping remote segments (#16237)
The listRemoteLogSegments returns the metadata list sorted by the start-offset. However, the returned metadata list contains all the uploaded segment information including the duplicate and overlapping remote-log-segments. The reason for duplicate/overlapping remote-log-segments cases is explained [here](https://github.com/apache/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/server/log/remote/metadata/storage/RemoteLogLeaderEpochState.java#L103).

The list returned by the RLMM#listRemoteLogSegments can contain the duplicate segment metadata at the end of the list. So, while computing the next log-start-offset we should take the maximum of segments (end-offset + 1).

Reviewers: Satish Duggana <satishd@apache.org>
2024-06-13 06:30:51 +05:30
Chris Egerton 1b1821dbff KAFKA-16935: Automatically wait for cluster startup in embedded Connect integration tests (#16288)
Reviewers: Greg Harris <gharris1727@gmail.com>
2024-06-12 20:18:38 -04:00
Jim Galasyn d7fe53fd9d MINOR: Remove Java 7 example code (#16308)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-06-12 16:51:17 -07:00
Antoine Pourchet 87264e6714 KAFKA-15045: (KIP-924 pt. 23) More TaskAssignmentUtils tests (#16292)
Also moved the assignment validation test from StreamsPartitionAssignorTest to TaskAssignmentUtilsTest.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-12 14:26:42 -07:00
Ivan Yurchenko 62fb6a3ef1 KAFKA-8206: KIP-899: Allow client to rebootstrap (#13277)
This commit implements KIP-899: Allow producer and consumer clients to rebootstrap. It introduces the new setting `metadata.recovery.strategy`, applicable to all the types of clients.

Reviewers: Greg Harris <gharris1727@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
2024-06-12 20:50:46 +01:00
Edoardo Comar 0e7134c105 KAFKA-16570 FenceProducers API returns "unexpected error" when succes… (#16229)
KAFKA-16570 FenceProducers API returns "unexpected error" when successful

* Client handling of ConcurrentTransactionsException as retriable
* Unit test
* Integration test

Reviewers: Chris Egerton <chrise@aiven.io>, Justine Olshan <jolshan@confluent.io>
2024-06-12 17:18:07 +01:00
Gantigmaa Selenge 72e72e3537 KAFKA-16865: Add IncludeTopicAuthorizedOperations option for DescribeTopicPartitionsRequest (#16136)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Calvin Liu <caliu@confluent.io>, Andrew Schofield <andrew_schofield@live.com>, Apoorv Mittal <amittal@confluent.io>
2024-06-12 17:07:01 +02:00
Abhijeet Kumar 5fd9bd10ab KAFKA-15265: Dynamic broker configs for remote fetch/copy quotas (#16078)
Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Satish Duggana <satishd@apache.org>
2024-06-12 19:52:10 +05:30
David Jacot 6016b15bea KAFKA-16770; [2/2] Coalesce records into bigger batches (#16215)
This patch is the continuation of https://github.com/apache/kafka/pull/15964. It introduces the records coalescing to the CoordinatorRuntime. It also introduces a new configuration `group.coordinator.append.linger.ms` which allows administrators to chose the linger time or disable it with zero. The new configuration defaults to 10ms.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2024-06-12 08:30:37 +02:00
Bruno Cadonna 8de153ebd6 KAFKA-10199: Enable state updater by default (#16107)
We have already enabled the state updater by default once.
However, we ran into issues that forced us to disable it again.
We think that we fixed those issues. So we want to enable the
state updater again by default.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-06-12 07:54:35 +02:00
Antoine Pourchet 77a6fe9c2a KAFKA-15045: (KIP-924 pt. 22) Add RackAwareOptimizationParams and other minor TaskAssignmentUtils changes (#16294)
We now provide a way to more easily customize the rack aware
optimizations that we provide by way of a configuration class called
RackAwareOptimizationParams.

We also simplified the APIs for the optimizeXYZ utility functions since
they were mutating the inputs anyway.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-11 21:35:01 -07:00
Abhijeet Kumar 0b4fcbb16d KAFKA-15265: Integrate RLMQuotaManager for throttling copies to remote storage (#15820)
- Added the integration of the quota manager to throttle copy requests to the remote storage. Reference KIP-956
- Added unit-tests for the copy throttling logic.

Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>
2024-06-12 06:44:47 +05:30
Okada Haruki 7c30eed66c KAFKA-16541 Fix potential leader-epoch checkpoint file corruption (#15993)
A patch for KAFKA-15046 got rid of fsync on LeaderEpochFileCache#truncateFromStart/End for performance reason, but it turned out this could cause corrupted leader-epoch checkpoint file on ungraceful OS shutdown, i.e. OS shuts down in the middle when kernel is writing dirty pages back to the device.

To address this problem, this PR makes below changes: (1) Revert LeaderEpochCheckpoint#write to always fsync
(2) truncateFromStart/End now call LeaderEpochCheckpoint#write asynchronously on scheduler thread
(3) UnifiedLog#maybeCreateLeaderEpochCache now loads epoch entries from checkpoint file only when current cache is absent

Reviewers: Jun Rao <junrao@gmail.com>
2024-06-12 06:33:09 +05:30
Chris Egerton 520fbb4116 MINOR: Wait for embedded clusters to start before using them in Connect OffsetsApiIntegrationTest (#16286)
Reviewers: Greg Harris <greg.harris@aiven.io>
2024-06-11 17:16:42 -04:00
Antoine Pourchet 444e5de083 KAFKA-15045: (KIP-924 pt. 21) UUID to ProcessId migration (#16269)
This PR changes the assignment process to use ProcessId instead of UUID.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-11 12:26:37 -07:00
Chris Egerton 5c13a6cf2f
MINOR: Fix flaky test ConnectWorkerIntegrationTest::testReconfigureConnectorWithFailingTaskConfigs (#16273)
Reviewers: Greg Harris <greg.harris@aiven.io>
2024-06-11 15:13:54 -04:00
David Jacot 46d7e44d1b KAFKA-16930; UniformHeterogeneousAssignmentBuilder throws NPE when one member has no subscriptions (#16283)
Fix the following NPE:

```
java.lang.NullPointerException: Cannot invoke "org.apache.kafka.coordinator.group.assignor.MemberAssignment.targetPartitions()" because the return value of "java.util.Map.get(Object)" is null
	at org.apache.kafka.coordinator.group.assignor.GeneralUniformAssignmentBuilder.canMemberParticipateInReassignment(GeneralUniformAssignmentBuilder.java:248)
	at org.apache.kafka.coordinator.group.assignor.GeneralUniformAssignmentBuilder.balance(GeneralUniformAssignmentBuilder.java:336)
	at org.apache.kafka.coordinator.group.assignor.GeneralUniformAssignmentBuilder.buildAssignment(GeneralUniformAssignmentBuilder.java:157)
	at org.apache.kafka.coordinator.group.assignor.UniformAssignor.assign(UniformAssignor.java:84)
	at org.apache.kafka.coordinator.group.consumer.TargetAssignmentBuilder.build(TargetAssignmentBuilder.java:302)
	at org.apache.kafka.coordinator.group.GroupMetadataManager.updateTargetAssignment(GroupMetadataManager.java:1913)
	at org.apache.kafka.coordinator.group.GroupMetadataManager.consumerGroupHeartbeat(GroupMetadataManager.java:1518)
	at org.apache.kafka.coordinator.group.GroupMetadataManager.consumerGroupHeartbeat(GroupMetadataManager.java:2254)
	at org.apache.kafka.coordinator.group.GroupCoordinatorShard.consumerGroupHeartbeat(GroupCoordinatorShard.java:308)
	at org.apache.kafka.coordinator.group.GroupCoordinatorService.lambda$consumerGroupHeartbeat$0(GroupCoordinatorService.java:298)
	at org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime$CoordinatorWriteEvent.lambda$run$0(CoordinatorRuntime.java:769)
	at org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime.withActiveContextOrThrow(CoordinatorRuntime.java:1582)
	at org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime.access$1400(CoordinatorRuntime.java:96)
	at org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime$CoordinatorWriteEvent.run(CoordinatorRuntime.java:767)
	at org.apache.kafka.coordinator.group.runtime.MultiThreadedEventProcessor$EventProcessorThread.handleEvents(MultiThreadedEventProcessor.java:144)
	at org.apache.kafka.coordinator.group.runtime.MultiThreadedEventProcessor$EventProcessorThread.run(MultiThreadedEventProcessor.java:176) 
```

Reviewers: Lianet Magrans <lianetmr@gmail.com>, Justine Olshan <jolshan@confluent.io>
2024-06-11 20:44:46 +02:00
Kamal Chandraprakash bcd95f6485 KAFKA-16904: Metric to measure the latency of remote read requests (#16209)
Reviewers: Satish Duggana <satishd@apache.org>, Christo Lolov <lolovc@amazon.com>, Luke Chen <showuon@gmail.com>
2024-06-11 21:08:39 +05:30
KrishVora01 15db823317 KAFKA-16373: KIP-1028: Modfiying download url for kafka dockerfile (#16281)
This PR modifies the download url from https://downloads.apache.org/kafka/ to https://archive.apache.org/dist/kafka/ as the former is not permanent.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Vedarth Sharma <vesharma@confluent.io>
2024-06-11 21:03:55 +05:30
KrishVora01 e75cc45bdf KAFKA-16373: KIP-1028: Adding 3.7.0 docker official images static assets (#16267)
This PR aims to add the static Dockerfile and scripts for AK 3.7.0 version. As mentioned in KIP-1028 this PR aims to start the release of the kafka:3.7.0 Docker Official image. This will also help us validate the process and allow us to address any changes suggested by Dockerhub before the 3.8.0 release.

The static Dockerfile and scripts have been generated via the github actions workflows and scripts added as part of https://github.com/apache/kafka/pull/16027. The reports of build and testing the 3.7.0 Docker official image are below.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Vedarth Sharma <vesharma@confluent.io>
2024-06-11 21:03:48 +05:30
Kamal Chandraprakash d94a28b4a4 KAFKA-15776: Support added to update remote.fetch.max.wait.ms dynamically (#16203)
Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>
2024-06-11 12:35:26 +05:30
Chia Chuan Yu 781b93b00d KAFKA-16885 Renamed the enableRemoteStorageSystem to isRemoteStorageSystemEnabled (#16256)
Reviewers: Kamal Chandraprakash <kchandraprakash@uber.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-11 12:35:25 +05:30
Murali Basani 9460e6b266 KAFKA-16884 Refactor RemoteLogManagerConfig with AbstractConfig (#16199)
Reviewers: Greg Harris <gharris1727@gmail.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-11 12:35:25 +05:30
Kamal Chandraprakash 025e791d0c MINOR: Cleanup the storage module unit tests (#16202)
- Use SystemTime instead of MockTime when time is not mocked
- Use static assertions to reduce the line length
- Fold the lines if it exceeds the limit
- rename tp0 to tpId0 when it refers to TopicIdPartition

Reviewers: Kuan-Po (Cooper) Tseng <brandboat@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-11 12:35:25 +05:30
Kamal Chandraprakash b6848d699d KAFKA-15776: Introduce remote.fetch.max.timeout.ms to configure DelayedRemoteFetch timeout (#14778)
KIP-1018, part1, Introduce remote.fetch.max.timeout.ms to configure DelayedRemoteFetch timeout

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-11 12:35:25 +05:30
Kamal Chandraprakash 69158f67f8 KAFKA-16882 Migrate RemoteLogSegmentLifecycleTest to ClusterInstance infra (#16180)
- Removed the RemoteLogSegmentLifecycleManager
- Removed the TopicBasedRemoteLogMetadataManagerWrapper, RemoteLogMetadataCacheWrapper, TopicBasedRemoteLogMetadataManagerHarness and TopicBasedRemoteLogMetadataManagerWrapperWithHarness

Reviewers: Kuan-Po (Cooper) Tseng <brandboat@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-11 12:35:25 +05:30
Murali Basani 944f4699a7 KAFKA-16880 Update equals and hashcode methods for two attributes (#16173)
Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-11 12:35:24 +05:30
Murali Basani e4a3da6b09 KAFKA-16852 Adding two thread pools kafka-16852 (#16154)
Reviewers: Christo Lolov <lolovc@amazon.com>, Chia-Ping Tasi <chia7712@gmail.com>
2024-06-11 12:35:24 +05:30
Ken Huang f9c37032ff KAFKA-16859 Cleanup check if tiered storage is enabled (#16153)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-11 12:35:24 +05:30
Kuan-Po (Cooper) Tseng 2273e06138 MINOR: Fix missing wait topic finished in TopicBasedRemoteLogMetadataManagerRestartTest (#16171)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-11 12:35:24 +05:30
Kuan-Po (Cooper) Tseng c6f0db3c60 KAFKA-16785 Migrate TopicBasedRemoteLogMetadataManagerRestartTest to new test infra (#16170)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-11 12:35:24 +05:30
Kamal Chandraprakash 68d92a5b43 KAFKA-16866 Used the right constant in RemoteLogManagerTest#testFetchQuotaManagerConfig (#16152)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-11 12:35:23 +05:30
Chris Egerton facbab272f
KAFKA-9228: Restart tasks on runtime-only connector config changes (#16053)
Reviewers: Greg Harris <greg.harris@aiven.io>
2024-06-10 17:02:33 -04:00
KrishVora01 113baae977 [MINOR] KAFKA-16373: KIP-1028: Modifying dockerfile comments (#16261)
This PR aims to add the replace the dockerfile comments under the jvm/dockerfile  Dockerfile, with updated url.
The original comment read
# Get kafka from https://archive.apache.org/dist/kafka and pass the url through build arguments . For DOI dockerfiles, we replace this with
# Get Kafka from https://downloads.apache.org/kafka, url passed as env var, for version {kafka_version} .

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Vedarth Sharma <vesharma@confluent.io>
2024-06-10 18:04:53 +05:30
KrishVora01 deeb847e83 modifying Readme for Docker official images (#16226)
This PR aims to modify the README file under /docker, to include the steps to release the Docker Official Images.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Vedarth Sharma <vesharma@confluent.io>
2024-06-10 18:04:48 +05:30
David Jacot a2760e0131 MINOR: Rename uniform assignor's internal builders (#16233)
This patch renames the uniform assignor's builders to match the `SubscriptionType` which is used to determine which one is called. It removes the abstract class `AbstractUniformAssignmentBuilder` which is not necessary anymore. It also applies minor refactoring.

Reviewers: Ritika Reddy <rreddy@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-10 14:27:44 +02:00
Max Riedel db3bf4ae3d KAFKA-14509; [4/4] Handle includeAuthorizedOperations in ConsumerGroupDescribe API (#16158)
This patch implements the handling of `includeAuthorizedOperations` flag in the ConsumerGroupDescribe API.

Reviewers: David Jacot <djacot@confluent.io>
2024-06-10 14:18:53 +02:00
Antoine Pourchet b0333f2ad5 KAFKA-15045: (KIP-924 pt. 20) Custom task assignment configuration fix (#16245)
The StreamsConfig class was not parsing the new task assignment
configuration flag properly, which made it impossible to properly
configure a custom task assignor.

This PR fixes this and adds a bit of INFO logging to help users diagnose
assignor misconfiguration issues.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-08 19:55:44 -07:00
Antoine Pourchet 81196bc6e2 KAFKA-15045: (KIP-924 pt. 18) Better assignment testing (#16201)
Added more testing for the StickyTaskAssignor, which includes a large-scale test with rack aware enabled.

Also added a no-op change to StreamsAssignmentScaleTest.java to allow for rack aware optimization testing.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-08 19:49:23 -07:00
Igor Soarez d5c9381e06
KAFKA-16886: Detect replica demotion in AssignmentsManager (#16232)
JBOD Brokers keep the Controller up to date with replica-to-directory
placement via AssignReplicasToDirsRequest. These requests are queued,
compacted and sent by AssignmentsManager.

The Controller returns the error NOT_LEADER_OR_FOLLOWER when handling
a AssignReplicasToDirsRequest from a broker that is not a replica.

A partition reassignment can take place, removing the Broker
as a replica before the AssignReplicasToDirsRequest successfully
reaches the Controller. AssignmentsManager retries failed
requests, and will continuously try to propagate this assignment,
until the Broker either shuts down, or is added back as a replica.

When encountering a NOT_LEADER_OR_FOLLOWER error, AssignmentsManager
should assume that the broker is no longer a replica, and stop
trying to propagate the directory assignment for that partition.

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-08 16:11:04 +03:00
Igor Soarez a8316f442a
MINOR: Fix broken ReassignPartitionsCommandTest test (#16251)
KAFKA-16606 (#15834) introduced a change that broke
ReassignPartitionsCommandTest.testReassignmentCompletionDuringPartialUpgrade.

The point was to validate that the MetadataVersion supports JBOD
in KRaft when multiple log directories are configured.
We do that by checking the version used in
kafka-features.sh upgrade --metadata, and the version discovered
via a FeatureRecord for metadata.version in the cluster metadata.

There's no point in checking inter.broker.protocol.version in
KafkaConfig, since in KRaft, that configuration is deprecated
and ignored — always assuming the value of MINIMUM_KRAFT_VERSION.

The broken that was broken sets inter.broker.protocol.version in
KRaft mode and configures 3 directories. So alternatively, we
could change the test to not configure this property.
Since the property isn't forbidden in KRaft mode, just ignored,
and operators may forget to remove it, it seems better to remote
the fail condition in KafkaConfig.

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-08 14:07:12 +03:00
Colin P. McCabe 7879f1c013 KAFKA-16535: Implement AddVoter, RemoveVoter, UpdateVoter RPCs
Implement the add voter, remove voter, and update voter RPCs for
KIP-853. This is just adding the RPC handling; the current
implementation in RaftManager just throws UnsupportedVersionException.

Reviewers: Andrew Schofield <aschofield@confluent.io>, José Armando García Sancio <jsancio@apache.org>

Conflicts: Fix some conflicts caused by the lack of KIP-932 RPCs in 3.8.
2024-06-07 15:19:37 -07:00
Jim Galasyn 3874560401 MINOR: Update docs for for KIP-671 (#16247)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-06-07 14:26:32 -07:00
Sebastien Viale b65a79ebad MINOR: update all-latency-max, range-latency-avg|max and add prefix-scan documentation (#16182)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-06-07 14:24:15 -07:00
Jim Galasyn 85fc07ff06 KAFKA-16911: Update docs for KIP-862 (#16246)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-06-07 10:53:43 -07:00
Phuc-Hong-Tran 0b2829bfc2 KAFKA-16493: Avoid unneeded subscription regex check if metadata version unchanged (#15869)
This PR includes changes for AsyncKafkaConsumer to avoid evaluating the subscription regex on every poll if metadata hasn't changed. The metadataVersionSnapshot was introduced to identify whether metadata has changed or not, if yes then the current subscription regex will be evaluated.

This is the same mechanism used by the LegacyConsumer.

Reviewers: Lianet Magrans <lianetmr@gmail.com>, Matthias J. Sax <matthias@confluent.io>
2024-06-07 10:30:57 -07:00
Igor Soarez 2ab6a3608e
KAFKA-16606 Gate JBOD configuration on 3.7-IV2 (#15834)
Support for multiple log directories in KRaft exists from
MetataVersion 3.7-IV2.

When migrating a ZK broker to KRaft, we already check that
the IBP is high enough before allowing the broker to startup.

With KIP-584 and KIP-778, Brokers in KRaft mode do not require
the IBP configuration - the configuration is deprecated.
In KRaft mode inter.broker.protocol.version defaults to
MetadataVersion.MINIMUM_KRAFT_VERSION (IBP_3_0_IV1).

Instead KRaft brokers discover the MetadataVersion by reading
the "metadata.version" FeatureLevelRecord from the cluster metadata.

This change adds a new configuration validation step upon discovering
the "metadata.version" from the cluster metadata.

Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-06-07 11:13:02 +03:00
Kirk True fc267f4eb8 KAFKA-16200: Enforce that RequestManager implementations respect user-provided timeout (#16031)
Improve consistency and correctness for user-provided timeouts at the Consumer network request layer, per the Java client Consumer timeouts design (https://cwiki.apache.org/confluence/display/KAFKA/Java+client+Consumer+timeouts). While the changes introduced in KAFKA-15974 enforce timeouts at the Consumer's event layer, this change enforces timeouts at the network request layer.

The changes mostly fit into the following areas:

1. Create shared code and idioms so timeout handling logic is consistent across current and future RequestManager implementations
2. Use deadlineMs instead of expirationMs, expirationTimeoutMs, retryExpirationTimeMs, timeoutMs, etc.
3. Update "preemptive pruning" to remove expired requests that have had at least one attempt

Reviewers: Lianet Magrans <lianetmr@gmail.com>, Bruno Cadonna <cadonna@apache.org>
2024-06-07 09:55:45 +02:00
Lianet Magrans 491a079cfa KAFKA-16786: Remove old assignment strategy usage in new consumer (#16214)
Remove usage of the partition.assignment.strategy config in the new consumer. This config is deprecated with the new consumer protocol, so the AsyncKafkaConsumer should not use or validate the property.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-06-07 09:30:53 +02:00
Antoine Pourchet c60b61886d KAFKA-15045: (KIP-924 pt. 19) Update to new AssignmentConfigs (#16219)
This PR updates all of the streams task assignment code to use the new AssignmentConfigs public class.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-06 14:51:27 -07:00
Bruno Cadonna 5d4e426f09 KAFKA-16903: Consider produce error of different task (#16222)
A task does not know anything about a produce error thrown
by a different task. That might lead to a InvalidTxnStateException
when a task attempts to do a transactional operation on a producer
that failed due to a different task.

This commit stores the produce exception in the streams producer
on completion of a send instead of the record collector since the
record collector is on task level whereas the stream producer
is on stream thread level. Since all tasks use the same streams
producer the error should be correctly propagated across tasks
of the same stream thread.

For EOS alpha, this commit does not change anything because
each task uses its own producer. The send error is still
on task level but so is also the transaction.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-06-06 12:21:01 -07:00
David Jacot 1b0edf4f8c KAFKA-14701; Move `PartitionAssignor` to new `group-coordinator-api` module (#16198)
This patch moves the `PartitionAssignor` interface and all the related classes to a newly created `group-coordinator/api` module, following the pattern used by the storage and tools modules.

Reviewers: Ritika Reddy <rreddy@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-06 21:20:02 +02:00
Alyssa Huang 25ca963980 KAFKA-16530: Fix high-watermark calculation to not assume the leader is in the voter set (#16079)
1. Changing log message from error to info - We may expect the HW calculation to give us a smaller result than the current HW in the case of quorum reconfiguration. We will continue to not allow the HW to actually decrease.
2. Logic for finding the updated LeaderEndOffset for updateReplicaState is changed as well. We do not assume the leader is in the voter set and check the observer states as well.
3. updateLocalState now accepts an additional "lastVoterSet" param which allows us to update the leader state with the last known voters. any nodes in this set but not in voterStates will be added to voterStates and removed from observerStates, any nodes not in this set but in voterStates will be removed from voterStates and added to observerStates

Reviewers: Luke Chen <showuon@gmail.com>, José Armando García Sancio <jsancio@apache.org>
2024-06-06 14:31:42 +08:00
Kuan-Po (Cooper) Tseng 04f7ed4c10 KAFKA-16814 KRaft broker cannot startup when `partition.metadata` is missing (#16165)
When starting up kafka logManager, we'll check stray replicas to avoid some corner cases. But this check might cause broker unable to startup if partition.metadata is missing because when startup kafka, we load log from file, and the topicId of the log is coming from partition.metadata file. So, if partition.metadata is missing, the topicId will be None, and the LogManager#isStrayKraftReplica will fail with no topicID error.

The partition.metadata missing could be some storage failure, or another possible path is unclean shutdown after topic is created in the replica, but before data is flushed into partition.metadata file. This is possible because we do the flush in async way here.

When finding a log without topicID, we should treat it as a stray log and then delete it.

Reviewers: Luke Chen <showuon@gmail.com>, Gaurav Narula <gaurav_narula2@apple.com>
2024-06-06 08:17:43 +08:00
Antoine Pourchet e82731f0c6 KAFKA-15045: (KIP-924 pt. 17) State store computation fixed (#16194)
Fixed the calculation of the store name list based on the subtopology being accessed.

Also added a new test to make sure this new functionality works as intended.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-05 15:59:26 -07:00
Antoine Pourchet 041f6516cd KAFKA-15045: (KIP-924 pt. 16) TaskAssignor.onAssignmentComputed handling (#16147)
This PR takes care of making the call back toTaskAssignor.onAssignmentComputed.

It also contains a change to the public AssignmentConfigs API, as well as some simplifications of the StickyTaskAssignor.

This PR also changes the rack information fetching to happen lazily in the case where the TaskAssignor makes its decisions without said rack information.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-05 15:59:26 -07:00
Florin Akermann 1f6c5dc5df KAFKA-12317: Update FK-left-join documentation (#15689)
FK left-join was changed via KIP-962. This PR updates the docs accordingly.

Reviewers: Ayoub Omari <ayoubomari1@outlook.fr>, Matthias J. Sax <matthias@confluent.io>
2024-06-05 15:23:23 -07:00
Ayoub Omari dd6d6f4a5a KAFKA-16573: Specify node and store where serdes are needed (#15790)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Bruno Cadonna <bruno@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-05 15:05:47 -07:00
Greg Harris c1accdbab9 KAFKA-16858: Throw DataException from validateValue on array and map schemas without inner schemas (#16161)
Signed-off-by: Greg Harris <greg.harris@aiven.io>
Reviewers: Chris Egerton <chrise@aiven.io>
2024-06-05 11:38:15 -07:00
Abhijeet Kumar fe7ebf085d KAFKA-15265: Integrate RLMQuotaManager for throttling fetches from remote storage (#16071)
Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org>
2024-06-05 19:14:05 +05:30
Kuan-Po (Cooper) Tseng eabb07bebe KAFKA-16888 Fix failed StorageToolTest.testFormatSucceedsIfAllDirectoriesAreAvailable and StorageToolTest.testFormatEmptyDirectory (#16186)
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-05 21:20:00 +08:00
Dongnuo Lyu bf0ca8498a MINOR: Adjust validateOffsetCommit/Fetch in ConsumerGroup to ensure compatibility with classic protocol members (#16145)
During online migration, there could be ConsumerGroup that has members that uses the classic protocol. In the current implementation, `STALE_MEMBER_EPOCH` could be thrown in ConsumerGroup offset fetch/commit validation but it's not supported by the classic protocol. Thus this patch changed `ConsumerGroup#validateOffsetCommit` and `ConsumerGroup#validateOffsetFetch` to ensure compatibility.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>
2024-06-05 08:09:16 +02:00
TingIāu "Ting" Kì 0e945093a3 KAFKA-15305 The background thread should try to process the remaining task until the shutdown timer is expired. (#16156)
Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-05 04:22:16 +08:00
Chris Egerton ec278a6864
MINOR: Fix return tag on Javadocs for consumer group-related Admin methods (#16197)
Reviewers: Greg Harris <greg.harris@aiven.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-04 15:04:51 -04:00
José Armando García Sancio b4f17a01e4 KAFKA-16525; Dynamic KRaft network manager and channel (#15986)
Allow KRaft replicas to send requests to any node (Node) not just the nodes configured in the
controller.quorum.voters property. This flexibility is needed so KRaft can implement the
controller.quorum.voters configuration, send request to the dynamically changing set of voters and
send request to the leader endpoint (Node) discovered through the KRaft RPCs (specially
BeginQuorumEpoch request and Fetch response).

This was achieved by changing the RequestManager API to accept Node instead of just the replica ID.
Internally, the request manager tracks connection state using the Node.idString method to match the
connection management used by NetworkClient.

The API for RequestManager is also changed so that the ConnectState class is not exposed in the
API. This allows the request manager to reclaim heap memory for any connection that is ready.

The NetworkChannel was updated to receive the endpoint information (Node) through the outbound raft
request (RaftRequent.Outbound). This makes the network channel more flexible as it doesn't need to
be configured with the list of all possible endpoints. RaftRequest.Outbound and
RaftResponse.Inbound were updated to include the remote node instead of just the remote id.

The follower state tracked by KRaft replicas was updated to include both the leader id and the
leader's endpoint (Node). In this comment the node value is computed from the set of voters. In
future commit this will be updated so that it is sent through KRaft RPCs. For example
BeginQuorumEpoch request and Fetch response.

Support for configuring controller.quorum.bootstrap.servers was added. This includes changes to
KafkaConfig, QuorumConfig, etc. All of the tests using QuorumTestHarness were changed to use the
controller.quorum.bootstrap.servers instead of the controller.quorum.voters for the broker
configuration. Finally, the node id for the bootstrap server will be decreasing negative numbers
starting with -2.

Reviewers: Jason Gustafson <jason@confluent.io>, Luke Chen <showuon@gmail.com>, Colin P. McCabe <cmccabe@apache.org>
2024-06-04 12:03:32 -04:00
Igor Soarez be15aa4dc2
KAFKA-16583: Handle PartitionChangeRecord without directory IDs (#16118)
When PartitionRegistration#merge() reads a PartitionChangeRecord
from an older MetadataVersion, with a replica assignment change
and without #directories() set, it produces a direcotry assignment
of DirectoryId.UNASSIGNED. This is problematic because the MetadataVersion
may not yet support directory assignments, leading to a
UnwritableMetadataException in PartitionRegistration#toRecord.

Since the Controller always sets directories on PartitionChangeRecord
if the MetadataVersion supports it, via PartitionChangeBuilder,
there's no need for PartitionRegistration#merge() to populate
directories upon a replica assignment change.

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-04 15:37:45 +01:00
Ritika Reddy ebc68f00e4 KAFKA-16821; Member Subscription Spec Interface (#16068)
This patch reworks the `PartitionAssignor` interface to use interfaces instead of POJOs. It mainly introduces the `MemberSubscriptionSpec` interface that represents a member subscription and changes the `GroupSpec` interfaces to expose the subscriptions and the assignments via different methods.

The patch does not change the performance.

before:
```
Benchmark                                     (memberCount)  (partitionsToMemberRatio)  (topicCount)  Mode  Cnt  Score   Error  Units
TargetAssignmentBuilderBenchmark.build                10000                         10           100  avgt    5  3.462 ± 0.687  ms/op
TargetAssignmentBuilderBenchmark.build                10000                         10          1000  avgt    5  3.626 ± 0.412  ms/op
JMH benchmarks done
```

after:
```
Benchmark                                     (memberCount)  (partitionsToMemberRatio)  (topicCount)  Mode  Cnt  Score   Error  Units
TargetAssignmentBuilderBenchmark.build                10000                         10           100  avgt    5  3.677 ± 0.683  ms/op
TargetAssignmentBuilderBenchmark.build                10000                         10          1000  avgt    5  3.991 ± 0.065  ms/op
JMH benchmarks done
```

Reviewers: David Jacot <djacot@confluent.io>
2024-06-04 15:50:48 +02:00
David Jacot 15ab07a822 MINOR: Log time taken to compute the target assignment (#16185)
The time taken to compute a new assignment is critical. This patches extending the existing logging to log it too. This is very useful information to have.

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-04 15:50:35 +02:00
Chris Egerton 7404fdffa6
KAFKA-16837, KAFKA-16838: Ignore task configs for deleted connectors, and compare raw task configs before publishing them (#16122)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-06-04 09:37:57 -04:00
Edoardo Comar c295feff3c KAFKA-16047: Use REQUEST_TIMEOUT_MS_CONFIG in AdminClient.fenceProducers (#16151)
Use REQUEST_TIMEOUT_MS_CONFIG in AdminClient.fenceProducers, 
or options.timeoutMs if specified, as transaction timeout.

No transaction will be started with this timeout, but
ReplicaManager.appendRecords uses this value as its timeout.
Use REQUEST_TIMEOUT_MS_CONFIG like a regular producer append
to allow for replication to take place.

Co-Authored-By: Adrian Preston <prestona@uk.ibm.com>
2024-06-04 11:55:18 +01:00
Jeff Kim 0aa0a01d9c KAFKA-16664; Re-add EventAccumulator.poll(long, TimeUnit) (#16144)
We have revamped the thread idle ratio metric in https://github.com/apache/kafka/pull/15835. https://github.com/apache/kafka/pull/15835#discussion_r1588068337 describes a case where the metric loses accuracy and in order to set a lower bound to the accuracy, this patch re-adds a poll with a timeout that was removed as part of https://github.com/apache/kafka/pull/15430.

Reviewers: David Jacot <djacot@confluent.io>
2024-06-04 08:28:10 +02:00
David Jacot 961c28ae71 MINOR: Fix type in MetadataVersion.IBP_4_0_IV0 (#16181)
This patch fixes a typo in MetadataVersion.IBP_4_0_IV0. It should be 0 not O.

Reviewers: Justine Olshan <jolshan@confluent.io>, Jun Rao <junrao@gmail.com>,  Chia-Ping Tsai <chia7712@gmail.com>
2024-06-03 20:50:57 -07:00
Anatoly Popov cd52f33746 KAFKA-16105: Reset read offsets when seeking to beginning in TBRLMM (#15165)
Reviewers: Greg Harris <greg.harris@aiven.io>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>
2024-06-03 13:48:50 -07:00
TingIāu "Ting" Kì 9c72048a88 KAFKA-16861: Don't convert to group to classic if the size is larger than group max size. (#16163)
Fix the bug where the group downgrade to a classic one when a member leaves, even though the consumer group size is still larger than `classicGroupMaxSize`.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>
2024-06-03 20:38:01 +02:00
David Jacot 1b11cf0fe3 MINOR: Small refactor in TargetAssignmentBuilder (#16174)
This patch is a small refactoring which mainly aims at avoid to construct a copy of the new target assignment in the TargetAssignmentBuilder because the copy is not used by the caller. The change relies on the exiting tests and it does not really have an impact on performance (e.g. validated with TargetAssignmentBuilderBenchmark).

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-03 20:37:52 +02:00
Ken Huang 495ec16fb2
KAFKA-16881: InitialState type leaks into the Connect REST API OpenAPI spec (#16175)
Reviewers: Chris Egerton <chrise@aiven.io>
2024-06-03 13:36:08 -04:00
Ken Huang dc5a22bf83 KAFKA-16807 DescribeLogDirsResponseData#results#topics have unexpected topics having empty partitions (#16042)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-02 17:33:43 +08:00
Colin Patrick McCabe 34f5d5bab2
KAFKA-16757: Fix broker re-registration issues around MV 3.7-IV2 (#15945)
When upgrading from a MetadataVersion older than 3.7-IV2, we need to resend the broker registration, so that the controller can record the storage directories. The current code for doing this has several problems, however. One is that it tends to trigger even in cases where we don't actually need it. Another is that when re-registering the broker, the broker is marked as fenced.

This PR moves the handling of the re-registration case out of BrokerMetadataPublisher and into BrokerRegistrationTracker. The re-registration code there will only trigger in the case where the broker sees an existing registration for itself with no directories set.  This is much more targetted than the original code.

Additionally, in ClusterControlManager, when re-registering the same broker, we now preserve its fencing and shutdown state, rather than clearing those. (There isn't any good reason re-registering the same broker should clear these things... this was purely an oversight.) Note that we can tell the broker is "the same" because it has the same IncarnationId.

Reviewers: Gaurav Narula <gaurav_narula2@apple.com>, Igor Soarez <soarez@apple.com>
2024-06-01 23:54:03 +01:00
TingIāu "Ting" Kì a39f3ec815 KAFKA-16639 Ensure HeartbeatRequestManager generates leave request regardless of in-flight heartbeats. (#16017)
Fix the bug where the heartbeat is not sent when a newly created consumer is immediately closed.

When there is a heartbeat request in flight and the consumer is then closed. In the current code, the HeartbeatRequestManager does not correctly send the closing heartbeat because a previous heartbeat request is still in flight. However, the closing heartbeat is only sent once, so in this situation, the broker will not know that the consumer has left the consumer group until the consumer's heartbeat times out.
This situation causes the broker to wait until the consumer's heartbeat times out before triggering a consumer group rebalance, which in turn affects message consumption.

Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-01 04:22:45 +08:00
David Jacot 92ed1ed586 KAFKA-16864; Optimize uniform (homogenous) assignor (#16088)
This patch optimizes uniform (homogenous) assignor by avoiding creating a copy of all the assignments. Instead, the assignor creates a copy only if the assignment is updated. It is a sort of copy-on-write. This change reduces the overhead of the TargetAssignmentBuilder when ran with the uniform (homogenous) assignor.

Trunk:

```
Benchmark                                     (memberCount)  (partitionsToMemberRatio)  (topicCount)  Mode  Cnt   Score   Error  Units
TargetAssignmentBuilderBenchmark.build                10000                         10           100  avgt    5  24.535 ± 1.583  ms/op
TargetAssignmentBuilderBenchmark.build                10000                         10          1000  avgt    5  24.094 ± 0.223  ms/op
JMH benchmarks done
```

```
Benchmark                                       (assignmentType)  (assignorType)  (isRackAware)  (memberCount)  (partitionsToMemberRatio)  (subscriptionType)  (topicCount)  Mode  Cnt   Score   Error  Units
ServerSideAssignorBenchmark.doAssignment             INCREMENTAL         UNIFORM          false          10000                         10         HOMOGENEOUS           100  avgt    5  14.697 ± 0.133  ms/op
ServerSideAssignorBenchmark.doAssignment             INCREMENTAL         UNIFORM          false          10000                         10         HOMOGENEOUS          1000  avgt    5  15.073 ± 0.135  ms/op
JMH benchmarks done
```

Patch:

```
Benchmark                                     (memberCount)  (partitionsToMemberRatio)  (topicCount)  Mode  Cnt  Score   Error  Units
TargetAssignmentBuilderBenchmark.build                10000                         10           100  avgt    5  3.376 ± 0.577  ms/op
TargetAssignmentBuilderBenchmark.build                10000                         10          1000  avgt    5  3.731 ± 0.359  ms/op
JMH benchmarks done
```

```
Benchmark                                       (assignmentType)  (assignorType)  (isRackAware)  (memberCount)  (partitionsToMemberRatio)  (subscriptionType)  (topicCount)  Mode  Cnt  Score   Error  Units
ServerSideAssignorBenchmark.doAssignment             INCREMENTAL         UNIFORM          false          10000                         10         HOMOGENEOUS           100  avgt    5  1.975 ± 0.086  ms/op
ServerSideAssignorBenchmark.doAssignment             INCREMENTAL         UNIFORM          false          10000                         10         HOMOGENEOUS          1000  avgt    5  2.026 ± 0.190  ms/op
JMH benchmarks done
```

Reviewers: Ritika Reddy <rreddy@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2024-05-31 22:18:44 +02:00
David Jacot 5257451646 KAFKA-16860; [2/2] Introduce group.version feature flag (#16149)
This patch updates the system tests to correctly enable the new consumer protocol/coordinator in the tests requiring them.

I went with the simplest approach for now. Long term, I think that we should refactor the tests to better handle features and non-production features.

I got a successful run of the consumer system tests with this patch combined with https://github.com/apache/kafka/pull/16120: https://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1717155071--dajac--KAFKA-16860-2--29028ae0dd/2024-05-31--001./2024-05-31--001./report.html.

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-05-31 21:50:54 +02:00
David Jacot 59ba28f2e7 KAFKA-16860; [1/2] Introduce group.version feature flag (#16120)
This patch introduces the `group.version` feature flag with one version:
1) Version 1 enables the new consumer group rebalance protocol (KIP-848).

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-05-31 21:50:46 +02:00
624 changed files with 21369 additions and 9515 deletions

2
.github/CODEOWNERS vendored
View File

@ -13,4 +13,4 @@
# See the License for the specific language governing permissions and
# limitations under the License.
* @superhx @SCNieh @ShadowySpirits @Chillax-0v0
* @superhx @SCNieh @Chillax-0v0 @Gezi-lzq

View File

@ -1,6 +1,7 @@
name: Docker Release
on:
workflow_dispatch:
push:
tags:
- '[0-9]+.[0-9]+.[0-9]+'
@ -12,7 +13,7 @@ jobs:
name: Docker Image Release
strategy:
matrix:
platform: [ "ubuntu-22.04" ]
platform: [ "ubuntu-24.04" ]
jdk: ["17"]
runs-on: ${{ matrix.platform }}
permissions:
@ -69,4 +70,4 @@ jobs:
context: ./docker
push: true
tags: ${{ steps.image_tags.outputs.tags }}
platforms: linux/amd64,linux/arm64
platforms: linux/amd64,linux/arm64

View File

@ -0,0 +1,41 @@
name: E2E Docker Release
on:
workflow_dispatch:
jobs:
docker-release:
name: Docker Image Release
strategy:
matrix:
platform: [ "ubuntu-22.04" ]
jdk: ["17"]
runs-on: ${{ matrix.platform }}
permissions:
contents: write
steps:
- name: Checkout Code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Determine Image Tags
id: image_tags
run: |
TAG=$(grep default_jdk tests/docker/ducker-ak | grep kos_e2e_base | awk -F ':|"' '{print $3}')
echo "tags=${{ secrets.DOCKERHUB_USERNAME }}/kos_e2e_base:$TAG" >> $GITHUB_OUTPUT
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_READ_WRITE_TOKEN }}
- name: Build and push
uses: docker/build-push-action@v5
with:
file: ./tests/docker/base-Dockerfile
push: true
tags: ${{ steps.image_tags.outputs.tags }}
platforms: linux/amd64,linux/arm64

View File

@ -14,25 +14,44 @@ on:
test-path:
required: false
type: string
storage-path:
required: true
type: string
outputs:
artifact-id:
description: "Artifact ID of the test results"
value: ${{ jobs.run_e2e.outputs.artifact-id }}
success-num:
description: "Number of successful tests"
value: ${{ jobs.run_e2e.outputs.success-num }}
failure-num:
description: "Number of failed tests"
value: ${{ jobs.run_e2e.outputs.failure-num }}
run-time-secs:
description: "Total run time in seconds"
value: ${{ jobs.run_e2e.outputs.run-time-secs }}
jobs:
run_e2e:
name: "Run E2E tests"
runs-on: ${{ inputs.runner }}
env:
TC_GENERAL_MIRROR_URL: "mirrors.ustc.edu.cn"
# Map the job outputs to step outputs
outputs:
artifact-id: ${{ steps.archive-artifacts.outputs.artifact-id }}
success-num: ${{ steps.extract-results.outputs.success-num }}
failure-num: ${{ steps.extract-results.outputs.failure-num }}
run-time-secs: ${{ steps.extract-results.outputs.run-time-secs }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Clean last running results
run: |
rm -rf results
rm -rf "${{ inputs.storage-path }}/${{ inputs.suite-id }}"
- uses: actions/setup-java@v4
with:
distribution: 'temurin'
java-version: '17'
cache: 'gradle'
- name: Setup Gradle
uses: gradle/gradle-build-action@v2.12.0
uses: gradle/actions/setup-gradle@v3
- name: Run E2E tests with yaml
if: ${{ inputs.test-path == '' }}
run: ./tests/docker/run_tests.sh
@ -45,13 +64,33 @@ jobs:
env:
TC_PATHS: ${{ inputs.test-path }}
shell: bash
- name: Move results
- name: Extract results
id: extract-results
run: |
results_path="$(pwd)/results/$(readlink results/latest | cut -d'/' -f5)"
mv "${results_path}" "${{ inputs.storage-path }}/${{ inputs.suite-id }}"
echo "success-num=$(jq .num_passed $results_path/report.json)" >> $GITHUB_OUTPUT
echo "failure-num=$(jq .num_failed $results_path/report.json)" >> $GITHUB_OUTPUT
echo "run-time-secs=$(jq .run_time_seconds $results_path/report.json)" >> $GITHUB_OUTPUT
if: ${{ always() }}
shell: bash
- name: Archive result artifacts
id: archive-artifacts
uses: actions/upload-artifact@v4
if: ${{ always() }}
with:
name: ${{ inputs.suite-id }}
retention-days: 3
compression-level: 1
path: |
results/*/report*
- name: show results
run: |
echo "success-num=${{ steps.extract-results.outputs.success-num }}"
echo "failure-num=${{ steps.extract-results.outputs.failure-num }}"
echo "run-time-secs=${{ steps.extract-results.outputs.run-time-secs }}"
echo "artifact-id=${{ steps.archive-artifacts.outputs.artifact-id }}"
if: ${{ always() }}
- name: Bring down docker containers
run: ./tests/docker/ducker-ak down
shell: bash
if: ${{ always() }}
if: ${{ always() }}

View File

@ -33,29 +33,33 @@ jobs:
run: |
./gradlew -Pprefix=automq-${{ github.ref_name }}_ --build-cache --refresh-dependencies clean releaseTarGz
mkdir -p core/build/distributions/latest
LATEST_TAG=$(git tag --sort=-v:refname | grep -E '^[0-9]+\.[0-9]+\.[0-9]+$' | head -n 1)
if [ "$LATEST_TAG" == "${{ github.ref_name }}" ]; then
IS_LATEST=true
fi
echo "IS_LATEST=$IS_LATEST" >> $GITHUB_OUTPUT
for file in core/build/distributions/automq-*.tgz; do
if [[ ! "$file" =~ site-docs ]]; then
echo "Find latest tgz file: $file"
cp "$file" core/build/distributions/latest/automq-kafka-latest.tgz
break
if [ "$IS_LATEST" = "true" ]; then
echo "Find latest tgz file: $file"
cp "$file" core/build/distributions/latest/automq-kafka-latest.tgz
fi
else
echo "Skip and remove site-docs file: $file"
rm "$file"
fi
done
- uses: jakejarvis/s3-sync-action@master
name: s3-upload-latest
if: ${{ github.repository_owner == 'AutoMQ' }}
- uses: tvrcgo/oss-action@master
name: upload-latest
if: ${{ github.repository_owner == 'AutoMQ' && env.IS_LATEST == 'true' }}
with:
args: --follow-symlinks --delete
env:
AWS_S3_BUCKET: ${{ secrets.AWS_CN_PROD_BUCKET }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_CN_PROD_AK }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_CN_PROD_SK }}
AWS_REGION: 'cn-northwest-1'
SOURCE_DIR: 'core/build/distributions/latest'
DEST_DIR: 'community_edition/artifacts'
bucket: ${{ secrets.UPLOAD_BUCKET }}
key-id: ${{ secrets.UPLOAD_BUCKET_AK }}
key-secret: ${{ secrets.UPLOAD_BUCKET_SK }}
region: 'oss-cn-hangzhou'
assets: |
core/build/distributions/latest/automq-kafka-latest.tgz:community_edition/artifacts/automq-kafka-latest.tgz
- name: GitHub Release
uses: softprops/action-gh-release@v1

View File

@ -12,8 +12,7 @@ jobs:
with:
suite-id: "benchmarks"
test-path: "tests/kafkatest/benchmarks"
storage-path: "/data/github-actions/reports"
runner: "extra"
runner: "e2e"
connect_e2e_1:
name: "Run connect E2E Tests 1"
uses: ./.github/workflows/e2e-run.yml
@ -21,8 +20,7 @@ jobs:
with:
suite-id: "connect1"
test-yaml: "tests/suites/connect_test_suite1.yml"
storage-path: "/data/github-actions/reports"
runner: "extra"
runner: "e2e"
connect_e2e_2:
name: "Run connect E2E Tests 2"
uses: ./.github/workflows/e2e-run.yml
@ -30,8 +28,7 @@ jobs:
with:
suite-id: "connect2"
test-yaml: "tests/suites/connect_test_suite2.yml"
storage-path: "/data/github-actions/reports"
runner: "extra"
runner: "e2e"
connect_e2e_3:
name: "Run connect E2E Tests 3"
uses: ./.github/workflows/e2e-run.yml
@ -39,8 +36,7 @@ jobs:
with:
suite-id: "connect3"
test-yaml: "tests/suites/connect_test_suite3.yml"
storage-path: "/data/github-actions/reports"
runner: "extra"
runner: "e2e"
streams_e2e:
name: "Run streams E2E Tests"
uses: ./.github/workflows/e2e-run.yml
@ -48,18 +44,18 @@ jobs:
with:
suite-id: "streams"
test-path: "tests/kafkatest/tests/streams"
storage-path: "/data/github-actions/reports"
runner: "extra"
runner: "e2e"
e2e_summary:
name: "E2E Tests Summary"
runs-on: [ self-hosted, extra ]
runs-on: "e2e"
if: ${{ always() && github.repository_owner == 'AutoMQ' }}
needs: [ benchmarks_e2e, connect_e2e_1, connect_e2e_2, connect_e2e_3, streams_e2e ]
steps:
- name: Report results
run: python3 tests/report_e2e_results.py
env:
CURRENT_REPO: ${{ github.repository }}
RUN_ID: ${{ github.run_id }}
WEB_HOOK_URL: ${{ secrets.E2E_REPORT_WEB_HOOK_URL }}
SHOW_RESULTS_URL: ${{ secrets.E2E_REPORT_SHOW_RESULTS_URL2 }}
STORAGE_PATH: "/data/github-actions/reports"
REPORT_TITLE_PREFIX: "Extra"
DATA_MAP: "{\"benchmarks_e2e\": ${{ toJSON(needs.benchmarks_e2e.outputs) }}, \"connect_e2e_1\": ${{ toJSON(needs.connect_e2e_1.outputs) }}, \"connect_e2e_2\": ${{ toJSON(needs.connect_e2e_2.outputs) }}, \"connect_e2e_3\": ${{ toJSON(needs.connect_e2e_3.outputs) }}, \"streams_e2e\": ${{ toJSON(needs.streams_e2e.outputs) }}}"
REPORT_TITLE_PREFIX: "Extra"

View File

@ -12,8 +12,7 @@ jobs:
with:
suite-id: "main1"
test-yaml: "tests/suites/main_kos_test_suite1.yml"
storage-path: "/data/github-actions/reports"
runner: "main"
runner: "e2e"
main_e2e_2:
name: "Run Main E2E Tests 2"
uses: ./.github/workflows/e2e-run.yml
@ -21,8 +20,7 @@ jobs:
with:
suite-id: "main2"
test-yaml: "tests/suites/main_kos_test_suite2.yml"
storage-path: "/data/github-actions/reports"
runner: "main"
runner: "e2e"
main_e2e_3:
name: "Run Main E2E Tests 3"
uses: ./.github/workflows/e2e-run.yml
@ -30,8 +28,7 @@ jobs:
with:
suite-id: "main3"
test-yaml: "tests/suites/main_kos_test_suite3.yml"
storage-path: "/data/github-actions/reports"
runner: "main"
runner: "e2e"
main_e2e_4:
name: "Run Main E2E Tests 4"
uses: ./.github/workflows/e2e-run.yml
@ -39,36 +36,26 @@ jobs:
with:
suite-id: "main4"
test-yaml: "tests/suites/main_kos_test_suite4.yml"
storage-path: "/data/github-actions/reports"
runner: "main"
runner: "e2e"
main_e2e_5:
name: "Run Main E2E Tests 5"
uses: ./.github/workflows/e2e-run.yml
if: ${{ github.repository_owner == 'AutoMQ' }}
with:
suite-id: "main5"
test-yaml: "tests/suites/main_kos_test_suite5.yml"
storage-path: "/data/github-actions/reports"
runner: "main"
main_e2e_6:
name: "Run Main E2E Tests 6"
uses: ./.github/workflows/e2e-run.yml
if: ${{ github.repository_owner == 'AutoMQ' }}
with:
suite-id: "main6"
test-yaml: "tests/suites/main_kos_test_suite6.yml"
storage-path: "/data/github-actions/reports"
runner: "main"
test-path: "tests/kafkatest/automq"
runner: "e2e"
e2e_summary:
runs-on: "e2e"
name: "E2E Tests Summary"
runs-on: [ self-hosted, main ]
if: ${{ always() && github.repository_owner == 'AutoMQ' }}
needs: [ main_e2e_1, main_e2e_2, main_e2e_3, main_e2e_4, main_e2e_5, main_e2e_6 ]
needs: [ main_e2e_1, main_e2e_2, main_e2e_3, main_e2e_4, main_e2e_5 ]
steps:
- name: Report results
run: python3 tests/report_e2e_results.py
env:
CURRENT_REPO: ${{ github.repository }}
RUN_ID: ${{ github.run_id }}
WEB_HOOK_URL: ${{ secrets.E2E_REPORT_WEB_HOOK_URL }}
SHOW_RESULTS_URL: ${{ secrets.E2E_REPORT_SHOW_RESULTS_URL }}
STORAGE_PATH: "/data/github-actions/reports"
DATA_MAP: "{\"main_e2e_1\": ${{ toJSON(needs.main_e2e_1.outputs) }}, \"main_e2e_2\": ${{ toJSON(needs.main_e2e_2.outputs) }}, \"main_e2e_3\": ${{ toJSON(needs.main_e2e_3.outputs) }}, \"main_e2e_4\": ${{ toJSON(needs.main_e2e_4.outputs) }}, \"main_e2e_5\": ${{ toJSON(needs.main_e2e_5.outputs) }}}"
REPORT_TITLE_PREFIX: "Main"

View File

@ -129,7 +129,7 @@ public class Deploy implements Callable<Integer> {
private static String genServerStartupCmd(ClusterTopology topo, Node node) {
StringBuilder sb = new StringBuilder();
appendEnvs(sb, topo);
sb.append("./bin/kafka-server-start.sh config/kraft/server.properties ");
sb.append("./bin/kafka-server-start.sh -daemon config/kraft/server.properties ");
appendCommonConfigsOverride(sb, topo, node);
appendExtConfigsOverride(sb, topo.getGlobal().getConfig());
return sb.toString();
@ -145,7 +145,7 @@ public class Deploy implements Callable<Integer> {
}
private static void appendEnvs(StringBuilder sb, ClusterTopology topo) {
topo.getGlobal().getEnvs().forEach(env -> sb.append(env.getName()).append("=").append(env.getValue()).append(" "));
topo.getGlobal().getEnvs().forEach(env -> sb.append(env.getName()).append("='").append(env.getValue()).append("' "));
}
private static void appendCommonConfigsOverride(StringBuilder sb, ClusterTopology topo, Node node) {
@ -155,7 +155,7 @@ public class Deploy implements Callable<Integer> {
sb.append("--override cluster.id=").append(topo.getGlobal().getClusterId()).append(" ");
sb.append("--override node.id=").append(node.getNodeId()).append(" ");
sb.append("--override controller.quorum.voters=").append(getQuorumVoters(topo)).append(" ");
sb.append("--override advertised.listener=").append(node.getHost()).append(":9092").append(" ");
sb.append("--override advertised.listeners=").append("PLAINTEXT://").append(node.getHost()).append(":9092").append(" ");
}
private static void appendExtConfigsOverride(StringBuilder sb, String rawConfigs) {

View File

@ -12,6 +12,7 @@
package com.automq.shell.log;
import com.automq.shell.AutoMQApplication;
import com.automq.shell.util.Utils;
import com.automq.stream.s3.operator.ObjectStorage;
import com.automq.stream.s3.operator.ObjectStorage.ObjectInfo;
import com.automq.stream.s3.operator.ObjectStorage.ObjectPath;
@ -201,7 +202,7 @@ public class LogUploader implements LogRecorder {
try {
String objectKey = getObjectKey();
objectStorage.write(WriteOptions.DEFAULT, objectKey, uploadBuffer.retainedSlice().asReadOnly()).get();
objectStorage.write(WriteOptions.DEFAULT, objectKey, Utils.compress(uploadBuffer.slice().asReadOnly())).get();
break;
} catch (Exception e) {
e.printStackTrace(System.err);

View File

@ -11,6 +11,7 @@
package com.automq.shell.metrics;
import com.automq.shell.util.Utils;
import com.automq.stream.s3.operator.ObjectStorage;
import com.automq.stream.s3.operator.ObjectStorage.ObjectInfo;
import com.automq.stream.s3.operator.ObjectStorage.ObjectPath;
@ -239,7 +240,7 @@ public class S3MetricsExporter implements MetricExporter {
synchronized (uploadBuffer) {
if (uploadBuffer.readableBytes() > 0) {
try {
objectStorage.write(WriteOptions.DEFAULT, getObjectKey(), uploadBuffer.retainedSlice().asReadOnly()).get();
objectStorage.write(WriteOptions.DEFAULT, getObjectKey(), Utils.compress(uploadBuffer.slice().asReadOnly())).get();
} catch (Exception e) {
LOGGER.error("Failed to upload metrics to s3", e);
return CompletableResultCode.ofFailure();

View File

@ -16,6 +16,7 @@ import org.apache.kafka.clients.ApiVersions;
import org.apache.kafka.clients.ClientUtils;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.ManualMetadataUpdater;
import org.apache.kafka.clients.MetadataRecoveryStrategy;
import org.apache.kafka.clients.NetworkClient;
import org.apache.kafka.clients.admin.AdminClientConfig;
import org.apache.kafka.common.metrics.Metrics;
@ -64,7 +65,8 @@ public class CLIUtils {
time,
false,
new ApiVersions(),
logContext
logContext,
MetadataRecoveryStrategy.NONE
);
}
}
}

View File

@ -1,52 +0,0 @@
/*
* Copyright 2024, AutoMQ HK Limited.
*
* The use of this file is governed by the Business Source License,
* as detailed in the file "/LICENSE.S3Stream" included in this repository.
*
* As of the Change Date specified in that file, in accordance with
* the Business Source License, use of this software will be governed
* by the Apache License, Version 2.0
*/
package com.automq.shell.util;
import java.io.File;
import java.io.IOException;
import java.io.PrintWriter;
import java.nio.charset.Charset;
import java.util.Enumeration;
import java.util.Properties;
public class S3PropUtil {
public static final String BROKER_PROPS_PATH = "template/broker.properties";
public static final String CONTROLLER_PROPS_PATH = "template/controller.properties";
public static final String SERVER_PROPS_PATH = "template/server.properties";
public static void persist(Properties props, String fileName) throws IOException {
File directory = new File("generated");
if (!directory.exists() && !directory.mkdirs()) {
throw new IOException("Can't create directory " + directory.getAbsolutePath());
}
String targetPath = "generated/" + fileName;
File file = new File(targetPath);
try (PrintWriter pw = new PrintWriter(file, Charset.forName("utf-8"))) {
for (Enumeration e = props.propertyNames(); e.hasMoreElements(); ) {
String key = (String) e.nextElement();
pw.println(key + "=" + props.getProperty(key));
}
}
}
public static Properties loadTemplateProps(String propsPath) throws IOException {
try (var in = S3PropUtil.class.getClassLoader().getResourceAsStream(propsPath)) {
if (in != null) {
Properties props = new Properties();
props.load(in);
return props;
} else {
throw new IOException(String.format("Can not find resource file under path: %s", propsPath));
}
}
}
}

View File

@ -0,0 +1,61 @@
/*
* Copyright 2024, AutoMQ HK Limited.
*
* The use of this file is governed by the Business Source License,
* as detailed in the file "/LICENSE.S3Stream" included in this repository.
*
* As of the Change Date specified in that file, in accordance with
* the Business Source License, use of this software will be governed
* by the Apache License, Version 2.0
*/
package com.automq.shell.util;
import com.automq.stream.s3.ByteBufAlloc;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;
import io.netty.buffer.ByteBuf;
public class Utils {
public static ByteBuf compress(ByteBuf input) throws IOException {
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
GZIPOutputStream gzipOutputStream = new GZIPOutputStream(byteArrayOutputStream);
byte[] buffer = new byte[input.readableBytes()];
input.readBytes(buffer);
gzipOutputStream.write(buffer);
gzipOutputStream.close();
ByteBuf compressed = ByteBufAlloc.byteBuffer(byteArrayOutputStream.size());
compressed.writeBytes(byteArrayOutputStream.toByteArray());
return compressed;
}
public static ByteBuf decompress(ByteBuf input) throws IOException {
byte[] compressedData = new byte[input.readableBytes()];
input.readBytes(compressedData);
ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(compressedData);
GZIPInputStream gzipInputStream = new GZIPInputStream(byteArrayInputStream);
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
int bytesRead;
while ((bytesRead = gzipInputStream.read(buffer)) != -1) {
byteArrayOutputStream.write(buffer, 0, bytesRead);
}
gzipInputStream.close();
byteArrayOutputStream.close();
byte[] uncompressedData = byteArrayOutputStream.toByteArray();
ByteBuf output = ByteBufAlloc.byteBuffer(uncompressedData.length);
output.writeBytes(uncompressedData);
return output;
}
}

View File

@ -0,0 +1,40 @@
/*
* Copyright 2024, AutoMQ HK Limited.
*
* The use of this file is governed by the Business Source License,
* as detailed in the file "/LICENSE.S3Stream" included in this repository.
*
* As of the Change Date specified in that file, in accordance with
* the Business Source License, use of this software will be governed
* by the Apache License, Version 2.0
*/
package com.automq.shell.util;
import com.automq.stream.s3.ByteBufAlloc;
import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.Tag;
import org.junit.jupiter.api.Test;
import io.netty.buffer.ByteBuf;
@Tag("S3Unit")
public class UtilsTest {
@Test
public void testCompression() {
String testStr = "This is a test string";
ByteBuf input = ByteBufAlloc.byteBuffer(testStr.length());
input.writeBytes(testStr.getBytes());
try {
ByteBuf compressed = Utils.compress(input);
ByteBuf decompressed = Utils.decompress(compressed);
String decompressedStr = decompressed.toString(io.netty.util.CharsetUtil.UTF_8);
System.out.printf("Original: %s, Decompressed: %s\n", testStr, decompressedStr);
Assertions.assertEquals(testStr, decompressedStr);
} catch (Exception e) {
Assertions.fail("Exception occurred during compression/decompression: " + e.getMessage());
}
}
}

View File

@ -937,6 +937,7 @@ project(':core') {
api libs.scalaLibrary
implementation project(':server-common')
implementation project(':group-coordinator:group-coordinator-api')
implementation project(':group-coordinator')
implementation project(':transaction-coordinator')
implementation project(':metadata')
@ -1367,6 +1368,66 @@ project(':metadata') {
}
}
project(':group-coordinator:group-coordinator-api') {
base {
archivesName = "kafka-group-coordinator-api"
}
dependencies {
implementation project(':clients')
}
task createVersionFile() {
def receiptFile = file("$buildDir/kafka/$buildVersionFileName")
inputs.property "commitId", commitId
inputs.property "version", version
outputs.file receiptFile
doLast {
def data = [
commitId: commitId,
version: version,
]
receiptFile.parentFile.mkdirs()
def content = data.entrySet().collect { "$it.key=$it.value" }.sort().join("\n")
receiptFile.setText(content, "ISO-8859-1")
}
}
sourceSets {
main {
java {
srcDirs = ["src/main/java"]
}
}
test {
java {
srcDirs = ["src/test/java"]
}
}
}
jar {
dependsOn createVersionFile
from("$buildDir") {
include "kafka/$buildVersionFileName"
}
}
clean.doFirst {
delete "$buildDir/kafka/"
}
javadoc {
include "**/org/apache/kafka/coordinator/group/api/**"
}
checkstyle {
configProperties = checkstyleConfigProperties("import-control-group-coordinator.xml")
}
}
project(':group-coordinator') {
base {
archivesName = "kafka-group-coordinator"
@ -1380,6 +1441,7 @@ project(':group-coordinator') {
implementation project(':server-common')
implementation project(':clients')
implementation project(':metadata')
implementation project(':group-coordinator:group-coordinator-api')
implementation project(':storage')
implementation libs.jacksonDatabind
implementation libs.jacksonJDK8Datatypes
@ -2073,6 +2135,7 @@ project(':s3stream') {
implementation 'com.yammer.metrics:metrics-core:2.2.0'
implementation 'commons-codec:commons-codec:1.17.0'
implementation 'org.hdrhistogram:HdrHistogram:2.2.2'
implementation 'software.amazon.awssdk.crt:aws-crt:0.30.8'
testImplementation 'org.slf4j:slf4j-simple:2.0.9'
testImplementation 'org.junit.jupiter:junit-jupiter:5.10.0'
@ -2597,10 +2660,6 @@ project(':streams:test-utils') {
testRuntimeOnly libs.slf4jlog4j
}
javadoc {
include "**/org/apache/kafka/streams/test/**"
}
tasks.create(name: "copyDependantLibs", type: Copy) {
from (configurations.runtimeClasspath) {
exclude('kafka-streams*')
@ -3023,6 +3082,7 @@ project(':jmh-benchmarks') {
implementation project(':raft')
implementation project(':clients')
implementation project(':group-coordinator')
implementation project(':group-coordinator:group-coordinator-api')
implementation project(':metadata')
implementation project(':storage')
implementation project(':streams')

View File

@ -38,6 +38,7 @@
<allow pkg="org.apache.kafka.common" />
<allow pkg="org.mockito" class="AssignmentsManagerTest"/>
<allow pkg="org.apache.kafka.server"/>
<allow pkg="org.opentest4j" class="RemoteLogManagerTest"/>
<!-- see KIP-544 for why KafkaYammerMetrics should be used instead of the global default yammer metrics registry
https://cwiki.apache.org/confluence/display/KAFKA/KIP-544%3A+Make+metrics+exposed+via+JMX+configurable -->
<disallow class="com.yammer.metrics.Metrics" />

View File

@ -443,6 +443,7 @@
<allow pkg="org.apache.kafka.common.message" />
<allow pkg="org.apache.kafka.common.metadata" />
<allow pkg="org.apache.kafka.common.metrics" />
<allow pkg="org.apache.kafka.common.network" />
<allow pkg="org.apache.kafka.common.protocol" />
<allow pkg="org.apache.kafka.common.record" />
<allow pkg="org.apache.kafka.common.requests" />

View File

@ -202,7 +202,7 @@
files="StreamThread.java"/>
<suppress checks="ClassDataAbstractionCoupling"
files="(KafkaStreams|KStreamImpl|KTableImpl).java"/>
files="(InternalTopologyBuilder|KafkaStreams|KStreamImpl|KTableImpl|StreamsPartitionAssignor).java"/>
<suppress checks="CyclomaticComplexity"
files="(KafkaStreams|StreamsPartitionAssignor|StreamThread|TaskManager|PartitionGroup|SubscriptionWrapperSerde|AssignorConfiguration).java"/>
@ -211,7 +211,7 @@
files="StreamsMetricsImpl.java"/>
<suppress checks="NPathComplexity"
files="(KafkaStreams|StreamsPartitionAssignor|StreamThread|TaskManager|GlobalStateManagerImpl|KStreamImplJoin|TopologyConfig|KTableKTableOuterJoin).java"/>
files="(KafkaStreams|StreamsPartitionAssignor|StreamThread|TaskManager|TaskAssignmentUtils|GlobalStateManagerImpl|KStreamImplJoin|TopologyConfig|KTableKTableOuterJoin).java"/>
<suppress checks="(FinalLocalVariable|UnnecessaryParentheses|BooleanExpressionComplexity|CyclomaticComplexity|WhitespaceAfter|LocalVariableName)"
files="Murmur3.java"/>

View File

@ -245,7 +245,9 @@ public final class ClientUtils {
throttleTimeSensor,
logContext,
hostResolver,
clientTelemetrySender);
clientTelemetrySender,
MetadataRecoveryStrategy.forName(config.getString(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG))
);
} catch (Throwable t) {
closeQuietly(selector, "Selector");
closeQuietly(channelBuilder, "ChannelBuilder");

View File

@ -220,6 +220,19 @@ public class CommonClientConfigs {
public static final String DEFAULT_API_TIMEOUT_MS_DOC = "Specifies the timeout (in milliseconds) for client APIs. " +
"This configuration is used as the default timeout for all client operations that do not specify a <code>timeout</code> parameter.";
public static final String METADATA_RECOVERY_STRATEGY_CONFIG = "metadata.recovery.strategy";
public static final String METADATA_RECOVERY_STRATEGY_DOC = "Controls how the client recovers when none of the brokers known to it is available. " +
"If set to <code>none</code>, the client fails. If set to <code>rebootstrap</code>, " +
"the client repeats the bootstrap process using <code>bootstrap.servers</code>. " +
"Rebootstrapping is useful when a client communicates with brokers so infrequently " +
"that the set of brokers may change entirely before the client refreshes metadata. " +
"Metadata recovery is triggered when all last-known brokers appear unavailable simultaneously. " +
"Brokers appear unavailable when disconnected and no current retry attempt is in-progress. " +
"Consider increasing <code>reconnect.backoff.ms</code> and <code>reconnect.backoff.max.ms</code> and " +
"decreasing <code>socket.connection.setup.timeout.ms</code> and <code>socket.connection.setup.timeout.max.ms</code> " +
"for the client.";
public static final String DEFAULT_METADATA_RECOVERY_STRATEGY = MetadataRecoveryStrategy.NONE.name;
/**
* Postprocess the configuration so that exponential backoff is disabled when reconnect backoff
* is explicitly configured but the maximum reconnect backoff is not explicitly configured.

View File

@ -130,7 +130,7 @@ public interface KafkaClient extends Closeable {
* @param now The current time in ms
* @return The node with the fewest in-flight requests.
*/
Node leastLoadedNode(long now);
LeastLoadedNode leastLoadedNode(long now);
/**
* The number of currently in-flight requests for which we have not yet returned a response

View File

@ -0,0 +1,43 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.kafka.clients;
import org.apache.kafka.common.Node;
public class LeastLoadedNode {
private final Node node;
private final boolean atLeastOneConnectionReady;
public LeastLoadedNode(Node node, boolean atLeastOneConnectionReady) {
this.node = node;
this.atLeastOneConnectionReady = atLeastOneConnectionReady;
}
public Node node() {
return node;
}
/**
* Indicates if the least loaded node is available or at least a ready connection exists.
*
* <p>There may be no node available while ready connections to live nodes exist. This may happen when
* the connections are overloaded with in-flight requests. This function takes this into account.
*/
public boolean hasNodeAvailableOrConnectionReady() {
return node != null || atLeastOneConnectionReady;
}
}

View File

@ -82,6 +82,8 @@ public class Metadata implements Closeable {
private final ClusterResourceListeners clusterResourceListeners;
private boolean isClosed;
private final Map<TopicPartition, Integer> lastSeenLeaderEpochs;
/** Addresses with which the metadata was originally bootstrapped. */
private List<InetSocketAddress> bootstrapAddresses;
/**
* Create a new Metadata instance
@ -304,6 +306,12 @@ public class Metadata implements Closeable {
this.needFullUpdate = true;
this.updateVersion += 1;
this.metadataSnapshot = MetadataSnapshot.bootstrap(addresses);
this.bootstrapAddresses = addresses;
}
public synchronized void rebootstrap() {
log.info("Rebootstrapping with {}", this.bootstrapAddresses);
this.bootstrap(this.bootstrapAddresses);
}
/**

View File

@ -0,0 +1,44 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.kafka.clients;
import java.util.Locale;
/**
* Defines the strategies which clients can follow to deal with the situation when none of the known nodes is available.
*/
public enum MetadataRecoveryStrategy {
NONE("none"),
REBOOTSTRAP("rebootstrap");
public final String name;
MetadataRecoveryStrategy(String name) {
this.name = name;
}
public static MetadataRecoveryStrategy forName(String name) {
if (name == null) {
throw new IllegalArgumentException("Illegal MetadataRecoveryStrategy: null");
}
try {
return MetadataRecoveryStrategy.valueOf(name.toUpperCase(Locale.ROOT));
} catch (IllegalArgumentException e) {
throw new IllegalArgumentException("Illegal MetadataRecoveryStrategy: " + name);
}
}
}

View File

@ -114,6 +114,8 @@ public class NetworkClient implements KafkaClient {
/* time in ms to wait before retrying to create connection to a server */
private final long reconnectBackoffMs;
private final MetadataRecoveryStrategy metadataRecoveryStrategy;
private final Time time;
/**
@ -147,7 +149,8 @@ public class NetworkClient implements KafkaClient {
Time time,
boolean discoverBrokerVersions,
ApiVersions apiVersions,
LogContext logContext) {
LogContext logContext,
MetadataRecoveryStrategy metadataRecoveryStrategy) {
this(selector,
metadata,
clientId,
@ -163,7 +166,8 @@ public class NetworkClient implements KafkaClient {
discoverBrokerVersions,
apiVersions,
null,
logContext);
logContext,
metadataRecoveryStrategy);
}
public NetworkClient(Selectable selector,
@ -181,7 +185,8 @@ public class NetworkClient implements KafkaClient {
boolean discoverBrokerVersions,
ApiVersions apiVersions,
Sensor throttleTimeSensor,
LogContext logContext) {
LogContext logContext,
MetadataRecoveryStrategy metadataRecoveryStrategy) {
this(null,
metadata,
selector,
@ -200,7 +205,8 @@ public class NetworkClient implements KafkaClient {
throttleTimeSensor,
logContext,
new DefaultHostResolver(),
null);
null,
metadataRecoveryStrategy);
}
public NetworkClient(Selectable selector,
@ -217,7 +223,8 @@ public class NetworkClient implements KafkaClient {
Time time,
boolean discoverBrokerVersions,
ApiVersions apiVersions,
LogContext logContext) {
LogContext logContext,
MetadataRecoveryStrategy metadataRecoveryStrategy) {
this(metadataUpdater,
null,
selector,
@ -236,7 +243,8 @@ public class NetworkClient implements KafkaClient {
null,
logContext,
new DefaultHostResolver(),
null);
null,
metadataRecoveryStrategy);
}
public NetworkClient(MetadataUpdater metadataUpdater,
@ -257,7 +265,8 @@ public class NetworkClient implements KafkaClient {
Sensor throttleTimeSensor,
LogContext logContext,
HostResolver hostResolver,
ClientTelemetrySender clientTelemetrySender) {
ClientTelemetrySender clientTelemetrySender,
MetadataRecoveryStrategy metadataRecoveryStrategy) {
/* It would be better if we could pass `DefaultMetadataUpdater` from the public constructor, but it's not
* possible because `DefaultMetadataUpdater` is an inner class and it can only be instantiated after the
* super constructor is invoked.
@ -288,6 +297,7 @@ public class NetworkClient implements KafkaClient {
this.log = logContext.logger(NetworkClient.class);
this.state = new AtomicReference<>(State.ACTIVE);
this.telemetrySender = (clientTelemetrySender != null) ? new TelemetrySender(clientTelemetrySender) : null;
this.metadataRecoveryStrategy = metadataRecoveryStrategy;
}
/**
@ -695,7 +705,7 @@ public class NetworkClient implements KafkaClient {
* @return The node with the fewest in-flight requests.
*/
@Override
public Node leastLoadedNode(long now) {
public LeastLoadedNode leastLoadedNode(long now) {
List<Node> nodes = this.metadataUpdater.fetchNodes();
if (nodes.isEmpty())
throw new IllegalStateException("There are no nodes in the Kafka cluster");
@ -705,16 +715,25 @@ public class NetworkClient implements KafkaClient {
Node foundCanConnect = null;
Node foundReady = null;
boolean atLeastOneConnectionReady = false;
int offset = this.randOffset.nextInt(nodes.size());
for (int i = 0; i < nodes.size(); i++) {
int idx = (offset + i) % nodes.size();
Node node = nodes.get(idx);
if (!atLeastOneConnectionReady
&& connectionStates.isReady(node.idString(), now)
&& selector.isChannelReady(node.idString())) {
atLeastOneConnectionReady = true;
}
if (canSendRequest(node.idString(), now)) {
int currInflight = this.inFlightRequests.count(node.idString());
if (currInflight == 0) {
// if we find an established connection with no in-flight requests we can stop right away
log.trace("Found least loaded node {} connected with no in-flight requests", node);
return node;
return new LeastLoadedNode(node, true);
} else if (currInflight < inflight) {
// otherwise if this is the best we have found so far, record that
inflight = currInflight;
@ -738,16 +757,16 @@ public class NetworkClient implements KafkaClient {
// which are being established before connecting to new nodes.
if (foundReady != null) {
log.trace("Found least loaded node {} with {} inflight requests", foundReady, inflight);
return foundReady;
return new LeastLoadedNode(foundReady, atLeastOneConnectionReady);
} else if (foundConnecting != null) {
log.trace("Found least loaded connecting node {}", foundConnecting);
return foundConnecting;
return new LeastLoadedNode(foundConnecting, atLeastOneConnectionReady);
} else if (foundCanConnect != null) {
log.trace("Found least loaded node {} with no active connection", foundCanConnect);
return foundCanConnect;
return new LeastLoadedNode(foundCanConnect, atLeastOneConnectionReady);
} else {
log.trace("Least loaded node selection failed to find an available node");
return null;
return new LeastLoadedNode(null, atLeastOneConnectionReady);
}
}
@ -1122,13 +1141,22 @@ public class NetworkClient implements KafkaClient {
// Beware that the behavior of this method and the computation of timeouts for poll() are
// highly dependent on the behavior of leastLoadedNode.
Node node = leastLoadedNode(now);
if (node == null) {
LeastLoadedNode leastLoadedNode = leastLoadedNode(now);
// Rebootstrap if needed and configured.
if (metadataRecoveryStrategy == MetadataRecoveryStrategy.REBOOTSTRAP
&& !leastLoadedNode.hasNodeAvailableOrConnectionReady()) {
metadata.rebootstrap();
leastLoadedNode = leastLoadedNode(now);
}
if (leastLoadedNode.node() == null) {
log.debug("Give up sending metadata request since no node is available");
return reconnectBackoffMs;
}
return maybeUpdate(now, node);
return maybeUpdate(now, leastLoadedNode.node());
}
@Override
@ -1266,7 +1294,7 @@ public class NetworkClient implements KafkaClient {
// Per KIP-714, let's continue to re-use the same broker for as long as possible.
if (stickyNode == null) {
stickyNode = leastLoadedNode(now);
stickyNode = leastLoadedNode(now).node();
if (stickyNode == null) {
log.debug("Give up sending telemetry request since no node is available");
return reconnectBackoffMs;

View File

@ -911,7 +911,7 @@ public interface Admin extends AutoCloseable {
* List the consumer groups available in the cluster.
*
* @param options The options to use when listing the consumer groups.
* @return The ListGroupsResult.
* @return The ListConsumerGroupsResult.
*/
ListConsumerGroupsResult listConsumerGroups(ListConsumerGroupsOptions options);
@ -921,7 +921,7 @@ public interface Admin extends AutoCloseable {
* This is a convenience method for {@link #listConsumerGroups(ListConsumerGroupsOptions)} with default options.
* See the overload for more details.
*
* @return The ListGroupsResult.
* @return The ListConsumerGroupsResult.
*/
default ListConsumerGroupsResult listConsumerGroups() {
return listConsumerGroups(new ListConsumerGroupsOptions());
@ -931,7 +931,7 @@ public interface Admin extends AutoCloseable {
* List the consumer group offsets available in the cluster.
*
* @param options The options to use when listing the consumer group offsets.
* @return The ListGroupOffsetsResult
* @return The ListConsumerGroupOffsetsResult
*/
default ListConsumerGroupOffsetsResult listConsumerGroupOffsets(String groupId, ListConsumerGroupOffsetsOptions options) {
@SuppressWarnings("deprecation")
@ -949,7 +949,7 @@ public interface Admin extends AutoCloseable {
* This is a convenience method for {@link #listConsumerGroupOffsets(Map, ListConsumerGroupOffsetsOptions)}
* to list offsets of all partitions of one group with default options.
*
* @return The ListGroupOffsetsResult.
* @return The ListConsumerGroupOffsetsResult.
*/
default ListConsumerGroupOffsetsResult listConsumerGroupOffsets(String groupId) {
return listConsumerGroupOffsets(groupId, new ListConsumerGroupOffsetsOptions());

View File

@ -19,6 +19,7 @@ package org.apache.kafka.clients.admin;
import org.apache.kafka.clients.ClientDnsLookup;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.MetadataRecoveryStrategy;
import org.apache.kafka.common.config.AbstractConfig;
import org.apache.kafka.common.config.ConfigDef;
import org.apache.kafka.common.config.ConfigDef.Importance;
@ -139,6 +140,10 @@ public class AdminClientConfig extends AbstractConfig {
public static final String RETRIES_CONFIG = CommonClientConfigs.RETRIES_CONFIG;
public static final String DEFAULT_API_TIMEOUT_MS_CONFIG = CommonClientConfigs.DEFAULT_API_TIMEOUT_MS_CONFIG;
public static final String METADATA_RECOVERY_STRATEGY_CONFIG = CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG;
public static final String METADATA_RECOVERY_STRATEGY_DOC = CommonClientConfigs.METADATA_RECOVERY_STRATEGY_DOC;
public static final String DEFAULT_METADATA_RECOVERY_STRATEGY = CommonClientConfigs.DEFAULT_METADATA_RECOVERY_STRATEGY;
/**
* <code>security.providers</code>
*/
@ -262,7 +267,14 @@ public class AdminClientConfig extends AbstractConfig {
Importance.MEDIUM,
SECURITY_PROTOCOL_DOC)
.withClientSslSupport()
.withClientSaslSupport();
.withClientSaslSupport()
.define(METADATA_RECOVERY_STRATEGY_CONFIG,
Type.STRING,
DEFAULT_METADATA_RECOVERY_STRATEGY,
ConfigDef.CaseInsensitiveValidString
.in(Utils.enumOptions(MetadataRecoveryStrategy.class)),
Importance.LOW,
METADATA_RECOVERY_STRATEGY_DOC);
}
@Override

View File

@ -25,6 +25,8 @@ import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.DefaultHostResolver;
import org.apache.kafka.clients.HostResolver;
import org.apache.kafka.clients.KafkaClient;
import org.apache.kafka.clients.LeastLoadedNode;
import org.apache.kafka.clients.MetadataRecoveryStrategy;
import org.apache.kafka.clients.NetworkClient;
import org.apache.kafka.clients.StaleMetadataException;
import org.apache.kafka.clients.admin.CreateTopicsResult.TopicMetadataAndConfig;
@ -277,7 +279,6 @@ import java.util.Optional;
import java.util.OptionalLong;
import java.util.Set;
import java.util.TreeMap;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;
@ -400,6 +401,7 @@ public class KafkaAdminClient extends AdminClient {
private final long retryBackoffMaxMs;
private final ExponentialBackoff retryBackoff;
private final boolean clientTelemetryEnabled;
private final MetadataRecoveryStrategy metadataRecoveryStrategy;
/**
* The telemetry requests client instance id.
@ -613,6 +615,7 @@ public class KafkaAdminClient extends AdminClient {
retryBackoffMaxMs,
CommonClientConfigs.RETRY_BACKOFF_JITTER);
this.clientTelemetryEnabled = config.getBoolean(AdminClientConfig.ENABLE_METRICS_PUSH_CONFIG);
this.metadataRecoveryStrategy = MetadataRecoveryStrategy.forName(config.getString(AdminClientConfig.METADATA_RECOVERY_STRATEGY_CONFIG));
config.logUnused();
AppInfoParser.registerAppInfo(JMX_PREFIX, clientId, metrics, time.milliseconds());
log.debug("Kafka admin client initialized");
@ -699,7 +702,13 @@ public class KafkaAdminClient extends AdminClient {
private class MetadataUpdateNodeIdProvider implements NodeProvider {
@Override
public Node provide() {
return client.leastLoadedNode(time.milliseconds());
LeastLoadedNode leastLoadedNode = client.leastLoadedNode(time.milliseconds());
if (metadataRecoveryStrategy == MetadataRecoveryStrategy.REBOOTSTRAP
&& !leastLoadedNode.hasNodeAvailableOrConnectionReady()) {
metadataManager.rebootstrap(time.milliseconds());
}
return leastLoadedNode.node();
}
@Override
@ -781,7 +790,7 @@ public class KafkaAdminClient extends AdminClient {
if (metadataManager.isReady()) {
// This may return null if all nodes are busy.
// In that case, we will postpone node assignment.
return client.leastLoadedNode(time.milliseconds());
return client.leastLoadedNode(time.milliseconds()).node();
}
metadataManager.requestUpdate();
return null;
@ -836,7 +845,7 @@ public class KafkaAdminClient extends AdminClient {
} else {
// This may return null if all nodes are busy.
// In that case, we will postpone node assignment.
return client.leastLoadedNode(time.milliseconds());
return client.leastLoadedNode(time.milliseconds()).node();
}
}
metadataManager.requestUpdate();
@ -2130,7 +2139,7 @@ public class KafkaAdminClient extends AdminClient {
throw new IllegalArgumentException("The TopicCollection: " + topics + " provided did not match any supported classes for describeTopics.");
}
Call generateDescribeTopicsCallWithMetadataApi(
private Call generateDescribeTopicsCallWithMetadataApi(
List<String> topicNamesList,
Map<String, KafkaFutureImpl<TopicDescription>> topicFutures,
DescribeTopicsOptions options,
@ -2193,7 +2202,7 @@ public class KafkaAdminClient extends AdminClient {
};
}
Call generateDescribeTopicsCallWithDescribeTopicPartitionsApi(
private Call generateDescribeTopicsCallWithDescribeTopicPartitionsApi(
List<String> topicNamesList,
Map<String, KafkaFutureImpl<TopicDescription>> topicFutures,
Map<Integer, Node> nodes,
@ -2247,7 +2256,7 @@ public class KafkaAdminClient extends AdminClient {
continue;
}
TopicDescription currentTopicDescription = getTopicDescriptionFromDescribeTopicsResponseTopic(topic, nodes);
TopicDescription currentTopicDescription = getTopicDescriptionFromDescribeTopicsResponseTopic(topic, nodes, options.includeAuthorizedOperations());
if (partiallyFinishedTopicDescription != null && partiallyFinishedTopicDescription.name().equals(topicName)) {
// Add the partitions for the cursor topic of the previous batch.
@ -2320,27 +2329,27 @@ public class KafkaAdminClient extends AdminClient {
}
if (topicNamesList.isEmpty()) {
return new HashMap<>(topicFutures);
return Collections.unmodifiableMap(topicFutures);
}
// First, we need to retrieve the node info.
DescribeClusterResult clusterResult = describeCluster();
Map<Integer, Node> nodes;
try {
nodes = clusterResult.nodes().get().stream().collect(Collectors.toMap(Node::id, node -> node));
} catch (InterruptedException | ExecutionException e) {
completeAllExceptionally(topicFutures.values(), e.getCause());
return new HashMap<>(topicFutures);
}
clusterResult.nodes().whenComplete(
(nodes, exception) -> {
if (exception != null) {
completeAllExceptionally(topicFutures.values(), exception);
return;
}
final long now = time.milliseconds();
final long now = time.milliseconds();
Map<Integer, Node> nodeIdMap = nodes.stream().collect(Collectors.toMap(Node::id, node -> node));
runnable.call(
generateDescribeTopicsCallWithDescribeTopicPartitionsApi(topicNamesList, topicFutures, nodeIdMap, options, now),
now
);
});
runnable.call(
generateDescribeTopicsCallWithDescribeTopicPartitionsApi(topicNamesList, topicFutures, nodes, options, now),
now
);
return new HashMap<>(topicFutures);
return Collections.unmodifiableMap(topicFutures);
}
private Map<Uuid, KafkaFuture<TopicDescription>> handleDescribeTopicsByIds(Collection<Uuid> topicIds, DescribeTopicsOptions options) {
@ -2410,14 +2419,16 @@ public class KafkaAdminClient extends AdminClient {
private TopicDescription getTopicDescriptionFromDescribeTopicsResponseTopic(
DescribeTopicPartitionsResponseTopic topic,
Map<Integer, Node> nodes
Map<Integer, Node> nodes,
boolean includeAuthorizedOperations
) {
List<DescribeTopicPartitionsResponsePartition> partitionInfos = topic.partitions();
List<TopicPartitionInfo> partitions = new ArrayList<>(partitionInfos.size());
for (DescribeTopicPartitionsResponsePartition partitionInfo : partitionInfos) {
partitions.add(DescribeTopicPartitionsResponse.partitionToTopicPartitionInfo(partitionInfo, nodes));
}
return new TopicDescription(topic.name(), topic.isInternal(), partitions, validAclOperations(topic.topicAuthorizedOperations()), topic.topicId());
Set<AclOperation> authorisedOperations = includeAuthorizedOperations ? validAclOperations(topic.topicAuthorizedOperations()) : null;
return new TopicDescription(topic.name(), topic.isInternal(), partitions, authorisedOperations, topic.topicId());
}
// AutoMQ for Kafka inject start
@ -4604,7 +4615,7 @@ public class KafkaAdminClient extends AdminClient {
public FenceProducersResult fenceProducers(Collection<String> transactionalIds, FenceProducersOptions options) {
AdminApiFuture.SimpleAdminApiFuture<CoordinatorKey, ProducerIdAndEpoch> future =
FenceProducersHandler.newFuture(transactionalIds);
FenceProducersHandler handler = new FenceProducersHandler(logContext);
FenceProducersHandler handler = new FenceProducersHandler(options, logContext, requestTimeoutMs);
invokeDriver(handler, future, options.timeoutMs);
return new FenceProducersResult(future.all());
}

View File

@ -92,6 +92,11 @@ public class AdminMetadataManager {
*/
private ApiException fatalException = null;
/**
* The cluster with which the metadata was bootstrapped.
*/
private Cluster bootstrapCluster;
public class AdminMetadataUpdater implements MetadataUpdater {
@Override
public List<Node> fetchNodes() {
@ -275,6 +280,7 @@ public class AdminMetadataManager {
public void update(Cluster cluster, long now) {
if (cluster.isBootstrapConfigured()) {
log.debug("Setting bootstrap cluster metadata {}.", cluster);
bootstrapCluster = cluster;
} else {
log.debug("Updating cluster metadata to {}", cluster);
this.lastMetadataUpdateMs = now;
@ -287,4 +293,12 @@ public class AdminMetadataManager {
this.cluster = cluster;
}
}
/**
* Rebootstrap metadata with the cluster previously used for bootstrapping.
*/
public void rebootstrap(long now) {
log.info("Rebootstrapping with {}", this.bootstrapCluster);
update(bootstrapCluster, now);
}
}

View File

@ -16,6 +16,7 @@
*/
package org.apache.kafka.clients.admin.internals;
import org.apache.kafka.clients.admin.FenceProducersOptions;
import org.apache.kafka.common.Node;
import org.apache.kafka.common.errors.ClusterAuthorizationException;
import org.apache.kafka.common.errors.TransactionalIdAuthorizationException;
@ -38,12 +39,16 @@ import java.util.stream.Collectors;
public class FenceProducersHandler extends AdminApiHandler.Unbatched<CoordinatorKey, ProducerIdAndEpoch> {
private final Logger log;
private final AdminApiLookupStrategy<CoordinatorKey> lookupStrategy;
private final int txnTimeoutMs;
public FenceProducersHandler(
LogContext logContext
FenceProducersOptions options,
LogContext logContext,
int requestTimeoutMs
) {
this.log = logContext.logger(FenceProducersHandler.class);
this.lookupStrategy = new CoordinatorStrategy(FindCoordinatorRequest.CoordinatorType.TRANSACTION, logContext);
this.txnTimeoutMs = options.timeoutMs() != null ? options.timeoutMs() : requestTimeoutMs;
}
public static AdminApiFuture.SimpleAdminApiFuture<CoordinatorKey, ProducerIdAndEpoch> newFuture(
@ -82,9 +87,8 @@ public class FenceProducersHandler extends AdminApiHandler.Unbatched<Coordinator
.setProducerEpoch(ProducerIdAndEpoch.NONE.epoch)
.setProducerId(ProducerIdAndEpoch.NONE.producerId)
.setTransactionalId(key.idValue)
// Set transaction timeout to 1 since it's only being initialized to fence out older producers with the same transactional ID,
// and shouldn't be used for any actual record writes
.setTransactionTimeoutMs(1);
// This timeout is used by the coordinator to append the record with the new producer epoch to the transaction log.
.setTransactionTimeoutMs(txnTimeoutMs);
return new InitProducerIdRequest.Builder(data);
}
@ -130,6 +134,10 @@ public class FenceProducersHandler extends AdminApiHandler.Unbatched<Coordinator
"coordinator is still in the process of loading state. Will retry",
transactionalIdKey.idValue);
return ApiResult.empty();
case CONCURRENT_TRANSACTIONS:
log.debug("InitProducerId request for transactionalId `{}` failed because of " +
"a concurrent transaction. Will retry", transactionalIdKey.idValue);
return ApiResult.empty();
case NOT_COORDINATOR:
case COORDINATOR_NOT_AVAILABLE:

View File

@ -18,6 +18,7 @@ package org.apache.kafka.clients.consumer;
import org.apache.kafka.clients.ClientDnsLookup;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.MetadataRecoveryStrategy;
import org.apache.kafka.common.IsolationLevel;
import org.apache.kafka.common.config.AbstractConfig;
import org.apache.kafka.common.config.ConfigDef;
@ -198,7 +199,10 @@ public class ConsumerConfig extends AbstractConfig {
* <code>fetch.max.wait.ms</code>
*/
public static final String FETCH_MAX_WAIT_MS_CONFIG = "fetch.max.wait.ms";
private static final String FETCH_MAX_WAIT_MS_DOC = "The maximum amount of time the server will block before answering the fetch request if there isn't sufficient data to immediately satisfy the requirement given by fetch.min.bytes.";
private static final String FETCH_MAX_WAIT_MS_DOC = "The maximum amount of time the server will block before " +
"answering the fetch request there isn't sufficient data to immediately satisfy the requirement given by " +
"fetch.min.bytes. This config is used only for local log fetch. To tune the remote fetch maximum wait " +
"time, please refer to 'remote.fetch.max.wait.ms' broker config";
public static final int DEFAULT_FETCH_MAX_WAIT_MS = 500;
/** <code>metadata.max.age.ms</code> */
@ -653,7 +657,14 @@ public class ConsumerConfig extends AbstractConfig {
Importance.MEDIUM,
CommonClientConfigs.SECURITY_PROTOCOL_DOC)
.withClientSslSupport()
.withClientSaslSupport();
.withClientSaslSupport()
.define(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG,
Type.STRING,
CommonClientConfigs.DEFAULT_METADATA_RECOVERY_STRATEGY,
ConfigDef.CaseInsensitiveValidString
.in(Utils.enumOptions(MetadataRecoveryStrategy.class)),
Importance.LOW,
CommonClientConfigs.METADATA_RECOVERY_STRATEGY_DOC);
}
@Override

View File

@ -297,7 +297,7 @@ public interface ConsumerPartitionAssignor {
// first try to get the class if passed in as a string
if (klass instanceof String) {
try {
klass = Class.forName((String) klass, true, Utils.getContextOrKafkaClassLoader());
klass = Utils.loadClass((String) klass, Object.class);
} catch (ClassNotFoundException classNotFound) {
throw new KafkaException(klass + " ClassNotFoundException exception occurred", classNotFound);
}

View File

@ -26,7 +26,6 @@ import org.apache.kafka.clients.consumer.Consumer;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerGroupMetadata;
import org.apache.kafka.clients.consumer.ConsumerInterceptor;
import org.apache.kafka.clients.consumer.ConsumerPartitionAssignor;
import org.apache.kafka.clients.consumer.ConsumerRebalanceListener;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.GroupProtocol;
@ -54,7 +53,6 @@ import org.apache.kafka.clients.consumer.internals.events.ConsumerRebalanceListe
import org.apache.kafka.clients.consumer.internals.events.ErrorEvent;
import org.apache.kafka.clients.consumer.internals.events.EventProcessor;
import org.apache.kafka.clients.consumer.internals.events.FetchCommittedOffsetsEvent;
import org.apache.kafka.clients.consumer.internals.events.LeaveOnCloseEvent;
import org.apache.kafka.clients.consumer.internals.events.ListOffsetsEvent;
import org.apache.kafka.clients.consumer.internals.events.NewTopicsMetadataUpdateRequestEvent;
import org.apache.kafka.clients.consumer.internals.events.PollEvent;
@ -91,7 +89,6 @@ import org.apache.kafka.common.utils.AppInfoParser;
import org.apache.kafka.common.utils.LogContext;
import org.apache.kafka.common.utils.Time;
import org.apache.kafka.common.utils.Timer;
import org.apache.kafka.common.utils.Utils;
import org.slf4j.Logger;
import org.slf4j.event.Level;
@ -110,7 +107,6 @@ import java.util.Optional;
import java.util.OptionalLong;
import java.util.Set;
import java.util.SortedSet;
import java.util.TreeSet;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.Future;
@ -235,12 +231,12 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
private final SubscriptionState subscriptions;
private final ConsumerMetadata metadata;
private int metadataVersionSnapshot;
private final Metrics metrics;
private final long retryBackoffMs;
private final int defaultApiTimeoutMs;
private final boolean autoCommitEnabled;
private volatile boolean closed = false;
private final List<ConsumerPartitionAssignor> assignors;
private final Optional<ClientTelemetryReporter> clientTelemetryReporter;
// to keep from repeatedly scanning subscriptions in poll(), cache the result during metadata updates
@ -257,6 +253,8 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
private final AtomicLong currentThread = new AtomicLong(NO_CURRENT_THREAD);
private final AtomicInteger refCount = new AtomicInteger(0);
private FetchCommittedOffsetsEvent pendingOffsetFetchEvent;
AsyncKafkaConsumer(final ConsumerConfig config,
final Deserializer<K> keyDeserializer,
final Deserializer<V> valueDeserializer) {
@ -313,6 +311,7 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
this.metadata = metadataFactory.build(config, subscriptions, logContext, clusterResourceListeners);
final List<InetSocketAddress> addresses = ClientUtils.parseAndValidateAddresses(config);
metadata.bootstrap(addresses);
this.metadataVersionSnapshot = metadata.updateVersion();
FetchMetricsManager fetchMetricsManager = createFetchMetricsManager(metrics);
FetchConfig fetchConfig = new FetchConfig(config);
@ -373,10 +372,6 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
rebalanceListenerInvoker
);
this.backgroundEventReaper = backgroundEventReaperFactory.build(logContext);
this.assignors = ConsumerPartitionAssignor.getAssignorInstances(
config.getList(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG),
config.originals(Collections.singletonMap(ConsumerConfig.CLIENT_ID_CONFIG, clientId))
);
// The FetchCollector is only used on the application thread.
this.fetchCollector = fetchCollectorFactory.build(logContext,
@ -424,7 +419,6 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
ConsumerMetadata metadata,
long retryBackoffMs,
int defaultApiTimeoutMs,
List<ConsumerPartitionAssignor> assignors,
String groupId,
boolean autoCommitEnabled) {
this.log = logContext.logger(getClass());
@ -441,11 +435,11 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
this.metrics = metrics;
this.groupMetadata.set(initializeGroupMetadata(groupId, Optional.empty()));
this.metadata = metadata;
this.metadataVersionSnapshot = metadata.updateVersion();
this.retryBackoffMs = retryBackoffMs;
this.defaultApiTimeoutMs = defaultApiTimeoutMs;
this.deserializers = deserializers;
this.applicationEventHandler = applicationEventHandler;
this.assignors = assignors;
this.kafkaConsumerMetrics = new KafkaConsumerMetrics(metrics, "consumer");
this.clientTelemetryReporter = Optional.empty();
this.autoCommitEnabled = autoCommitEnabled;
@ -460,8 +454,7 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
Deserializer<V> valueDeserializer,
KafkaClient client,
SubscriptionState subscriptions,
ConsumerMetadata metadata,
List<ConsumerPartitionAssignor> assignors) {
ConsumerMetadata metadata) {
this.log = logContext.logger(getClass());
this.subscriptions = subscriptions;
this.clientId = config.getString(ConsumerConfig.CLIENT_ID_CONFIG);
@ -472,10 +465,10 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
this.time = time;
this.metrics = new Metrics(time);
this.metadata = metadata;
this.metadataVersionSnapshot = metadata.updateVersion();
this.retryBackoffMs = config.getLong(ConsumerConfig.RETRY_BACKOFF_MS_CONFIG);
this.defaultApiTimeoutMs = config.getInt(ConsumerConfig.DEFAULT_API_TIMEOUT_MS_CONFIG);
this.deserializers = new Deserializers<>(keyDeserializer, valueDeserializer);
this.assignors = assignors;
this.clientTelemetryReporter = Optional.empty();
ConsumerMetrics metricsRegistry = new ConsumerMetrics(CONSUMER_METRIC_GROUP_PREFIX);
@ -1237,8 +1230,8 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
clientTelemetryReporter.ifPresent(reporter -> reporter.initiateClose(timeout.toMillis()));
closeTimer.update();
// Prepare shutting down the network thread
prepareShutdown(closeTimer, firstException);
closeTimer.update();
swallow(log, Level.ERROR, "Failed to release assignment before closing consumer",
() -> releaseAssignmentAndLeaveGroup(closeTimer), firstException);
swallow(log, Level.ERROR, "Failed invoking asynchronous commit callback.",
() -> awaitPendingAsyncCommitsAndExecuteCommitCallbacks(closeTimer, false), firstException);
if (applicationEventHandler != null)
@ -1270,27 +1263,34 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
/**
* Prior to closing the network thread, we need to make sure the following operations happen in the right sequence:
* 1. autocommit offsets
* 2. revoke all partitions
* 3. if partition revocation completes successfully, send leave group
* 2. release assignment. This is done via a background unsubscribe event that will
* trigger the callbacks, clear the assignment on the subscription state and send the leave group request to the broker
*/
void prepareShutdown(final Timer timer, final AtomicReference<Throwable> firstException) {
private void releaseAssignmentAndLeaveGroup(final Timer timer) {
if (!groupMetadata.get().isPresent())
return;
if (autoCommitEnabled)
autoCommitSync(timer);
commitSyncAllConsumed(timer);
applicationEventHandler.add(new CommitOnCloseEvent());
completeQuietly(
() -> {
maybeRevokePartitions();
applicationEventHandler.addAndGet(new LeaveOnCloseEvent(calculateDeadlineMs(timer)));
},
"Failed to send leaveGroup heartbeat with a timeout(ms)=" + timer.timeoutMs(), firstException);
log.info("Releasing assignment and leaving group before closing consumer");
UnsubscribeEvent unsubscribeEvent = new UnsubscribeEvent(calculateDeadlineMs(timer));
applicationEventHandler.add(unsubscribeEvent);
try {
processBackgroundEvents(unsubscribeEvent.future(), timer);
log.info("Completed releasing assignment and sending leave group to close consumer");
} catch (TimeoutException e) {
log.warn("Consumer triggered an unsubscribe event to leave the group but couldn't " +
"complete it within {} ms. It will proceed to close.", timer.timeoutMs());
} finally {
timer.update();
}
}
// Visible for testing
void autoCommitSync(final Timer timer) {
void commitSyncAllConsumed(final Timer timer) {
Map<TopicPartition, OffsetAndMetadata> allConsumed = subscriptions.allConsumed();
log.debug("Sending synchronous auto-commit of offsets {} on closing", allConsumed);
try {
@ -1302,35 +1302,6 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
timer.update();
}
// Visible for testing
void maybeRevokePartitions() {
if (!subscriptions.hasAutoAssignedPartitions() || subscriptions.assignedPartitions().isEmpty())
return;
try {
SortedSet<TopicPartition> droppedPartitions = new TreeSet<>(MembershipManagerImpl.TOPIC_PARTITION_COMPARATOR);
droppedPartitions.addAll(subscriptions.assignedPartitions());
if (subscriptions.rebalanceListener().isPresent())
subscriptions.rebalanceListener().get().onPartitionsRevoked(droppedPartitions);
} catch (Exception e) {
throw new KafkaException(e);
} finally {
subscriptions.assignFromSubscribed(Collections.emptySet());
}
}
// Visible for testing
void completeQuietly(final Utils.ThrowingRunnable function,
final String msg,
final AtomicReference<Throwable> firstException) {
try {
function.run();
} catch (TimeoutException e) {
log.debug("Timeout expired before the {} operation could complete.", msg);
} catch (Exception e) {
firstException.compareAndSet(null, e);
}
}
@Override
public void wakeup() {
wakeupTrigger.wakeup();
@ -1478,12 +1449,11 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
}
/**
* TODO: remove this when we implement the KIP-848 protocol.
*
* <p>
* The contents of this method are shamelessly stolen from
* {@link ConsumerCoordinator#updatePatternSubscription(Cluster)} and are used here because we won't have access
* to a {@link ConsumerCoordinator} in this code. Perhaps it could be moved to a ConsumerUtils class?
*
* This function evaluates the regex that the consumer subscribed to
* against the list of topic names from metadata, and updates
* the list of topics in subscription state accordingly
*
* @param cluster Cluster from which we get the topics
*/
@ -1493,7 +1463,7 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
.collect(Collectors.toSet());
if (subscriptions.subscribeFromPattern(topicsToSubscribe)) {
applicationEventHandler.add(new SubscriptionChangeEvent());
metadata.requestUpdateForNewTopics();
this.metadataVersionSnapshot = metadata.requestUpdateForNewTopics();
}
}
@ -1506,7 +1476,8 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
Timer timer = time.timer(Long.MAX_VALUE);
UnsubscribeEvent unsubscribeEvent = new UnsubscribeEvent(calculateDeadlineMs(timer));
applicationEventHandler.add(unsubscribeEvent);
log.info("Unsubscribing all topics or patterns and assigned partitions");
log.info("Unsubscribing all topics or patterns and assigned partitions {}",
subscriptions.assignedPartitions());
try {
processBackgroundEvents(unsubscribeEvent.future(), timer);
@ -1516,7 +1487,9 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
}
resetGroupMetadata();
}
subscriptions.unsubscribe();
} catch (Exception e) {
log.error("Unsubscribe failed", e);
throw e;
} finally {
release();
}
@ -1670,27 +1643,64 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
return true;
log.debug("Refreshing committed offsets for partitions {}", initializingPartitions);
// The shorter the timeout provided to poll(), the more likely the offsets fetch will time out. To handle
// this case, on the first attempt to fetch the committed offsets, a FetchCommittedOffsetsEvent is created
// (with potentially a longer timeout) and stored. The event is used for the first attempt, but in the
// case it times out, subsequent attempts will also use the event in order to wait for the results.
if (!canReusePendingOffsetFetchEvent(initializingPartitions)) {
// Give the event a reasonable amount of time to complete.
final long timeoutMs = Math.max(defaultApiTimeoutMs, timer.remainingMs());
final long deadlineMs = calculateDeadlineMs(time, timeoutMs);
pendingOffsetFetchEvent = new FetchCommittedOffsetsEvent(initializingPartitions, deadlineMs);
applicationEventHandler.add(pendingOffsetFetchEvent);
}
final CompletableFuture<Map<TopicPartition, OffsetAndMetadata>> future = pendingOffsetFetchEvent.future();
try {
final FetchCommittedOffsetsEvent event =
new FetchCommittedOffsetsEvent(
initializingPartitions,
calculateDeadlineMs(timer));
wakeupTrigger.setActiveTask(event.future());
final Map<TopicPartition, OffsetAndMetadata> offsets = applicationEventHandler.addAndGet(event);
wakeupTrigger.setActiveTask(future);
final Map<TopicPartition, OffsetAndMetadata> offsets = ConsumerUtils.getResult(future, timer);
// Clear the pending event once its result is successfully retrieved.
pendingOffsetFetchEvent = null;
refreshCommittedOffsets(offsets, metadata, subscriptions);
return true;
} catch (TimeoutException e) {
log.error("Couldn't refresh committed offsets before timeout expired");
log.debug(
"The committed offsets for the following partition(s) could not be refreshed within the timeout: {} ",
initializingPartitions
);
return false;
} catch (InterruptException e) {
throw e;
} catch (Throwable t) {
pendingOffsetFetchEvent = null;
throw ConsumerUtils.maybeWrapAsKafkaException(t);
} finally {
wakeupTrigger.clearTask();
}
}
private void throwIfNoAssignorsConfigured() {
if (assignors.isEmpty())
throw new IllegalStateException("Must configure at least one partition assigner class name to " +
ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG + " configuration property");
/**
* This determines if the {@link #pendingOffsetFetchEvent pending offset fetch event} can be reused. Reuse
* is only possible if all the following conditions are true:
*
* <ul>
* <li>A pending offset fetch event exists</li>
* <li>The partition set of the pending offset fetch event is the same as the given partition set</li>
* <li>The pending offset fetch event has not expired</li>
* </ul>
*/
private boolean canReusePendingOffsetFetchEvent(Set<TopicPartition> partitions) {
if (pendingOffsetFetchEvent == null)
return false;
if (!pendingOffsetFetchEvent.partitions().equals(partitions))
return false;
return pendingOffsetFetchEvent.deadlineMs() > time.milliseconds();
}
private void updateLastSeenEpochIfNewer(TopicPartition topicPartition, OffsetAndMetadata offsetAndMetadata) {
@ -1780,7 +1790,6 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
if (pattern == null || pattern.toString().isEmpty())
throw new IllegalArgumentException("Topic pattern to subscribe to cannot be " + (pattern == null ?
"null" : "empty"));
throwIfNoAssignorsConfigured();
log.info("Subscribed to pattern: '{}'", pattern);
subscriptions.subscribe(pattern, listener);
metadata.requestUpdateForNewTopics();
@ -1805,8 +1814,6 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
throw new IllegalArgumentException("Topic collection to subscribe to cannot contain null or empty topic");
}
throwIfNoAssignorsConfigured();
// Clear the buffered data which are not a part of newly assigned topics
final Set<TopicPartition> currentTopicPartitions = new HashSet<>();
@ -1818,7 +1825,7 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
fetchBuffer.retainAll(currentTopicPartitions);
log.info("Subscribed to topic(s): {}", String.join(", ", topics));
if (subscriptions.subscribe(new HashSet<>(topics), listener))
metadata.requestUpdateForNewTopics();
this.metadataVersionSnapshot = metadata.requestUpdateForNewTopics();
// Trigger subscribe event to effectively join the group if not already part of it,
// or just send the new subscription to the broker.
@ -1996,10 +2003,17 @@ public class AsyncKafkaConsumer<K, V> implements ConsumerDelegate<K, V> {
return subscriptions;
}
private void maybeUpdateSubscriptionMetadata() {
if (subscriptions.hasPatternSubscription()) {
updatePatternSubscription(metadata.fetch());
}
boolean hasPendingOffsetFetchEvent() {
return pendingOffsetFetchEvent != null;
}
private void maybeUpdateSubscriptionMetadata() {
if (this.metadataVersionSnapshot < metadata.updateVersion()) {
this.metadataVersionSnapshot = metadata.updateVersion();
if (subscriptions.hasPatternSubscription()) {
updatePatternSubscription(metadata.fetch());
}
}
}
}

View File

@ -69,6 +69,7 @@ import static org.apache.kafka.clients.consumer.internals.NetworkClientDelegate.
import static org.apache.kafka.common.protocol.Errors.COORDINATOR_LOAD_IN_PROGRESS;
public class CommitRequestManager implements RequestManager, MemberStateListener {
private final Time time;
private final SubscriptionState subscriptions;
private final LogContext logContext;
private final Logger log;
@ -133,6 +134,7 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
final OptionalDouble jitter,
final Metrics metrics) {
Objects.requireNonNull(coordinatorRequestManager, "Coordinator is needed upon committing offsets");
this.time = time;
this.logContext = logContext;
this.log = logContext.logger(getClass());
this.pendingRequests = new PendingRequests();
@ -205,6 +207,13 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
.orElse(Long.MAX_VALUE);
}
private KafkaException maybeWrapAsTimeoutException(Throwable t) {
if (t instanceof TimeoutException)
return (TimeoutException) t;
else
return new TimeoutException(t);
}
/**
* Generate a request to commit consumed offsets. Add the request to the queue of pending
* requests to be sent out on the next call to {@link #poll(long)}. If there are empty
@ -245,7 +254,7 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
if (autoCommitEnabled() && autoCommitState.get().shouldAutoCommit()) {
OffsetCommitRequestState requestState = createOffsetCommitRequest(
subscriptions.allConsumed(),
Optional.empty());
Long.MAX_VALUE);
CompletableFuture<Map<TopicPartition, OffsetAndMetadata>> result = requestAutoCommit(requestState);
// Reset timer to the interval (even if no request was generated), but ensure that if
// the request completes with a retriable error, the timer is reset to send the next
@ -294,14 +303,14 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
* complete exceptionally if the commit fails with a non-retriable error, or if the retry
* timeout expires.
*/
public CompletableFuture<Void> maybeAutoCommitSyncBeforeRevocation(final long retryExpirationTimeMs) {
public CompletableFuture<Void> maybeAutoCommitSyncBeforeRevocation(final long deadlineMs) {
if (!autoCommitEnabled()) {
return CompletableFuture.completedFuture(null);
}
CompletableFuture<Void> result = new CompletableFuture<>();
OffsetCommitRequestState requestState =
createOffsetCommitRequest(subscriptions.allConsumed(), Optional.of(retryExpirationTimeMs));
createOffsetCommitRequest(subscriptions.allConsumed(), deadlineMs);
autoCommitSyncBeforeRevocationWithRetries(requestState, result);
return result;
}
@ -314,9 +323,9 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
result.complete(null);
} else {
if (error instanceof RetriableException || isStaleEpochErrorAndValidEpochAvailable(error)) {
if (error instanceof TimeoutException && requestAttempt.isExpired) {
if (requestAttempt.isExpired()) {
log.debug("Auto-commit sync before revocation timed out and won't be retried anymore");
result.completeExceptionally(error);
result.completeExceptionally(maybeWrapAsTimeoutException(error));
} else if (error instanceof UnknownTopicOrPartitionException) {
log.debug("Auto-commit sync before revocation failed because topic or partition were deleted");
result.completeExceptionally(error);
@ -367,7 +376,7 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
log.debug("Skipping commit of empty offsets");
return CompletableFuture.completedFuture(null);
}
OffsetCommitRequestState commitRequest = createOffsetCommitRequest(offsets, Optional.empty());
OffsetCommitRequestState commitRequest = createOffsetCommitRequest(offsets, Long.MAX_VALUE);
pendingRequests.addOffsetCommitRequest(commitRequest);
CompletableFuture<Void> asyncCommitResult = new CompletableFuture<>();
@ -385,28 +394,26 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
* Commit offsets, retrying on expected retriable errors while the retry timeout hasn't expired.
*
* @param offsets Offsets to commit
* @param retryExpirationTimeMs Time until which the request will be retried if it fails with
* @param deadlineMs Time until which the request will be retried if it fails with
* an expected retriable error.
* @return Future that will complete when a successful response
*/
public CompletableFuture<Void> commitSync(final Map<TopicPartition, OffsetAndMetadata> offsets,
final long retryExpirationTimeMs) {
final long deadlineMs) {
CompletableFuture<Void> result = new CompletableFuture<>();
OffsetCommitRequestState requestState = createOffsetCommitRequest(
offsets,
Optional.of(retryExpirationTimeMs));
OffsetCommitRequestState requestState = createOffsetCommitRequest(offsets, deadlineMs);
commitSyncWithRetries(requestState, result);
return result;
}
private OffsetCommitRequestState createOffsetCommitRequest(final Map<TopicPartition, OffsetAndMetadata> offsets,
final Optional<Long> expirationTimeMs) {
final long deadlineMs) {
return jitter.isPresent() ?
new OffsetCommitRequestState(
offsets,
groupId,
groupInstanceId,
expirationTimeMs,
deadlineMs,
retryBackoffMs,
retryBackoffMaxMs,
jitter.getAsDouble(),
@ -415,7 +422,7 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
offsets,
groupId,
groupInstanceId,
expirationTimeMs,
deadlineMs,
retryBackoffMs,
retryBackoffMaxMs,
memberInfo);
@ -432,9 +439,9 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
result.complete(null);
} else {
if (error instanceof RetriableException) {
if (error instanceof TimeoutException && requestAttempt.isExpired) {
if (requestAttempt.isExpired()) {
log.info("OffsetCommit timeout expired so it won't be retried anymore");
result.completeExceptionally(error);
result.completeExceptionally(maybeWrapAsTimeoutException(error));
} else {
requestAttempt.resetFuture();
commitSyncWithRetries(requestAttempt, result);
@ -465,7 +472,7 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
* Enqueue a request to fetch committed offsets, that will be sent on the next call to {@link #poll(long)}.
*
* @param partitions Partitions to fetch offsets for.
* @param expirationTimeMs Time until which the request should be retried if it fails
* @param deadlineMs Time until which the request should be retried if it fails
* with expected retriable errors.
* @return Future that will complete when a successful response is received, or the request
* fails and cannot be retried. Note that the request is retried whenever it fails with
@ -473,31 +480,31 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
*/
public CompletableFuture<Map<TopicPartition, OffsetAndMetadata>> fetchOffsets(
final Set<TopicPartition> partitions,
final long expirationTimeMs) {
final long deadlineMs) {
if (partitions.isEmpty()) {
return CompletableFuture.completedFuture(Collections.emptyMap());
}
CompletableFuture<Map<TopicPartition, OffsetAndMetadata>> result = new CompletableFuture<>();
OffsetFetchRequestState request = createOffsetFetchRequest(partitions, expirationTimeMs);
OffsetFetchRequestState request = createOffsetFetchRequest(partitions, deadlineMs);
fetchOffsetsWithRetries(request, result);
return result;
}
private OffsetFetchRequestState createOffsetFetchRequest(final Set<TopicPartition> partitions,
final long expirationTimeMs) {
final long deadlineMs) {
return jitter.isPresent() ?
new OffsetFetchRequestState(
partitions,
retryBackoffMs,
retryBackoffMaxMs,
expirationTimeMs,
deadlineMs,
jitter.getAsDouble(),
memberInfo) :
new OffsetFetchRequestState(
partitions,
retryBackoffMs,
retryBackoffMaxMs,
expirationTimeMs,
deadlineMs,
memberInfo);
}
@ -516,8 +523,9 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
result.complete(res);
} else {
if (error instanceof RetriableException || isStaleEpochErrorAndValidEpochAvailable(error)) {
if (error instanceof TimeoutException && fetchRequest.isExpired) {
result.completeExceptionally(error);
if (fetchRequest.isExpired()) {
log.debug("OffsetFetch request for {} timed out and won't be retried anymore", fetchRequest.requestedPartitions);
result.completeExceptionally(maybeWrapAsTimeoutException(error));
} else {
fetchRequest.resetFuture();
fetchOffsetsWithRetries(fetchRequest, result);
@ -612,12 +620,12 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
OffsetCommitRequestState(final Map<TopicPartition, OffsetAndMetadata> offsets,
final String groupId,
final Optional<String> groupInstanceId,
final Optional<Long> expirationTimeMs,
final long deadlineMs,
final long retryBackoffMs,
final long retryBackoffMaxMs,
final MemberInfo memberInfo) {
super(logContext, CommitRequestManager.class.getSimpleName(), retryBackoffMs,
retryBackoffMaxMs, memberInfo, expirationTimeMs);
retryBackoffMaxMs, memberInfo, deadlineTimer(time, deadlineMs));
this.offsets = offsets;
this.groupId = groupId;
this.groupInstanceId = groupInstanceId;
@ -628,13 +636,13 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
OffsetCommitRequestState(final Map<TopicPartition, OffsetAndMetadata> offsets,
final String groupId,
final Optional<String> groupInstanceId,
final Optional<Long> expirationTimeMs,
final long deadlineMs,
final long retryBackoffMs,
final long retryBackoffMaxMs,
final double jitter,
final MemberInfo memberInfo) {
super(logContext, CommitRequestManager.class.getSimpleName(), retryBackoffMs, 2,
retryBackoffMaxMs, jitter, memberInfo, expirationTimeMs);
retryBackoffMaxMs, jitter, memberInfo, deadlineTimer(time, deadlineMs));
this.offsets = offsets;
this.groupId = groupId;
this.groupInstanceId = groupInstanceId;
@ -780,40 +788,24 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
* Represents a request that can be retried or aborted, based on member ID and epoch
* information.
*/
abstract class RetriableRequestState extends RequestState {
abstract class RetriableRequestState extends TimedRequestState {
/**
* Member info (ID and epoch) to be included in the request if present.
*/
final MemberInfo memberInfo;
/**
* Time until which the request should be retried if it fails with retriable
* errors. If not present, the request is triggered without waiting for a response or
* retrying.
*/
private final Optional<Long> expirationTimeMs;
/**
* True if the request expiration time has been reached. This is set when validating the
* request expiration on {@link #poll(long)} before sending it. It is used to know if a
* request should be retried on TimeoutException.
*/
boolean isExpired;
RetriableRequestState(LogContext logContext, String owner, long retryBackoffMs,
long retryBackoffMaxMs, MemberInfo memberInfo, Optional<Long> expirationTimeMs) {
super(logContext, owner, retryBackoffMs, retryBackoffMaxMs);
long retryBackoffMaxMs, MemberInfo memberInfo, Timer timer) {
super(logContext, owner, retryBackoffMs, retryBackoffMaxMs, timer);
this.memberInfo = memberInfo;
this.expirationTimeMs = expirationTimeMs;
}
// Visible for testing
RetriableRequestState(LogContext logContext, String owner, long retryBackoffMs, int retryBackoffExpBase,
long retryBackoffMaxMs, double jitter, MemberInfo memberInfo, Optional<Long> expirationTimeMs) {
super(logContext, owner, retryBackoffMs, retryBackoffExpBase, retryBackoffMaxMs, jitter);
long retryBackoffMaxMs, double jitter, MemberInfo memberInfo, Timer timer) {
super(logContext, owner, retryBackoffMs, retryBackoffExpBase, retryBackoffMaxMs, jitter, timer);
this.memberInfo = memberInfo;
this.expirationTimeMs = expirationTimeMs;
}
/**
@ -828,13 +820,12 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
abstract CompletableFuture<?> future();
/**
* Complete the request future with a TimeoutException if the request timeout has been
* reached, based on the provided current time.
* Complete the request future with a TimeoutException if the request has been sent out
* at least once and the timeout has been reached.
*/
void maybeExpire(long currentTimeMs) {
if (retryTimeoutExpired(currentTimeMs)) {
void maybeExpire() {
if (numAttempts > 0 && isExpired()) {
removeRequest();
isExpired = true;
future().completeExceptionally(new TimeoutException(requestDescription() +
" could not complete before timeout expired."));
}
@ -846,11 +837,12 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
NetworkClientDelegate.UnsentRequest buildRequestWithResponseHandling(final AbstractRequest.Builder<?> builder) {
NetworkClientDelegate.UnsentRequest request = new NetworkClientDelegate.UnsentRequest(
builder,
coordinatorRequestManager.coordinator());
coordinatorRequestManager.coordinator()
);
request.whenComplete(
(response, throwable) -> {
long currentTimeMs = request.handler().completionTimeMs();
handleClientResponse(response, throwable, currentTimeMs);
long completionTimeMs = request.handler().completionTimeMs();
handleClientResponse(response, throwable, completionTimeMs);
});
return request;
}
@ -875,10 +867,6 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
abstract void onResponse(final ClientResponse response);
boolean retryTimeoutExpired(long currentTimeMs) {
return expirationTimeMs.isPresent() && expirationTimeMs.get() <= currentTimeMs;
}
abstract void removeRequest();
}
@ -898,10 +886,10 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
public OffsetFetchRequestState(final Set<TopicPartition> partitions,
final long retryBackoffMs,
final long retryBackoffMaxMs,
final long expirationTimeMs,
final long deadlineMs,
final MemberInfo memberInfo) {
super(logContext, CommitRequestManager.class.getSimpleName(), retryBackoffMs,
retryBackoffMaxMs, memberInfo, Optional.of(expirationTimeMs));
retryBackoffMaxMs, memberInfo, deadlineTimer(time, deadlineMs));
this.requestedPartitions = partitions;
this.future = new CompletableFuture<>();
}
@ -909,11 +897,11 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
public OffsetFetchRequestState(final Set<TopicPartition> partitions,
final long retryBackoffMs,
final long retryBackoffMaxMs,
final long expirationTimeMs,
final long deadlineMs,
final double jitter,
final MemberInfo memberInfo) {
super(logContext, CommitRequestManager.class.getSimpleName(), retryBackoffMs, 2,
retryBackoffMaxMs, jitter, memberInfo, Optional.of(expirationTimeMs));
retryBackoffMaxMs, jitter, memberInfo, deadlineTimer(time, deadlineMs));
this.requestedPartitions = partitions;
this.future = new CompletableFuture<>();
}
@ -1145,9 +1133,10 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
inflightOffsetFetches.stream().filter(r -> r.sameRequest(request)).findAny();
if (dupe.isPresent() || inflight.isPresent()) {
log.info("Duplicated OffsetFetchRequest: " + request.requestedPartitions);
log.debug("Duplicated unsent offset fetch request found for partitions: {}", request.requestedPartitions);
dupe.orElseGet(inflight::get).chainFuture(request.future);
} else {
log.debug("Enqueuing offset fetch request for partitions: {}", request.requestedPartitions);
this.unsentOffsetFetches.add(request);
}
return request.future;
@ -1165,7 +1154,7 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
.filter(request -> !request.canSendRequest(currentTimeMs))
.collect(Collectors.toList());
failAndRemoveExpiredCommitRequests(currentTimeMs);
failAndRemoveExpiredCommitRequests();
// Add all unsent offset commit requests to the unsentRequests list
List<NetworkClientDelegate.UnsentRequest> unsentRequests = unsentOffsetCommits.stream()
@ -1179,7 +1168,7 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
unsentOffsetFetches.stream()
.collect(Collectors.partitioningBy(request -> request.canSendRequest(currentTimeMs)));
failAndRemoveExpiredFetchRequests(currentTimeMs);
failAndRemoveExpiredFetchRequests();
// Add all sendable offset fetch requests to the unsentRequests list and to the inflightOffsetFetches list
for (OffsetFetchRequestState request : partitionedBySendability.get(true)) {
@ -1200,18 +1189,18 @@ public class CommitRequestManager implements RequestManager, MemberStateListener
* Find the unsent commit requests that have expired, remove them and complete their
* futures with a TimeoutException.
*/
private void failAndRemoveExpiredCommitRequests(final long currentTimeMs) {
private void failAndRemoveExpiredCommitRequests() {
Queue<OffsetCommitRequestState> requestsToPurge = new LinkedList<>(unsentOffsetCommits);
requestsToPurge.forEach(req -> req.maybeExpire(currentTimeMs));
requestsToPurge.forEach(RetriableRequestState::maybeExpire);
}
/**
* Find the unsent fetch requests that have expired, remove them and complete their
* futures with a TimeoutException.
*/
private void failAndRemoveExpiredFetchRequests(final long currentTimeMs) {
private void failAndRemoveExpiredFetchRequests() {
Queue<OffsetFetchRequestState> requestsToPurge = new LinkedList<>(unsentOffsetFetches);
requestsToPurge.forEach(req -> req.maybeExpire(currentTimeMs));
requestsToPurge.forEach(RetriableRequestState::maybeExpire);
}
private void clearAll() {

View File

@ -91,8 +91,7 @@ public class ConsumerDelegateCreator {
valueDeserializer,
client,
subscriptions,
metadata,
assignors
metadata
);
else
return new LegacyKafkaConsumer<>(

View File

@ -139,7 +139,7 @@ public class ConsumerNetworkClient implements Closeable {
public Node leastLoadedNode() {
lock.lock();
try {
return client.leastLoadedNode(time.milliseconds());
return client.leastLoadedNode(time.milliseconds()).node();
} finally {
lock.unlock();
}

View File

@ -207,8 +207,7 @@ public class ConsumerNetworkThread extends KafkaThread implements Closeable {
*/
// Visible for testing
static void runAtClose(final Collection<Optional<? extends RequestManager>> requestManagers,
final NetworkClientDelegate networkClientDelegate,
final Timer timer) {
final NetworkClientDelegate networkClientDelegate) {
// These are the optional outgoing requests at the
requestManagers.stream()
.filter(Optional::isPresent)
@ -293,21 +292,28 @@ public class ConsumerNetworkThread extends KafkaThread implements Closeable {
* Check the unsent queue one last time and poll until all requests are sent or the timer runs out.
*/
private void sendUnsentRequests(final Timer timer) {
if (networkClientDelegate.unsentRequests().isEmpty())
if (!networkClientDelegate.hasAnyPendingRequests())
return;
do {
networkClientDelegate.poll(timer.remainingMs(), timer.currentTimeMs());
timer.update();
} while (timer.notExpired() && !networkClientDelegate.unsentRequests().isEmpty());
} while (timer.notExpired() && networkClientDelegate.hasAnyPendingRequests());
if (networkClientDelegate.hasAnyPendingRequests()) {
log.warn("Close timeout of {} ms expired before the consumer network thread was able " +
"to complete pending requests. Inflight request count: {}, Unsent request count: {}",
timer.timeoutMs(), networkClientDelegate.inflightRequestCount(), networkClientDelegate.unsentRequests().size());
}
}
void cleanup() {
log.trace("Closing the consumer network thread");
Timer timer = time.timer(closeTimeout);
try {
runAtClose(requestManagers.entries(), networkClientDelegate, timer);
runAtClose(requestManagers.entries(), networkClientDelegate);
} catch (Exception e) {
log.error("Unexpected error during shutdown. Proceed with closing.", e);
log.error("Unexpected error during shutdown. Proceed with closing.", e);
} finally {
sendUnsentRequests(timer);
applicationEventReaper.reap(applicationEventQueue);

View File

@ -45,6 +45,7 @@ import java.util.SortedSet;
import java.util.TreeSet;
import java.util.stream.Collectors;
/**
* <p>Manages the request creation and response handling for the heartbeat. The module creates a
* {@link ConsumerGroupHeartbeatRequest} using the state stored in the {@link MembershipManager} and enqueue it to
@ -208,7 +209,11 @@ public class HeartbeatRequestManager implements RequestManager {
return new NetworkClientDelegate.PollResult(heartbeatRequestState.heartbeatIntervalMs, Collections.singletonList(leaveHeartbeat));
}
boolean heartbeatNow = membershipManager.shouldHeartbeatNow() && !heartbeatRequestState.requestInFlight();
// Case 1: The member is leaving
boolean heartbeatNow = membershipManager.state() == MemberState.LEAVING ||
// Case 2: The member state indicates it should send a heartbeat without waiting for the interval, and there is no heartbeat request currently in-flight
(membershipManager.shouldHeartbeatNow() && !heartbeatRequestState.requestInFlight());
if (!heartbeatRequestState.canSendRequest(currentTimeMs) && !heartbeatNow) {
return new NetworkClientDelegate.PollResult(heartbeatRequestState.timeToNextHeartbeatMs(currentTimeMs));
}
@ -258,7 +263,8 @@ public class HeartbeatRequestManager implements RequestManager {
pollTimer.update(pollMs);
if (pollTimer.isExpired()) {
logger.warn("Time between subsequent calls to poll() was longer than the configured " +
"max.poll.interval.ms, exceeded approximately by {} ms.", pollTimer.isExpiredBy());
"max.poll.interval.ms, exceeded approximately by {} ms. Member {} will rejoin the group now.",
pollTimer.isExpiredBy(), membershipManager.memberId());
membershipManager.maybeRejoinStaleMember();
}
pollTimer.reset(maxPollIntervalMs);

View File

@ -46,6 +46,7 @@ import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
@ -520,12 +521,15 @@ public class MembershipManagerImpl implements MembershipManager {
@Override
public void transitionToFenced() {
if (state == MemberState.PREPARE_LEAVING) {
log.debug("Member {} with epoch {} got fenced but it is already preparing to leave " +
log.info("Member {} with epoch {} got fenced but it is already preparing to leave " +
"the group, so it will stop sending heartbeat and won't attempt to rejoin.",
memberId, memberEpoch);
// Transition to UNSUBSCRIBED, ensuring that the member (that is not part of the
// group anymore from the broker point of view) will stop sending heartbeats while it
// completes the ongoing leaving operation.
// Briefly transition to LEAVING to ensure all required actions are applied even
// though there is no need to send a leave group heartbeat (ex. clear epoch and
// notify epoch listeners). Then transition to UNSUBSCRIBED, ensuring that the member
// (that is not part of the group anymore from the broker point of view) will stop
// sending heartbeats while it completes the ongoing leaving operation.
transitionToSendingLeaveGroup(false);
transitionTo(MemberState.UNSUBSCRIBED);
return;
}
@ -552,7 +556,7 @@ public class MembershipManagerImpl implements MembershipManager {
log.error("onPartitionsLost callback invocation failed while releasing assignment" +
" after member got fenced. Member will rejoin the group anyways.", error);
}
clearSubscription();
clearAssignment();
if (state == MemberState.FENCED) {
transitionToJoining();
} else {
@ -589,7 +593,7 @@ public class MembershipManagerImpl implements MembershipManager {
log.error("onPartitionsLost callback invocation failed while releasing assignment" +
"after member failed with fatal error.", error);
}
clearSubscription();
clearAssignment();
});
}
@ -605,7 +609,7 @@ public class MembershipManagerImpl implements MembershipManager {
/**
* Clear the assigned partitions in the member subscription, pending assignments and metadata cache.
*/
private void clearSubscription() {
private void clearAssignment() {
if (subscriptions.hasAutoAssignedPartitions()) {
subscriptions.assignFromSubscribed(Collections.emptySet());
}
@ -655,7 +659,7 @@ public class MembershipManagerImpl implements MembershipManager {
public CompletableFuture<Void> leaveGroup() {
if (isNotInGroup()) {
if (state == MemberState.FENCED) {
clearSubscription();
clearAssignment();
transitionTo(MemberState.UNSUBSCRIBED);
}
return CompletableFuture.completedFuture(null);
@ -664,6 +668,7 @@ public class MembershipManagerImpl implements MembershipManager {
if (state == MemberState.PREPARE_LEAVING || state == MemberState.LEAVING) {
// Member already leaving. No-op and return existing leave group future that will
// complete when the ongoing leave operation completes.
log.debug("Leave group operation already in progress for member {}", memberId);
return leaveGroupInProgress.get();
}
@ -673,8 +678,16 @@ public class MembershipManagerImpl implements MembershipManager {
CompletableFuture<Void> callbackResult = invokeOnPartitionsRevokedOrLostToReleaseAssignment();
callbackResult.whenComplete((result, error) -> {
if (error != null) {
log.error("Member {} callback to release assignment failed. Member will proceed " +
"to send leave group heartbeat", memberId, error);
} else {
log.debug("Member {} completed callback to release assignment and will send leave " +
"group heartbeat", memberId);
}
// Clear the subscription, no matter if the callback execution failed or succeeded.
clearSubscription();
subscriptions.unsubscribe();
clearAssignment();
// Transition to ensure that a heartbeat request is sent out to effectively leave the
// group (even in the case where the member had no assignment to release or when the
@ -705,6 +718,9 @@ public class MembershipManagerImpl implements MembershipManager {
SortedSet<TopicPartition> droppedPartitions = new TreeSet<>(TOPIC_PARTITION_COMPARATOR);
droppedPartitions.addAll(subscriptions.assignedPartitions());
log.debug("Member {} is triggering callbacks to release assignment {} and leave group",
memberId, droppedPartitions);
CompletableFuture<Void> callbackResult;
if (droppedPartitions.isEmpty()) {
// No assignment to release.
@ -764,7 +780,7 @@ public class MembershipManagerImpl implements MembershipManager {
* This also includes the latest member ID in the notification. If the member fails or leaves
* the group, this will be invoked with empty epoch and member ID.
*/
private void notifyEpochChange(Optional<Integer> epoch, Optional<String> memberId) {
void notifyEpochChange(Optional<Integer> epoch, Optional<String> memberId) {
stateUpdatesListeners.forEach(stateListener -> stateListener.onMemberEpochUpdated(epoch, memberId));
}
@ -794,8 +810,12 @@ public class MembershipManagerImpl implements MembershipManager {
}
} else if (state == MemberState.LEAVING) {
if (isPollTimerExpired) {
log.debug("Member {} sent heartbeat to leave due to expired poll timer. It will " +
"remain stale (no heartbeat) until it rejoins the group on the next consumer " +
"poll.", memberId);
transitionToStale();
} else {
log.debug("Member {} sent heartbeat to leave group.", memberId);
transitionToUnsubscribed();
}
}
@ -875,7 +895,7 @@ public class MembershipManagerImpl implements MembershipManager {
log.error("onPartitionsLost callback invocation failed while releasing assignment" +
" after member left group due to expired poll timer.", error);
}
clearSubscription();
clearAssignment();
log.debug("Member {} sent leave group heartbeat and released its assignment. It will remain " +
"in {} state until the poll timer is reset, and it will then rejoin the group",
memberId, MemberState.STALE);
@ -939,11 +959,13 @@ public class MembershipManagerImpl implements MembershipManager {
revokedPartitions.addAll(ownedPartitions);
revokedPartitions.removeAll(assignedTopicPartitions);
log.info("Updating assignment with local epoch {}\n" +
log.info("Reconciling assignment with local epoch {}\n" +
"\tMember: {}\n" +
"\tAssigned partitions: {}\n" +
"\tCurrent owned partitions: {}\n" +
"\tAdded partitions (assigned - owned): {}\n" +
"\tRevoked partitions (owned - assigned): {}\n",
memberId,
resolvedAssignment.localEpoch,
assignedTopicPartitions,
ownedPartitions,
@ -960,7 +982,7 @@ public class MembershipManagerImpl implements MembershipManager {
// best effort to commit the offsets in the case where the epoch might have changed while
// the current reconciliation is in process. Note this is using the rebalance timeout as
// it is the limit enforced by the broker to complete the reconciliation process.
commitResult = commitRequestManager.maybeAutoCommitSyncBeforeRevocation(getExpirationTimeForTimeout(rebalanceTimeoutMs));
commitResult = commitRequestManager.maybeAutoCommitSyncBeforeRevocation(getDeadlineMsForTimeout(rebalanceTimeoutMs));
// Execute commit -> onPartitionsRevoked -> onPartitionsAssigned.
commitResult.whenComplete((__, commitReqError) -> {
@ -986,7 +1008,7 @@ public class MembershipManagerImpl implements MembershipManager {
});
}
long getExpirationTimeForTimeout(final long timeoutMs) {
long getDeadlineMsForTimeout(final long timeoutMs) {
long expiration = time.milliseconds() + timeoutMs;
if (expiration < 0) {
return Long.MAX_VALUE;
@ -1175,11 +1197,14 @@ public class MembershipManagerImpl implements MembershipManager {
* request fails, this will proceed to invoke the user callbacks anyway,
* returning a future that will complete or fail depending on the callback execution only.
*
* @param revokedPartitions Partitions to revoke.
* @param partitionsToRevoke Partitions to revoke.
* @return Future that will complete when the commit request and user callback completes.
* Visible for testing
*/
CompletableFuture<Void> revokePartitions(Set<TopicPartition> revokedPartitions) {
CompletableFuture<Void> revokePartitions(Set<TopicPartition> partitionsToRevoke) {
// Ensure the set of partitions to revoke are still assigned
Set<TopicPartition> revokedPartitions = new HashSet<>(partitionsToRevoke);
revokedPartitions.retainAll(subscriptions.assignedPartitions());
log.info("Revoking previously assigned partitions {}", revokedPartitions.stream().map(TopicPartition::toString).collect(Collectors.joining(", ")));
logPausedPartitionsBeingRevoked(revokedPartitions);

View File

@ -81,6 +81,10 @@ public class NetworkClientDelegate implements AutoCloseable {
return unsentRequests;
}
public int inflightRequestCount() {
return client.inFlightRequestCount();
}
/**
* Check if the node is disconnected and unavailable for immediate reconnection (i.e. if it is in
* reconnect backoff window following the disconnect).
@ -130,6 +134,13 @@ public class NetworkClientDelegate implements AutoCloseable {
checkDisconnects(currentTimeMs);
}
/**
* Return true if there is at least one in-flight request or unsent request.
*/
public boolean hasAnyPendingRequests() {
return client.hasInFlightRequests() || !unsentRequests.isEmpty();
}
/**
* Tries to send the requests in the unsentRequest queue. If the request doesn't have an assigned node, it will
* find the leastLoadedOne, and will be retried in the next {@code poll()}. If the request is expired, a
@ -156,7 +167,7 @@ public class NetworkClientDelegate implements AutoCloseable {
}
boolean doSend(final UnsentRequest r, final long currentTimeMs) {
Node node = r.node.orElse(client.leastLoadedNode(currentTimeMs));
Node node = r.node.orElse(client.leastLoadedNode(currentTimeMs).node());
if (node == null || nodeUnavailable(node)) {
log.debug("No broker available to send the request: {}. Retrying.", r);
return false;
@ -201,7 +212,7 @@ public class NetworkClientDelegate implements AutoCloseable {
}
public Node leastLoadedNode() {
return this.client.leastLoadedNode(time.milliseconds());
return this.client.leastLoadedNode(time.milliseconds()).node();
}
public void wakeup() {
@ -309,11 +320,20 @@ public class NetworkClientDelegate implements AutoCloseable {
@Override
public String toString() {
String remainingMs;
if (timer != null) {
timer.update();
remainingMs = String.valueOf(timer.remainingMs());
} else {
remainingMs = "<not set>";
}
return "UnsentRequest{" +
"requestBuilder=" + requestBuilder +
", handler=" + handler +
", node=" + node +
", timer=" + timer +
", remainingMs=" + remainingMs +
'}';
}
}

View File

@ -155,6 +155,7 @@ public class RequestManagers implements Closeable {
apiVersions);
final TopicMetadataRequestManager topic = new TopicMetadataRequestManager(
logContext,
time,
config);
HeartbeatRequestManager heartbeatRequestManager = null;
MembershipManager membershipManager = null;

View File

@ -0,0 +1,71 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.kafka.clients.consumer.internals;
import org.apache.kafka.common.utils.LogContext;
import org.apache.kafka.common.utils.Time;
import org.apache.kafka.common.utils.Timer;
/**
* {@code TimedRequestState} adds to a {@link RequestState} a {@link Timer} with which to keep track
* of the request's expiration.
*/
public class TimedRequestState extends RequestState {
private final Timer timer;
public TimedRequestState(final LogContext logContext,
final String owner,
final long retryBackoffMs,
final long retryBackoffMaxMs,
final Timer timer) {
super(logContext, owner, retryBackoffMs, retryBackoffMaxMs);
this.timer = timer;
}
public TimedRequestState(final LogContext logContext,
final String owner,
final long retryBackoffMs,
final int retryBackoffExpBase,
final long retryBackoffMaxMs,
final double jitter,
final Timer timer) {
super(logContext, owner, retryBackoffMs, retryBackoffExpBase, retryBackoffMaxMs, jitter);
this.timer = timer;
}
public boolean isExpired() {
timer.update();
return timer.isExpired();
}
public long remainingMs() {
timer.update();
return timer.remainingMs();
}
public static Timer deadlineTimer(final Time time, final long deadlineMs) {
long diff = Math.max(0, deadlineMs - time.milliseconds());
return time.timer(diff);
}
@Override
protected String toStringBase() {
return super.toStringBase() + ", remainingMs=" + remainingMs();
}
}

View File

@ -29,6 +29,7 @@ import org.apache.kafka.common.protocol.Errors;
import org.apache.kafka.common.requests.MetadataRequest;
import org.apache.kafka.common.requests.MetadataResponse;
import org.apache.kafka.common.utils.LogContext;
import org.apache.kafka.common.utils.Time;
import org.slf4j.Logger;
import java.util.Collections;
@ -61,6 +62,7 @@ import static org.apache.kafka.clients.consumer.internals.NetworkClientDelegate.
*/
public class TopicMetadataRequestManager implements RequestManager {
private final Time time;
private final boolean allowAutoTopicCreation;
private final List<TopicMetadataRequestState> inflightRequests;
private final long retryBackoffMs;
@ -68,9 +70,10 @@ public class TopicMetadataRequestManager implements RequestManager {
private final Logger log;
private final LogContext logContext;
public TopicMetadataRequestManager(final LogContext context, final ConsumerConfig config) {
public TopicMetadataRequestManager(final LogContext context, final Time time, final ConsumerConfig config) {
logContext = context;
log = logContext.logger(getClass());
this.time = time;
inflightRequests = new LinkedList<>();
retryBackoffMs = config.getLong(ConsumerConfig.RETRY_BACKOFF_MS_CONFIG);
retryBackoffMaxMs = config.getLong(ConsumerConfig.RETRY_BACKOFF_MAX_MS_CONFIG);
@ -81,7 +84,7 @@ public class TopicMetadataRequestManager implements RequestManager {
public NetworkClientDelegate.PollResult poll(final long currentTimeMs) {
// Prune any requests which have timed out
List<TopicMetadataRequestState> expiredRequests = inflightRequests.stream()
.filter(req -> req.isExpired(currentTimeMs))
.filter(TimedRequestState::isExpired)
.collect(Collectors.toList());
expiredRequests.forEach(TopicMetadataRequestState::expire);
@ -99,10 +102,10 @@ public class TopicMetadataRequestManager implements RequestManager {
*
* @return the future of the metadata request.
*/
public CompletableFuture<Map<String, List<PartitionInfo>>> requestAllTopicsMetadata(final long expirationTimeMs) {
public CompletableFuture<Map<String, List<PartitionInfo>>> requestAllTopicsMetadata(final long deadlineMs) {
TopicMetadataRequestState newRequest = new TopicMetadataRequestState(
logContext,
expirationTimeMs,
deadlineMs,
retryBackoffMs,
retryBackoffMaxMs);
inflightRequests.add(newRequest);
@ -115,11 +118,11 @@ public class TopicMetadataRequestManager implements RequestManager {
* @param topic to be requested.
* @return the future of the metadata request.
*/
public CompletableFuture<Map<String, List<PartitionInfo>>> requestTopicMetadata(final String topic, final long expirationTimeMs) {
public CompletableFuture<Map<String, List<PartitionInfo>>> requestTopicMetadata(final String topic, final long deadlineMs) {
TopicMetadataRequestState newRequest = new TopicMetadataRequestState(
logContext,
topic,
expirationTimeMs,
deadlineMs,
retryBackoffMs,
retryBackoffMaxMs);
inflightRequests.add(newRequest);
@ -131,35 +134,32 @@ public class TopicMetadataRequestManager implements RequestManager {
return inflightRequests;
}
class TopicMetadataRequestState extends RequestState {
class TopicMetadataRequestState extends TimedRequestState {
private final String topic;
private final boolean allTopics;
private final long expirationTimeMs;
CompletableFuture<Map<String, List<PartitionInfo>>> future;
public TopicMetadataRequestState(final LogContext logContext,
final long expirationTimeMs,
final long deadlineMs,
final long retryBackoffMs,
final long retryBackoffMaxMs) {
super(logContext, TopicMetadataRequestState.class.getSimpleName(), retryBackoffMs,
retryBackoffMaxMs);
retryBackoffMaxMs, deadlineTimer(time, deadlineMs));
future = new CompletableFuture<>();
this.topic = null;
this.allTopics = true;
this.expirationTimeMs = expirationTimeMs;
}
public TopicMetadataRequestState(final LogContext logContext,
final String topic,
final long expirationTimeMs,
final long deadlineMs,
final long retryBackoffMs,
final long retryBackoffMaxMs) {
super(logContext, TopicMetadataRequestState.class.getSimpleName(), retryBackoffMs,
retryBackoffMaxMs);
retryBackoffMaxMs, deadlineTimer(time, deadlineMs));
future = new CompletableFuture<>();
this.topic = topic;
this.allTopics = false;
this.expirationTimeMs = expirationTimeMs;
}
/**
@ -167,10 +167,6 @@ public class TopicMetadataRequestManager implements RequestManager {
* {@link org.apache.kafka.clients.consumer.internals.NetworkClientDelegate.UnsentRequest} if needed.
*/
private Optional<NetworkClientDelegate.UnsentRequest> send(final long currentTimeMs) {
if (currentTimeMs >= expirationTimeMs) {
return Optional.empty();
}
if (!canSendRequest(currentTimeMs)) {
return Optional.empty();
}
@ -183,10 +179,6 @@ public class TopicMetadataRequestManager implements RequestManager {
return Optional.of(createUnsentRequest(request));
}
private boolean isExpired(final long currentTimeMs) {
return currentTimeMs >= expirationTimeMs;
}
private void expire() {
completeFutureAndRemoveRequest(
new TimeoutException("Timeout expired while fetching topic metadata"));
@ -210,9 +202,8 @@ public class TopicMetadataRequestManager implements RequestManager {
private void handleError(final Throwable exception,
final long completionTimeMs) {
if (exception instanceof RetriableException) {
if (completionTimeMs >= expirationTimeMs) {
completeFutureAndRemoveRequest(
new TimeoutException("Timeout expired while fetching topic metadata"));
if (isExpired()) {
completeFutureAndRemoveRequest(new TimeoutException("Timeout expired while fetching topic metadata"));
} else {
onFailedAttempt(completionTimeMs);
}
@ -222,20 +213,12 @@ public class TopicMetadataRequestManager implements RequestManager {
}
private void handleResponse(final ClientResponse response) {
long responseTimeMs = response.receivedTimeMs();
try {
Map<String, List<PartitionInfo>> res = handleTopicMetadataResponse((MetadataResponse) response.responseBody());
future.complete(res);
inflightRequests.remove(this);
} catch (RetriableException e) {
if (responseTimeMs >= expirationTimeMs) {
completeFutureAndRemoveRequest(
new TimeoutException("Timeout expired while fetching topic metadata"));
} else {
onFailedAttempt(responseTimeMs);
}
} catch (Exception t) {
completeFutureAndRemoveRequest(t);
} catch (Exception e) {
handleError(e, response.receivedTimeMs());
}
}

View File

@ -31,7 +31,7 @@ public abstract class ApplicationEvent {
COMMIT_ASYNC, COMMIT_SYNC, POLL, FETCH_COMMITTED_OFFSETS, NEW_TOPICS_METADATA_UPDATE, ASSIGNMENT_CHANGE,
LIST_OFFSETS, RESET_POSITIONS, VALIDATE_POSITIONS, TOPIC_METADATA, ALL_TOPICS_METADATA, SUBSCRIPTION_CHANGE,
UNSUBSCRIBE, CONSUMER_REBALANCE_LISTENER_CALLBACK_COMPLETED,
COMMIT_ON_CLOSE, LEAVE_ON_CLOSE
COMMIT_ON_CLOSE
}
private final Type type;

View File

@ -32,7 +32,6 @@ import org.slf4j.Logger;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import java.util.concurrent.CompletableFuture;
import java.util.function.BiConsumer;
import java.util.function.Supplier;
@ -119,10 +118,6 @@ public class ApplicationEventProcessor implements EventProcessor<ApplicationEven
process((CommitOnCloseEvent) event);
return;
case LEAVE_ON_CLOSE:
process((LeaveOnCloseEvent) event);
return;
default:
log.warn("Application event type " + event.type() + " was not expected");
}
@ -268,20 +263,6 @@ public class ApplicationEventProcessor implements EventProcessor<ApplicationEven
requestManagers.commitRequestManager.get().signalClose();
}
private void process(final LeaveOnCloseEvent event) {
if (!requestManagers.heartbeatRequestManager.isPresent()) {
event.future().complete(null);
return;
}
MembershipManager membershipManager =
Objects.requireNonNull(requestManagers.heartbeatRequestManager.get().membershipManager(), "Expecting " +
"membership manager to be non-null");
log.debug("Leaving group before closing");
CompletableFuture<Void> future = membershipManager.leaveGroup();
// The future will be completed on heartbeat sent
future.whenComplete(complete(event.future()));
}
private <T> BiConsumer<? super T, ? super Throwable> complete(final CompletableFuture<T> b) {
return (value, exception) -> {
if (exception != null)

View File

@ -43,6 +43,7 @@ public abstract class CompletableApplicationEvent<T> extends ApplicationEvent im
return future;
}
@Override
public long deadlineMs() {
return deadlineMs;
}

View File

@ -18,6 +18,7 @@ package org.apache.kafka.clients.producer;
import org.apache.kafka.clients.ClientDnsLookup;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.MetadataRecoveryStrategy;
import org.apache.kafka.common.compress.GzipCompression;
import org.apache.kafka.common.compress.Lz4Compression;
import org.apache.kafka.common.compress.ZstdCompression;
@ -528,7 +529,14 @@ public class ProducerConfig extends AbstractConfig {
null,
new ConfigDef.NonEmptyString(),
Importance.LOW,
TRANSACTIONAL_ID_DOC);
TRANSACTIONAL_ID_DOC)
.define(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG,
Type.STRING,
CommonClientConfigs.DEFAULT_METADATA_RECOVERY_STRATEGY,
ConfigDef.CaseInsensitiveValidString
.in(Utils.enumOptions(MetadataRecoveryStrategy.class)),
Importance.LOW,
CommonClientConfigs.METADATA_RECOVERY_STRATEGY_DOC);
}
@Override

View File

@ -484,7 +484,7 @@ public class Sender implements Runnable {
FindCoordinatorRequest.CoordinatorType coordinatorType = nextRequestHandler.coordinatorType();
targetNode = coordinatorType != null ?
transactionManager.coordinator(coordinatorType) :
client.leastLoadedNode(time.milliseconds());
client.leastLoadedNode(time.milliseconds()).node();
if (targetNode != null) {
if (!awaitNodeReady(targetNode, coordinatorType)) {
log.trace("Target node {} not ready within request timeout, will retry when node is ready.", targetNode);

View File

@ -766,14 +766,7 @@ public class ConfigDef {
if (value instanceof Class)
return value;
else if (value instanceof String) {
ClassLoader contextOrKafkaClassLoader = Utils.getContextOrKafkaClassLoader();
// Use loadClass here instead of Class.forName because the name we use here may be an alias
// and not match the name of the class that gets loaded. If that happens, Class.forName can
// throw an exception.
Class<?> klass = contextOrKafkaClassLoader.loadClass(trimmed);
// Invoke forName here with the true name of the requested class to cause class
// initialization to take place.
return Class.forName(klass.getName(), true, contextOrKafkaClassLoader);
return Utils.loadClass(trimmed, Object.class);
} else
throw new ConfigException(name, value, "Expected a Class instance or class name.");
default:

View File

@ -320,7 +320,8 @@ public class SaslChannelBuilder implements ChannelBuilder, ListenerReconfigurabl
AuthenticateCallbackHandler callbackHandler;
String prefix = ListenerName.saslMechanismPrefix(mechanism);
@SuppressWarnings("unchecked")
Class<? extends AuthenticateCallbackHandler> clazz = (Class<? extends AuthenticateCallbackHandler>) configs.get(SaslConfigs.SASL_CLIENT_CALLBACK_HANDLER_CLASS);
Class<? extends AuthenticateCallbackHandler> clazz =
(Class<? extends AuthenticateCallbackHandler>) configs.get(prefix + BrokerSecurityConfigs.SASL_SERVER_CALLBACK_HANDLER_CLASS_CONFIG);
// AutoMQ inject start
if (clazz != null) {
if (Utils.hasConstructor(clazz, SaslChannelBuilder.class)) {

View File

@ -871,6 +871,18 @@ public class MemoryRecordsBuilder implements AutoCloseable {
return this.writeLimit >= estimatedBytesWritten() + recordSize;
}
/**
* Check if we have room for a given number of bytes.
*/
public boolean hasRoomFor(int estimatedRecordsSize) {
if (isFull()) return false;
return this.writeLimit >= estimatedBytesWritten() + estimatedRecordsSize;
}
public int maxAllowedBytes() {
return this.writeLimit - this.batchHeaderSizeInBytes;
}
public boolean isClosed() {
return builtRecords != null;
}

View File

@ -47,8 +47,6 @@ public class ListOffsetsRequest extends AbstractRequest {
*/
public static final long EARLIEST_LOCAL_TIMESTAMP = -4L;
public static final long LATEST_TIERED_TIMESTAMP = -5L;
public static final int CONSUMER_REPLICA_ID = -1;
public static final int DEBUGGING_REPLICA_ID = -2;

View File

@ -94,7 +94,7 @@ public final class Utils {
// This matches URIs of formats: host:port and protocol://host:port
// IPv6 is supported with [ip] pattern
private static final Pattern HOST_PORT_PATTERN = Pattern.compile("^(?:[a-zA-Z][a-zA-Z\\d+-.]*://)?\\[?([0-9a-zA-Z\\-._%:]+)\\]?:([0-9]+)$");
private static final Pattern HOST_PORT_PATTERN = Pattern.compile("^(?:[0-9a-zA-Z\\-%._]*://)?\\[?([0-9a-zA-Z\\-%._:]*)]?:([0-9]+)");
private static final Pattern VALID_HOST_CHARACTERS = Pattern.compile("([0-9a-zA-Z\\-%._:]*)");
@ -451,7 +451,14 @@ public final class Utils {
* @return the new class
*/
public static <T> Class<? extends T> loadClass(String klass, Class<T> base) throws ClassNotFoundException {
return Class.forName(klass, true, Utils.getContextOrKafkaClassLoader()).asSubclass(base);
ClassLoader contextOrKafkaClassLoader = Utils.getContextOrKafkaClassLoader();
// Use loadClass here instead of Class.forName because the name we use here may be an alias
// and not match the name of the class that gets loaded. If that happens, Class.forName can
// throw an exception.
Class<?> loadedClass = contextOrKafkaClassLoader.loadClass(klass);
// Invoke forName here with the true name of the requested class to cause class
// initialization to take place.
return Class.forName(loadedClass.getName(), true, contextOrKafkaClassLoader).asSubclass(base);
}
/**
@ -480,7 +487,7 @@ public final class Utils {
Class<?>[] argTypes = new Class<?>[params.length / 2];
Object[] args = new Object[params.length / 2];
try {
Class<?> c = Class.forName(className, true, Utils.getContextOrKafkaClassLoader());
Class<?> c = Utils.loadClass(className, Object.class);
for (int i = 0; i < params.length / 2; i++) {
argTypes[i] = (Class<?>) params[2 * i];
args[i] = params[(2 * i) + 1];

View File

@ -319,7 +319,7 @@ public class MockClient implements KafkaClient {
checkTimeoutOfPendingRequests(now);
// We skip metadata updates if all nodes are currently blacked out
if (metadataUpdater.isUpdateNeeded() && leastLoadedNode(now) != null) {
if (metadataUpdater.isUpdateNeeded() && leastLoadedNode(now).node() != null) {
MetadataUpdate metadataUpdate = metadataUpdates.poll();
if (metadataUpdate != null) {
metadataUpdater.update(time, metadataUpdate);
@ -588,13 +588,13 @@ public class MockClient implements KafkaClient {
}
@Override
public Node leastLoadedNode(long now) {
public LeastLoadedNode leastLoadedNode(long now) {
// Consistent with NetworkClient, we do not return nodes awaiting reconnect backoff
for (Node node : metadataUpdater.fetchNodes()) {
if (!connectionState(node.idString()).isBackingOff(now))
return node;
return new LeastLoadedNode(node, true);
}
return null;
return new LeastLoadedNode(null, false);
}
public void setWakeupHook(Runnable wakeupHook) {

View File

@ -128,7 +128,16 @@ public class NetworkClientTest {
private NetworkClient createNetworkClient(long reconnectBackoffMaxMs) {
return new NetworkClient(selector, metadataUpdater, "mock", Integer.MAX_VALUE,
reconnectBackoffMsTest, reconnectBackoffMaxMs, 64 * 1024, 64 * 1024,
defaultRequestTimeoutMs, connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest, time, true, new ApiVersions(), new LogContext());
defaultRequestTimeoutMs, connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest, time, true, new ApiVersions(), new LogContext(),
MetadataRecoveryStrategy.NONE);
}
private NetworkClient createNetworkClientWithMaxInFlightRequestsPerConnection(
int maxInFlightRequestsPerConnection, long reconnectBackoffMaxMs) {
return new NetworkClient(selector, metadataUpdater, "mock", maxInFlightRequestsPerConnection,
reconnectBackoffMsTest, reconnectBackoffMaxMs, 64 * 1024, 64 * 1024,
defaultRequestTimeoutMs, connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest, time, true, new ApiVersions(), new LogContext(),
MetadataRecoveryStrategy.NONE);
}
private NetworkClient createNetworkClientWithMultipleNodes(long reconnectBackoffMaxMs, long connectionSetupTimeoutMsTest, int nodeNumber) {
@ -136,26 +145,30 @@ public class NetworkClientTest {
TestMetadataUpdater metadataUpdater = new TestMetadataUpdater(nodes);
return new NetworkClient(selector, metadataUpdater, "mock", Integer.MAX_VALUE,
reconnectBackoffMsTest, reconnectBackoffMaxMs, 64 * 1024, 64 * 1024,
defaultRequestTimeoutMs, connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest, time, true, new ApiVersions(), new LogContext());
defaultRequestTimeoutMs, connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest, time, true, new ApiVersions(), new LogContext(),
MetadataRecoveryStrategy.NONE);
}
private NetworkClient createNetworkClientWithStaticNodes() {
return new NetworkClient(selector, metadataUpdater,
"mock-static", Integer.MAX_VALUE, 0, 0, 64 * 1024, 64 * 1024, defaultRequestTimeoutMs,
connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest, time, true, new ApiVersions(), new LogContext());
connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest, time, true, new ApiVersions(), new LogContext(),
MetadataRecoveryStrategy.NONE);
}
private NetworkClient createNetworkClientWithNoVersionDiscovery(Metadata metadata) {
return new NetworkClient(selector, metadata, "mock", Integer.MAX_VALUE,
reconnectBackoffMsTest, 0, 64 * 1024, 64 * 1024,
defaultRequestTimeoutMs, connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest, time, false, new ApiVersions(), new LogContext());
defaultRequestTimeoutMs, connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest, time, false, new ApiVersions(), new LogContext(),
MetadataRecoveryStrategy.NONE);
}
private NetworkClient createNetworkClientWithNoVersionDiscovery() {
return new NetworkClient(selector, metadataUpdater, "mock", Integer.MAX_VALUE,
reconnectBackoffMsTest, reconnectBackoffMaxMsTest,
64 * 1024, 64 * 1024, defaultRequestTimeoutMs,
connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest, time, false, new ApiVersions(), new LogContext());
connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest, time, false, new ApiVersions(), new LogContext(),
MetadataRecoveryStrategy.NONE);
}
@BeforeEach
@ -698,14 +711,18 @@ public class NetworkClientTest {
public void testLeastLoadedNode() {
client.ready(node, time.milliseconds());
assertFalse(client.isReady(node, time.milliseconds()));
assertEquals(node, client.leastLoadedNode(time.milliseconds()));
LeastLoadedNode leastLoadedNode = client.leastLoadedNode(time.milliseconds());
assertEquals(node, leastLoadedNode.node());
assertTrue(leastLoadedNode.hasNodeAvailableOrConnectionReady());
awaitReady(client, node);
client.poll(1, time.milliseconds());
assertTrue(client.isReady(node, time.milliseconds()), "The client should be ready");
// leastloadednode should be our single node
Node leastNode = client.leastLoadedNode(time.milliseconds());
leastLoadedNode = client.leastLoadedNode(time.milliseconds());
assertTrue(leastLoadedNode.hasNodeAvailableOrConnectionReady());
Node leastNode = leastLoadedNode.node();
assertEquals(leastNode.id(), node.id(), "There should be one leastloadednode");
// sleep for longer than reconnect backoff
@ -716,8 +733,29 @@ public class NetworkClientTest {
client.poll(1, time.milliseconds());
assertFalse(client.ready(node, time.milliseconds()), "After we forced the disconnection the client is no longer ready.");
leastNode = client.leastLoadedNode(time.milliseconds());
assertNull(leastNode, "There should be NO leastloadednode");
leastLoadedNode = client.leastLoadedNode(time.milliseconds());
assertFalse(leastLoadedNode.hasNodeAvailableOrConnectionReady());
assertNull(leastLoadedNode.node(), "There should be NO leastloadednode");
}
@Test
public void testHasNodeAvailableOrConnectionReady() {
NetworkClient client = createNetworkClientWithMaxInFlightRequestsPerConnection(1, reconnectBackoffMaxMsTest);
awaitReady(client, node);
long now = time.milliseconds();
LeastLoadedNode leastLoadedNode = client.leastLoadedNode(now);
assertEquals(node, leastLoadedNode.node());
assertTrue(leastLoadedNode.hasNodeAvailableOrConnectionReady());
MetadataRequest.Builder builder = new MetadataRequest.Builder(Collections.emptyList(), true);
ClientRequest request = client.newClientRequest(node.idString(), builder, now, true);
client.send(request, now);
client.poll(defaultRequestTimeoutMs, now);
leastLoadedNode = client.leastLoadedNode(now);
assertNull(leastLoadedNode.node());
assertTrue(leastLoadedNode.hasNodeAvailableOrConnectionReady());
}
@Test
@ -727,7 +765,7 @@ public class NetworkClientTest {
Set<Node> providedNodeIds = new HashSet<>();
for (int i = 0; i < nodeNumber * 10; i++) {
Node node = client.leastLoadedNode(time.milliseconds());
Node node = client.leastLoadedNode(time.milliseconds()).node();
assertNotNull(node, "Should provide a node");
providedNodeIds.add(node);
client.ready(node, time.milliseconds());
@ -800,7 +838,7 @@ public class NetworkClientTest {
client.poll(1, time.milliseconds());
// leastloadednode should return null since the node is throttled
assertNull(client.leastLoadedNode(time.milliseconds()));
assertNull(client.leastLoadedNode(time.milliseconds()).node());
}
@Test
@ -1046,7 +1084,8 @@ public class NetworkClientTest {
NetworkClient client = new NetworkClient(metadataUpdater, null, selector, "mock", Integer.MAX_VALUE,
reconnectBackoffMsTest, reconnectBackoffMaxMsTest, 64 * 1024, 64 * 1024,
defaultRequestTimeoutMs, connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest,
time, false, new ApiVersions(), null, new LogContext(), mockHostResolver, mockClientTelemetrySender);
time, false, new ApiVersions(), null, new LogContext(), mockHostResolver, mockClientTelemetrySender,
MetadataRecoveryStrategy.NONE);
// Connect to one the initial addresses, then change the addresses and disconnect
client.ready(node, time.milliseconds());
@ -1106,7 +1145,8 @@ public class NetworkClientTest {
NetworkClient client = new NetworkClient(metadataUpdater, null, selector, "mock", Integer.MAX_VALUE,
reconnectBackoffMsTest, reconnectBackoffMaxMsTest, 64 * 1024, 64 * 1024,
defaultRequestTimeoutMs, connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest,
time, false, new ApiVersions(), null, new LogContext(), mockHostResolver, mockClientTelemetrySender);
time, false, new ApiVersions(), null, new LogContext(), mockHostResolver, mockClientTelemetrySender,
MetadataRecoveryStrategy.NONE);
// First connection attempt should fail
client.ready(node, time.milliseconds());
@ -1158,7 +1198,8 @@ public class NetworkClientTest {
NetworkClient client = new NetworkClient(metadataUpdater, null, selector, "mock", Integer.MAX_VALUE,
reconnectBackoffMsTest, reconnectBackoffMaxMsTest, 64 * 1024, 64 * 1024,
defaultRequestTimeoutMs, connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest,
time, false, new ApiVersions(), null, new LogContext(), mockHostResolver, mockClientTelemetrySender);
time, false, new ApiVersions(), null, new LogContext(), mockHostResolver, mockClientTelemetrySender,
MetadataRecoveryStrategy.NONE);
// Connect to one the initial addresses, then change the addresses and disconnect
client.ready(node, time.milliseconds());
@ -1266,7 +1307,8 @@ public class NetworkClientTest {
NetworkClient client = new NetworkClient(metadataUpdater, null, selector, "mock", Integer.MAX_VALUE,
reconnectBackoffMsTest, reconnectBackoffMaxMsTest, 64 * 1024, 64 * 1024,
defaultRequestTimeoutMs, connectionSetupTimeoutMsTest, connectionSetupTimeoutMaxMsTest,
time, true, new ApiVersions(), null, new LogContext(), new DefaultHostResolver(), mockClientTelemetrySender);
time, true, new ApiVersions(), null, new LogContext(), new DefaultHostResolver(), mockClientTelemetrySender,
MetadataRecoveryStrategy.NONE);
// Send the ApiVersionsRequest
client.ready(node, time.milliseconds());

View File

@ -0,0 +1,47 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.kafka.clients.admin;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.MetadataRecoveryStrategy;
import org.apache.kafka.common.config.ConfigException;
import org.junit.jupiter.api.Test;
import java.util.HashMap;
import java.util.Map;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertThrows;
import static org.junit.jupiter.api.Assertions.assertTrue;
public class AdminClientConfigTest {
@Test
public void testDefaultMetadataRecoveryStrategy() {
Map<String, Object> configs = new HashMap<>();
final AdminClientConfig adminClientConfig = new AdminClientConfig(configs);
assertEquals(MetadataRecoveryStrategy.NONE.name, adminClientConfig.getString(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG));
}
@Test
public void testInvalidMetadataRecoveryStrategy() {
Map<String, Object> configs = new HashMap<>();
configs.put(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG, "abc");
ConfigException ce = assertThrows(ConfigException.class, () -> new AdminClientConfig(configs));
assertTrue(ce.getMessage().contains(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG));
}
}

View File

@ -1464,6 +1464,7 @@ public class KafkaAdminClientTest {
assertEquals(0, topicDescription.partitions().get(0).partition());
assertEquals(1, topicDescription.partitions().get(1).partition());
topicDescription = topicDescriptions.get(topicName1);
assertNull(topicDescription.authorizedOperations());
assertEquals(1, topicDescription.partitions().size());
} catch (Exception e) {
fail("describe using DescribeTopics API should not fail", e);
@ -1471,6 +1472,77 @@ public class KafkaAdminClientTest {
}
}
@Test
public void testDescribeTopicPartitionsApiWithAuthorizedOps() throws ExecutionException, InterruptedException {
try (AdminClientUnitTestEnv env = mockClientEnv()) {
env.kafkaClient().setNodeApiVersions(NodeApiVersions.create());
String topicName0 = "test-0";
Uuid topicId = Uuid.randomUuid();
int authorisedOperations = Utils.to32BitField(Utils.mkSet(AclOperation.DESCRIBE.code(), AclOperation.ALTER.code()));
env.kafkaClient().prepareResponse(
prepareDescribeClusterResponse(0,
env.cluster().nodes(),
env.cluster().clusterResource().clusterId(),
2,
authorisedOperations)
);
DescribeTopicPartitionsResponseData responseData = new DescribeTopicPartitionsResponseData();
responseData.topics().add(new DescribeTopicPartitionsResponseTopic()
.setErrorCode((short) 0)
.setTopicId(topicId)
.setName(topicName0)
.setIsInternal(false)
.setTopicAuthorizedOperations(authorisedOperations));
env.kafkaClient().prepareResponse(new DescribeTopicPartitionsResponse(responseData));
DescribeTopicsResult result = env.adminClient().describeTopics(
singletonList(topicName0), new DescribeTopicsOptions().includeAuthorizedOperations(true)
);
Map<String, TopicDescription> topicDescriptions = result.allTopicNames().get();
TopicDescription topicDescription = topicDescriptions.get(topicName0);
assertEquals(new HashSet<>(asList(AclOperation.DESCRIBE, AclOperation.ALTER)),
topicDescription.authorizedOperations());
}
}
@Test
public void testDescribeTopicPartitionsApiWithoutAuthorizedOps() throws ExecutionException, InterruptedException {
try (AdminClientUnitTestEnv env = mockClientEnv()) {
env.kafkaClient().setNodeApiVersions(NodeApiVersions.create());
String topicName0 = "test-0";
Uuid topicId = Uuid.randomUuid();
int authorisedOperations = Utils.to32BitField(Utils.mkSet(AclOperation.DESCRIBE.code(), AclOperation.ALTER.code()));
env.kafkaClient().prepareResponse(
prepareDescribeClusterResponse(0,
env.cluster().nodes(),
env.cluster().clusterResource().clusterId(),
2,
authorisedOperations)
);
DescribeTopicPartitionsResponseData responseData = new DescribeTopicPartitionsResponseData();
responseData.topics().add(new DescribeTopicPartitionsResponseTopic()
.setErrorCode((short) 0)
.setTopicId(topicId)
.setName(topicName0)
.setIsInternal(false)
.setTopicAuthorizedOperations(authorisedOperations));
env.kafkaClient().prepareResponse(new DescribeTopicPartitionsResponse(responseData));
DescribeTopicsResult result = env.adminClient().describeTopics(
singletonList(topicName0), new DescribeTopicsOptions().includeAuthorizedOperations(false)
);
Map<String, TopicDescription> topicDescriptions = result.allTopicNames().get();
TopicDescription topicDescription = topicDescriptions.get(topicName0);
assertNull(topicDescription.authorizedOperations());
}
}
@SuppressWarnings({"NPathComplexity", "CyclomaticComplexity"})
@Test
public void testDescribeTopicsWithDescribeTopicPartitionsApiEdgeCase() {
@ -1547,6 +1619,7 @@ public class KafkaAdminClientTest {
assertEquals(2, topicDescription.partitions().size());
topicDescription = topicDescriptions.get(topicName2);
assertEquals(2, topicDescription.partitions().size());
assertNull(topicDescription.authorizedOperations());
} catch (Exception e) {
fail("describe using DescribeTopics API should not fail", e);
}

View File

@ -16,6 +16,7 @@
*/
package org.apache.kafka.clients.admin.internals;
import org.apache.kafka.clients.admin.FenceProducersOptions;
import org.apache.kafka.clients.admin.internals.AdminApiHandler.ApiResult;
import org.apache.kafka.common.Node;
import org.apache.kafka.common.message.InitProducerIdResponseData;
@ -39,11 +40,21 @@ import static org.junit.jupiter.api.Assertions.assertInstanceOf;
public class FenceProducersHandlerTest {
private final LogContext logContext = new LogContext();
private final Node node = new Node(1, "host", 1234);
private final int requestTimeoutMs = 30000;
private final FenceProducersOptions options = new FenceProducersOptions();
@Test
public void testBuildRequest() {
FenceProducersHandler handler = new FenceProducersHandler(logContext);
mkSet("foo", "bar", "baz").forEach(transactionalId -> assertLookup(handler, transactionalId));
FenceProducersHandler handler = new FenceProducersHandler(options, logContext, requestTimeoutMs);
mkSet("foo", "bar", "baz").forEach(transactionalId -> assertLookup(handler, transactionalId, requestTimeoutMs));
}
@Test
public void testBuildRequestOptionsTimeout() {
final int optionsTimeoutMs = 50000;
options.timeoutMs(optionsTimeoutMs);
FenceProducersHandler handler = new FenceProducersHandler(options, logContext, requestTimeoutMs);
mkSet("foo", "bar", "baz").forEach(transactionalId -> assertLookup(handler, transactionalId, optionsTimeoutMs));
}
@Test
@ -51,7 +62,7 @@ public class FenceProducersHandlerTest {
String transactionalId = "foo";
CoordinatorKey key = CoordinatorKey.byTransactionalId(transactionalId);
FenceProducersHandler handler = new FenceProducersHandler(logContext);
FenceProducersHandler handler = new FenceProducersHandler(options, logContext, requestTimeoutMs);
short epoch = 57;
long producerId = 7;
@ -73,7 +84,7 @@ public class FenceProducersHandlerTest {
@Test
public void testHandleErrorResponse() {
String transactionalId = "foo";
FenceProducersHandler handler = new FenceProducersHandler(logContext);
FenceProducersHandler handler = new FenceProducersHandler(options, logContext, requestTimeoutMs);
assertFatalError(handler, transactionalId, Errors.TRANSACTIONAL_ID_AUTHORIZATION_FAILED);
assertFatalError(handler, transactionalId, Errors.CLUSTER_AUTHORIZATION_FAILED);
assertFatalError(handler, transactionalId, Errors.UNKNOWN_SERVER_ERROR);
@ -83,6 +94,7 @@ public class FenceProducersHandlerTest {
assertRetriableError(handler, transactionalId, Errors.COORDINATOR_LOAD_IN_PROGRESS);
assertUnmappedKey(handler, transactionalId, Errors.NOT_COORDINATOR);
assertUnmappedKey(handler, transactionalId, Errors.COORDINATOR_NOT_AVAILABLE);
assertRetriableError(handler, transactionalId, Errors.CONCURRENT_TRANSACTIONS);
}
private void assertFatalError(
@ -136,10 +148,10 @@ public class FenceProducersHandlerTest {
return result;
}
private void assertLookup(FenceProducersHandler handler, String transactionalId) {
private void assertLookup(FenceProducersHandler handler, String transactionalId, int txnTimeoutMs) {
CoordinatorKey key = CoordinatorKey.byTransactionalId(transactionalId);
InitProducerIdRequest.Builder request = handler.buildSingleRequest(1, key);
assertEquals(transactionalId, request.data.transactionalId());
assertEquals(1, request.data.transactionTimeoutMs());
assertEquals(txnTimeoutMs, request.data.transactionTimeoutMs());
}
}

View File

@ -17,6 +17,7 @@
package org.apache.kafka.clients.consumer;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.MetadataRecoveryStrategy;
import org.apache.kafka.common.config.ConfigException;
import org.apache.kafka.common.errors.InvalidConfigurationException;
import org.apache.kafka.common.security.auth.SecurityProtocol;
@ -191,6 +192,25 @@ public class ConsumerConfigTest {
assertEquals(remoteAssignorName, consumerConfig.getString(ConsumerConfig.GROUP_REMOTE_ASSIGNOR_CONFIG));
}
@Test
public void testDefaultMetadataRecoveryStrategy() {
Map<String, Object> configs = new HashMap<>();
configs.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, keyDeserializerClass);
configs.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, valueDeserializerClass);
final ConsumerConfig consumerConfig = new ConsumerConfig(configs);
assertEquals(MetadataRecoveryStrategy.NONE.name, consumerConfig.getString(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG));
}
@Test
public void testInvalidMetadataRecoveryStrategy() {
Map<String, Object> configs = new HashMap<>();
configs.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, keyDeserializerClass);
configs.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, valueDeserializerClass);
configs.put(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG, "abc");
ConfigException ce = assertThrows(ConfigException.class, () -> new ConsumerConfig(configs));
assertTrue(ce.getMessage().contains(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG));
}
@ParameterizedTest
@CsvSource({"consumer, true", "classic, true", "Consumer, true", "Classic, true", "invalid, false"})
public void testProtocolConfigValidation(String protocol, boolean isValid) {

View File

@ -435,7 +435,7 @@ public class KafkaConsumerTest {
}
@ParameterizedTest
@EnumSource(GroupProtocol.class)
@EnumSource(value = GroupProtocol.class, names = "CLASSIC")
public void testSubscription(GroupProtocol groupProtocol) {
consumer = newConsumer(groupProtocol, groupId);
@ -495,7 +495,7 @@ public class KafkaConsumerTest {
}
@ParameterizedTest
@EnumSource(GroupProtocol.class)
@EnumSource(value = GroupProtocol.class, names = "CLASSIC")
public void testSubscriptionWithEmptyPartitionAssignment(GroupProtocol groupProtocol) {
Properties props = new Properties();
props.setProperty(ConsumerConfig.GROUP_PROTOCOL_CONFIG, groupProtocol.name());
@ -3227,7 +3227,7 @@ public void testClosingConsumerUnregistersConsumerMetrics(GroupProtocol groupPro
}
@ParameterizedTest
@EnumSource(GroupProtocol.class)
@EnumSource(value = GroupProtocol.class, names = "CLASSIC")
public void testAssignorNameConflict(GroupProtocol groupProtocol) {
Map<String, Object> configs = new HashMap<>();
configs.put(ConsumerConfig.GROUP_PROTOCOL_CONFIG, groupProtocol.name());

View File

@ -19,7 +19,6 @@ package org.apache.kafka.clients.consumer.internals;
import org.apache.kafka.clients.Metadata.LeaderAndEpoch;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerGroupMetadata;
import org.apache.kafka.clients.consumer.ConsumerPartitionAssignor;
import org.apache.kafka.clients.consumer.ConsumerRebalanceListener;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
@ -30,7 +29,6 @@ import org.apache.kafka.clients.consumer.OffsetAndTimestamp;
import org.apache.kafka.clients.consumer.OffsetCommitCallback;
import org.apache.kafka.clients.consumer.OffsetResetStrategy;
import org.apache.kafka.clients.consumer.RetriableCommitFailedException;
import org.apache.kafka.clients.consumer.RoundRobinAssignor;
import org.apache.kafka.clients.consumer.internals.events.ApplicationEvent;
import org.apache.kafka.clients.consumer.internals.events.ApplicationEventHandler;
import org.apache.kafka.clients.consumer.internals.events.AssignmentChangeEvent;
@ -44,7 +42,6 @@ import org.apache.kafka.clients.consumer.internals.events.ConsumerRebalanceListe
import org.apache.kafka.clients.consumer.internals.events.ErrorEvent;
import org.apache.kafka.clients.consumer.internals.events.EventProcessor;
import org.apache.kafka.clients.consumer.internals.events.FetchCommittedOffsetsEvent;
import org.apache.kafka.clients.consumer.internals.events.LeaveOnCloseEvent;
import org.apache.kafka.clients.consumer.internals.events.ListOffsetsEvent;
import org.apache.kafka.clients.consumer.internals.events.NewTopicsMetadataUpdateRequestEvent;
import org.apache.kafka.clients.consumer.internals.events.PollEvent;
@ -56,6 +53,7 @@ import org.apache.kafka.clients.consumer.internals.events.ValidatePositionsEvent
import org.apache.kafka.common.KafkaException;
import org.apache.kafka.common.Metric;
import org.apache.kafka.common.Node;
import org.apache.kafka.common.Cluster;
import org.apache.kafka.common.TopicPartition;
import org.apache.kafka.common.errors.FencedInstanceIdException;
import org.apache.kafka.common.errors.GroupAuthorizationException;
@ -105,10 +103,10 @@ import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicReference;
import java.util.regex.Pattern;
import java.util.stream.Stream;
import static java.util.Arrays.asList;
import static java.util.Collections.emptySet;
import static java.util.Collections.singleton;
import static java.util.Collections.singletonList;
import static org.apache.kafka.clients.consumer.internals.ConsumerRebalanceListenerMethodName.ON_PARTITIONS_ASSIGNED;
@ -131,7 +129,6 @@ import static org.junit.jupiter.api.Assertions.assertThrows;
import static org.junit.jupiter.api.Assertions.assertTrue;
import static org.junit.jupiter.api.Assertions.fail;
import static org.mockito.ArgumentMatchers.any;
import static org.mockito.ArgumentMatchers.eq;
import static org.mockito.Mockito.atLeast;
import static org.mockito.Mockito.doAnswer;
import static org.mockito.Mockito.doReturn;
@ -139,8 +136,10 @@ import static org.mockito.Mockito.doThrow;
import static org.mockito.Mockito.mock;
import static org.mockito.Mockito.mockStatic;
import static org.mockito.Mockito.never;
import static org.mockito.Mockito.spy;
import static org.mockito.Mockito.verify;
import static org.mockito.Mockito.when;
import static org.mockito.Mockito.clearInvocations;
@SuppressWarnings("unchecked")
public class AsyncKafkaConsumerTest {
@ -157,9 +156,15 @@ public class AsyncKafkaConsumerTest {
public void resetAll() {
backgroundEventQueue.clear();
if (consumer != null) {
consumer.close(Duration.ZERO);
try {
consumer.close(Duration.ZERO);
} catch (Exception e) {
// best effort to clean up after each test, but may throw (ex. if callbacks where
// throwing errors)
}
}
consumer = null;
Mockito.framework().clearInlineMocks();
MockConsumerInterceptor.resetCounters();
}
@ -205,7 +210,6 @@ public class AsyncKafkaConsumerTest {
ConsumerInterceptors<String, String> interceptors,
ConsumerRebalanceListenerInvoker rebalanceListenerInvoker,
SubscriptionState subscriptions,
List<ConsumerPartitionAssignor> assignors,
String groupId,
String clientId) {
long retryBackoffMs = 100L;
@ -228,7 +232,6 @@ public class AsyncKafkaConsumerTest {
metadata,
retryBackoffMs,
defaultApiTimeoutMs,
assignors,
groupId,
autoCommitEnabled);
}
@ -236,6 +239,7 @@ public class AsyncKafkaConsumerTest {
@Test
public void testSuccessfulStartupShutdown() {
consumer = newConsumer();
completeUnsubscribeApplicationEventSuccessfully();
assertDoesNotThrow(() -> consumer.close());
}
@ -248,6 +252,7 @@ public class AsyncKafkaConsumerTest {
@Test
public void testFailOnClosedConsumer() {
consumer = newConsumer();
completeUnsubscribeApplicationEventSuccessfully();
consumer.close();
final IllegalStateException res = assertThrows(IllegalStateException.class, consumer::assignment);
assertEquals("This consumer has already been closed.", res.getMessage());
@ -480,6 +485,44 @@ public class AsyncKafkaConsumerTest {
assertTrue(callbackExecuted.get());
}
@Test
public void testSubscriptionRegexEvalOnPollOnlyIfMetadataChanges() {
SubscriptionState subscriptions = mock(SubscriptionState.class);
Cluster cluster = mock(Cluster.class);
consumer = newConsumer(
mock(FetchBuffer.class),
mock(ConsumerInterceptors.class),
mock(ConsumerRebalanceListenerInvoker.class),
subscriptions,
"group-id",
"client-id");
final String topicName = "foo";
final int partition = 3;
final TopicPartition tp = new TopicPartition(topicName, partition);
doReturn(Fetch.empty()).when(fetchCollector).collectFetch(any(FetchBuffer.class));
Map<TopicPartition, OffsetAndMetadata> offsets = Collections.singletonMap(tp, new OffsetAndMetadata(1));
completeFetchedCommittedOffsetApplicationEventSuccessfully(offsets);
doReturn(LeaderAndEpoch.noLeaderOrEpoch()).when(metadata).currentLeader(any());
doReturn(cluster).when(metadata).fetch();
doReturn(Collections.singleton(topicName)).when(cluster).topics();
consumer.subscribe(Pattern.compile("f*"));
verify(metadata).requestUpdateForNewTopics();
verify(subscriptions).matchesSubscribedPattern(topicName);
clearInvocations(subscriptions);
when(subscriptions.hasPatternSubscription()).thenReturn(true);
consumer.poll(Duration.ZERO);
verify(subscriptions, never()).matchesSubscribedPattern(topicName);
when(metadata.updateVersion()).thenReturn(2);
when(subscriptions.hasPatternSubscription()).thenReturn(true);
consumer.poll(Duration.ZERO);
verify(subscriptions).matchesSubscribedPattern(topicName);
}
@Test
public void testClearWakeupTriggerAfterPoll() {
consumer = newConsumer();
@ -537,6 +580,151 @@ public class AsyncKafkaConsumerTest {
"This method is deprecated and will be removed in the next major release.", e.getMessage());
}
@Test
public void testOffsetFetchStoresPendingEvent() {
consumer = newConsumer();
long timeoutMs = 0;
doReturn(Fetch.empty()).when(fetchCollector).collectFetch(any(FetchBuffer.class));
consumer.assign(Collections.singleton(new TopicPartition("topic1", 0)));
// The first attempt at poll() creates an event, enqueues it, but its Future does not complete within the
// timeout, leaving a pending fetch.
consumer.poll(Duration.ofMillis(timeoutMs));
verify(applicationEventHandler).add(any(FetchCommittedOffsetsEvent.class));
CompletableApplicationEvent<Map<TopicPartition, OffsetAndMetadata>> event = getLastEnqueuedEvent();
assertThrows(TimeoutException.class, () -> ConsumerUtils.getResult(event.future(), time.timer(timeoutMs)));
assertTrue(consumer.hasPendingOffsetFetchEvent());
clearInvocations(applicationEventHandler);
// For the second attempt, the event is reused, so first verify that another FetchCommittedOffsetsEvent
// was not enqueued. On this attempt the Future returns successfully, clearing the pending fetch.
event.future().complete(Collections.emptyMap());
consumer.poll(Duration.ofMillis(timeoutMs));
verify(applicationEventHandler, never()).add(any(FetchCommittedOffsetsEvent.class));
assertDoesNotThrow(() -> ConsumerUtils.getResult(event.future(), time.timer(timeoutMs)));
assertFalse(consumer.hasPendingOffsetFetchEvent());
}
@Test
public void testOffsetFetchDoesNotReuseMismatchedPendingEvent() {
consumer = newConsumer();
long timeoutMs = 0;
doReturn(Fetch.empty()).when(fetchCollector).collectFetch(any(FetchBuffer.class));
// The first attempt at poll() retrieves data for partition 0 of the topic. poll() creates an event,
// enqueues it, but its Future does not complete within the timeout, leaving a pending fetch.
consumer.assign(Collections.singleton(new TopicPartition("topic1", 0)));
consumer.poll(Duration.ofMillis(timeoutMs));
verify(applicationEventHandler).add(any(FetchCommittedOffsetsEvent.class));
CompletableApplicationEvent<Map<TopicPartition, OffsetAndMetadata>> event1 = getLastEnqueuedEvent();
assertThrows(TimeoutException.class, () -> ConsumerUtils.getResult(event1.future(), time.timer(timeoutMs)));
assertTrue(consumer.hasPendingOffsetFetchEvent());
clearInvocations(applicationEventHandler);
// For the second attempt, the set of partitions is reassigned, causing the pending offset to be replaced.
// Verify that another FetchCommittedOffsetsEvent is enqueued.
consumer.assign(Collections.singleton(new TopicPartition("topic1", 1)));
consumer.poll(Duration.ofMillis(timeoutMs));
verify(applicationEventHandler).add(any(FetchCommittedOffsetsEvent.class));
CompletableApplicationEvent<Map<TopicPartition, OffsetAndMetadata>> event2 = getLastEnqueuedEvent();
assertNotEquals(event1, event2);
assertThrows(TimeoutException.class, () -> ConsumerUtils.getResult(event2.future(), time.timer(timeoutMs)));
assertTrue(consumer.hasPendingOffsetFetchEvent());
clearInvocations(applicationEventHandler);
// For the third attempt, the event from attempt 2 is reused, so there should not have been another
// FetchCommittedOffsetsEvent enqueued. The Future is completed to make it return successfully in poll().
// This will finally clear out the pending fetch.
event2.future().complete(Collections.emptyMap());
consumer.poll(Duration.ofMillis(timeoutMs));
verify(applicationEventHandler, never()).add(any(FetchCommittedOffsetsEvent.class));
assertDoesNotThrow(() -> ConsumerUtils.getResult(event2.future(), time.timer(timeoutMs)));
assertFalse(consumer.hasPendingOffsetFetchEvent());
}
@Test
public void testOffsetFetchDoesNotReuseExpiredPendingEvent() {
consumer = newConsumer();
long timeoutMs = 0;
doReturn(Fetch.empty()).when(fetchCollector).collectFetch(any(FetchBuffer.class));
consumer.assign(Collections.singleton(new TopicPartition("topic1", 0)));
// The first attempt at poll() creates an event, enqueues it, but its Future does not complete within
// the timeout, leaving a pending fetch.
consumer.poll(Duration.ofMillis(timeoutMs));
verify(applicationEventHandler).add(any(FetchCommittedOffsetsEvent.class));
CompletableApplicationEvent<Map<TopicPartition, OffsetAndMetadata>> event1 = getLastEnqueuedEvent();
assertThrows(TimeoutException.class, () -> ConsumerUtils.getResult(event1.future(), time.timer(timeoutMs)));
assertTrue(consumer.hasPendingOffsetFetchEvent());
clearInvocations(applicationEventHandler);
// Sleep past the event's expiration, causing the poll() to *not* reuse the pending fetch. A new event
// is created and added to the application event queue.
time.sleep(event1.deadlineMs() - time.milliseconds());
consumer.poll(Duration.ofMillis(timeoutMs));
verify(applicationEventHandler).add(any(FetchCommittedOffsetsEvent.class));
CompletableApplicationEvent<Map<TopicPartition, OffsetAndMetadata>> event2 = getLastEnqueuedEvent();
assertNotEquals(event1, event2);
assertThrows(TimeoutException.class, () -> ConsumerUtils.getResult(event2.future(), time.timer(timeoutMs)));
assertTrue(consumer.hasPendingOffsetFetchEvent());
}
@Test
public void testOffsetFetchTimeoutExceptionKeepsPendingEvent() {
consumer = newConsumer();
long timeoutMs = 0;
doReturn(Fetch.empty()).when(fetchCollector).collectFetch(any(FetchBuffer.class));
consumer.assign(Collections.singleton(new TopicPartition("topic1", 0)));
consumer.poll(Duration.ofMillis(timeoutMs));
verify(applicationEventHandler).add(any(FetchCommittedOffsetsEvent.class));
CompletableApplicationEvent<Map<TopicPartition, OffsetAndMetadata>> event = getLastEnqueuedEvent();
assertTrue(consumer.hasPendingOffsetFetchEvent());
event.future().completeExceptionally(new TimeoutException("Test error"));
assertDoesNotThrow(() -> consumer.poll(Duration.ofMillis(timeoutMs)));
assertTrue(consumer.hasPendingOffsetFetchEvent());
}
@Test
public void testOffsetFetchInterruptExceptionKeepsPendingEvent() {
consumer = newConsumer();
long timeoutMs = 0;
doReturn(Fetch.empty()).when(fetchCollector).collectFetch(any(FetchBuffer.class));
consumer.assign(Collections.singleton(new TopicPartition("topic1", 0)));
consumer.poll(Duration.ofMillis(timeoutMs));
verify(applicationEventHandler).add(any(FetchCommittedOffsetsEvent.class));
CompletableApplicationEvent<Map<TopicPartition, OffsetAndMetadata>> event = getLastEnqueuedEvent();
assertTrue(consumer.hasPendingOffsetFetchEvent());
event.future().completeExceptionally(new InterruptException("Test error"));
assertThrows(InterruptException.class, () -> consumer.poll(Duration.ofMillis(timeoutMs)));
assertTrue(Thread.interrupted());
assertTrue(consumer.hasPendingOffsetFetchEvent());
}
@Test
public void testOffsetFetchUnexpectedExceptionClearsPendingEvent() {
consumer = newConsumer();
long timeoutMs = 0;
doReturn(Fetch.empty()).when(fetchCollector).collectFetch(any(FetchBuffer.class));
consumer.assign(Collections.singleton(new TopicPartition("topic1", 0)));
consumer.poll(Duration.ofMillis(timeoutMs));
verify(applicationEventHandler).add(any(FetchCommittedOffsetsEvent.class));
CompletableApplicationEvent<Map<TopicPartition, OffsetAndMetadata>> event = getLastEnqueuedEvent();
assertTrue(consumer.hasPendingOffsetFetchEvent());
event.future().completeExceptionally(new NullPointerException("Test error"));
assertThrows(KafkaException.class, () -> consumer.poll(Duration.ofMillis(timeoutMs)));
assertFalse(consumer.hasPendingOffsetFetchEvent());
}
@Test
public void testCommitSyncLeaderEpochUpdate() {
consumer = newConsumer();
@ -564,7 +752,6 @@ public class AsyncKafkaConsumerTest {
new ConsumerInterceptors<>(Collections.emptyList()),
mock(ConsumerRebalanceListenerInvoker.class),
subscriptions,
singletonList(new RoundRobinAssignor()),
"group-id",
"client-id");
completeCommitSyncApplicationEventSuccessfully();
@ -705,17 +892,17 @@ public class AsyncKafkaConsumerTest {
consumer.seek(tp, 20);
consumer.commitAsync();
return getLastEnqueuedEventFuture();
CompletableApplicationEvent<Void> event = getLastEnqueuedEvent();
return event.future();
}
// ArgumentCaptor's type-matching does not work reliably with Java 8, so we cannot directly capture the AsyncCommitEvent
// Instead, we capture the super-class CompletableApplicationEvent and fetch the last captured event.
private CompletableFuture<Void> getLastEnqueuedEventFuture() {
final ArgumentCaptor<CompletableApplicationEvent<Void>> eventArgumentCaptor = ArgumentCaptor.forClass(CompletableApplicationEvent.class);
private <T> CompletableApplicationEvent<T> getLastEnqueuedEvent() {
final ArgumentCaptor<CompletableApplicationEvent<T>> eventArgumentCaptor = ArgumentCaptor.forClass(CompletableApplicationEvent.class);
verify(applicationEventHandler, atLeast(1)).add(eventArgumentCaptor.capture());
final List<CompletableApplicationEvent<Void>> allValues = eventArgumentCaptor.getAllValues();
final CompletableApplicationEvent<Void> lastEvent = allValues.get(allValues.size() - 1);
return lastEvent.future();
final List<CompletableApplicationEvent<T>> allValues = eventArgumentCaptor.getAllValues();
return allValues.get(allValues.size() - 1);
}
@Test
@ -758,6 +945,7 @@ public class AsyncKafkaConsumerTest {
@Test
public void testEnsureShutdownExecutedCommitAsyncCallbacks() {
consumer = newConsumer();
completeUnsubscribeApplicationEventSuccessfully();
MockCommitCallback callback = new MockCommitCallback();
completeCommitAsyncApplicationEventSuccessfully();
assertDoesNotThrow(() -> consumer.commitAsync(new HashMap<>(), callback));
@ -769,70 +957,45 @@ public class AsyncKafkaConsumerTest {
@Test
public void testVerifyApplicationEventOnShutdown() {
consumer = newConsumer();
completeUnsubscribeApplicationEventSuccessfully();
doReturn(null).when(applicationEventHandler).addAndGet(any());
consumer.close();
verify(applicationEventHandler).addAndGet(any(LeaveOnCloseEvent.class));
verify(applicationEventHandler).add(any(UnsubscribeEvent.class));
verify(applicationEventHandler).add(any(CommitOnCloseEvent.class));
}
@Test
public void testPartitionRevocationOnClose() {
MockRebalanceListener listener = new MockRebalanceListener();
SubscriptionState subscriptions = new SubscriptionState(new LogContext(), OffsetResetStrategy.NONE);
consumer = newConsumer(
public void testUnsubscribeOnClose() {
SubscriptionState subscriptions = mock(SubscriptionState.class);
consumer = spy(newConsumer(
mock(FetchBuffer.class),
mock(ConsumerInterceptors.class),
mock(ConsumerRebalanceListenerInvoker.class),
subscriptions,
singletonList(new RoundRobinAssignor()),
"group-id",
"client-id");
consumer.subscribe(singleton("topic"), listener);
subscriptions.assignFromSubscribed(singleton(new TopicPartition("topic", 0)));
"client-id"));
completeUnsubscribeApplicationEventSuccessfully();
consumer.close(Duration.ZERO);
assertTrue(subscriptions.assignedPartitions().isEmpty());
assertEquals(1, listener.revokedCount);
verifyUnsubscribeEvent(subscriptions);
}
@Test
public void testFailedPartitionRevocationOnClose() {
// If rebalance listener failed to execute during close, we will skip sending leave group and proceed with
// closing the consumer.
ConsumerRebalanceListener listener = mock(ConsumerRebalanceListener.class);
SubscriptionState subscriptions = new SubscriptionState(new LogContext(), OffsetResetStrategy.NONE);
consumer = newConsumer(
// If rebalance listener failed to execute during close, we still send the leave group,
// and proceed with closing the consumer.
SubscriptionState subscriptions = mock(SubscriptionState.class);
consumer = spy(newConsumer(
mock(FetchBuffer.class),
new ConsumerInterceptors<>(Collections.emptyList()),
mock(ConsumerRebalanceListenerInvoker.class),
subscriptions,
singletonList(new RoundRobinAssignor()),
"group-id",
"client-id");
subscriptions.subscribe(singleton("topic"), Optional.of(listener));
TopicPartition tp = new TopicPartition("topic", 0);
subscriptions.assignFromSubscribed(singleton(tp));
doThrow(new KafkaException()).when(listener).onPartitionsRevoked(eq(singleton(tp)));
"client-id"));
doThrow(new KafkaException()).when(consumer).processBackgroundEvents(any(), any());
assertThrows(KafkaException.class, () -> consumer.close(Duration.ZERO));
verify(applicationEventHandler, never()).addAndGet(any(LeaveOnCloseEvent.class));
verify(listener).onPartitionsRevoked(eq(singleton(tp)));
assertEquals(emptySet(), subscriptions.assignedPartitions());
}
@Test
public void testCompleteQuietly() {
AtomicReference<Throwable> exception = new AtomicReference<>();
CompletableFuture<Object> future = CompletableFuture.completedFuture(null);
consumer = newConsumer();
assertDoesNotThrow(() -> consumer.completeQuietly(() -> {
future.get(0, TimeUnit.MILLISECONDS);
}, "test", exception));
assertNull(exception.get());
assertDoesNotThrow(() -> consumer.completeQuietly(() -> {
throw new KafkaException("Test exception");
}, "test", exception));
assertInstanceOf(KafkaException.class, exception.get());
verifyUnsubscribeEvent(subscriptions);
// Close operation should carry on even if the unsubscribe fails
verify(applicationEventHandler).close(any(Duration.class));
}
@Test
@ -844,13 +1007,12 @@ public class AsyncKafkaConsumerTest {
mock(ConsumerInterceptors.class),
mock(ConsumerRebalanceListenerInvoker.class),
subscriptions,
singletonList(new RoundRobinAssignor()),
"group-id",
"client-id");
consumer.subscribe(singleton("topic"), mock(ConsumerRebalanceListener.class));
subscriptions.assignFromSubscribed(singleton(new TopicPartition("topic", 0)));
subscriptions.seek(new TopicPartition("topic", 0), 100);
consumer.autoCommitSync(time.timer(100));
consumer.commitSyncAllConsumed(time.timer(100));
verify(applicationEventHandler).add(any(SyncCommitEvent.class));
}
@ -862,7 +1024,6 @@ public class AsyncKafkaConsumerTest {
mock(ConsumerInterceptors.class),
mock(ConsumerRebalanceListenerInvoker.class),
subscriptions,
singletonList(new RoundRobinAssignor()),
"group-id",
"client-id");
consumer.subscribe(singleton("topic"), mock(ConsumerRebalanceListener.class));
@ -1135,7 +1296,7 @@ public class AsyncKafkaConsumerTest {
}
return null;
}).when(applicationEventHandler).add(any());
completeUnsubscribeApplicationEventSuccessfully();
consumer.close(Duration.ZERO);
// A commit was triggered and not completed exceptionally by the wakeup
@ -1173,6 +1334,7 @@ public class AsyncKafkaConsumerTest {
completeCommitAsyncApplicationEventSuccessfully();
consumer.commitAsync(cb);
completeUnsubscribeApplicationEventSuccessfully();
assertDoesNotThrow(() -> consumer.close(Duration.ofMillis(10)));
assertEquals(1, cb.invoked);
}
@ -1187,6 +1349,7 @@ public class AsyncKafkaConsumerTest {
consumer = newConsumer(props);
assertEquals(1, MockConsumerInterceptor.INIT_COUNT.get());
completeCommitSyncApplicationEventSuccessfully();
completeUnsubscribeApplicationEventSuccessfully();
consumer.close(Duration.ZERO);
@ -1624,6 +1787,18 @@ public class AsyncKafkaConsumerTest {
assertFalse(config.unused().contains(ConsumerConfig.GROUP_REMOTE_ASSIGNOR_CONFIG));
}
@Test
public void testPartitionAssignmentStrategyUnusedInAsyncConsumer() {
final Properties props = requiredConsumerConfig();
props.put(ConsumerConfig.GROUP_ID_CONFIG, "consumerGroup1");
props.put(ConsumerConfig.GROUP_PROTOCOL_CONFIG, GroupProtocol.CONSUMER.name().toLowerCase(Locale.ROOT));
props.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG, "CooperativeStickyAssignor");
final ConsumerConfig config = new ConsumerConfig(props);
consumer = newConsumer(config);
assertTrue(config.unused().contains(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG));
}
@Test
public void testGroupIdNull() {
final Properties props = requiredConsumerConfig();
@ -1666,7 +1841,6 @@ public class AsyncKafkaConsumerTest {
new ConsumerInterceptors<>(Collections.emptyList()),
mock(ConsumerRebalanceListenerInvoker.class),
subscriptions,
singletonList(new RoundRobinAssignor()),
"group-id",
"client-id");
final TopicPartition tp = new TopicPartition("topic", 0);
@ -1713,13 +1887,13 @@ public class AsyncKafkaConsumerTest {
if (committedOffsetsEnabled) {
// Verify there was an FetchCommittedOffsets event and no ResetPositions event
verify(applicationEventHandler, atLeast(1))
.addAndGet(ArgumentMatchers.isA(FetchCommittedOffsetsEvent.class));
.add(ArgumentMatchers.isA(FetchCommittedOffsetsEvent.class));
verify(applicationEventHandler, never())
.addAndGet(ArgumentMatchers.isA(ResetPositionsEvent.class));
} else {
// Verify there was not any FetchCommittedOffsets event but there should be a ResetPositions
verify(applicationEventHandler, never())
.addAndGet(ArgumentMatchers.isA(FetchCommittedOffsetsEvent.class));
.add(ArgumentMatchers.isA(FetchCommittedOffsetsEvent.class));
verify(applicationEventHandler, atLeast(1))
.addAndGet(ArgumentMatchers.isA(ResetPositionsEvent.class));
}
@ -1738,7 +1912,7 @@ public class AsyncKafkaConsumerTest {
verify(applicationEventHandler, atLeast(1))
.addAndGet(ArgumentMatchers.isA(ValidatePositionsEvent.class));
verify(applicationEventHandler, atLeast(1))
.addAndGet(ArgumentMatchers.isA(FetchCommittedOffsetsEvent.class));
.add(ArgumentMatchers.isA(FetchCommittedOffsetsEvent.class));
verify(applicationEventHandler, atLeast(1))
.addAndGet(ArgumentMatchers.isA(ResetPositionsEvent.class));
}
@ -1880,6 +2054,7 @@ public class AsyncKafkaConsumerTest {
@Test
void testReaperInvokedInClose() {
consumer = newConsumer();
completeUnsubscribeApplicationEventSuccessfully();
consumer.close();
verify(backgroundEventReaper).reap(backgroundEventQueue);
}
@ -1901,6 +2076,18 @@ public class AsyncKafkaConsumerTest {
verify(backgroundEventReaper).reap(time.milliseconds());
}
private void verifyUnsubscribeEvent(SubscriptionState subscriptions) {
// Check that an unsubscribe event was generated, and that the consumer waited for it to
// complete processing background events.
verify(applicationEventHandler).add(any(UnsubscribeEvent.class));
verify(consumer).processBackgroundEvents(any(), any());
// The consumer should not clear the assignment in the app thread. The unsubscribe
// event is the one responsible for updating the assignment in the background when it
// completes.
verify(subscriptions, never()).assignFromSubscribed(any());
}
private Map<TopicPartition, OffsetAndMetadata> mockTopicPartitionOffset() {
final TopicPartition t0 = new TopicPartition("t0", 2);
final TopicPartition t1 = new TopicPartition("t0", 3);
@ -1964,6 +2151,12 @@ public class AsyncKafkaConsumerTest {
doReturn(committedOffsets)
.when(applicationEventHandler)
.addAndGet(any(FetchCommittedOffsetsEvent.class));
doAnswer(invocation -> {
FetchCommittedOffsetsEvent event = invocation.getArgument(0);
event.future().complete(committedOffsets);
return null;
}).when(applicationEventHandler).add(ArgumentMatchers.isA(FetchCommittedOffsetsEvent.class));
}
private void completeFetchedCommittedOffsetApplicationEventExceptionally(Exception ex) {

View File

@ -193,11 +193,11 @@ public class CommitRequestManagerTest {
offsets2.put(new TopicPartition("test", 4), new OffsetAndMetadata(20L));
// Add the requests to the CommitRequestManager and store their futures
long expirationTimeMs = time.milliseconds() + defaultApiTimeoutMs;
commitManager.commitSync(offsets1, expirationTimeMs);
commitManager.fetchOffsets(Collections.singleton(new TopicPartition("test", 0)), expirationTimeMs);
commitManager.commitSync(offsets2, expirationTimeMs);
commitManager.fetchOffsets(Collections.singleton(new TopicPartition("test", 1)), expirationTimeMs);
long deadlineMs = time.milliseconds() + defaultApiTimeoutMs;
commitManager.commitSync(offsets1, deadlineMs);
commitManager.fetchOffsets(Collections.singleton(new TopicPartition("test", 0)), deadlineMs);
commitManager.commitSync(offsets2, deadlineMs);
commitManager.fetchOffsets(Collections.singleton(new TopicPartition("test", 1)), deadlineMs);
// Poll the CommitRequestManager and verify that the inflightOffsetFetches size is correct
NetworkClientDelegate.PollResult result = commitManager.poll(time.milliseconds());
@ -287,8 +287,8 @@ public class CommitRequestManagerTest {
Map<TopicPartition, OffsetAndMetadata> offsets = Collections.singletonMap(
new TopicPartition("topic", 1),
new OffsetAndMetadata(0));
long expirationTimeMs = time.milliseconds() + defaultApiTimeoutMs;
CompletableFuture<Void> commitResult = commitRequestManager.commitSync(offsets, expirationTimeMs);
long deadlineMs = time.milliseconds() + defaultApiTimeoutMs;
CompletableFuture<Void> commitResult = commitRequestManager.commitSync(offsets, deadlineMs);
sendAndVerifyOffsetCommitRequestFailedAndMaybeRetried(commitRequestManager, error, commitResult);
// We expect that request should have been retried on this sync commit.
@ -307,8 +307,8 @@ public class CommitRequestManagerTest {
new OffsetAndMetadata(0));
// Send sync offset commit that fails and verify it propagates the expected exception.
long expirationTimeMs = time.milliseconds() + retryBackoffMs;
CompletableFuture<Void> commitResult = commitRequestManager.commitSync(offsets, expirationTimeMs);
long deadlineMs = time.milliseconds() + retryBackoffMs;
CompletableFuture<Void> commitResult = commitRequestManager.commitSync(offsets, deadlineMs);
completeOffsetCommitRequestWithError(commitRequestManager, commitError);
assertFutureThrows(commitResult, expectedException);
}
@ -332,8 +332,8 @@ public class CommitRequestManagerTest {
Map<TopicPartition, OffsetAndMetadata> offsets = Collections.singletonMap(
new TopicPartition("topic", 1),
new OffsetAndMetadata(0));
long expirationTimeMs = time.milliseconds() + defaultApiTimeoutMs;
CompletableFuture<Void> commitResult = commitRequestManager.commitSync(offsets, expirationTimeMs);
long deadlineMs = time.milliseconds() + defaultApiTimeoutMs;
CompletableFuture<Void> commitResult = commitRequestManager.commitSync(offsets, deadlineMs);
completeOffsetCommitRequestWithError(commitRequestManager, Errors.UNKNOWN_MEMBER_ID);
NetworkClientDelegate.PollResult res = commitRequestManager.poll(time.milliseconds());
@ -594,7 +594,7 @@ public class CommitRequestManagerTest {
@ParameterizedTest
@MethodSource("offsetFetchExceptionSupplier")
public void testOffsetFetchRequestErroredRequests(final Errors error, final boolean isRetriable) {
public void testOffsetFetchRequestErroredRequests(final Errors error) {
CommitRequestManager commitRequestManager = create(true, 100);
when(coordinatorRequestManager.coordinator()).thenReturn(Optional.of(mockedNode));
@ -606,7 +606,7 @@ public class CommitRequestManagerTest {
1,
error);
// we only want to make sure to purge the outbound buffer for non-retriables, so retriable will be re-queued.
if (isRetriable)
if (isRetriableOnOffsetFetch(error))
testRetriable(commitRequestManager, futures);
else {
testNonRetriable(futures);
@ -614,15 +614,49 @@ public class CommitRequestManagerTest {
}
}
@ParameterizedTest
@MethodSource("offsetFetchExceptionSupplier")
public void testOffsetFetchRequestTimeoutRequests(final Errors error) {
CommitRequestManager commitRequestManager = create(true, 100);
when(coordinatorRequestManager.coordinator()).thenReturn(Optional.of(mockedNode));
Set<TopicPartition> partitions = new HashSet<>();
partitions.add(new TopicPartition("t1", 0));
List<CompletableFuture<Map<TopicPartition, OffsetAndMetadata>>> futures = sendAndVerifyDuplicatedOffsetFetchRequests(
commitRequestManager,
partitions,
1,
error);
if (isRetriableOnOffsetFetch(error)) {
futures.forEach(f -> assertFalse(f.isDone()));
// Insert a long enough sleep to force a timeout of the operation. Invoke poll() again so that each
// OffsetFetchRequestState is evaluated via isExpired().
time.sleep(defaultApiTimeoutMs);
assertFalse(commitRequestManager.pendingRequests.unsentOffsetFetches.isEmpty());
commitRequestManager.poll(time.milliseconds());
futures.forEach(f -> assertFutureThrows(f, TimeoutException.class));
assertTrue(commitRequestManager.pendingRequests.unsentOffsetFetches.isEmpty());
} else {
futures.forEach(f -> assertFutureThrows(f, KafkaException.class));
assertEmptyPendingRequests(commitRequestManager);
}
}
private boolean isRetriableOnOffsetFetch(Errors error) {
return error == Errors.NOT_COORDINATOR || error == Errors.COORDINATOR_LOAD_IN_PROGRESS || error == Errors.COORDINATOR_NOT_AVAILABLE;
}
@Test
public void testSuccessfulOffsetFetch() {
CommitRequestManager commitManager = create(false, 100);
when(coordinatorRequestManager.coordinator()).thenReturn(Optional.of(mockedNode));
long expirationTimeMs = time.milliseconds() + defaultApiTimeoutMs;
long deadlineMs = time.milliseconds() + defaultApiTimeoutMs;
CompletableFuture<Map<TopicPartition, OffsetAndMetadata>> fetchResult =
commitManager.fetchOffsets(Collections.singleton(new TopicPartition("test", 0)),
expirationTimeMs);
deadlineMs);
// Send fetch request
NetworkClientDelegate.PollResult result = commitManager.poll(time.milliseconds());
@ -667,8 +701,8 @@ public class CommitRequestManagerTest {
Set<TopicPartition> partitions = new HashSet<>();
partitions.add(new TopicPartition("t1", 0));
long expirationTimeMs = time.milliseconds() + defaultApiTimeoutMs;
CompletableFuture<Map<TopicPartition, OffsetAndMetadata>> result = commitRequestManager.fetchOffsets(partitions, expirationTimeMs);
long deadlineMs = time.milliseconds() + defaultApiTimeoutMs;
CompletableFuture<Map<TopicPartition, OffsetAndMetadata>> result = commitRequestManager.fetchOffsets(partitions, deadlineMs);
completeOffsetFetchRequestWithError(commitRequestManager, partitions, error);
@ -694,8 +728,8 @@ public class CommitRequestManagerTest {
Set<TopicPartition> partitions = new HashSet<>();
partitions.add(new TopicPartition("t1", 0));
long expirationTimeMs = time.milliseconds() + defaultApiTimeoutMs;
CompletableFuture<Map<TopicPartition, OffsetAndMetadata>> result = commitRequestManager.fetchOffsets(partitions, expirationTimeMs);
long deadlineMs = time.milliseconds() + defaultApiTimeoutMs;
CompletableFuture<Map<TopicPartition, OffsetAndMetadata>> result = commitRequestManager.fetchOffsets(partitions, deadlineMs);
NetworkClientDelegate.PollResult res = commitRequestManager.poll(time.milliseconds());
assertEquals(1, res.unsentRequests.size());
@ -748,8 +782,8 @@ public class CommitRequestManagerTest {
new OffsetAndMetadata(0));
// Send sync offset commit request that fails with retriable error.
long expirationTimeMs = time.milliseconds() + retryBackoffMs * 2;
CompletableFuture<Void> commitResult = commitRequestManager.commitSync(offsets, expirationTimeMs);
long deadlineMs = time.milliseconds() + retryBackoffMs * 2;
CompletableFuture<Void> commitResult = commitRequestManager.commitSync(offsets, deadlineMs);
completeOffsetCommitRequestWithError(commitRequestManager, Errors.REQUEST_TIMED_OUT);
// Request retried after backoff, and fails with retriable again. Should not complete yet
@ -770,8 +804,9 @@ public class CommitRequestManagerTest {
* Sync commit requests that fail with an expected retriable error should be retried
* while there is time. When time expires, they should fail with a TimeoutException.
*/
@Test
public void testOffsetCommitSyncFailedWithRetriableThrowsTimeoutWhenRetryTimeExpires() {
@ParameterizedTest
@MethodSource("offsetCommitExceptionSupplier")
public void testOffsetCommitSyncFailedWithRetriableThrowsTimeoutWhenRetryTimeExpires(final Errors error) {
CommitRequestManager commitRequestManager = create(false, 100);
when(coordinatorRequestManager.coordinator()).thenReturn(Optional.of(mockedNode));
@ -780,17 +815,21 @@ public class CommitRequestManagerTest {
new OffsetAndMetadata(0));
// Send offset commit request that fails with retriable error.
long expirationTimeMs = time.milliseconds() + retryBackoffMs * 2;
CompletableFuture<Void> commitResult = commitRequestManager.commitSync(offsets, expirationTimeMs);
completeOffsetCommitRequestWithError(commitRequestManager, Errors.COORDINATOR_NOT_AVAILABLE);
long deadlineMs = time.milliseconds() + retryBackoffMs * 2;
CompletableFuture<Void> commitResult = commitRequestManager.commitSync(offsets, deadlineMs);
completeOffsetCommitRequestWithError(commitRequestManager, error);
// Sleep to expire the request timeout. Request should fail on the next poll with a
// TimeoutException.
time.sleep(expirationTimeMs);
time.sleep(deadlineMs);
NetworkClientDelegate.PollResult res = commitRequestManager.poll(time.milliseconds());
assertEquals(0, res.unsentRequests.size());
assertTrue(commitResult.isDone());
assertFutureThrows(commitResult, TimeoutException.class);
if (error.exception() instanceof RetriableException)
assertFutureThrows(commitResult, TimeoutException.class);
else
assertFutureThrows(commitResult, KafkaException.class);
}
/**
@ -829,8 +868,8 @@ public class CommitRequestManagerTest {
Map<TopicPartition, OffsetAndMetadata> offsets = Collections.singletonMap(new TopicPartition("topic", 1),
new OffsetAndMetadata(0));
long expirationTimeMs = time.milliseconds() + defaultApiTimeoutMs;
commitRequestManager.commitSync(offsets, expirationTimeMs);
long deadlineMs = time.milliseconds() + defaultApiTimeoutMs;
commitRequestManager.commitSync(offsets, deadlineMs);
NetworkClientDelegate.PollResult res = commitRequestManager.poll(time.milliseconds());
assertEquals(1, res.unsentRequests.size());
res.unsentRequests.get(0).handler().onFailure(time.milliseconds(), new TimeoutException());
@ -911,8 +950,8 @@ public class CommitRequestManagerTest {
when(coordinatorRequestManager.coordinator()).thenReturn(Optional.of(mockedNode));
// Send request that is expected to fail with invalid epoch.
long expirationTimeMs = time.milliseconds() + defaultApiTimeoutMs;
commitRequestManager.fetchOffsets(partitions, expirationTimeMs);
long deadlineMs = time.milliseconds() + defaultApiTimeoutMs;
commitRequestManager.fetchOffsets(partitions, deadlineMs);
// Mock member has new a valid epoch.
int newEpoch = 8;
@ -950,9 +989,9 @@ public class CommitRequestManagerTest {
when(coordinatorRequestManager.coordinator()).thenReturn(Optional.of(mockedNode));
// Send request that is expected to fail with invalid epoch.
long expirationTimeMs = time.milliseconds() + defaultApiTimeoutMs;
long deadlineMs = time.milliseconds() + defaultApiTimeoutMs;
CompletableFuture<Map<TopicPartition, OffsetAndMetadata>> requestResult =
commitRequestManager.fetchOffsets(partitions, expirationTimeMs);
commitRequestManager.fetchOffsets(partitions, deadlineMs);
// Mock member not having a valid epoch anymore (left/failed/fenced).
commitRequestManager.onMemberEpochUpdated(Optional.empty(), Optional.empty());
@ -983,10 +1022,10 @@ public class CommitRequestManagerTest {
TopicPartition tp = new TopicPartition("topic", 1);
subscriptionState.assignFromUser(singleton(tp));
subscriptionState.seek(tp, 5);
long expirationTimeMs = time.milliseconds() + retryBackoffMs * 2;
long deadlineMs = time.milliseconds() + retryBackoffMs * 2;
// Send commit request expected to be retried on STALE_MEMBER_EPOCH error while it does not expire
commitRequestManager.maybeAutoCommitSyncBeforeRevocation(expirationTimeMs);
commitRequestManager.maybeAutoCommitSyncBeforeRevocation(deadlineMs);
int newEpoch = 8;
String memberId = "member1";
@ -1094,7 +1133,7 @@ public class CommitRequestManagerTest {
}
/**
* @return {@link Errors} that could be received in OffsetCommit responses.
* @return {@link Errors} that could be received in {@link ApiKeys#OFFSET_COMMIT} responses.
*/
private static Stream<Arguments> offsetCommitExceptionSupplier() {
return Stream.of(
@ -1113,25 +1152,27 @@ public class CommitRequestManagerTest {
Arguments.of(Errors.UNKNOWN_MEMBER_ID));
}
// Supplies (error, isRetriable)
/**
* @return {@link Errors} that could be received in {@link ApiKeys#OFFSET_FETCH} responses.
*/
private static Stream<Arguments> offsetFetchExceptionSupplier() {
// fetchCommit is only retrying on a subset of RetriableErrors
return Stream.of(
Arguments.of(Errors.NOT_COORDINATOR, true),
Arguments.of(Errors.COORDINATOR_LOAD_IN_PROGRESS, true),
Arguments.of(Errors.UNKNOWN_SERVER_ERROR, false),
Arguments.of(Errors.GROUP_AUTHORIZATION_FAILED, false),
Arguments.of(Errors.OFFSET_METADATA_TOO_LARGE, false),
Arguments.of(Errors.INVALID_COMMIT_OFFSET_SIZE, false),
Arguments.of(Errors.UNKNOWN_TOPIC_OR_PARTITION, false),
Arguments.of(Errors.COORDINATOR_NOT_AVAILABLE, true),
Arguments.of(Errors.REQUEST_TIMED_OUT, false),
Arguments.of(Errors.FENCED_INSTANCE_ID, false),
Arguments.of(Errors.TOPIC_AUTHORIZATION_FAILED, false),
Arguments.of(Errors.UNKNOWN_MEMBER_ID, false),
Arguments.of(Errors.NOT_COORDINATOR),
Arguments.of(Errors.COORDINATOR_LOAD_IN_PROGRESS),
Arguments.of(Errors.UNKNOWN_SERVER_ERROR),
Arguments.of(Errors.GROUP_AUTHORIZATION_FAILED),
Arguments.of(Errors.OFFSET_METADATA_TOO_LARGE),
Arguments.of(Errors.INVALID_COMMIT_OFFSET_SIZE),
Arguments.of(Errors.UNKNOWN_TOPIC_OR_PARTITION),
Arguments.of(Errors.COORDINATOR_NOT_AVAILABLE),
Arguments.of(Errors.REQUEST_TIMED_OUT),
Arguments.of(Errors.FENCED_INSTANCE_ID),
Arguments.of(Errors.TOPIC_AUTHORIZATION_FAILED),
Arguments.of(Errors.UNKNOWN_MEMBER_ID),
// Adding STALE_MEMBER_EPOCH as non-retriable here because it is only retried if a new
// member epoch is received. Tested separately.
Arguments.of(Errors.STALE_MEMBER_EPOCH, false));
Arguments.of(Errors.STALE_MEMBER_EPOCH),
Arguments.of(Errors.UNSTABLE_OFFSET_COMMIT));
}
/**
@ -1155,9 +1196,9 @@ public class CommitRequestManagerTest {
TopicPartition tp2 = new TopicPartition("t2", 3);
partitions.add(tp1);
partitions.add(tp2);
long expirationTimeMs = time.milliseconds() + defaultApiTimeoutMs;
long deadlineMs = time.milliseconds() + defaultApiTimeoutMs;
CompletableFuture<Map<TopicPartition, OffsetAndMetadata>> future =
commitRequestManager.fetchOffsets(partitions, expirationTimeMs);
commitRequestManager.fetchOffsets(partitions, deadlineMs);
NetworkClientDelegate.PollResult res = commitRequestManager.poll(time.milliseconds());
assertEquals(1, res.unsentRequests.size());
@ -1215,9 +1256,9 @@ public class CommitRequestManagerTest {
int numRequest,
final Errors error) {
List<CompletableFuture<Map<TopicPartition, OffsetAndMetadata>>> futures = new ArrayList<>();
long expirationTimeMs = time.milliseconds() + defaultApiTimeoutMs;
long deadlineMs = time.milliseconds() + defaultApiTimeoutMs;
for (int i = 0; i < numRequest; i++) {
futures.add(commitRequestManager.fetchOffsets(partitions, expirationTimeMs));
futures.add(commitRequestManager.fetchOffsets(partitions, deadlineMs));
}
NetworkClientDelegate.PollResult res = commitRequestManager.poll(time.milliseconds());

View File

@ -339,6 +339,27 @@ public class ConsumerNetworkThreadTest {
verify(applicationEventReaper).reap(any(Long.class));
}
@Test
void testSendUnsentRequest() {
String groupId = "group-id";
NetworkClientDelegate.UnsentRequest request = new NetworkClientDelegate.UnsentRequest(
new FindCoordinatorRequest.Builder(
new FindCoordinatorRequestData()
.setKeyType(FindCoordinatorRequest.CoordinatorType.TRANSACTION.id())
.setKey(groupId)),
Optional.empty());
networkClient.add(request);
assertTrue(networkClient.hasAnyPendingRequests());
assertFalse(networkClient.unsentRequests().isEmpty());
assertFalse(client.hasInFlightRequests());
consumerNetworkThread.cleanup();
assertTrue(networkClient.unsentRequests().isEmpty());
assertFalse(client.hasInFlightRequests());
assertFalse(networkClient.hasAnyPendingRequests());
}
private void prepareOffsetCommitRequest(final Map<TopicPartition, Long> expectedOffsets,
final Errors error,
final boolean disconnected) {

View File

@ -174,7 +174,7 @@ public class ConsumerTestBuilder implements Closeable {
backgroundEventHandler,
logContext));
this.topicMetadataRequestManager = spy(new TopicMetadataRequestManager(logContext, config));
this.topicMetadataRequestManager = spy(new TopicMetadataRequestManager(logContext, time, config));
if (groupInfo.isPresent()) {
GroupInformation gi = groupInfo.get();

View File

@ -21,6 +21,7 @@ import org.apache.kafka.clients.ClientRequest;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.KafkaClient;
import org.apache.kafka.clients.Metadata;
import org.apache.kafka.clients.MetadataRecoveryStrategy;
import org.apache.kafka.clients.MockClient;
import org.apache.kafka.clients.NetworkClient;
import org.apache.kafka.clients.NodeApiVersions;
@ -369,7 +370,7 @@ public class FetchRequestManagerTest {
// NOTE: by design the FetchRequestManager doesn't perform network I/O internally. That means that calling
// the close() method with a Timer will NOT send out the close session requests on close. The network
// I/O logic is handled inside ConsumerNetworkThread.runAtClose, so we need to run that logic here.
ConsumerNetworkThread.runAtClose(singletonList(Optional.of(fetcher)), networkClientDelegate, timer);
ConsumerNetworkThread.runAtClose(singletonList(Optional.of(fetcher)), networkClientDelegate);
// the network is polled during the last state of clean up.
networkClientDelegate.poll(time.timer(1));
// validate that closing the fetcher has sent a request with final epoch. 2 requests are sent, one for the
@ -1909,7 +1910,8 @@ public class FetchRequestManagerTest {
Node node = cluster.nodes().get(0);
NetworkClient client = new NetworkClient(selector, metadata, "mock", Integer.MAX_VALUE,
1000, 1000, 64 * 1024, 64 * 1024, 1000, 10 * 1000, 127 * 1000,
time, true, new ApiVersions(), metricsManager.throttleTimeSensor(), new LogContext());
time, true, new ApiVersions(), metricsManager.throttleTimeSensor(), new LogContext(),
MetadataRecoveryStrategy.NONE);
ApiVersionsResponse apiVersionsResponse = TestUtils.defaultApiVersionsResponse(
400, ApiMessageType.ListenerType.ZK_BROKER);

View File

@ -21,6 +21,7 @@ import org.apache.kafka.clients.ClientRequest;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.FetchSessionHandler;
import org.apache.kafka.clients.Metadata;
import org.apache.kafka.clients.MetadataRecoveryStrategy;
import org.apache.kafka.clients.MockClient;
import org.apache.kafka.clients.NetworkClient;
import org.apache.kafka.clients.NodeApiVersions;
@ -1905,7 +1906,8 @@ public class FetcherTest {
Node node = cluster.nodes().get(0);
NetworkClient client = new NetworkClient(selector, metadata, "mock", Integer.MAX_VALUE,
1000, 1000, 64 * 1024, 64 * 1024, 1000, 10 * 1000, 127 * 1000,
time, true, new ApiVersions(), metricsManager.throttleTimeSensor(), new LogContext());
time, true, new ApiVersions(), metricsManager.throttleTimeSensor(), new LogContext(),
MetadataRecoveryStrategy.NONE);
ApiVersionsResponse apiVersionsResponse = TestUtils.defaultApiVersionsResponse(
400, ApiMessageType.ListenerType.ZK_BROKER);

View File

@ -277,7 +277,7 @@ public class HeartbeatRequestManagerTest {
result = heartbeatRequestManager.poll(time.milliseconds());
assertEquals(0, result.unsentRequests.size(), "No heartbeat should be sent while a " +
"previous one is in-flight");
time.sleep(DEFAULT_HEARTBEAT_INTERVAL_MS);
result = heartbeatRequestManager.poll(time.milliseconds());
assertEquals(0, result.unsentRequests.size(), "No heartbeat should be sent when the " +
@ -752,6 +752,25 @@ public class HeartbeatRequestManagerTest {
assertEquals(1, result.unsentRequests.size(), "Fenced member should resume heartbeat after transitioning to JOINING");
}
@ParameterizedTest
@ApiKeyVersionsSource(apiKey = ApiKeys.CONSUMER_GROUP_HEARTBEAT)
public void testSendingLeaveGroupHeartbeatWhenPreviousOneInFlight(final short version) {
mockStableMember();
time.sleep(DEFAULT_HEARTBEAT_INTERVAL_MS);
NetworkClientDelegate.PollResult result = heartbeatRequestManager.poll(time.milliseconds());
assertEquals(1, result.unsentRequests.size());
result = heartbeatRequestManager.poll(time.milliseconds());
assertEquals(0, result.unsentRequests.size(), "No heartbeat should be sent while a previous one is in-flight");
membershipManager.leaveGroup();
ConsumerGroupHeartbeatRequest heartbeatToLeave = getHeartbeatRequest(heartbeatRequestManager, version);
assertEquals(ConsumerGroupHeartbeatRequest.LEAVE_GROUP_MEMBER_EPOCH, heartbeatToLeave.data().memberEpoch());
NetworkClientDelegate.PollResult pollAgain = heartbeatRequestManager.poll(time.milliseconds());
assertEquals(0, pollAgain.unsentRequests.size());
}
private void assertHeartbeat(HeartbeatRequestManager hrm, int nextPollMs) {
NetworkClientDelegate.PollResult pollResult = hrm.poll(time.milliseconds());
assertEquals(1, pollResult.unsentRequests.size());

View File

@ -365,6 +365,8 @@ public class MembershipManagerImplTest {
// because member is already out of the group in the broker).
completeCallback(callbackEvent, membershipManager);
assertEquals(MemberState.UNSUBSCRIBED, membershipManager.state());
assertEquals(ConsumerGroupHeartbeatRequest.LEAVE_GROUP_MEMBER_EPOCH, membershipManager.memberEpoch());
verify(membershipManager).notifyEpochChange(Optional.empty(), Optional.empty());
assertTrue(membershipManager.shouldSkipHeartbeat());
}

View File

@ -43,6 +43,7 @@ import static org.apache.kafka.clients.consumer.ConsumerConfig.KEY_DESERIALIZER_
import static org.apache.kafka.clients.consumer.ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG;
import static org.apache.kafka.clients.consumer.ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertFalse;
import static org.junit.jupiter.api.Assertions.assertNotNull;
import static org.junit.jupiter.api.Assertions.assertNull;
import static org.junit.jupiter.api.Assertions.assertTrue;
@ -140,6 +141,34 @@ public class NetworkClientDelegateTest {
assertEquals(REQUEST_TIMEOUT_MS, ncd.unsentRequests().poll().timer().timeoutMs());
}
@Test
public void testHasAnyPendingRequests() throws Exception {
try (NetworkClientDelegate networkClientDelegate = newNetworkClientDelegate()) {
NetworkClientDelegate.UnsentRequest unsentRequest = newUnsentFindCoordinatorRequest();
networkClientDelegate.add(unsentRequest);
// unsent
assertTrue(networkClientDelegate.hasAnyPendingRequests());
assertFalse(networkClientDelegate.unsentRequests().isEmpty());
assertFalse(client.hasInFlightRequests());
networkClientDelegate.poll(0, time.milliseconds());
// in-flight
assertTrue(networkClientDelegate.hasAnyPendingRequests());
assertTrue(networkClientDelegate.unsentRequests().isEmpty());
assertTrue(client.hasInFlightRequests());
client.respond(FindCoordinatorResponse.prepareResponse(Errors.NONE, GROUP_ID, mockNode()));
networkClientDelegate.poll(0, time.milliseconds());
// get response
assertFalse(networkClientDelegate.hasAnyPendingRequests());
assertTrue(networkClientDelegate.unsentRequests().isEmpty());
assertFalse(client.hasInFlightRequests());
}
}
public NetworkClientDelegate newNetworkClientDelegate() {
LogContext logContext = new LogContext();
Properties properties = new Properties();

View File

@ -0,0 +1,96 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.kafka.clients.consumer.internals;
import org.apache.kafka.common.utils.LogContext;
import org.apache.kafka.common.utils.MockTime;
import org.apache.kafka.common.utils.Time;
import org.apache.kafka.common.utils.Timer;
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertFalse;
import static org.junit.jupiter.api.Assertions.assertTrue;
public class TimedRequestStateTest {
private final static long DEFAULT_TIMEOUT_MS = 30000;
private final Time time = new MockTime();
@Test
public void testIsExpired() {
TimedRequestState state = new TimedRequestState(
new LogContext(),
this.getClass().getSimpleName(),
100,
1000,
time.timer(DEFAULT_TIMEOUT_MS)
);
assertFalse(state.isExpired());
time.sleep(DEFAULT_TIMEOUT_MS);
assertTrue(state.isExpired());
}
@Test
public void testRemainingMs() {
TimedRequestState state = new TimedRequestState(
new LogContext(),
this.getClass().getSimpleName(),
100,
1000,
time.timer(DEFAULT_TIMEOUT_MS)
);
assertEquals(DEFAULT_TIMEOUT_MS, state.remainingMs());
time.sleep(DEFAULT_TIMEOUT_MS);
assertEquals(0, state.remainingMs());
}
@Test
public void testDeadlineTimer() {
long deadlineMs = time.milliseconds() + DEFAULT_TIMEOUT_MS;
Timer timer = TimedRequestState.deadlineTimer(time, deadlineMs);
assertEquals(DEFAULT_TIMEOUT_MS, timer.remainingMs());
timer.sleep(DEFAULT_TIMEOUT_MS);
assertEquals(0, timer.remainingMs());
}
@Test
public void testAllowOverdueDeadlineTimer() {
long deadlineMs = time.milliseconds() - DEFAULT_TIMEOUT_MS;
Timer timer = TimedRequestState.deadlineTimer(time, deadlineMs);
assertEquals(0, timer.remainingMs());
}
@Test
public void testToStringUpdatesTimer() {
TimedRequestState state = new TimedRequestState(
new LogContext(),
this.getClass().getSimpleName(),
100,
1000,
time.timer(DEFAULT_TIMEOUT_MS)
);
assertToString(state, DEFAULT_TIMEOUT_MS);
time.sleep(DEFAULT_TIMEOUT_MS);
assertToString(state, 0);
}
private void assertToString(TimedRequestState state, long timerMs) {
assertTrue(state.toString().contains("remainingMs=" + timerMs + "}"));
}
}

View File

@ -74,6 +74,7 @@ public class TopicMetadataRequestManagerTest {
props.put(VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
this.topicMetadataRequestManager = spy(new TopicMetadataRequestManager(
new LogContext(),
time,
new ConsumerConfig(props)));
}

View File

@ -36,14 +36,10 @@ import java.util.Collections;
import java.util.List;
import java.util.Optional;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.CompletableFuture;
import static org.apache.kafka.clients.consumer.internals.events.CompletableEvent.calculateDeadlineMs;
import static org.junit.jupiter.api.Assertions.assertTrue;
import static org.mockito.Mockito.doReturn;
import static org.mockito.Mockito.mock;
import static org.mockito.Mockito.verify;
import static org.mockito.Mockito.when;
public class ApplicationEventProcessorTest {
private final Time time = new MockTime(1);
@ -90,16 +86,6 @@ public class ApplicationEventProcessorTest {
verify(commitRequestManager).signalClose();
}
@Test
public void testPrepClosingLeaveGroupEvent() {
LeaveOnCloseEvent event = new LeaveOnCloseEvent(calculateDeadlineMs(time, 100));
when(heartbeatRequestManager.membershipManager()).thenReturn(membershipManager);
when(membershipManager.leaveGroup()).thenReturn(CompletableFuture.completedFuture(null));
processor.process(event);
verify(membershipManager).leaveGroup();
assertTrue(event.future().isDone());
}
private List<NetworkClientDelegate.UnsentRequest> mockCommitResults() {
return Collections.singletonList(mock(NetworkClientDelegate.UnsentRequest.class));
}

View File

@ -18,6 +18,7 @@ package org.apache.kafka.clients.producer;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.KafkaClient;
import org.apache.kafka.clients.LeastLoadedNode;
import org.apache.kafka.clients.MockClient;
import org.apache.kafka.clients.NodeApiVersions;
import org.apache.kafka.clients.consumer.ConsumerConfig;
@ -735,8 +736,8 @@ public class KafkaProducerTest {
// let mockClient#leastLoadedNode return the node directly so that we can isolate Metadata calls from KafkaProducer for idempotent producer
MockClient mockClient = new MockClient(Time.SYSTEM, metadata) {
@Override
public Node leastLoadedNode(long now) {
return NODE;
public LeastLoadedNode leastLoadedNode(long now) {
return new LeastLoadedNode(NODE, true);
}
};

View File

@ -17,6 +17,7 @@
package org.apache.kafka.clients.producer;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.MetadataRecoveryStrategy;
import org.apache.kafka.common.config.ConfigException;
import org.apache.kafka.common.security.auth.SecurityProtocol;
import org.apache.kafka.common.serialization.ByteArraySerializer;
@ -98,6 +99,25 @@ public class ProducerConfigTest {
assertTrue(ce.getMessage().contains(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG));
}
@Test
public void testDefaultMetadataRecoveryStrategy() {
Map<String, Object> configs = new HashMap<>();
configs.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, keySerializerClass);
configs.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, valueSerializerClass);
final ProducerConfig producerConfig = new ProducerConfig(configs);
assertEquals(MetadataRecoveryStrategy.NONE.name, producerConfig.getString(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG));
}
@Test
public void testInvalidMetadataRecoveryStrategy() {
Map<String, Object> configs = new HashMap<>();
configs.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, keySerializerClass);
configs.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, valueSerializerClass);
configs.put(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG, "abc");
ConfigException ce = assertThrows(ConfigException.class, () -> new ProducerConfig(configs));
assertTrue(ce.getMessage().contains(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG));
}
@Test
public void testCaseInsensitiveSecurityProtocol() {
final String saslSslLowerCase = SecurityProtocol.SASL_SSL.name.toLowerCase(Locale.ROOT);

View File

@ -19,7 +19,9 @@ package org.apache.kafka.clients.producer.internals;
import org.apache.kafka.clients.ApiVersions;
import org.apache.kafka.clients.ClientRequest;
import org.apache.kafka.clients.ClientResponse;
import org.apache.kafka.clients.LeastLoadedNode;
import org.apache.kafka.clients.Metadata;
import org.apache.kafka.clients.MetadataRecoveryStrategy;
import org.apache.kafka.clients.MetadataSnapshot;
import org.apache.kafka.clients.MockClient;
import org.apache.kafka.clients.NetworkClient;
@ -299,7 +301,8 @@ public class SenderTest {
Node node = cluster.nodes().get(0);
NetworkClient client = new NetworkClient(selector, metadata, "mock", Integer.MAX_VALUE,
1000, 1000, 64 * 1024, 64 * 1024, 1000, 10 * 1000, 127 * 1000,
time, true, new ApiVersions(), throttleTimeSensor, logContext);
time, true, new ApiVersions(), throttleTimeSensor, logContext,
MetadataRecoveryStrategy.NONE);
ApiVersionsResponse apiVersionsResponse = TestUtils.defaultApiVersionsResponse(
400, ApiMessageType.ListenerType.ZK_BROKER);
@ -3797,12 +3800,12 @@ public class SenderTest {
client = new MockClient(time, metadata) {
volatile boolean canSendMore = true;
@Override
public Node leastLoadedNode(long now) {
public LeastLoadedNode leastLoadedNode(long now) {
for (Node node : metadata.fetch().nodes()) {
if (isReady(node, now) && canSendMore)
return node;
return new LeastLoadedNode(node, true);
}
return null;
return new LeastLoadedNode(null, false);
}
@Override
@ -3821,7 +3824,7 @@ public class SenderTest {
while (!client.ready(node, time.milliseconds()))
client.poll(0, time.milliseconds());
client.send(request, time.milliseconds());
while (client.leastLoadedNode(time.milliseconds()) != null)
while (client.leastLoadedNode(time.milliseconds()).node() != null)
client.poll(0, time.milliseconds());
}

View File

@ -21,6 +21,8 @@ import org.apache.kafka.test.TestUtils;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.Timeout;
import org.junit.jupiter.api.function.Executable;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.CsvSource;
import org.mockito.stubbing.OngoingStubbing;
import java.io.Closeable;
@ -106,31 +108,35 @@ public class UtilsTest {
}
}
@Test
public void testGetHost() {
// valid
@ParameterizedTest
@CsvSource(value = {"PLAINTEXT", "SASL_PLAINTEXT", "SSL", "SASL_SSL"})
public void testGetHostValid(String protocol) {
assertEquals("mydomain.com", getHost(protocol + "://mydomain.com:8080"));
assertEquals("MyDomain.com", getHost(protocol + "://MyDomain.com:8080"));
assertEquals("My_Domain.com", getHost(protocol + "://My_Domain.com:8080"));
assertEquals("::1", getHost(protocol + "://[::1]:1234"));
assertEquals("2001:db8:85a3:8d3:1319:8a2e:370:7348", getHost(protocol + "://[2001:db8:85a3:8d3:1319:8a2e:370:7348]:5678"));
assertEquals("2001:DB8:85A3:8D3:1319:8A2E:370:7348", getHost(protocol + "://[2001:DB8:85A3:8D3:1319:8A2E:370:7348]:5678"));
assertEquals("fe80::b1da:69ca:57f7:63d8%3", getHost(protocol + "://[fe80::b1da:69ca:57f7:63d8%3]:5678"));
assertEquals("127.0.0.1", getHost("127.0.0.1:8000"));
assertEquals("mydomain.com", getHost("PLAINTEXT://mydomain.com:8080"));
assertEquals("MyDomain.com", getHost("PLAINTEXT://MyDomain.com:8080"));
assertEquals("My_Domain.com", getHost("PLAINTEXT://My_Domain.com:8080"));
assertEquals("::1", getHost("[::1]:1234"));
assertEquals("2001:db8:85a3:8d3:1319:8a2e:370:7348", getHost("PLAINTEXT://[2001:db8:85a3:8d3:1319:8a2e:370:7348]:5678"));
assertEquals("2001:DB8:85A3:8D3:1319:8A2E:370:7348", getHost("PLAINTEXT://[2001:DB8:85A3:8D3:1319:8A2E:370:7348]:5678"));
assertEquals("fe80::b1da:69ca:57f7:63d8%3", getHost("PLAINTEXT://[fe80::b1da:69ca:57f7:63d8%3]:5678"));
}
// invalid
assertNull(getHost("PLAINTEXT://mydo)main.com:8080"));
assertNull(getHost("PLAINTEXT://mydo(main.com:8080"));
assertNull(getHost("PLAINTEXT://mydo()main.com:8080"));
assertNull(getHost("PLAINTEXT://mydo(main).com:8080"));
@ParameterizedTest
@CsvSource(value = {"PLAINTEXT", "SASL_PLAINTEXT", "SSL", "SASL_SSL"})
public void testGetHostInvalid(String protocol) {
assertNull(getHost(protocol + "://mydo)main.com:8080"));
assertNull(getHost(protocol + "://mydo(main.com:8080"));
assertNull(getHost(protocol + "://mydo()main.com:8080"));
assertNull(getHost(protocol + "://mydo(main).com:8080"));
assertNull(getHost(protocol + "://[2001:db)8:85a3:8d3:1319:8a2e:370:7348]:5678"));
assertNull(getHost(protocol + "://[2001:db(8:85a3:8d3:1319:8a2e:370:7348]:5678"));
assertNull(getHost(protocol + "://[2001:db()8:85a3:8d3:1319:8a2e:370:7348]:5678"));
assertNull(getHost(protocol + "://[2001:db(8:85a3:)8d3:1319:8a2e:370:7348]:5678"));
assertNull(getHost("ho)st:9092"));
assertNull(getHost("ho(st:9092"));
assertNull(getHost("ho()st:9092"));
assertNull(getHost("ho(st):9092"));
assertNull(getHost("PLAINTEXT://[2001:db)8:85a3:8d3:1319:8a2e:370:7348]:5678"));
assertNull(getHost("PLAINTEXT://[2001:db(8:85a3:8d3:1319:8a2e:370:7348]:5678"));
assertNull(getHost("PLAINTEXT://[2001:db()8:85a3:8d3:1319:8a2e:370:7348]:5678"));
assertNull(getHost("PLAINTEXT://[2001:db(8:85a3:)8d3:1319:8a2e:370:7348]:5678"));
}
@Test

View File

@ -37,7 +37,7 @@ controller.quorum.voters=1@localhost:9093
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
listeners=PLAINTEXT://localhost:9092
listeners=PLAINTEXT://:9092
# Name of listener used for communication between brokers.
inter.broker.listener.name=PLAINTEXT
@ -133,7 +133,7 @@ log.retention.check.interval.ms=300000
elasticstream.enable=true
# The data buckets
# the full url format for s3 is 0@s3://$bucket?region=$region[&endpoint=$endpoint][&pathStyle=$enablePathStyle][&authType=$authType][&accessKey=$accessKey][&secretKey=$secretKey]"
# the full url format for s3 is 0@s3://$bucket?region=$region[&endpoint=$endpoint][&pathStyle=$enablePathStyle][&authType=$authType][&accessKey=$accessKey][&secretKey=$secretKey][&checksumAlgorithm=$checksumAlgorithm]"
# - pathStyle: true|false. The object storage access path style. When using MinIO, it should be set to true."
# - authType: instance|static.
# - When set to instance, it will use instance profile to auth.

View File

@ -126,7 +126,7 @@ log.retention.check.interval.ms=300000
elasticstream.enable=true
# The data buckets
# the full url format for s3 is 0@s3://$bucket?region=$region[&endpoint=$endpoint][&pathStyle=$enablePathStyle][&authType=$authType][&accessKey=$accessKey][&secretKey=$secretKey]"
# the full url format for s3 is 0@s3://$bucket?region=$region[&endpoint=$endpoint][&pathStyle=$enablePathStyle][&authType=$authType][&accessKey=$accessKey][&secretKey=$secretKey][&checksumAlgorithm=$checksumAlgorithm]"
# - pathStyle: true|false. The object storage access path style. When using MinIO, it should be set to true."
# - authType: instance|static.
# - When set to instance, it will use instance profile to auth.

View File

@ -136,7 +136,7 @@ log.retention.check.interval.ms=300000
elasticstream.enable=true
# The data buckets
# the full url format for s3 is 0@s3://$bucket?region=$region[&endpoint=$endpoint][&pathStyle=$enablePathStyle][&authType=$authType][&accessKey=$accessKey][&secretKey=$secretKey]"
# the full url format for s3 is 0@s3://$bucket?region=$region[&endpoint=$endpoint][&pathStyle=$enablePathStyle][&authType=$authType][&accessKey=$accessKey][&secretKey=$secretKey][&checksumAlgorithm=$checksumAlgorithm]"
# - pathStyle: true|false. The object storage access path style. When using MinIO, it should be set to true."
# - authType: instance|static.
# - When set to instance, it will use instance profile to auth.

View File

@ -213,11 +213,15 @@ public class ConnectSchema implements Schema {
validateValue(null, schema, value);
}
public static void validateValue(String name, Schema schema, Object value) {
public static void validateValue(String field, Schema schema, Object value) {
validateValue(schema, value, field == null ? "value" : "field: \"" + field + "\"");
}
private static void validateValue(Schema schema, Object value, String location) {
if (value == null) {
if (!schema.isOptional())
throw new DataException("Invalid value: null used for required field: \"" + name
+ "\", schema type: " + schema.type());
throw new DataException("Invalid value: null used for required " + location
+ ", schema type: " + schema.type());
return;
}
@ -236,8 +240,8 @@ public class ConnectSchema implements Schema {
exceptionMessage.append(" \"").append(schema.name()).append("\"");
}
exceptionMessage.append(" with type ").append(schema.type()).append(": ").append(value.getClass());
if (name != null) {
exceptionMessage.append(" for field: \"").append(name).append("\"");
if (location != null) {
exceptionMessage.append(" for ").append(location);
}
throw new DataException(exceptionMessage.toString());
}
@ -251,19 +255,33 @@ public class ConnectSchema implements Schema {
break;
case ARRAY:
List<?> array = (List<?>) value;
for (Object entry : array)
validateValue(schema.valueSchema(), entry);
String entryLocation = "element of array " + location;
Schema arrayValueSchema = assertSchemaNotNull(schema.valueSchema(), entryLocation);
for (Object entry : array) {
validateValue(arrayValueSchema, entry, entryLocation);
}
break;
case MAP:
Map<?, ?> map = (Map<?, ?>) value;
String keyLocation = "key of map " + location;
String valueLocation = "value of map " + location;
Schema mapKeySchema = assertSchemaNotNull(schema.keySchema(), keyLocation);
Schema mapValueSchema = assertSchemaNotNull(schema.valueSchema(), valueLocation);
for (Map.Entry<?, ?> entry : map.entrySet()) {
validateValue(schema.keySchema(), entry.getKey());
validateValue(schema.valueSchema(), entry.getValue());
validateValue(mapKeySchema, entry.getKey(), keyLocation);
validateValue(mapValueSchema, entry.getValue(), valueLocation);
}
break;
}
}
private static Schema assertSchemaNotNull(Schema schema, String location) {
if (schema == null) {
throw new DataException("No schema defined for " + location);
}
return schema;
}
private static List<Class<?>> expectedClassesFor(Schema schema) {
List<Class<?>> expectedClasses = LOGICAL_TYPE_CLASSES.get(schema.name());
if (expectedClasses == null)

View File

@ -330,4 +330,144 @@ public class ConnectSchemaTest {
new Struct(emptyStruct);
}
private void assertInvalidValueForSchema(String fieldName, Schema schema, Object value, String message) {
Exception e = assertThrows(DataException.class, () -> ConnectSchema.validateValue(fieldName, schema, value));
assertEquals(message, e.getMessage());
}
@Test
public void testValidateFieldWithInvalidValueType() {
String fieldName = "field";
assertInvalidValueForSchema(fieldName, new FakeSchema(), new Object(),
"Invalid Java object for schema \"fake\" with type null: class java.lang.Object for field: \"field\"");
assertInvalidValueForSchema(null, Schema.INT8_SCHEMA, new Object(),
"Invalid Java object for schema with type INT8: class java.lang.Object for value");
assertInvalidValueForSchema(fieldName, Schema.INT8_SCHEMA, new Object(),
"Invalid Java object for schema with type INT8: class java.lang.Object for field: \"field\"");
}
@Test
public void testValidateFieldWithInvalidValueMismatchTimestamp() {
long longValue = 1000L;
String fieldName = "field";
ConnectSchema.validateValue(fieldName, Schema.INT64_SCHEMA, longValue);
assertInvalidValueForSchema(fieldName, Timestamp.SCHEMA, longValue,
"Invalid Java object for schema \"org.apache.kafka.connect.data.Timestamp\" " +
"with type INT64: class java.lang.Long for field: \"field\"");
}
@Test
public void testValidateList() {
String fieldName = "field";
// Optional element schema
Schema optionalStrings = SchemaBuilder.array(Schema.OPTIONAL_STRING_SCHEMA);
ConnectSchema.validateValue(fieldName, optionalStrings, Collections.emptyList());
ConnectSchema.validateValue(fieldName, optionalStrings, Collections.singletonList("hello"));
ConnectSchema.validateValue(fieldName, optionalStrings, Collections.singletonList(null));
ConnectSchema.validateValue(fieldName, optionalStrings, Arrays.asList("hello", "world"));
ConnectSchema.validateValue(fieldName, optionalStrings, Arrays.asList("hello", null));
ConnectSchema.validateValue(fieldName, optionalStrings, Arrays.asList(null, "world"));
assertInvalidValueForSchema(fieldName, optionalStrings, Collections.singletonList(true),
"Invalid Java object for schema with type STRING: class java.lang.Boolean for element of array field: \"field\"");
// Required element schema
Schema requiredStrings = SchemaBuilder.array(Schema.STRING_SCHEMA);
ConnectSchema.validateValue(fieldName, requiredStrings, Collections.emptyList());
ConnectSchema.validateValue(fieldName, requiredStrings, Collections.singletonList("hello"));
assertInvalidValueForSchema(fieldName, requiredStrings, Collections.singletonList(null),
"Invalid value: null used for required element of array field: \"field\", schema type: STRING");
ConnectSchema.validateValue(fieldName, requiredStrings, Arrays.asList("hello", "world"));
assertInvalidValueForSchema(fieldName, requiredStrings, Arrays.asList("hello", null),
"Invalid value: null used for required element of array field: \"field\", schema type: STRING");
assertInvalidValueForSchema(fieldName, requiredStrings, Arrays.asList(null, "world"),
"Invalid value: null used for required element of array field: \"field\", schema type: STRING");
assertInvalidValueForSchema(fieldName, optionalStrings, Collections.singletonList(true),
"Invalid Java object for schema with type STRING: class java.lang.Boolean for element of array field: \"field\"");
// Null element schema
Schema nullElements = SchemaBuilder.type(Schema.Type.ARRAY);
assertInvalidValueForSchema(fieldName, nullElements, Collections.emptyList(),
"No schema defined for element of array field: \"field\"");
assertInvalidValueForSchema(fieldName, nullElements, Collections.singletonList("hello"),
"No schema defined for element of array field: \"field\"");
assertInvalidValueForSchema(fieldName, nullElements, Collections.singletonList(null),
"No schema defined for element of array field: \"field\"");
assertInvalidValueForSchema(fieldName, nullElements, Arrays.asList("hello", "world"),
"No schema defined for element of array field: \"field\"");
assertInvalidValueForSchema(fieldName, nullElements, Arrays.asList("hello", null),
"No schema defined for element of array field: \"field\"");
assertInvalidValueForSchema(fieldName, nullElements, Arrays.asList(null, "world"),
"No schema defined for element of array field: \"field\"");
assertInvalidValueForSchema(fieldName, nullElements, Collections.singletonList(true),
"No schema defined for element of array field: \"field\"");
}
@Test
public void testValidateMap() {
String fieldName = "field";
// Optional element schema
Schema optionalStrings = SchemaBuilder.map(Schema.OPTIONAL_STRING_SCHEMA, Schema.OPTIONAL_STRING_SCHEMA);
ConnectSchema.validateValue(fieldName, optionalStrings, Collections.emptyMap());
ConnectSchema.validateValue(fieldName, optionalStrings, Collections.singletonMap("key", "value"));
ConnectSchema.validateValue(fieldName, optionalStrings, Collections.singletonMap("key", null));
ConnectSchema.validateValue(fieldName, optionalStrings, Collections.singletonMap(null, "value"));
ConnectSchema.validateValue(fieldName, optionalStrings, Collections.singletonMap(null, null));
assertInvalidValueForSchema(fieldName, optionalStrings, Collections.singletonMap("key", true),
"Invalid Java object for schema with type STRING: class java.lang.Boolean for value of map field: \"field\"");
assertInvalidValueForSchema(fieldName, optionalStrings, Collections.singletonMap(true, "value"),
"Invalid Java object for schema with type STRING: class java.lang.Boolean for key of map field: \"field\"");
// Required element schema
Schema requiredStrings = SchemaBuilder.map(Schema.STRING_SCHEMA, Schema.STRING_SCHEMA);
ConnectSchema.validateValue(fieldName, requiredStrings, Collections.emptyMap());
ConnectSchema.validateValue(fieldName, requiredStrings, Collections.singletonMap("key", "value"));
assertInvalidValueForSchema(fieldName, requiredStrings, Collections.singletonMap("key", null),
"Invalid value: null used for required value of map field: \"field\", schema type: STRING");
assertInvalidValueForSchema(fieldName, requiredStrings, Collections.singletonMap(null, "value"),
"Invalid value: null used for required key of map field: \"field\", schema type: STRING");
assertInvalidValueForSchema(fieldName, requiredStrings, Collections.singletonMap(null, null),
"Invalid value: null used for required key of map field: \"field\", schema type: STRING");
assertInvalidValueForSchema(fieldName, requiredStrings, Collections.singletonMap("key", true),
"Invalid Java object for schema with type STRING: class java.lang.Boolean for value of map field: \"field\"");
assertInvalidValueForSchema(fieldName, requiredStrings, Collections.singletonMap(true, "value"),
"Invalid Java object for schema with type STRING: class java.lang.Boolean for key of map field: \"field\"");
// Null key schema
Schema nullKeys = SchemaBuilder.type(Schema.Type.MAP);
assertInvalidValueForSchema(fieldName, nullKeys, Collections.emptyMap(),
"No schema defined for key of map field: \"field\"");
assertInvalidValueForSchema(fieldName, nullKeys, Collections.singletonMap("key", "value"),
"No schema defined for key of map field: \"field\"");
assertInvalidValueForSchema(fieldName, nullKeys, Collections.singletonMap("key", null),
"No schema defined for key of map field: \"field\"");
assertInvalidValueForSchema(fieldName, nullKeys, Collections.singletonMap(null, "value"),
"No schema defined for key of map field: \"field\"");
assertInvalidValueForSchema(fieldName, nullKeys, Collections.singletonMap(null, null),
"No schema defined for key of map field: \"field\"");
assertInvalidValueForSchema(fieldName, nullKeys, Collections.singletonMap("key", true),
"No schema defined for key of map field: \"field\"");
assertInvalidValueForSchema(fieldName, nullKeys, Collections.singletonMap(true, "value"),
"No schema defined for key of map field: \"field\"");
// Null value schema
Schema nullValues = SchemaBuilder.mapWithNullValues(Schema.OPTIONAL_STRING_SCHEMA);
assertInvalidValueForSchema(fieldName, nullValues, Collections.emptyMap(),
"No schema defined for value of map field: \"field\"");
assertInvalidValueForSchema(fieldName, nullValues, Collections.singletonMap("key", "value"),
"No schema defined for value of map field: \"field\"");
assertInvalidValueForSchema(fieldName, nullValues, Collections.singletonMap("key", null),
"No schema defined for value of map field: \"field\"");
assertInvalidValueForSchema(fieldName, nullValues, Collections.singletonMap(null, "value"),
"No schema defined for value of map field: \"field\"");
assertInvalidValueForSchema(fieldName, nullValues, Collections.singletonMap(null, null),
"No schema defined for value of map field: \"field\"");
assertInvalidValueForSchema(fieldName, nullValues, Collections.singletonMap("key", true),
"No schema defined for value of map field: \"field\"");
assertInvalidValueForSchema(fieldName, nullValues, Collections.singletonMap(true, "value"),
"No schema defined for value of map field: \"field\"");
}
}

View File

@ -304,39 +304,6 @@ public class StructTest {
e.getMessage());
}
@Test
public void testValidateFieldWithInvalidValueType() {
String fieldName = "field";
FakeSchema fakeSchema = new FakeSchema();
Exception e = assertThrows(DataException.class, () -> ConnectSchema.validateValue(fieldName,
fakeSchema, new Object()));
assertEquals("Invalid Java object for schema \"fake\" with type null: class java.lang.Object for field: \"field\"",
e.getMessage());
e = assertThrows(DataException.class, () -> ConnectSchema.validateValue(fieldName,
Schema.INT8_SCHEMA, new Object()));
assertEquals("Invalid Java object for schema with type INT8: class java.lang.Object for field: \"field\"",
e.getMessage());
e = assertThrows(DataException.class, () -> ConnectSchema.validateValue(Schema.INT8_SCHEMA, new Object()));
assertEquals("Invalid Java object for schema with type INT8: class java.lang.Object", e.getMessage());
}
@Test
public void testValidateFieldWithInvalidValueMismatchTimestamp() {
String fieldName = "field";
long longValue = 1000L;
// Does not throw
ConnectSchema.validateValue(fieldName, Schema.INT64_SCHEMA, longValue);
Exception e = assertThrows(DataException.class, () -> ConnectSchema.validateValue(fieldName,
Timestamp.SCHEMA, longValue));
assertEquals("Invalid Java object for schema \"org.apache.kafka.connect.data.Timestamp\" " +
"with type INT64: class java.lang.Long for field: \"field\"", e.getMessage());
}
@Test
public void testPutNullField() {
final String fieldName = "fieldName";

View File

@ -241,15 +241,15 @@ public class JsonConverter implements Converter, HeaderConverter, Versioned {
/**
* Creates a JsonConvert initializing serializer and deserializer.
*
* @param enableModules permits to enable/disable the registration of additional Jackson modules.
* @param enableAfterburner permits to enable/disable the registration of Jackson Afterburner module.
* <p>
* NOTE: This is visible only for testing
*/
public JsonConverter(boolean enableModules) {
public JsonConverter(boolean enableAfterburner) {
serializer = new JsonSerializer(
mkSet(),
JSON_NODE_FACTORY,
enableModules
enableAfterburner
);
deserializer = new JsonDeserializer(
@ -259,7 +259,7 @@ public class JsonConverter implements Converter, HeaderConverter, Versioned {
DeserializationFeature.USE_BIG_DECIMAL_FOR_FLOATS
),
JSON_NODE_FACTORY,
enableModules
enableAfterburner
);
}

View File

@ -16,13 +16,15 @@
*/
package org.apache.kafka.connect.json;
import org.apache.kafka.common.errors.SerializationException;
import org.apache.kafka.common.serialization.Deserializer;
import com.fasterxml.jackson.core.json.JsonReadFeature;
import com.fasterxml.jackson.databind.DeserializationFeature;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.node.JsonNodeFactory;
import org.apache.kafka.common.errors.SerializationException;
import org.apache.kafka.common.serialization.Deserializer;
import com.fasterxml.jackson.module.afterburner.AfterburnerModule;
import java.util.Collections;
import java.util.Set;
@ -51,13 +53,13 @@ public class JsonDeserializer implements Deserializer<JsonNode> {
JsonDeserializer(
final Set<DeserializationFeature> deserializationFeatures,
final JsonNodeFactory jsonNodeFactory,
final boolean enableModules
final boolean enableAfterburner
) {
objectMapper.enable(JsonReadFeature.ALLOW_LEADING_ZEROS_FOR_NUMBERS.mappedFeature());
deserializationFeatures.forEach(objectMapper::enable);
objectMapper.setNodeFactory(jsonNodeFactory);
if (enableModules) {
objectMapper.findAndRegisterModules();
if (enableAfterburner) {
objectMapper.registerModule(new AfterburnerModule());
}
}

View File

@ -16,12 +16,14 @@
*/
package org.apache.kafka.connect.json;
import org.apache.kafka.common.errors.SerializationException;
import org.apache.kafka.common.serialization.Serializer;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
import com.fasterxml.jackson.databind.node.JsonNodeFactory;
import org.apache.kafka.common.errors.SerializationException;
import org.apache.kafka.common.serialization.Serializer;
import com.fasterxml.jackson.module.afterburner.AfterburnerModule;
import java.util.Collections;
import java.util.Set;
@ -50,12 +52,12 @@ public class JsonSerializer implements Serializer<JsonNode> {
JsonSerializer(
final Set<SerializationFeature> serializationFeatures,
final JsonNodeFactory jsonNodeFactory,
final boolean enableModules
final boolean enableAfterburner
) {
serializationFeatures.forEach(objectMapper::enable);
objectMapper.setNodeFactory(jsonNodeFactory);
if (enableModules) {
objectMapper.findAndRegisterModules();
if (enableAfterburner) {
objectMapper.registerModule(new AfterburnerModule());
}
}

View File

@ -33,7 +33,6 @@ public class AutoMQIdentityReplicationPolicy extends IdentityReplicationPolicy {
if (offsetSyncsTopic == null) {
return super.offsetSyncsTopic(clusterAlias);
}
log.info("Using offset syncs topic: {}", offsetSyncsTopic);
return offsetSyncsTopic;
}
@ -43,7 +42,6 @@ public class AutoMQIdentityReplicationPolicy extends IdentityReplicationPolicy {
if (checkpointsTopic == null) {
return super.checkpointsTopic(clusterAlias);
}
log.info("Using checkpoints topic: {}", checkpointsTopic);
return checkpointsTopic;
}
@ -53,7 +51,22 @@ public class AutoMQIdentityReplicationPolicy extends IdentityReplicationPolicy {
if (heartbeatsTopic == null) {
return super.heartbeatsTopic();
}
log.info("Using heartbeats topic: {}", heartbeatsTopic);
return heartbeatsTopic;
}
@Override
public boolean isCheckpointsTopic(String topic) {
String checkpointsTopic = System.getenv(CHECKPOINTS_TOPIC_ENV_KEY);
return super.isCheckpointsTopic(topic) || topic.equals(checkpointsTopic);
}
@Override
public boolean isHeartbeatsTopic(String topic) {
return super.isHeartbeatsTopic(topic) || topic.equals(heartbeatsTopic());
}
@Override
public boolean isMM2InternalTopic(String topic) {
return super.isMM2InternalTopic(topic) || isHeartbeatsTopic(topic) || isCheckpointsTopic(topic);
}
}

View File

@ -219,16 +219,12 @@ public class MirrorConnectorsIntegrationBaseTest {
.build();
primary.start();
primary.assertions().assertAtLeastNumWorkersAreUp(NUM_WORKERS,
"Workers of " + PRIMARY_CLUSTER_ALIAS + "-connect-cluster did not start in time.");
waitForTopicCreated(primary, "mm2-status.backup.internal");
waitForTopicCreated(primary, "mm2-offsets.backup.internal");
waitForTopicCreated(primary, "mm2-configs.backup.internal");
backup.start();
backup.assertions().assertAtLeastNumWorkersAreUp(NUM_WORKERS,
"Workers of " + BACKUP_CLUSTER_ALIAS + "-connect-cluster did not start in time.");
primaryProducer = initializeProducer(primary);
backupProducer = initializeProducer(backup);

View File

@ -124,7 +124,7 @@ import static org.apache.kafka.connect.runtime.ConnectorConfig.VALUE_CONVERTER_C
*/
public abstract class AbstractHerder implements Herder, TaskStatus.Listener, ConnectorStatus.Listener {
private final Logger log = LoggerFactory.getLogger(AbstractHerder.class);
private static final Logger log = LoggerFactory.getLogger(AbstractHerder.class);
private final String workerId;
protected final Worker worker;
@ -1039,21 +1039,30 @@ public abstract class AbstractHerder implements Herder, TaskStatus.Listener, Con
return result;
}
public boolean taskConfigsChanged(ClusterConfigState configState, String connName, List<Map<String, String>> taskProps) {
public static boolean taskConfigsChanged(ClusterConfigState configState, String connName, List<Map<String, String>> rawTaskProps) {
int currentNumTasks = configState.taskCount(connName);
boolean result = false;
if (taskProps.size() != currentNumTasks) {
log.debug("Connector {} task count changed from {} to {}", connName, currentNumTasks, taskProps.size());
if (rawTaskProps.size() != currentNumTasks) {
log.debug("Connector {} task count changed from {} to {}", connName, currentNumTasks, rawTaskProps.size());
result = true;
} else {
}
if (!result) {
for (int index = 0; index < currentNumTasks; index++) {
ConnectorTaskId taskId = new ConnectorTaskId(connName, index);
if (!taskProps.get(index).equals(configState.taskConfig(taskId))) {
if (!rawTaskProps.get(index).equals(configState.rawTaskConfig(taskId))) {
log.debug("Connector {} has change in configuration for task {}-{}", connName, connName, index);
result = true;
}
}
}
if (!result) {
Map<String, String> appliedConnectorConfig = configState.appliedConnectorConfig(connName);
Map<String, String> currentConnectorConfig = configState.connectorConfig(connName);
if (!Objects.equals(appliedConnectorConfig, currentConnectorConfig)) {
log.debug("Forcing task restart for connector {} as its configuration appears to be updated", connName);
result = true;
}
}
if (result) {
log.debug("Reconfiguring connector {}: writing new updated configurations for tasks", connName);
} else {

View File

@ -17,6 +17,7 @@
package org.apache.kafka.connect.runtime.distributed;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.MetadataRecoveryStrategy;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.common.config.ConfigDef;
import org.apache.kafka.common.config.ConfigException;
@ -96,6 +97,10 @@ public class DistributedConfig extends WorkerConfig {
public static final String REBALANCE_TIMEOUT_MS_CONFIG = CommonClientConfigs.REBALANCE_TIMEOUT_MS_CONFIG;
private static final String REBALANCE_TIMEOUT_MS_DOC = CommonClientConfigs.REBALANCE_TIMEOUT_MS_DOC;
public static final String METADATA_RECOVERY_STRATEGY_CONFIG = CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG;
private static final String METADATA_RECOVERY_STRATEGY_DOC = CommonClientConfigs.METADATA_RECOVERY_STRATEGY_DOC;
public static final String DEFAULT_METADATA_RECOVERY_STRATEGY = CommonClientConfigs.DEFAULT_METADATA_RECOVERY_STRATEGY;
/**
* <code>worker.sync.timeout.ms</code>
*/
@ -512,7 +517,14 @@ public class DistributedConfig extends WorkerConfig {
(name, value) -> validateVerificationAlgorithms(crypto, name, (List<String>) value),
() -> "A list of one or more MAC algorithms, each supported by the worker JVM"),
ConfigDef.Importance.LOW,
INTER_WORKER_VERIFICATION_ALGORITHMS_DOC);
INTER_WORKER_VERIFICATION_ALGORITHMS_DOC)
.define(METADATA_RECOVERY_STRATEGY_CONFIG,
ConfigDef.Type.STRING,
DEFAULT_METADATA_RECOVERY_STRATEGY,
ConfigDef.CaseInsensitiveValidString
.in(Utils.enumOptions(MetadataRecoveryStrategy.class)),
ConfigDef.Importance.LOW,
METADATA_RECOVERY_STRATEGY_DOC);
}
private final ExactlyOnceSourceSupport exactlyOnceSourceSupport;

View File

@ -2229,11 +2229,11 @@ public class DistributedHerder extends AbstractHerder implements Runnable {
}
private void publishConnectorTaskConfigs(String connName, List<Map<String, String>> taskProps, Callback<Void> cb) {
if (!taskConfigsChanged(configState, connName, taskProps)) {
List<Map<String, String>> rawTaskProps = reverseTransform(connName, configState, taskProps);
if (!taskConfigsChanged(configState, connName, rawTaskProps)) {
return;
}
List<Map<String, String>> rawTaskProps = reverseTransform(connName, configState, taskProps);
if (isLeader()) {
writeTaskConfigs(connName, rawTaskProps);
cb.onCompletion(null, null);

View File

@ -20,6 +20,7 @@ import org.apache.kafka.clients.ApiVersions;
import org.apache.kafka.clients.ClientUtils;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.Metadata;
import org.apache.kafka.clients.MetadataRecoveryStrategy;
import org.apache.kafka.clients.NetworkClient;
import org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient;
import org.apache.kafka.clients.GroupRebalanceConfig;
@ -119,7 +120,9 @@ public class WorkerGroupMember {
time,
true,
new ApiVersions(),
logContext);
logContext,
MetadataRecoveryStrategy.forName(config.getString(CommonClientConfigs.METADATA_RECOVERY_STRATEGY_CONFIG))
);
this.client = new ConsumerNetworkClient(
logContext,
netClient,

View File

@ -47,7 +47,7 @@ public class CreateConnectorRequest {
return config;
}
@JsonProperty
@JsonProperty("initial_state")
public InitialState initialState() {
return initialState;
}

View File

@ -519,10 +519,10 @@ public class StandaloneHerder extends AbstractHerder {
}
List<Map<String, String>> newTaskConfigs = recomputeTaskConfigs(connName);
List<Map<String, String>> rawTaskConfigs = reverseTransform(connName, configState, newTaskConfigs);
if (taskConfigsChanged(configState, connName, newTaskConfigs)) {
if (taskConfigsChanged(configState, connName, rawTaskConfigs)) {
removeConnectorTasks(connName);
List<Map<String, String>> rawTaskConfigs = reverseTransform(connName, configState, newTaskConfigs);
configBackingStore.putTaskConfigs(connName, rawTaskConfigs);
createConnectorTasks(connName);
}

View File

@ -0,0 +1,66 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.kafka.connect.storage;
import org.apache.kafka.connect.runtime.WorkerConfigTransformer;
import java.util.Map;
/**
* Wrapper class for a connector configuration that has been used to generate task configurations
* Supports lazy {@link WorkerConfigTransformer#transform(Map) transformation}.
*/
public class AppliedConnectorConfig {
private final Map<String, String> rawConfig;
private Map<String, String> transformedConfig;
/**
* Create a new applied config that has not yet undergone
* {@link WorkerConfigTransformer#transform(Map) transformation}.
* @param rawConfig the non-transformed connector configuration; may be null
*/
public AppliedConnectorConfig(Map<String, String> rawConfig) {
this.rawConfig = rawConfig;
}
/**
* If necessary, {@link WorkerConfigTransformer#transform(Map) transform} the raw
* connector config, then return the result. Transformed configurations are cached and
* returned in all subsequent calls.
* <p>
* This method is thread-safe: different threads may invoke it at any time and the same
* transformed config should always be returned, with transformation still only ever
* taking place once before its results are cached.
* @param configTransformer the transformer to use, if no transformed connector
* config has been cached yet; may be null
* @return the possibly-cached, transformed, connector config; may be null
*/
public synchronized Map<String, String> transformedConfig(WorkerConfigTransformer configTransformer) {
if (transformedConfig != null || rawConfig == null)
return transformedConfig;
if (configTransformer != null) {
transformedConfig = configTransformer.transform(rawConfig);
} else {
transformedConfig = rawConfig;
}
return transformedConfig;
}
}

View File

@ -43,6 +43,7 @@ public class ClusterConfigState {
Collections.emptyMap(),
Collections.emptyMap(),
Collections.emptyMap(),
Collections.emptyMap(),
Collections.emptySet(),
Collections.emptySet());
@ -55,6 +56,7 @@ public class ClusterConfigState {
final Map<ConnectorTaskId, Map<String, String>> taskConfigs;
final Map<String, Integer> connectorTaskCountRecords;
final Map<String, Integer> connectorTaskConfigGenerations;
final Map<String, AppliedConnectorConfig> appliedConnectorConfigs;
final Set<String> connectorsPendingFencing;
final Set<String> inconsistentConnectors;
@ -66,6 +68,7 @@ public class ClusterConfigState {
Map<ConnectorTaskId, Map<String, String>> taskConfigs,
Map<String, Integer> connectorTaskCountRecords,
Map<String, Integer> connectorTaskConfigGenerations,
Map<String, AppliedConnectorConfig> appliedConnectorConfigs,
Set<String> connectorsPendingFencing,
Set<String> inconsistentConnectors) {
this(offset,
@ -76,6 +79,7 @@ public class ClusterConfigState {
taskConfigs,
connectorTaskCountRecords,
connectorTaskConfigGenerations,
appliedConnectorConfigs,
connectorsPendingFencing,
inconsistentConnectors,
null);
@ -89,6 +93,7 @@ public class ClusterConfigState {
Map<ConnectorTaskId, Map<String, String>> taskConfigs,
Map<String, Integer> connectorTaskCountRecords,
Map<String, Integer> connectorTaskConfigGenerations,
Map<String, AppliedConnectorConfig> appliedConnectorConfigs,
Set<String> connectorsPendingFencing,
Set<String> inconsistentConnectors,
WorkerConfigTransformer configTransformer) {
@ -100,6 +105,7 @@ public class ClusterConfigState {
this.taskConfigs = taskConfigs;
this.connectorTaskCountRecords = connectorTaskCountRecords;
this.connectorTaskConfigGenerations = connectorTaskConfigGenerations;
this.appliedConnectorConfigs = appliedConnectorConfigs;
this.connectorsPendingFencing = connectorsPendingFencing;
this.inconsistentConnectors = inconsistentConnectors;
this.configTransformer = configTransformer;
@ -158,6 +164,19 @@ public class ClusterConfigState {
return connectorConfigs.get(connector);
}
/**
* Get the most recent configuration for the connector from which task configs have
* been generated. The configuration will have been transformed by
* {@link org.apache.kafka.common.config.ConfigTransformer}
* @param connector name of the connector
* @return the connector config, or null if no config exists from which task configs have
* been generated
*/
public Map<String, String> appliedConnectorConfig(String connector) {
AppliedConnectorConfig appliedConfig = appliedConnectorConfigs.get(connector);
return appliedConfig != null ? appliedConfig.transformedConfig(configTransformer) : null;
}
/**
* Get the target state of the connector
* @param connector name of the connector
@ -303,4 +322,5 @@ public class ClusterConfigState {
inconsistentConnectors,
configTransformer);
}
}

View File

@ -318,6 +318,7 @@ public class KafkaConfigBackingStore extends KafkaTopicBasedBackingStore impleme
final Map<String, Integer> connectorTaskCountRecords = new HashMap<>();
final Map<String, Integer> connectorTaskConfigGenerations = new HashMap<>();
final Map<String, AppliedConnectorConfig> appliedConnectorConfigs = new HashMap<>();
final Set<String> connectorsPendingFencing = new HashSet<>();
private final WorkerConfigTransformer configTransformer;
@ -478,6 +479,7 @@ public class KafkaConfigBackingStore extends KafkaTopicBasedBackingStore impleme
new HashMap<>(taskConfigs),
new HashMap<>(connectorTaskCountRecords),
new HashMap<>(connectorTaskConfigGenerations),
new HashMap<>(appliedConnectorConfigs),
new HashSet<>(connectorsPendingFencing),
new HashSet<>(inconsistent),
configTransformer
@ -997,11 +999,8 @@ public class KafkaConfigBackingStore extends KafkaTopicBasedBackingStore impleme
synchronized (lock) {
if (value.value() == null) {
// Connector deletion will be written as a null value
processConnectorRemoval(connectorName);
log.info("Successfully processed removal of connector '{}'", connectorName);
connectorConfigs.remove(connectorName);
connectorTaskCounts.remove(connectorName);
taskConfigs.keySet().removeIf(taskId -> taskId.connector().equals(connectorName));
deferredTaskUpdates.remove(connectorName);
removed = true;
} else {
// Connector configs can be applied and callbacks invoked immediately
@ -1064,6 +1063,22 @@ public class KafkaConfigBackingStore extends KafkaTopicBasedBackingStore impleme
private void processTasksCommitRecord(String connectorName, SchemaAndValue value) {
List<ConnectorTaskId> updatedTasks = new ArrayList<>();
synchronized (lock) {
// Edge case: connector was deleted before these task configs were published,
// but compaction took place and both the original connector config and the
// tombstone message for it have been removed from the config topic
// We should ignore these task configs
Map<String, String> appliedConnectorConfig = connectorConfigs.get(connectorName);
if (appliedConnectorConfig == null) {
processConnectorRemoval(connectorName);
log.debug(
"Ignoring task configs for connector {}; it appears that the connector was deleted previously "
+ "and that log compaction has since removed any trace of its previous configurations "
+ "from the config topic",
connectorName
);
return;
}
// Apply any outstanding deferred task updates for the given connector. Note that just because we
// encounter a commit message does not mean it will result in consistent output. In particular due to
// compaction, there may be cases where . For example if we have the following sequence of writes:
@ -1111,6 +1126,11 @@ public class KafkaConfigBackingStore extends KafkaTopicBasedBackingStore impleme
connectorTaskConfigGenerations.compute(connectorName, (ignored, generation) -> generation != null ? generation + 1 : 0);
}
inconsistent.remove(connectorName);
appliedConnectorConfigs.put(
connectorName,
new AppliedConnectorConfig(appliedConnectorConfig)
);
}
// Always clear the deferred entries, even if we didn't apply them. If they represented an inconsistent
// update, then we need to see a completely fresh set of configs after this commit message, so we don't
@ -1168,7 +1188,7 @@ public class KafkaConfigBackingStore extends KafkaTopicBasedBackingStore impleme
log.debug("Setting task count record for connector '{}' to {}", connectorName, taskCount);
connectorTaskCountRecords.put(connectorName, taskCount);
// If a task count record appears after the latest task configs, the connectors doesn't need a round of zombie
// If a task count record appears after the latest task configs, the connector doesn't need a round of zombie
// fencing before it can start tasks with the latest configs
connectorsPendingFencing.remove(connectorName);
}
@ -1244,6 +1264,14 @@ public class KafkaConfigBackingStore extends KafkaTopicBasedBackingStore impleme
}
}
private void processConnectorRemoval(String connectorName) {
connectorConfigs.remove(connectorName);
connectorTaskCounts.remove(connectorName);
taskConfigs.keySet().removeIf(taskId -> taskId.connector().equals(connectorName));
deferredTaskUpdates.remove(connectorName);
appliedConnectorConfigs.remove(connectorName);
}
private ConnectorTaskId parseTaskId(String key) {
String[] parts = key.split("-");
if (parts.length < 3) return null;
@ -1314,5 +1342,6 @@ public class KafkaConfigBackingStore extends KafkaTopicBasedBackingStore impleme
else
throw new ConnectException("Expected integer value to be either Integer or Long");
}
}

View File

@ -21,6 +21,8 @@ import org.apache.kafka.connect.runtime.SessionKey;
import org.apache.kafka.connect.runtime.TargetState;
import org.apache.kafka.connect.runtime.WorkerConfigTransformer;
import org.apache.kafka.connect.util.ConnectorTaskId;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.Collections;
import java.util.HashMap;
@ -36,6 +38,8 @@ import java.util.concurrent.TimeUnit;
*/
public class MemoryConfigBackingStore implements ConfigBackingStore {
private static final Logger log = LoggerFactory.getLogger(MemoryConfigBackingStore.class);
private final Map<String, ConnectorState> connectors = new HashMap<>();
private UpdateListener updateListener;
private WorkerConfigTransformer configTransformer;
@ -61,6 +65,7 @@ public class MemoryConfigBackingStore implements ConfigBackingStore {
Map<String, Map<String, String>> connectorConfigs = new HashMap<>();
Map<String, TargetState> connectorTargetStates = new HashMap<>();
Map<ConnectorTaskId, Map<String, String>> taskConfigs = new HashMap<>();
Map<String, AppliedConnectorConfig> appliedConnectorConfigs = new HashMap<>();
for (Map.Entry<String, ConnectorState> connectorStateEntry : connectors.entrySet()) {
String connector = connectorStateEntry.getKey();
@ -69,6 +74,9 @@ public class MemoryConfigBackingStore implements ConfigBackingStore {
connectorConfigs.put(connector, connectorState.connConfig);
connectorTargetStates.put(connector, connectorState.targetState);
taskConfigs.putAll(connectorState.taskConfigs);
if (connectorState.appliedConnConfig != null) {
appliedConnectorConfigs.put(connector, connectorState.appliedConnConfig);
}
}
return new ClusterConfigState(
@ -80,6 +88,7 @@ public class MemoryConfigBackingStore implements ConfigBackingStore {
taskConfigs,
Collections.emptyMap(),
Collections.emptyMap(),
appliedConnectorConfigs,
Collections.emptySet(),
Collections.emptySet(),
configTransformer
@ -123,6 +132,7 @@ public class MemoryConfigBackingStore implements ConfigBackingStore {
HashSet<ConnectorTaskId> taskIds = new HashSet<>(state.taskConfigs.keySet());
state.taskConfigs.clear();
state.appliedConnConfig = null;
if (updateListener != null)
updateListener.onTaskConfigUpdate(taskIds);
@ -137,6 +147,8 @@ public class MemoryConfigBackingStore implements ConfigBackingStore {
Map<ConnectorTaskId, Map<String, String>> taskConfigsMap = taskConfigListAsMap(connector, configs);
state.taskConfigs = taskConfigsMap;
state.applyConfig();
if (updateListener != null)
updateListener.onTaskConfigUpdate(taskConfigsMap.keySet());
}
@ -187,6 +199,7 @@ public class MemoryConfigBackingStore implements ConfigBackingStore {
private TargetState targetState;
private Map<String, String> connConfig;
private Map<ConnectorTaskId, Map<String, String>> taskConfigs;
private AppliedConnectorConfig appliedConnConfig;
/**
* @param connConfig the connector's configuration
@ -197,6 +210,11 @@ public class MemoryConfigBackingStore implements ConfigBackingStore {
this.targetState = targetState == null ? TargetState.STARTED : targetState;
this.connConfig = connConfig;
this.taskConfigs = new HashMap<>();
this.appliedConnConfig = null;
}
public void applyConfig() {
this.appliedConnConfig = new AppliedConnectorConfig(connConfig);
}
}

View File

@ -134,11 +134,6 @@ public class BlockingConnectorTest {
// start the clusters
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(
NUM_WORKERS,
"Initial group of workers did not start in time"
);
}
@After

View File

@ -17,19 +17,28 @@
package org.apache.kafka.connect.integration;
import org.apache.kafka.common.TopicPartition;
import org.apache.kafka.common.config.provider.FileConfigProvider;
import org.apache.kafka.common.config.ConfigDef;
import org.apache.kafka.common.utils.LogCaptureAppender;
import org.apache.kafka.connect.connector.Task;
import org.apache.kafka.connect.data.Struct;
import org.apache.kafka.connect.errors.ConnectException;
import org.apache.kafka.connect.json.JsonConverter;
import org.apache.kafka.connect.json.JsonConverterConfig;
import org.apache.kafka.connect.runtime.distributed.DistributedConfig;
import org.apache.kafka.connect.runtime.distributed.DistributedHerder;
import org.apache.kafka.connect.runtime.rest.entities.ConnectorOffset;
import org.apache.kafka.connect.runtime.rest.entities.ConnectorOffsets;
import org.apache.kafka.connect.runtime.rest.entities.CreateConnectorRequest;
import org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource;
import org.apache.kafka.connect.runtime.rest.errors.ConnectRestException;
import org.apache.kafka.connect.sink.SinkConnector;
import org.apache.kafka.connect.sink.SinkRecord;
import org.apache.kafka.connect.sink.SinkTask;
import org.apache.kafka.connect.storage.KafkaConfigBackingStore;
import org.apache.kafka.connect.storage.StringConverter;
import org.apache.kafka.connect.util.ConnectorTaskId;
import org.apache.kafka.connect.util.SinkUtils;
import org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster;
import org.apache.kafka.connect.util.clusters.WorkerHandle;
import org.apache.kafka.test.IntegrationTest;
@ -39,11 +48,15 @@ import org.junit.Before;
import org.junit.Rule;
import org.junit.Test;
import org.junit.experimental.categories.Category;
import org.junit.rules.TemporaryFolder;
import org.junit.rules.TestRule;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.event.Level;
import java.io.File;
import java.io.FileOutputStream;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
@ -53,9 +66,14 @@ import java.util.Properties;
import java.util.Set;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicReference;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
import static javax.ws.rs.core.Response.Status.INTERNAL_SERVER_ERROR;
import static org.apache.kafka.clients.CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG;
import static org.apache.kafka.common.config.AbstractConfig.CONFIG_PROVIDERS_CONFIG;
import static org.apache.kafka.common.config.TopicConfig.DELETE_RETENTION_MS_CONFIG;
import static org.apache.kafka.common.config.TopicConfig.SEGMENT_MS_CONFIG;
import static org.apache.kafka.connect.integration.BlockingConnectorTest.TASK_STOP;
import static org.apache.kafka.connect.integration.MonitorableSourceConnector.TOPIC_CONFIG;
import static org.apache.kafka.connect.runtime.ConnectorConfig.CONNECTOR_CLASS_CONFIG;
@ -71,6 +89,7 @@ import static org.apache.kafka.connect.runtime.TopicCreationConfig.REPLICATION_F
import static org.apache.kafka.connect.runtime.WorkerConfig.CONNECTOR_CLIENT_POLICY_CLASS_CONFIG;
import static org.apache.kafka.connect.runtime.WorkerConfig.OFFSET_COMMIT_INTERVAL_MS_CONFIG;
import static org.apache.kafka.connect.runtime.WorkerConfig.TASK_SHUTDOWN_GRACEFUL_TIMEOUT_MS_CONFIG;
import static org.apache.kafka.connect.runtime.distributed.DistributedConfig.CONFIG_STORAGE_PREFIX;
import static org.apache.kafka.connect.runtime.distributed.DistributedConfig.CONFIG_TOPIC_CONFIG;
import static org.apache.kafka.connect.runtime.distributed.DistributedConfig.SCHEDULED_REBALANCE_MAX_DELAY_MS_CONFIG;
import static org.apache.kafka.connect.runtime.distributed.DistributedConfig.REBALANCE_TIMEOUT_MS_CONFIG;
@ -108,6 +127,9 @@ public class ConnectWorkerIntegrationTest {
@Rule
public TestRule watcher = ConnectIntegrationTestUtils.newTestWatcher(log);
@Rule
public TemporaryFolder tmp = new TemporaryFolder();
@Before
public void setup() {
// setup Connect worker properties
@ -150,9 +172,6 @@ public class ConnectWorkerIntegrationTest {
// set up props for the source connector
Map<String, String> props = defaultSourceConnectorProps(TOPIC_NAME);
connect.assertions().assertAtLeastNumWorkersAreUp(NUM_WORKERS,
"Initial group of workers did not start in time.");
// start a source connector
connect.configureConnector(CONNECTOR_NAME, props);
@ -196,9 +215,6 @@ public class ConnectWorkerIntegrationTest {
props.put(TASKS_MAX_CONFIG, Objects.toString(numTasks));
props.put(CONNECTOR_CLIENT_PRODUCER_OVERRIDES_PREFIX + BOOTSTRAP_SERVERS_CONFIG, "nobrokerrunningatthisaddress");
connect.assertions().assertExactlyNumWorkersAreUp(NUM_WORKERS,
"Initial group of workers did not start in time.");
// Try to start the connector and its single task.
connect.configureConnector(CONNECTOR_NAME, props);
@ -236,9 +252,6 @@ public class ConnectWorkerIntegrationTest {
// set up props for the source connector
Map<String, String> props = defaultSourceConnectorProps(TOPIC_NAME);
connect.assertions().assertAtLeastNumWorkersAreUp(NUM_WORKERS,
"Initial group of workers did not start in time.");
// start a source connector
connect.configureConnector(CONNECTOR_NAME, props);
@ -290,9 +303,6 @@ public class ConnectWorkerIntegrationTest {
// start the clusters
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(NUM_WORKERS,
"Initial group of workers did not start in time.");
// base connector props
Map<String, String> props = defaultSourceConnectorProps(TOPIC_NAME);
props.put(CONNECTOR_CLASS_CONFIG, MonitorableSourceConnector.class.getSimpleName());
@ -330,8 +340,6 @@ public class ConnectWorkerIntegrationTest {
.build();
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(1, "Initial group of workers did not start in time.");
// and when the connector is not configured to create topics
Map<String, String> props = defaultSourceConnectorProps("nonexistenttopic");
props.remove(DEFAULT_TOPIC_CREATION_PREFIX + REPLICATION_FACTOR_CONFIG);
@ -383,9 +391,6 @@ public class ConnectWorkerIntegrationTest {
// start the clusters
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(NUM_WORKERS,
"Initial group of workers did not start in time.");
// Want to make sure to use multiple tasks
final int numTasks = 4;
Map<String, String> props = defaultSourceConnectorProps(TOPIC_NAME);
@ -475,9 +480,6 @@ public class ConnectWorkerIntegrationTest {
// start the clusters
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(NUM_WORKERS,
"Initial group of workers did not start in time.");
Map<String, String> props = defaultSourceConnectorProps(TOPIC_NAME);
// Fail the connector on startup
props.put("connector.start.inject.error", "true");
@ -550,11 +552,6 @@ public class ConnectWorkerIntegrationTest {
// start the clusters
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(
NUM_WORKERS,
"Initial group of workers did not start in time."
);
connect.configureConnector(CONNECTOR_NAME, defaultSourceConnectorProps(TOPIC_NAME));
connect.assertions().assertConnectorAndExactlyNumTasksAreRunning(
CONNECTOR_NAME,
@ -578,9 +575,6 @@ public class ConnectWorkerIntegrationTest {
// start the clusters
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(NUM_WORKERS,
"Initial group of workers did not start in time.");
CreateConnectorRequest createConnectorRequest = new CreateConnectorRequest(
CONNECTOR_NAME,
defaultSourceConnectorProps(TOPIC_NAME),
@ -612,9 +606,6 @@ public class ConnectWorkerIntegrationTest {
// start the clusters
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(NUM_WORKERS,
"Initial group of workers did not start in time.");
Map<String, String> props = defaultSourceConnectorProps(TOPIC_NAME);
// Configure the connector to produce a maximum of 10 messages
@ -665,9 +656,6 @@ public class ConnectWorkerIntegrationTest {
// start the clusters
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(NUM_WORKERS,
"Initial group of workers did not start in time.");
// Create topic and produce 10 messages
connect.kafka().createTopic(TOPIC_NAME);
for (int i = 0; i < 10; i++) {
@ -732,9 +720,6 @@ public class ConnectWorkerIntegrationTest {
// start the clusters
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(NUM_WORKERS,
"Initial group of workers did not start in time.");
// Create a connector with PAUSED initial state
CreateConnectorRequest createConnectorRequest = new CreateConnectorRequest(
CONNECTOR_NAME,
@ -784,9 +769,6 @@ public class ConnectWorkerIntegrationTest {
// start the clusters
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(NUM_WORKERS,
"Initial group of workers did not start in time.");
connect.kafka().createTopic(TOPIC_NAME);
Map<String, String> props = defaultSinkConnectorProps(TOPIC_NAME);
@ -837,8 +819,6 @@ public class ConnectWorkerIntegrationTest {
.numWorkers(1)
.build();
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(1,
"Worker did not start in time");
Map<String, String> connectorConfig1 = defaultSourceConnectorProps(TOPIC_NAME);
Map<String, String> connectorConfig2 = new HashMap<>(connectorConfig1);
@ -905,8 +885,6 @@ public class ConnectWorkerIntegrationTest {
connect.start();
connect.assertions().assertExactlyNumWorkersAreUp(1, "Worker not brought up in time");
Map<String, String> connectorWithBlockingTaskStopConfig = new HashMap<>();
connectorWithBlockingTaskStopConfig.put(CONNECTOR_CLASS_CONFIG, BlockingConnectorTest.BlockingSourceConnector.class.getName());
connectorWithBlockingTaskStopConfig.put(TASKS_MAX_CONFIG, "1");
@ -986,11 +964,6 @@ public class ConnectWorkerIntegrationTest {
// start the clusters
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(
NUM_WORKERS,
"Initial group of workers did not start in time."
);
Map<String, String> connectorProps = defaultSourceConnectorProps(TOPIC_NAME);
int maxTasks = 1;
connectorProps.put(TASKS_MAX_CONFIG, Integer.toString(maxTasks));
@ -1123,6 +1096,249 @@ public class ConnectWorkerIntegrationTest {
);
}
/**
* Task configs are not removed from the config topic after a connector is deleted.
* When topic compaction takes place, this can cause the tombstone message for the
* connector config to be deleted, leaving the task configs in the config topic with no
* explicit record of the connector's deletion.
* <p>
* This test guarantees that those older task configs are never used, even when the
* connector is recreated later.
*/
@Test
public void testCompactedDeletedOlderConnectorConfig() throws Exception {
brokerProps.put("log.cleaner.backoff.ms", "100");
brokerProps.put("log.cleaner.delete.retention.ms", "1");
brokerProps.put("log.cleaner.max.compaction.lag.ms", "1");
brokerProps.put("log.cleaner.min.cleanable.ratio", "0");
brokerProps.put("log.cleaner.min.compaction.lag.ms", "1");
brokerProps.put("log.cleaner.threads", "1");
final String configTopic = "kafka-16838-configs";
final int offsetCommitIntervalMs = 100;
workerProps.put(CONFIG_TOPIC_CONFIG, configTopic);
workerProps.put(CONFIG_STORAGE_PREFIX + SEGMENT_MS_CONFIG, "100");
workerProps.put(CONFIG_STORAGE_PREFIX + DELETE_RETENTION_MS_CONFIG, "1");
workerProps.put(OFFSET_COMMIT_INTERVAL_MS_CONFIG, Integer.toString(offsetCommitIntervalMs));
final int numWorkers = 1;
connect = connectBuilder
.numWorkers(numWorkers)
.build();
// start the clusters
connect.start();
final String connectorTopic = "connector-topic";
connect.kafka().createTopic(connectorTopic, 1);
ConnectorHandle connectorHandle = RuntimeHandles.get().connectorHandle(CONNECTOR_NAME);
connectorHandle.expectedCommits(NUM_TASKS * 2);
Map<String, String> connectorConfig = defaultSourceConnectorProps(connectorTopic);
connect.configureConnector(CONNECTOR_NAME, connectorConfig);
connect.assertions().assertConnectorAndExactlyNumTasksAreRunning(
CONNECTOR_NAME,
NUM_TASKS,
"Connector or its tasks did not start in time"
);
connectorHandle.awaitCommits(offsetCommitIntervalMs * 3);
connect.deleteConnector(CONNECTOR_NAME);
// Roll the entire cluster
connect.activeWorkers().forEach(connect::removeWorker);
// Miserable hack: produce directly to the config topic and then wait a little bit
// in order to trigger segment rollover and allow compaction to take place
connect.kafka().produce(configTopic, "garbage-key-1", null);
Thread.sleep(1_000);
connect.kafka().produce(configTopic, "garbage-key-2", null);
Thread.sleep(1_000);
for (int i = 0; i < numWorkers; i++)
connect.addWorker();
connect.assertions().assertAtLeastNumWorkersAreUp(
numWorkers,
"Workers did not start in time after cluster was rolled."
);
final TopicPartition connectorTopicPartition = new TopicPartition(connectorTopic, 0);
final long initialEndOffset = connect.kafka().endOffset(connectorTopicPartition);
assertTrue(
"Source connector should have published at least one record to Kafka",
initialEndOffset > 0
);
connectorHandle.expectedCommits(NUM_TASKS * 2);
// Re-create the connector with a different config (targets a different topic)
final String otherConnectorTopic = "other-topic";
connect.kafka().createTopic(otherConnectorTopic, 1);
connectorConfig.put(TOPIC_CONFIG, otherConnectorTopic);
connect.configureConnector(CONNECTOR_NAME, connectorConfig);
connect.assertions().assertConnectorAndExactlyNumTasksAreRunning(
CONNECTOR_NAME,
NUM_TASKS,
"Connector or its tasks did not start in time"
);
connectorHandle.awaitCommits(offsetCommitIntervalMs * 3);
// See if any new records got written to the old topic
final long nextEndOffset = connect.kafka().endOffset(connectorTopicPartition);
assertEquals(
"No new records should have been written to the older topic",
initialEndOffset,
nextEndOffset
);
}
/**
* If a connector has existing tasks, and then generates new task configs, workers compare the
* new and existing configs before publishing them to the config topic. If there is no difference,
* workers do not publish task configs (this is a workaround to prevent infinite loops with eager
* rebalancing).
* <p>
* This test tries to guarantee that, if the old task configs become invalid because of
* an invalid config provider reference, it will still be possible to reconfigure the connector.
*/
@Test
public void testReconfigureConnectorWithFailingTaskConfigs() throws Exception {
final int offsetCommitIntervalMs = 100;
workerProps.put(CONFIG_PROVIDERS_CONFIG, "file");
workerProps.put(CONFIG_PROVIDERS_CONFIG + ".file.class", FileConfigProvider.class.getName());
workerProps.put(OFFSET_COMMIT_INTERVAL_MS_CONFIG, Integer.toString(offsetCommitIntervalMs));
final int numWorkers = 1;
connect = connectBuilder
.numWorkers(numWorkers)
.build();
// start the clusters
connect.start();
final String firstConnectorTopic = "connector-topic-1";
connect.kafka().createTopic(firstConnectorTopic);
final File secretsFile = tmp.newFile("test-secrets");
final Properties secrets = new Properties();
final String throughputSecretKey = "secret-throughput";
secrets.put(throughputSecretKey, "10");
try (FileOutputStream secretsOutputStream = new FileOutputStream(secretsFile)) {
secrets.store(secretsOutputStream, null);
}
ConnectorHandle connectorHandle = RuntimeHandles.get().connectorHandle(CONNECTOR_NAME);
connectorHandle.expectedCommits(NUM_TASKS * 2);
Map<String, String> connectorConfig = defaultSourceConnectorProps(firstConnectorTopic);
connectorConfig.put(
"throughput",
"${file:" + secretsFile.getAbsolutePath() + ":" + throughputSecretKey + "}"
);
connect.configureConnector(CONNECTOR_NAME, connectorConfig);
connect.assertions().assertConnectorAndExactlyNumTasksAreRunning(
CONNECTOR_NAME,
NUM_TASKS,
"Connector or its tasks did not start in time"
);
connectorHandle.awaitCommits(offsetCommitIntervalMs * 3);
// Delete the secrets file, which should render the old task configs invalid
assertTrue("Failed to delete secrets file", secretsFile.delete());
// Use a start latch here instead of assertConnectorAndExactlyNumTasksAreRunning
// since failure to reconfigure the tasks (which may occur if the bug this test was written
// to help catch resurfaces) will not cause existing tasks to fail or stop running
StartAndStopLatch restarts = connectorHandle.expectedStarts(1);
final String secondConnectorTopic = "connector-topic-2";
connect.kafka().createTopic(secondConnectorTopic, 1);
// Stop using the config provider for this connector, and instruct it to start writing to the
// old topic again
connectorConfig.put("throughput", "10");
connectorConfig.put(TOPIC_CONFIG, secondConnectorTopic);
connect.configureConnector(CONNECTOR_NAME, connectorConfig);
assertTrue(
"Connector tasks were not restarted in time",
restarts.await(10, TimeUnit.SECONDS)
);
// Wait for at least one task to commit offsets after being restarted
connectorHandle.expectedCommits(1);
connectorHandle.awaitCommits(offsetCommitIntervalMs * 3);
final long endOffset = connect.kafka().endOffset(new TopicPartition(secondConnectorTopic, 0));
assertTrue(
"Source connector should have published at least one record to new Kafka topic "
+ "after being reconfigured",
endOffset > 0
);
}
@Test
public void testRuntimePropertyReconfiguration() throws Exception {
final int offsetCommitIntervalMs = 1_000;
// force fast offset commits
workerProps.put(OFFSET_COMMIT_INTERVAL_MS_CONFIG, Integer.toString(offsetCommitIntervalMs));
connect = connectBuilder.build();
// start the clusters
connect.start();
final String topic = "kafka9228";
connect.kafka().createTopic(topic, 1);
connect.kafka().produce(topic, "non-json-value");
Map<String, String> connectorConfig = new HashMap<>();
connectorConfig.put(CONNECTOR_CLASS_CONFIG, EmptyTaskConfigsConnector.class.getName());
connectorConfig.put(TASKS_MAX_CONFIG, "1");
connectorConfig.put(TOPICS_CONFIG, topic);
// Initially configure the connector to use the JSON converter, which should cause task failure(s)
connectorConfig.put(VALUE_CONVERTER_CLASS_CONFIG, JsonConverter.class.getName());
connectorConfig.put(
VALUE_CONVERTER_CLASS_CONFIG + "." + JsonConverterConfig.SCHEMAS_ENABLE_CONFIG,
"false"
);
connect.configureConnector(CONNECTOR_NAME, connectorConfig);
connect.assertions().assertConnectorIsRunningAndTasksHaveFailed(
CONNECTOR_NAME,
1,
"Connector did not start or task did not fail in time"
);
assertEquals(
"Connector should not have any committed offsets when only task fails on first record",
new ConnectorOffsets(Collections.emptyList()),
connect.connectorOffsets(CONNECTOR_NAME)
);
// Reconfigure the connector to use the string converter, which should not cause any more task failures
connectorConfig.put(VALUE_CONVERTER_CLASS_CONFIG, StringConverter.class.getName());
connectorConfig.remove(
KEY_CONVERTER_CLASS_CONFIG + "." + JsonConverterConfig.SCHEMAS_ENABLE_CONFIG
);
connect.configureConnector(CONNECTOR_NAME, connectorConfig);
connect.assertions().assertConnectorAndExactlyNumTasksAreRunning(
CONNECTOR_NAME,
1,
"Connector or tasks did not start in time"
);
Map<String, Object> expectedOffsetKey = new HashMap<>();
expectedOffsetKey.put(SinkUtils.KAFKA_TOPIC_KEY, topic);
expectedOffsetKey.put(SinkUtils.KAFKA_PARTITION_KEY, 0);
Map<String, Object> expectedOffsetValue = Collections.singletonMap(SinkUtils.KAFKA_OFFSET_KEY, 1);
ConnectorOffset expectedOffset = new ConnectorOffset(expectedOffsetKey, expectedOffsetValue);
ConnectorOffsets expectedOffsets = new ConnectorOffsets(Collections.singletonList(expectedOffset));
// Wait for it to commit offsets, signaling that it has successfully processed the record we produced earlier
waitForCondition(
() -> expectedOffsets.equals(connect.connectorOffsets(CONNECTOR_NAME)),
offsetCommitIntervalMs * 2,
"Task did not successfully process record and/or commit offsets in time"
);
}
private Map<String, String> defaultSourceConnectorProps(String topic) {
// setup props for the source connector
Map<String, String> props = new HashMap<>();
@ -1137,4 +1353,60 @@ public class ConnectWorkerIntegrationTest {
props.put(DEFAULT_TOPIC_CREATION_PREFIX + PARTITIONS_CONFIG, String.valueOf(1));
return props;
}
public static class EmptyTaskConfigsConnector extends SinkConnector {
@Override
public String version() {
return "0.0";
}
@Override
public void start(Map<String, String> props) {
// no-op
}
@Override
public Class<? extends Task> taskClass() {
return SimpleTask.class;
}
@Override
public List<Map<String, String>> taskConfigs(int maxTasks) {
return IntStream.range(0, maxTasks)
.mapToObj(i -> Collections.<String, String>emptyMap())
.collect(Collectors.toList());
}
@Override
public void stop() {
// no-op
}
@Override
public ConfigDef config() {
return new ConfigDef();
}
}
public static class SimpleTask extends SinkTask {
@Override
public String version() {
return "0.0";
}
@Override
public void start(Map<String, String> props) {
// no-op
}
@Override
public void put(Collection<SinkRecord> records) {
// no-op
}
@Override
public void stop() {
// no-op
}
}
}

View File

@ -121,8 +121,6 @@ public class ConnectorClientPolicyIntegrationTest {
// start the clusters
connect.start();
connect.assertions().assertAtLeastNumWorkersAreUp(NUM_WORKERS,
"Initial group of workers did not start in time.");
return connect;
}

Some files were not shown because too many files have changed in this diff Show More