kafka

Commit Graph

Author	SHA1	Message	Date
Cheng Tan	ae3a6ed990	KAKFA-10619: Idempotent producer will get authorized once it has a WRITE access to at least one topic (KIP-679) (#9485 ) Includes: - New API to authorize by resource type - Default implementation for the method that supports super users and ACLs - Optimized implementation in AclAuthorizer that supports ACLs, super users and allow.everyone.if.no.acl.found - Benchmarks and tests - InitProducerIdRequest authorized for Cluster:IdempotentWrite or WRITE to any topic, ProduceRequest authorized only for topic even if idempotent Reviewers: Lucas Bradstreet <lucas@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>	2020-12-18 18:08:46 +00:00
David Jacot	02a30a51eb	KAFKA-10740; Replace OffsetsForLeaderEpochRequest.PartitionData with automated protocol (#9689 ) This patch follows up https://github.com/apache/kafka/pull/9547. It refactors AbstractFetcherThread and its descendants to use `OffsetForLeaderEpochRequestData.OffsetForLeaderPartition` instead of `OffsetsForLeaderEpochRequest.PartitionData`. The patch relies on existing tests. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>	2020-12-17 17:40:37 +01:00
Ismael Juma	782175dfbc	MINOR: Simplify ApiKeys by relying on ApiMessageType (#9748 ) * The naming for `ListOffsets` was inconsistent, in some places it was `ListOffset` and in others it was `ListOffsets`. Picked the latter since it was used in metrics and the protocol documentation and made it consistent. * Removed unused methods in ApiKeys. * Deleted `CommonFields`. * Added `lowestSupportedVersion` and `highestSupportedVersion` to `ApiMessageType` * Removed tests in `MessageTest` that are no longer relevant. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-12-16 06:33:10 -08:00
Tom Bentley	ad4211fad7	HOTFIX: Access apiversions data via method not field (#9759 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-12-16 22:22:08 +08:00
Tom Bentley	6116605179	KAFKA-10656: Log the feature flags received by the client (#9552 ) print out the feature flags received at DEBUG level, as well as the other version information. Example log line: [2020-11-03 17:47:17,076] DEBUG Node 0 has finalizedFeaturesEpoch: 42, finalizedFeatures: [FinalizedFeatureKey(name='feature_1', maxVersionLevel=2, minVersionLevel=1), FinalizedFeatureKey(name='feature_2', maxVersionLevel=4, minVersionLevel=3)], supportedFeatures: [SupportedFeatureKey(name='feature_1', minVersion=1, maxVersion=2), SupportedFeatureKey(name='feature_2', minVersion=3, maxVersion=4)] (org.apache.kafka.clients.NetworkClient:926) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-12-16 21:50:15 +08:00
Chia-Ping Tsai	d84a82c32d	MINOR: make Send and Receive work with TransferableChannel rather than Gat… (#9516 ) This PR introduces a new interface 'TransferableChannel' to replace GatheringByteChannel to avoid casting in write path. `TransferableChannel ` extends GatheringByteChannel with the minimal set of methods required by the Send interface. Supporting TLS and efficient zero copy transfers are the main reasons for the additional methods. Co-authored-by: Ismael Juma <ismael@juma.me.uk> Reviewers: Ismael Juma <ismael@juma.me.uk>	2020-12-16 08:51:40 +08:00
Jason Gustafson	a084dd829c	MINOR: Fix bad logging substitution in `AbstractCoordinator` (#9757 ) Missed this in #9729. The substitution in `markCoordinatorUnknown` does not work because the argument is not provided as a parameter. Reviewers: Ismael Juma <ismael@juma.me.uk>	2020-12-15 15:51:37 -08:00
Lucas Bradstreet	ea8ae97650	KAFKA-10839; improve consumer group coordinator unavailable message (#9729 ) When a consumer encounters an issue that triggers marking it to mark coordinator as unknown, the error message it prints does not give much context about the error that triggered it. This change includes the response error that triggered the transition or any other cause if not triggered by an error code in a response. Reviewers: Jason Gustafson <jason@confluent.io>	2020-12-15 14:38:18 -08:00
Anastasia Vela	1a10c3445e	KAFKA-10525: Emit JSONs with new auto-generated schema (KIP-673) (#9526 ) This patch updates the request logger to output request and response payloads in JSON. Payloads are converted to JSON based on their auto-generated schema. Reviewers: Lucas Bradstreet <lucas@confluent.io>, David Mao <dmao@confluent.io>, David Jacot <djacot@confluent.io>	2020-12-15 14:33:36 +01:00
Ismael Juma	5e5daf47ef	KAFKA-10852: AlterIsr should not be throttled (#9747 ) Set it as a cluster action and update the handler in KafkaApis. We keep the `throttleTimeMs` field since we intend to enable throttling in the future (especially relevant when we switch to the built-in quorum mode). Reviewers: David Arthur <mumrah@gmail.com>	2020-12-14 22:28:47 -08:00
dengziming	3572049863	MINOR: Skip `Struct` conversion in `FetchRequest.parse` (#9740 ) This was missed in `6f27bb0`. Reviewers: Ismael Juma <ismael@juma.me.uk>	2020-12-13 08:36:37 -08:00
David Mao	404062d2b6	KAFKA-10747: Extend DescribeClientQuotas and AlterClientQuotas APIs to support IP connection rate quota (KIP-612) (#9628 ) This PR adds support for IP entities to the `DescribeClientQuotas` and `AlterClientQuotas` APIs. This PR also adds support for describing/altering IP quotas via `kafka-configs` tooling. Reviewers: Brian Byrne <bbyrne@confluent.io>, Anna Povzner <anna@confluent.io>, David Jacot <djacot@confluent.io>	2020-12-10 09:53:32 +01:00
Boyang Chen	310e240abd	throw corresponding invalid producer epoch (#9700 ) As suggested, ensure InvalidProducerEpoch gets caught properly on stream side. Reviewers: Guozhang Wang <wangguoz@gmail.com>, A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>	2020-12-09 20:15:43 -08:00
Ismael Juma	1f98112e99	MINOR: Remove connection id from Send and consolidate request/message utils (#9714 ) Connection id is now only present in `NetworkSend`, which is now the class used by `Selector`/`NetworkClient`/`KafkaChannel` (which works well since `NetworkReceive` is the class used for received data). The previous `NetworkSend` was also responsible for adding a size prefix. This logic is already present in `SendBuilder`, but for the minority of cases where `SendBuilder` is not used (including a number of tests), we now have `ByteBufferSend.sizePrefixed()`. With regards to the request/message utilities: * Renamed `toByteBuffer`/`toBytes` in `MessageUtil` to `toVersionPrefixedByteBuffer`/`toVersionPrefixedBytes` for clarity. * Introduced new `MessageUtil.toByteBuffer` that does not include the version as the prefix. * Renamed `serializeBody` in `AbstractRequest/Response` to `serialize` for symmetry with `parse`. * Introduced `RequestTestUtils` and moved relevant methods from `TestUtils`. * Moved `serializeWithHeader` methods that were only used in tests to `RequestTestUtils`. * Deleted `MessageTestUtil`. Finally, a couple of changes to simplify coding patterns: * Added `flip()` and `buffer()` to `ByteBufferAccessor`. * Added `MessageSizeAccumulator.sizeExcludingZeroCopy`. * Used lambdas instead of `TestCondition`. * Used `Arrays.copyOf` instead of `System.arraycopy` in `MessageUtil`. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>	2020-12-09 11:15:58 -08:00
Ismael Juma	00f7341a82	Revert "KAFKA-10713: Stricter protocol parsing in hostnames (#9593 )" This reverts commit `8a59a22881` since it breaks client configurations like `bootstrap.servers=SASL_PLAINTEXT://localhost:49767`. A KIP will be submitted to discuss the details and an adjusted change will be submitted depending on the outcome of that.	2020-12-09 06:33:43 -08:00
APaMio	c5575801b7	MINOR: Using primitive data types for loop index (#9705 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-12-09 10:44:55 +08:00
vamossagar12	99b5e4f4ab	KAFKA-10634; Adding LeaderId to voters list in LeaderChangeMessage along with granting voters (#9539 ) This patch ensures that the leader is included among the voters in the `LeaderChangeMessage`. It also adds an additional field for the set of granting voters, which was originally specified in KIP-595. Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, Jason Gustafson <jason@confluent.io>	2020-12-08 17:37:48 -08:00
Boyang Chen	41ea0775e0	KAFKA-10667: add timeout for forwarding requests (#9564 ) add total timeout for forwarding, including the underlying broker-to-controller channel timeout setting. Reviewers: David Arthur <mumrah@gmail.com>, Jason Gustafson <jason@confluent.io>	2020-12-08 11:12:30 -08:00
Ismael Juma	6f27bb02da	KAFKA-10818: Skip conversion to `Struct` when serializing generated requests/responses (#7409 ) Generated request/response classes have code to serialize/deserialize directly to `ByteBuffer` so the intermediate conversion to `Struct` can be skipped for them. We have recently completed the transition to generated request/response classes, so we can also remove the `Struct` based fallbacks. Additional noteworthy changes: * `AbstractRequest.parseRequest` has a more efficient computation of request size that relies on the received buffer instead of the parsed `Struct`. * Use `SendBuilder` for `AbstractRequest/Response` `toSend`, made the superclass implementation final and removed the overrides that are no longer necessary. * Removed request/response constructors that assume latest version as they are unsafe outside of tests. * Removed redundant version fields in requests/responses. * Removed unnecessary work in `OffsetFetchResponse`'s constructor when version >= 2. * Made `AbstractResponse.throttleTimeMs()` abstract. * Using `toSend` in `SaslClientAuthenticator` instead of `serialize`. * Various changes in Request/Response classes to make them more consistent and to rely on the Data classes as much as possible when it comes to their state. * Remove the version argument from `AbstractResponse.toString`. * Fix `getErrorResponse` for `ProduceRequest` and `DescribeClientQuotasRequest` to use `ApiError` which processes the error message sent back to the clients. This was uncovered by an accidental fix to a `RequestResponseTest` test (it was calling `AbstractResponse.toString` instead of `AbstractResponse.toString(short)`). Rely on existing protocol tests to ensure this refactoring does not change observed behavior (aside from improved performance). Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-12-07 15:39:57 -08:00
José Armando García Sancio	ab0807dd85	KAFKA-10394: Add classes to read and write snapshot for KIP-630 (#9512 ) This PR adds support for generating snapshot for KIP-630. 1. Adds the interfaces `RawSnapshotWriter` and `RawSnapshotReader` and the implementations `FileRawSnapshotWriter` and `FileRawSnapshotReader` respectively. These interfaces and implementations are low level API for writing and reading snapshots. They are internal to the Raft implementation and are not exposed to the users of `RaftClient`. They operation at the `Record` level. These types are exposed to the `RaftClient` through the `ReplicatedLog` interface. 2. Adds a buffered snapshot writer: `SnapshotWriter<T>`. This type is a higher-level type and it is exposed through the `RaftClient` interface. A future PR will add the related `SnapshotReader<T>`, which will be used by the state machine to load a snapshot. Reviewers: Jason Gustafson <jason@confluent.io>	2020-12-07 14:06:25 -08:00
Rajini Sivaram	b8ebcc2a93	KAFKA-10798; Ensure response is delayed for failed SASL authentication with connection close delay (#9678 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2020-12-07 16:12:18 +00:00
Guozhang Wang	a57486e750	MINOR: Do not print log4j for memberId required (#9667 ) For MemberIdRequiredException, we would not print the exception at INFO with a full exception message since it may introduce more confusion that clearance. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Boyang Chen <boyang@confluent.io>	2020-12-04 14:02:17 -08:00
Geordie	b18ecad90e	MINOR: Make Histogram#clear more readable (#9679 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-12-04 11:54:44 +08:00
David Jacot	10364e4b0c	KAFKA-10739; Replace EpochEndOffset with automated protocol (#9630 ) This patch follows up https://github.com/apache/kafka/pull/9547. It refactors KafkaApis, ReplicaManager and Partition to use `OffsetForLeaderEpochResponseData.EpochEndOffset` instead of `EpochEndOffset`. In the mean time, it removes `OffsetsForLeaderEpochRequest#epochsByTopicPartition` and `OffsetsForLeaderEpochResponse#responses` and replaces their usages to use the automated protocol directly. Finally, it removes old constructors in `OffsetsForLeaderEpochResponse`. The patch relies on existing tests. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>	2020-12-03 18:50:29 +01:00
Chia-Ping Tsai	e63f591ec4	KAFKA-10090 Misleading warnings: The configuration was supplied but i… (#8826 ) Reviewers: Jun Rao <junrao@gmail.com>	2020-12-03 10:34:27 +08:00
Gardner Vickers	85f94d5027	KAFKA-10729; Bump remaining RPC's to use tagged fields. (#9601 ) As a follow-up from [KIP-482](https://cwiki.apache.org/confluence/display/KAFKA/KIP-482%3A+The+Kafka+Protocol+should+Support+Optional+Tagged+Fields), this PR bumps the version for several RPC's to enable tagged fields via the flexible versioning mechanism. Additionally, a new IBP version `KAFKA_2_8_IV0` is introduced to allow replication to take advantage of these new RPC versions for OffsetForLeaderEpoch and ListOffset. Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>	2020-12-01 15:55:07 -08:00
ArunParthiban-ST	cc1aa3b83d	KAFKA-10770: Remove duplicate defination of Metrics#getTags (#9659 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-12-01 13:10:22 +08:00
Chia-Ping Tsai	0a74c7d935	KAFKA-10736 Convert transaction coordinator metadata schemas to use g… (#9611 ) Reviewers: David Jacot <djacot@confluent.io>	2020-11-30 17:43:42 +08:00
Tom Bentley	8a59a22881	KAFKA-10713: Stricter protocol parsing in hostnames (#9593 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	2020-11-26 14:59:04 +00:00
Ismael Juma	a5986bd32d	MINOR: Update build and test dependencies (#9645 ) The spotbugs upgrade means we can re-enable RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE and RCN_REDUNDANT_NULLCHECK_WOULD_HAVE_BEEN_A_NPE. These uncovered one bug, one unnecessary null check and one false positive. Addressed them all, including a test for the bug. * gradle (6.7.0 -> 6.7.1): minor fixes. * gradle versions plugin (0.29.0 -> 0.36.0): minor fixes. * grgit (4.0.2 -> 4.1.0): a few small fixes and dependency bumps. * owasp dependency checker plugin (5.3.2.1 -> 6.0.3): improved db schema, data and several fixes. * scoverage plugin (4.0.2 -> 5.0.0): support Scala 2.13. * shadow plugin (6.0.0 -> 6.1.0): require Java 8, support for Java 16. * spotbugs plugin (4.4.4 -> 4.6.0): support SARIF reporting standard. * spotbugs (4.0.6 -> 4.1.4): support for Java 16 and various fixes including try with resources false positive. * spotless plugin (5.1.0 -> 5.8.2): minor fixes. * test retry plugin (1.1.6 -> 1.1.9): newer gradle and java version compatibility fixes. * mockito (3.5.7 -> 3.6.0): minor fixes. * powermock (2.0.7 -> 2.0.9): minor fixes. Release notes links: * https://docs.gradle.org/6.7.1/release-notes.html * https://github.com/spotbugs/spotbugs/blob/4.1.4/CHANGELOG.md * https://github.com/scoverage/gradle-scoverage/releases/tag/5.0.0 * https://github.com/johnrengelman/shadow/releases/tag/6.1.0 * https://github.com/spotbugs/spotbugs-gradle-plugin/releases/tag/4.6.0 * https://github.com/spotbugs/spotbugs-gradle-plugin/releases/tag/4.6.0 * https://github.com/spotbugs/spotbugs-gradle-plugin/releases/tag/4.5.0 * https://github.com/ben-manes/gradle-versions-plugin/releases * https://github.com/ajoberstar/grgit/releases/tag/4.1.0 * https://github.com/jeremylong/DependencyCheck/blob/main/RELEASE_NOTES.md#version-603-2020-11-03 * https://github.com/powermock/powermock/releases/tag/powermock-2.0.8 * https://github.com/powermock/powermock/releases/tag/powermock-2.0.9 * https://github.com/mockito/mockito/blob/v3.6.0/doc/release-notes/official.md * https://github.com/gradle/test-retry-gradle-plugin/releases * https://github.com/diffplug/spotless/blob/main/plugin-gradle/CHANGES.md Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-11-24 06:20:05 -08:00
Rajini Sivaram	ed8659b4a0	KAFKA-10727; Handle Kerberos error during re-login as transient failure in clients (#9605 ) We use a background thread for Kerberos to perform re-login before tickets expire. The thread performs logout() followed by login(), relying on the Java library to clear and then populate credentials in Subject. This leaves a timing window where clients fail to authenticate because credentials are not available. We cannot introduce any form of locking since authentication is performed on the network thread. So this commit treats NO_CRED as a transient failure rather than a fatal authentication exception in clients. Reviewers: Ron Dagostino <rdagostino@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>	2020-11-23 09:04:16 +00:00
Tom Bentley	407808964a	KAFKA-10607: Consistent behaviour for response errorCounts() (#9433 ) Reviewers: Lee Dongjin <dongjin@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>	2020-11-20 15:30:03 +08:00
Jason Gustafson	438749bb5d	MINOR: Factor out common response parsing logic (#9617 ) This patch factors out some common parsing logic from `NetworkClient.parseResponse` and `AbstractResponse.parseResponse`. As a result of this refactor, we are now verifying the correlationId in forwarded requests. This patch also adds a test case to verify handling in this case. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Boyang Chen <boyang@confluent.io>	2020-11-19 19:00:52 -08:00
Jason Gustafson	6054837c0a	MINOR: Reduce sends created by `SendBuilder` (#9619 ) This patch changes the grouping of `Send` objects created by `SendBuilder` in order to reduce the number of generated `Send` objects and thereby the number of system writes. Reviewers: David Jacot <djacot@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2020-11-19 12:45:23 -08:00
Stanislav Kozlovski	b9bab3b762	KAFKA-9023: Log request destination when the Producer gets disconnected (#7498 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>, David Jacot <djacot@confluent.io>	2020-11-19 16:42:38 +01:00
David Jacot	51c833e795	KAFKA-9630; Replace OffsetsForLeaderEpoch request/response with automated protocol (#9547 ) This PR migrates the OffsetsForLeaderEpoch request/response to the automated protocol. It also refactors the OffsetsForLeaderEpochClient to use directly the internal structs generated by the automated protocol. It relies on the existing tests. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>	2020-11-19 11:41:50 +01:00
Chia-Ping Tsai	30bc21ca35	KAFKA-9628; Replace Produce request/response with automated protocol (#9401 ) This patch rewrites `ProduceRequest` and `ProduceResponse` using the generated protocols. We have also added several new benchmarks to verify no regression in performance. A summary of results is included below: ### Benchmark 1. loop 30 times 1. calculate average #### kafkatest.benchmarks.core.benchmark_test.Benchmark.test_producer_throughput > @cluster(num_nodes=5) > @parametrize(acks=-1, topic=TOPIC_REP_THREE) - +0.3144915325 % - 28.08766667 -> 28.1715625 (mb_per_sec) > @cluster(num_nodes=5) > @matrix(acks=[1], topic=[TOPIC_REP_THREE], message_size=[100000],compression_type=["none"], security_protocol=['PLAINTEXT']) - +4.220730323 % - 157.145 -> 163.7776667 (mb_per_sec) > @cluster(num_nodes=7) > @parametrize(acks=1, topic=TOPIC_REP_THREE, num_producers=3) - +5.996241145% - 57.64166667 -> 61.098 (mb_per_sec) > @cluster(num_nodes=5) > @parametrize(acks=1, topic=TOPIC_REP_THREE) - +0.3979572536% - 44.05833333 -> 44.23366667 (mb_per_sec) > @cluster(num_nodes=5) > @parametrize(acks=1, topic= TOPIC_REP_ONE) - +2.228235226% - 69.23266667 -> 70.77533333 (mb_per_sec) ### JMH results In short, most ops performance are regression since we have to convert data to protocol data. The cost is inevitable (like other request/response) before we use protocol data directly. ### JMH for ProduceRequest 1. construction regression: - 281.474 -> 454.935 ns/op - 296.000 -> 1888.000 B/op 1. toErrorResponse regression: - 41.942 -> 107.528 ns/op - 1216.000 -> 1616.000 B/op 1. toStruct improvement: - 255.185 -> 90.728 ns/op - 864.000 -> 304.000 B/op BEFORE ``` Benchmark Mode Cnt Score Error Units ProducerRequestBenchmark.constructorErrorResponse avgt 15 41.942 ± 0.036 ns/op ProducerRequestBenchmark.constructorErrorResponse:·gc.alloc.rate avgt 15 6409.263 ± 5.478 MB/sec ProducerRequestBenchmark.constructorErrorResponse:·gc.alloc.rate.norm avgt 15 296.000 ± 0.001 B/op ProducerRequestBenchmark.constructorErrorResponse:·gc.churn.G1_Eden_Space avgt 15 6416.420 ± 76.071 MB/sec ProducerRequestBenchmark.constructorErrorResponse:·gc.churn.G1_Eden_Space.norm avgt 15 296.331 ± 3.539 B/op ProducerRequestBenchmark.constructorErrorResponse:·gc.churn.G1_Old_Gen avgt 15 0.002 ± 0.002 MB/sec ProducerRequestBenchmark.constructorErrorResponse:·gc.churn.G1_Old_Gen.norm avgt 15 ≈ 10⁻⁴ B/op ProducerRequestBenchmark.constructorErrorResponse:·gc.count avgt 15 698.000 counts ProducerRequestBenchmark.constructorErrorResponse:·gc.time avgt 15 378.000 ms ProducerRequestBenchmark.constructorProduceRequest avgt 15 281.474 ± 3.286 ns/op ProducerRequestBenchmark.constructorProduceRequest:·gc.alloc.rate avgt 15 3923.868 ± 46.303 MB/sec ProducerRequestBenchmark.constructorProduceRequest:·gc.alloc.rate.norm avgt 15 1216.000 ± 0.001 B/op ProducerRequestBenchmark.constructorProduceRequest:·gc.churn.G1_Eden_Space avgt 15 3923.375 ± 59.568 MB/sec ProducerRequestBenchmark.constructorProduceRequest:·gc.churn.G1_Eden_Space.norm avgt 15 1215.844 ± 11.184 B/op ProducerRequestBenchmark.constructorProduceRequest:·gc.churn.G1_Old_Gen avgt 15 0.004 ± 0.001 MB/sec ProducerRequestBenchmark.constructorProduceRequest:·gc.churn.G1_Old_Gen.norm avgt 15 0.001 ± 0.001 B/op ProducerRequestBenchmark.constructorProduceRequest:·gc.count avgt 15 515.000 counts ProducerRequestBenchmark.constructorProduceRequest:·gc.time avgt 15 279.000 ms ProducerRequestBenchmark.constructorStruct avgt 15 255.185 ± 0.069 ns/op ProducerRequestBenchmark.constructorStruct:·gc.alloc.rate avgt 15 3074.889 ± 0.823 MB/sec ProducerRequestBenchmark.constructorStruct:·gc.alloc.rate.norm avgt 15 864.000 ± 0.001 B/op ProducerRequestBenchmark.constructorStruct:·gc.churn.G1_Eden_Space avgt 15 3077.737 ± 31.537 MB/sec ProducerRequestBenchmark.constructorStruct:·gc.churn.G1_Eden_Space.norm avgt 15 864.800 ± 8.823 B/op ProducerRequestBenchmark.constructorStruct:·gc.churn.G1_Old_Gen avgt 15 0.003 ± 0.001 MB/sec ProducerRequestBenchmark.constructorStruct:·gc.churn.G1_Old_Gen.norm avgt 15 0.001 ± 0.001 B/op ProducerRequestBenchmark.constructorStruct:·gc.count avgt 15 404.000 counts ProducerRequestBenchmark.constructorStruct:·gc.time avgt 15 214.000 ms ``` AFTER ``` Benchmark Mode Cnt Score Error Units ProducerRequestBenchmark.constructorErrorResponse avgt 15 107.528 ± 0.270 ns/op ProducerRequestBenchmark.constructorErrorResponse:·gc.alloc.rate avgt 15 4864.899 ± 12.132 MB/sec ProducerRequestBenchmark.constructorErrorResponse:·gc.alloc.rate.norm avgt 15 576.000 ± 0.001 B/op ProducerRequestBenchmark.constructorErrorResponse:·gc.churn.G1_Eden_Space avgt 15 4868.023 ± 61.943 MB/sec ProducerRequestBenchmark.constructorErrorResponse:·gc.churn.G1_Eden_Space.norm avgt 15 576.371 ± 7.331 B/op ProducerRequestBenchmark.constructorErrorResponse:·gc.churn.G1_Old_Gen avgt 15 0.005 ± 0.001 MB/sec ProducerRequestBenchmark.constructorErrorResponse:·gc.churn.G1_Old_Gen.norm avgt 15 0.001 ± 0.001 B/op ProducerRequestBenchmark.constructorErrorResponse:·gc.count avgt 15 639.000 counts ProducerRequestBenchmark.constructorErrorResponse:·gc.time avgt 15 339.000 ms ProducerRequestBenchmark.constructorProduceRequest avgt 15 454.935 ± 0.332 ns/op ProducerRequestBenchmark.constructorProduceRequest:·gc.alloc.rate avgt 15 3769.014 ± 2.767 MB/sec ProducerRequestBenchmark.constructorProduceRequest:·gc.alloc.rate.norm avgt 15 1888.000 ± 0.001 B/op ProducerRequestBenchmark.constructorProduceRequest:·gc.churn.G1_Eden_Space avgt 15 3763.407 ± 31.530 MB/sec ProducerRequestBenchmark.constructorProduceRequest:·gc.churn.G1_Eden_Space.norm avgt 15 1885.190 ± 15.594 B/op ProducerRequestBenchmark.constructorProduceRequest:·gc.churn.G1_Old_Gen avgt 15 0.004 ± 0.001 MB/sec ProducerRequestBenchmark.constructorProduceRequest:·gc.churn.G1_Old_Gen.norm avgt 15 0.002 ± 0.001 B/op ProducerRequestBenchmark.constructorProduceRequest:·gc.count avgt 15 494.000 counts ProducerRequestBenchmark.constructorProduceRequest:·gc.time avgt 15 264.000 ms ProducerRequestBenchmark.constructorStruct avgt 15 90.728 ± 0.695 ns/op ProducerRequestBenchmark.constructorStruct:·gc.alloc.rate avgt 15 3043.140 ± 23.246 MB/sec ProducerRequestBenchmark.constructorStruct:·gc.alloc.rate.norm avgt 15 304.000 ± 0.001 B/op ProducerRequestBenchmark.constructorStruct:·gc.churn.G1_Eden_Space avgt 15 3047.251 ± 59.638 MB/sec ProducerRequestBenchmark.constructorStruct:·gc.churn.G1_Eden_Space.norm avgt 15 304.404 ± 5.034 B/op ProducerRequestBenchmark.constructorStruct:·gc.churn.G1_Old_Gen avgt 15 0.003 ± 0.001 MB/sec ProducerRequestBenchmark.constructorStruct:·gc.churn.G1_Old_Gen.norm avgt 15 ≈ 10⁻⁴ B/op ProducerRequestBenchmark.constructorStruct:·gc.count avgt 15 400.000 counts ProducerRequestBenchmark.constructorStruct:·gc.time avgt 15 205.000 ms ``` ### JMH for ProduceResponse 1. construction regression: - 3.293 -> 303.226 ns/op - 24.000 -> 1848.000 B/op 1. toStruct improvement: - 825.889 -> 311.725 ns/op - 2208.000 -> 896.000 B/op BEFORE ``` Benchmark Mode Cnt Score Error Units ProducerResponseBenchmark.constructorProduceResponse avgt 15 3.293 ± 0.004 ns/op ProducerResponseBenchmark.constructorProduceResponse:·gc.alloc.rate avgt 15 6619.731 ± 9.075 MB/sec ProducerResponseBenchmark.constructorProduceResponse:·gc.alloc.rate.norm avgt 15 24.000 ± 0.001 B/op ProducerResponseBenchmark.constructorProduceResponse:·gc.churn.G1_Eden_Space avgt 15 6618.648 ± 0.153 MB/sec ProducerResponseBenchmark.constructorProduceResponse:·gc.churn.G1_Eden_Space.norm avgt 15 23.996 ± 0.033 B/op ProducerResponseBenchmark.constructorProduceResponse:·gc.churn.G1_Old_Gen avgt 15 0.003 ± 0.002 MB/sec ProducerResponseBenchmark.constructorProduceResponse:·gc.churn.G1_Old_Gen.norm avgt 15 ≈ 10⁻⁵ B/op ProducerResponseBenchmark.constructorProduceResponse:·gc.count avgt 15 720.000 counts ProducerResponseBenchmark.constructorProduceResponse:·gc.time avgt 15 383.000 ms ProducerResponseBenchmark.constructorStruct avgt 15 825.889 ± 0.638 ns/op ProducerResponseBenchmark.constructorStruct:·gc.alloc.rate avgt 15 2428.000 ± 1.899 MB/sec ProducerResponseBenchmark.constructorStruct:·gc.alloc.rate.norm avgt 15 2208.000 ± 0.001 B/op ProducerResponseBenchmark.constructorStruct:·gc.churn.G1_Eden_Space avgt 15 2430.196 ± 55.894 MB/sec ProducerResponseBenchmark.constructorStruct:·gc.churn.G1_Eden_Space.norm avgt 15 2210.001 ± 51.009 B/op ProducerResponseBenchmark.constructorStruct:·gc.churn.G1_Old_Gen avgt 15 0.003 ± 0.001 MB/sec ProducerResponseBenchmark.constructorStruct:·gc.churn.G1_Old_Gen.norm avgt 15 0.002 ± 0.001 B/op ProducerResponseBenchmark.constructorStruct:·gc.count avgt 15 319.000 counts ProducerResponseBenchmark.constructorStruct:·gc.time avgt 15 166.000 ms ``` AFTER ``` Benchmark Mode Cnt Score Error Units ProducerResponseBenchmark.constructorProduceResponse avgt 15 303.226 ± 0.517 ns/op ProducerResponseBenchmark.constructorProduceResponse:·gc.alloc.rate avgt 15 5534.940 ± 9.439 MB/sec ProducerResponseBenchmark.constructorProduceResponse:·gc.alloc.rate.norm avgt 15 1848.000 ± 0.001 B/op ProducerResponseBenchmark.constructorProduceResponse:·gc.churn.G1_Eden_Space avgt 15 5534.046 ± 51.849 MB/sec ProducerResponseBenchmark.constructorProduceResponse:·gc.churn.G1_Eden_Space.norm avgt 15 1847.710 ± 18.105 B/op ProducerResponseBenchmark.constructorProduceResponse:·gc.churn.G1_Old_Gen avgt 15 0.007 ± 0.001 MB/sec ProducerResponseBenchmark.constructorProduceResponse:·gc.churn.G1_Old_Gen.norm avgt 15 0.002 ± 0.001 B/op ProducerResponseBenchmark.constructorProduceResponse:·gc.count avgt 15 602.000 counts ProducerResponseBenchmark.constructorProduceResponse:·gc.time avgt 15 318.000 ms ProducerResponseBenchmark.constructorStruct avgt 15 311.725 ± 3.132 ns/op ProducerResponseBenchmark.constructorStruct:·gc.alloc.rate avgt 15 2610.602 ± 25.964 MB/sec ProducerResponseBenchmark.constructorStruct:·gc.alloc.rate.norm avgt 15 896.000 ± 0.001 B/op ProducerResponseBenchmark.constructorStruct:·gc.churn.G1_Eden_Space avgt 15 2613.021 ± 42.965 MB/sec ProducerResponseBenchmark.constructorStruct:·gc.churn.G1_Eden_Space.norm avgt 15 896.824 ± 11.331 B/op ProducerResponseBenchmark.constructorStruct:·gc.churn.G1_Old_Gen avgt 15 0.003 ± 0.001 MB/sec ProducerResponseBenchmark.constructorStruct:·gc.churn.G1_Old_Gen.norm avgt 15 0.001 ± 0.001 B/op ProducerResponseBenchmark.constructorStruct:·gc.count avgt 15 343.000 counts ProducerResponseBenchmark.constructorStruct:·gc.time avgt 15 194.000 ms ``` Reviewers: David Jacot <djacot@confluent.io>, Jason Gustafson <jason@confluent.io>	2020-11-18 13:44:21 -08:00
Ilya Ganelin	b3264b7996	MINOR: Unifies implementations for commitSync (#6319 ) Co-authored-by: Ilya Ganelin <ilya@slower.ai> Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-11-19 00:06:48 +08:00
Justine Olshan	28c57b273a	KAFKA-10618: Rename UUID to Uuid and make it more efficient (#9566 ) As decided in KIP-516, the UUID class should be named Uuid. Change all instances of org.apache.kafka.common.UUID to org.apache.kafka.common.Uuid. Also modify Uuid so that it stores two `long` fields instead of wrapping java.util.UUID to reduce memory usage. Reviewers: Ismael Juma <ismael@juma.me.uk>	2020-11-18 00:58:20 -08:00
Chia-Ping Tsai	6bbf69fb00	KAFKA-10497 Convert group coordinator metadata schemas to use generat… (#9318 ) Reviewers: David Jacot <djacot@confluent.io>	2020-11-18 14:49:04 +08:00
Boyang Chen	e7090173ee	KAFKA-10687: make ProduceRespone only returns INVALID_PRODUCER_EPOCH (#9569 ) Ensures INVALID_PRODUCER_EPOCH recognizable from client side, and ensure the ProduceResponse always uses the old error code as INVALID_PRODUCER_EPOCH. Reviewers: Guozhang Wang <wangguoz@gmail.com>	2020-11-17 19:56:38 -08:00
Luke Chen	7a23e592f4	KAFKA-10685: strictly parsing the date/time format (#9576 ) Reviewer: Matthias J. Sax <matthias@confluent.io>	2020-11-17 11:32:01 -08:00
Jason Gustafson	e14e708671	KAFKA-10684; Avoid additional envelope copies during network transmission (#9563 ) This patch creates a new `SendBuilder` class which allows us to avoid copying "zero copy" types when transmitting an api message over the network. This generalizes the pattern that was previously used only for `FetchResponse`. Initially we only apply this optimization to the `Envelope` types and `FetchResponse`, but in the future, it can be the default implementation for `toSend`. The patch also contains a few minor cleanups such as moving envelope parsing logic into `RequestContext`. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-11-14 13:16:30 -08:00
Nikolay	ece01e6e83	MINOR: commit method doesn't exists for Consumer, but `commitSync` does. (#9585 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-11-11 19:57:07 +08:00
James Yuzawa	eb24ed893a	KAFKA-10470: Zstd upgrade and buffering (#9499 ) Zstd-jni 1.4.5-6 allocates large internal buffers inside of ZstdInputStream and ZstdOutputStream. This caused a lot of allocation and GC activity when creating and closing the streams. It also does not buffer the reads or writes. This causes inefficiency when DefaultRecord.writeTo() does a series of small single bytes reads using various ByteUtils methods. The JNI is more efficient if the writes of uncompressed data were flushed in large pieces rather than for each byte. This is due to the the expense of context switching between the Java code and the native code. This is also the case when reading as well. Per luben/zstd-jni#141 the maintainer of zstd-jni and I agreed to not buffer reads and writes in favor of having the caller do that, so here we are updating the caller. In this patch, I upgraded to the most recent zstd-jni version with the buffer reuse built-in. This was done in luben/zstd-jni#143 and luben/zstd-jni#146 Since we decided not to add additional buffering of input/output with zstd-jni, I added the BufferedInputStream and BufferedOutputStream to CompressionType.ZSTD just like we currently do for CompressionType.GZIP which also is inefficient for single byte reads and writes. I used the same buffer sizes as that existing implementation. NOTE: if so desired we could pass a wrapped BufferSupplier into the Zstd*Stream classes to have Kafka decide how the buffer recycling occurs. This functionality was added in the latter PR linked above. I am holding off on this since based on jmh benchmarking the performance gains were not clear and personally I don't know if it worth the complexity of trying to hack around the reflection at this point in time. The zstd-jni uses a very similar default recycler as snappy does currently which seems to provide decent efficiency. While this PR fixes the defect, I feel that using BufferSupplier in both zstd-jni and snappy is outside of the scope of this bugfix and should be considered a separate improvement. I would prefer this change get merged in on its own since the performance gains here are very significant relative to the more incremental and minor optimizations which could be achieved by doing that separate work. There are some noticeable improvements in the JMH benchmarks (excerpt): BEFORE: Benchmark (bufferSupplierStr) (bytes) (compressionType) (maxBatchSize) (messageSize) (messageVersion) Mode Cnt Score Error Units CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed CREATE RANDOM ZSTD 200 1000 2 thrpt 15 27743.260 ± 673.869 ops/s CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 3399.966 ± 82.608 MB/sec CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 134968.010 ± 0.012 B/op CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 3850.985 ± 84.476 MB/sec CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 152881.128 ± 942.189 B/op CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Survivor_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 174.241 ± 3.486 MB/sec CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Survivor_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 6917.758 ± 82.522 B/op CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 1689.000 counts CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 82621.000 ms JMH benchmarks done Benchmark (bufferSupplierStr) (bytes) (compressionType) (maxBatchSize) (messageSize) (messageVersion) Mode Cnt Score Error Units RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage CREATE RANDOM ZSTD 200 1000 2 thrpt 15 24095.711 ± 895.866 ops/s RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2932.289 ± 109.465 MB/sec RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 134032.012 ± 0.013 B/op RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 3282.912 ± 115.042 MB/sec RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 150073.914 ± 1342.235 B/op RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Survivor_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 149.697 ± 5.786 MB/sec RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Survivor_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 6842.462 ± 64.515 B/op RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 1449.000 counts RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 82518.000 ms RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize CREATE RANDOM ZSTD 200 1000 2 thrpt 15 1449.060 ± 230.498 ops/s RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 198.051 ± 31.532 MB/sec RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 150502.519 ± 0.186 B/op RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 200.064 ± 31.879 MB/sec RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 152569.341 ± 13826.686 B/op RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 91.000 counts RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 75869.000 ms RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2609.660 ± 1145.160 ops/s RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 815.441 ± 357.818 MB/sec RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 344309.097 ± 0.238 B/op RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 808.952 ± 354.975 MB/sec RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 345712.061 ± 51434.034 B/op RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Old_Gen CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.019 ± 0.042 MB/sec RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Old_Gen.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 18.615 ± 42.045 B/op RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Survivor_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 24.132 ± 12.254 MB/sec RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Survivor_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 13540.960 ± 14649.192 B/op RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 148.000 counts RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 23848.000 ms JMH benchmarks done AFTER Benchmark (bufferSupplierStr) (bytes) (compressionType) (maxBatchSize) (messageSize) (messageVersion) Mode Cnt Score Error Units CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed CREATE RANDOM ZSTD 200 1000 2 thrpt 15 147792.454 ± 2721.318 ops/s CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2708.481 ± 50.012 MB/sec CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 20184.002 ± 0.002 B/op CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2732.667 ± 59.258 MB/sec CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 20363.460 ± 120.585 B/op CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Old_Gen CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.042 ± 0.033 MB/sec CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.churn.G1_Old_Gen.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.316 ± 0.249 B/op CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 833.000 counts CompressedRecordBatchValidationBenchmark.measureValidateMessagesAndAssignOffsetsCompressed:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 8390.000 ms JMH benchmarks done Benchmark (bufferSupplierStr) (bytes) (compressionType) (maxBatchSize) (messageSize) (messageVersion) Mode Cnt Score Error Units RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage CREATE RANDOM ZSTD 200 1000 2 thrpt 15 166786.092 ± 3285.702 ops/s RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2926.914 ± 57.464 MB/sec RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 19328.002 ± 0.002 B/op RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2938.541 ± 66.850 MB/sec RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 19404.357 ± 177.485 B/op RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Old_Gen CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.516 ± 0.100 MB/sec RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Old_Gen.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 3.409 ± 0.657 B/op RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Survivor_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.032 ± 0.131 MB/sec RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.churn.G1_Survivor_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.207 ± 0.858 B/op RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 834.000 counts RecordBatchIterationBenchmark.measureIteratorForBatchWithSingleMessage:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 9370.000 ms RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize CREATE RANDOM ZSTD 200 1000 2 thrpt 15 15988.116 ± 137.427 ops/s RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 448.636 ± 3.851 MB/sec RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 30907.698 ± 0.020 B/op RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 450.905 ± 5.587 MB/sec RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 31064.113 ± 291.190 B/op RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.churn.G1_Old_Gen CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.043 ± 0.007 MB/sec RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.churn.G1_Old_Gen.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2.931 ± 0.493 B/op RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 790.000 counts RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 999.000 ms RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize CREATE RANDOM ZSTD 200 1000 2 thrpt 15 11345.169 ± 206.528 ops/s RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.alloc.rate CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2314.800 ± 42.094 MB/sec RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.alloc.rate.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 224714.266 ± 0.028 B/op RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2320.213 ± 45.521 MB/sec RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Eden_Space.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 225235.965 ± 803.309 B/op RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Old_Gen CREATE RANDOM ZSTD 200 1000 2 thrpt 15 0.026 ± 0.005 MB/sec RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.churn.G1_Old_Gen.norm CREATE RANDOM ZSTD 200 1000 2 thrpt 15 2.551 ± 0.455 B/op RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.count CREATE RANDOM ZSTD 200 1000 2 thrpt 15 994.000 counts RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize:·gc.time CREATE RANDOM ZSTD 200 1000 2 thrpt 15 1189.000 ms JMH benchmarks done Reviewers: Ismael Juma <ismael@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2020-11-11 10:11:52 +08:00
Boyang Chen	ab64e58320	migrate remaining RPCs (#9558 ) This PR follows up `0814e4f` to migrate the remaining RPCs which need forwarding, including: CreateAcls/DeleteAcls/CreateDelegationToken/RenewDelegationToken/ExpireDelegationToken/AlterPartitionReassignment/CreatePartition/DeleteTopics/UpdateFeatures/Scram Reviewers: David Arthur <mumrah@gmail.com>	2020-11-10 09:18:40 -08:00
Jason Gustafson	f49c6c203f	KAFKA-10661; Add new resigned state for graceful shutdown/initialization (#9531 ) When initializing the raft state machine after shutting down as a leader, we were previously entering the "unattached" state, which means we have no leader and no voted candidate. This was a bug because it allowed a reinitialized leader to cast a vote for a candidate in the same epoch that it was already the leader of. This patch fixes the problem by introducing a new "resigned" state which allows us to retain the leader state so that we cannot change our vote and we will not accept additional appends. This patch also revamps the shutdown logic to make use of the new "resigned" state. Previously we had a separate path in `KafkaRaftClient.poll` for the shutdown logic which resulted in some duplication. Instead now we incorporate shutdown behavior into each state's respective logic. Finally, this patch changes the shutdown logic so that `EndQuorumEpoch` is only sent by resigning leaders. Previously we allowed this request to be sent by candidates as well. Reviewers: dengziming <dengziming1993@gmail.com>, Guozhang Wang <wangguoz@gmail.com>	2020-11-09 12:52:28 -08:00
Boyang Chen	0814e4f645	KAFKA-10181: Use Envelope RPC to do redirection for (Incremental)AlterConfig, AlterClientQuota and CreateTopics (#9103 ) This PR adds support for forwarding of the following RPCs: AlterConfigs IncrementalAlterConfigs AlterClientQuotas CreateTopics Co-authored-by: Jason Gustafson <jason@confluent.io> Reviewers: Jason Gustafson <jason@confluent.io>	2020-11-04 14:21:44 -08:00
Sanket Fajage	bbc6c3108b	MINOR: revise assertions in AbstractConfigTest (#9180 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-11-03 14:48:17 +08:00
Kowshik Prakasam	b9e2a89c0f	MINOR: KIP-584: Remove admin client facility to read features from controller (#9536 ) In this PR, I have eliminated the facility in Admin#describeFeatures API and it's implementation to be able to optionally send a describeFeatures request to the controller. This feature was not seen to be particularly useful, and besides it also poses some hindrance to post KIP-500 world where no client would be able to access the controller directly. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jun Rao <junrao@gmail.com>	2020-11-02 17:16:52 -08:00
Jason Gustafson	21a65e1043	KAFKA-10632; Raft client should push all committed data to state machines (#9482 ) In #9418, we add a listener to the `RaftClient` interface. In that patch, we used it only to send commit notifications for writes from the leader. In this PR, we extend the `handleCommit` API to accept all committed data and we remove the pull-based `read` API. Additionally, we add two new callbacks to the listener interface in order to notify the state machine when the raft client has claimed or resigned leadership. Finally, this patch allows the `RaftClient` to support multiple listeners. This is necessary for KIP-500 because we will have one listener for the controller role and one for the broker role. Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, Boyang Chen <boyang@confluent.io>	2020-11-02 15:06:58 -08:00
Manikumar Reddy	236d7dc890	KAFKA-10669: Make CurrentLeaderEpoch field ignorable and set MaxNumOffsets field default to 1 Couple of failures observed after KAFKA-9627: Replace ListOffset request/response with automated protocol (https://github.com/apache/kafka/pull/8295) 1. Latest consumer fails to consume from 0.10.0.1 brokers. Below system tests are failing kafkatest.tests.client.client_compatibility_features_test.ClientCompatibilityFeaturesTest kafkatest.tests.client.client_compatibility_produce_consume_test.ClientCompatibilityProduceConsumeTest Solution: Current default value for MaxNumOffsets is 0. because to this brokers are not returning offsets for v0 request. Set default value for MaxNumOffsets field to 1. This is similar to previous [approach] (https://github.com/apache/kafka/blob/2.6/clients/src/main/java/org/apache/kafka/common/requests/ListOffsetRequest.java#L204) 2. In some scenarios, latest consumer fails with below error when connecting to a Kafka cluster which consists of newer and older (<=2.0) Kafka brokers `org.apache.kafka.common.errors.UnsupportedVersionException: Attempted to write a non-default currentLeaderEpoch at version 3` Solution: After #8295, consumer can set non-default CurrentLeaderEpoch value for v3 and below requests. One solution is to make CurrentLeaderEpoch ignorable. Author: Manikumar Reddy <manikumar.reddy@gmail.com> Reviewers: David Jacot <djacot@confluent.io> Closes #9540 from omkreddy/fix-listoffsets	2020-11-02 23:39:03 +05:30
Anastasia Vela	328c58f94a	MINOR: Add missing DESCRIBE_QUORUM ApiKey in AbstractRequest.parseRequest (#9537 ) Reviewers: David Jacot <djacot@confluent.io>	2020-10-30 11:37:44 +01:00
Matthias J. Sax	cf78fbe41e	MINOR: improve `null` checks for headers (#9513 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Luke Chen @showuon	2020-10-29 16:45:43 -07:00
Boyang Chen	9f26906fcc	Revert "KAFKA-9705 part 1: add KIP-590 request header fields (#9144 )" (#9523 ) This reverts commit `21dc5231ce` as we decide to use Envelope for redirection instead of initial principal. Reviewers: Jason Gustafson <jason@confluent.io>	2020-10-28 22:57:10 -07:00
Jason Gustafson	927edfece3	KAFKA-10601; Add support for append linger to Raft implementation (#9418 ) The patch adds `quorum.append.linger.ms` behavior to the raft implementation. This gives users a powerful knob to tune the impact of fsync. When an append is accepted from the state machine, it is held in an accumulator (similar to the producer) until the configured linger time is exceeded. This allows the implementation to amortize fsync overhead at the expense of some write latency. The patch also improves our methodology for testing performance. Up to now, we have relied on the producer performance test, but it is difficult to simulate expected controller loads because producer performance is limited by other factors such as the number of producer clients and head-of-line blocking. Instead, this patch adds a workload generator which runs on the leader after election. Finally, this patch brings us nearer to the write semantics expected by the KIP-500 controller. It makes the following changes: - Introduce `RecordSerde<T>` interface which abstracts the underlying log implementation from `RaftClient`. The generic type is carried over to `RaftClient<T>` and is exposed through the read/write APIs. - `RaftClient.append` is changed to `RaftClient.scheduleAppend` and returns the last offset of the expected log append. - `RaftClient.scheduleAppend` accepts a list of records and ensures that the full set are included in a single batch. - Introduce `RaftClient.Listener` with a single `handleCommit` API which will eventually replace `RaftClient.read` in order to surface committed data to the controller state machine. Currently `handleCommit` is only used for records appended by the leader. Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>	2020-10-27 12:10:13 -07:00
David Jacot	aa287acb2e	KAFKA-10647; Only serialize owned partitions when consumer protocol version >= 1 (#9506 ) A regression got introduced by `466f8fd21c`. The owned partition field must be ignored for version < 1 otherwise the serialization fails with an unsupported version exception. Reviewers: Jason Gustafson <jason@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2020-10-27 11:11:24 +01:00
Richard Fussenegger	25c10c8722	MINOR: Fix NPE in KafkaAdminClient.describeUserScramCredentials (#9374 ) `KafkaAdminClient.describeUserScramCredentials` should not fail with a NPE when `users` is `null` as `null` means that all the users must be returned. Reviewers: Ron Dagostino <rdagostino@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>	2020-10-26 17:22:04 +01:00
Andrey Falko	3bfbe682a2	KAFKA-10092: Remove unnecessary contructor and exception in NioEchoServer (#8794 ) Reviewers: notifygd, Andrew Choi <andrew.choi@uwaterloo.ca>, Jakob Homan, Chia-Ping Tsai <chia7712@gmail.com>	2020-10-26 13:06:56 +08:00
Bruno Cadonna	17e30a8dbc	Handle ProducerFencedException on offset commit (#9479 ) The transaction manager does currently not handle producer fenced errors returned from a offset commit request. Adds the handling of the producer fenced errors. Reviewers: Boyang Chen <boyang@apache.org>, John Roesler <vvcephei@apache.org>	2020-10-22 14:23:01 -05:00
Tom Bentley	47933088de	MINOR: Add some class javadoc to Admin client (#9459 ) Reviewers: Lee Dongjin <dongjin@apache.org>	2020-10-22 20:19:05 +08:00
Chia-Ping Tsai	c283886b8e	MINOR: simplify implementation of ConsumerGroupOperationContext.hasCo… (#9449 ) Reviewers: David Jacot <djacot@confluent.io>	2020-10-22 17:02:13 +08:00
Rajini Sivaram	1d26391368	KAFKA-10520; Ensure transactional producers poll if leastLoadedNode not available with max.in.flight=1 (#9406 ) We currently stop polling in `Sender` in a transactional producer if there is only one broker in the bootstrap server list and `max.in.flight.requests.per.connection=1` and Metadata response is pending when InitProducerId request is ready to be sent. In this scenario, we attempt to send FindCoordinator to `leastLoadedNode`, but since that is blocked due to `max.in.flight=1` as a result of the pending metadata response, we never unblock unless we poll. This PR ensures we poll in this case. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>, David Jacot <djacot@confluent.io>	2020-10-21 12:07:27 -07:00
Justine Olshan	67bc4f08fe	KAFKA-10618: Add UUID class, use in protocols (part of KIP-516) (#9454 ) In order to support topic IDs, we need to create a public UUID class. This class will be used in protocols. This PR creates the class, modifies code to use the class in the message protocol and changes the code surrounding the existing messages/json that used the old UUID class. SimpleExampleMessage was used only for testing, so all usages of UUID have been switched to the new class. SubscriptionInfoData uses UUID for processId extensively. It also utilizes java.util.UUID implementation of Comparable so that UUIDs can be ordered. This functionality was not necessary for the UUIDs used for topic IDs converted to java.util.UUID on the boundary of SubscriptionInfoData. Sorting was used only for testing, though, so this still may be changed. Also added tests for the methods of the new UUID class. The existing SimpleExampleMessage tests should be sufficient for testing the new UUID class in message protocols. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2020-10-21 10:17:12 +01:00
Bill Bejeck	b6ce9d6862	MINOR: Clean-up client javadoc warnings (#9463 ) Reviewers: Boyang Chen <boyang@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2020-10-20 15:46:31 -04:00
David Jacot	aebd5d9df0	MINOR; Fix UpdateMetadataRequestTest.testVersionLogic's assertions (#9462 ) UpdateMetadataRequestTest.testVersionLogic's assertions must verify the deserialized request instead of the original one. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-10-20 21:24:55 +02:00
Rajini Sivaram	f8e3b84ec0	MINOR: Use `PartitionResponse.errorMessage` in exceptions in KafkaProducer (#9450 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2020-10-20 10:15:01 +01:00
David Jacot	e3d6344ed7	MINOR; Return timed out connections as a List instead of a Set (#8999 ) Using a Set is not necessary as the caller only cares about having the list of timed out connections/nodes. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2020-10-20 08:36:22 +02:00
Chia-Ping Tsai	8339d78ad2	MINOR: the top-level error message of AlterPartitionReassignmentsResponseData does not get propagated correctly (#9392 ) Reviewers: David Jacot <djacot@confluent.io>	2020-10-20 10:54:14 +08:00
Justine Olshan	fd71e1355b	MINOR: Fixed comment to refer to UpdateMetadataPartitionState rather than UpdateMetadataTopicState. (#9447 ) Reviewers: David Jacot <djacot@confluent.io>	2020-10-19 21:15:30 +02:00
Adem Efe Gencer	d71fd8857c	KAFKA-10583: Add documentation on the thread-safety of KafkaAdminClient (#9397 ) Other than a Stack Overflow comment (see https://stackoverflow.com/a/61738065) by Colin Patrick McCabe and a proposed design note on KIP-117 wiki, there is no source that verifies the thread-safety of KafkaAdminClient. This patch updates JavaDoc of KafkaAdminClient to clarify its thread-safety. Reviewers: Tom Bentley <tbentley@redhat.com>, Chia-Ping Tsai <chia7712@gmail.com>	2020-10-19 15:53:48 +08:00
Chia-Ping Tsai	50bcb34d8d	MINOR: fix potential NPE in PartitionData.equals (#9391 ) the field metadata is nullable (see https://github.com/apache/kafka/blob/trunk/clients/src/main/resources/common/message/OffsetFetchResponse.json#L50) Reviewers: David Jacot <david.jacot@gmail.com>	2020-10-18 21:29:13 +08:00
Andrey Bozhko	88862cc848	MINOR: Fix typos in DefaultSslEngineFactory javadoc (#9413 ) Fix comment typos. Reviewers: Boyang Chen <boyang@confluent.io>, Lee Dongjin <dongjin@apache.org>	2020-10-14 23:30:42 -07:00
Colin Patrick McCabe	7f9beeaaaf	MINOR: fix a bug in removing elements from an ImplicitLinkedHashColle… (#9428 ) Fix a bug that was introduced by change `86013dc` that resulted in incorrect behavior when deleting through an iterator. The bug is that the hash table relies on a denseness invariant... if you remove something, you might have to move some other things. Calling removeElementAtSlot will do this. Calling removeFromList is not enough. Reviewers: Jason Gustafson <jason@confluent.io>	2020-10-14 14:53:30 -07:00
Lee Dongjin	7e9dec707d	KAFKA-9587: Add omitted configs in KafkaProducer javadoc (#8150 ) Simple javadoc fix that aligns the properties with the text. Reviewers: Konstantine Karantasis <k.karantasis@gmail.com>	2020-10-13 22:23:43 -07:00
Xavier Léauté	f46d4f4fce	KAFKA-10570; Rename JMXReporter configs for KIP-629 * rename whitelist/blacklist to include/exclude * add utility methods to translate deprecated configs Author: Xavier Léauté <xvrl@apache.org> Reviewers: Gwen Shapira Closes #9367 from xvrl/kafka-10570	2020-10-13 12:33:05 -07:00
Colin Patrick McCabe	46e48d7f22	MINOR: Implement ApiError#equals and hashCode (#9390 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	2020-10-09 10:25:54 -07:00
Xavier Léauté	4ab72780dd	KAFKA-10571; Replace blackout with backoff for KIP-629 This replaces code and comment occurrences as described in the KIP Author: Xavier Léauté <xvrl@apache.org> Reviewers: Gwen Shapira, Mickael Maison Closes #9366 from xvrl/kafka-10571	2020-10-08 15:54:59 -07:00
Kowshik Prakasam	de4183485b	KAFKA-10028: Minor fixes to describeFeatures and updateFeatures apis (#9393 ) In this PR, I have addressed the review comments from @chia7712 in #9001 which were provided after #9001 was merged. The changes are made mainly to KafkaAdminClient: Improve error message in updateFeatures api when feature name is empty. Propagate top-level error message in updateFeatures api. Add an empty-parameter variety for describeFeatures api. Minor documentation updates to @param and @return to make these resemble other apis. Reviewers: Chia-Ping Tsai chia7712@gmail.com, Jun Rao junrao@gmail.com	2020-10-08 10:05:29 -07:00
Kowshik Prakasam	fb4f297207	KAFKA-10028: Implement write path for feature versioning system (KIP-584) (#9001 ) Summary: In this PR, I have implemented the write path of the feature versioning system (KIP-584). Here is a summary of what's in this PR: New APIs in org.apache.kafka.clients.admin.Admin interface, and their client and server implementations. These APIs can be used to describe features and update finalized features. These APIs are: Admin#describeFeatures and Admin#updateFeatures. The write path is provided by the Admin#updateFeatures API. The corresponding server-side implementation is provided in KafkaApis and KafkaController classes. This can be a good place to start the code review. The write path is supplemented by Admin#describeFeatures client API. This does not translate 1:1 to a server-side API. Instead, under the hood the API makes an explicit ApiVersionsRequest to the Broker to fetch the supported and finalized features. Implemented a suite of integration tests in UpdateFeaturesTest.scala that thoroughly exercises the various cases in the write path. Other changes: The data type of the FinalizedFeaturesEpoch field in ApiVersionsResponse has been modified from int32 to int64. This change is to conform with the latest changes to the KIP explained in the voting thread. Along the way, the class SupportedFeatures has been renamed to be called BrokerFeatures, and, it now holds both supported features as well as default minimum version levels. For the purpose of testing, both the BrokerFeatures and FinalizedFeatureCache classes have been changed to be no longer singleton in implementation. Instead, these are now instantiated once and maintained in KafkaServer. The singleton instances are passed around to various classes, as needed. Reviewers: Boyang Chen <boyang@confluent.io>, Jun Rao <junrao@gmail.com>	2020-10-07 10:23:16 -07:00
Gokul Srinivas	40a23cc0c2	KAFKA-10186; Abort transaction with pending data with TransactionAbortedException (#9280 ) If a transaction is aborted with no underlying exception, throw a new kind of exception - `TransactionAbortedException` to distinguish this from other fatal exceptions. This API change is documented in KIP-654: https://cwiki.apache.org/confluence/display/KAFKA/KIP-654:+Aborted+transaction+with+non-flushed+data+should+throw+a+non-fatal+exception. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Jason Gustafson <jason@confluent.io>	2020-10-07 09:34:39 -07:00
Jason Gustafson	218135a4a1	MINOR: Remove `TargetVoters` from `DescribeQuorum` (#9376 ) This field is leftover from the early days of the KIP when it covered reassignment. The API is not exposed yet, so there is no harm updating the first version. Reviewers: Ismael Juma <ismael@juma.me.uk>	2020-10-07 09:27:17 -07:00
Rajini Sivaram	7be8bd8cbf	KAFKA-10338; Support PEM format for SSL key and trust stores (KIP-651) (#9345 ) Adds support for SSL key and trust stores to be specified in PEM format either as files or directly as configuration values. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2020-10-06 19:13:43 +01:00
Chia-Ping Tsai	e0f215bbd3	MINOR: Add proper checks to KafkaConsumer.groupMetadata (#9349 ) Add following checks to `KafkaConsumer.groupMetadata`: 1. null check of coordinator (replace NPE by `InvalidGroupIdException` which is same to other methods) 2. concurrent check (`groupMetadata` is not thread-safe so concurrent check is necessary) Reviewers: Jason Gustafson <jason@confluent.io>	2020-10-05 14:14:51 -07:00
Guozhang Wang	3bc2df7651	KAFKA-10134 Follow-up: Set the re-join flag in heartbeat failure (#9354 ) Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>	2020-10-01 17:57:00 -07:00
Ron Dagostino	ad17ea1089	KAFKA-10556: NPE if sasl.mechanism is unrecognized (#9356 ) Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2020-10-01 09:20:25 +01:00
Gonzalo Muñoz	901bf57c08	KAFKA-10503: MockProducer doesn't throw ClassCastException when no partition for topic exists (#9309 ) Reviewer: Matthias J. Sax <matthias@confluent.io>	2020-09-30 22:31:07 -07:00
Chia-Ping Tsai	b8090add33	KAFKA-10326: Both serializer and deserializer should be able to see generated ID (#9102 ) Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	2020-09-30 20:23:07 -07:00
Matthias J. Sax	a15387f34d	KAFKA-9274: Revert deprecation of `retries` for producer and admin clients (#9333 ) Reviewer: John Roesler <john@confluent.io>	2020-09-30 12:13:34 -07:00
manijndl7	2fda4458b4	KAFKA-6585: Consolidate duplicated logic on reset tools (#9255 ) Reviewers: Navinder Pal Singh Brar <navinder_brar@yahoo.com>, Matthias J. Sax <matthias@confluent.io>	2020-09-30 10:13:01 -07:00
David Jacot	a0fec75d3c	MINOR; Preserve ThrottlingQuotaExceededException when request timeouts after being retried due to a quota violation (KIP-599) (#9344 ) This PR adds the logic to preserve the ThrottlingQuotaExceededException when topics are retried. The throttleTimeMs is also adjusted accordingly as the request could remain pending or in-flight for quite a long time. Have run various tests on clusters with enabled quotas and I, indeed, find it better to preserve the exception. Otherwise, the caller does not really understand what is going on. This allows the caller to take the appropriate measure and also to take the throttleTimeMs into consideration. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2020-09-29 15:17:37 +01:00
A. Sophie Blee-Goldman	bd462df203	MINOR: standardize rebalance related logging for easy discovery & debugging (#9295 ) Some minor logging adjustments to standardize the grammar of rebalance related messages and make it easy to query the logs for quick debugging results Guozhang Wang <wangguoz@gmail.com>	2020-09-25 20:29:17 -07:00
David Jacot	466f8fd21c	MINOR: Use the automated protocol for the Consumer Protocol's subscriptions and assignments (#8897 ) This PR moves the consumer protocol to using the automated protocol instead of using plain old structs. Reviewers: Jason Gustafson <jason@confluent.io>	2020-09-25 09:21:22 -07:00
David Jacot	89485fe137	KAFKA-10516; Disable automatic retry of `THROTTLING_QUOTA_EXCEEDED` errors in the `kafka-topics` command (KIP-599) (#9334 ) Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2020-09-25 15:57:19 +01:00
David Arthur	57de67db22	KAFKA-8836; Add `AlterISR` RPC and use it for ISR modifications (#9100 ) This patch implements [KIP-497](https://cwiki.apache.org/confluence/display/KAFKA/KIP-497%3A+Add+inter-broker+API+to+alter+ISR), which introduces an asynchronous API for partition leaders to update ISR state. Reviewers: Jason Gustafson <jason@confluent.io>	2020-09-24 16:28:25 -07:00
Mickael Maison	785de1e3d4	KAFKA-9627: Replace ListOffset request/response with automated protocol (#8295 ) Reviewers: Boyang Chen <reluctanthero104@gmail.com>, David Jacot <djacot@confluent.io> Co-authored-by: Mickael Maison <mickael.maison@gmail.com> Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com>	2020-09-24 15:53:59 +02:00
Jason Gustafson	b7c8490cf4	KAFKA-10492; Core Kafka Raft Implementation (KIP-595) (#9130 ) This is the core Raft implementation specified by KIP-595: https://cwiki.apache.org/confluence/display/KAFKA/KIP-595%3A+A+Raft+Protocol+for+the+Metadata+Quorum. We have created a separate "raft" module where most of the logic resides. The new APIs introduced in this patch in order to support Raft election and such are disabled in the server until the integration with the controller is complete. Until then, there is a standalone server which can be used for testing the performance of the Raft implementation. See `raft/README.md` for details. Reviewers: Guozhang Wang <wangguoz@gmail.com>, Boyang Chen <boyang@confluent.io> Co-authored-by: Boyang Chen <boyang@confluent.io> Co-authored-by: Guozhang Wang <wangguoz@gmail.com>	2020-09-22 11:32:44 -07:00
Chia-Ping Tsai	4b6d8da9fd	KAFKA-10438: Lazy initialization of record header to reduce memory usage (#9223 ) There are no checks on the header key so instantiating key (bytes to string) is unnecessary. One implication is that conversion failures will be detected a bit later, but this is consistent with how we handle the header value. JMH RESULT 1. ops: +12% 1. The optimization of memory usage is very small as the cost of creating extra ```ByteBuffer``` is almost same to byte array copy (used to construct ```String```). Using large key results in better improvement but I don't think large key is common case. BEFORE ``` Benchmark (bufferSupplierStr) (bytes) (compressionType) (headerKeySize) (maxBatchSize) (maxHeaderSize) (messageSize) (messageVersion) Mode Cnt Score Error Units RecordBatchIterationBenchmark.measureValidation NO_CACHING RANDOM NONE 10 200 5 1000 2 thrpt 15 2035938.174 ± 1653.566 ops/s RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm NO_CACHING RANDOM NONE 10 200 5 1000 2 thrpt 15 2040.000 ± 0.001 B/op ``` ``` Benchmark (bufferSupplierStr) (bytes) (compressionType) (headerKeySize) (maxBatchSize) (maxHeaderSize) (messageSize) (messageVersion) Mode Cnt Score Error Units RecordBatchIterationBenchmark.measureValidation NO_CACHING RANDOM NONE 30 200 5 1000 2 thrpt 15 1979193.376 ± 1239.286 ops/s RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm NO_CACHING RANDOM NONE 30 200 5 1000 2 thrpt 15 2120.000 ± 0.001 B/op ``` AFTER ``` Benchmark (bufferSupplierStr) (bytes) (compressionType) (headerKeySize) (maxBatchSize) (maxHeaderSize) (messageSize) (messageVersion) Mode Cnt Score Error Units RecordBatchIterationBenchmark.measureValidation NO_CACHING RANDOM NONE 10 200 5 1000 2 thrpt 15 2289115.973 ± 2661.856 ops/s RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm NO_CACHING RANDOM NONE 10 200 5 1000 2 thrpt 15 2032.000 ± 0.001 B/op ``` ``` Benchmark (bufferSupplierStr) (bytes) (compressionType) (headerKeySize) (maxBatchSize) (maxHeaderSize) (messageSize) (messageVersion) Mode Cnt Score Error Units RecordBatchIterationBenchmark.measureValidation NO_CACHING RANDOM NONE 30 200 5 1000 2 thrpt 15 2222625.706 ± 908.358 ops/s RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm NO_CACHING RANDOM NONE 30 200 5 1000 2 thrpt 15 2040.000 ± 0.001 B/op ``` Reviewers: Ismael Juma <ismael@juma.me.uk>	2020-09-21 08:03:49 -07:00
Zach Zhang	2a27e0ddce	MINOR: Log warn message with details when there's kerberos login issue (#9236 ) Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>	2020-09-19 19:01:34 +05:30
Tom Bentley	59c1d4ece3	MINOR: Generator config-specific HTML ids (#8878 ) Currently the docs have HTML ids for each config key. That doesn't work correctly for config keys like bootstrap.servers which occur across producer, consumer, admin configs: We generate duplicate ids. So arrange for each config to prefix the ids it generates with the HTML id of its section heading. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>	2020-09-19 17:05:13 +05:30
Jason Gustafson	97d1a3248a	MINOR: Fix common struct `JsonConverter` and `Schema` generation (#9279 ) This patch fixes a couple problems with the use of the `StructRegistry`. First, it fixes registration so that it is consistently based on the typename of the struct. Previously structs were registered under the field name which meant that fields which referred to common structs resulted in multiple entries. Second, the patch fixes `SchemaGenerator` so that common structs are considered first. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2020-09-17 14:37:51 -07:00
Justine Olshan	1443f24ace	MINOR: Fix now that kafka.apache.org resolves to 3 IP addresses (#9294 ) Reviewers: David Jacot <david.jacot@gmail.com>, Boyang Chen <boyang@confluent.io>, David Arthur <mumrah@gmail.com>	2020-09-17 16:44:43 -04:00
Jason Gustafson	aa5263fba9	KAFKA-10487; Fetch response should return diverging epoch and end offset (#9290 ) This patch changes the Fetch response schema to include both the diverging epoch and its end offset rather than just the offset. This allows for more accurate truncation on the follower. This is the schema that was originally specified in KIP-595, but we altered it during the discussion. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2020-09-16 09:05:08 -07:00
Jason Gustafson	634c917505	KAFKA-10435; Fetch protocol changes for KIP-595 (#9275 ) This patch bumps the `Fetch` protocol as specified by KIP-595: https://cwiki.apache.org/confluence/display/KAFKA/KIP-595%3A+A+Raft+Protocol+for+the+Metadata+Quorum. The main differences are the following: - Truncation detection - Leader discovery through the response - Flexible version support The most notable change is truncation detection. This patch adds logic in the request handling path to detect truncation, but it does not change the replica fetchers to make use of this capability. This will be done separately. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2020-09-15 13:38:16 -07:00
David Jacot	94e61c3979	KAFKA-10458; Updating controller quota does not work since Token Bucket (#9272 ) This PR fixes two issues that have been introduced by #9114. - When the metric was switched from Rate to TokenBucket in the ControllerMutationQuotaManager, the metrics were mixed up. That broke the quota update path. - When a quota is updated, the ClientQuotaManager updates the MetricConfig of the KafkaMetric. That update was not reflected into the Sensor so the Sensor was still using the MetricConfig that it has been created with. Reviewers: Anna Povzner <anna@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>	2020-09-14 19:48:47 +01:00
Ismael Juma	7d0086e0c3	KAFKA-10447: Migrate tools module to JUnit 5 (#9231 ) This change sets the groundwork for migrating other modules incrementally. Main changes: - Replace `junit` 4.13 with `junit-jupiter` and `junit-vintage` 5.7.0-RC1. - All modules except for `tools` depend on `junit-vintage`. - `tools` depends on `junit-jupiter`. - Convert `tools` tests to JUnit 5. - Update `PushHttpMetricsReporterTest` to use `mockito` instead of `powermock` and `easymock` (powermock doesn't seem to work well with JUnit 5 and we don't need it since mockito can mock static methods). - Update `mockito` to 3.5.7. - Update `TestUtils` to use JUnit 5 assertions since `tools` depends on it. Unrelated clean-ups: - Remove `unit` from package names in a few `core` tests. - Replace `try/catch/fail` with `assertThrows` in a number of places. - Tag `CoordinatorTest` as integration test. - Remove unnecessary type parameters when invoking methods and constructors. Tested with IntelliJ and gradle. Verified that the following commands work as expected: * ./gradlew tools:unitTest * ./gradlew tools:integrationTest * ./gradlew tools:test * ./gradlew core:unitTest * ./gradlew core:integrationTest * ./gradlew clients:test Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2020-09-10 16:14:38 -07:00
Guozhang Wang	7e7bb184d2	KAFKA-10134: Enable heartbeat during PrepareRebalance and Depend On State For Poll Timeout (#8834 ) 1. Split the consumer coordinator's REBALANCING state into PREPARING_REBALANCE and COMPLETING_REBALANCE. The first is when the join group request is sent, and the second is after the join group response is received. During the first state we should still not send hb since it shares the same socket with the join group request and the group coordinator has disabled timeout, however when we transit to the second state we should start sending hb in case leader's assign takes long time. This is also for fixing KAFKA-10122. 2. When deciding coordinator#timeToNextPoll, do not count in timeToNextHeartbeat if the state is in UNJOINED or PREPARING_REBALANCE since we would disable hb and hence its timer would not be updated. 3. On the broker side, allow hb received during PREPARING_REBALANCE, return NONE error code instead of REBALANCE_IN_PROGRESS. However on client side, we still need to ignore REBALANCE_IN_PROGRESS if state is COMPLETING_REBALANCE in case it is talking to an old versioned broker. 4. Piggy-backing a log4j improvement on the broker coordinator for triggering rebalance reason, as I found it a bit blurred during the investigation. Also subsumed #9038 with log4j improvements. The tricky part for allowing hb during COMPLETING_REBALANCE is in two parts: 1) before the sync-group response is received, a hb response may have reset the generation; also after the sync-group response but before the callback is triggered, a hb response can still reset the generation, we need to handle both cases by checking the generation / state. 2) with the hb thread enabled, the sync-group request may be sent by the hb thread even if the caller thread did not call poll yet. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, John Roesler <john@confluent.io>	2020-09-10 14:34:38 -07:00
David Jacot	e4eab377e1	MINOR: Address flaky `KafkaAdminClient` tests (#9091 ) Fixes flakiness in `KafkaAdminClientTest` as a result of #8864. Addresses the following flaky tests: - testAlterReplicaLogDirsPartialFailure - testDescribeLogDirsPartialFailure - testMetadataRetries Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>	2020-09-10 13:36:29 -07:00
Jason Gustafson	e7f1cffd97	MINOR: Fix JSON generation of nested structs with non-matching type/name (#9277 ) The schema specification allows a struct type name to differ from the field name. This works with the generated `Message` classes, but not with the generated JSON converter. The patch fixes the problem, which is that the type name is getting replaced with the field name when the struct is registered in the `StructRegistry`. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2020-09-10 13:27:20 -07:00
Colin Patrick McCabe	86013dc9f8	MINOR: add ImplicitLinkedHashCollection#moveToEnd (#9269 ) Add ImplicitLinkedHashCollection#moveToEnd. Refactor ImplicitLinkedHashCollectionIterator to be a little bit more robust against concurrent modifications to the map (which admittedly should not happen.) Reviewers: Jason Gustafson <jason@confluent.io>	2020-09-09 17:29:12 -07:00
Ron Dagostino	e8524ccd8f	KAFKA-10259: KIP-554 Broker-side SCRAM Config API (#9032 ) Implement the KIP-554 API to create, describe, and alter SCRAM user configurations via the AdminClient. Add ducktape tests, and modify JUnit tests to test and use the new API where appropriate. Reviewers: Colin P. McCabe <cmccabe@apache.org>, Rajini Sivaram <rajinisivaram@googlemail.com>	2020-09-04 13:05:01 -07:00
Mickael Maison	fd02c8f07f	MINOR: Include call name in TimeoutException (#8050 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2020-09-02 18:04:51 +02:00
Can Cecen	6e8a01e18c	KAFKA-10098: Remove unnecessary escaping in regular expression. (#8798 ) '<' or '>' do not need to be escaped. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Andrew Choi <andrew.choi@uwaterloo.ca>, Jakob Homan	2020-09-02 17:30:00 +02:00
Colin Patrick McCabe	b6ba67482f	KAFKA-10384: Separate converters from generated messages (#9194 ) For the generated message code, put the JSON conversion functionality in a separate JsonConverter class. Make MessageDataGenerator simply another generator class, alongside the new JsonConverterGenerator class. Move some of the utility functions from MessageDataGenerator into FieldSpec and other places, so that they can be used by other generator classes. Use argparse4j to support a better command-line for the generator. Reviewers: David Arthur <mumrah@gmail.com>	2020-08-26 15:10:09 -07:00
A. Sophie Blee-Goldman	22bcd9fac3	KAFKA-10054: KIP-613, add TRACE-level e2e latency metrics (#9094 ) Adds avg, min, and max e2e latency metrics at the new TRACE level. Also adds the missing avg task-level metric at the INFO level. I think where we left off with the KIP, the TRACE-level metrics were still defined to be "stateful-processor-level". I realized this doesn't really make sense and would be pretty much impossible to define given the DFS processing approach of Streams, and felt that store-level metrics made more sense to begin with. I haven't updated the KIP yet so I could get some initial feedback on this Reviewers: Bruno Cadonna <bruno@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	2020-08-24 17:37:49 -07:00
Tom Bentley	8b94e62295	KAFKA-10211: Add DirectoryConfigProvider (#9136 ) See KIP-632: https://cwiki.apache.org/confluence/display/KAFKA/KIP-632%3A+Add+DirectoryConfigProvider Reviewers: Mickael Maison <mickael.maison@gmail.com>, David Jacot <david.jacot@gmail.com>	2020-08-22 17:10:48 +02:00
David Arthur	1a9697430a	KAFKA-8806 Reduce calls to validateOffsetsIfNeeded (#7222 ) Only check if positions need validation if there is new metadata. Also fix some inefficient java.util.stream code in the hot path of SubscriptionState.	2020-08-21 10:25:52 -04:00
Boyang Chen	21dc5231ce	KAFKA-9705 part 1: add KIP-590 request header fields (#9144 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>, David Jacot <david.jacot@gmail.com>	2020-08-18 13:38:59 -07:00
Jason Gustafson	3a189ad868	KAFKA-10386; Fix flexible version support for `records` type (#9163 ) This patch fixes the generated serde logic for the 'records' type so that it uses the compact byte array representation consistently when flexible versions are enabled. Reviewers: David Arthur <mumrah@gmail.com>	2020-08-13 09:52:23 -07:00
Boyang Chen	b937ec7567	KAFKA-9911: Add new PRODUCER_FENCED error code (#8549 ) Add a separate error code as PRODUCER_FENCED to differentiate INVALID_PRODUCER_EPOCH. On broker side, replace INVALID_PRODUCER_EPOCH with PRODUCER_FENCED when the request version is the latest, while still returning INVALID_PRODUCER_EPOCH to older clients. On client side, simply handling INVALID_PRODUCER_EPOCH the same as PRODUCER_FENCED if from txn coordinator APIs. Reviewers: Guozhang Wang <wangguoz@gmail.com>	2020-08-12 08:54:01 -07:00
Jason Gustafson	89e12f3c6b	KAFKA-10388; Fix struct conversion logic for tagged structures (#9166 ) The message generator was missing conversion logic for tagged structures. This led to casting errors when either `fromStruct` or `toStruct` were invoked. This patch also adds missing null checks in the serialization of tagged byte arrays, which was found from improved test coverage. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2020-08-12 08:29:59 -07:00
Jason Gustafson	0f3622cc09	MINOR: Remove `PartitionHeader` abstraction from `FetchResponse` schema (#9164 ) This patch removes the PartitionHeader grouping from the Fetch response. With old versions of the protocol, there was no cost for this grouping, but once we add flexible version support, then it adds an extra byte to the schema for tagged fields with little apparent benefit. Reviewers: Ismael Juma <ismael@juma.me.uk>, David Arthur <mumrah@gmail.com>	2020-08-11 16:38:07 -07:00
David Jacot	b5f90daf13	KAFKA-10162; Use Token Bucket algorithm for controller mutation quota (KIP-599, Part III) (#9114 ) Based on the discussion in #9072, I have put together an alternative way. This one does the following: Instead of changing the implementation of the Rate to behave like a Token Bucket, it actually use two different metrics: the regular Rate and a new Token Bucket. The latter is used to enforce the quota. The Token Bucket algorithm uses the rate of the quota as the refill rate for the credits and compute the burst based on the number of samples and their length (# samples * sample length * quota). The Token Bucket algorithm used can go under zero in order to handle unlimited burst (e.g. create topic with a number of partitions higher than the burst). Throttling kicks in when the number of credits is under zero. The throttle time is computed as credits under zero / refill rate (or quota). Only the controller mutation uses it for now. The remaining number of credits in the bucket is exposed with the tokens metrics per user/clientId. Reviewers: Anna Povzner <anna@confluent.io>, Jun Rao <junrao@gmail.com>	2020-08-06 09:11:55 -07:00
Boyang Chen	e9ebe39e7c	MINOR: add additional shutdown log info (#9124 ) As title, additional logging added to detect the shutdown progress for Kafka server. Reviewers: Jason Gustafson <jason@confluent.io>	2020-08-04 22:07:16 -07:00
showuon	28b7d8e216	MINOR: Add comments to constrainedAssign and generalAssign method (#9096 ) Enhance the understandability for constrainedAssign and generalAssign method by getting more detailed meta comments. Co-authored-by: A. Sophie Blee-Goldman <ableegoldman@gmail.com> Reviewers: Boyang Chen <boyang@confluent.io>, A. Sophie Blee-Goldman <ableegoldman@gmail.com>	2020-08-03 11:51:20 -07:00
David Arthur	4cd2396db3	KAFKA-9629 Use generated protocol for Fetch API (#9008 ) Refactored FetchRequest and FetchResponse to use the generated message classes for serialization and deserialization. This allows us to bypass unnecessary Struct conversion in a few places. A new "records" type was added to the message protocol which uses BaseRecords as the field type. When sending, we can set a FileRecords instance on the message, and when receiving the message class will use MemoryRecords. Also included a few JMH benchmarks which indicate a small performance improvement for requests with high partition counts or small record sizes. Reviewers: Jason Gustafson <jason@confluent.io>, Boyang Chen <boyang@confluent.io>, David Jacot <djacot@confluent.io>, Lucas Bradstreet <lucas@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Colin P. McCabe <cmccabe@apache.org>	2020-07-30 13:29:39 -04:00
Tom Bentley	819cd454f9	KAFKA-10120: Deprecate DescribeLogDirsResult.all() and .values() (#9007 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, David Jacot <djacot@confluent.io>, Lee Dongjin <dongjin@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>	2020-07-30 14:10:30 +01:00
Sasaki Toru	783a6451f5	KAFKA-10309: KafkaProducer's sendOffsetsToTransaction should not block infinitively (#9081 ) Modified KafkaProducer.sendOffsetsToTransaction() to be affected with max.block.ms, and added timeout test for blocking methods Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Xi Hu <huxi_2b@hotmail.com>	2020-07-29 15:38:27 -07:00
Boyang Chen	de2e6938c8	KAFKA-10270: A broker to controller channel manager (#9012 ) Add a broker to controller channel manager for use cases such as redirection and AlterIsr. Reviewers: David Arthur <mumrah@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk> Co-authored-by: Viktor Somogyi <viktorsomogyi@gmail.com> Co-authored-by: Boyang Chen <boyang@confluent.io>	2020-07-29 11:40:14 -07:00
Chia-Ping Tsai	659ca8f089	MINOR: remove NewTopic#NO_PARTITIONS and NewTopic#NO_REPLICATION_FACTOR as they are duplicate to CreateTopicsRequest#NO_NUM_PARTITIONS and CreateTopicsRequest#NO_REPLICATION_FACTOR (#9077 ) Consolidate constant values of NO_PARTITIONS and NO_REPLICATION_FACTOR as stated in the title. Reviewers: Boyang Chen <boyang@confluent.io>	2020-07-27 09:57:28 -07:00
Guozhang Wang	f748d59d28	MINOR: INFO log4j when request re-join (#9068 ) While debugging a rebalance scenario I found that inside rejoinNeededOrPending when we trigger rebalance due to metadata or subscription changes it is not logged, and hence it's actually a bit tricky to find out the reason of the triggered rebalance. I'm adding two INFO log4j entries to fill in the gap. Other requestRejoin() calls are already covered. Reviewers: Boyang Chen <boyang@confluent.io>	2020-07-26 16:45:16 -07:00
Rajini Sivaram	39cf75bd81	MINOR: Fix SslEngineFactory javadoc (#9055 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2020-07-22 21:42:13 +01:00
Rajini Sivaram	6fda5e2b5f	MINOR: Fix deprecation version for NotLeaderForPartitionException (#9056 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	2020-07-22 20:33:55 +01:00
David Jacot	d9168970dd	KAFKA-10164; Throttle Create Topic, Create Partition and Delete Topic Operations (KIP-599, Part II, Admin Changes) (#8968 ) Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2020-07-22 19:12:26 +01:00
David Jacot	a5ffd1ca44	KAFKA-10163; Throttle Create Topic, Create Partition and Delete Topic Operations (KIP-599, Part I, Broker Changes) (#8933 ) This PR implements the broker side changes of KIP-599, except the changes of the Rate implementation which will be addressed separately. The PR changes/introduces the following: - It introduces the protocol changes. - It introduces a new quota manager ControllerMutationQuotaManager which is another specialization of the ClientQuotaManager. - It enforces the quota in the KafkaApis and in the AdminManager. This part handles new and old clients as described in the KIP. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2020-07-22 16:38:55 +01:00
Matthias J. Sax	194c56fce2	KAFKA-9274: Mark `retries` config as deprecated and add new `task.timeout.ms` config (#8864 ) - part of KIP-572 - deprecates producer config `retries` (still in use) - deprecates admin config `retries` (still in use) - deprecates Kafka Streams config `retries` (will be ignored) - adds new Kafka Streams config `task.timeout.ms` (follow up PRs will leverage this new config) Reviewers: John Roesler <john@confluent.io>, Jason Gustafson <jason@confluent.io>, Randall Hauch <randall@confluent.io>	2020-07-21 12:19:13 -07:00
Manikumar Reddy	c38825ab97	KAFKA-9432:(follow-up) Set `configKeys` to null in `describeConfigs()` to make it backward compatible with older Kafka versions. - After #8312, older brokers are returning empty configs, with latest `adminClient.describeConfigs`. Old brokers are receiving empty configNames in `AdminManageer.describeConfigs()` method. Older brokers does not handle empty configKeys. Due to this old brokers are filtering all the configs. - Update ClientCompatibilityTest to verify describe configs - Add test case to test describe configs with empty configuration Keys Author: Manikumar Reddy <manikumar.reddy@gmail.com> Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com> Closes #9046 from omkreddy/KAFKA-9432	2020-07-21 17:32:11 +05:30
Rajini Sivaram	6162a15326	KAFKA-10279; Allow dynamic update of certificates with additional SubjectAltNames (#9044 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2020-07-21 10:36:39 +01:00
Leonard Ge	cd6850b410	MINOR: Fixed some resource leaks. (#8922 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2020-07-19 20:36:12 +05:30
Leonard Ge	b988de2842	MINOR: Improved code quality for various files. (#9037 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2020-07-18 14:48:31 +05:30
Rajini Sivaram	9c8f75c4b6	KAFKA-10223; Use NOT_LEADER_OR_FOLLOWER instead of non-retriable REPLICA_NOT_AVAILABLE for consumers (#8979 ) Brokers currently return NOT_LEADER_FOR_PARTITION to producers and REPLICA_NOT_AVAILABLE to consumers if a replica is not available on the broker during reassignments. Non-Java clients treat REPLICA_NOT_AVAILABLE as a non-retriable exception, Java consumers handle this error by explicitly matching the error code even though it is not an InvalidMetadataException. This PR renames NOT_LEADER_FOR_PARTITION to NOT_LEADER_OR_FOLLOWER and uses the same error for producers and consumers. This is compatible with both Java and non-Java clients since all clients handle this error code (6) as retriable exception. The PR also makes ReplicaNotAvailableException a subclass of InvalidMetadataException. - ALTER_REPLICA_LOG_DIRS continues to return REPLICA_NOT_AVAILABLE. Retained this for compatibility since this request never returned NOT_LEADER_FOR_PARTITION earlier. - MetadataRequest version 0 also returns REPLICA_NOT_AVAILABLE as topic-level error code for compatibility. Newer versions filter these out and return Errors.NONE, so didn't change this. - Partition responses in MetadataRequest return REPLICA_NOT_AVAILABLE to indicate that one of the replicas is not available. Did not change this since NOT_LEADER_FOR_PARTITION is not suitable in this case. Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>, Bob Barrett <bob.barrett@confluent.io>	2020-07-17 20:05:11 +01:00
Rajini Sivaram	99c64822a7	MINOR: Filter out quota configs for ConfigCommand using --bootstrap-server (#9030 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, David Jacot <djacot@confluent.io>, Ron Dagostino <rdagostino@confluent.io>	2020-07-17 08:55:53 +01:00
Chia-Ping Tsai	ffdec02e25	KAFKA-10044 Deprecate ConsumerConfig#addDeserializerToConfig and Prod… (#9013 ) deprecate ConsumerConfig#addDeserializerToConfig and ProducerConfig#addSerializerToConfig. Create internal use cases instead: appendDeserializerToConfig and appendSerializerToConfig Reviewers: Boyang Chen <boyang@confluent.io>	2020-07-13 16:23:18 -07:00
David Jacot	f40aa33de7	MINOR; KafkaAdminClient#describeLogDirs should not fail all the futures when only one call fails (#8998 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>	2020-07-13 08:00:51 -07:00
David Jacot	38877c0025	MINOR; alterReplicaLogDirs should not fail all the futures when only one call fails (#8985 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>	2020-07-08 12:31:14 -07:00
Guozhang Wang	6f3ceea5c6	KAFKA-10134: Use long poll if we do not have fetchable partitions (#8934 ) The intention of using poll(0) is to not block on rebalance but still return some data; however, `updateAssignmentMetadataIfNeeded` have three different logic: 1) discover coordinator if necessary, 2) join-group if necessary, 3) refresh metadata and fetch position if necessary. We only want to make 2) to be non-blocking but not others, since e.g. when the coordinator is down, then heartbeat would expire and cause the consumer to fetch with timeout 0 as well, causing unnecessarily high CPU. Since splitting this function is a rather big change to make as a last minute blocker fix for 2.6, so I made a smaller change to make updateAssignmentMetadataIfNeeded has an optional boolean flag to indicate if 2) above should wait until either expired or complete, otherwise do not wait on the join-group future and just poll with zero timer. Reviewers: Jason Gustafson <jason@confluent.io>	2020-07-08 09:51:50 -07:00
David Jacot	47cbbf2752	KAFKA-10243; ConcurrentModificationException while processing connection setup timeouts (#8990 ) This PR fixes a bug introduced in #8683. While processing connection set up timeouts, we are iterating through the connecting nodes to process timeouts and we disconnect within the loop, removing the entry from the set in the loop that it iterating over the set. That raises a ConcurrentModificationException exception. The current unit test did not catch this because it was using only one node. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2020-07-07 16:48:44 +01:00
Boyang Chen	18f2589c1e	KAFKA-10239: Make GroupInstanceId ignorable in DescribeGroups (#8989 ) * make GroupInstanceId ignorable in DescribeGroup * tests and cleanups * add throttle test coverage	2020-07-06 18:50:40 -07:00
Tom Bentley	ce939e9136	MINOR: Document that max.block.ms affects some transaction methods (#8975 ) The documentation for max.block.ms said it affected only send() and partitionsFor(), but it actually also affects initTransactions(), abortTransaction() and commitTransaction(). So rework the documentation to cover these methods too. Reviewers: Boyang Chen <boyang@confluent.io>	2020-07-04 10:54:11 -07:00
Mickael Maison	caa806cd82	KAFKA-10232: MirrorMaker2 internal topics Formatters KIP-597 (#8604 ) This PR includes 3 MessageFormatters for MirrorMaker2 internal topics: - HeartbeatFormatter - CheckpointFormatter - OffsetSyncFormatter This also introduces a new public interface org.apache.kafka.common.MessageFormatter that users can implement to build custom formatters. Reviewers: Konstantine Karantasis <k.karantasis@gmail.com>, Ryanne Dolan <ryannedolan@gmail.com>, David Jacot <djacot@confluent.io> Co-authored-by: Mickael Maison <mickael.maison@gmail.com> Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com>	2020-07-03 10:41:45 +01:00
Rajini Sivaram	b8a99be784	MINOR: Fix log entry in FetchSessionHandler to specify throttle correctly (#8959 ) Reviewers: David Jacot <djacot@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>, Colin P. McCabe <cmccabe@apache.org>	2020-07-01 21:17:02 +01:00
David Jacot	18dcd2c78a	MINOR: Update AlterConfigsOptions Javadoc (#8958 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2020-06-30 19:31:28 +05:30
Jorge Esteban Quilcate Otoya	1e1aa4a5fb	MINOR: Fix typo in ssl.client.auth config doc description (#8956 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2020-06-30 18:19:46 +05:30
Cheng Tan	55b5b248cd	KAFKA-9893: Configurable TCP connection timeout and improve the initial metadata fetch (KIP-601) (#8683 ) Reviewers: David Jacot <djacot@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>	2020-06-30 12:15:17 +01:00
David Jacot	4be4420b3d	KAFKA-10212: Describing a topic with the TopicCommand fails if unauthorised to use ListPartitionReassignments API Since https://issues.apache.org/jira/browse/KAFKA-8834, describing topics with the TopicCommand requires privileges to use ListPartitionReassignments or fails to describe the topics with the following error: > Error while executing topic command : Cluster authorization failed. This is a quite hard restriction has most of the secure clusters do not authorize non admin members to access ListPartitionReassignments. This patch catches the `ClusterAuthorizationException` exception and gracefully fails back. We already do this when the API is not available so it remains consistent. Author: David Jacot <djacot@confluent.io> Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> Closes #8947 from dajac/KAFKA-10212	2020-06-30 06:27:42 +05:30
John Roesler	831938952e	KAFKA-10173: Fix suppress changelog binary schema compatibility (#8905 ) We inadvertently changed the binary schema of the suppress buffer changelog in 2.4.0 without bumping the schema version number. As a result, it is impossible to upgrade from 2.3.x to 2.4+ if you are using suppression. * Refactor the schema compatibility test to use serialized data from older versions as a more foolproof compatibility test. * Refactor the upgrade system test to use the smoke test application so that we actually exercise a significant portion of the Streams API during upgrade testing * Add more recent versions to the upgrade system test matrix * Fix the compatibility bug by bumping the schema version to 3 Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Guozhang Wang <wangguoz@gmail.com>	2020-06-26 21:41:51 -05:00
Chia-Ping Tsai	fa2da1c9db	MINOR: Rename SslTransportLayer.State."NOT_INITALIZED" enum value to "NOT_INITIALIZED" The enum ```State``` is private so it is fine to fix typo without breaking compatibility. Author: Chia-Ping Tsai <chia7712@gmail.com> Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> Closes #8932 from chia7712/MINOR-8932	2020-06-26 23:08:28 +05:30
Jeff Kim	a047a7c0eb	KAFKA-9439: add KafkaProducer API unit tests (#8174 ) Add unit tests for KafkaProducer.close(), KafkaProducer.abortTransaction(), and KafkaProducer.flush() in the KafkaProducerTest. Increase KafkaProducer test code coverage from 82% methods, 82% lines to 86% methods, 87% lines when being merged. Reviewers: Boyang Chen <boyang@confluent.io>	2020-06-23 14:07:14 -07:00
Chia-Ping Tsai	14137def71	MINOR: correct the doc of transaction.timeout.ms (#8901 ) Reference the correct transaction timeout error class in the config documentation. Reviewers: Boyang Chen <boyang@confluent.io>	2020-06-23 08:33:13 -07:00
Jason Gustafson	f3c00ae1c8	KAFKA-10113; Specify fetch offsets correctly in `LogTruncationException` (#8822 ) This patch fixes a bug in the constructor of `LogTruncationException`. We were passing the divergent offsets to the super constructor as the fetch offsets. There is no way to fix this without breaking compatibility, but the harm is probably minimal since this exception was not getting raised properly until KAFKA-9840 anyway. Note that I have also moved the check for unknown offset and epoch into `SubscriptionState`, which ensures that the partition is still awaiting validation and that the fetch offset hasn't changed. Finally, I made some minor improvements to the logging and exception messages to ensure that we always have the fetch offset and epoch as well as the divergent offset and epoch included. Reviewers: Boyang Chen <boyang@confluent.io>, David Arthur <mumrah@gmail.com>	2020-06-18 18:10:05 -07:00
Guozhang Wang	d8cc6fe8e3	KAFKA-10167: use the admin client to read end-offset (#8876 ) Since admin client allows use to use flexible offset-spec, we can always set to use read-uncommitted regardless of the EOS config. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	2020-06-18 11:28:49 -07:00
David Arthur	446196d6e9	KAFKA-10123; Fix incorrect value for AWAIT_RESET#hasPosition (#8841 ) ## Background When a partition subscription is initialized it has a `null` position and is in the INITIALIZING state. Depending on the consumer, it will then transition to one of the other states. Typically a consumer will either reset the offset to earliest/latest, or it will provide an offset (with or without offset metadata). For the reset case, we still have no position to act on so fetches should not occur. Recently we made changes for KAFKA-9724 (#8376) to prevent clients from entering the AWAIT_VALIDATION state when targeting older brokers. New logic to bypass offset validation as part of this change exposed this new issue. ## Bug and Fix In the partition subscriptions, the AWAIT_RESET state was incorrectly reporting that it had a position. In some cases a position might actually exist (e.g., if we were resetting offsets during a fetch after a truncation), but in the initialization case no position had been set. We saw this issue in system tests where there is a race between the offset reset completing and the first fetch request being issued. Since AWAIT_RESET#hasPosition was incorrectly returning `true`, the new logic to bypass offset validation was transitioning the subscription to FETCHING (even though no position existed). The fix was simply to have AWAIT_RESET#hasPosition to return `false` which should have been the case from the start. Additionally, this fix includes some guards against NPE when reading the position from the subscription. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>	2020-06-17 20:23:45 -07:00
Chia-Ping Tsai	3b7c2ab6d8	MINOR: reuse toConfigObject(Map) to generate Config (#8889 ) Author: Chia-Ping Tsai <chia7712@gmail.com> Reviewers: Randall Hauch <rhauch@gmail.com>, David Jacot <david.jacot@gmail.com>	2020-06-17 15:46:05 -05:00
Chia-Ping Tsai	26e238c6f5	KAFKA-10147 MockAdminClient#describeConfigs(Collection<ConfigResource>) is unable to handle broker resource (#8853 ) Author: Chia-Ping Tsai <chia7712@gmail.com> Reviewers: Boyang Chen <boyang@confluent.io>, Randall Hauch <rhauch@gmail.com>	2020-06-17 09:56:07 -05:00
Guozhang Wang	bea5ceb434	KAFKA-10169: Error message when transit to Aborting / AbortableError / FatalError (#8880 ) Reviewers: John Roesler <vvcephei@apache.org>	2020-06-16 10:28:36 -07:00
Tom Bentley	a1f429d4f6	MINOR: Documentation for KIP-585 (#8839 ) * Add documentation for using transformation predicates. * Add `PredicateDoc` for generating predicate config docs, following the style of `TransformationDoc`. * Fix the header depth mismatch. * Avoid generating HTML ids based purely on the config name since there are very likely to conflict (e.g. #name). Instead allow passing a function which can be used to generate an id from a config key. The docs have been generated and tested locally. Reviewer: Konstantine Karantasis <konstantine@confluent.io>	2020-06-15 22:07:01 -07:00
Tom Bentley	9a4f00f78b	KAFKA-9432: automated protocol for DescribeConfigs (#8312 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	2020-06-12 21:19:35 +01:00
Kowshik Prakasam	4f96c5b424	KAFKA-10027: Implement read path for feature versioning system (KIP-584) (#8680 ) In this PR, I have implemented various classes and integration for the read path of the feature versioning system (KIP-584). The ultimate plan is that the cluster-wide finalized features information is going to be stored in ZK under the node /feature. The read path implemented in this PR is centered around reading this finalized features information from ZK, and, processing it inside the Broker. Here is a summary of what's in this PR (a lot of it is new classes): A facility is provided in the broker to declare its supported features, and advertise its supported features via its own BrokerIdZNode under a features key. A facility is provided in the broker to listen to and propagate cluster-wide finalized feature changes from ZK. When new finalized features are read from ZK, feature incompatibilities are detected by comparing against the broker's own supported features. ApiVersionsResponse is now served containing supported and finalized feature information (using the newly added tagged fields). Reviewers: Boyang Chen <boyang@confluent.io>, Jun Rao <junrao@gmail.com>	2020-06-11 11:28:57 -07:00
A. Sophie Blee-Goldman	5f7b07b514	MINOR: reduce sizeInBytes for percentiles metrics (#8835 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>	2020-06-11 01:05:18 -07:00
Randall Hauch	48b56e533b	KAFKA-9216: Enforce that Connect’s internal topics use `compact` cleanup policy (#8828 ) This change adds a check to the KafkaConfigBackingStore, KafkaOffsetBackingStore, and KafkaStatusBackingStore to use the admin client to verify that the internal topics are compacted and do not use the `delete` cleanup policy. Connect already will create the internal topics with `cleanup.policy=compact` if the topics do not yet exist when the Connect workers are started; the new topics are created always as compacted, overwriting any user-specified `cleanup.policy`. However, if the topics already exist the worker did not previously verify the internal topics were compacted, such as when a user manually creates the internal topics before starting Connect or manually changes the topic settings after the fact. The current change helps guard against users running Connect with topics that have delete cleanup policy enabled, which will remove all connector configurations, source offsets, and connector & task statuses that are older than the retention time. This means that, for example, the configuration for a long-running connector could be deleted by the broker, and this will cause restart issues upon a subsequent rebalance or restarting of Connect worker(s). Connect behavior requires that its internal topics are compacted and not deleted after some retention time. Therefore, this additional check is simply enforcing the existing expectations, and therefore does not need a KIP. Author: Randall Hauch <rhauch@gmail.com> Reviewer: Konstantine Karantasis <konstantine@confluent.io>, Chris Egerton <chrise@confluent.io>	2020-06-10 22:39:52 -05:00
Chia-Ping Tsai	164db1a661	KAFKA-10014; Always try to close all channels in Selector#close (#8685 ) Ensure all channels get closed in `Selector.close`, even if some of them raise errors. Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>	2020-06-10 12:24:31 -07:00
Rens Groothuijsen	6fa44a3a14	KAFKA-9716; Clarify meaning of compression rate metrics (#8664 ) There is some confusion over the compression rate metrics, as the meaning of the value isn't clearly stated in the metric description. In this case, it was assumed that a higher compression rate value meant better compression. This PR clarifies the meaning of the value, to prevent misunderstandings. Reviewers: Jason Gustafson <jason@confluent.io>	2020-06-09 12:09:32 -07:00
Konstantine Karantasis	09b22e7e67	KAFKA-9848: Avoid triggering scheduled rebalance delay when task assignment fails but Connect workers remain in the group (#8805 ) In the first version of the incremental cooperative protocol, in the presence of a failed sync request by the leader, the assignor was designed to treat the unapplied assignments as lost and trigger a rebalance delay. This commit applies optimizations in these cases to avoid the unnecessary activation of the rebalancing delay. First, if the worker that loses the sync group request or response is the leader, then it detects this failure by checking the what is the expected generation when it performs task assignments. If it's not the expected one, it resets its view of the previous assignment because it wasn't successfully applied and it doesn't represent a correct state. Furthermore, if the worker that has missed the assignment sync is an ordinary worker, then the leader is able to detect that there are lost assignments and instead of triggering a rebalance delay among the same members of the group, it treats the lost tasks as new tasks and reassigns them immediately. If the lost assignment included revocations that were not applied, the leader reapplies these revocations again. Existing unit tests and integration tests are adapted to test the proposed optimizations. Reviewers: Randall Hauch <rhauch@gmail.com>	2020-06-09 09:41:11 -07:00
Jason Gustafson	1d24e2e3b2	MINOR: Fix fetch session epoch comment in `FetchRequest.json` (#8802 ) The current "about" string incorrectly describes the session epoch as the partition epoch. Rename to `SessionEpoch` to make usage clearer. Also rename `MaxWait` to `MaxWaitMs` to make the time unit clear and `FetchableTopic` to `FetchTopic` for consistency with `FetchPartition`. Reviewers: Ismael Juma <ismael@juma.me.uk>	2020-06-08 16:51:48 -07:00
A. Sophie Blee-Goldman	0fa95935db	HOTFIX: fix validity check in sticky assignor tests (#8815 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	2020-06-08 16:08:27 -07:00
showuon	aed7ba9f16	MINOR: Remove unused isSticky assert out from tests only do constrainedAssign (#8788 ) Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	2020-06-08 14:25:20 -07:00
Colin Patrick McCabe	6a30b4e385	MINOR: equals() should check _unknownTaggedFields (#8640 ) _unknownTaggedFields contains tagged fields which we don't understand with the current schema. However, we still want to keep the data around for various purposes. For example, if we are printing out a JSON form of the message we received, we want to include a section containing the tagged fields that couldn't be parsed. To leave these out would give an incorrect impression of what was sent over the wire. Since the unknown tagged fields represent real data, they should be included in the fields checked by equals(). Reviewers: Ismael Juma <ismael@juma.me.uk>, Boyang Chen <boyang@confluent.io>	2020-06-08 12:57:48 -07:00
Navina Ramesh	7777dc8f8f	KAFKA-10012; Reduce overhead of strings in SelectorMetrics (#8684 ) `SelectorMetrics` has a per-connection metrics, which means the number of `MetricName` objects and the strings associated with it (such as group name and description) grows with the number of connections in the client. This overhead of duplicate string objects is amplified when there are multiple instances of kafka clients within the same JVM. This patch addresses some of the memory overhead by making `metricGrpName` a constant field and introducing a new field `perConnectionMetricGrpName`. Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>	2020-06-07 18:44:44 -07:00
Boyang Chen	910f317996	KAFKA-9840; Skip End Offset validation when the leader epoch is not reliable (#8486 ) This PR provides two fixes: 1. Skip offset validation if the current leader epoch cannot be reliably determined. 2. Raise an out of range error if the leader returns an undefined offset in response to the OffsetsForLeaderEpoch request. Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>	2020-06-05 15:53:13 -07:00
Randall Hauch	b44ce35fe9	KAFKA-10110: Corrected potential NPE when null label value added to KafkaMetricsContext (#8811 ) Also added a new unit test to verify the functionality and expectations. Author: Randall Hauch <rhauch@gmail.com> Reviewer: Konstantine Karantasis <konstantine@confluent.io>	2020-06-05 15:19:23 -05:00
Xavier Léauté	7a876ec9de	MINOR: fix backwards incompatibility in JmxReporter introduced by KIP-606 cc omkreddy this should also get backported to 2.6.x Author: Xavier Léauté <xvrl@apache.org> Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> Closes #8813 from xvrl/fix-jmx-reset	2020-06-06 01:30:48 +05:30
gnkoshelev	f1b0931447	KAFKA-10033: Throw UnknownTopicOrPartitionException if altering configs of non-existing topic Fixes KAFKA-10033. Replace AdminOperationException with UnknownTopicOrPartitionException if topic does not exist when validating topic configs in AdminZkClient. Author: gnkoshelev <gnkoshelev@gmail.com> Author: Gregory <gnkoshelev@gmail.com> Reviewers: Brian Byrne <bbyrne@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com> Closes #8715 from gnkoshelev/KAFKA-10033	2020-06-05 00:02:43 +05:30
Tom Bentley	78e8a49cda	KAFKA-9434: automated protocol for alterReplicaLogDirs (#8311 ) Reviewers: David Jacot <djacot@confluent.io>, Mickael Maison <mickael.maison@gmail.com>	2020-06-04 15:36:37 +01:00
Badai Aqrandista	50c3012890	KAFKA-9313: Set `use_all_dns_ips` as the new default for `client.dns.lookup` (KIP-602) (#8644 ) This applies to the producer, consumer, admin client, connect worker and inter broker communication. `ClientDnsLookup.DEFAULT` has been deprecated and a warning will be logged if it's explicitly set in a client config. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ismael Juma <ismael@juma.me.uk>	2020-06-04 06:21:52 -07:00
Chia-Ping Tsai	98cd4b8cab	KAFKA-10089 The stale ssl engine factory is not closed after reconfigure (#8792 ) Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2020-06-03 19:04:21 +01:00
A. Sophie Blee-Goldman	35a0692ce1	KAFKA-10083: fix failed testReassignmentWithRandomSubscriptionsAndChanges tests (#8786 ) Minimum fix needed to stop this test failing and unblock others Co-authored-by: Luke Chen <showuon@gmail.com> Reviewers: Guozhang Wang <wangguoz@gmail.com>	2020-06-02 20:39:02 -07:00
Nikolay	8b22b81596	KAFKA-9320: Enable TLSv1.3 by default (KIP-573) (#8695 ) 1. Enables `TLSv1.3` by default with Java 11 or newer. 2. Add unit tests that cover the various TLSv1.2 and TLSv1.3 combinations. 3. Extend `benchmark_test.py` and `replication_test.py` to run with 'TLSv1.2' or 'TLSv1.3'. Reviewers: Ismael Juma <ismael@juma.me.uk>	2020-06-02 15:34:43 -07:00
showuon	9e5e4cd558	KAFKA-10082: Fix the failed testMultiConsumerStickyAssignment (#8777 ) Fix the failed testMultiConsumerStickyAssignment by modifying the logic error in allSubscriptionsEqual method. We will create the consumerToOwnedPartitions to keep the set of previously owned partitions encoded in the Subscription. It's our basis to do the reassignment. In the allSubscriptionsEqual, we'll get the member generation of the subscription, and remove all previously owned partitions as invalid if the current generation is higher. However, the logic before my fix, will remove the current highest member out of the consumerToOwnedPartitions, which should be kept because it's the current higher generation member. Fix this logic error. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	2020-06-02 09:50:20 -07:00
Nikolay	a3d79ff236	MINOR: Remove unused variable to fix spotBugs failure (#8779 ) Fixed spotBugs error introduced by c6633a1: >Dead store to isFreshAssignment in org.apache.kafka.clients.consumer.internals.AbstractStickyAssignor.generalAssign(Map, Map) Reviewers: Ismael Juma <ismael@juma.me.uk>	2020-06-02 08:59:56 -07:00
Rohan	07f6676539	MINOR: ChangelogReader should poll for duration 0 for standby restore (#8773 ) Co-authored-by: Guozhang Wang <wangguoz@gmail.com> Reviewers: Guozhang Wang <wangguoz@gmail.com>	2020-06-01 22:33:22 -07:00
A. Sophie Blee-Goldman	c6633a157e	KAFKA-9987: optimize sticky assignment algorithm for same-subscription case (#8668 ) Motivation and pseudo code algorithm in the ticket. Added a scale test with large number of topic partitions and consumers and 30s timeout. With these changes, assignment with 2,000 consumers and 200 topics with 2,000 each completes within a few seconds. Porting the same test to trunk, it took 2 minutes even with a 100x reduction in the number of topics (ie, 2 minutes for 2,000 consumers and 2 topics with 2,000 partitions) Should be cherry-picked to 2.6, 2.5, and 2.4 Reviewers: Guozhang Wang <wangguoz@gmail.com>	2020-06-01 15:57:15 -07:00
Rajini Sivaram	66fdb59ed0	KAFKA-9392; Clarify deleteAcls javadoc and add test for create/delete timing (#7956 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2020-06-01 16:38:21 +01:00
Chia-Ping Tsai	d4c1ef4a10	MINOR: Align the constructor of KafkaConsumer to KafkaProducer (#8605 ) 1. Move KafkaProducer#propsToMap to Utils#propsToMap 2. Apply Utils#propsToMap to constructor of KafkaConsumer Reviewers: Noa Resare <noa@resare.com>, Ismael Juma <ismael@juma.me.uk>	2020-05-31 14:36:27 -07:00
tswstarplanet	0736068013	Remove redundant `containsKey` call in KafkaProducer (#8761 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	2020-05-30 08:13:07 -07:00
Shailesh Panwar	af3b8b50f2	KAFKA-9494; Include additional metadata information in DescribeConfig response (KIP-569) (#8723 ) Adds documentation and type of ConfigEntry in version 3 of DescribeConfigsResponse Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2020-05-29 23:18:50 +01:00
Mickael Maison	fe948d39e5	KAFKA-9130; KIP-518 Allow listing consumer groups per state (#8238 ) Implementation of KIP-518: https://cwiki.apache.org/confluence/display/KAFKA/KIP-518%3A+Allow+listing+consumer+groups+per+state. Reviewers: David Jacot <djacot@confluent.io>, Jason Gustafson <jason@confluent.io> Co-authored-by: Mickael Maison <mickael.maison@gmail.com> Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com>	2020-05-29 11:25:20 -07:00
Rajini Sivaram	277e4cd3bb	KAFKA-10056; Ensure consumer metadata contains new topics on subscription change (#8739 ) Reviewers: Jason Gustafson <jason@confluent.io>	2020-05-29 17:03:49 +01:00
John Roesler	aab52485a4	MINOR: Log the reason for coordinator discovery failure (#8747 ) Reviewers: Boyang Chen <boyang@confluent.io>	2020-05-29 08:52:19 -05:00
Rajini Sivaram	1fd195e5c2	KAFKA-10029; Don't update completedReceives when channels are closed to avoid ConcurrentModificationException (#8705 ) Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>	2020-05-29 09:32:57 +01:00
A. Sophie Blee-Goldman	2faf7f301f	MINOR: Relax Percentiles test (#8748 ) Decrease test sensitivity to +/- 20% to hopefully eliminate flakiness. Reviewers: John Roesler <vvcephei@apache.org>	2020-05-28 18:25:50 -05:00

... 2 3 4 5 6 ...

2263 Commits