kafka

Commit Graph

Author	SHA1	Message	Date
Christo Lolov	f44ee4fab7	MINOR: Remove unnecessary code in client/connect (#13259 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	2023-05-02 17:39:31 +02:00
Philip Nee	64ebbc577d	MINOR: Fixing typos in the ConsumerCoordinator (#13618 ) Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, David Jacot <djacot@confluent.io>	2023-04-28 17:46:40 +02:00
Anton Agestam	e55fbceb66	MINOR: Fix incorrect description of SessionLifetimeMs (#13649 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	2023-04-28 12:54:28 +02:00
Philip Nee	c6ad151ac3	KAFKA-14639: A single partition may be revoked and assign during a single round of rebalance (#13550 ) This is a really long story, but the incident started in KAFKA-13419 when we observed a member sending out a topic partition owned from the previous generation when a member missed a rebalance cycle due to REBALANCE_IN_PROGRESS. This patch changes the AbstractStickyAssignor.AllSubscriptionsEqual method. In short, it should no long check and validate only the highest generation. Instead, we consider 3 cases: 1. Member will continue to hold on to its partition if there are no other owners 2. If there are 1+ owners to the same partition. One with the highest generation will win. 3. If two members of the same generation hold on to the same partition. We will log an error but remove both from the assignment. (Same with the current logic) Here are some important notes that lead to the patch: - If a member is kicked out of the group, and `UNKNOWN_MEMBER_ID` will be thrown. - It seems to be a common situation that members are late to joinGroup and therefore get `REBALANCE_IN_PROGRESS` error. This is why we don't want to reset generation because it might cause lots of revocations and can be disruptive To summarize the current behavior of different errors: `REBALANCE_IN_PROGRESS` - heartbeat: requestRejoin if member state is stable - joinGroup: rejoin immediately - syncGroup: rejoin immediately - commit: requestRejoin and fail the commit. Raise this exception if the generation is staled, i.e. another rebalance is already in progress. `UNKNOWN_MEMBER_ID` - heartbeat: resetStateAndRejoinif generation hasn't changed. otherwise, ignore - joinGroup: resetStateAndRejoin if generation unchanged, otherwise rejoin immediately - syncGroup: resetStateAndRejoin if generation unchanged, otherwise rejoin immediately `ILLEGAL_GENERATION` - heartbeat: resetStateAndRejoinif generation hasn't changed. otherwise, ignore - syncGroup: raised the exception if generation has been resetted or the member hasn't completed rebalancing. then resetStateAndRejoin if generation unchanged, otherwise rejoin immediately Reviewers: David Jacot <djacot@confluent.io>	2023-04-28 11:08:32 +02:00
Luke Chen	d796480fe8	KAFKA-14909: check zkMigrationReady tag before migration (#13631 ) 1. add ZkMigrationReady in apiVersionsResponse 2. check all nodes if ZkMigrationReady are ready before moving to next migration state Reviewers: David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com>	2023-04-28 14:35:12 +08:00
Colin P. McCabe	7049333617	KAFKA-14943: Fix ClientQuotaControlManager validation Don't allow setting negative or zero values for quotas. Don't allow SCRAM mechanism names to be used as client quota names. SCRAM mechanisms are not client quotas. (The confusion arose because of internal ZK representation details that treated them both as "client configs.") Add unit tests for ClientQuotaControlManager.isValidIpEntity and ClientQuotaControlManager.configKeysForEntityType. This change doesn't affect metadata record application, only input validation. If there are bad client quotas that are set currently, this change will not alter the current behavior (of throwing an exception and ignoring the bad quota).	2023-04-27 10:42:32 -07:00
LinShunKang	dd6690a7a0	KAFKA-14944: Reduce CompletedFetch#parseRecord() memory copy (#12545 ) This implements KIP-863: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=225152035 Direct use ByteBuffer instead of byte[] to deserialize. Reviewers: Luke Chen <showuon@gmail.com>, Kirk True <kirk@kirktrue.pro>	2023-04-27 10:44:08 +08:00
Manyanda Chitimbo	d83a734c41	MINOR: only set sslEngine#setUseClientMode to false once when ssl mode is server (#13626 ) The sslEngine.setUseClientMode(false) was duplicated when ssl mode is server during SSLEngine creation in DefaultSslEngineFactory.java. The patch attemps to remove the duplicated call. Reviewers: maulin-vasavada <maulin.vasavada@gmail.com>, Divij Vaidya <diviv@amazon.com>, Manikumar Reddy <manikumar.reddy@gmail.com>	2023-04-25 15:32:24 +05:30
Dimitar Dimitrov	e14dd8024a	KAFKA-14821 Implement the listOffsets API with AdminApiDriver (#13432 ) We are handling complex workflows ListOffsets by chaining together MetadataCall instances and ListOffsetsCall instances, there are many complex and error-prone logic. In this PR we rewrote it with the `AdminApiDriver` infra, notable changes better than old logic: 1. Retry lookup stage on receiving `NOT_LEADER_OR_FOLLOWER` and `LEADER_NOT_AVAILABLE`, whereas in the past we failed the partition directly without retry. 2. Removing class field `supportsMaxTimestamp` and calculating it on the fly to avoid the mutable state, this won't change any behavior of the client. 3. Retry fulfillment stage on `RetriableException`, whereas in the past we just retry fulfillment stage on `InvalidMetadataException`, this means we will retry on `TimeoutException` and other `RetriableException`. We also `handleUnsupportedVersionException` to `AdminApiHandler` and `AdminApiLookupStrategy`, they are used to keep consistency with old logic, and we can continue improvise them. Reviewers: Ziming Deng <dengziming1993@gmail.com>, David Jacot <djacot@confluent.io>	2023-04-20 11:29:27 +08:00
Dániel Urbán	454b72161a	KAFKA-14902: KafkaStatusBackingStore retries on a dedicated background thread to avoid stack overflows (#13557 ) KafkaStatusBackingStore uses an infinite retry logic on producer send, which can lead to a stack overflow. To avoid the problem, a background thread was added, and the callback submits the retry onto the background thread.	2023-04-18 09:40:14 +02:00
Justine Olshan	56dcb837a2	KAFKA-14561: Improve transactions experience for older clients by ensuring ongoing transaction (#13391 ) Added check for ongoing transaction Thread to send and receive verify only add partition to txn requests Code to send on request thread courtesy of @artemlivshits Reviewers: Artem Livshits <alivshits@confluent.io>, Jun Rao <junrao@gmail.com>	2023-04-12 17:04:51 -07:00
Rajini Sivaram	b64ac94a8c	KAFKA-14891: Fix rack-aware range assignor to assign co-partitioned subsets (#13539 ) Reviewers: David Jacot <djacot@confluent.io>	2023-04-12 08:35:03 +01:00
Gantigmaa Selenge	751a8af1f0	KAFKA-14420: Use incrementalAlterConfigs API for syncing topic configurations in MirrorMaker 2 (KIP-894) (#13373 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chris Egerton <chrise@aiven.io>	2023-04-10 11:55:49 -04:00
José Armando García Sancio	672dd3ab6a	KAFKA-13020; Implement reading Snapshot log append timestamp (#13345 ) The SnapshotReader exposes the "last contained log time". This is mainly used during snapshot cleanup. The previous implementation used the append time of the snapshot record. This is not accurate as this is the time when the snapshot was created and not the log append time of the last record included in the snapshot. The log append time of the last record included in the snapshot is store in the header control record of the snapshot. The header control record is the first record of the snapshot. To be able to read this record, this change extends the RecordsIterator to decode and expose the control records in the Records type. Reviewers: Colin Patrick McCabe <cmccabe@apache.org>	2023-04-07 09:25:54 -07:00
Chia-Ping Tsai	637bc92ba1	MINOR: move RecordReader from org.apache.kafka.tools (client module) to org.apache.kafka.tools.api (tools-api module) (#13454 ) Reviewers: Jun Rao <junrao@gmail.com>	2023-04-07 00:20:56 +08:00
Roman Schmitz	4f34ce1b49	KAFKA-14376: Add ConfigProvider to make use of environment variables KIP-887 (#12992 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Viktor Somogyi-Vass <viktorsomogyi@gmail.com>, Jordan Moore <crikket.007@gmail.com>, Chris Egerton <fearthecellos@gmail.com>	2023-04-06 17:22:12 +02:00
Proven Provenzano	6d36db1c78	KAFKA-14765 and KAFKA-14776: Support for SCRAM at bootstrap with integration tests (#13374 ) Implement KIP-900 Update kafka-storage to be able to add SCRAM records to the bootstrap metadata file at format time so that SCRAM is enabled at initial start (bootstrap) of KRaft cluster. Includes unit tests. Update ./core/src/test/scala/integration/kafka/api/SaslScramSslEndToEndAuthorizationTest.scala to use bootstrap and enable the test to run with both ZK and KRaft quorum. Moved the one test from ScramServerStartupTest.scala into SaslScramSslEndToEndAuthorizationTest.scala. This test is really small, so there was no point in recreating all the bootstrap startup just for a 5 line test when it could easily be run elsewhere. Reviewers: Colin P. McCabe <cmccabe@apache.org>, Manikumar Reddy <manikumar.reddy@gmail.com>	2023-04-04 08:34:09 -07:00
Pierangelo Di Pilato	4e1fcf1847	KAFKA-14771: Include threads info in ConcurrentModificationException message (#13325 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Divij Vaidya <diviv@amazon.com>	2023-04-03 12:35:13 +02:00
Rajini Sivaram	1f0ae71fb3	KAFKA-14452: Make sticky assignors rack-aware if client rack is configured (KIP-881) (#13350 ) Best-effort rack alignment for sticky assignors when both consumer racks and partition racks are available with the protocol changes introduced in KIP-881. Rack-aware assignment is enabled by configuring client.rack for consumers. The assignment builders attempt to align on racks on a best-effort basis, but prioritize balanced assignment over rack alignment. Reviewers: David Jacot <djacot@confluent.io>	2023-04-03 09:22:47 +01:00
Rajini Sivaram	3c4472d701	KAFKA-14867: Trigger rebalance when replica racks change if client.rack is configured (KIP-881) (#13474 ) When `client.rack` is configured for consumers, we perform rack-aware consumer partition assignment to improve locality. After/during reassignments, replica racks may change, so to ensure optimal consumer assignment, trigger rebalance from the leader when set of racks of any partition changes. Reviewers: David Jacot <djacot@confluent.io>	2023-03-31 15:01:07 +01:00
Calvin Liu	8c88cdb718	KAFKA-14617: Update AlterPartitionRequest and enable Kraft controller to reject stale request. (#13408 ) Second part of the [KIP-903](https://cwiki.apache.org/confluence/display/KAFKA/KIP-903%3A+Replicas+with+stale+broker+epoch+should+not+be+allowed+to+join+the+ISR), it updates the AlterPartitionRequest: - Deprecate the NewIsr field - Create a new field BrokerState with BrokerId and BrokerEpoch - Bump the AlterPartition version to 3 With this change, the Quorum Controller is enabled to reject stale AlterPartition request. Reviewers: Jun Rao <junrao@gmail.com>, David Jacot <djacot@confluent.io>	2023-03-31 11:27:42 +02:00
Justine Olshan	6d9d65e666	MINOR: Change ordering of checks to prevent log spam on metadata updates (#13447 ) On startup, we always update the metadata. The topic ID also goes from null to defined. Move the epoch is null check to before the topic ID check to prevent log spam. Reviewers: David Jacot <djacot@confluent.io>, Jason Gustafson <jason@confluent.io>	2023-03-30 09:23:55 -07:00
Philip Nee	5c0e4aa676	KAFKA-14468: Committed API (#13380 ) In this PR, I implemented the committed API. Here are the specifics: * the CommitRequestManager handles committed() request. * I implemented a UnsentOffsetFetchRequestState to handle deduping the request: because we don't want to send the exact requests repeatedly. * I implemented the retry mechanism: Some retriable errors will be retried automatically * ClientResponse errors are handled in the handlers. * Some of the top-level APIs were refactored lightly. Reviewers: Guozhang Wang <wangguoz@gmail.com>	2023-03-29 16:09:52 -07:00
Chia-Ping Tsai	6e8d0d9850	KAFKA-14853 the serializer/deserialize which extends ClusterResourceListener is not added to Metadata (#13460 ) Reviewers: dengziming <dengziming1993@gmail.com>	2023-03-29 16:02:04 +08:00
Kirk True	31440b00f3	KAFKA-14848: KafkaConsumer incorrectly passes locally-scoped serializers to FetchConfig (#13452 ) Fix for a NPE bug that was caused by referring to a local variable and not the instance variable of the deserializers. Co-authored-by: Robert Yokota <1761488+rayokota@users.noreply.github.com> Reviewers: Robert Yokota <1761488+rayokota@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>	2023-03-27 09:53:12 -07:00
egyedt	139f7709bd	Fix log DateTime format unit test (#13441 ) Reviewers: Viktor Somogyi-Vass <viktorsomogyi@gmail.com>	2023-03-27 10:48:47 +02:00
Chia-Ping Tsai	18fba41946	KAFKA-10244 An new java interface to replace 'kafka.common.MessageReader' (#13393 ) Reviewers: Mickael Maison <mimaison@users.noreply.github.com>	2023-03-25 21:27:02 +08:00
Kirk True	a3252629a3	KAFKA-14365: Extract common logic from Fetcher (#13425 ) * KAFKA-14365: Extract common logic from Fetcher Extract logic from Fetcher into AbstractFetcher. Also introduce FetchConfig as a more concise way to delineate state from incoming configuration. Formalized the defaults in CommonClientConfigs and ConsumerConfig to be accessible elsewhere. * Removed overridden methods in favor of synchronizing where needed Reviewers: Guozhang Wang <wangguoz@gmail.com>	2023-03-24 14:33:13 -07:00
Spacrocket	71ca8ef4ec	KAFKA-14722: Make BooleanSerde public (#13382 ) KAFKA-14722: Make BooleanSerde public (#13328) Addition of boolean serde https://cwiki.apache.org/confluence/display/KAFKA/KIP-907%3A+Add+Boolean+Serde+to+public+interface During the task KAFKA-14491 Victoria added BooleanSerde class, It will be useful to have such class in public package. Reviewers: Walker Carlson <wcarlson@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>, Divij Vaidya <diviv@amazon.com>	2023-03-24 10:41:51 -05:00
hudeqi	f79c2a6e04	MINOR:Incorrect/canonical use of constants in AdminClientConfig and StreamsConfigTest (#13427 ) Co-authored-by: Deqi Hu <deqi.hu@shopee.com> Reviewers: Ziming Deng <dengziming1993@gmail.com>, Guozhang Wang <guozhang.wang.us@gmail.com>	2023-03-23 09:36:35 -07:00
Mickael Maison	52a4917988	MINOR: Cleanups in clients common.config (#13422 ) Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>	2023-03-21 10:16:24 +01:00
Calvin Liu	79b5f7f1ce	KAFKA-14617: Add ReplicaState to FetchRequest (KIP-903) (#13323 ) This patch is the first part of KIP-903. It updates the FetchRequest to include the new tagged ReplicaState field which replaces the now deprecated ReplicaId field. The FetchRequest version is bumped to version 15 and the MetadataVersion to 3.5-IV1. Reviewers: David Jacot <djacot@confluent.io>	2023-03-16 14:04:34 +01:00
Rajini Sivaram	1401769d92	KAFKA-14452: Refactor AbstractStickyAssignor to prepare for rack-aware assignment (#13349 ) This commit refactors AbstractStickyAssignor without changing any logic to make it easier to add rack-awareness. The class currently consists of a lot of collections that are passed around various methods, with some methods updating some collections. Addition of rack-awareness makes this class with very large methods even more complex and harder to read. The new code moves the two assignment methods into their own classes so that the state can be maintained as instance fields rather than local variables. Reviewers: David Jacot <djacot@confluent.io>	2023-03-13 08:59:03 +00:00
Kirk True	c74c0f2fac	KAFKA-14758: Extract inner classes from Fetcher for reuse in refactoring (#13301 ) The Fetcher class is used internally by the KafkaConsumer to fetch records from the brokers. There is ongoing work to create a new consumer implementation with a significantly refactored threading model. The threading refactor work requires a similarly refactored Fetcher. This task includes refactoring Fetcher by extracting out the inner classes into top-level (though still in internal) so that those classes can be referenced by forthcoming refactored fetch logic. Reviewers: Philip Nee <philipnee@gmail.com>, Guozhang Wang <wangguoz@gmail.com>	2023-03-10 10:17:14 -08:00
José Armando García Sancio	5f6a050bfe	MINOR; Export control record type value (#13366 ) Reviewers: David Arthur <mumrah@gmail.com>	2023-03-09 15:15:39 -08:00
Kirk True	c5240c0390	KAFKA-14317: ProduceRequest timeouts are logged as network exceptions (#12813 ) Minor changes to `Sender` and `NetworkClient` so that we can log timeouts during `ProduceRequest` with a more precise error message, denoting a timeout vs. "generic" network error. Reviewers: Philip Nee <pnee@confluent.io>, Guozhang Wang <guozhang@apache.org>, David Jacot <djacot@confluent.io>	2023-03-09 10:04:07 +01:00
José Armando García Sancio	00c7a018b8	KAFKA-14794; Decode base64 JSON string (#13363 ) A binary value (array of bytes) can be a BinaryNode or a TextNode. When it is a BinaryNode, the method binaryValue() always returns non-null. When it is a TextNode, the method binaryValue() will return non-null if the value is a base64 string. For all other JSON nodes binaryValue() returns null. Reviewers: Colin Patrick McCabe <cmccabe@apache.org>	2023-03-08 18:40:29 -08:00
Philip Nee	6fbe4d85a2	KAFKA-14761 Adding integration test for the prototype consumer (#13303 ) The goal of this PR is to add more tests to the PrototypeAsyncConsumer to test * Successful startup and shutdown. * Commit. I also added integration tests: * Test commitAsync() * Test commitSync() Note that I still need to implement committed() to test if commitSync() has been successfully committed. Additional things: Change KafkaConsumer<K, V> to Consumer<K, V> to use different implementations Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Guozhang Wang <wangguoz@gmail.com>	2023-03-07 16:15:59 -08:00
Justine Olshan	29a1a16668	KAFKA-14402: Update AddPartitionsToTxn protocol to batch and handle verifyOnly requests (#13231 ) Part 1 of KIP-890 I've updated the API spec and related classes. Clients should only be able to send up to version 3 requests and that is enforced by using a client builder. Requests > 4 only require cluster permissions as they are initiated from other brokers. API version 4 is marked as unstable for now. I've added tests for the batched requests and for the verifyOnly mode. Also -- minor change to the KafkaApis method to properly match the request name. Reviewers: Jason Gustafson <jason@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, David Jacot <djacot@confluent.io>	2023-03-07 09:20:16 -08:00
Rajini Sivaram	4d43abf1e0	KAFKA-14770: Allow dynamic keystore update for brokers if string representation of DN matches even if canonical DNs don't match (#13346 ) To avoid mistakes during dynamic broker config updates that could potentially affect clients, we restrict changes that can be performed dynamically without broker restart. For broker keystore updates, we require the DN to be the same for the old and new certificates since this could potentially contain host names used for host name verification by clients. DNs are compared using standard Java implementation of X500Principal.equals() which compares canonical names. If tags of fields change from one with a printable string representation and one without or vice-versa, canonical name check fails even if the actual name is the same since canonical representation converts to hex for some tags only. We can relax the verification to allow dynamic updates in this case by enabling dynamic update if either the canonical name or the RFC2253 string representation of the DN matches. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Kalpesh Patel <kpatel@confluent.io>	2023-03-07 09:41:01 +00:00
Christo Lolov	5b295293c0	MINOR: Remove unnecessary toString(); fix comment references (#13212 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Divij Vaidya <diviv@amazon.com>, Lucas Brutschy <lbrutschy@confluent.io>	2023-03-06 18:39:04 +01:00
hudeqi	1ae6405c47	MINOR: Fix hint in selector poll (#13324 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2023-03-05 21:12:52 +08:00
Proven Provenzano	38c409cf33	KAFKA-14084: SCRAM support in KRaft. (#13114 ) This commit adds support to store the SCRAM credentials in a cluster with KRaft quorum servers and no ZK cluster backing the metadata. This includes creating ScramControlManager in the controller, and adding support for SCRAM to MetadataImage and MetadataDelta. Change UserScramCredentialRecord to contain only a single tuple (name, mechanism, salt, pw, iter) rather than a mapping between name and a list. This will avoid creating an excessively large record if a single user has many entries. Because record ID 11 (UserScramCredentialRecord) has not been used before, this is a compatible change. SCRAM will be supported in 3.5-IV0 and later. This commit does not include KIP-900 SCRAM bootstrapping support, or updating the credential cache on the controller (as opposed to broker). We will implement these in follow-on commits. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Colin P. McCabe <cmccabe@apache.org>	2023-03-03 10:23:34 -08:00
Gantigmaa Selenge	ea30ec4b56	KAFKA-14590: Move DelegationTokenCommand to tools (#13172 ) KAFKA-14590: Move DelegationTokenCommand to tools Reviewers: Luke Chen <showuon@gmail.com>, Christo Lolov <christo_lolov@yahoo.com>, Federico Valeri <fvaleri@redhat.com>	2023-03-02 14:30:07 +08:00
RivenSun	9be36a4cd3	KAFKA-14729: The kafakConsumer pollForFetches(timer) method takes up a lot of cpu due to the abnormal exit of the heartbeat thread (#13270 ) throwing an exception directly form the foreground thread's callers when the abnormal exit of the heartbeat thread Reviewers: Luke Chen <showuon@gmail.com>, Philip Nee <philipnee@gmail.com>	2023-03-02 11:37:33 +08:00
Rajini Sivaram	98d84b17f7	KAFKA-14451: Rack-aware consumer partition assignment for RangeAssignor (KIP-881) (#12990 ) Best-effort rack alignment for range assignor when both consumer racks and partition racks are available with the protocol changes introduced in KIP-881. Rack-aware assignment is enabled by configuring client.rack for consumers. Balanced assignment per topic is prioritized over rack-alignment. For topics with equal partitions and the same set of subscribers, co-partitioning is prioritized over rack-alignment. Reviewers: David Jacot <djacot@confluent.io>	2023-03-01 21:01:35 +00:00
Philip Nee	f7f376f6c1	KAFKA-12639: Exit upon expired timer to prevent tight looping (#13190 ) In AbstractCoordinator#joinGroupIfNeeded - joinGroup request will be retried without proper backoff, due to the expired timer. This is an uncommon scenario and possibly only appears during the testing, but I think it makes sense to enforce the client to drive the join group via poll. Reviewers: Guozhang Wang <wangguoz@gmail.com>	2023-02-28 17:36:37 -08:00
Yash Mayya	d143d349ec	MINOR: ExponentialBackoff Javadoc improvements (#13317 ) Reviewers: Chris Egerton <chrise@aiven.io>	2023-02-28 09:27:37 -05:00
Mickael Maison	461c5cfe05	MINOR: Various cleanups in common utils (#13174 ) - Remove unused methods - Cleanup syntax Reviewers: Josep Prat <josep.prat@aiven.io>, Christo Lolov <christololov@gmail.com>	2023-02-27 10:35:28 +01:00
Hoki Min	dfd8fdb859	MINOR: Replace String literal with existing String variable (#13305 ) Reviewers: Divij Vaidya <diviv@amazon.com>, Chia-Ping Tsai <chia7712@gmail.com>	2023-02-27 01:57:20 +08:00

1 2 3 4 5 ...

2744 Commits