kafka

Commit Graph

Author	SHA1	Message	Date
Kuan-Po Tseng	190df07ace	KAFKA-17265 Fix flaky MemoryRecordsBuilderTest#testBuffersDereferencedOnClose (#17092 ) Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-09-05 19:47:16 +08:00
Luke Chen	eb9cfb06c0	KAFKA-17412: add doc for `unclean.leader.election.enable` in KRaft (#17051 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>	2024-09-03 16:11:46 -07:00
TengYao Chi	2f9b236259	KAFKA-17294 Handle retriable errors when fetching offsets in new consumer (#16833 ) The original behavior was implemented to maintain the behavior of the Classic consumer, where the ConsumerCoordinator would do the same when handling the OffsetFetchResponse. This behavior is being updated for the legacy coordinator as part of KAFKA-17279, to retry on all retriable errors. We should review and update the CommitRequestManager to align with this, and retry on all retriable errors, which seems sensible when fetching offsets. The corresponding PR for classic consumer is #16826 Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-09-03 19:05:29 +08:00
ShivsundarR	743e185c8b	KAFKA-17450 Reduced the handlers for handling ShareAcknowledgeResponse (#17061 ) Currently there are 4 handler functions present for handling ShareAcknowledge responses. ShareConsumeRequestManager had an interface and the respective handlers would implement it. Instead of having 4 different handlers for this, now using AcknowledgeRequestType, we could merge the code and have only 2 handler functions, one for ShareAcknowledge success and one for ShareAcknowledge failure, eliminating the need for the interface. This PR also fixes a bug - We were not using the time at which the response was received while handling the ShareAcknowledge response, we were using an outdated time. Now the bug is fixed. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-09-03 11:31:49 +08:00
TengYao Chi	6a2789cf70	KAFKA-17293: New consumer HeartbeatRequestManager should rediscover disconnected coordinator (#16844 ) Reviewers: Lianet Magrans <lianetmr@gmail.com>, TaiJuWu <tjwu1217@gmail.com>	2024-09-01 09:59:21 -04:00
TaiJuWu	dc650fbd73	MINOR: add UT for consumer.poll (#17047 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-09-01 10:09:18 +08:00
PoAn Yang	4a2577b801	KAFKA-17395 Flaky test testMissingOffsetNoResetPolicy for new consumer (#17056 ) In AsyncKafkaConsumer, FindCoordinatorRequest is sent by background thread. In MockClient#prepareResponseFrom, it only matches the response to a future request. If there is some race condition, FindCoordinatorResponse may not match to a FindCoordinatorRequest. It's better to put MockClient#prepareResponseFrom before the request to avoid flaky test. Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-31 23:57:19 +08:00
Eric Chang	a6062d0868	KAFKA-17137 Feat admin client it acl configs (#16648 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-08-31 12:29:39 +08:00
PoAn Yang	70dd577286	KAFKA-15909 Throw error when consumer configured with empty/whitespace-only group.id for LegacyKafkaConsumer (#16933 ) Per KIP-289, the use of an empty value for group.id configuration was deprecated back in 2.2.0. In 3.7, the AsyncKafkaConsumer implementation will throw an error (see KAFKA-14438). This task is to update the LegacyKafkaConsumer implementation to throw an error in 4.0. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-08-30 23:24:36 +08:00
PoAn Yang	2b495945a2	KAFKA-17377: Consider using defaultApiTimeoutMs in AsyncKafkaConsumer#unsubscribe (#17030 ) Reviewers: Kirk True <ktrue@confluent.io>, Lianet Magrans <lianetmr@gmail.com>	2024-08-29 20:24:25 -04:00
PoAn Yang	8db80d1f07	KAFKA-17064: New consumer assign should update assignment in background thread (#16673 ) Reviewers: Kirk True <ktrue@confluent.io>, Lianet Magrans <lianetmr@gmail.com>	2024-08-29 23:01:36 +02:00
Ken Huang	28e2e8631f	KAFKA-17170: Add test to ensure new consumer acks reconciled assignment even if first HB with ack lost (#16694 ) Reviewers: Kirk True <ktrue@confluent.io>, Lianet Magrans <lianetmr@gmail.com>	2024-08-29 13:26:40 +02:00
Kirk True	dd7d7c3145	KAFKA-17335 Lack of default for URL encoding configuration for OAuth causes NPE (#16990 ) AccessTokenRetrieverFactory uses the value of sasl.oauthbearer.header.urlencode provided by the user, or null if no value was provided for that configuration. When the HttpAccessTokenRetriever is created the JVM attempts to unbox the value into a boolean, a NullPointerException is thrown. The fix is to explicitly check the Boolean, and if it's null, use Boolean.FALSE. Reviewers: bachmanity1 <81428651+bachmanity1@users.noreply.github.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-28 23:11:41 +08:00
Lianet Magrans	f9e30289d9	KAFKA-17403 Generate HB to leave on pollOnClose if needed (#16974 ) Fix to ensure that the HB request to leave the group is generated when closing the HBRequestManager if the state is LEAVING. This is needed because we could end up closing the network thread without giving a chance to the HBManager to generate the request. This flow on consumer.close with short timeout: 1. app thread triggers releaseAssignmentAndLeaveGroup 2. background thread transitions to LEAVING 2.1 the next run of the background thread should poll the HB manager and generate a request 3. app thread releaseAssignmentAndLeaveGroup times out and moves on to close the network thread (stops polling managers. Calls pollOnClose to gather the final requests and send them along with the unsent) If 3 happens in the app thread before 2.1 happens in the background, the HB manager won't have a chance to generate the request to leave. This PR implements the pollOnClose to generate the final request if needed. Reviewers: Kirk True <kirk@kirktrue.pro>, TaiJuWu <tjwu1217@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-28 09:04:25 +08:00
DL1231	006af8b939	KAFKA-17327; Add support of group in kafka-configs.sh (#16887 ) The patch adds support of alter/describe configs for group in kafka-configs.sh. Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>	2024-08-27 02:16:46 -07:00
Kuan-Po Tseng	5557720246	KAFKA-17038 KIP-919 supports for `alterPartitionReassignments` and `listPartitionReassignments` (#16644 ) This is a follow-up after KIP-919, extending support for BOOTSTRAP_CONTROLLERS_CONFIG to both Admin#alterPartitionReassignments and Admin#listPartitionReassignments. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-08-27 17:12:16 +08:00
Andrew Schofield	ccd24f2cf6	KAFKA-17341 Refactor consumer heartbeat request managers (#16963 ) Refactor the heartbeat request managers for consumer groups and share groups. Almost all of the code can be shared which is definitely good. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-27 16:37:37 +08:00
xijiu	a39037e55c	KAFKA-17399 Apply LambdaValidator to code base (#16980 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-08-26 21:53:49 +08:00
TengYao Chi	d67c18b4ae	KAFKA-17331 Set correct version for EarliestLocalSpec and LatestTieredSpec (#16876 ) Add the version check to client side when building ListOffsetRequest for the specific timestamp: 1) the version must be >=8 if timestamp=-4L (EARLIEST_LOCAL_TIMESTAMP) 2) the version must be >=9 if timestamp=-5L (LATEST_TIERED_TIMESTAMP) Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-25 17:39:28 +08:00
xijiu	e750f44cf8	KAFKA-17409 Remove deprecated constructor of org.apache.kafka.clients.producer.RecordMetadata (#16979 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-08-25 06:02:42 +08:00
Andrew Schofield	ffc865c432	KAFKA-17291: Add integration test for share group list and describe (#16920 ) Add an integration test for share group list and describe admin operations. Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>	2024-08-23 16:31:04 +05:30
xijiu	246b165456	KAFKA-17359 Add tests and enhance the docs of `Admin#describeConfigs` for the case of nonexistent resource (#16947 ) - The different behavior of nonexistent resource. For example: nonexistent broker will cause timeout; nonexistent topic will produce UnknownTopicOrPartitionException; nonexistent group will return static/default configs; client_metrics will return empty configs - The resources (topic and broker resource types are currently supported) this description is out-of-date - Add some junit test Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-23 03:23:34 +08:00
Andrew Schofield	94f5039350	KAFKA-17378 Fixes for performance testing (#16942 ) Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-22 06:53:21 +08:00
Sean Quah	c207438823	KAFKA-17279: Handle retriable errors from offset fetches (#16826 ) Handle retriable errors from offset fetches in ConsumerCoordinator. Reviewers: Lianet Magrans <lianetmr@gmail.com>, David Jacot <djacot@confluent.io>	2024-08-20 06:13:25 -07:00
ShivsundarR	932e84096a	KAFKA-17325: Updated result handling in ShareConsumeRequestManager::commitAsync(). (#16903 ) Currently we were not updating the result count when we merged commitAsync() requests into one batch in ShareConsumeRequestManager, so this led to lesser acknowledgements sent to the application thread (ShareConsumerImpl) than expected. Fix : Now if the acknowledge response came from a commitAsync, then we do not wait for other requests to complete, we always prepare a background event to be sent. This PR also fixes a bug in ShareConsumeRequestManager, where during the final ShareAcknowledge sent during close(), we also pick up any piggybacked acknowledgements which were waiting to be sent along with ShareFetch. Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>	2024-08-20 16:44:53 +05:30
David Schlosnagle	050edfaf00	KAFKA-14336: MetadataResponse#convertToNodeArray uses iteration (#12782 ) Avoids stream allocation on hot code path in Admin#listOffsets This patch avoids allocating the stream reference pipeline & spliterator for this case by explicitly allocating the pre-sized Node[] and using a for loop with int induction over the specified IDs List argument. Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Kirk True <kirk@kirktrue.pro>, David Arthur <mumrah@gmail.com>	2024-08-19 19:46:51 -04:00
PoAn Yang	2f0ae82d4a	KAFKA-12989 MockClient should respect the request matcher passed to prepareUnsupportedVersionResponse (#16849 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-08-19 23:19:01 +08:00
lushilin	5f02ef952e	KAFKA-17340 correct the docs of allow.auto.create.topics (#16880 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-19 03:56:25 +08:00
xijiu	21dd5cd421	KAFKA-16818 Move event processing-related tests from ConsumerNetworkThreadTest to ApplicationEventProcessorTest (#16875 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-08-16 02:27:52 +08:00
Andrew Schofield	7031855570	KAFKA-17318 ConsumerRecord.deliveryCount and remove deprecations (#16872 ) Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-16 00:11:08 +08:00
Ken Huang	b767c65527	KAFKA-17326 The LIST_OFFSET request is removed from the "Api Keys" page (#16870 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-08-15 18:59:38 +08:00
DL1231	3a0efa2845	KAFKA-14510; Extend DescribeConfigs API to support group configs (#16859 ) This patch extends the DescribeConfigs API to support group configs. Reviewers: Andrew Schofield <aschofield@confluent.io>, David Jacot <djacot@confluent.io>	2024-08-14 06:37:57 -07:00
xijiu	75bcb9eb42	KAFKA-17239 add request-latency metrics for node in admin client (#16832 ) Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-14 04:33:47 +08:00
Stanislav Knot	43a49f0b82	MINOR: typo in the lz4java exception handling (#16869 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-08-13 22:19:20 +08:00
PoAn Yang	dc636ea1ff	KAFKA-17309 Fix flaky testCallFailWithUnsupportedVersionExceptionDoesNotHaveConcurrentModificationException (#16854 ) Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-13 21:06:20 +08:00
Ken Huang	98d7618b94	KAFKA-17319 change ListOffsetsRequest latestVersionUnstable to false (#16865 ) Reviewers: Luke Chen <showuon@gmail.com>, PoAn Yang <payang@apache.org>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-13 20:44:20 +08:00
Chirag Wadhwa	5b47975bb9	KAFKA-16746: Implemented handleShareAcknowledgeRequest RPC including unit tests (#16792 ) Implemented handleShareAcknowledge request RPC in KafkaApis.scala. This method is called whenever the client sends a Share Acknowledge request to the broker. The acknowledge logic is handles asynchronously and the results are handled appropriately. Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Jun Rao <junrao@gmail.com>	2024-08-12 07:41:59 -07:00
DL1231	bbdf79e1b4	KAFKA-14511; Extend AlterIncrementalConfigs API to support group config (#15067 ) This patch add resources to store and handle consumer group's config. Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>, David Jacot <djacot@confluent.io>	2024-08-12 00:40:13 -07:00
Colin Patrick McCabe	e1b2adea07	KAFKA-17190: AssignmentsManager gets stuck retrying on deleted topics (#16672 ) In MetadataVersion 3.7-IV2 and above, the broker's AssignmentsManager sends an RPC to the controller informing it about which directory we have chosen to place each new replica on. Unfortunately, the code does not check to see if the topic still exists in the MetadataImage before sending the RPC. It will also retry infinitely. Therefore, after a topic is created and deleted in rapid succession, we can get stuck including the now-defunct replica in our subsequent AssignReplicasToDirsRequests forever. In order to prevent this problem, the AssignmentsManager should check if a topic still exists (and is still present on the broker in question) before sending the RPC. In order to prevent log spam, we should not log any error messages until several minutes have gone past without success. Finally, rather than creating a new EventQueue event for each assignment request, we should simply modify a shared data structure and schedule a deferred event to send the accumulated RPCs. This will improve efficiency. Reviewers: Igor Soarez <i@soarez.me>, Ron Dagostino <rndgstn@gmail.com>	2024-08-10 12:31:45 +01:00
Andrew Schofield	0a4a12fbc4	KAFKA-17225: Refactor consumer membership managers (#16751 ) The initial drop of ShareMembershipManager contained a lot of code duplicated from MembershipManagerImpl. The plan was always to share as much code as possible between the membership managers for consumer groups and share groups. This issue refactors the membership managers so that almost all of the code is in common. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia Chuan Yu <yujuan476@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>	2024-08-09 16:18:07 +05:30
José Armando García Sancio	8ce514a52e	KAFKA-16534; Implemeent update voter sending (#16837 ) This change implements the KRaft voter sending UpdateVoter request. The UpdateVoter RPC is used to update a voter's listeners and supported kraft versions. The UpdateVoter RPC is sent if the replicated voter set (VotersRecord in the log) doesn't match the local voter's supported kraft versions and controller listeners. To not starve the Fetch request, the UpdateVoter request is sent at most every 3 fetch timeouts. This is required to make sure that replication is making progress and eventually the voter set in the replicated log matches the local voter configuration. This change also modifies the semantic for UpdateVoter. Now the UpdateVoter response is sent right after the leader has created the new voter set. This is required so that updating voter can transition from sending UpdateVoter request to sending Fetch request. If the leader waits for the VotersRecord control record to commit before sending the UpdateVoter response, it may never send the UpdateVoter response. This can happen if the leader needs that voter's Fetch request to commit the control record. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2024-08-08 16:16:09 -07:00
Colin Patrick McCabe	6a44fb154d	KAFKA-16523; kafka-metadata-quorum: support add-controller and remove-controller (#16774 ) This PR adds support for add-controller and remove-controller in the kafka-metadata-quorum.sh command-line tool. It also fixes some minor server-side bugs that blocked the tool from working. In kafka-metadata-quorum.sh, the implementation of remove-controller is fairly straightforward. It just takes some command-line flags and uses them to invoke AdminClient. The add-controller implementation is a bit more complex because we have to look at the new controller's configuration file. The parsing logic for the advertised.listeners and listeners server configurations that we need was previously implemented in the :core module. However, the gradle module where kafka-metadata-quorum.sh lives, :tools, cannot depend on :core. Therefore, I moved listener parsing into SocketServerConfigs.listenerListToEndPoints. This will be a small step forward in our efforts to move Kafka configuration out of :core. I also made some minor changes in kafka-metadata-quorum.sh and Kafka-storage-tool.sh to handle --help without displaying a backtrace on the screen, and give slightly better error messages on stderr. Also, in DynamicVoter.toString, we now enclose the host in brackets if it contains a colon (as IPV6 addresses can). This PR fixes our handling of clusterId in addRaftVoter and removeRaftVoter, in two ways. Firstly, it marks clusterId as nullable in the AddRaftVoterRequest.json and RemoveRaftVoterRequest.json schemas, as it was always intended to be. Secondly, it allows AdminClient to optionally send clusterId, by using AddRaftVoterOptions and RemoveRaftVoterOptions. We now also remember to properly set timeoutMs in AddRaftVoterRequest. This PR adds unit tests for KafkaAdminClient#addRaftVoter and KafkaAdminClient#removeRaftVoter, to make sure they are sending the right things. Finally, I fixed some minor server-side bugs that were blocking the handling of these RPCs. Firstly, ApiKeys.ADD_RAFT_VOTER and ApiKeys.REMOVE_RAFT_VOTER are now marked as forwardable so that forwarding from the broker to the active controller works correctly. Secondly, org.apache.kafka.raft.KafkaNetworkChannel has now been updated to enable API_VERSIONS_REQUEST and API_VERSIONS_RESPONSE. Co-authored-by: Murali Basani muralidhar.basani@aiven.io Reviewers: José Armando García Sancio <jsancio@apache.org>, Alyssa Huang <ahuang@confluent.io>	2024-08-08 15:54:12 -07:00
bboyleonp666	e7317b37bf	KAFKA-17269 Fix ConcurrentModificationException caused by NioEchoServer.closeNewChannels (#16817 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-08-09 03:24:59 +08:00
PoAn Yang	130af38481	KAFKA-17223 Retrying the call after encoutering UnsupportedVersionException will cause ConcurrentModificationException (#16753 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-08-09 01:07:54 +08:00
Alyssa Huang	3066019efa	KAFKA-16521: Have Raft endpoints printed as name://host:port (#16830 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>	2024-08-08 09:22:23 -07:00
ShivsundarR	0cbc5e083a	KAFKA-17217: Batched acknowledge requests per node in ShareConsumeRequestManager (#16727 ) In ShareConsumeRequestManager, currently every time we perform a commitSync/commitAsync/acknowledgeOnClose we create one AcknowledgeRequestState for each call. But this can be optimised further as we can batch up the acknowledgements to be sent to the same node before the next poll() is invoked. This will ensure that between 2 polls, the acknowledgements are accumulated in one request per node and then sent during poll, resulting in lesser RPC calls. To achieve this, we are storing a pair of acknowledge request states for every node, the first value denotes the requestState for commitAsync() and the second value denotes the requestState for commitSync() and acknowledgeOnClose(). All the acknowledgements to be sent to a particular node are stored in the corresponding acknowledgeRequestState based on whether it was synchronous or asynchronous. Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>	2024-08-08 20:43:37 +05:30
Luke Chen	164f899605	KAFKA-17236: Handle local log deletion when remote.log.copy.disabled=true (#16765 ) Handle local log deletion when remote.log.copy.disabled=true based on the KIP-950. When tiered storage is disabled or becomes read-only on a topic, the local retention configuration becomes irrelevant, and all data expiration follows the topic-wide retention configuration exclusively. - added remoteLogEnabledAndRemoteCopyEnabled method to check if this topic enables tiered storage and remote log copy is enabled. We should adopt local.retention.ms/bytes when remote.storage.enable=true,remote.log.copy.disable=false. - Changed to use retention.bytes/retention.ms when remote copy disabled. - Added validation to ask users to set local.retention.ms == retention.ms and local.retention.bytes == retention.bytes - Added tests Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Satish Duggana <satishd@apache.org>, Christo Lolov <lolovc@amazon.com>	2024-08-08 17:07:40 +05:30
Ken Huang	786d2c9975	KAFKA-17276; replicaDirectoryId for Fetch and FetchSnapshot should be ignorable (#16819 ) The replicaDirectoryId field for FetchRequest and FetchSnapshotRequest should be ignorable. This allows data objects with the directory id to be serialized to any version of the requests. Reviewers: José Armando García Sancio <jsancio@apache.org>, Chia-Ping Tsai <chia7712@apache.org>	2024-08-07 20:53:15 -04:00
Andrew Schofield	b9099301fb	KAFKA-16716: Add admin list and describe share groups (#16827 ) Adds the methods to the admin client for listing and describing share groups. There are some unit tests, but integration tests will be in a follow-on PR. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2024-08-08 00:46:25 +05:30
Tim Fox	837684a1b9	MINOR: Correct type in README.md from "boolean" to "bool" (#16706 ) The actual type name in the JSON descriptor files is "bool" not "boolean" Reviewers: Mickael Maison <mickael.maison@gmail.com>, Andrew Schofield <andrew_schofield@live.com>	2024-08-07 17:34:53 +02:00
Mickael Maison	7c5d339d07	KAFKA-17227: Refactor compression code to only load codecs when used (#16782 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Josep Prat <josep.prat@aiven.io>	2024-08-06 11:01:21 +02:00
Philip Nee	c18463ec2d	MINOR: Rethrow caught exception instead of wrapping into LoginException (#16769 ) Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2024-08-06 09:48:07 +01:00
José Armando García Sancio	e49c8df1f7	KAFKA-16533; Update voter handling Add support for handling the update voter RPC. The update voter RPC is used to automatically update the voters supported kraft versions and available endpoints as the operator upgrades and reconfigures the KRaft controllers. The add voter RPC is handled as follow: 1. Check that the leader has fenced the previous leader(s) by checking that the HWM is known; otherwise, return the REQUEST_TIMED_OUT error. 2. Check that the cluster supports kraft.version 1; otherwise, return the UNSUPPORTED_VERSION error. 3. Check that there are no uncommitted voter changes, otherwise return the REQUEST_TIMED_OUT error. 4. Check that the updated voter still supports the currently finalized kraft.version; otherwise return the INVALID_REQUEST error. 5. Check that the updated voter is still listening on the default listener. 6. Append the updated VotersRecord to the log. The KRaft internal listener will read this uncommitted record from the log and update the voter in the set of voters. 7. Wait for the VotersRecord to commit using the majority of the voters. Return a REQUEST_TIMED_OUT error if it doesn't commit in time. 8. Send the UpdateVoter successful response to the voter. This change also implements the ability for the leader to update its own entry in the voter set when it becomes leader for an epoch. This is done by updating the voter set and writing a control batch as the first batch in a new leader epoch. Finally, fix a bug in KafkaAdminClient's handling of removeRaftVoterResponse where we tried to cast the response to the wrong type. Reviewers: Alyssa Huang <ahuang@confluent.io>, Colin P. McCabe <cmccabe@apache.org>	2024-08-05 11:32:21 -07:00
TengYao Chi	736bd3cd09	KAFKA-17201 SelectorTest.testInboundConnectionsCountInConnectionCreationMetric leaks sockets and threads (#16723 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-08-05 22:47:06 +08:00
brenden20	0b78d8459c	MINOR: Remove ConsumerTestBuilder.java (#16691 ) The purpose of this PR is to remove ConsumerTestBuilder.java since it is no longer needed. The following PRs have eliminated the use of ConsumerTestBuilder: #14930 #16140 #16200 #16312 Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-08-05 11:05:53 +08:00
Kuan-Po Tseng	84add30ea5	KAFKA-16154: Broker returns offset for LATEST_TIERED_TIMESTAMP (#16783 ) This pr support EarliestLocalSpec LatestTierSpec in GetOffsetShell, and add integration tests. Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, PoAn Yang <payang@apache.org>	2024-08-05 10:41:14 +08:00
Lianet Magrans	2a7bad8ca0	MINOR: Fix consumer log on fatal error & improve memberId logging (#16720 ) Fix log on consumer fatal error, to show member ID only if present. If no member ID the log will clearly indicate that the member has no member ID (instead of showing empty as it used to) Apply same fix consistently to all other log lines that include member ID. Reviewers: Kirk True <kirk@kirktrue.pro>, PoAn Yang <payang@apache.org>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-04 02:18:14 +08:00
Luke Chen	9f7e8d478a	KAFKA-16855: remote log disable policy in KRaft (#16653 ) Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Christo Lolov <lolovc@amazon.com>	2024-08-03 09:38:41 +01:00
PoAn Yang	6e324487fa	KAFKA-16480: ListOffsets change should have an associated API/IBP version update (#16781 ) 1. Use oldestAllowedVersion as 9 if using ListOffsetsRequest#EARLIEST_LOCAL_TIMESTAMP or ListOffsetsRequest#LATEST_TIERED_TIMESTAMP. 2. Add test cases to ListOffsetsRequestTest#testListOffsetsRequestOldestVersion to make sure requireTieredStorageTimestamp return 9 as minVersion. 3. Add EarliestLocalSpec and LatestTierSpec to OffsetSpec. 4. Add more cases to KafkaAdminClient#getOffsetFromSpec. 5. Add testListOffsetsEarliestLocalSpecMinVersion and testListOffsetsLatestTierSpecSpecMinVersion to KafkaAdminClientTest to make sure request builder has oldestAllowedVersion as 9. Signed-off-by: PoAn Yang <payang@apache.org> Reviewers: Luke Chen <showuon@gmail.com>	2024-08-03 14:27:27 +08:00
Alyssa Huang	bc4df734b5	KAFKA-16521; kafka-metadata-quorum describe command changes for KIP-853 (#16759 ) describe --status now includes directory id and endpoint information for voter and observers. describe --replication now includes directory id. Reviewers: Colin P. McCabe <cmccabe@apache.org>, José Armando García Sancio <jsancio@apache.org>	2024-08-01 15:28:57 -04:00
Apoorv Mittal	902fc33b27	KAFKA-17230: Fix for consumer node latency metrics (#16755 ) Kafka Consumer client registers node/connection latency metrics in Selector.java but the values against the metric is never recorded. This seems to be an issue since inception. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Jun Rao <junrao@gmail.com>	2024-08-01 12:04:19 -07:00
Kondrat Bertalan	8d8c367066	KAFKA-17192 Fix MirrorMaker2 worker config does not pass config.provi… (#16678 ) Reviewers: Chris Egerton <chrise@aiven.io>	2024-07-31 16:50:22 -04:00
Linu Shibu	9d634629f2	KAFKA-16899 MembershipManagerImpl: rebalanceTimeoutMs variable name changed to commitTimeoutDuringReconciliation (#16334 ) As mentioned in the ticket, the config property name "rebalanceTimeoutMs" is misleading since the property only handles the client's commit portion of the process.It is used in MembershipManagerImpl as a means to limit the client's efforts in the case where it is repeatedly trying to commit but failing. Considering the same, the property name has been updated to "commitTimeoutDuringReconciliation" (suggested in ticket) in GroupRebalanceConfig and in all other classes where the property/variable is used Reviewers: Kirk True <kirk@kirktrue.pro>, Chia-Ping Tsai <chia7712@gmail.com>	2024-08-01 00:51:46 +08:00
Philip Nee	51c1e1e147	KAFKA-17188: Ensure login and callback handler are closed upon encountering LoginException (#16724 ) Reviewers: TengYao Chi <kitingiao@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>	2024-07-31 08:51:54 +01:00
Nancy	9e06767ffa	KAFKA-13898 Updated docs for metrics.recording.level (#16402 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-31 04:29:33 +08:00
Chung, Ming-Yen	7c0a96d08d	KAFKA-17185 Declare Loggers as static to prevent multiple logger instances (#16680 ) As discussed in #16657 (comment) , we should make logger as static to avoid creating multiple logger instances. I use the regex private.Logger.LoggerFactory to search and check all the results if certain logs need to be static. There are some exceptions that loggers don't need to be static: 1) The logger in the inner class. Since java8 doesn't support static field in the inner class. https://github.com/apache/kafka/blob/trunk/clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetchRequestManagerTest.java#L3676 2) Custom loggers for each instance (non-static + non-final). In this case, multiple logger instances is actually really needed. https://github.com/apache/kafka/blob/trunk/storage/src/test/java/org/apache/kafka/server/log/remote/storage/LocalTieredStorage.java#L166 3) The logger is initialized in constructor by LogContext. Many non-static but with final modifier loggers are in this category, that's why I use .*LoggerFactory to only check the loggers that are assigned initial value when declaration. 4) protected final Logger log = Logger.getLogger(getClass()) This is for subclass can do logging with subclass name instead of superclass name. But in this case, if the log access modifier is private, the purpose cannot be achieved since subclass cannot access the log defined in superclass. So if access modifier is private, we can replace getClass() with <className>.class Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-31 02:37:36 +08:00
Kirk True	d260b06180	KAFKA-17060 Rename LegacyKafkaConsumer to ClassicKafkaConsumer (#16683 ) Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-07-29 20:56:23 +08:00
Alyssa Huang	da8fe6355b	KAFKA-16915; LeaderChangeMessage supports directory id (#16668 ) Extend LeaderChangeMessage schema to support version 1 of the message. The leader will continue to write version 0 of the schema. This is needed so that in the future the leader can write version 1 of the message and be guaranteed that all of the replicas in the cluster support version 1 of the schema. Reviewers: José Armando García Sancio <jsancio@apache.org>	2024-07-28 11:12:42 -04:00
José Armando García Sancio	da32dcab2c	KAKFA-16537; Implement remove voter RPC (#16670 ) Implement the RemoveVoter RPC. The general algorithm is as follow: 1. Check that the leader has fenced the previous leader(s) by checking that the HWM is known; otherwise return the REQUEST_TIMED_OUT error. 2. Check that the cluster supports kraft.version 1; otherwise return the UNSUPPORTED_VERSION error. 3. Check that there are no uncommitted voter changes; otherwise return the REQUEST_TIMED_OUT error. 4. Append the updated VotersRecord to the log. The KRaft internal listener will read this uncommitted record from the log and add the new voter to the set of voters. 5. Wait for the VotersRecord to commit using the majority of the new set of voters. Return a REQUEST_TIMED_OUT error if it doesn't commit in time. 6. Send the RemoveVoter successful response to the client. 7. Resign the leadership if the leader is not in the new voter set One thing to note is that now that KRaft supports both the remove voter and add voter RPC. Only one change can be pending at once. This is achieved in the following ways. The AddVoter RPC checks if there are pending AddVoter or RemoveVoter RPC. The RemoveVoter RPC checks if there are any pending AddVoter or RemoveVoter RPC. Both RPCs check that there is no uncommitted VotersRecord. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2024-07-26 16:25:41 -07:00
PaulRMellor	738d8cc91e	MINOR: Update bootstrap.servers doc string (#16655 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	2024-07-26 15:08:01 +02:00
brenden20	41beee508e	KAFKA-16558: Implemented HeartbeatRequestState toStringBase() and added a test for it (#16373 ) Reviewers: Kirk True <kirk@kirktrue.pro>, Matthias J. Sax <matthias@confluent.io>	2024-07-25 13:27:17 -07:00
brenden20	cca0390822	KAFKA-15999 Migrate HeartbeatRequestManagerTest away from ConsumerTestBuilder (#16200 ) In this PR I have completely migrated HeartbeatRequestManagerTest away from ConsumerTestBuilder. I have removed any instances of spy objects and used mocks wherever possible. 31/31 tests are passing. I also removed one test that existed in another test file. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-07-25 14:12:51 +08:00
Zhengke Zhou	a38829e61b	KAFKA-16765: Close leaked accepted sockets in EchoServer, NioEchoServer, ServerShutdownTest (#16576 ) Reviewers: Greg Harris <greg.harris@aiven.io>	2024-07-24 10:20:51 -07:00
TaiJuWu	4fa1c21940	KAFKA-17104 InvalidMessageCrcRecordsPerSec is not updated in validating LegacyRecord (#16558 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-24 19:25:11 +08:00
TaiJuWu	f9f480ac7c	KAFKA-17163 revisit testSubscriptionOnInvalidTopic and testPollAuthenticationFailure (#16638 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-24 16:40:35 +08:00
José Armando García Sancio	9db5c2481f	KAFKA-16535; Implement KRaft add voter handling This change implements the AddVoter RPC. The high-level algorithm is as follow: 1. Check that the leader has fenced the previous leader(s) by checking that the HWM is known; otherwise, return the REQUEST_TIMED_OUT error. 2. Check that the cluster supports kraft.version 1; otherwise return the UNSUPPORTED_VERSION error. 3. Check that there are no uncommitted voter changes; otherwise return the REQUEST_TIMED_OUT error. 4. Check that the new voter's id is not part of the existing voter set, otherwise return the DUPLICATE_VOTER error. 5. Send an API_VERSIONS RPC to the first (default) listener to discover the supported kraft.version of the new voter. 6. Check that the new voter supports the current kraft.version, otherwise return the INVALID_REQUEST error. 7. Check that the new voter is caught up to the log end offset of the leader, otherwise return a REQUEST_TIMED_OUT error. 8. Append the updated VotersRecord to the log. The KRaft internal listener will read this uncommitted record from the log and add the new voter to the set of voters. 9. Wait for the VotersRecord to commit using the majority of the new set of voters. Return a REQUEST_TIMED_OUT error if it doesn't commit in time. 10. Send the AddVoter successful response to the client. The leader implements the above algorithm by tracking 3 events: the ADD_VOTER request is received, the API_VERSIONS response is received and finally the HWM is updated. The state of the ADD_VOTER operation is tracked by LeaderState using the AddVoterHandlerState. The algorithm is implemented by the AddVoterHandler type. This change also fixes a small issue introduced by the bootstrap checkpoint (0-0.checkpoint). The internal partition listener (KRaftControlRecordStateMachine) and the external partition listener (KafkaRaftClient.ListenerContext) were using "nextOffset = 0" as the initial state of the reading cursor. This was causing the bootstrap checkpoint to keep getting reloaded until the leader wrote a record to the log. Changing the initial offset (nextOffset) to -1 allows the listeners to distinguish between the initial state (nextOffset == -1) and the bootstrap checkpoint was loaded but the log segment is empty (nextOffset == 0). Reviewers: Colin P. McCabe <cmccabe@apache.org>	2024-07-22 13:07:38 -07:00
PoAn Yang	defcbb51ee	KAFKA-17082 replace kafka.utils.LogCaptureAppender with org.apache.kafka.common.utils.LogCaptureAppender (#16601 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-21 18:22:05 +08:00
Andrew Schofield	28f6fbdd55	KAFKA-17144 Retract unimplemented ListGroups v6 added in error (#16635 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-21 15:03:23 +08:00
TaiJuWu	f09ead1483	KAFKA-17132 Revisit testMissingOffsetNoResetPolicy for AsyncConsumer (#16587 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-19 18:46:04 +08:00
Lianet Magrans	931bb62a23	KAFKA-16984: Complete consumer leave on response to leave request (#16569 ) Improvement to ensure that the consumer unsubscribe operation waits for a response to the leave group request before moving on to close the consumer. This makes it consistent with the behaviour of the legacy consumer. This will avoid undesired interactions on close, that triggers a leave group, and shuts down the network thread when it completes (which before this PR could led to responses to disconnected clients). Note that this PR does not change the transitions of the state machine on leave group, only the completion of the leave group future. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Kirk True <ktrue@confluent.io>	2024-07-19 10:08:40 +02:00
Chung, Ming-Yen	66655ab49a	KAFKA-17095 Fix the typo from "CreateableTopicConfig" to "CreatableTopicConfig" (#16623 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-19 11:09:08 +08:00
PoAn Yang	6acc220e03	KAFKA-15773 Group protocol configuration should be validated (#16543 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-18 18:31:36 +08:00
PoAn Yang	c7c0de3889	KAFKA-17141 'DescribeTopicPartitions API is not supported' warning message confuses users (#16612 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-18 18:00:35 +08:00
Greg Harris	c97421c100	KAFKA-17150: Use Utils.loadClass instead of Class.forName to resolve aliases correctly (#16608 ) Signed-off-by: Greg Harris <greg.harris@aiven.io> Reviewers: Chris Egerton <chrise@aiven.io>, Chia-Ping Tsai <chia7712@gmail.com>, Josep Prat <josep.prat@aiven.io>	2024-07-17 16:00:45 -07:00
PoAn Yang	b015a83f6d	KAFKA-17017 AsyncKafkaConsumer#unsubscribe clean the assigned partitions (#16449 ) Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-07-17 04:18:33 +08:00
Colin Patrick McCabe	4e66836ca0	MINOR: fix JavaDoc error in SupportedVersionRange.java (#16605 ) Reviewers: José Armando García Sancio <jsancio@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>	2024-07-17 04:12:53 +08:00
Alyssa Huang	d0f71486c2	MINOR: Improve the logging of IllegalStateException exceptions thrown from SslFactory (#16346 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>, Rajini Sivaram <rajinisivaram@googlemail.com>	2024-07-16 11:01:53 -07:00
Colin Patrick McCabe	4d3e366bc2	KAFKA-16772: Introduce kraft.version to support KIP-853 (#16230 ) Introduce the KRaftVersion enum to describe the current value of kraft.version. Change a bunch of places in the code that were using raw shorts over to using this new enum. In BrokerServer.scala, fix a bug that could cause null pointer exceptions during shutdown if we tried to shut down before fully coming up. Do not send finalized features that are finalized as level 0, since it is a no-op. Reviewers: dengziming <dengziming1993@gmail.com>, José Armando García Sancio <jsancio@apache.org>	2024-07-16 09:31:10 -07:00
Andrew Schofield	1e16e16f64	KAFKA-16730: Initial version of share group consumer client code (#16461 ) This is the initial version of the share group consumer client code. It implements the complete ShareConsumer interface. There are unit tests, but not integration tests yet since those would depend upon complete broker code, which is not available at this point. Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Lianet Magrans <lianetmr@gmail.com>	2024-07-16 16:45:38 +05:30
Demonic	43235c2796	KAFKA-17133 add unit test to make sure `ConsumerRecords#recoreds` returns immutable object (#16588 ) Reviewers: TingIāu "Ting" Kì <kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-07-16 17:05:13 +08:00
Ken Huang	20ee83c462	KAFKA-17102 FetchRequest#forgottenTopics would return incorrect data (#16557 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-16 05:12:43 +08:00
xijiu	f749af557a	MINOR: add some unit test for common Utils (#16549 ) Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-07-15 21:18:51 +08:00
PoAn Yang	2cf1ef99d7	KAFKA-17129 Revisit FindCoordinatorResponse in KafkaConsumerTest (#16589 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-15 08:17:53 +08:00
PoAn Yang	e104974b74	KAFKA-17110 Enable valid test case in KafkaConsumerTest for AsyncKafkaConsumer (#16566 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-14 09:27:41 +08:00
TaiJuWu	75075002b5	KAFKA-17106 enable testFetchProgressWithMissingPartitionPosition for AsyncConsumer (#16564 ) Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>	2024-07-14 09:04:20 +08:00
Ken Huang	ec9dabf86f	KAFKA-17092 Revisit `KafkaConsumerTest#testBeginningOffsetsTimeout` for AsyncConsumer (#16541 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-12 05:17:48 +08:00
Colin Patrick McCabe	ede289db93	KAFKA-17011: Fix a bug preventing features from supporting v0 (#16421 ) As part of KIP-584, brokers expose a range of supported versions for each feature. For example, metadata.version might be supported from 1 to 21. (Note that feature level ranges are always inclusive, so this would include both level 1 and 21.) These supported ranges are supposed to be able to include 0. For example, it should be possible for a broker to support a kraft.version between 0 and 1. However, in older software versions, there is an assertion in org.apache.kafka.common.feature.SupportedVersionRange that prevents this. This causes problems when the older software attempts to deserialize an ApiVersionsResponse containing such a range. In order to resolve this dilemma, this PR bumps the version of ApiVersionsRequest from 3 to 4. Clients which send v4 promise to be able to handle ranges including 0. Clients which send v3 will not be exposed to these ranges. The feature will show up as having a minimum version of 1 instead. This work is part of KIP-1022. Similarly, this PR also introduces a new version of BrokerRegistrationRequest, and specifies that the older versions of that RPC cannot handle supported version ranges including 0. Therefore, 0 is translated to 1 in the older requests. Reviewers: Jun Rao <junrao@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>	2024-07-10 16:10:25 -07:00
Omnia Ibrahim	25d775b742	KAFKA-15853 Refactor ShareGroupConfig with AbstractConfig (#16506 ) Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-07-11 01:37:50 +08:00
Chung, Ming-Yen	9b02260ac6	KAFKA-17090 Add reminder to CreateTopicsResult#config for null values of type and documentation (#16539 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-11 01:32:50 +08:00
NICOLAS GUYOMAR	406d4c5ef9	MINOR: Log client idle disconnect events at DEBUG level (#16089 ) We recently increased that log verbosity from INFO to DEBUG, but this is causing some confusion among users when an idle connection is disconnected, which is a normal activity. This makes it clear that the connection being disconnect has expired Reviewers: Dimitar Dimitrov <ddimitrov@confluent.io>, Justine Olshan <jolshan@confluent.io>	2024-07-10 09:47:54 -07:00
Christo Lolov	f369771bf2	KAFKA-16851: Add remote.log.disable.policy (#16132 ) Add a remote.log.disable.policy on a topic-level only as part of KIP-950 Reviewers: Kamal Chandraprakash <kchandraprakash@uber.com>, Luke Chen <showuon@gmail.com>, Murali Basani <muralidhar.basani@aiven.io>	2024-07-10 11:18:48 +08:00
bachmanity1	cfc7bb90ae	KAFKA-16345: Optionally URL-encode clientID and clientSecret in authorization header (#15475 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Kirk True <kirk@kirktrue.pro>	2024-07-09 18:53:45 +02:00
Ken Huang	5608b5cc3a	KAFKA-16684: Remove caching in FetchResponse.responseData (#16532 ) The response data should change accordingly to the input, however with the current design, it will not change even if the input changes. We remove this cache logic to avoid returning wrong data. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>, Igor Soarez <soarez@apple.com>	2024-07-09 15:46:22 +01:00
Kuan-Po Tseng	a533e246e3	KAFKA-17081 Tweak GroupCoordinatorConfig: re-introduce local attributes and validation (#16524 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-08 01:15:18 +08:00
Arun Mathew	79244fe235	KAFKA-13403 Fix broker crash due to race condition when deleting topics (#11438 ) Update the walkFileTree override implementation to handle parallel file deletion. So as to prevent crashing of Kafka broker process when logs are deleted by other threads due to retention expiry. Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Igor Soarez <soarez@apple.com>	2024-07-06 08:18:35 +01:00
Nancy	b4d77e047b	KAFKA-4374 Update log message for clarity metadata response errors (#16508 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-05 15:15:22 +08:00
José Armando García Sancio	9f7afafefe	KAFKA-16529; Implement raft response handling (#16454 ) This change implements response handling for the new version of Vote, Fetch, FetchSnapshot, BeginQuorumEpoch and EndQuorumEpoch. All of these responses were extended to include the leader's endpoint when the leader is known. This change also includes sending the new version of the requests for Vote, Fetch, FetchSnapshot, BeginQuorumEpoch and EndQuorumEpoch. The two most notable changes are that: 1. The leader is going to include all of its endpoints in the BeginQuorumEpoch and EndQuorumEpoch requests. 2. The replica is going to include the destination replica key for the Vote and BeginQuorumEpoch request. QuorumState was extended so that the replica transitions to UnattachedState instead of FollowerState during startup, if the leader is known but the leader's endpoint is not known. This can happen if the known leader is not part of the voter set replicated by the replica. The expectation is that the replica will rediscover the leader from Fetch responses from the bootstrap servers or from the BeginQuorumEpoch request from the leader. To make sure that replicas never forget the leader of a given epoch the unattached state was extended to allow an optional leader id for when the leader is known but the leader's endpoint is not known. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2024-07-03 16:52:35 -04:00
Chris Egerton	27220d146c	KAFKA-10816: Add health check endpoint for Kafka Connect (#16477 ) Reviewers: Greg Harris <gharris1727@gmail.com>	2024-07-03 14:15:15 -04:00
TingIāu "Ting" Kì	c97d4ce026	KAFKA-17032 NioEchoServer should generate meaningful id instead of incremental number (#16460 ) Reviewers: Greg Harris <greg.harris@aiven.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-07-03 03:28:42 +08:00
Ken Huang	2bc5e01121	KAFKA-17051 ApiKeys#toHtml should exclude the APIs having unstable latest version (#16480 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-03 02:52:14 +08:00
abhi-ksolves	6897b06b03	KAFKA-3346 Rename Mode to ConnectionMode (#16403 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-07-03 02:46:04 +08:00
David Jacot	9a78122fb0	MINOR: Refactor GroupMetadataManager#consumerGroupHeartbeat and GroupMetadataManager#classicGroupJoinToConsumerGroup (#16371 ) This patch is an attempt to simplifying GroupMetadataManager#consumerGroupHeartbeat and GroupMetadataManager#classicGroupJoinToConsumerGroup by sharing more of the common logic. It slightly change how static members are replaced too. Now, we generate the records to replace the member and then we update the member if needed. Reviewers: Dongnuo Lyu <dlyu@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>	2024-06-30 23:16:52 -07:00
NICOLAS GUYOMAR	3c4b1db93b	MINOR: Fix Trace log typo in RecordAccumulator.java (#16487 )	2024-06-30 00:49:07 +08:00
brenden20	836f52b0ba	KAFKA-16000: Migrated MembershipManagerImplTest away from ConsumerTestBuilder (#16312 ) Finishing migration of MembershipManagerImplTest away from ConsumerTestBuilder and removed all spy objects. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Philip Nee <pnee@confluent.io>, Matthias J. Sax <matthias@confluent.io>	2024-06-28 12:13:27 -07:00
José Armando García Sancio	9be27e715a	MINOR; Fix incompatible change to the kafka config (#16464 ) Prior to KIP-853, users were not allow to enumerate listeners specified in `controller.listener.names` in the `advertised.listeners`. This decision was made in 3.3 because the `controller.quorum.voters` property is in effect the list of advertised listeners for all of the controllers. KIP-853 is moving away from `controller.quorum.voters` in favor of a dynamic set of voters. This means that the user needs to have a way of specifying the advertised listeners for controller. This change allows the users to specify listener names in `controller.listener.names` in `advertised.listeners`. To make this change forwards compatible (use a valid configuration from 3.8 in 3.9), the controller's advertised listeners are going to get computed by looking up the endpoint in `advertised.listeners`. If it doesn't exist, the controller will look up the endpoint in the `listeners` configuration. This change also includes a fix the to the BeginQuorumEpoch request where the default value for VoterId was 0 instead of -1. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2024-06-27 21:24:25 -04:00
Colin Patrick McCabe	ebaa108967	KAFKA-16968: Introduce 3.8-IV0, 3.9-IV0, 3.9-IV1 Create 3 new metadata versions: - 3.8-IV0, for the upcoming 3.8 release. - 3.9-IV0, to add support for KIP-1005. - 3.9-IV1, as the new release vehicle for KIP-966. Create ListOffsetRequest v9, which will be used in 3.9-IV0 to support KIP-1005. v9 is currently an unstable API version. Reviewers: Jun Rao <junrao@gmail.com>, Justine Olshan <jolshan@confluent.io>	2024-06-27 14:03:03 -07:00
Ken Huang	9b4f13efbc	KAFKA-15623 Remove junit 4 from stream module (#16447 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-27 15:11:32 +08:00
José Armando García Sancio	adee6f0cc1	KAFKA-16527; Implement request handling for updated KRaft RPCs (#16235 ) Implement request handling for the updated versions of the KRaft RPCs (Fetch, FetchSnapshot, Vote, BeginQuorumEpoch and EndQuorumEpoch). This doesn't add support for KRaft replicas to send the new version of the KRaft RPCs. That will be implemented in KAFKA-16529. All of the RPCs responses were extended to include the leader's endpoint for the listener of the channel used in the request. EpochState was extended to include the leader's endpoint information but only the FollowerState and LeaderState know the leader id and its endpoint(s). For the Fetch request, the replica directory id was added. The leader now tracks the follower's log end offset using both the replica id and replica directory id. For the FetchSnapshot request, the replica directory id was added. This is not used by the KRaft leader and it is there for consistency with Fetch and for help debugging. For the Vote request, the replica key for both the voter (destination) and the candidate (source) were added. The voter key is checked for consistency. The candidate key is persisted when the vote is granted. For the BeginQuorumEpoch request, all of the leader's endpoints are included. This is needed so that the voters can return the leader's endpoint for all of the supported listeners. For the EndQuorumEpoch request, all of the leader's endpoints are included. This is needed so that the voters can return the leader's endpoint for all of the supported listeners. The successor list has been extended to include the directory id. Receiving voters can use the entire replica key when searching their position in the successor list. Updated the existing test in KafkaRaftClientTest and KafkaRaftClientSnapshotTest to execute using both the old version and new version of the RPCs. Reviewers: Luke Chen <showuon@gmail.com>, Colin P. McCabe <cmccabe@apache.org>	2024-06-25 13:45:15 -07:00
Chia Chuan Yu	5b0e96d785	KAFKA-17034 Tweak some descriptions in FeatureUpdate (#16448 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-26 03:05:34 +08:00
Andrew Schofield	63304fb6e5	KAFKA-17028: FindCoordinator v6 initial implementation (#16440 ) KIP-932 introduces FindCoordinator v6 for finding share coordinators. The initial implementation: Checks that share coordinators are only requested with v6 or above. Share coordinator requests are authorized as cluster actions (this is for inter-broker use only) Responds with COORDINATOR_NOT_AVAILABLE because share coordinators are not yet available. When the share coordinator code is delivered, the request handling will be gated by configurations which enable share groups and the share coordinator specifically. If these are not enabled, COORDINATOR_NOT_AVAILABLE is the response. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2024-06-25 21:13:16 +05:30
Alieh Saeedi	f4cbf71ea6	KAFKA-16965: Throw cause of TimeoutException (#16344 ) Add the cause of TimeoutException for Producer send() errors. Reviewers: Matthias J. Sax <matthias@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, Artem Livshits <alivshits@confluent.io>, Justine Olshan <jolshan@confluent.io>	2024-06-24 14:51:27 -07:00
TingIāu "Ting" Kì	d78ff06476	KAFKA-16967 NioEchoServer fails to register connection and causes flaky failure. (#16384 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-24 21:39:40 +08:00
Ken Huang	8a109d87d1	KAFKA-17009 Add unit test to query nonexistent replica by describeReplicaLogDirs (#16423 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-23 16:25:57 +08:00
gongxuanzhang	e294ea433c	KAFKA-17012 Enable CONSUMER protocol for some tests of KafkaConsumerTest (#16415 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-21 21:58:24 +08:00
gongxuanzhang	80f31224aa	KAFKA-10787 Apply spotless to `clients` module (#16393 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-20 17:43:25 +08:00
Chia Chuan Yu	4ff83dc733	KAFKA-16957 Enable KafkaConsumerTest#configurableObjectsShouldSeeGeneratedClientId to work with CLASSIC and CONSUMER (#16370 ) Reviewers: Kirk True <kirk@kirktrue.pro>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-19 09:35:10 +08:00
Lianet Magrans	8199290500	MINOR: consumer log fixes (#16345 ) Reviewers: Kirk True <kirk@kirktrue.pro>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-18 22:51:33 +08:00
dujian0068	823d6f7555	KAFKA-16958 add STRICT_STUBS to EndToEndLatencyTest, OffsetCommitCallbackInvokerTest, ProducerPerformanceTest, and TopologyTest (#16348 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-18 18:51:43 +08:00
Lianet Magrans	6c4e777079	KAFKA-16954: fix consumer close to release assignment in background (#16343 ) This PR fixes consumer close to avoid updating the subscription state object in the app thread. Now the close simply triggers an UnsubscribeEvent that is handled in the background to trigger callbacks, clear assignment, and send leave heartbeat. Note that after triggering the event, the unsubscribe will continuously process background events until the event completes, to ensure that it allows for callbacks to run in the app thread. The logic around what happens if the unsubscribe fails remain unchanged: close will log, keep the first exception and carry on. It also removes the redundant LeaveOnClose event (it used to do the the exact same thing as the UnsubscribeEvent, both calling membershipMgr.leaveGroup). Reviewers: Lucas Brutschy <lbrutschy@confluent.io>	2024-06-17 21:27:33 +02:00
Dongnuo Lyu	21d60eabab	KAFKA-16673; Simplify `GroupMetadataManager#toTopicPartitions` by using `ConsumerProtocolSubscription` instead of `ConsumerPartitionAssignor.Subscription` (#16309 ) In `GroupMetadataManager#toTopicPartitions`, we generate a list of `ConsumerGroupHeartbeatRequestData.TopicPartitions` from the input deserialized subscription. Currently the input subscription is `ConsumerPartitionAssignor.Subscription`, where the topic partitions are stored as (topic-partition) pairs, whereas in `ConsumerGroupHeartbeatRequestData.TopicPartitions`, we need the topic partitions to be stored as (topic-partition list) pairs. `ConsumerProtocolSubscription` is an intermediate data structure in the deserialization where the topic partitions are stored as (topic-partition list) pairs. This pr uses `ConsumerProtocolSubscription` instead as the input subscription to make `toTopicPartitions` more efficient. Reviewers: David Jacot <djacot@confluent.io>	2024-06-17 02:47:52 -07:00
Chia Chuan Yu	768e90f667	KAFKA-16669 Remove extra collection copy when generating DescribeAclsResource (#15924 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-17 14:47:44 +08:00
Gaurav Narula	4a37c2e18f	KAFKA-16219 set SO_TIMEOUT in EchoServer (#16354 ) We observed some runs of the test suite caused CI pipelines to stall. A thread dump revealed that the test runner was blocked trying to read from a socket, while attempting to close the socket [[0]]. It turns out this is due to a bug in JDK which is very similar to JDK-8274524, but it affects the else branch of `SSLSocketImpl::bruteForceCloseInput` [[1]] which wasn't fixed in JDK-8274524. Since the blocking happens in a native call, the test runner's timeouts have no effect as the blocked test runner thread doesn't seem to respond to interrupts. As a mitigation in Kafka's test suite, this change adds `SO_TIMEOUT` of 30 seconds to all the TLS sockets handled by `EchoServer`. The timeout is reasonably high for tests and a finite upper bound avoids infinite blocking of the test suite. [0]: https://issues.apache.org/jira/secure/attachment/13066427/timeout.log [1]: `890adb6410/src/java.base/share/classes/sun/security/ssl/SSLSocketImpl.java (L808)` Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-17 14:11:37 +08:00
Andrew Schofield	fecbfb8133	KAFKA-16950: Define Persister interfaces and RPCs (#16335 ) Define the interfaces and RPCs for share-group persistence. (KIP-932). This PR is just RPCs and interfaces to allow building of the broker components which depend upon them. The implementation will follow in subsequent PRs. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>	2024-06-15 20:52:49 +05:30
gongxuanzhang	3a9d877686	MINOR: refactor BuiltInPartitioner to remove mockRandom from production code (#16277 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-15 12:18:42 +08:00
Kirk True	8f86b9c4ec	KAFKA-16637 AsyncKafkaConsumer removes offset fetch responses from cache too aggressively (#16310 ) Allow the committed offsets fetch to run for as long as needed. This handles the case where a user invokes Consumer.poll() with a very small timeout (including zero). Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-15 08:48:53 +08:00
TingIāu "Ting" Kì	09bc5be63e	KAFKA-16946: Utils.getHost/getPort cannot parse SASL_PLAINTEXT://host:port (#16319 ) In previous PR(#16048), I mistakenly excluded the underscore (_) from the set of valid characters for the protocol, resulting in the inability to correctly parse the connection string for SASL_PLAINTEXT. This bug fix addresses the issue and includes corresponding tests. Reviewers: Andrew Schofield <aschofield@confluent.io>, Luke Chen <showuon@gmail.com>	2024-06-14 13:07:11 -07:00
TingIāu "Ting" Kì	4e2f26bfc6	KAFKA-16917 DescribeTopicsResult should use mutable map in order to keep compatibility (#16250 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-14 23:48:35 +08:00
Omnia Ibrahim	e99da2446c	KAFKA-15853: Move KafkaConfig.configDef out of core (#16116 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-14 17:26:00 +02:00
Lianet Magrans	46714dbaed	KAFKA-16933: New consumer unsubscribe close commit fixes (#16272 ) Fixes for the leave group flow (unsubscribe/close): Fix to send Heartbeat to leave group on close even if the callbacks fail fix to ensure that if a member gets fenced while blocked on callbacks (ex. on unsubscribe), it will clear its epoch to not include it in commit requests fix to avoid race on the subscription state object on unsubscribe, updating it only on the background thread when the callbacks to leave complete (success or failure). Also improving logging in this area. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Philip Nee <pnee@confluent.io>	2024-06-14 13:03:58 +02:00
Kuan-Po (Cooper) Tseng	888a177603	KAFKA-12708 Rewrite org.apache.kafka.test.Microbenchmarks by JMH (#16231 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-14 16:47:34 +08:00
dujian0068	133f2b0f31	KAFKA-16879 SystemTime should use singleton mode (#16266 ) Reviewers: Greg Harris <gharris1727@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-14 08:49:19 +08:00
brenden20	a0b716ec9f	KAFKA-16001: Migrated ConsumerNetworkThreadTest away from ConsumerTestBuilder (#16140 ) Completely migrates ConsumerNetworkThreadTest away from ConsumerTestBuilder and removes all usages of spy objects and replaced with mocks. Removes testEnsureMetadataUpdateOnPoll() since it was doing integration testing. Also I adds new tests to get more complete test coverage of ConsumerNetworkThread. Reviewers: Kirk True <kirk@kirktrue.pro>, Lianet Magrans <lianetmr@gmail.com>, Philip Nee <pnee@confluent.io>, Matthias J. Sax <matthias@confluent.io>	2024-06-13 13:35:36 -07:00
gongxuanzhang	596b945072	KAFKA-16643 Add ModifierOrder checkstyle rule (#15890 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-13 15:39:32 +08:00
brenden20	e59c887bfd	KAFKA-16557 Implemented OffsetFetchRequestState toStringBase and added a test for it (#16291 ) Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-13 15:30:05 +08:00
TingIāu "Ting" Kì	dd6fcc650e	KAFKA-16901 Add unit tests for ConsumerRecords#records(String) (#16227 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-13 14:35:33 +08:00
Lianet Magrans	fe98888960	MINOR: Improving log for outstanding requests on close and cleanup (#16304 ) Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-13 14:31:16 +08:00
Gantigmaa Selenge	6d1f8f8727	MINOR: Clean up for KafkaAdminClientTest (#16285 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-13 09:42:39 +08:00
Ivan Yurchenko	dd755b7ea9	KAFKA-8206: KIP-899: Allow client to rebootstrap (#13277 ) This commit implements KIP-899: Allow producer and consumer clients to rebootstrap. It introduces the new setting `metadata.recovery.strategy`, applicable to all the types of clients. Reviewers: Greg Harris <gharris1727@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>	2024-06-12 20:48:32 +01:00
Edoardo Comar	615e6e705c	KAFKA-16570 FenceProducers API returns "unexpected error" when succes… (#16229 ) KAFKA-16570 FenceProducers API returns "unexpected error" when successful * Client handling of ConcurrentTransactionsException as retriable * Unit test * Integration test Reviewers: Chris Egerton <chrise@aiven.io>, Justine Olshan <jolshan@confluent.io>	2024-06-12 17:07:33 +01:00
Gantigmaa Selenge	9368ef81b5	KAFKA-16865: Add IncludeTopicAuthorizedOperations option for DescribeTopicPartitionsRequest (#16136 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Calvin Liu <caliu@confluent.io>, Andrew Schofield <andrew_schofield@live.com>, Apoorv Mittal <amittal@confluent.io>	2024-06-12 17:04:24 +02:00
David Jacot	638844f833	KAFKA-16770; [2/2] Coalesce records into bigger batches (#16215 ) This patch is the continuation of https://github.com/apache/kafka/pull/15964. It introduces the records coalescing to the CoordinatorRuntime. It also introduces a new configuration `group.coordinator.append.linger.ms` which allows administrators to chose the linger time or disable it with zero. The new configuration defaults to 10ms. Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>	2024-06-11 23:29:50 -07:00
Murali Basani	226ac5e8fc	KAFKA-16922 Adding unit tests for NewTopic (#16255 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-12 11:38:50 +08:00
Nikolay	aecaf44475	KAFKA-16520: Support KIP-853 in DescribeQuorum (#16106 ) Add support for KIP-953 KRaft Quorum reconfiguration in the DescribeQuorum request and response. Also add support to AdminClient.describeQuorum, so that users will be able to find the current set of quorum nodes, as well as their directories, via these RPCs. Reviewers: Luke Chen <showuon@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Andrew Schofield <aschofield@confluent.io>	2024-06-11 10:01:35 -07:00
Gyeongwon, Do	1426e8e920	KAFKA-16764: New consumer should throw InvalidTopicException on poll when invalid topic in metadata. (#16043 ) Propagate metadata error from background thread to application thread. So, this fix ensures that metadata errors are thrown to the user on consumer.poll() Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Philip Nee <pnee@confluent.io>, Lianet Magrans <lianetmr@gmail.com>	2024-06-10 18:30:29 +02:00
Apoorv Mittal	96036aee85	KAFKA-16916: Fixing error in completing future (#16249 ) Fix to complete Future which was stuck due the exception.getCause() throws an error. The fix in the #16217 unblocked blocking thread but exception in catch block from blocking get call was wrapped in ExecutionException which is not the case when moved to async workflow hence getCause is not needed. Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-08 14:25:54 +08:00
Okada Haruki	d13a693ea7	KAFKA-16916: Disable blocking test temporarily (#16248 ) The test runs forever. We disable the test temporarily to unblock CI Reviewers: Luke Chen <showuon@gmail.com>	2024-06-08 09:09:00 +08:00
NICOLAS GUYOMAR	eb352828ed	MINOR: Remove AK 1.0.0 reference from NetworkClient.java (#16223 ) Reviewers: Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-08 06:54:09 +08:00
Phuc-Hong-Tran	5cd6944eaa	KAFKA-16493: Avoid unneeded subscription regex check if metadata version unchanged (#15869 ) This PR includes changes for AsyncKafkaConsumer to avoid evaluating the subscription regex on every poll if metadata hasn't changed. The metadataVersionSnapshot was introduced to identify whether metadata has changed or not, if yes then the current subscription regex will be evaluated. This is the same mechanism used by the LegacyConsumer. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Matthias J. Sax <matthias@confluent.io>	2024-06-07 10:30:07 -07:00
Kirk True	d6cd83e2fb	KAFKA-16200: Enforce that RequestManager implementations respect user-provided timeout (#16031 ) Improve consistency and correctness for user-provided timeouts at the Consumer network request layer, per the Java client Consumer timeouts design (https://cwiki.apache.org/confluence/display/KAFKA/Java+client+Consumer+timeouts). While the changes introduced in KAFKA-15974 enforce timeouts at the Consumer's event layer, this change enforces timeouts at the network request layer. The changes mostly fit into the following areas: 1. Create shared code and idioms so timeout handling logic is consistent across current and future RequestManager implementations 2. Use deadlineMs instead of expirationMs, expirationTimeoutMs, retryExpirationTimeMs, timeoutMs, etc. 3. Update "preemptive pruning" to remove expired requests that have had at least one attempt Reviewers: Lianet Magrans <lianetmr@gmail.com>, Bruno Cadonna <cadonna@apache.org>	2024-06-07 09:53:27 +02:00
brenden20	247132569e	KAFKA-16912 Migrate ConsumerNetworkThreadTest.testPollResultTimer() to NetworkClientDelegateTest (#16234 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-07 14:39:33 +08:00
Apoorv Mittal	c01279b92a	KAFKA-16905: Fix blocking DescribeCluster call in AdminClient DescribeTopics (#16217 ) Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, David Arthur <mumrah@gmail.com>	2024-06-06 18:11:43 -04:00
Mickael Maison	79ea7d6122	MINOR: Various cleanups in clients (#16193 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-06 20:28:39 +02:00
Lianet Magrans	b74b182841	KAFKA-16786: Remove old assignment strategy usage in new consumer (#16214 ) Remove usage of the partition.assignment.strategy config in the new consumer. This config is deprecated with the new consumer protocol, so the AsyncKafkaConsumer should not use or validate the property. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>	2024-06-06 09:45:36 +02:00
Kamal Chandraprakash	02c794dfd3	KAFKA-15776: Introduce remote.fetch.max.timeout.ms to configure DelayedRemoteFetch timeout (#14778 ) KIP-1018, part1, Introduce remote.fetch.max.timeout.ms to configure DelayedRemoteFetch timeout Reviewers: Luke Chen <showuon@gmail.com>	2024-06-05 14:42:23 +08:00
Colin P. McCabe	9ceed8f18f	KAFKA-16535: Implement AddVoter, RemoveVoter, UpdateVoter RPCs Implement the add voter, remove voter, and update voter RPCs for KIP-853. This is just adding the RPC handling; the current implementation in RaftManager just throws UnsupportedVersionException. Reviewers: Andrew Schofield <aschofield@confluent.io>, José Armando García Sancio <jsancio@apache.org>	2024-06-04 14:07:48 -07:00
TingIāu "Ting" Kì	8b3c77c671	KAFKA-15305 The background thread should try to process the remaining task until the shutdown timer is expired. (#16156 ) Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-05 04:21:20 +08:00
Chris Egerton	2b4779840c	MINOR: Fix return tag on Javadocs for consumer group-related Admin methods (#16197 ) Reviewers: Greg Harris <greg.harris@aiven.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-04 15:04:34 -04:00
Edoardo Comar	eea9ebf8a3	KAFKA-16047: Use REQUEST_TIMEOUT_MS_CONFIG in AdminClient.fenceProducers (#16151 ) Use REQUEST_TIMEOUT_MS_CONFIG in AdminClient.fenceProducers, or options.timeoutMs if specified, as transaction timeout. No transaction will be started with this timeout, but ReplicaManager.appendRecords uses this value as its timeout. Use REQUEST_TIMEOUT_MS_CONFIG like a regular producer append to allow for replication to take place. Co-Authored-By: Adrian Preston <prestona@uk.ibm.com>	2024-06-04 11:45:11 +01:00
Sanskar Jhajharia	e4ff54073a	MINOR: Code Cleanup (Clients Module) (#16049 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-04 16:49:59 +08:00
Andrew Schofield	c3a1bef429	KAFKA-16715: Create KafkaShareConsumer interfaces (#16134 ) This PR introduces ShareConsumer and KafkaShareConsumer. It is focused entirely on the minimal additions required to introduce the external programming interfaces. Reviewers: Apoorv Mittal <amittal@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>	2024-06-04 09:05:59 +05:30
GANESH SADANALA	04e6ef4750	KAFKA-15156 Update cipherInformation correctly in DefaultChannelMetadataRegistry (#16169 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-06-04 00:32:47 +08:00
Andrew Schofield	8f82f14a48	KAFKA-16713: Define initial set of RPCs for KIP-932 (#16022 ) This PR defines the initial set of RPCs for KIP-932. The RPCs for the admin client and state management are not in this PR. Reviewers: Apoorv Mittal <amittal@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>	2024-06-03 11:52:35 +05:30
TingIāu "Ting" Kì	ca9f4aeda7	KAFKA-16639 Ensure HeartbeatRequestManager generates leave request regardless of in-flight heartbeats. (#16017 ) Fix the bug where the heartbeat is not sent when a newly created consumer is immediately closed. When there is a heartbeat request in flight and the consumer is then closed. In the current code, the HeartbeatRequestManager does not correctly send the closing heartbeat because a previous heartbeat request is still in flight. However, the closing heartbeat is only sent once, so in this situation, the broker will not know that the consumer has left the consumer group until the consumer's heartbeat times out. This situation causes the broker to wait until the consumer's heartbeat times out before triggering a consumer group rebalance, which in turn affects message consumption. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-06-01 04:14:15 +08:00
TingIāu "Ting" Kì	0971924ebc	KAFKA-16824: Utils.getHost and Utils.getPort do not catch a lot of invalid host and ports. (#16048 ) Modify regex of HOST_PORT_PATTERN to prevent malformed hosts and ports. Reviewers: Luke Chen <showuon@gmail.com>	2024-05-31 16:50:27 +08:00
Lianet Magrans	eb39031cd0	KAFKA-16766: offset fetch timeout exception in new consumer consistent with legacy (#16125 ) * Timeout exception fetching offsets * Tests	2024-05-31 10:33:20 +02:00
Alyssa Huang	a8e99eb969	KAFKA-16833 Fixing PartitionInfo and Cluster equals and hashCode (#16062 ) Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2024-05-31 05:00:42 +08:00
Ahmed Najiub	33a292e4dd	MINOR: Adds a test case to test that an exception is thrown in invalid ports (#16112 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-05-31 04:38:23 +08:00
Justine Olshan	5e3df22095	KAFKA-16308 [1/N]: Create FeatureVersion interface and add `--feature` flag and handling to StorageTool (#15685 ) As part of KIP-1022, I have created an interface for all the new features to be used when parsing the command line arguments, doing validations, getting default versions, etc. I've also added the --feature flag to the storage tool to show how it will be used. Created a TestFeatureVersion to show an implementation of the interface (besides MetadataVersion which is unique) and added tests using this new test feature. I will add the unstable config and tests in a followup. Reviewers: David Mao <dmao@confluent.io>, David Jacot <djacot@confluent.io>, Artem Livshits <alivshits@confluent.io>, Jun Rao <junrao@apache.org>	2024-05-29 16:36:06 -07:00
Mickael Maison	3f3f3ac155	MINOR: Delete KafkaSecurityConfigs class (#16113 ) Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-05-30 05:55:24 +08:00
Eugene Mitskevich	862ea12cd7	MINOR: Fix rate metric spikes (#15889 ) Rate reports value in the form of sumOrCount/monitoredWindowSize. It has a bug in monitoredWindowSize calculation, which leads to spikes in result values. Reviewers: Jun Rao <junrao@gmail.com>	2024-05-29 13:14:37 -07:00
Frederik Rouleau	4eb60b5104	KAFKA-16507 Add KeyDeserializationException and ValueDeserializationException with record content (#15691 ) Implements KIP-1036. Add raw ConsumerRecord data to RecordDeserialisationException to make DLQ implementation easier. Reviewers: Kirk True <ktrue@confluent.io>, Andrew Schofield <aschofield@confluent.io>, Matthias J. Sax <matthias@confluent.io>	2024-05-28 14:56:47 -07:00
Lianet Magrans	0143c72e50	KAFKA-16815: Handle FencedInstanceId in HB response (#16047 ) Handle FencedInstanceIdException that a consumer may receive in the heartbeat response. This will be the case when a static consumer is removed from the group by and admin client, and another member joins with the same group.instance.id (allowed in). The first member will receive a FencedInstanceId on its next heartbeat. The expectation is that this should be handled as a fatal error. There are no actual changes in logic with this PR, given that without being handled, the FencedInstanceId was being treated as an "unexpected error", which are all treated as fatal errors, so the outcome remains the same. But we're introducing this small change just for accuracy in the logic and the logs: FencedInstanceId is expected during heartbeat, a log line is shown describing the situation and why it happened (and it's treated as a fatal error, just like it was before this PR). This PR also improves the test to ensure that the error propagated to the app thread matches the one received in the HB. Reviewers: Andrew Schofield <aschofield@confluent.io>, David Jacot <djacot@confluent.io>	2024-05-24 05:19:43 -07:00
Kirk True	a98c9be6b0	KAFKA-15974: Enforce that event processing respects user-provided timeout (#15640 ) The intention of the CompletableApplicationEvent is for a Consumer to enqueue the event and then block, waiting for it to complete. The application thread will block up to the amount of the timeout. This change introduces a consistent manner in which events are expired out by checking their timeout values. The CompletableEventReaper is a new class that tracks CompletableEvents that are enqueued. Both the application thread and the network I/O thread maintain their own reaper instances. The application thread will track any CompletableBackgroundEvents that it receives and the network I/O thread will do the same with any CompletableApplicationEvents it receives. The application and network I/O threads will check their tracked events, and if any are expired, the reaper will invoke each event's CompletableFuture.completeExceptionally() method with a TimeoutException. On closing the AsyncKafkaConsumer, both threads will invoke their respective reapers to cancel any unprocessed events in their queues. In this case, the reaper will invoke each event's CompletableFuture.completeExceptionally() method with a CancellationException instead of a TimeoutException to differentiate the two cases. The overall design for the expiration mechanism is captured on the Apache wiki and the original issue (KAFKA-15848) has more background on the cause. Note: this change only handles the event expiration and does not cover the network request expiration. That is handled in a follow-up Jira (KAFKA-16200) that builds atop this change. This change also includes some minor refactoring of the EventProcessor and its implementations. This allows the event processor logic to focus on processing individual events rather than also the handling of batches of events. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Philip Nee <pnee@confluent.io>, Bruno Cadonna <cadonna@apache.org>	2024-05-22 18:11:10 +02:00
Mickael Maison	affe8da54c	KAFKA-7632: Support Compression Levels (KIP-390) (#15516 ) Reviewers: Jun Rao <jun@confluent.io>, Luke Chen <showuon@gmail.com> Co-authored-by: Lee Dongjin <dongjin@apache.org>	2024-05-21 17:58:49 +02:00
vamossagar12	476d323f5a	KAFKA-16197: Print Connect worker specific logs on poll timeout expiry (#15305 ) Reviewers: Luke Chen <showuon@gmail.com>, Greg Harris <greg.harris@aiven.io>	2024-05-20 17:00:04 -07:00
Kuan-Po (Cooper) Tseng	92489995f3	KAFKA-16544 DescribeTopicsResult#allTopicIds and DescribeTopicsResult#allTopicNames should return null instead of throwing NPE (#15979 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-05-18 03:41:10 +08:00
Johnny Hsu	9896d07997	MINOR: Add debug enablement check when using log.debug (#15977 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-05-18 02:53:28 +08:00
Kirk True	f9db4fa19c	KAFKA-16787: Remove TRACE level logging from AsyncKafkaConsumer hot path (#15981 ) Removed unnecessarily verbose logging. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>	2024-05-17 08:57:23 +02:00
Andrew Schofield	bb3ff0f84a	KAFKA-16759: Handle telemetry push response while terminating (#15957 ) When client telemetry is configured in a cluster, Kafka producers and consumers push metrics to the brokers periodically. There is a special push of metrics that occurs when the client is terminating. A state machine in the client telemetry reporter controls its behaviour in different states. Sometimes, when a client was terminating, it was attempting an invalid state transition from TERMINATING_PUSH_IN_PROGRESS to PUSH_NEEDED when it receives a response to a PushTelemetry RPC. This was essentially harmless because the state transition did not occur but it did cause unsightly log lines to be generated. This PR performs a check for the terminating states when receiving the response and simply remains in the current state. I added a test to validate the state management in this case. Actually, the test passes before the code change in the PR, but with unsightly log lines. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Apoorv Mittal <amittal@confluent.io>	2024-05-15 21:40:34 +05:30
Kirk True	3f8d11f047	KAFKA-16577: New consumer fails with stop within allotted timeout in consumer_test.py system test (#15784 ) The KafkaAsyncConsumer would occasionally fail to stop when wakeup() was invoked. It turns out that there's a race condition between the thread that invokes wakeup() and the thread that is performing an action on the consumer. If the operation's Future is already completed by thread A when thread B invoke's completeExceptionally() inside wakeup(), the WakeupException will be ignored. We should use the return value from completeExceptionally() to determine if that call actually triggered completion of the Future. If that method returns false, that signals that the Future was already completed, and the exception we passed to completeExceptionally() was ignored. Therefore, we then need to return a new WakeupFuture instead of null so that the next call to setActiveTask() will throw the WakeupException. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>	2024-05-15 11:14:31 +02:00
flashmouse	f0291ac74b	KAFKA-15170: Fix rack-aware assignment in AbstractStickyAssignor (#13965 ) This PR fixes two kinds of bugs in the new(ish) rack-aware part of the sticky assignment algorithm: First, when reassigning "owned partitions" to their previous owners, we now have to take the rack placement into account and might not immediately assign a previously-owned partition to its old consumer during this phase. There is a small chance this partition will be assigned to its previous owner during a later stage of the assignment, but if it's not then by definition it has been "revoked" and must be removed from the assignment during the adjustment phase of the CooperativeStickyAssignor according to the cooperative protocol. We need to make sure any partitions removed in this way end up in the "partitionsTransferringOwnership". Second, the sticky algorithm works in part by keeping track of how many consumers are still "unfilled" when they are at the "minQuota", meaning we may need to assign one more partition to get to the expected number of consumers at the "maxQuota". During the rack-aware round-robin assignment phase, we were not properly clearing the set of unfilled & minQuota consumers once we reached the expected number of "maxQuota" consumers (since by definition that means no more minQuota consumers need to or can be given any more partitions since that would bump them up to maxQuota and exceed the expected count). This bug would result in the entire assignment being failed due to a correctness check at the end which verifies that the "unfilled members" set is empty before returning the assignment. An IllegalStateException would be thrown, failing the rebalancing and sending the group into an endless rebalancing loop until/unless it was lucky enough to produce a new assignment that didn't hit this bug Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>	2024-05-14 17:06:05 -07:00
Ivan Vaskevych	2958dcb919	KAFKA-13115; Update doSend doc about possible blocking (#11023 ) Reviewers: José Armando García Sancio <jsancio@apache.org>	2024-05-14 10:50:56 -04:00
José Armando García Sancio	440f5f6c09	MINOR; Validate at least one control record (#15912 ) Validate that a control batch in the batch accumulator has at least one control record. Reviewers: Jun Rao <junrao@apache.org>, Chia-Ping Tsai <chia7712@apache.org>	2024-05-14 10:02:29 -04:00
Mickael Maison	0587a9af3d	MINOR: Various cleanups in clients tests (#15877 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Lianet Magrans <lianetmr@gmail.com>	2024-05-14 10:19:00 +02:00
Lianet Magrans	e18f61ce46	KAFKA-16695: Improve expired poll logging (#15909 ) Improve consumer log for expired poll timer, by showing how much time was the max.poll.interval.ms exceeded. This should be helpful in guiding the user to tune that config on the common case of long-running processing causing the consumer to leave the group. Inspired by other clients that log such information on the same situation. Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Matthias Sax <mjsax@apache.org>, Andrew Schofield <andrew_schofield@live.com>, Kirk True <kirk@kirktrue.pro>	2024-05-13 18:04:54 -07:00
Lianet Magrans	ea485a7061	KAFKA-16665: Allow to initialize newly assigned partition's positions without allowing fetching while callback runs (#15856 ) Fix to allow to initialize positions for newly assigned partitions, while the onPartitionsAssigned callback is running, even though the partitions remain non-fetchable until the callback completes. Before this PR, we were not allowing initialization or fetching while the callback was running. The fix here only allows to initialize the newly assigned partition position, and keeps the existing logic for making sure that the partition remains non-fetchable until the callback completes. The need for this fix came out in one of the connect system tests, that attempts to retrieve a newly assigned partition position with a call to consumer.position from within the onPartitionsAssigned callback (WorkerSinkTask). With this PR, we allow to make such calls (test added), which is the behaviour of the legacy consumer. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>	2024-05-07 10:40:00 +02:00
Andrew Schofield	4c4ae6e39c	KAFKA-16608 Honour interrupted thread state on KafkaConsumer.poll (#15803 ) The contract of KafkaConsumer.poll(Duration) says that it throws InterruptException "if the calling thread is interrupted before or while this function is called". The new KafkaConsumer implementation was not doing this if the thread was interrupted before the poll was called, specifically with a very short timeout. If it ever waited for records, it did check the thread state. If it did not wait for records because of a short timeout, it did not. Some of the log messages in the code erroneously mentioned timeouts, when they really meant interruption. Also adds a test for this specific scenario. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2024-05-07 08:22:41 +08:00
Okada Haruki	5c96ad61d9	KAFKA-16393 read/write sequence of buffers correctly (#15571 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-05-06 19:11:04 +08:00
Chia Chuan Yu	55a00be4e9	MINOR: Replaced Utils.join() with JDK API. (#15823 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-05-06 15:13:01 +08:00
PoAn Yang	970ac07881	KAFKA-16659 KafkaConsumer#position() does not respect wakup when group protocol is CONSUMER (#15853 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2024-05-06 08:45:11 +08:00

... 2 3 4 5 6 ...

3456 Commits