TestUtils.tempDirectory already registers a shutdown hook for deleting the temp directory. There's no reason to also call File.deleteOnExit, since that just registers another hook to do the same thing.
Reviewers: TengYao Chi <kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Fix for the known issue that the logic for updating fetch positions in the new consumer was being performed partly in the app thread, party in the background thread, potentially leading to race conditions on the subscription state.
This PR moves the logic for updateFetchPositions to the background thread as a single event (instead of triggering separate events to validate, fetchOffsets, listOffsets). A new UpdateFetchPositionsEvent is triggered from the app thread and processed in the background, where it performs those same operations and updates the subscription state accordingly, without blocking the background thread.
This PR maintains the existing logic for keeping a pendingOffsetFetchRequest that does not complete within the lifetime of the updateFetchPositions attempt, and may be used on the next call to updateFetchPositions.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
Currently the code in ShareConsumeRequestManager works on the basis that there can only be one commitSync()/close() at a time. But there is a chance these calls timeout on the application thread, but are still sent later on the background thread. This will mean the incoming commitSync()/close() will not be processed, resulting in possible loss of acknowledgements.
To cover this case, we will now have a list of AcknowledgeRequestStates to store the commitSyncs() and a separate requestState to store the close(). This queue will be processed one by one until its empty. For close(), we are still assuming there can only be one active close() at a time.
eviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
This is the equivalent of #16755 for the share group consumer.
The node request-latency-max and request-latency-avg were not being recorded and thus reported as NaN for the share group consumer.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
The 3.8 controller assumes the unknown features have min version = 0, but KAFKA-17011 replace the min=0 by min=1 when BrokerRegistrationRequest < 4. Hence, to support upgrading from 3.8.0 to 3.9, this PR changes the implementation of ApiVersionsResponse (<4) and BrokerRegistrationRequest (<4) to skip features with supported minVersion of 0 instead of replacing 0 with 1
Reviewers: Jun Rao <junrao@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
Introduces the share coordinator. This coordinator is built on the new coordinator runtime framework. It
is responsible for persistence of share-group state in a new internal topic named "__share_group_state".
The responsibility for being a share coordinator is distributed across the brokers in a cluster.
Reviewers: David Arthur <mumrah@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>
The original behavior was implemented to maintain the behavior of the Classic consumer, where the ConsumerCoordinator would do the same when handling the OffsetFetchResponse. This behavior is being updated for the legacy coordinator as part of KAFKA-17279, to retry on all retriable errors.
We should review and update the CommitRequestManager to align with this, and retry on all retriable errors, which seems sensible when fetching offsets.
The corresponding PR for classic consumer is #16826
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Currently there are 4 handler functions present for handling ShareAcknowledge responses. ShareConsumeRequestManager had an interface and the respective handlers would implement it. Instead of having 4 different handlers for this, now using AcknowledgeRequestType, we could merge the code and have only 2 handler functions, one for ShareAcknowledge success and one for ShareAcknowledge failure, eliminating the need for the interface.
This PR also fixes a bug - We were not using the time at which the response was received while handling the ShareAcknowledge response, we were using an outdated time. Now the bug is fixed.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
In AsyncKafkaConsumer, FindCoordinatorRequest is sent by background thread. In MockClient#prepareResponseFrom, it only matches the response to a future request. If there is some race condition, FindCoordinatorResponse may not match to a FindCoordinatorRequest. It's better to put MockClient#prepareResponseFrom before the request to avoid flaky test.
Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Per KIP-289, the use of an empty value for group.id configuration was deprecated back in 2.2.0.
In 3.7, the AsyncKafkaConsumer implementation will throw an error (see KAFKA-14438).
This task is to update the LegacyKafkaConsumer implementation to throw an error in 4.0.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
AccessTokenRetrieverFactory uses the value of sasl.oauthbearer.header.urlencode provided by the user, or null if no value was provided for that configuration. When the HttpAccessTokenRetriever is created the JVM attempts to unbox the value into a boolean, a NullPointerException is thrown.
The fix is to explicitly check the Boolean, and if it's null, use Boolean.FALSE.
Reviewers: bachmanity1 <81428651+bachmanity1@users.noreply.github.com>, Chia-Ping Tsai <chia7712@gmail.com>
Fix to ensure that the HB request to leave the group is generated when closing the HBRequestManager if the state is LEAVING. This is needed because we could end up closing the network thread without giving a chance to the HBManager to generate the request. This flow on consumer.close with short timeout:
1. app thread triggers releaseAssignmentAndLeaveGroup
2. background thread transitions to LEAVING
2.1 the next run of the background thread should poll the HB manager and generate a request
3. app thread releaseAssignmentAndLeaveGroup times out and moves on to close the network thread (stops polling managers. Calls pollOnClose to gather the final requests and send them along with the unsent)
If 3 happens in the app thread before 2.1 happens in the background, the HB manager won't have a chance to generate the request to leave. This PR implements the pollOnClose to generate the final request if needed.
Reviewers: Kirk True <kirk@kirktrue.pro>, TaiJuWu <tjwu1217@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
The patch adds support of alter/describe configs for group in kafka-configs.sh.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>
This is a follow-up after KIP-919, extending support for BOOTSTRAP_CONTROLLERS_CONFIG to both Admin#alterPartitionReassignments and Admin#listPartitionReassignments.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Refactor the heartbeat request managers for consumer groups and share groups. Almost all of the code can be shared which is definitely good.
Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Add the version check to client side when building ListOffsetRequest for the specific timestamp:
1) the version must be >=8 if timestamp=-4L (EARLIEST_LOCAL_TIMESTAMP)
2) the version must be >=9 if timestamp=-5L (LATEST_TIERED_TIMESTAMP)
Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
Add an integration test for share group list and describe admin operations.
Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
- The different behavior of nonexistent resource. For example: nonexistent broker will cause timeout; nonexistent topic will produce UnknownTopicOrPartitionException; nonexistent group will return static/default configs; client_metrics will return empty configs
- The resources (topic and broker resource types are currently supported) this description is out-of-date
- Add some junit test
Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
Currently we were not updating the result count when we merged commitAsync() requests into one batch in ShareConsumeRequestManager, so this led to lesser acknowledgements sent to the application thread (ShareConsumerImpl) than expected.
Fix : Now if the acknowledge response came from a commitAsync, then we do not wait for other requests to complete, we always prepare a background event to be sent.
This PR also fixes a bug in ShareConsumeRequestManager, where during the final ShareAcknowledge sent during close(), we also pick up any piggybacked acknowledgements which were waiting to be sent along with ShareFetch.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
Avoids stream allocation on hot code path in Admin#listOffsets
This patch avoids allocating the stream reference pipeline & spliterator for this case by explicitly allocating the pre-sized Node[] and using a for loop with int induction over the specified IDs List argument.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Kirk True <kirk@kirktrue.pro>, David Arthur <mumrah@gmail.com>