Commit Graph

3456 Commits

Author SHA1 Message Date
ShivsundarR 8fde6dedea
KAFKA-18155 : Fix bug in response handler for ShareAcknowledge (#18029)
In the response handler for ShareAcknowledge, we are passing the clientResponse.receivedTimeMs() to the handler methods. But when there is a disconnect or when the response received is null, we should be passing the current time instead.

This bug was causing consumer to hang as it did not call the handler methods on disconnect, and further requests were blocked waiting for its completion.

Reviewers: Andrew Schofield <aschofield@confluent.io>,  Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-12-05 12:59:13 +05:30
Ken Huang 2b43c49f51
KAFKA-18050 Upgrade the checkstyle version to 10.20.2 (#17999)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-12-05 10:59:18 +08:00
Kirk True 4362ab7090
KAFKA-17947: Update currentLag(), pause(), and resume() to update SubscriptionState in background thread (#17699)
Reviewers: Lianet Magrans <lmagrans@confluent.io>
2024-12-04 21:31:44 -05:00
Lianet Magrans bd0ea70912
KAFKA-18096: Allow join with regex if no matching topics (#18024)
Reviewers: David Jacot <djacot@confluent.io>
2024-12-04 11:35:42 -05:00
Lianet Magrans f60382bf21
KAFKA-18127 Validate SubscriptionPattern used on v0 HB (#17989)
Reviewers: David Jacot <djacot@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-12-04 19:55:12 +08:00
PoAn Yang fe88232b07
KAFKA-17750 Extend kafka-consumer-groups command line tool to support new consumer group (part 1) (#17958)
1) Bump validVersions of ConsumerGroupDescribeRequest.json and ConsumerGroupDescribeResponse.json to "0-1".

2) Add MemberType field to ConsumerGroupDescribeResponse.json. Default value is -1 (unknown). 0 is for classic member and 1 is for consumer member.

3) When ConsumerGroupMember#useClassicProtocol is true, return MemberType field as 0. Otherwise, return 1.

Reviewers: David Jacot <djacot@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-12-04 06:08:39 +08:00
Kuan-Po Tseng ac8b3dfbf0
KAFKA-18060 new coordinator does not handle TxnOffsetCommitRequest with empty member id when using CONSUMER group (#17914)
There are two issues in KAFKA-18060:

1) New coordinator can't handle the TxnOffsetCommitRequest with empty member id, and TxnOffsetCommitRequest v0-v2 do definitely has an empty member ID, causing ConsumerGroup#validateOffsetCommit to throw an UnknownMemberIdException. This prevents the old producer from calling sendOffsetsToTransaction. Note that TxnOffsetCommitRequest versions v0-v2 are included in KIP-896, so it seems the new coordinator should support that operations

2) The deprecated API Producer#sendOffsetsToTransaction does not use v0-v2 to send TxnOffsetCommitRequest with an empty member ID. Unfortunately, it has been released for a while. Therefore, the new coordinator needs to handle TxnOffsetCommitRequest with an empty member ID for all versions.

Taken from the two issues above, we need to handle empty member id in all API versions when new coordinator are dealing with TxnOffsetCommitRequest.

Reviewers: David Jacot <djacot@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-12-04 02:55:19 +08:00
Ken Huang ff44f5e0a5
KAFKA-17554 Flaky testFutureCompletionOutsidePoll in ConsumerNetworkClientTest (#17217)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-12-03 08:58:56 +08:00
TengYao Chi 6fd951a9c0
KAFKA-17610 Drop alterConfigs (#18002)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-12-02 23:26:06 +08:00
Manikumar Reddy ae3c5dec99
KAFKA-18013: Add support for duration based offset reset strategy to Kafka Consumer (#17972)
Update AutoOffsetResetStrategy.java to support duration based reset strategy
Update OffsetFetcher related classes and unit tests

Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lmagrans@confluent.io>
2024-11-29 22:38:57 +05:30
Lianet Magrans 6237325fb1
KAFKA-15561 [5/N]: Integration tests for new subscribe API with Re2J pattern (#17964)
- integration tests for new subscribe api with RE2J pattern
- fix to ensure all topics are included in metadata requests when consumer is subscribed to RE2J pattern

Reviewers: David Jacot <djacot@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-11-30 01:02:39 +08:00
Ken Huang 9d23f89e05
KAFKA-17338 ConsumerConfig should prevent using partition assignors with CONSUMER group protocol (#16899)
Reviewers: Kirk True <ktrue@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Lianet Magrans <lmagrans@confluent.io>
2024-11-29 09:36:29 -05:00
Andrew Schofield e7bbcdb251
KAFKA-18090: Add ShareMemberDescription and Assignment (#17975)
Introduce ShareMemberDescription and ShareMemberAssignment as distinct classes for share groups. Although the correspondence with consumer groups is fairly close, the concepts are likely to diverge over time and separating these concepts now makes sense.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2024-11-29 10:20:01 +05:30
HYUNSANG HAN (한현상, Travis) 700bdd5fee
KAFKA-17997 Remove deprecated config log.message.timestamp.difference.max.ms (#17928)
Reviewers: Divij Vaidya <diviv@amazon.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-11-29 03:15:46 +08:00
Chia-Chuan Yu c446e799be
KAFKA-17010 Remove `DescribeLogDirsResponse#LogDirInfo`, `DescribeLogDirsResponse#ReplicaInfo`, and `DescribeLogDirsResult#all` (#17953)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-28 04:42:34 +08:00
Lianet Magrans a39c984d21
KAFKA-15561 [4/N]: MockConsumer support for SubscriptionPattern (#17962)
Reviewers: David Jacot <djacot@confluent.io>
2024-11-27 14:33:28 -05:00
David Jacot aae42ef656
KAFKA-17593; [9/N] Mark ConsumerGroupHeartbeat API v1 as stable (#17961)
Reviewers: Lianet Magrans <lmagrans@confluent.io>
2024-11-27 13:03:46 -05:00
Lianet Magrans 37b4d9b01d
KAFKA-15561 [3/N]: Client support for SubscriptionPattern in HB (#17951)
Reviewers: David Jacot <djacot@confluent.io>
2024-11-27 12:01:12 -05:00
Ken Huang c32a49549d
MINOR: Remove duplicate valid value in document (#17947)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-27 21:07:36 +08:00
ShivsundarR 866d66229d
KAFKA-18056: Fixed bug in handling commitAsync responses (#17909)
There was a bug in handling the ShareAcknowledgeResponse for commitAsync(). Currently after we receive a response, we send out a background event to the application thread to update the acknowledgement commit callbacks for EVERY TopicIdPartition.
The map that was sent was not cleared after sending the event. This meant we ended up sending responses for partitions that were already sent in the previous event. So there will be duplicate calls to the callback.

The PR fixes the bug and adds a unit test for the same.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-11-26 20:15:19 +05:30
Lianet Magrans 0b081fc310
KAFKA-15561 [2/N]: Background event and subscription state changes for RE2J pattern (#17918)
Reviewers: David Jacot <djacot@confluent.io>
2024-11-26 14:49:13 +01:00
Andrew Schofield 5480d54d18
KAFKA-17544: Add log message for early access use of KafkaShareConsumer (#17940)
When a KafkaShareConsumer is constructed in AK 4.0, a log message is written warning about the early access nature of the feature.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2024-11-26 10:15:43 +05:30
Ritika Reddy 4fc9e442c3
KAFKA-17898: Refine Epoch Bumping Logic (#17849)
With KAFKA-14562, we implemented epoch bump on both the client and the server. Mentioned below are the different epoch bump scenarios we have on hand after enabled tv2

Non-Transactional Producers
• Epoch bumping is always allowed.
• Different code paths are used to handle epoch bumping.

Transactional Producers

No Epoch Bump Allowed
• coordinatorSupportsBumpingEpoch = false when initPIDVersion < 3 or initPIDVersion = null.

Client-Triggered Epoch Bump Allowed
• coordinatorSupportsBumpingEpoch = true when initPIDVersion >= 3.
• TransactionVersion2Enabled = false when endTxnVersion < 5.

Only Server-Triggered Epoch Bump Allowed
• TransactionVersion2Enabled = true and endTxnVersion >= 5.

We want to refine the code and make it more structured to correctly handle epoch bumping in the above mentioned cases.

The changes made in this patch are:

Rename epochBumpRequired to epochBumpTriggerRequired to symbolize a manual epoch bump request from the client
Modify canEpochBump method according to the above mentioned scenarios

Reviewers: Artem Livshits <alivshits@confluent.io>, Calvin Liu <caliu@confluent.io>, Justine Olshan <jolshan@confluent.io>
2024-11-25 14:29:15 -08:00
Bill Bejeck 7f8a592ad1
KAFKA-17869: Adding tests to ensure KIP-1076 doesn't interfere producer metrics[2/3] (#17783)
Adding producer tests to ensure the KIP-1076 methods don't interfere with existing metrics
Reviewers: Matthias Sax <mjsax@apache.org>
2024-11-25 16:24:16 -05:00
Rajini Sivaram 0f33b16fdf
KAFKA-18085: Abort inflight requests on existing connections while rebootstrapping (#17939)
When disconnecting channels before rebootstrapping due to the rebootstrap conditions introduced in KIP-1102, we should ensure that inflight requests are aborted similar to other disconnections like request timeout in clients. With the earlier rebootstrapping from KIP-899, we only rebootstrapped when there were no connections, so no disconnections are required.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2024-11-25 17:58:11 +00:00
Andrew Schofield d17a149205
KAFKA-17956 Remove Admin.listShareGroups (#17912)
KIP-1043 introduced Admin.listGroups as the way to list all types of groups. As a result, Admin.listShareGroups has been removed. This PR is the final step of the removal.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-25 22:05:35 +08:00
ClarkChen 54843e6e1e
KAFKA-18077 Remove deprecated JmxReporter(String) (#17923)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-25 21:54:50 +08:00
Manikumar Reddy 3268435fd6
KAFKA-18013: Add AutoOffsetResetStrategy internal class (#17858)
- Deprecates OffsetResetStrategy enum
- Adds new internal class AutoOffsetResetStrategy
- Replaces all OffsetResetStrategy enum usages with AutoOffsetResetStrategy
- Deprecate old/Add new constructors to MockConsumer

 Reviewers: Andrew Schofield <aschofield@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-11-25 19:11:12 +05:30
Lianet Magrans 654ebe10f4
KAFKA-18071: Avoid event to refresh regex if no pattern subscription (#17917)
Reviewers: David Jacot <djacot@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2024-11-24 21:39:11 -05:00
Kuan-Po Tseng b04a498317
MINOR: Enhance error message in KafkaProducer#throwIfInvalidGroupMetadata (#17915)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-23 14:07:12 +08:00
Mickael Maison d5e270482c
MINOR: Various cleanups in clients (#17895)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-22 20:38:31 +01:00
Lianet Magrans 0d7c765981
KAFKA-15561 [1/N]: Introduce new subscribe api for RE2J regex (#17897)
Reviewers: David Jacot <djacot@confluent.io>
2024-11-22 11:58:20 -05:00
Nick Guo 7db4d53f18
KAFKA-12690 Remove deprecated Producer#sendOffsetsToTransaction (#17865)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-22 18:07:10 +08:00
Bill Bejeck 1c998f8ef3
KAFKA-17869: Adding tests to ensure KIP-1076 doesn't interfere with consumer metrics[1/3] (#17781)
Adding tests to ensure the KIP-1076 methods don't interfere with existing metrics in clients

Reviewers: Apoorv Mittal <amittal@confluent.io>, Matthias Sax <mjsax@apache.org>
2024-11-21 13:41:29 -05:00
Andrew Schofield 32c887b05e
KAFKA-17949: Introduce GroupState and replace ShareGroupState (#17763)
This PR introduces the unified GroupState enum for all group types from KIP-1043. This PR also removes ShareGroupState and begins the work to replace Admin.listShareGroups with Admin.listGroups. That will complete in a future PR.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2024-11-19 21:17:12 +05:30
David Arthur 5f4cbd4aa4
KAFKA-17767 Automatically quarantine new tests [5/n] (#17725)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-19 09:56:36 +08:00
ShivsundarR eafa78d99d
KAFKA-18016: Modified handling of piggyback acknowledgements in ShareConsumeRequestManager. (#17824)
What
There was a bug in handling piggyback acknowledgements in ShareConsumeRequestManager, where the fetchAcknowledgementsMap could be updated when the request was in flight and when the ShareFetch response is received, we were removing any acknowledgements(without actually sending them) which came when the request was in flight.

Fix
Now we are maintaining 2 separate maps(one which has the acknowledgements to send and one which keeps track of the acknowledgements in flight).

 Reviewers: Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>,  Manikumar Reddy <manikumar.reddy@gmail.com>
2024-11-18 17:15:42 +05:30
Andrew Schofield a592912ec9
KAFKA-17663 Add metadata caching in PartitionLeaderStrategy (#17367)
Admin API operations have two phases: lookup and fulfilment. The lookup phase involves a METADATA request whose details depend upon the operation being performed.

For some operations, the METADATA request can be quite expensive to serve. For example, if the user calls Admin.listOffsets for 1000 topics, the METADATA request will include all 1000 topics and the response will contain the leader information for all of these topics. And then the actual fulfilment phase does the real work of the operation.

In cases where a long-running application is performing repeated admin operations which need the same metadata information about partition leadership, it is not necessary to send the METADATA request for every single admin operation.

This PR adds a cache of the mapping from topic-partition to leader id to the admin client. The cache doesn't need to be very sophisticated because the admin client will retry if the information becomes stale, and the cache can be updated as a result of the retry.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-18 14:45:06 +08:00
TengYao Chi e1dcd383bc
KAFKA-17927 Disallow users to configure `max.in.flight.requests.per.connection` bigger than 5 (#17717)
Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2024-11-18 14:01:16 +08:00
Ritika Reddy e4c0034679
KAFKA-18019: Make INVALID_PRODUCER_ID_MAPPING a fatal error (#17822)
This patch contains changes to the handling of the INVALID_PRODUCER_ID_MAPPING error.
Quoted from KIP-890
Since we bump epoch on abort, we no longer need to call InitProducerId to fence requests. InitProducerId will only be called when the producer starts up to fence a previous instance.

With this change, some other calls to InitProducerId were inspected including the call after receiving an InvalidPidMappingException. This exception was changed to abortable as part of KIP-360: Improve reliability of idempotent/transactional producer. However, this change means that we can violate EOS guarantees. As an example:

Consider an application that is copying data from one partition to another

Application instance A processes to offset 4
Application instance B comes up and fences application instance A
Application instance B processes to offset 5
Application instances A and B are idle for transaction.id.expiration.ms, transaction id expires on server
Application instance A attempts to process offset 5 (since in its view, that is next) -- if we recover from invalid pid mapping, we can duplicate this processing
Thus, INVALID_PID_MAPPING should be fatal to the producer.

This is consistent with KIP-1050: Consistent error handling for Transactions where errors that are fatal to the producer are in the "application recoverable" category. This is a grouping that indicates to the client that the producer needs to restart and recovery on the application side is necessary. KIP-1050 is approved so we are consistent with that decision.

This PR also fixes the flakiness of TransactionsExpirationTest.

Reviewers:  Artem Livshits <alivshits@confluent.io>, Justine Olshan <jolshan@confluent.io>, Calvin Liu <caliu@confluent.io>
2024-11-17 18:43:04 -08:00
Lianet Magrans 5cf9872e8f
KAFKA-18017: Fix call order for HB error and group manager (#17805)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-17 19:12:25 -05:00
PoAn Yang 5725a51453
KAFKA-16460: New consumer times out consuming records in multiple consumer_test.py system tests (#17777)
Reviewers: Lianet Magrans <lmagrans@confluent.io>
2024-11-15 19:41:39 +01:00
PoAn Yang cc20e78474
KAFKA-17648: AsyncKafkaConsumer#unsubscribe and #close swallow TopicAuthorizationException and GroupAuthorizationException (#17516)
Reviewers: Lianet Magrans <lmagrans@confluent.io>, Kirk True <ktrue@confluent.io>
2024-11-15 15:15:26 +01:00
Andrew Schofield 13f497fe87
KAFKA-18009: Remove extra public constructor for KafkaShareConsumer (#17806)
KafkaShareConsumer had a constructor which was marked as public in error and should have been package-private.
2024-11-14 14:05:00 +05:30
Kirk True b6b2c9ebc4
KAFKA-16985: Ensure consumer attempts to send leave request on close even if interrupted (#16686)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Lianet Magrans <lmagrans@confluent.io>, Philip Nee <pnee@confluent.io>
2024-11-13 20:26:40 +01:00
David Arthur 48ff6a6b53
MINOR Fix a few test names (#17788)
Remove or update custom display names to make sure we actually include the test method as the first part of the display name.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Bill Bejeck <bill@confluent.io>
2024-11-13 13:28:38 -05:00
Rajini Sivaram 52d2fa5c8b
KAFKA-17885: Enable clients to rebootstrap based on timeout or error code (KIP-1102) (#17720)
Implementation of https://cwiki.apache.org/confluence/display/KAFKA/KIP-1102%3A+Enable+clients+to+rebootstrap+based+on+timeout+or+error+code
- Introduces rebootstrap trigger interval config metadata.recovery.rebootstrap.trigger.ms, set to 5 minutes by default
- Makes rebootstrap the default for metadata.recovery.strategy
- Adds new error code REBOOTSTRAP_REQUIRED, introduces top-level error code in metadata response. On this error, clients rebootstrap.
- Configs apply to producers, consumers, share consumers, admin clients, Connect and KStreams clients.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>,  Manikumar Reddy <manikumar.reddy@gmail.com>
2024-11-13 13:01:08 +00:00
Dániel Urbán 8563955e39
KAFKA-17632: Fix RoundRobinPartitioner for even partition counts (#17620)
RoundRobinPartitioner does not handle the fact that on new batch creation, the partition method is called twice.

Reviewers: Viktor Somogyi-Vass <viktorsomogyi@gmail.com>, Mickael Maison <mickael.maison@gmail.com>
2024-11-12 15:44:36 +01:00
Ken Huang 207b35901c
KAFKA-17314 Fix the typo: `maxlifeTimeMs` (#17749)
Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-11-12 16:26:29 +08:00
TengYao Chi 81019b6e9f
MINOR: Delete unused member from KafkaAdminClient (#17729)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-09 23:39:55 +08:00