Commit Graph

2879 Commits

Author SHA1 Message Date
Eaugene Thomas 2a41beb0f4
MINOR: Check the existence of AppInfo for the given ID before creating a new mbean of the same name (#14287)
When using kafka consumer in apache pinot , we did see couple of WARN as we are trying to create kafka consumer class with the same name . We currently have to use a added suffix to create a new mBean as each new kafka consumer in same process creates a mBean . Adding support here to skip creation of mBean if its already existing

Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>
2023-09-14 14:53:57 +08:00
Lianet Magrans a7e865c0a7
KAFKA-15115 - KAFKA-15163; Reset/Validate positions implementation & API integration (#14346)
Implementation of the required functionality for resetting and validating positions in the new async consumer.

This PR includes:
1. New async application events ResetPositionsApplicationEvent and ValidatePositionsApplicationEvent, both handled by the same OfffsetsRequestManager.
2. Integration of the reset/validate functionality in the new async consumer, to update fetch positions using the partitions offsets.
3. Minor refactoring to extract functionality that is reused from both consumer implementations (moving logic without changes from OffsetFetcher into OffsetFetchUtils, and from OffsetsForLeaderEpochClient into OffsetsForLeaderEpochUtils)

Reviewers: Philip Nee <pnee@confluent.io>, Kirk True <kirk@mustardgrain.com>, Jun Rao<junrao@gmail.com>
2023-09-13 07:49:41 -07:00
Lianet Magrans 0e47fa7537
KAFKA-15275 - Client state machine basic components, states and initial transitions (#14323)
Initial definition of the core components for maintaining group membership on the client following the new consumer group protocol.

This PR includes:
- Membership management for keeping member state and assignment, based on the heartbeat responses received.
- Assignor selection logic to support server side assignors.
This only includes the basic initial states and transitions, to be extended as we implement the protocol.

This is intended to be used from the heartbeat and assignment requests manager that actually build and process the heartbeat and assignment related requests.

Reviewers: Philip Nee <pnee@confluent.io>, Kirk True <ktrue@confluent.io>, David Jacot <djacot@confluent.io>
2023-09-13 05:07:56 -07:00
Luke Chen 8a7e5e8ea0
MINOR: Fix errors in javadoc and docs in tiered storage (#14379)
Reviewers: Satish Duggana <satishd@apache.org>
2023-09-13 12:45:36 +05:30
David Jacot 6123071fc0
KAFKA-14499: [7/7] Add integration tests for OffsetCommit API and OffsetFetch API (#14353)
This patch adds integration tests for the OffsetCommit API and the OffsetFetch API. The tests runs against the old and the new group coordinator and with the new and the old consumer rebalance protocol.

Reviewers: Ritika Reddy <rreddy@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-09-11 10:48:02 -07:00
Roon 7fc65003fc
KAFKA-15315: Use getOrDefault rather than get in TransactionState (#14167)
Reviewers: Ziming Deng dengziming1993@gmail.com.
2023-09-11 19:30:57 +08:00
Lucas Brutschy 01b91af59c
MINOR: fix currentLag javadoc (#14224)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-09-07 19:25:31 -07:00
Kirk True a2de7d32c8
KAFKA-14274 #1: basic refactoring (#14305)
This change introduces some basic clean up and refactoring for forthcoming commits related to the revised fetch code for the consumer threading refactor project.

Reviewers: Christo Lolov <lolovc@amazon.com>, Jun Rao <junrao@gmail.com>
2023-09-07 15:23:44 -07:00
Colin Patrick McCabe 41b695b6e3
KAFKA-15369: Implement KIP-919: Allow AC to Talk Directly with Controllers (#14306)
Implement KIP-919: Allow AdminClient to Talk Directly with the KRaft Controller Quorum and add
Controller Registration. This KIP adds a new version of DescribeClusterRequest which is supported
by KRaft controllers. It also teaches AdminClient how to use this new DESCRIBE_CLUSTER request to
talk directly with the controller quorum. This is all gated behind a new MetadataVersion,
IBP_3_7_IV0.

In order to share the DESCRIBE_CLUSTER logic between broker and controller, this PR factors it out
into AuthHelper.computeDescribeClusterResponse.

The KIP adds three new errors codes: MISMATCHED_ENDPOINT_TYPE, UNSUPPORTED_ENDPOINT_TYPE, and
UNKNOWN_CONTROLLER_ID. The endpoint type errors can be returned from DescribeClusterRequest

On the controller side, the controllers now try to register themselves with the current active
controller, by sending a CONTROLLER_REGISTRATION request. This, in turn, is converted into a
RegisterControllerRecord by the active controller. ClusterImage, ClusterDelta, and all other
associated classes have been upgraded to propagate the new metadata. In the metadata shell, the
cluster directory now contains both broker and controller subdirectories.

QuorumFeatures previously had a reference to the ApiVersions structure used by the controller's
NetworkClient. Because this PR removes that reference, QuorumFeatures now contains only immutable
data. Specifically, it contains the current node ID, the locally supported features, and the list
of quorum node IDs in the cluster.

Reviewers: David Arthur <mumrah@gmail.com>, Ziming Deng <dengziming1993@gmail.com>, Luke Chen <showuon@gmail.com>
2023-09-07 15:21:52 -07:00
Chris Egerton 77a91be22e
KAFKA-15425: Fail fast in Admin::listOffsets when topic (but not partition) metadata is not found (#14314)
This restores previous behavior for Admin::listOffsets, which was to fail immediately if topic metadata could not be found, and only retry if metadata for one or more specific partitions could not be found.

There is a subtle difference here: prior to https://github.com/apache/kafka/pull/13432, the operation would be retried if any metadata error was reported for any individual topic partition, even if an error was also reported for the entire topic. With this change, the operation always fails if an error is reported for the entire topic, even if an error is also reported for one or more individual topic partitions.

I am not aware of any cases where brokers might return both topic- and topic partition-level errors for a metadata request, and if there are none, then this change should be safe. However, if there are such cases, we may need to refine this PR to remove the discrepancy in behavior.

Reviewers: Justine Olshan <jolshan@confluent.io>
2023-09-07 14:02:57 -07:00
Max Riedel 90e646052a
KAFKA-14509; [1/2] Define ConsumerGroupDescribe API request and response schemas and classes. (#14124)
This patch adds the schemas of the new ConsumerGroupDescribe API (KIP-848) and adds an handler to the KafkaApis.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>
2023-09-07 08:05:04 -07:00
David Jacot 7054625c45
KAFKA-14499: [6/N] Add MemberId and MemberEpoch to OffsetFetchRequest (#14321)
This patch adds the MemberId and the MemberEpoch fields to the OffsetFetchRequest. Those fields will be populated when the new consumer group protocol is used to ensure that the member fetching the offset has the correct member id and epoch. If it does not, UNKNOWN_MEMBER_ID or STALE_MEMBER_EPOCH are returned to the client.

Our initial idea was to implement the same for the old protocol. The field is called GenerationIdOrMemberEpoch in KIP-848 to materialize this. As a second though, I think that we should only do it for the new protocol. The effort to implement it in the old protocol is not worth it in my opinion.

Reviewers: Ritika Reddy <rreddy@confluent.io>, Calvin Liu <caliu@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-09-05 23:36:38 -07:00
Andrew Schofield b49013b73e
KAFKA-9800: Exponential backoff for Kafka clients - KIP-580 (#14111)
Implementation of KIP-580 to add exponential back-off to situations in which retry.backoff.ms
is used to delay backoff attempts. This KIP adds exponential backoff behavior with a maximum
controlled by a new config retry.backoff.max.ms, together with a +/- 20% of jitter to spread the
retry attempts of the client fleet.

Reviewers: Mayank Shekhar Narula <mayanks.narula@gmail.com>, Milind Luthra <i.milind.luthra@gmail.com>, Kirk True <kirk@mustardgrain.com>, Jun Rao<junrao@gmail.com>
2023-09-05 11:57:51 -07:00
Lianet Magrans 1bb8c11f5a
KAFKA-14965 - OffsetsRequestsManager implementation & API integration (#14308)
Implementation of the OffsetRequestsManager, responsible for building requests and processing responses for requests related to partition offsets.

In this PR, the manager includes support for ListOffset requests, generated when the user makes any of the following consumer API calls:

beginningOffsets
endOffsets
offsetsForTimes
All previous consumer API calls interact with the OffsetsRequestsManager by generating a ListOffsetsApplicationEvent.

Includes tests to cover the new functionality and to ensure no API level changes are introduced.

This covers KAFKA-14965 and KAFKA-15081.

Reviewers: Philip Nee <pnee@confluent.io>, Kirk True <kirk@mustardgrain.com>, Jun Rao<junrao@gmail.com>
2023-09-01 13:57:17 -07:00
Philip Nee 945d21953e
KAFKA-14875: Implement wakeup (#14118)
Summary
Implemented wakeup() mechanism using a WakeupTrigger class to store the pending wakeup item, and when wakeup() is invoked, it checks whether there's an active task or a wakeup task.

If there's an active task: the task will be completed exceptionally and the atomic reference will be freed up.
If there an wakedup task, which means wakeup() was invoked before a blocking call was issued. Therefore, the current task will be completed exceptionally immediately.

This PR also addressed minor issues such as:

Throwing WakeupException at the right place: As wakeups are thrown by completing an active future exceptionally. The WakeupException is wrapped inside of the ExecutionException.

mockConstruction is a thread-lock mock; therefore, we need to free up the reference before completing the test. Otherwise, other tests will continue using the thread-lock mock.

Reviewers: Lianet Magrans <lianetmr@gmail.com>, Jun Rao <junrao@gmail.com>
2023-08-29 12:03:15 -07:00
Calvin Liu b41b2dfcf2
KAFKA-15353: make sure AlterPartitionRequest.build() is idempotent (#14236)
As described in https://issues.apache.org/jira/browse/KAFKA-15353
When the AlterPartitionRequest version is < 3 and its builder.build is called multiple times, both newIsrWithEpochs and newIsr will all be empty. This can happen if the sender retires on errors.

Reviewers: Luke Chen <showuon@gmail.com>
2023-08-28 17:59:48 +08:00
Maros Orsak 5785796f98
MINOR: Add a few test cases to clients (#14211)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-08-25 11:31:21 +02:00
Phuc-Hong-Tran 8d12c1175c
KAFKA-15152: Fix incorrect format specifiers when formatting string (#14026)
Reviewers: Divij Vaidya <diviv@amazon.com>

Co-authored-by: phuchong.tran <phuchong.tran@servicenow.com>
2023-08-24 19:38:45 +02:00
Mehari Beyene 25b128de81
KAFKA-14991: KIP-937-Improve message timestamp validation (#14135)
This implementation introduces two new configurations `log.message.timestamp.before.max.ms` and `log.message.timestamp.after.max.ms` and deprecates `log.message.timestamp.difference.max.ms`.

The default value for all these three configs is maintained to be Long.MAX_VALUE for backward compatibility but with the newly added configurations we can have a finer control when validating message timestamps that are in the past and the future compared to the broker's timestamp.

To maintain backward compatibility if the default value of `log.message.timestamp.before.max.ms` is not changed, we are assuming users are still using the deprecated config `log.message.timestamp.difference.max.ms` and validation is done using its value. This ensures that existing customers who have customized the value of `log.message.timestamp.difference.max.ms` will continue to see no change in behavior.

Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>
2023-08-24 12:04:55 +02:00
Okada Haruki 87a30b73b5
KAFKA-15391: Handle concurrent dir rename which makes log-dir to be offline unexpectedly (#14280)
A race condition between async flush and segment rename (for deletion purpose) might cause the entire log directory to be marked offline when we delete a topic. This PR fixes the bug by ignoring NoSuchFileException when we flush a directory.

Reviewers: Divij Vaidya <diviv@amazon.com>
2023-08-24 10:24:01 +02:00
olalamichelle 9972297e51
KAFKA-14780: Fix flaky test 'testSecondaryRefreshAfterElapsedDelay' (#14078)
"The test RefreshingHttpsJwksTest#testSecondaryRefreshAfterElapsedDelay relies on the actual system clock, which makes it frequently fail. The fix adds a second constructor that allows for passing a ScheduledExecutorService to manually execute the scheduled tasks before refreshing. The fixed task is much more robust and stable.

Co-authored-by: Fei Xie <feixie@MacBook-Pro.attlocal.net>

Reviewers: Divij Vaidya <diviv@amazon.com>, Luke Chen <showuon@gmail.com>
2023-08-24 10:59:16 +08:00
Lianet Magrans e9f358eef6
KAFKA-14937; [2/N]: Refactoring for client code to reduce boilerplate (#14218)
This PR main refactoring relates to :

1. serializers/deserializers used in clients - unified in a Deserializers class
2. logic for configuring ClusterResourceListeners moved to ClientUtils
3. misc refactoring of the new async consumer in preparation for upcoming Request Managers

Reviewers: Jun Rao <junrao@gmail.com>
2023-08-22 21:43:06 -07:00
Proven Provenzano c2759df067
KAFKA-15219: KRaft support for DelegationTokens (#14083)
Reviewers: David Arthur <mumrah@gmail.com>, Ron Dagostino <rndgstn@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Viktor Somogyi <viktor.somogyi@cloudera.com>
2023-08-19 14:01:08 -04:00
José Armando García Sancio 3f4816dd3e
KAFKA-15345; KRaft leader notifies leadership when listener reaches epoch start (#14213)
In a non-empty log the KRaft leader only notifies the listener of leadership when it has read to the leader's epoch start offset. This guarantees that the leader epoch has been committed and that the listener has read all committed offsets/records.

Unfortunately, the KRaft leader doesn't do this when the log is empty. When the log is empty the listener is notified immediately when it has become leader. This makes the API inconsistent and harder to program against.

This change fixes that by having the KRaft leader wait for the listener's nextOffset to be greater than the leader's epochStartOffset before calling handleLeaderChange.

The RecordsBatchReader implementation is also changed to include control records. This makes it possible for the state machine learn about committed control records. This additional information can be used to compute the committed offset or for counting those bytes when determining when to snapshot the partition.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, Jason Gustafson <jason@confluent.io>
2023-08-17 18:40:17 -07:00
Philip Nee b97e8203eb
MINOR: CommitRequestManager should only poll when the coordinator node is known (#14179)
As title, we discovered a flaky bug during testing that the commit request manager would seldomly throw a NOT_COORDINATOR exception, which means the request was routed to a non-coordinator node. We discovered that if we don't check the coordinator node in the commitRequestManager, the request manager will pass on an empty node to the NetworkClientDelegate, which implies the request can be sent to any node in the cluster. This behavior is incorrect as the commit requests need to be routed to a coordinator node.

Because the timing coordinator's discovery during integration testing isn't entirely deterministic; therefore, the test became extremely flaky. After fixing this: The coordinator node is mandatory before attempt to enqueue these commit request to the NetworkClient.

Reviewers: Jun Rao <junrao@gmail.com>
2023-08-15 15:01:28 -07:00
Greg Harris 1a001c1e88
KAFKA-15336: Add ServiceLoader Javadocs for Connect plugins (#14194)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-08-15 13:21:45 -07:00
Kirk True 67b527460e
KAFKA-14937: Refactoring for client code to reduce boilerplate (#13990)
Move common code from the client implementations to the ClientUtils
class or (consumer) Utils class, where passible.

There are a number of places in the client code where the same basic
calls are made by more than one client implementation. Minor
refactoring will reduce the amount of boilerplate code necessary for
the client to construct its internal state.


Reviewers: Lianet Magrans <lianetmr@gmail.com>, Jun Rao <junrao@gmail.com>
2023-08-14 10:08:20 -07:00
vveicc 43751d8d05
KAFKA-15289: Support KRaft mode in RequestQuotaTest (#14201)
Enable kraft mode for RequestQuotaTest, there are 2 works left to be done.

Reviewers: dengziming <dengziming1993@gmail.com>
2023-08-14 17:04:15 +08:00
Philip Nee f6b8b39747
MINOR: Fix committed API in the PrototypeAsyncConsumer timeout (#14123)
Discovered the committed() API timeout during the integration test. After investigation, this is because the future was not completed in the ApplicationEventProcessor. Also added toString methods to the event class for debug purposes.

Reviewers: Jun Rao <junrao@gmail.com>
2023-08-11 13:15:30 -07:00
vveicc 594156e01b
KAFKA-15287: Change NodeApiVersions.create() to support both zk and kraft (#14185)
Reviewers: dengziming <dengziming1993@gmail.com>
2023-08-11 10:18:13 +08:00
Federico Valeri 8de3e0436a
KAFKA-15239: Fix system tests using producer performance service (#14092)
Reviewers: Greg Harris <greg.harris@aiven.io>
2023-08-10 14:23:43 -07:00
vveicc 393b563bb5
KAFKA-15288: Change BrokerApiVersionsCommandTest to support kraft mode (#14175)
Use ApiKeys.clientApis() to replace ApiKeys.zkBrokerApis() to support kraft mode.

Reviewers: dengziming <dengziming1993@gmail.com>
2023-08-10 14:26:37 +08:00
Qichao Chu c72065a632
MINOR: Add test for describe topic with ID (#14110)
* MINOR: Add test for describe topic with ID

Add a simple test to verify topic description with topic IDs.

Reviewers: Divij Vaidya <diviv@amazon.com>, dengziming <dengziming1993@gmail.com>
2023-08-10 10:16:30 +08:00
flashmouse e0b7499103
KAFKA-15106: Fix AbstractStickyAssignor isBalanced predict (#13920)
in 3.5.0 AbstractStickyAssignor may run useless loop in performReassignments  because isBalanced have a trivial mistake, and result in rebalance timeout in some situation.

Co-authored-by: lixy <lixy@tuya.com>
Reviewers: Ritika Reddy <rreddy@confluent.io>, Philip Nee <pnee@confluent.io>, Kirk True <kirk@mustardgrain.com>, Guozhang Wang <wangguoz@gmail.com>
2023-08-03 11:17:08 -07:00
Philip Nee 811ae01723
MINOR: Test assign() and assignment() in the integration test (#14086)
A missing piece from KAFKA-14950. This is to test assign() and assignment() in the integration test.

Also fixed an accidental mistake in the committed API.

Reviewers: Jun Rao <junrao@gmail.com>
2023-07-28 09:11:20 -07:00
James Shaw afe631cd73
KAFKA-14967: fix NPE in CreateTopicsResult in MockAdminClient (#13671)
Co-authored-by: James Shaw <james.shaw@masabi.com>
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-07-28 11:45:15 +02:00
Justine Olshan 6f39ef02ca
MINOR: Adjust Invalid Record Exception for Invalid Txn State as mentioned in KIP-890 (#14088)
Invalid record is a newer error. INVALID_TXN_STATE has been around as long as transactions and is not retriable. This is the desired behavior.
2023-07-27 09:36:32 -07:00
David Jacot 29825ee24f
KAFKA-14499: [3/N] Implement OffsetCommit API (#14067)
This patch introduces the `OffsetMetadataManager` and implements the `OffsetCommit` API for both the old rebalance protocol and the new rebalance protocol. It introduces version 9 of the API but keeps it as unstable for now. The patch adds unit tests to test the API. Integration tests will be done separately.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-07-27 13:18:10 +02:00
vamossagar12 ff390ab60a
[MINOR] Fix Javadoc comment in KafkaFuture#toCompletionStage (#14100)
Fix Javadoc comment in KafkaFuture#toCompletionStage

Reviewers: Luke Chen <showuon@gmail.com>
2023-07-26 20:26:20 +08:00
tison 8b027b6fef
MINOR: Fix typo in ProduceRequest.json (#14070)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-07-25 17:56:49 +02:00
Philip Nee 1656591d0b
KAFKA-14950: implement assign() and assignment() (#13797)
We will explicitly send an assignment change event to the background thread to invoke auto-commit if the group.id is configured. After updating the subscription state, a NewTopicsMetadataUpdateRequestEvent will also be sent to the background thread to update the metadata.

Co-authored-by: Kirk True <kirk@kirktrue.pro>
Reviewers: Jun Rao <junrao@gmail.com>
2023-07-21 13:59:00 -07:00
David Jacot 69659b70fc
KAFKA-14499: [1/N] Introduce OffsetCommit API version 9 and add new StaleMemberEpochException error (#14046)
This patch does a few things:
1) It introduces version 9 of the OffsetCommit API. This new version has no schema changes but it can return a StaleMemberEpochException if the new consumer group protocol is used. Note the use of `"latestVersionUnstable": true` in the request schema. This means that this new version is not available yet unless activated.
2) It renames the `generationId` field in the request to `GenerationIdOrMemberEpoch`. This is backward compatible change.
3) It introduces the new StaleMemberEpochException error.
4) It does a minor refactoring in OffsetCommitRequest class.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Arthur <mumrah@gmail.com>, Justine Olshan <jolshan@confluent.io>
2023-07-21 20:08:06 +02:00
Greg Harris 125dbb9286
KAFKA-14760: Move ThroughputThrottler from tools to clients, remove tools dependency from connect-runtime (#13313)
Reviewers: Ismael Juma <ismael@juma.me.uk>
2023-07-20 12:58:48 -07:00
Jeff Kim a500c3ecf9
KAFKA-14500; [5/N] Implement JoinGroup protocol in new GroupCoordinator (#13870)
This patch implements the existing JoinGroup protocol within the new group coordinator. 

Some notable differences:
* Methods return a CoordinatorResult to the runtime framework, which includes records to append to the log as well as a future to complete after the append succeeds/fails.
* The coordinator runtime ensures that only a single thread will be processing a group at any given time, therefore there is no more locking on groups.
* Instead of using on purgatories, we rely on the Timer interface to schedule/cancel delayed operations.

Reviewers: David Jacot <djacot@confluent.io>
2023-07-19 09:15:13 +02:00
Max Riedel 15418db69d
KAFKA-15123: Add tests for ChunkedBytesStream (#13941)
Reviewers: Divij Vaidya <diviv@amazon.com>
2023-07-15 15:26:36 +02:00
Justine Olshan ea0bb00126
KAFKA-14884: Include check transaction is still ongoing right before append (take 2) (#13787)
Introduced extra mapping to track verification state.

When verifying, there is a race condition that the add partitions verification response returns that the partition is in the ongoing transaction, but an abort marker is written before we get to append. Therefore, we track any given transaction we are verifying with an object unique to that transaction.

We check this unique state upon the first append to the log. After that, we can rely on currentTransactionFirstOffset. We remove the verification state on appending to the log with a transactional data record or marker.

We will also clean up lingering verification state entries via the producer state entry expiration mechanism. We do not update the the timestamp on retrying a verification for a transaction, so each entry must be verified before producer.id.expiration.ms.

There were a few other fixes:
- Moved the transaction manager handling for failed batch into the future completed exceptionally block to avoid processing it twice (this caused issues in unit tests)
- handle interrupted exceptions encountered when callback thread encountered them
- change handling to throw error if we try to set verification state and leaderLogIfLocal is None.

Reviewers: David Jacot <djacot@confluent.io>, Artem Livshits <alivshits@confluent.io>, Jason Gustafson <jason@confluent.io>
2023-07-14 15:18:11 -07:00
Okada Haruki ab71c56973
KAFKA-12261: Mention about potential delivery loss on increasing partition when auto.offset.reset = latest (#10167)
Splitting partitions while setting auto.offset.reset to latest may cause message delivery loss, but users might not be aware about that since currently it isn't documented anywhere.

Reviewers: Luke Chen <showuon@gmail.com>
2023-07-14 11:14:31 +08:00
Alyssa Huang 5b5f6fcafb
[KAFKA-15137] Do not log entire request payload in KRaftControllerChannelManager (#13988)
Reviewers: David Arthur <mumrah@gmail.com>
2023-07-11 10:48:53 +02:00
Cheryl Simmons e98508747a
Doc fixes: Fix format and other small errors in config documentation (#13661)
Various formatting fixes in the config docs.

Reviewers: Bill Bejeck <bbejeck@apache.org>
2023-07-10 12:48:35 -04:00
Lianet Magrans 4a61b48d3d
KAFKA-14966; [2/N] Extract OffsetFetcher reusable logic (#13898)
This is a follow up on the initial OffsetFetcher refactoring to extract reusable logic, needed for the new consumer implementation (initial refactoring merged with PR-13815).

Similar to the initial refactoring, this PR brings no changes to the existing logic, just extracting functions or pieces of logic.

There were no individual tests for the extracted functions, so no tests were migrated.

Reviewers: Jun Rao <junrao@gmail.com>
2023-07-05 17:20:49 -07:00
Gantigmaa Selenge b2d647904c
KAFKA-8982: Add retry of fetching metadata to Admin.deleteRecords (#13760)
Use AdminApiDriver class to refresh the metadata and retry the request that failed with retriable errors.

Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Mickael Maison <mmaison@redhat.com>, Dimitar Dimitrov <30328539+dimitarndimitrov@users.noreply.github.com>
2023-07-03 09:13:55 +08:00
Ismael Juma 1f4cbc5d53
MINOR: Add JDK 20 CI build and remove some branch builds (#12948)
It's good for us to add support for Java 20 in preparation for Java 21 - the next LTS.

Given that Scala 2.12 support has been deprecated, a Scala 2.12 variant is not included.

Also remove some branch builds that add load to the CI, but have
low value: JDK 8 & Scala 2.13 (JDK 8 support has been deprecated),
JDK 11 & Scala 2.12 (Scala 2.12 support has been deprecated) and
JDK 17 & Scala 2.12 (Scala 2.12 support has been deprecated).

A newer version of Mockito (4.9.0 -> 4.11.0) is required for Java 20 support, but we
only use it with Scala 2.13+ since it causes compilation errors with Scala 2.12. Similarly,
we upgrade easymock when the Java version is 16 or newer as it's incompatible
with powermock (which doesn't support Java 16 or newer).

Filed KAFKA-15117 for a test that fails with Java 20 (SslTransportLayerTest.testValidEndpointIdentificationCN).

Finally, fixed some lossy conversions that were added after #13582 was submitted.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2023-06-30 01:12:00 -07:00
Kirk True a81f35c1c8
KAFKA-14831: Illegal state errors should be fatal in transactional producer (#13591)
Poison the transaction manager if we detect an illegal transition in the Sender thread. A ThreadLocal in is stored in TransactionManager so that the Sender can inform TransactionManager which thread it's using.

Reviewers: Daniel Urban <durban@cloudera.com>, Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io>
2023-06-29 11:21:15 -07:00
Bo Gao 005416879e
KAFKA-15053: Use case insensitive validator for security.protocol config (#13831)
Fixed a regression described in KAFKA-15053 that security.protocol only allows uppercase values like PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL. With this fix, both lower case and upper case values will be supported (e.g. PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL, plaintext, ssl, sasl_plaintext, sasl_ssl)

Reviewers: Chris Egerton <chrise@aiven.io>, Divij Vaidya <diviv@amazon.com>
2023-06-29 10:13:21 +02:00
José Armando García Sancio ee88a3d1b9
MINOR; Failed atomic file move should be logged at WARN (#13917)
When Kafka fails to perform an atomic file move the error is getting swallowed. Kafka should log these cases at least at WARN level.

Reviewers: Ron Dagostino <rndgstn@gmail.com>, Kirk True <kirk@kirktrue.pro>
2023-06-28 15:55:50 -07:00
Justine Olshan 2f71708955
KAFKA-15028: AddPartitionsToTxnManager metrics (#13798)
Adding the following metrics as per kip-890:

VerificationTimeMs – number of milliseconds from adding partition info to the manager to the time the response is sent. This will include the round trip to the transaction coordinator if it is called. This will also account for verifications that fail before the coordinator is called.

VerificationFailureRate – rate of verifications that returned in failure either from the AddPartitionsToTxn response or through errors in the manager.

AddPartitionsToTxnVerification metrics – separating the verification request metrics from the typical add partitions ones similar to how fetch replication and fetch consumer metrics are separated.

Reviewers: Divij Vaidya <diviv@amazon.com>
2023-06-28 09:00:37 -07:00
minjian.cai e71f68d6c9
MINOR: fix typos for client (#13884)
Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Kirk True <ktrue@confluent.io>
2023-06-28 16:47:42 +02:00
Manyanda Chitimbo c5889fcedd
MINOR: Split ConsumerCoordinator#testCommitOffsetMetadata onto two test cases testing commitSync and commitAsync (#13665)
Split ConsumerCoordinator#testCommitOffsetMetadata onto two test cases testing commitSync and commitAsync 

Reviewers:  Luke Chen <showuon@gmail.com>
2023-06-24 12:32:21 +08:00
Jeff Kim 1dbcb7da9e
KAFKA-14694: RPCProducerIdManager should not wait on new block (#13267)
RPCProducerIdManager initiates an async request to the controller to grab a block of producer IDs and then blocks waiting for a response from the controller.

This is done in the request handler threads while holding a global lock. This means that if many producers are requesting producer IDs and the controller is slow to respond, many threads can get stuck waiting for the lock.

This patch aims to:
* resolve the deadlock scenario mentioned above by not waiting for a new block and returning an error immediately
* remove synchronization usages in RpcProducerIdManager.generateProducerId()
* handle errors returned from generateProducerId() so that KafkaApis does not log unexpected errors
* confirm producers backoff before retrying
* introduce backoff if manager fails to process AllocateProducerIdsResponse

Reviewers: Artem Livshits <alivshits@confluent.io>, Jason Gustafson <jason@confluent.io>
2023-06-22 10:19:39 -07:00
Ismael Juma 9c8aaa2c35
MINOR: Fix lossy conversions flagged by Java 20 (#13582)
An example of the warning:
> warning: [lossy-conversions] implicit cast from long to int in compound assignment is possibly lossy

There should be no change in behavior as part of these changes - runtime logic ensured
we didn't run into issues due to the lossy conversions.

Reviewers: Divij Vaidya <diviv@amazon.com>
2023-06-22 08:05:55 -07:00
Mickael Maison 190e6de809
MINOR: Close code tag in Producer configs (#13875)
Reviewers: Divij Vaidya <diviv@amazon.com>
2023-06-19 16:58:55 +02:00
Daniel Scanteianu 6f7682d2f4
MINOR: add comment that timestamp unit is milliseconds (#13341)
Reviewers:  Divij Vaidya <diviv@amazon.com>
2023-06-19 13:22:09 +02:00
Luke Chen 74238656dc
KAFKA-15066: add "remote.log.metadata.manager.listener.name" config to rlmm (#13828)
add "remote.log.metadata.manager.listener.name" config to rlmm to allow producer/consumer to connect to the server. Also add tests.

Reviewers: Divij Vaidya <diviv@amazon.com>, Satish Duggana <satishd@apache.org>
2023-06-16 20:56:13 +08:00
Sushant Mahajan 5afce2de68
KAFKA-15077: Code to trim token in FileTokenRetriever (#13835)
The FileTokenRetriever class is used to read the access_token from a file on the clients system and then it is passed along with the jaas config to the OAuthBearerSaslServer. In case the token was sent using FileTokenRetriever on the client side, some EOL character is getting appended to the token, causing authentication to fail with the message:


Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2023-06-11 11:52:25 +05:30
Lianet Magrans 4af4bccbbf
KAFKA-14966: Extract OffsetFetcher reusable logic (#13815)
The OffsetFetcher is internally used by the KafkaConsumer to fetch offsets, validate and reset positions. For the new KafkaConsumer with a refactored threading model, similar functionality will be needed.

This is an initial refactoring for extracting logic from the OffsetFetcher, that will be reused by the new consumer implementation. No changes to the existing logic, just extracting classes, functions or pieces of logic.

All the functionality moved out of the OffsetFetcher is already covered by tests in OffsetFetcherTest and FetcherTest. There were no individual tests for the extracted functions, so no tests were migrated.

Reviewers: Jun Rao <junrao@gmail.com>
2023-06-08 14:03:45 -07:00
Lucas Brutschy ff77b3ad04
KAFKA-14278: Fix InvalidProducerEpochException and InvalidTxnStateException handling in producer clients (#13811)
This PR fixes three issues:

InvalidProducerEpochException was not handled consistently. InvalidProducerEpochException used to be able to be return via both transactional response and produce response, but as of KIP-588 (2.7+), transactional responses should not return InvalidProducerEpochException anymore, only produce responses can. It can happen that older brokers may still return InvalidProducerEpochException for transactional responses; these must be converted to the newer ProducerFencedException. This conversion wasn't done for TxnOffsetCommit (sent to the group coordinator).

InvalidTxnStateException was double-wrapped in KafkaException, whereas other exceptions are usually wrapped only once. Furthermore, InvalidTxnStateException was not handled at all for in AddOffsetsToTxn response, where it should be a possible error as well, according to API documentation.

According to API documentation, UNSUPPORTED_FOR_MESSAGE_FORMAT is not possible for TxnOffsetCommit, but it looks like it is, and it is being handled there, so I updated the API documentation.

Reviewers: Justine Olshan <jolshan@confluent.io>
2023-06-07 09:48:14 -07:00
Erik van Oosten 59d30a06fc
KAFKA-10337: await async commits in commitSync even if no offsets given (#13678)
The contract for Consumer#commitSync() guarantees that the callbacks for all prior async commits will be invoked before it returns. Prior to this patch the contract could be violated if an empty offsets map were passed in to Consumer#commitSync().

Reviewers: Philip Nee <philipnee@gmail.com>, David Jacot <djacot@confluent.io>
2023-06-07 16:55:03 +02:00
David Jacot 7d147cf241
KAFKA-14462; [14/N] Add PartitionWriter (#13675)
This patch introduces the `PartitionWriter` interface in the `group-coordinator` module. The `ReplicaManager` resides in the `core` module and it is thus not accessible from the `group-coordinator` one. The `CoordinatorPartitionWriter` is basically an implementation of the interface residing in `core` which interfaces with the `ReplicaManager`.

One notable difference from the usual produce path is that the `PartitionWriter` returns the offset following the written records. This is then used by the coordinator runtime to track when the request associated with the write can be completed.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-06-06 16:24:48 +02:00
mojh7 04f2f6a26a
MINOR: Typo and unused method removal (#13739)
clean up unused private method and removed typos

Reviewers:  Divij Vaidya <diviv@amazon.com>,  Manyanda Chitimbo <manyanda.chitimbo@gmail.com>,  Daniel Scanteianu, Josep Prat <josep.prat@aiven.io>
2023-06-06 10:50:56 +02:00
Gabriel Oliveira 443bd1dd82
MINOR: Add "versions" tag to recently added ReplicaState field on Fetch Request (#13680)
Reviewers: David Jacot <djacot@confluent.io>
2023-06-05 13:40:20 +02:00
Divij Vaidya fe6a827e20
KAFKA-14633: Reduce data copy & buffer allocation during decompression (#13135)
After this change,

    For broker side decompression: JMH benchmark RecordBatchIterationBenchmark demonstrates 20-70% improvement in throughput (see results for RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize).
    For consumer side decompression: JMH benchmark RecordBatchIterationBenchmark a mix bag of single digit regression for some compression type to 10-50% improvement for Zstd (see results for RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize).

Reviewers: Luke Chen <showuon@gmail.com>, Manyanda Chitimbo <manyanda.chitimbo@gmail.com>, Ismael Juma <mail@ismaeljuma.com>
2023-06-05 15:04:49 +08:00
Daniel Scanteianu 6d72c26731
KAFKA-14841 Handle callbacks to ConsumerRebalanceListener in MockConsumer (#13455)
Reviewers: Philip Nee <philipnee@gmail.com>
2023-05-26 08:33:03 -05:00
Manyanda Chitimbo c14e0df617
KAFKA-14961: harden DefaultBackgroundThreadTest.testStartupAndTearDown test (#13664)
1. Ensures that NPE are not thrown
2. Ensures that the background thread has been started to avoid flasky
   assertions failures on isRunning
3. Add a check that the thread is not running when closed

Reviewers: Philip Nee <philipnee@gmail.com>, Kirk True <kirk@kirktrue.pro>
2023-05-26 08:28:03 -05:00
Aaron Ai c90a08c37e
MINOR: Fix the outdated comments of ConfigDef (#13710)
Fix the outdated comments of ConfigDef since the signature of the corresponding method has been updated.

Reviewers: Luke Chen <showuon@gmail.com>
2023-05-19 13:00:20 +08:00
Divij Vaidya bb10ae4273
KAFKA-14962: Trim whitespace from ACL configuration (#13670)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Christo Lolov <lolovc@amazon.com>
2023-05-12 23:51:00 +05:30
dengziming a7c9842f70
KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse (#13679)
The KRaft controller return empty finalized features in `ApiVersionResponse`, the brokers are not infected by this, so this problem doesn't have any impact currently, but it's worth fixing it to avoid unexpected problems.

And there is a bunch of of confusing methods in `ApiVersionResponse` which are only used in test code, I moved them to TestUtils to make the code more clear, and force everyone to pass in the correct parameters instead of the default zero parameters, for example, empty supported features and empty finalized features.

Reviewers: Luke Chen <showuon@gmail.com>
2023-05-12 13:46:06 +08:00
vamossagar12 86daf8ce65
KAFKA-14913: Using ThreadUtils.shutdownExecutorServiceQuietly to close executors in Connect Runtime (#13594)
#13557 introduced a utils method to close executors silently. This PR leverages that method to close executors in connect runtime. There was duplicate code while closing the executors which isn't the case with this PR.

Note that there are a few more executors used in Connect runtime but their close methods don't follow this pattern of shutdown, await and shutdown. Some of them have some logic like executor like Worker, so not changing at such places.

---------

Co-authored-by: Sagar Rao <sagarrao@Sagars-MacBook-Pro.local>

Reviewers: Daniel Urban <durban@cloudera.com>, Yash Mayya <yash.mayya@gmail.com>, Viktor Somogyi-Vass <viktorsomogyi@gmail.com>
2023-05-08 16:39:47 +02:00
Federico Valeri 2607e0edb7
MINOR: Fix producer Callback comment (#13669)
Fixes the wrong exception name: OffsetMetadataTooLargeException.

Reviewers: Manyanda Chitimbo <manyanda.chitimbo@gmail.com>, Luke Chen <showuon@gmail.com>
2023-05-08 14:33:31 +08:00
Chia-Ping Tsai 6e7144ac24
MINOR: add docs to remind reader that impl of ConsumerPartitionAssign… (#13659)
Reviewers: David Jacot <djacot@confluent.io>, Kirk True <kirk@kirktrue.pro>
2023-05-06 02:56:26 +08:00
Divij Vaidya 6bcc497c36
KAFKA-14766: Improve performance of VarInt encoding and decoding (#13312)
Motivation

Reading/writing the protocol buffer varInt32 and varInt64 (also called varLong in our code base) is in the hot path of data plane code in Apache Kafka. We read multiple varInt in a record and in long. Hence, even a minor change in performance could extrapolate to larger performance benefit.

In this PR, we only update varInt32 encoding/decoding.
Changes

This change uses loop unrolling and reduces the amount of repetition of calculations. Based on the empirical results from the benchmark, the code has been modified to pick up the best implementation.
Results

Performance has been evaluated using JMH benchmarks on JDK 17.0.6. Various implementations have been added in the benchmark and benchmarking has been done for different sizes of varints and varlongs. The benchmark for various implementations have been added at ByteUtilsBenchmark.java

Reviewers: Ismael Juma <mlists@juma.me.uk>, Luke Chen <showuon@gmail.com>, Alexandre Dupriez <alexandre.dupriez@gmail.com>
2023-05-05 20:05:20 +08:00
Justine Olshan ffd814d25f
KAFKA-14916: Fix code that assumes transactional ID implies all records are transactional (#13607)
Also modifies verification to only add a partition to verify if it is transactional.

When verifying we look at all the transactional producer IDs and throw INVALID_RECORD on the request if one is different.

Reviewers: Kirk True <ktrue@confluent.io>, Artem Livshits <alivshits@confluent.io>, Jason Gustafson <jason@confluent.io>
2023-05-04 09:55:45 -07:00
Philip Nee ea81e99e59
KAFKA-13668: Retry upon missing initProducerId due to authorization error (#12149)
Producers used to throw a fatal error upon failing initProducerId, which can be caused by authorization errors. In this case, the user will need to instantiate a producer.

This PR makes authorization errors non-fatal so that the user can retry until the permission is fixed by an admin.

Here we first transition the producer to the ABORTABLE state, then to the UNINITIALIZED state (so that the producer is recoverable). Upon the subsequent send, the producer will transition to INITIALIZING and attempt to send another InitProducerIdRequest.

Reviewers: Kirk True <ktrue@confluent.io>, David Jacot <djacot@confluent.io>, Jason Gustafson <jason@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-05-04 09:20:01 -07:00
Christo Lolov dc7819d7f1
KAFKA-14594: Move LogDirsCommand to tools module (#13122)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-05-04 12:00:33 +02:00
David Mao d46c3f259c
MINOR: Reduce number of threads created for integration test brokers (#13655)
The integration tests seem to create an unnecessarily large number of threads. This reduces the number of threads created per integration test harness broker.

Reviewers: Luke Chen <showuon@gmail.com>. Justine Olshan <jolshan@confluent.io>
2023-05-03 17:09:43 -07:00
Jason Gustafson c08120f83f
MINOR: Allow tagged fields with version subset of flexible version range (#13551)
The generated message types are missing a range check for the case when the tagged version range is a subset of
the flexible version range. This causes the tagged field count, which is computed correctly, to conflict with the
number of tags serialized.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-05-03 15:25:32 -07:00
Christo Lolov f44ee4fab7
MINOR: Remove unnecessary code in client/connect (#13259)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-05-02 17:39:31 +02:00
Philip Nee 64ebbc577d
MINOR: Fixing typos in the ConsumerCoordinator (#13618)
Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, David Jacot <djacot@confluent.io>
2023-04-28 17:46:40 +02:00
Anton Agestam e55fbceb66
MINOR: Fix incorrect description of SessionLifetimeMs (#13649)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-04-28 12:54:28 +02:00
Philip Nee c6ad151ac3
KAFKA-14639: A single partition may be revoked and assign during a single round of rebalance (#13550)
This is a really long story, but the incident started in KAFKA-13419 when we observed a member sending out a topic partition owned from the previous generation when a member missed a rebalance cycle due to REBALANCE_IN_PROGRESS.

This patch changes the AbstractStickyAssignor.AllSubscriptionsEqual method.  In short, it should no long check and validate only the highest generation.  Instead, we consider 3 cases:
1. Member will continue to hold on to its partition if there are no other owners
2. If there are 1+ owners to the same partition. One with the highest generation will win.
3. If two members of the same generation hold on to the same partition.  We will log an error but remove both from the assignment. (Same with the current logic)

Here are some important notes that lead to the patch:
- If a member is kicked out of the group, and `UNKNOWN_MEMBER_ID` will be thrown.
- It seems to be a common situation that members are late to joinGroup and therefore get `REBALANCE_IN_PROGRESS` error.  This is why we don't want to reset generation because it might cause lots of revocations and can be disruptive

To summarize the current behavior of different errors:
`REBALANCE_IN_PROGRESS`
- heartbeat: requestRejoin if member state is stable
- joinGroup: rejoin immediately
- syncGroup: rejoin immediately
- commit: requestRejoin and fail the commit. Raise this exception if the generation is staled, i.e. another rebalance is already in progress.

`UNKNOWN_MEMBER_ID`
- heartbeat: resetStateAndRejoinif generation hasn't changed. otherwise, ignore
- joinGroup: resetStateAndRejoin if generation unchanged, otherwise rejoin immediately
- syncGroup:  resetStateAndRejoin if generation unchanged, otherwise rejoin immediately

`ILLEGAL_GENERATION`
- heartbeat: resetStateAndRejoinif generation hasn't changed. otherwise, ignore
- syncGroup: raised the exception if generation has been resetted or the member hasn't completed rebalancing.  then resetStateAndRejoin if generation unchanged, otherwise rejoin immediately

Reviewers: David Jacot <djacot@confluent.io>
2023-04-28 11:08:32 +02:00
Luke Chen d796480fe8
KAFKA-14909: check zkMigrationReady tag before migration (#13631)
1. add ZkMigrationReady in apiVersionsResponse
2. check all nodes if ZkMigrationReady are ready before moving to next migration state

Reviewers: David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com>
2023-04-28 14:35:12 +08:00
Colin P. McCabe 7049333617 KAFKA-14943: Fix ClientQuotaControlManager validation
Don't allow setting negative or zero values for quotas. Don't allow SCRAM mechanism names to be
used as client quota names. SCRAM mechanisms are not client quotas. (The confusion arose because of
internal ZK representation details that treated them both as "client configs.")

Add unit tests for ClientQuotaControlManager.isValidIpEntity and
ClientQuotaControlManager.configKeysForEntityType.

This change doesn't affect metadata record application, only input validation. If there are bad
client quotas that are set currently, this change will not alter the current behavior (of throwing
an exception and ignoring the bad quota).
2023-04-27 10:42:32 -07:00
LinShunKang dd6690a7a0
KAFKA-14944: Reduce CompletedFetch#parseRecord() memory copy (#12545)
This implements KIP-863: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=225152035
Direct use ByteBuffer instead of byte[] to deserialize.

Reviewers: Luke Chen <showuon@gmail.com>, Kirk True <kirk@kirktrue.pro>
2023-04-27 10:44:08 +08:00
Manyanda Chitimbo d83a734c41
MINOR: only set sslEngine#setUseClientMode to false once when ssl mode is server (#13626)
The sslEngine.setUseClientMode(false) was duplicated when ssl mode is server during SSLEngine creation
in DefaultSslEngineFactory.java. The patch attemps to remove the duplicated call.

Reviewers:   maulin-vasavada <maulin.vasavada@gmail.com>, Divij Vaidya <diviv@amazon.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
2023-04-25 15:32:24 +05:30
Dimitar Dimitrov e14dd8024a
KAFKA-14821 Implement the listOffsets API with AdminApiDriver (#13432)
We are handling complex workflows ListOffsets by chaining together MetadataCall instances and ListOffsetsCall instances, there are many complex and error-prone logic. In this PR we rewrote it with the `AdminApiDriver` infra, notable changes better than old logic:
1. Retry lookup stage on receiving `NOT_LEADER_OR_FOLLOWER` and `LEADER_NOT_AVAILABLE`, whereas in the past we failed the partition directly without retry.
2. Removing class field `supportsMaxTimestamp` and calculating it on the fly to avoid the mutable state, this won't change any behavior of  the client.
3. Retry fulfillment stage on `RetriableException`, whereas in the past we just retry fulfillment stage on `InvalidMetadataException`, this means we will retry on `TimeoutException` and other `RetriableException`.

We also `handleUnsupportedVersionException` to `AdminApiHandler` and `AdminApiLookupStrategy`, they are used to keep consistency with old logic, and we can continue improvise them. 

Reviewers: Ziming Deng <dengziming1993@gmail.com>, David Jacot <djacot@confluent.io>
2023-04-20 11:29:27 +08:00
Dániel Urbán 454b72161a
KAFKA-14902: KafkaStatusBackingStore retries on a dedicated background thread to avoid stack overflows (#13557)
KafkaStatusBackingStore uses an infinite retry logic on producer send, which can lead to a stack overflow.
To avoid the problem, a background thread was added, and the callback submits the retry onto the background thread.
2023-04-18 09:40:14 +02:00
Justine Olshan 56dcb837a2
KAFKA-14561: Improve transactions experience for older clients by ensuring ongoing transaction (#13391)
Added check for ongoing transaction
Thread to send and receive verify only add partition to txn requests
Code to send on request thread courtesy of @artemlivshits

Reviewers: Artem Livshits <alivshits@confluent.io>, Jun Rao <junrao@gmail.com>
2023-04-12 17:04:51 -07:00
Rajini Sivaram b64ac94a8c
KAFKA-14891: Fix rack-aware range assignor to assign co-partitioned subsets (#13539)
Reviewers: David Jacot <djacot@confluent.io>
2023-04-12 08:35:03 +01:00
Gantigmaa Selenge 751a8af1f0
KAFKA-14420: Use incrementalAlterConfigs API for syncing topic configurations in MirrorMaker 2 (KIP-894) (#13373)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chris Egerton <chrise@aiven.io>
2023-04-10 11:55:49 -04:00
José Armando García Sancio 672dd3ab6a
KAFKA-13020; Implement reading Snapshot log append timestamp (#13345)
The SnapshotReader exposes the "last contained log time". This is mainly used during snapshot cleanup. The previous implementation used the append time of the snapshot record. This is not accurate as this is the time when the snapshot was created and not the log append time of the last record included in the snapshot.

The log append time of the last record included in the snapshot is store in the header control record of the snapshot. The header control record is the first record of the snapshot.

To be able to read this record, this change extends the RecordsIterator to decode and expose the control records in the Records type.

Reviewers: Colin Patrick McCabe <cmccabe@apache.org>
2023-04-07 09:25:54 -07:00
Chia-Ping Tsai 637bc92ba1
MINOR: move RecordReader from org.apache.kafka.tools (client module) to org.apache.kafka.tools.api (tools-api module) (#13454)
Reviewers: Jun Rao <junrao@gmail.com>
2023-04-07 00:20:56 +08:00