Commit Graph

4945 Commits

Author SHA1 Message Date
Chung, Ming-Yen 66655ab49a
KAFKA-17095 Fix the typo from "CreateableTopicConfig" to "CreatableTopicConfig" (#16623)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-19 11:09:08 +08:00
TengYao Chi cd48fa682d
KAFKA-17077 The node.id is inconsistent to broker.id when "broker.id.generation.enable=true". (#16540)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-19 10:41:53 +08:00
Volk 43fdc6ae08
KAFKA-17122 Change the type of `clusterId` from `UUID` to `String` (#16590)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-19 10:37:58 +08:00
xijiu 4fd0f4095e
KAFKA-17118 remove useless code in StorageTool (#16622)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-19 10:04:32 +08:00
brenden20 b4682da5ae
MINOR: Improve ProducerIdExpirationTest (#16619)
The purpose of this PR is to improve upon a test case in ProducerIdExpirationTest.scala. Specifically, the testTransactionAfterTransactionIdExpiresButProducerIdRemains(). This test was slightly flaky. Removed an assertion on the producerState size that caused flakiness in the test

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-07-18 14:39:58 -07:00
Kuan-Po Tseng f595802cc7
KAFKA-16975 The error message of creating `__cluster_metadata` should NOT be "Authorization failed" (#16372)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-18 19:06:45 +08:00
Kuan-Po Tseng 94f5a4f63e
KAFKA-17135 Add unit test for `ProducerStateManager#readSnapshot` and `ProducerStateManager#writeSnapshot` (#16603)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-18 18:23:29 +08:00
Federico Valeri 04d34f3676
KAFKA-17124: Fix flaky DumpLogSegmentsTest#testDumpRemoteLogMetadataNonZeroStartingOffset (#16580)
This changes should fix the flakiness reported for DumpLogSegmentsTest#testDumpRemoteLogMetadataNonZeroStartingOffset.
I was not able to reproduce locally, but the issue was that the second segment was not created in time:

Missing required argument "[files]"

The fix consists of getting the log path directly from the rolled segment.

We were also creating the log twice, and that was producing this warning:

[2024-07-12 00:57:28,368] WARN [LocalLog partition=kafka-832386, dir=/tmp/kafka-2956913950351159820] Trying to roll a new log segment with start offset 0 =max(provided offset = Some(0), LEO = 0) while it already exists and is active with size 0. Size of time index: 873813, size of offset index: 1310720. (kafka.log.LocalLog:70)

This is also fixed.

Reviewers: Luke Chen <showuon@gmail.com>
2024-07-18 10:13:29 +08:00
Greg Harris c97421c100
KAFKA-17150: Use Utils.loadClass instead of Class.forName to resolve aliases correctly (#16608)
Signed-off-by: Greg Harris <greg.harris@aiven.io>
Reviewers: Chris Egerton <chrise@aiven.io>, Chia-Ping Tsai <chia7712@gmail.com>, Josep Prat <josep.prat@aiven.io>
2024-07-17 16:00:45 -07:00
Dmitry Werner a66a59f427
KAFKA-17148: Remove print MetaPropertiesEnsemble from kafka-storage tool (#16607)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Greg Harris <greg.harris@aiven.io>
2024-07-17 08:16:29 -07:00
Abhijeet Kumar 24ed31739e
KAFKA-16853: Split RemoteLogManagerScheduledThreadPool (#16502)
As part of KIP-950, we want to split the RemoteLogManagerScheduledThreadPool into separate thread pools (one for copy and another for expiration). In this change, we are splitting it into three thread pools (one for copy, one for expiration, and another one for follower). We are reusing the same thread pool configuration for all three thread pools. We can introduce new user-facing configurations later.

Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Luke Chen <showuon@gmail.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org>
2024-07-17 16:43:23 +05:30
PoAn Yang b015a83f6d
KAFKA-17017 AsyncKafkaConsumer#unsubscribe clean the assigned partitions (#16449)
Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-07-17 04:18:33 +08:00
Colin Patrick McCabe 4d3e366bc2
KAFKA-16772: Introduce kraft.version to support KIP-853 (#16230)
Introduce the KRaftVersion enum to describe the current value of kraft.version. Change a bunch of places in the code that were using raw shorts over to using this new enum.

In BrokerServer.scala, fix a bug that could cause null pointer exceptions during shutdown if we tried to shut down before fully coming up.

Do not send finalized features that are finalized as level 0, since it is a no-op.

Reviewers: dengziming <dengziming1993@gmail.com>, José Armando García Sancio <jsancio@apache.org>
2024-07-16 09:31:10 -07:00
Apoorv Mittal 3a442ffe32
KAFKA-16743,KAFKA-16744: KafkaApis support for share group heartbeat and describe (#16574)
Added handling of share group heartbeat and describe in KafkaApis. The Implementation of heartbeat and describe is with group coordinator.

Reviewers:  Manikumar Reddy <manikumar.reddy@gmail.com>, Rahul <rahul.nirgude@mastercard.com>
2024-07-15 19:06:20 +05:30
David Arthur 8aee314a46
KAFKA-16667 Avoid stale read in KRaftMigrationDriver (#15918)
When becoming the active KRaftMigrationDriver, there is another race condition similar to KAFKA-16171. This time, the race is due to a stale read from ZK. After writing to /controller and /controller_epoch, it is possible that a read on /migration is not linearized with the writes that were just made. In other words, we get a stale read on /migration. This leads to an inability to sync metadata to ZK due to incorrect zkVersion on the migration ZNode.

The non-linearizability of reads is in fact documented behavior for ZK, so we need to handle it.

To fix the stale read, this patch adds a write to /migration after updating /controller and /controller_epoch. This allows us to learn the correct zkVersion for the migration ZNode before leaving the BECOME_CONTROLLER state.

This patch also adds a check on the current leader epoch when running certain events in KRaftMigrationDriver. Historically, we did not include this check because it is not necessary for correctness. Writes to ZK are gated on the /controller_epoch zkVersion, and RPCs sent to brokers are gated on the controller epoch. However, during a time of rapid failover, there is a lot of processing happening on the controller (i.e., full metadata sync to ZK and full UMRs sent to brokers), so it is best to avoid running events we know will fail.

There is also a small fix in here to improve the logging of ZK operations. The log message are changed to past tense to reflect the fact that they have already happened by the time the log message is created.

Reviewers: Igor Soarez <soarez@apple.com>
2024-07-15 09:32:06 -04:00
Apoorv Mittal 0b6086ed88
KAFKA-16741: Add ShareGroupHeartbeat API support - 2/N (KIP-932) (#16573)
ShareGroupHeartbeat API support as defined in KIP-932. The heartbeat persists Group and Member information on __consumer_offsets topic.

The PR also moves some of the ShareGroupConfigs to GroupCoordinatorConfigs as they should only be used in group coordinator.


Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-07-15 16:14:55 +05:30
Vinay Agarwal 4cec840baf
KAFKA-16661: Added a lower `log.initial.task.delay.ms` value (#16221)
added a lower log.initial.task.delay.ms value to integration test framework to 500ms

Reviewers: Luke Chen <showuon@gmail.com>
2024-07-15 17:06:54 +08:00
Alyssa Huang 7495e70365
KAFKA-16532; Support for first leader bootstrapping the voter set (#16518)
The first leader of a KRaft topic partition must rewrite the content of the bootstrap checkpoint (0-0.checkpoint) to the log so that it is replicated. Bootstrap checkpoints are not replicated to the followers.

The control records for KRaftVersionRecord and VotersRecord in the bootstrap checkpoint will be written in one batch along with the LeaderChangeMessage. The leader will write these control records before accepting data records from the state machine (Controller).

The leader determines that the bootstrap checkpoint has not been written to the log if the latest set of voters is located at offset -1. This is the last contained offset for the bootstrap checkpoint (0-0.checkpoint).

This change also improves the RaftClientTestContext to allow for better testing of the reconfiguration functionality. This is mainly done by allowing the voter set to be configured statically or through the bootstrap checkpoint.

Reviewers: José Armando García Sancio <jsancio@apache.org>, Colin P. McCabe <cmccabe@apache.org>
Co-authors: José Armando García Sancio <jsancio@apache.org>
2024-07-12 13:44:21 -07:00
Okada Haruki 01cf24a1ca
KAFKA-17061 Improve the performance of isReplicaOnline (#16529)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-12 12:52:59 +08:00
Colin Patrick McCabe ede289db93
KAFKA-17011: Fix a bug preventing features from supporting v0 (#16421)
As part of KIP-584, brokers expose a range of supported versions for each feature. For example, metadata.version might be supported from 1 to 21. (Note that feature level ranges are always inclusive, so this would include both level 1 and 21.)

These supported ranges are supposed to be able to include 0. For example, it should be possible for a broker to support a kraft.version between 0 and 1. However, in older software versions, there is an assertion in org.apache.kafka.common.feature.SupportedVersionRange that prevents this. This causes problems when the older software attempts to deserialize an ApiVersionsResponse containing such a range.

In order to resolve this dilemma, this PR bumps the version of ApiVersionsRequest from 3 to 4. Clients which send v4 promise to be able to handle ranges including 0. Clients which send v3 will not be exposed to these ranges. The feature will show up as having a minimum version of 1 instead. This work is part
of KIP-1022.

Similarly, this PR also introduces a new version of BrokerRegistrationRequest, and specifies that the
older versions of that RPC cannot handle supported version ranges including 0. Therefore, 0 is translated to 1 in the older requests.

Reviewers: Jun Rao <junrao@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>
2024-07-10 16:10:25 -07:00
Omnia Ibrahim 25d775b742
KAFKA-15853 Refactor ShareGroupConfig with AbstractConfig (#16506)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-07-11 01:37:50 +08:00
PoAn Yang 0b11971f2c
KAFKA-17016 Align the behavior of GaugeWrapper and MeterWrapper (#16426)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-10 15:26:17 +08:00
Federico Valeri 8491e84aad
KAFKA-16228: Add remote log metadata flag to the dump log tool (#16475)
This change adds the --remote-log-metadata-decoder flag to the kafka-dump-log.sh tool. This new flag can be used to decode the payload of the __remote_log_metadata records produced by the default RemoteLogMetadataManager.

Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>
2024-07-10 11:20:18 +08:00
Christo Lolov f369771bf2
KAFKA-16851: Add remote.log.disable.policy (#16132)
Add a remote.log.disable.policy on a topic-level only as part of KIP-950

Reviewers: Kamal Chandraprakash <kchandraprakash@uber.com>, Luke Chen <showuon@gmail.com>, Murali Basani <muralidhar.basani@aiven.io>
2024-07-10 11:18:48 +08:00
Abhinav Dixit 67e6859632
KAFKA-17071: SharePartition - Add more unit tests and minor enhancement (#16530)
The PR contains the following -

1. Added minor enhancement to not have a write state RPC call in case we have 0 states to be updated.
2. Removed unutilized function deliveryCount()
3. Added more unit tests covering functionalities across SharePartition.java

Reviewers:  Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-07-09 11:45:15 +05:30
Kuan-Po Tseng a533e246e3
KAFKA-17081 Tweak GroupCoordinatorConfig: re-introduce local attributes and validation (#16524)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-08 01:15:18 +08:00
Kuan-Po Tseng d45596a2a1
MINOR: Move related getters to RemoteLogManagerConfig (#16538)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-08 01:00:42 +08:00
TaiJuWu f0e0db1aad
MINOR: Wait all brokers to get ready (#16537)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com
2024-07-08 00:32:21 +08:00
Sushant Mahajan 932759bd70
KAFKA-16731: Added share group metrics class. (#16488)
Added ShareGroupMetrics inner class to SharePartitionManager.
Added code to record metrics at various checkpoints in code.

Reviewers:  Andrew Schofield <aschofield@confluent.io>,Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-07-07 17:51:13 +05:30
TingIāu "Ting" Kì 4a49a1249d
KAFKA-17059 Remove `dynamicConfigOverride` from KafkaConfig (#16523)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-05 07:11:02 +08:00
José Armando García Sancio 376365d9da
MINOR; Add property methods to LogOffsetMetadata (#16521)
This change simply adds property methods to LogOffsetMetadata. It
changes all of the callers to use the new property methods instead of
using the fields directly.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-07-04 15:03:32 -04:00
Apoorv Mittal e0dcfa7b51
KAFKA-16741: Add share group classes for Heartbeat API (1/N) (KIP-932) (#16516)
Defined share group, member and sinmple assignor classes with API definition for Share Group Heartbeat and Describe API.

The ShareGroup and ShareGroupMember extends the common ModernGroup and ModernGroupMember respectively.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-07-04 20:31:47 +05:30
Abhijeet Kumar 641469e4ac
KAFKA-17069: Remote log copy throttle metrics (#16086)
As part of KIP-956, we have added quota for remote copies to remote storage. In this PR, we are adding the following metrics for remote copy throttling.
1. remote-copy-throttle-time-avg 	The average time in millis remote copies was throttled by a broker
2. remote-copy-throttle-time-max 	The max time in millis remote copies was throttled by a broker

Added unit test for the metrics.

Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org>
2024-07-04 12:27:59 +08:00
Apoorv Mittal ae192bdd41
KAFKA-16754: Removing partitions from release API (KIP-932) (#16513)
The release API exposed Partitions which should be an internal implementation detail for releaseAcquiredRecords API. Also lessen the scope for cached topic partitions method as it's not needed.


Reviewers:  Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>, Abhinav Dixit <adixit@confluent.io>
2024-07-03 20:19:03 +05:30
Abhinav Dixit 35baa0ac4f
KAFKA-17026: Implement updateCacheAndOffsets functionality on LSO movement (#16459)
Implemented the functionality which takes care of archiving the records when LSO moves past them. Implemented the following functions -

1. updateCacheAndOffsets - Updates the cached state, start and end offsets of the share partition as per the new log start offset. The method is called when the log start offset is moved for the share partition.
2. archiveAvailableRecordsOnLsoMovement - This function archives all the available records when they are behind the LSO.
3. archivePerOffsetBatchRecords - It archives all the available records in the per offset tracked batch passed to this function.
4. archiveCompleteBatch - It archives all the available records of the complete batch passed to this function.

Reviewers:  Andrew Schofield <aschofield@confluent.io>,Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-07-03 16:01:44 +05:30
Abhijeet Kumar b2da186f21
KAFKA-15265: Remote fetch throttle metrics (#16087)
As part of [KIP-956](https://cwiki.apache.org/confluence/display/KAFKA/KIP-956+Tiered+Storage+Quotas), we have added quota for remote fetches from remote storage. In this PR, we are adding the following metrics for remote fetch throttling.

remote-fetch-throttle-time-avg : The average time in millis remote fetches was throttled by a broker 
remote-fetch-throttle-time-max : The max time in millis remote fetches was throttled by a broker

Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org>
2024-07-03 14:11:18 +05:30
Apoorv Mittal f2dbc55d24
KAFKA-17047: Refactored group coordinator classes to modern package (KIP-932) (#16474)
Following the discussion and suggestion by @dajac, https://github.com/apache/kafka/pull/16054#discussion_r1613638293, the PR refactors the common classes to build TargetAssignment in `modern` package. `consumer` package has been moved inside `modern` package with classes exclusive to `consumer group`.

This PR completes the refactoring and base to introduce `share` package inside `modern`. The subsequent PRs will define the implementation specific to Share Groups while re-using the common functionality from `modern` package classes. 

Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>
2024-07-03 00:16:40 -07:00
TingIāu "Ting" Kì c97d4ce026
KAFKA-17032 NioEchoServer should generate meaningful id instead of incremental number (#16460)
Reviewers: Greg Harris <greg.harris@aiven.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-07-03 03:28:42 +08:00
abhi-ksolves 6897b06b03
KAFKA-3346 Rename Mode to ConnectionMode (#16403)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-03 02:46:04 +08:00
Chirag Wadhwa e55c28c60b
KAFKA-16750: Added acknowledge code in SharePartitionManager including unit tests (#16457)
About
This PR adds acknowledge code in SharePartitionManager. Internally, the record acknowledgements happen at the SharePartition level. SharePartitionManager identifies the SharePartitions and calls their acknowledge method to actually acknowledge the individual records

Testing
Added unit tests to cover the new functionality added in SharePartitionManagerTest

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-07-02 22:25:14 +05:30
David Jacot e7a75805fe
KAFKA-17050: Revert `group.version` (#16482)
This patch partially reverts `group.version` in trunk. I kept the `GroupVersion` class but removed it from `Features` so it is not advertised. I also kept all the changes in the test framework. I removed the logic to require `group.version=1` to enable the new consumer rebalance protocol. The new protocol is enabled based on the static configuration.

For the context, I prefer to revert it in trunk now so we don't forget to revert it in the 3.9 release. I will bring it back for the 4.0 release.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-02 07:21:18 -07:00
Abhinav Dixit d0dfefbe63
KAFKA-15727: Added KRaft support in AlterUserScramCredentialsRequestNotAuthorizedTest (#15224)
his PR adds KRaft support to the following tests in AlterUserScramCredentialsRequestNotAuthorizedTest

Co-authored-by: Manikumar Reddy <manikumar.reddy@gmail.com>

Reviewers:  Manikumar Reddy <manikumar.reddy@gmail.com>
2024-07-02 12:24:41 +05:30
Kuan-Po (Cooper) Tseng 206d0f809a
KAFKA-16909 Refactor GroupCoordinatorConfig with AbstractConfig (#16458)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-01 23:31:53 +08:00
Logan Zhu 33f5995ec3
MINOR: Eliminate warnings for AdminUtils#assignReplicasToBrokersRackAware (#16470)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-30 22:15:55 +08:00
Xiduo You bb3c35b265
MINOR: Align coordinator loader metrics data type (#16481)
Reviewers: David Jacot <david.jacot@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-30 00:31:53 +08:00
TingIāu "Ting" Kì e57cbe0346
KAFKA-17022 Fix error-prone in KafkaApis#handleFetchRequest (#16455)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-28 14:53:07 +08:00
José Armando García Sancio 9be27e715a
MINOR; Fix incompatible change to the kafka config (#16464)
Prior to KIP-853, users were not allow to enumerate listeners specified in `controller.listener.names` in the `advertised.listeners`. This decision was made in 3.3 because the `controller.quorum.voters` property is in effect the list of advertised listeners for all of the controllers.

KIP-853 is moving away from `controller.quorum.voters` in favor of a dynamic set of voters. This means that the user needs to have a way of specifying the advertised listeners for controller.

This change allows the users to specify listener names in `controller.listener.names` in `advertised.listeners`. To make this change forwards compatible (use a valid configuration from 3.8 in 3.9), the controller's advertised listeners are going to get computed by looking up the endpoint in `advertised.listeners`. If it doesn't exist, the controller will look up the endpoint in the `listeners` configuration.

This change also includes a fix the to the BeginQuorumEpoch request where the default value for VoterId was 0 instead of -1.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-06-27 21:24:25 -04:00
Colin Patrick McCabe ebaa108967
KAFKA-16968: Introduce 3.8-IV0, 3.9-IV0, 3.9-IV1
Create 3 new metadata versions:

- 3.8-IV0, for the upcoming 3.8 release.
- 3.9-IV0, to add support for KIP-1005.
- 3.9-IV1, as the new release vehicle for KIP-966.

Create ListOffsetRequest v9, which will be used in 3.9-IV0 to support KIP-1005. v9 is currently an unstable API version.

Reviewers: Jun Rao <junrao@gmail.com>, Justine Olshan <jolshan@confluent.io>
2024-06-27 14:03:03 -07:00
Abhinav Dixit 49e9bd4a5b
KAFKA-16754: Implemented release acquired records functionality to SharePartition (#16430)
About
Implemented release acquired records functionality in SharePartition. This functionality is used when a share session gets closed, hence all the acquired records should either move to AVAILABLE or ARCHIVED state. Implemented the following functions -

1. releaseAcquiredRecords - This function is executed when the acquisition lock timeout is reached. The function releases the acquired records.
2. releaseAcquiredRecordsForCompleteBatch - Function which releases acquired records maintained at a batch level.
3. releaseAcquiredRecordsForPerOffsetBatch - Function which releases acquired records maintained at an offset level.

Testing
Added unit tests to cover the new functionality added.


Reviewers:  Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>
2024-06-27 15:33:46 +05:30
Abhinav Dixit 399949ebcf
KAFKA-16751: Implemented release acquired records functionality in SharePartitionManager (#16446)
About
Implemented releaseAcquiredRecords functionality in SharePartitionManager which will act as a bridge between the call from KafkaApis to SharePartition for releasing the acquired records when a share session gets closed.

Testing
The added function has been tested with unit tests.

Reviewers:  Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>
2024-06-27 02:18:54 +05:30