Commit Graph

4885 Commits

Author SHA1 Message Date
NICOLAS GUYOMAR 8db7b31c8b
MINOR: Lower InvalidProducerEpochException log level (#16407)
There is no Kafka Administrator action needed for an InvalidProducerEpochException, ERROR level is worrisome while such exception can happen for a variety of valid reason, by design

Proposing to lower the log level from ERROR to INFO

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-06-20 16:24:53 -07:00
Andras Katona 7b23976e78
KAFKA-17000 Separating creation and configuration of Authorizer in AuthorizerTest (#16401)
Reviewers: Viktor Somogyi-Vass <viktorsomogyi@gmail.com>, Dániel Urbán <urb.daniel7@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-21 00:51:32 +08:00
Kamal Chandraprakash 4fe08f3b29
KAFKA-16976 Update the current/dynamic config inside RemoteLogManagerConfig (#16394)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-20 23:33:35 +08:00
Ken Huang f79e429044
KAFKA-16939 Revisit ConfigCommandIntegrationTest (#16317)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-20 09:40:47 +08:00
Andrew Schofield a6718dbbdb
KAFKA-16725: Adjust share group configs to match KIP (#16368)
A few of the share group configs in KIP-932 were defined with limits that do not match KIP-932. This PR corrects the limits.


Reviewers:  Manikumar Reddy <manikumar.reddy@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>
2024-06-19 21:31:29 +05:30
TaiJuWu c0add50a99
MINOR: Add interface for aliveBroker and isShutDwon for Brokers. (#16323)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-19 22:32:08 +08:00
gongxuanzhang 96989e4b64
KAFKA-10787 Apply spotless to core module (#16392)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-19 14:24:42 +08:00
Apoorv Mittal f2a552a1eb
KAFKA-16753: Implement share acknowledge API in partition (KIP-932) (#16339)
The share-partition leader keeps track of the state and delivery attempts for in-flight records. However, delivery attempts tracking follows atleast-once semantics.

The consumer processes the records and acknowledges them upon successful consumption. This successful attempt triggers a transition to the "Acknowledged" state.

The code implements the functionality to acknowledge the offset/batches in the request to in-memory cached data.

Reviewers:  Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-06-18 22:37:59 +05:30
TingIāu "Ting" Kì 6b341e6ca7
KAFKA-16547 Add test for DescribeConfigsOptions#includeDocumentation (#16355)
Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-18 18:45:31 +08:00
Kamal Chandraprakash 8abeaf3cb4
KAFKA-15265: Reapply dynamic remote configs after broker restart (#16353)
The below remote log configs can be configured dynamically:
1. remote.log.manager.copy.max.bytes.per.second
2. remote.log.manager.fetch.max.bytes.per.second and
3. remote.log.index.file.cache.total.size.bytes

If those values are configured dynamically, then during the broker restart, it ensures the dynamic values are loaded instead of the static values from the config.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>
2024-06-18 09:39:35 +05:30
Igor Soarez ceab1fe658
KAFKA-16969: Log error if config conficts with MV (#16366)
When broker configuration is incompatible with the current Metadata Version the Broker should log an error-level message but avoid shutting down.

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-18 11:10:34 +08:00
Gantigmaa Selenge 166d9e8059
KAFKA-15751, KAFKA-15752: Enable KRaft for BaseAdminIntegrationTest and SaslSslAdminIntegrationTest (#15175)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-06-17 19:38:52 +02:00
Chia Chuan Yu 768e90f667
KAFKA-16669 Remove extra collection copy when generating DescribeAclsResource (#15924)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-17 14:47:44 +08:00
Harry Fallows 9c7d81b436
KAFKA-10190 Set dynamic broker configs for entity default (#16280)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-17 13:53:13 +08:00
PoAn Yang a9d71d1312
KAFKA-16898 move TimeIndexTest and TransactionIndexTest to storage module (#16341)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-17 09:11:17 +08:00
Andrew Schofield fecbfb8133
KAFKA-16950: Define Persister interfaces and RPCs (#16335)
Define the interfaces and RPCs for share-group persistence. (KIP-932). This PR is just RPCs and interfaces to allow building of the broker components which depend upon them. The implementation will follow in subsequent PRs.

Reviewers:  Manikumar Reddy <manikumar.reddy@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>
2024-06-15 20:52:49 +05:30
Apoorv Mittal adee6bae45
KAFKA-16740: Added additional APIs for Share Partition (#16340)
Added additional APIs for SharePartition which shall be used by SharePartitionManager.

The lock API on SharePartition helps not issuing concurrent fetch request on replica manager for same SharePartition. The updateCacheAndOffsets API helps to update the cache and corresponding offsets when an exception is encountered in SharePartitionManager because of movement of Log Start Offset.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2024-06-15 12:25:05 +05:30
Francois Visconte 817da3fb5d
KAFKA-16895 fix off-by-one bug in RemoteCopyLagSegments (#16210)
Reviewers: Kamal Chandraprakash <kchandraprakash@uber.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-15 09:10:25 +08:00
Kirk True 8f86b9c4ec
KAFKA-16637 AsyncKafkaConsumer removes offset fetch responses from cache too aggressively (#16310)
Allow the committed offsets fetch to run for as long as needed. This handles the case where a user invokes Consumer.poll() with a very small timeout (including zero).

Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-15 08:48:53 +08:00
TingIāu "Ting" Kì 4e2f26bfc6
KAFKA-16917 DescribeTopicsResult should use mutable map in order to keep compatibility (#16250)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-14 23:48:35 +08:00
Omnia Ibrahim e99da2446c
KAFKA-15853: Move KafkaConfig.configDef out of core (#16116)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-14 17:26:00 +02:00
Abhinav Dixit 8f6e0513df
KAFKA-16747: Implemented share sessions and contexts for share fetch requests (#16263)
About

KIP-932 introduces share sessions for share groups. This PR implements share sessions and contexts for incoming share fetch requests on broker. The changes include:

Defined CachedSharePartition class which are stored in share sessions.
Defined ShareSessionKey, ShareSession classes.
Defined ShareSessionCache class which caches all the share sessions and has evict policy defined as per KIP-932

Defined the 2 types of contexts -
a. ShareSessionContext - for share session fetch request.
b. FinalContext - for final share fetch request (epoch = -1).

Defined newContext function which returns the created/updated context on receiving share fetch request on broker.

Testing
The added code has been tested with the help of unit tests present in the PR.

Reviewers:  Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>
2024-06-14 16:55:27 +05:30
Apoorv Mittal 1565d41cd7
KAFKA-16752: Implemented acquire functionality for Fetch (KIP-932) (#16274)
The implementation for share-fetch next-fetch-offset in share partition and acquiring records from log.

The Next Fetch Offset (NFO) determines where the Share Partition should initiate the next data read from the Replica Manager. While it typically aligns with the last offset of the most recently returned batch, last offset + 1, there are exceptions. Messages marked available again due to release acknowledgements or lock timeouts can cause the NFO to shift.

The acquire method caches the batches as acquired in-memory and spawns a timer task for lock timeout.

Cache
Per-offset Metadata: Simple to implement but inefficient. Every offset requires in-memory storage and traversal, leading to high memory usage and processing overhead, especially for per-batch acknowledgements (mostly the way records would be acknowledged).

Per-Replica Fetch Batch: This approach aligns with the Replica Manager fetch batches. Since a full Replica Manager batch is retrieved whenever the requested offset falls within that batch's boundaries, a single Share Fetch request will likely receive an entire Replica Manager batch. However, there's a trade-off. Replica Manager batches are based on producer batching. If producers don't batch effectively, the in-flight metadata becomes heavily reliant on the producer's batching behavior.

For per-message acknowledgements, per-offset tracking will be necessary which again requires splitting in-flight batches based on state. Splitting bacthes is inefficient as it requires cache update wshich maintains sorted order. Therefore, we propose a hybrid approach:

Implemented a combination of option 2 (per-in-flight batch tracking) with option 1 (per-offset tracking). This aligns well with Replica Manager batching.

States shall be maintained per in-flight batch. If state inconsistencies arise within in-flight batches due to per-message acknowledgements, switch state tracking for the respective batch to option 1 (per-offset tracking).


Reviewers:  Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>, Abhinav Dixit <144765188+adixitconfluent@users.noreply.github.com>
2024-06-14 10:31:56 +05:30
Kamal Chandraprakash a5c71bd6cb
KAFKA-16948: Reset tier lag metrics on becoming follower (#16321)
When the node transitions from a leader to a follower for a partition, then the tier-lag metrics should be reset to zero. Otherwise, it would lead to false positive in metrics. Addressed the concurrency issue while emitting the metrics.

Reviewers: Satish Duggana <satishd@apache.org>, Francois Visconte <f.visconte@gmail.com>,
2024-06-14 09:50:12 +05:30
dujian0068 133f2b0f31
KAFKA-16879 SystemTime should use singleton mode (#16266)
Reviewers: Greg Harris <gharris1727@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-14 08:49:19 +08:00
Edoardo Comar f380cd1b64
MINOR: Add integration tag to AdminFenceProducersIntegrationTest (#16326)
Add @tag("integration") to AdminFenceProducersIntegrationTest

Reviewers: Chris Egerton <chrise@aiven.io>
2024-06-13 15:01:08 +01:00
gongxuanzhang 596b945072
KAFKA-16643 Add ModifierOrder checkstyle rule (#15890)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-13 15:39:32 +08:00
TingIāu "Ting" Kì 0a203a9622
KAFKA-16938 non-dynamic props gets corrupted due to circular reference between DynamicBrokerConfig and DynamicConfig. (#16302)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-13 09:47:51 +08:00
Kamal Chandraprakash cf5a86b654
KAFKA-16890: Compute valid log-start-offset when deleting overlapping remote segments (#16237)
The listRemoteLogSegments returns the metadata list sorted by the start-offset. However, the returned metadata list contains all the uploaded segment information including the duplicate and overlapping remote-log-segments. The reason for duplicate/overlapping remote-log-segments cases is explained [here](https://github.com/apache/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/server/log/remote/metadata/storage/RemoteLogLeaderEpochState.java#L103).

The list returned by the RLMM#listRemoteLogSegments can contain the duplicate segment metadata at the end of the list. So, while computing the next log-start-offset we should take the maximum of segments (end-offset + 1).

Reviewers: Satish Duggana <satishd@apache.org>
2024-06-13 05:18:30 +05:30
Ken Huang 05b1380ecb
KAFKA-16897 Move OffsetIndexTest and OffsetMapTest to storage module (#16244)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-13 06:24:23 +08:00
Ivan Yurchenko dd755b7ea9
KAFKA-8206: KIP-899: Allow client to rebootstrap (#13277)
This commit implements KIP-899: Allow producer and consumer clients to rebootstrap. It introduces the new setting `metadata.recovery.strategy`, applicable to all the types of clients.

Reviewers: Greg Harris <gharris1727@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
2024-06-12 20:48:32 +01:00
Edoardo Comar 615e6e705c
KAFKA-16570 FenceProducers API returns "unexpected error" when succes… (#16229)
KAFKA-16570 FenceProducers API returns "unexpected error" when successful

* Client handling of ConcurrentTransactionsException as retriable
* Unit test
* Integration test

Reviewers: Chris Egerton <chrise@aiven.io>, Justine Olshan <jolshan@confluent.io>
2024-06-12 17:07:33 +01:00
Abhijeet Kumar b5fb6543a2
KAFKA-15265: Dynamic broker configs for remote fetch/copy quotas (#16078)
Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Satish Duggana <satishd@apache.org>
2024-06-12 19:47:46 +05:30
Dmitry Werner faee6a4385
MINOR: Use predetermined dir IDs in ReplicationQuotasTest
Use predetermined directory IDs instead of Uuid.randomUuid() in ReplicationQuotasTest.

Reviewers: Igor Soarez <soarez@apple.com>
2024-06-12 11:44:11 +01:00
David Jacot 638844f833
KAFKA-16770; [2/2] Coalesce records into bigger batches (#16215)
This patch is the continuation of https://github.com/apache/kafka/pull/15964. It introduces the records coalescing to the CoordinatorRuntime. It also introduces a new configuration `group.coordinator.append.linger.ms` which allows administrators to chose the linger time or disable it with zero. The new configuration defaults to 10ms.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2024-06-11 23:29:50 -07:00
Abhijeet Kumar 23fe71d579
KAFKA-15265: Integrate RLMQuotaManager for throttling copies to remote storage (#15820)
- Added the integration of the quota manager to throttle copy requests to the remote storage. Reference KIP-956
- Added unit-tests for the copy throttling logic.

Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>
2024-06-12 06:27:02 +05:30
Nikolay aecaf44475
KAFKA-16520: Support KIP-853 in DescribeQuorum (#16106)
Add support for KIP-953 KRaft Quorum reconfiguration in the DescribeQuorum request and response.
Also add support to AdminClient.describeQuorum, so that users will be able to find the current set of
quorum nodes, as well as their directories, via these RPCs.

Reviewers: Luke Chen <showuon@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Andrew Schofield <aschofield@confluent.io>
2024-06-11 10:01:35 -07:00
Kamal Chandraprakash f3dbd7ed08
KAFKA-16904: Metric to measure the latency of remote read requests (#16209)
Reviewers: Satish Duggana <satishd@apache.org>, Christo Lolov <lolovc@amazon.com>, Luke Chen <showuon@gmail.com>
2024-06-11 21:07:12 +05:30
Abhinav Dixit 99eacf1b61
KAFKA-16914: Added share group dynamic and broker configs (#16268)
KIP-932 introduces a bunch of broker and dynamic configs for share groups. This PR adds those new configs. The changes include:

1. Defined ShareGroupConfigs class which stores various share group configurations.
2. Use the defined share configs in KafkaConfig.scala for making it available to BrokerServer
3. Adds a few tests to validate the conditions on these new configs.


 Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2024-06-11 16:10:15 +05:30
Gyeongwon, Do 1426e8e920
KAFKA-16764: New consumer should throw InvalidTopicException on poll when invalid topic in metadata. (#16043)
Propagate metadata error from background thread to application thread.
So, this fix ensures that metadata errors are thrown to the user on consumer.poll()

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Philip Nee <pnee@confluent.io>, Lianet Magrans <lianetmr@gmail.com>
2024-06-10 18:30:29 +02:00
Kamal Chandraprakash f359908fcd
KAFKA-15776: Support added to update remote.fetch.max.wait.ms dynamically (#16203)
Reviewers: Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>
2024-06-10 20:42:12 +05:30
Max Riedel 40de07dab5
KAFKA-14509; [4/4] Handle includeAuthorizedOperations in ConsumerGroupDescribe API (#16158)
This patch implements the handling of `includeAuthorizedOperations` flag in the ConsumerGroupDescribe API.

Reviewers: David Jacot <djacot@confluent.io>
2024-06-10 05:07:45 -07:00
Gantigmaa Selenge 53b048bf0b
KAFKA-15718: Refactor UncleanLeaderElectionTest to enable KRaft later (#16157)
Refactor UncleanLeaderElectionTest to allow to enable KRaft later

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-10 19:15:34 +08:00
Chia Chuan Yu e5b8712993
KAFKA-16885 Renamed the enableRemoteStorageSystem to isRemoteStorageSystemEnabled (#16256)
Reviewers: Kamal Chandraprakash <kchandraprakash@uber.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-10 02:14:15 +08:00
Gaurav Narula b8b248b1e8
KAFKA-16920: close BrokerLifecycleManager in tests (#16252)
Tests in BrokerLifecycleManagerTest do not close BrokerLifecycleManager
if an assertion fails.

This change makes BrokerLifecycleManager an instance variable that is
closed in an `@AfterEach` method.

Reviewers: Igor Soarez <i@soarez.me>
2024-06-09 00:03:00 +01:00
PoAn Yang 3d5d1504f7
KAFKA-16878 Remove powermock and easymock from code base (#16236)
Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-09 00:17:43 +08:00
Igor Soarez 5a5a292146
MINOR: Fix broken ReassignPartitionsCommandTest test (#16251)
KAFKA-16606 (#15834) introduced a change that broke
ReassignPartitionsCommandTest.testReassignmentCompletionDuringPartialUpgrade.

The point was to validate that the MetadataVersion supports JBOD
in KRaft when multiple log directories are configured.
We do that by checking the version used in
kafka-features.sh upgrade --metadata, and the version discovered
via a FeatureRecord for metadata.version in the cluster metadata.

There's no point in checking inter.broker.protocol.version in
KafkaConfig, since in KRaft, that configuration is deprecated
and ignored — always assuming the value of MINIMUM_KRAFT_VERSION.

The broken that was broken sets inter.broker.protocol.version in
KRaft mode and configures 3 directories. So alternatively, we
could change the test to not configure this property.
Since the property isn't forbidden in KRaft mode, just ignored,
and operators may forget to remove it, it seems better to remote
the fail condition in KafkaConfig.

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-08 14:06:46 +03:00
Igor Soarez 7c0fff7c36
KAFKA-16606 Gate JBOD configuration on 3.7-IV2 (#15834)
Support for multiple log directories in KRaft exists from
MetataVersion 3.7-IV2.

When migrating a ZK broker to KRaft, we already check that
the IBP is high enough before allowing the broker to startup.

With KIP-584 and KIP-778, Brokers in KRaft mode do not require
the IBP configuration - the configuration is deprecated.
In KRaft mode inter.broker.protocol.version defaults to
MetadataVersion.MINIMUM_KRAFT_VERSION (IBP_3_0_IV1).

Instead KRaft brokers discover the MetadataVersion by reading
the "metadata.version" FeatureLevelRecord from the cluster metadata.

This change adds a new configuration validation step upon discovering
the "metadata.version" from the cluster metadata.

Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-06-07 09:11:57 +01:00
Kirk True d6cd83e2fb
KAFKA-16200: Enforce that RequestManager implementations respect user-provided timeout (#16031)
Improve consistency and correctness for user-provided timeouts at the Consumer network request layer, per the Java client Consumer timeouts design (https://cwiki.apache.org/confluence/display/KAFKA/Java+client+Consumer+timeouts). While the changes introduced in KAFKA-15974 enforce timeouts at the Consumer's event layer, this change enforces timeouts at the network request layer.

The changes mostly fit into the following areas:

1. Create shared code and idioms so timeout handling logic is consistent across current and future RequestManager implementations
2. Use deadlineMs instead of expirationMs, expirationTimeoutMs, retryExpirationTimeMs, timeoutMs, etc.
3. Update "preemptive pruning" to remove expired requests that have had at least one attempt

Reviewers: Lianet Magrans <lianetmr@gmail.com>, Bruno Cadonna <cadonna@apache.org>
2024-06-07 09:53:27 +02:00
Apoorv Mittal c01279b92a
KAFKA-16905: Fix blocking DescribeCluster call in AdminClient DescribeTopics (#16217)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, David Arthur <mumrah@gmail.com>
2024-06-06 18:11:43 -04:00