Commit Graph

15214 Commits

Author SHA1 Message Date
TaiJuWu 934b0159bb
KAFKA-18089: Upgrade Caffeine lib to 3.1.8 (#18004)
- Fixed the RemoteIndexCacheTest that fails with caffeine > 3.1.1

Reviewers: Luke Chen <showuon@gmail.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>
2025-02-18 21:51:38 +05:30
TaiJuWu 4c8d96c0f0
KAFKA-18767: Add client side config check for shareConsumer (#18850)
Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-18 15:57:56 +00:00
Parker Chang ed366e6b89
MINOR: Align assertFutureThrows method signature with JUnit conventions (#18825)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-02-18 15:56:42 +00:00
Mickael Maison 0a2fab9310
KAFKA-14484: Decouple UnifiedLog and RemoteLogManager (#18460)
Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Ismael Juma <ismael@juma.me.uk>
2025-02-18 15:10:31 +01:00
Andrew Schofield 6c14f64245
MINOR: Rename NoOpShareStatePersister for consistency (#18933)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-18 14:07:59 +00:00
Chirag Wadhwa 63229a768c
KAFKA-16718 [1/n]: Added DeleteShareGroupOffsets request and response schema (#18927)
Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-18 14:06:24 +00:00
Bruno Cadonna d6b6952d48
KAFKA-18736: Add Streams group heartbeat request manager (1/N) (#18870)
This commit adds the Streams group heartbeat request manager
to the async consumer. The Streams group heartbeat request
manager is responsible to send heartbeat requests and to
process their responses.

This commit implements:
- sending of full heartbeat request (independent of any state)
- processing successful response

Reviewers: Bill Bejeck <bill@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>
2025-02-18 13:45:01 +01:00
David Jacot 657154dfb8
MINOR: Update LICENSE-binary (#18943)
Before the patch:
```
% python3 ./committer-tools/verify_license.py

...

All libs from ./libs are present in the LICENSE file.

The following entries are in the LICENSE file but not present in ./libs. These should be removed from the LICENSE-binary file:
 - audience-annotations-0.12.0
 - jackson-jaxrs-base-2.16.2
 - jackson-jaxrs-json-provider-2.16.2
 - jackson-module-jaxb-annotations-2.16.2
 - jakarta.inject-2.6.1
 - javax.servlet-api-3.1.0
 - jetty-continuation-9.4.56.v20240826
 - jetty-servlet-9.4.56.v20240826
 - jetty-servlets-9.4.56.v20240826
 - jetty-util-ajax-9.4.56.v20240826
 - jsr305-3.0.2
 - log4j-core-test-2.24.1
```

After the patch:
```
% python3 ./committer-tools/verify_license.py

...

All libs from ./libs are present in the LICENSE file.

No extra dependencies in the LICENSE file.
```

Reviewers: Mickael Maison <mickael.maison@gmail.com>
2025-02-18 13:24:03 +01:00
David Jacot 5413063441
MINOR: Add verify_license tool (#18931)
This patch adds the verify_license.py tool. It compares the libraries shipped within the tarball to the LICENSE file, and vice versa, to ensure that they are aligned. It also slightly update the format of the LICENSE file to make it easier to parse it.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Mickael Maison <mickael.maison@gmail.com>
2025-02-18 12:07:37 +01:00
Andrew Schofield 385b7ad355
MINOR: Align share group admin authz with consumer group (#18936)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2025-02-18 09:12:07 +00:00
Luke Chen 1f47a78a10
MINOR: add docs for "org.apache.kafka.sasl.oauthbearer.allowed.urls" (#18938)
add docs for "org.apache.kafka.sasl.oauthbearer.allowed.urls"

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
2025-02-18 16:47:43 +08:00
xijiu 1dcdbf78bb
KAFKA-18798 The replica placement policy used by ReassignPartitionsCommand is not aligned with kraft controller (#18914)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-18 16:02:00 +08:00
Kamal Chandraprakash da3643c6b4
KAFKA-18787: RemoteIndexCache fails to delete invalid files on init (#18888)
The stale/invalid files that ends-with ".deleted" and ".tmp" should be cleaned when the broker gets restarted.

- fix the remote-index-cache test to use the logDir instead of topicDir
- fix the flaky test

Reviewers: Luke Chen <showuon@gmail.com>
2025-02-18 12:56:03 +05:30
Sean Quah 1c9190d6b1
KAFKA-18807; Fix thread idle ratio metric (#18934)
When group.coordinator.threads is greater than 1, we lose track of thread idle time because of integer arithmetic. Use doubles instead.

Reviewers: David Jacot <djacot@confluent.io>
2025-02-18 08:11:38 +01:00
David Jacot 2ecc16b987
MINOR: Remove dropwizard metrics from dependencies.gradle (#18932)
This patch removes dropwizard metrics in the dependency list as it is not used any more. It was introduced in 4f5b4c868e because it was required by Zookeeper. Zookeeper is no longer there so we can remove it too.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2025-02-18 08:10:06 +01:00
Matthias J. Sax 87f797811b
HOTFIX: StoreChangelogReader should require stable consumer group (#18901)
Fixing regression bug, introduced by beac86f049

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Bruno Cadonna <bruno@confluent.io>
2025-02-17 12:53:13 -08:00
Kaushik Raina 35420eb11b
KAFKA-18684: Add base exception classes (#18871)
Introduced two new exception classes to the Kafka error handling framework:

ApplicationRecoverableException: This exception signals that the error is recoverable, but the producer needs to be restarted. It helps in scenarios where recovery actions (like re-balancing or restoring from checkpoints) are needed.

RefreshRetriableException: This exception occurs when metadata is outdated or invalid and needs to be refreshed before retrying the request. It helps handle retries that depend on updated metadata.

Both classes are abstract and in upcoming PRs they will be extended by relevant classes as mentioned in KIP-1050:Exception Table.

Reviewers: Justine Olshan <jolshan@confluent.io>, Sanskar Jhajharia <jhajharia.sanskar@gmail.com>
2025-02-17 12:11:51 -08:00
Apoorv Mittal 06ce3e890b
KAFKA-18733: Updating share group record acks metric (2/N) (#18924)
Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-17 18:12:58 +00:00
Jimmy Wang 98a7ce5caa
KAFKA-18801 Remove ClusterGenerator and revise ClusterTemplate javadoc (#18907)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Arthur <mumrah@gmail.com>
2025-02-17 10:39:21 -05:00
Lucas Brutschy d0c65a1fd2
KAFKA-18730: Add replaying streams group state from offset topic (#18809)
Adds streams group to the GroupMetadataManager, and implements loading
the records from the offset topic into state. The state also contains
two timers (rebalance timeout and session timeout) that are started
after the group coordinator has been loaded.

Reviewers: Bruno Cadonna <bruno@confluent.io>, Bill Bejeck <bill@confluent.io>
2025-02-17 16:13:21 +01:00
PoAn Yang 2b6e868538
KAFKA-18784 Fix ConsumerWithLegacyMessageFormatIntegrationTest (#18889)
In PR #18267, we removed old message format for cases in ConsumerWithLegacyMessageFormatIntegrationTest. Although test cases can pass, they don't fulfill original purpose. We can't send old message format since 4.0, so I change cases to append old records by ReplicaManager directly.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-17 20:43:29 +08:00
Andrew Schofield 9b7ad6ec32
MINOR: Mark testQuotaOverrideDelete as flaky (#18925)
Reviewers: poorv Mittal <apoorvmittal10@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-17 15:20:35 +08:00
Ken Huang d1db3d8e14
KAFKA-18805: add synchronized block for Consumer Heartbeat close (#18920)
add synchronized block for Consumer Heartbeat close.

Reviewers: Luke Chen <showuon@gmail.com>
2025-02-17 14:38:20 +08:00
Jimmy Wang 85c337af44
KAFKA-18755 Align timeout in kafka-share-groups.sh (#18908)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-17 04:48:25 +08:00
TengYao Chi 5cbe00e375
MINOR: Remove unused member in DynamicBrokerConfig (#18915)
Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-17 04:46:25 +08:00
Sushant Mahajan 5235e11d4d
KAFKA-18809 Set min in sync replicas for __share_group_state. (#18922)
- The share.coordinator.state.topic.min.isr config defined in ShareCoordinatorConfig was not being used in the AutoTopicCreationManager.
- The AutoTopicCreationManager calls the ShareCoordinatorService.shareGroupStateTopicConfigs to configs for the topic to create.
- The method ShareCoordinatorService.shareGroupStateTopicConfigs was not setting the supplied config value for share.coordinator.state.topic.min.isr to min.insync.replicas.
- In this PR, we remedy the situation by setting the value
- A test has been added to ShareCoordinatorServiceTest so that this is not repeated for any configs.

Reviewers: poorv Mittal <apoorvmittal10@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-17 04:22:48 +08:00
Jhen-Yung Hsu d0e516a872
KAFKA-18803 The acls would appear at the wrong level of the metadata shell "tree" (#18916)
Reviewers: David Arthur <mumrah@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-17 03:53:18 +08:00
David Arthur e330f0bf25
MINOR Always keep thread dumps after build timeouts
Reviewers: Matthias J. Sax <matthias@confluent.io>
2025-02-14 20:36:01 -05:00
Matthias J. Sax 36fd33a9d9 HOTFIX: fix broken :streams:javadocs target 2025-02-14 15:18:11 -08:00
Matthias J. Sax bcc58b4cfe
MINOR: cleanup top level class JavaDocs for main interfaces of Kafka Streams DSL (2/N) (#18882)
Reviewers: Bill Bejeck <bill@confluent.io>
2025-02-14 13:47:23 -08:00
Matthias J. Sax 835d8f3097
MINOR: cleanup top level class JavaDocs for main interfaces of Kafka Streams DSL (1/N) (#18881)
Reviewers: Bill Bejeck <bill@confluent.io>
2025-02-14 13:46:27 -08:00
Ming-Yen Chung e828767062
KAFKA-18790 Fix testCustomQuotaCallback (#18906)
Frequently updating the trust store can cause unexpected termination of the AsyncConsumer background thread.

1. To resolve this issue, reuse the same AdminClient instead of recreating it.
2. Add error logging when fail to initialize resources for the consumer network thread.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-15 03:07:59 +08:00
Andrew Schofield 79e853d68e
KAFKA-18761: Complete listing of share group offsets [1/N] (#18894)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-14 18:55:20 +00:00
Jimmy Wang 6a6b80215d
KAFKA-16717 [1/2]: Add AdminClient.alterShareGroupOffsets (#18819)
KAFKA-16720 aims to add the support for the AlterShareGroupOffsets AdminClient. Key Changes in the PR:

1. Added handing of alterShareGroupOffsets() in KafkaAdminClient and introduce AlterShareGroupOffsetRequest/AlterShareGroupOffsetResponse/AlterShareGroupOffsetsOptions classes.
2. Corresponding test in KafkaAdminClientTest.
3. Added ALTER_SHARE_GROUP_OFFSETS API (will finish it in next PR and the share coordinator pieces)

Reviewers: poorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-15 02:35:46 +08:00
Justine Olshan 48283ad2e5
MINOR: Add release notes for Transactions Server Side Defense (KIP-890) (#18896)
Add some notes about upgrading and performance

Reviewers: David Jacot <djacot@confluent.io>
2025-02-14 08:41:08 -08:00
Calvin Liu 53c2b1604d
MINOR: TransactionManager logs the epoch bump less frequently. (#18895)
Reviwers: Justine Olshan <jolshan@confluen.io>
2025-02-14 08:37:23 -08:00
David Jacot aec0e555be
MINOR: Mark IBP_4_0_IV3 as production ready! (#18902)
This patch marks IBP_4_0_IV3 as production ready for the Apache Kafka 4.0 release. It also introduced IBP_4_1_IV0 as the next development version.

Reviewers: Justine Olshan <jolshan@confluent.io>
2025-02-14 08:17:11 -08:00
David Jacot 1cbd0a2bd7
MINOR: Add KIP-848's metric to the doc (#18890)
This patch update the documentation to include all the new metrics introduced by KIP-848.

Reviewers: Jeff Kim <jeff.kim@confluent.io>
2025-02-14 07:36:36 -08:00
Jimmy Wang ea5d0864d5
KAFKA-18772 Define share group config defaults for Docker (#18899)
Co-authored-by: jimmy <wangzhiwang@qq.com>
Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-14 12:26:18 +00:00
Apoorv Mittal 53543bcf63
KAFKA-18733: Updating share group metrics (1/N) (#18826)
Reviewers: Sushant Mahajan <smahajan@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2025-02-14 08:48:41 +00:00
Calvin Liu e7a2af8414
KAFKA-18634: Fix ELR metadata version issues (#18680)
This patch cleans up the places that should not use MV to determine ELR is enabled marks 4.0IV1 stable.

Reviewers: Alyssa Huang <ahuang@confluent.io>, Colin P. McCabe <cmccabe@apache.org>
2025-02-13 23:40:31 -08:00
陳昱霖(Yu-Lin Chen) 2bbd25841e
KAFKA-18298 Fix flaky testConsumerGroupsDeprecatedConsumerGroupState and testConsumerGroups in PlaintextAdminIntegrationTest (#18513)
It's related to KAFKA-18298 and KAFKA-18297. The root cause of the flaky tests is member rejoin after member removal. To prevent members from rejoining after being removed, before removing group members, calling `consumers.close` in ConsumerThread . This fix also extract the flaky member removal test  to new test `testConsumerGroupWithMemberRemoval`.

Flow of member removal test: 
1. Set 2 static consumer + 1 dynamic consumer
2. Close all consumers.
3. remove one static member
4. remove remaining members
 
Before KIP-1092, the member count is different between ClassicConsumer/AsyncConsumer. (AsyncConsumer will remove dynamic member after consumer closed.)

To get more details, please refer to the discussion under KAFKA-18297 and this PR:
- discussion : [Link](https://issues.apache.org/jira/browse/KAFKA-18297?focusedCommentId=17912537&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17912537)
- review: https://github.com/apache/kafka/pull/18513#pullrequestreview-2589110367

This PR fixed below flaky errors:

1. **PlaintextAdminIntegrationTest#testConsumerGroups**
  a.  `org.opentest4j.AssertionFailedError: expected: <2> but was: <3>` ([Report](https://ge.apache.org/s/lt3lpviv45cns/tests/task/:core:test/details/kafka.api.PlaintextAdminIntegrationTest/testConsumerGroups(String%2C%20String)%5B1%5D?top-execution=1))
  b.  `org.opentest4j.AssertionFailedError: expected: <true> but was: <false>` ([Report](https://ge.apache.org/s/jlxo446xalpoa/tests/task/:core:test/details/kafka.api.PlaintextAdminIntegrationTest/testConsumerGroups(String%2C%20String)%5B1%5D?top-execution=1))

2. **PlaintextAdminIntegrationTest#testConsumerGroupsDeprecatedConsumerGroupState**
  a.  `org.opentest4j.AssertionFailedError: expected: <2> but was: <3>` ([Report](https://ge.apache.org/s/ndoj6s2stb446/tests/task/:core:test/details/kafka.api.PlaintextAdminIntegrationTest/testConsumerGroupsDeprecatedConsumerGroupState(String%2C%20String)%5B1%5D?top-execution=1))
  b. `org.opentest4j.AssertionFailedError: expected: <true> but was: <false>` ([Report](https://ge.apache.org/s/kh3jze2tc5qeu/tests/task/:core:test/details/kafka.api.PlaintextAdminIntegrationTest/testConsumerGroupsDeprecatedConsumerGroupState(String%2C%20String)%5B1%5D?top-execution=1))

Reviewers: David Jacot <djacot@confluent.io>, TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-14 07:28:45 +08:00
Apoorv Mittal e6b835f0b4
MINOR: Marking testVerifyFetchAndCloseImplicit flaky (#18893)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-14 04:57:06 +08:00
Bill Bejeck 3aae6f5402
MINOR: Adjust javadoc to reflect the correct status of standby task TopicPartition (#18892)
KIP-744 introduced the StreamsMetadata class as part of the implementation. In the KIP, the javadoc for the standbyTopicPartitions states that the method returns the set of source TopicPartition that it represents as a standby. The current javadoc states that it represents the changelog TopicPartition(s). While the partitions of the source and changelog topics will match, the javadoc needs to be updated to reflect the correct behavior.

Note that the deprecated o.a.k.streams.state.StreamsMetadata#standbyTopicPartitions method also describes the set of TopicPartition being source TopicPartition.

Reviewers: Matthias Sax<mjsax@apache.org>
2025-02-13 14:06:01 -05:00
Kirk True 057460e807
KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#18795)
Reviewers: Jun Rao <jun@confluent.io>, Lianet Magrans <lmagrans@confluent.io>, Jeff Kim <jeff.kim@confluent.io>
2025-02-13 13:53:56 -05:00
Andrew Schofield 952113e8e0
KAFKA-16720: Support multiple groups in DescribeShareGroupOffsets RPC (#18834)
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
2025-02-13 18:27:05 +00:00
Calvin Liu 9cb271f1e1
KAFKA-18654[2/2]: Transction V2 retry add partitions on the server side when handling produce request. (#18810)
During the transaction commit phase, it is normal to hit CONCURRENT_TRANSACTION error before the transaction markers are fully propagated. Instead of letting the client to retry the produce request, it is better to retry on the server side.

Reviewers: Artem Livshits <alivshits@confluent.io>, Justine Olshan <jolshan@confluent.io>
2025-02-13 09:30:58 -08:00
Matthias J. Sax 9fbf14d544
MINOR: fix warn log message in Kafka Streams (#18878)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Bill Bejeck <bill@confluent.io>
2025-02-13 09:30:07 -08:00
Lianet Magrans 6eb6a5e578
KAFKA-18776: Fix flaky coordinator disconnect test & fix log level (#18866)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-13 12:11:45 -05:00
Lianet Magrans c465cf6b4b
KAFKA-17298: Update upgrade notes for 4.0 KIP-848 (#18756)
Reviewers: David Jacot <djacot@confluent.io>
2025-02-13 11:51:56 -05:00