Commit Graph

14855 Commits

Author SHA1 Message Date
David Jacot 6f1e1884e3 MINOR: Tweak default group coordinator config & upgrade notes (#18948)
This patch changes the default value of `group.coordinator.threads` to `4` and sets it priority to `HIGH`. This change makes it consistent with how we handle `num.network.threads` and `num.io.threads`. The patch also tweaks the upgrade notes.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2025-02-18 20:06:24 +01:00
Parker Chang b91f64ceb9 MINOR: Fix the missing and updated licenses (#18950)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-19 01:38:49 +08:00
PoAn Yang 546d9ce39b KAFKA-18773 Migrate the log4j1 config to log4j 2 for native image and README (#18872)
- update reflection-config.json and resource-config.json to include log4j2 and jackson
- remove unused jackson scala library
- fix the incorrect path of log4j2.yaml
- adopt workaround (--standalone) to make this PR work and it will be fixed by KAFKA-18737)

Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-19 00:49:23 +08:00
David Jacot 421b9a6f0c MINOR: Update LICENSE-binary (#18943)
Before the patch:
```
% python3 ./committer-tools/verify_license.py

...

All libs from ./libs are present in the LICENSE file.

The following entries are in the LICENSE file but not present in ./libs. These should be removed from the LICENSE-binary file:
 - audience-annotations-0.12.0
 - jackson-jaxrs-base-2.16.2
 - jackson-jaxrs-json-provider-2.16.2
 - jackson-module-jaxb-annotations-2.16.2
 - jakarta.inject-2.6.1
 - javax.servlet-api-3.1.0
 - jetty-continuation-9.4.56.v20240826
 - jetty-servlet-9.4.56.v20240826
 - jetty-servlets-9.4.56.v20240826
 - jetty-util-ajax-9.4.56.v20240826
 - jsr305-3.0.2
 - log4j-core-test-2.24.1
```

After the patch:
```
% python3 ./committer-tools/verify_license.py

...

All libs from ./libs are present in the LICENSE file.

No extra dependencies in the LICENSE file.
```

Reviewers: Mickael Maison <mickael.maison@gmail.com>
2025-02-18 13:24:31 +01:00
David Jacot e050a1f2d5 MINOR: Add verify_license tool (#18931)
This patch adds the verify_license.py tool. It compares the libraries shipped within the tarball to the LICENSE file, and vice versa, to ensure that they are aligned. It also slightly update the format of the LICENSE file to make it easier to parse it.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Mickael Maison <mickael.maison@gmail.com>
2025-02-18 12:08:15 +01:00
Luke Chen 10c849f55d MINOR: add docs for "org.apache.kafka.sasl.oauthbearer.allowed.urls" (#18938)
add docs for "org.apache.kafka.sasl.oauthbearer.allowed.urls"

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
2025-02-18 16:48:51 +08:00
Sean Quah e0580b1ea4 KAFKA-18807; Fix thread idle ratio metric (#18934)
When group.coordinator.threads is greater than 1, we lose track of thread idle time because of integer arithmetic. Use doubles instead.

Reviewers: David Jacot <djacot@confluent.io>
2025-02-18 08:12:00 +01:00
David Jacot 8aea5d55c7 MINOR: Remove dropwizard metrics from dependencies.gradle (#18932)
This patch removes dropwizard metrics in the dependency list as it is not used any more. It was introduced in 4f5b4c868e because it was required by Zookeeper. Zookeeper is no longer there so we can remove it too.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2025-02-18 08:10:33 +01:00
Matthias J. Sax 7a749b589f HOTFIX: StoreChangelogReader should require stable consumer group (#18901)
Fixing regression bug, introduced by beac86f049

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Bruno Cadonna <bruno@confluent.io>
2025-02-17 12:56:50 -08:00
PoAn Yang c4ea05f684 KAFKA-18784 Fix ConsumerWithLegacyMessageFormatIntegrationTest (#18889)
In PR #18267, we removed old message format for cases in ConsumerWithLegacyMessageFormatIntegrationTest. Although test cases can pass, they don't fulfill original purpose. We can't send old message format since 4.0, so I change cases to append old records by ReplicaManager directly.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-17 20:43:49 +08:00
Jhen-Yung Hsu 8797c903ee KAFKA-18803 The acls would appear at the wrong level of the metadata shell "tree" (#18916)
Reviewers: David Arthur <mumrah@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-17 03:56:53 +08:00
Ming-Yen Chung 6abb4775b9 KAFKA-18790 Fix testCustomQuotaCallback (#18906)
Frequently updating the trust store can cause unexpected termination of the AsyncConsumer background thread.

1. To resolve this issue, reuse the same AdminClient instead of recreating it.
2. Add error logging when fail to initialize resources for the consumer network thread.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-15 03:08:30 +08:00
Justine Olshan e6609d3acd MINOR: Add release notes for Transactions Server Side Defense (KIP-890) (#18896)
Add some notes about upgrading and performance

Reviewers: David Jacot <djacot@confluent.io>
2025-02-14 08:42:02 -08:00
Calvin Liu eddae2ee19 MINOR: TransactionManager logs the epoch bump less frequently. (#18895)
Reviwers: Justine Olshan <jolshan@confluen.io>
2025-02-14 08:38:14 -08:00
David Jacot e37af20760 MINOR: Mark IBP_4_0_IV3 as production ready! (#18902)
This patch marks IBP_4_0_IV3 as production ready for the Apache Kafka 4.0 release. It also introduced IBP_4_1_IV0 as the next development version.

Reviewers: Justine Olshan <jolshan@confluent.io>
2025-02-14 17:17:30 +01:00
David Jacot ffa281029a MINOR: Add KIP-848's metric to the doc (#18890)
This patch update the documentation to include all the new metrics introduced by KIP-848.

Reviewers: Jeff Kim <jeff.kim@confluent.io>
2025-02-14 16:37:13 +01:00
Calvin Liu ba067caa54 KAFKA-18634: Fix ELR metadata version issues (#18680)
This patch cleans up the places that should not use MV to determine ELR is enabled marks 4.0IV1 stable.

Reviewers: Alyssa Huang <ahuang@confluent.io>, Colin P. McCabe <cmccabe@apache.org>
2025-02-14 08:41:05 +01:00
Bill Bejeck 91958fce6a MINOR: Adjust javadoc to reflect the correct status of standby task TopicPartition (#18892)
KIP-744 introduced the StreamsMetadata class as part of the implementation. In the KIP, the javadoc for the standbyTopicPartitions states that the method returns the set of source TopicPartition that it represents as a standby. The current javadoc states that it represents the changelog TopicPartition(s). While the partitions of the source and changelog topics will match, the javadoc needs to be updated to reflect the correct behavior.

Note that the deprecated o.a.k.streams.state.StreamsMetadata#standbyTopicPartitions method also describes the set of TopicPartition being source TopicPartition.

Reviewers: Matthias Sax<mjsax@apache.org>
2025-02-13 14:11:02 -05:00
Kirk True b60d05ca1b KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#18795)
Reviewers: Jun Rao <jun@confluent.io>, Lianet Magrans <lmagrans@confluent.io>, Jeff Kim <jeff.kim@confluent.io>
2025-02-13 13:54:55 -05:00
Calvin Liu 720f4f446d KAFKA-18654[2/2]: Transction V2 retry add partitions on the server side when handling produce request. (#18810)
During the transaction commit phase, it is normal to hit CONCURRENT_TRANSACTION error before the transaction markers are fully propagated. Instead of letting the client to retry the produce request, it is better to retry on the server side.

Reviewers: Artem Livshits <alivshits@confluent.io>, Justine Olshan <jolshan@confluent.io>
2025-02-13 09:33:20 -08:00
Lianet Magrans f5f58e88f7 KAFKA-17298: Update upgrade notes for 4.0 KIP-848 (#18756)
Reviewers: David Jacot <djacot@confluent.io>
2025-02-13 11:52:53 -05:00
Swikar Patel 44a0dce3de KAFKA-15443: Upgrade RocksDB to 9.7.3 (#18275)
This PR upgrades RocksDB from 7.9.2 to 9.7.3 and addresses the following compatibility issues introduced by the RocksDB upgrade:

- Removal of AccessHint: The AccessHint class was completely removed in RocksDB 9.7.3. This required removing all import statements, variable declarations, method parameters, method return types, and static method calls related to AccessHint in RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapter.java RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest.java Unused methods are removed in RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapter.java
- Removal of NO_FILE_CLOSES: The NO_FILE_CLOSES metric was also removed in RocksDB 9.7.3. The calculation for numberOfOpenFiles in RocksDBMetricsRecorder.java has been adjusted to now track the total number of file opens since the last reset. The previous calculation, which subtracted NO_FILE_CLOSES from NO_FILE_OPENS, is no longer possible. The reason RocksDB removed NO_FILE_CLOSES seems to be that it did not properly work: https://github.com/search?q=repo%3Afacebook%2Frocksdb+NO_FILE_CLOSES&type=issues
- Removal of methods related to compressed block cache configuration in BlockBasedTableConfig
- Change of the signature of org.rocksdb.Options.setLogger()

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Matthias J. Sax <matthias@confluent.io>, Bruno Cadonna <cadonna@apache.org>
2025-02-12 10:26:25 +01:00
PoAn Yang da641dcc62 KAFKA-18771: fix Flaky test KRaftClusterTest .testDescribeQuorumRequestToControllers (#18859)
The case testDescribeQuorumRequestToControllers shutdowns raft client but not the controller. This makes client has chance to send a request to the controller and get NOT_LEADER_OR_FOLLOWER error. However, if the raft client finishes shutdown before handling the request, the request will not be handled. Shutdown the controller before doing KafkaFuture#get for the client request, so we can make sure the request is handled by another controller eventually.

Signed-off-by: PoAn Yang <payang@apache.org>

Reviewers: Luke Chen <showuon@gmail.com>
2025-02-12 16:18:18 +08:00
David Jacot c1953c4e60 MINOR: Fix reassign partitions system test (#18860)
The tests which set reassign_from_offset_zero=False have a setup phase which produces records with old timestamps to the topic and waits until they are cleaned by the retention in order to run the main phase of the test based on non-zero offsets. The setup phases did not wait enough for the cleaning task to kick in, mainly because the scheduled task was not started yet due to log.initial.task.delay.ms being set to 30s by default. Reducing it to 5s helps to stabilize the test. The patch also changes the sleep to 12s in order to have a bit more head room.

```
================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.12.0
session_id:       2025-02-11--016
run time:         26 minutes 9.451 seconds
tests run:        12
passed:           12
flaky:            0
failed:           0
ignored:          0
================================================================================
```

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-11 15:46:40 +01:00
David Jacot 57f794432e MINOR: Fix log compaction system test (#18857)
`log.segment.bytes` must be greater or equals to 1MB (KIP-1030).

```
================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.12.0
session_id:       2025-02-10--003
run time:         55.903 seconds
tests run:        1
passed:           1
flaky:            0
failed:           0
ignored:          0
================================================================================
```

Reviewers: Divij Vaidya <diviv@amazon.com>
2025-02-11 14:51:09 +01:00
Edoardo Comar 7fb3879799 KAFKA-18758: NullPointerException in shutdown following InvalidConfigurationException (#18833)
* KAFKA-18758:  NullPointerException in shutdown following InvalidConfigurationException

Add checks for null in shutdown as BrokerLifecycleManager is not instantiaited if LogManager constructor throws an Exception
2025-02-11 10:18:10 +00:00
TengYao Chi 438fbf5f8e KAFKA-18396: Migrate log4j1 configuration to log4j2 in KafkaDockerWrapper (#18394)
After log4j migration, we need to update the logging configuration in KafkaDockerWrapper from log4j1 to log4j2.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2025-02-11 13:26:47 +05:30
Ken Huang 8e8423fed0 KAFKA-18366 Remove KafkaConfig.interBrokerProtocolVersion (#18820)
Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-11 06:24:44 +08:00
Jhen-Yung Hsu 2452d67f2e KAFKA-18743 Remove leader.imbalance.per.broker.percentage as it is not supported by Kraft (#18821)
Remove `leader.imbalance.per.broker.percentage` from config.
Add `leader.imbalance.per.broker.percentage` to release note

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-11 04:02:24 +08:00
Ken Huang ad6db0952b KAFKA-18225 ClientQuotaCallback#updateClusterMetadata is unsupported by kraft (#18196)
This commit ensures that the ClientQuotaCallback#updateClusterMetadata method is executed in KRaft mode. This method is triggered whenever a topic or cluster metadata change occurs. However, in KRaft mode, the current implementation of the updateClusterMetadata API is inefficient due to the requirement of creating a full Cluster object. To address this, a follow-up issue (KAFKA-18239) has been created to explore more efficient mechanisms for providing cluster information to the ClientQuotaCallback without incurring the overhead of a full Cluster object creation.

Reviewers: Mickael Maison <mickael.maison@gmail.com>, TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-11 01:07:28 +08:00
David Jacot c3ca56e66b MINOR: Accept specifying consumer group assignors by their short names (#18832)
At the moment, we require specifying builtin server side assignors by their full class name. This is not convenient and also exposed their full class name as part of our public API. This patch changes it to accept specifying builtin server side assignor by their short name (uniform or range) while continuing to accept full class name for customer assignors.

Reviewers: Jeff Kim <jeff.kim@confluent.io>
2025-02-10 11:59:30 +01:00
Matthias J. Sax 562ecf4a83 MINOR: make leaking public member in StreamsConfig non-private
KIP-1112 specified PROCESSOR_WRAPPER_CLASS_DOC as `private` and it should not be public.
We need to make to package-private though, to allow TopologyConfig to use it.
2025-02-07 21:23:10 -08:00
Lucas Brutschy 54eadda130
MINOR: Fix streams smoke test flush records (#18830)
In the streams smoke test, flush records that are appended to the input topics, to advance the stream time so that all suppressed windows are flushed at the end of the test. The records are created with record time equal to current time + 2 days. caf0b67 changed the broker defaults so that records more than one hour in the future are rejected by the broker. This breaks the flush messages. By moving all record time stamps 2 days into the past, the existing logic should work correctly with the new default broker configuration.

A similar thing happens in the relational smoke test, where data is emitted 4 days into the future. To avoid running into retention / compaction, the window retention time is increased for both tests.

Reviewers: Bruno Cadonna <bruno@confluent.io>, Bill Bejeck <bill@confluent.io>
2025-02-07 20:48:28 +01:00
PoAn Yang b3837f831e KAFKA-17833: Convert DescribeAuthorizedOperationsTest to use KRaft (#18252)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2025-02-07 15:45:05 +01:00
Matthias J. Sax dcc27ec9c7
MINOR: update Kafka Streams `Topology` JavaDocs (#18814)
JavaDocs changes extracted from
https://github.com/apache/kafka/pull/18778 for 4.0 backport.

Reviewers: Bill Bejeck <bill@confluent.io>
2025-02-06 11:01:19 -08:00
PoAn Yang 5bacdbeb1c MINOR: add missing <li> to upgrade.html (#18817)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-07 00:11:24 +08:00
Nick Guo 0c487c2f65 KAFKA-18741 document the removal of `inter.broker.protocol.version` (#18818)
Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-07 00:07:51 +08:00
Colin Patrick McCabe b6e6a3c68a KAFKA-18360 Remove zookeeper configurations (#18566)
Remove broker.id.generation.enable and reserved.broker.max.id, which are not used in KRaft mode.
Remove inter.broker.protocol.version, which is not used in KRaft mode.

Reviewers: PoAn Yang <payang@apache.org>, Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-06 22:22:36 +08:00
Ken Huang cf8d3ac49e KAFKA-18530 Remove ZooKeeperInternals (#18641)
Since zk has been removed in 4.0, config handlers no longer need to handle the "<default>" value. This PR streamlines the config update process by eliminating the unnecessary string checks for "<default>"

Reviewers: Christo Lolov <lolovc@amazon.com>, Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-06 18:56:52 +08:00
TengYao Chi f3d2607cf4 MINOR: Remove unused QuotaConfgHandler (#18617)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-06 18:54:06 +08:00
mingdaoy 8b0ef93bb4 KAFKA-18499 Clean up zookeeper from LogConfig (#18583)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-06 18:49:52 +08:00
Ming-Yen Chung 10105f9cf3 MINOR: Fix wrong config property in KafkaConfigTest (#18815)
Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-06 17:10:17 +08:00
Apoorv Mittal a9b154968f MINOR: Removing share module from settings (#18806)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-06 02:04:59 +08:00
Kuan-Po Tseng 4c9d335bcb KAFKA-18206: EmbeddedKafkaCluster must set features (#18189)
related to KAFKA-18206, set features in EmbeddedKafkaCluster in both streams and connect module, note that this PR also fix potential transaction with empty records in sendPrivileged method as transaction version 2 doesn't allow this kind of scenario.

Reviewers: Justine Olshan <jolshan@confluent.io>
2025-02-05 09:17:48 -08:00
TengYao Chi 6684319185 KAFKA-18645: New consumer should align close timeout handling with classic consumer (#18702)
Reviewers: Lianet Magrans <lmagrans@confluent.io>, Kirk True <ktrue@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-05 09:11:09 -05:00
Ming-Yen Chung 5d3d7250f6 KAFKA-18675 Add tests for valid and invalid broker addresses (#18781)
Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-05 17:02:20 +08:00
kevin-wu24 98d238aef0 KAFKA-16524; Metrics for KIP-853 (#18304)
This change implement some of the metrics enumerated in KIP-853.

The KafkaRaftMetrics object now exposes number-of-voters, number-of-observers and uncommitted-voter-change. The number-of-observers and uncommitted-voter-change metrics are only present on the active controller or leader, since it does not make sense for other replicas to report these metrics.

In order to make these two metrics thread-safe, KafkaRaftMetrics needs to be passed into LeaderState, and therefore QuorumState. This introduces a circularity since the KafkaRaftMetrics constructor takes in QuorumState. To break the circularity for now, the logic using QuorumState will be moved to the KafkaRaftMetrics#initialize method.

The BrokerServerMetrics object now exposes ignored-static-voters. The ControllerServerMetrics object now exposes IgnoredStaticVoters. To implement both metrics for "ignored static voters", this PR introduces the ExternalKRaftMetrics interface, which allows for higher layer metrics objects to be accessible within the raft module.

Reviewers: José Armando García Sancio <jsancio@apache.org>
2025-02-04 14:19:11 -08:00
Calvin Liu 226532a966 KAFKA-18635: reenable the unclean shutdown detection (#18277)
We need to re-enable the unclean shutdown detection when in ELR mode, which was inadvertently removed during the development process.

Reviewers: David Mao <dmao@confluent.io>,  Jun Rao <junrao@gmail.com>
2025-02-04 14:07:27 -08:00
Calvin Liu eb352ff00b KAFKA-18649: complete ClearElrRecord handling (#18708)
Implement ClearElrRecord handling in the TopicDelta. Also, the ReplicationControlManager should not merge updates if ELR/LastKnownElr are empty, becuase that will cause an unnecessary partition epoch bump.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2025-02-04 13:56:01 -08:00
Calvin Liu ffc57421f5 KAFKA-16540: Clear ELRs when min.insync.replicas is changed. (#18148)
In order to maintain the integrity of replication, we need to clear the ELRs of affected partitions when min.insync.replicas is changed. This could happen at the topic level, or at a global level if the cluster level default is changed.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2025-02-04 13:51:32 -08:00