Commit Graph

263 Commits

Author SHA1 Message Date
ClarkChen 269e2d898b
KAFKA-18849 Add "strict min ISR" to the docs of "min.insync.replicas" (#19016)
KIP-966 adds strict min ISR rule, so this PR improves the docs of min.insync.replicas to include that change.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-27 16:05:24 +08:00
Sushant Mahajan 6e76736890
KAFKA-18827: Initialize share group state persister impl [2/N]. (#18992)
* In this PR, we have provided implementation for the initialize share
group state RPC from the persister perspective.
* Tests have been added wherever applicable.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-23 08:03:13 +00:00
Calvin Liu 8f13e7c207
MINOR: Move the ELR default version to 4.1 (#18954)
- ELR is enabled (ELRV_1) by default if the cluster is created with its bootstrap metadata version >= IBP_4_1_IV0.
- ELRV_1 can be manually enabled iff the metadata version is >= IBP_4_0_IV1.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Colin P. McCabe <cmccabe@apache.org>, David Jacot <djacot@confluent.io>
2025-02-21 16:13:11 +01:00
Sushant Mahajan c2cb543a1e
KAFKA-18629: Delete share group state RPC group coordinator impl. [3/N] (#18848)
* In this PR, we have added GC side impl to call the delete state share
coord RPC using the persister.
* We will be using the existing `GroupCoordinatorService.deleteGroups`.
The logic will be modified as follows:
* After sanitization, we will call a new
`runtime.scheduleWriteOperation` (not read for consistency) with
callback `GroupCoordinatorShard.sharePartitions`. This will return a Map
of share partitions of the groups which are of SHARE type. We need to
pass all groups as WE CANNOT DETERMINE the type of the group in the
service class.
* Then using the map we will create requests which could be passed to
the persister and make the appropriate calls.
* Once this future completes, we will continue with the existing flow of
group deletion.
* If the group under inspection is not share group - the read callback
should return an empty map.
* Tests have been added wherever applicable.

Reviewers: David Jacot <djacot@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2025-02-21 12:13:16 +00:00
Matthias J. Sax 538a60e1b3
MINOR: disallow rawtypes and fail build (#18877)
Cleanup code to avoid rawtype, and add suppressions where necessary.
Change the build to fail on rawtype warning.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-02-19 13:11:49 -08:00
Ismael Juma 3dba3125e9
KAFKA-18601: Assume a baseline of 3.3 for server protocol versions (#18845)
3.3.0 was the first KRaft release that was deemed production-ready and also
when KIP-778 (KRaft to KRaft upgrades) landed. Given that, it's reasonable
for 4.x to only support upgrades from 3.3.0 or newer (the metadata version also
needs to be set to "3.3" or newer before upgrading).

Noteworthy changes:
1. `AlterPartition` no longer includes topic names, which makes it possible to
simplify `AlterParitionManager` logic.
2. Metadata versions older than `IBP_3_3_IV3` have been removed and
`IBP_3_3_IV3` is now the minimum version.
3. `MINIMUM_BOOTSTRAP_VERSION` has been removed.
4. Removed `isLeaderRecoverySupported`, `isNoOpsRecordSupported`,
`isKRaftSupported`, `isBrokerRegistrationChangeRecordSupported` and
`isInControlledShutdownStateSupported` - these are always `true` now.
Also removed related conditional code.
5. Removed default metadata version or metadata version fallbacks in
multiple places - we now fail-fast instead of potentially using an incorrect
metadata version.
6. Update `MetadataBatchLoader.resetToImage` to set `hasSeenRecord`
based on whether image is empty - this was a previously existing issue that
became more apparent after the changes in this PR.
7. Remove `ibp` parameter from `BootstrapDirectory`
8. A number of tests were not useful anymore and have been removed.

I will update the upgrade notes via a separate PR as there are a few things that
need changing and it would be easier to do so that way.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jun Rao <junrao@gmail.com>, David Arthur <mumrah@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Justine Olshan <jolshan@confluen.io>, Ken Huang <s7133700@gmail.com>
2025-02-19 05:35:42 -08:00
xijiu 4c4458c17a
KAFKA-18799 Remove AdminUtils (#18946)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-19 06:25:43 +08:00
Parker Chang ed366e6b89
MINOR: Align assertFutureThrows method signature with JUnit conventions (#18825)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-02-18 15:56:42 +00:00
Andrew Schofield 6c14f64245
MINOR: Rename NoOpShareStatePersister for consistency (#18933)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-18 14:07:59 +00:00
David Jacot aec0e555be
MINOR: Mark IBP_4_0_IV3 as production ready! (#18902)
This patch marks IBP_4_0_IV3 as production ready for the Apache Kafka 4.0 release. It also introduced IBP_4_1_IV0 as the next development version.

Reviewers: Justine Olshan <jolshan@confluent.io>
2025-02-14 08:17:11 -08:00
Calvin Liu e7a2af8414
KAFKA-18634: Fix ELR metadata version issues (#18680)
This patch cleans up the places that should not use MV to determine ELR is enabled marks 4.0IV1 stable.

Reviewers: Alyssa Huang <ahuang@confluent.io>, Colin P. McCabe <cmccabe@apache.org>
2025-02-13 23:40:31 -08:00
Sushant Mahajan 9e5503b04f
KAFKA-18745: Handle network related errors in persister. (#18824)
Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-07 15:59:06 +00:00
Ken Huang a3d9d881e1
KAFKA-18530 Remove ZooKeeperInternals (#18641)
Since zk has been removed in 4.0, config handlers no longer need to handle the "<default>" value. This PR streamlines the config update process by eliminating the unnecessary string checks for "<default>"

Reviewers: Christo Lolov <lolovc@amazon.com>, Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
2025-02-06 17:48:17 +08:00
Sushant Mahajan 0bd1ff936f
KAFKA-18629: Add persister impl and tests for DeleteShareGroupState RPC. [2/N] (#18748)
Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-02-05 14:51:19 +00:00
Sanskar Jhajharia 7dbed2f6e8
[KAFKA-16720] AdminClient Support for ListShareGroupOffsets (2/2) (#18671)
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Sushant Mahajan <smahajan@confluent.io>, Andrew Schofield <aschofield@confluent.io>
2025-02-05 14:38:09 +00:00
TengYao Chi aac62a32d9
KAFKA-18698: Migrate suitable classes to records in server and server-common modules (#18783)
Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Christo Lolov <lolovc@amazon.com>
2025-02-05 10:00:11 +00:00
Ming-Yen Chung 9f78771a1f
KAFKA-18693 Remove PasswordEncoder (#18790)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-02-04 13:18:41 +08:00
Calvin Liu 6235a73622
KAFKA-16540: Clear ELRs when min.insync.replicas is changed. (#18148)
In order to maintain the integrity of replication, we need to clear the ELRs of affected partitions when min.insync.replicas is changed. This could happen at the topic level, or at a global level if the cluster level default is changed.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2025-01-24 10:57:33 -08:00
TengYao Chi fa2df3bca7
KAFKA-18559 Cleanup FinalizedFeatures (#18593)
Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
2025-01-24 19:39:01 +08:00
Ken Huang d0bca67b27
KAFKA-18363: Remove ZooKeeper mentiosn in broker configs (#18346)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, TengYao Chi <kitingiao@gmail.com>
2025-01-20 18:22:40 +01:00
TengYao Chi a842c02b88
KAFKA-18553: Update javadoc and comments of ConfigType (#18567)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <amittal@confluent.io>
2025-01-20 15:20:36 +01:00
Sanskar Jhajharia bcbc72e29b
[KAFKA-16720] AdminClient Support for ListShareGroupOffsets (1/n) (#18571)
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
2025-01-20 07:47:14 +00:00
TengYao Chi 3490cf7bfc
MINOR: Cleanup MetadataVersion (#18623)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2025-01-19 23:48:37 +08:00
Ismael Juma b90ef861dd
Convert BrokerEndPoint to record (#18577)
Tests for this class is being removed via #18573 - making it
a record avoids the need for the tests being removed.

Reviewers: Andrew Schofield <aschofield@confluent.io>
2025-01-17 07:08:43 -08:00
Jason Taylor 23c459286b
KAFKA-16368: Update defaults for LOG_MESSAGE_TIMESTAMP_AFTER_MAX_MS_DEFAULT and NUM_RECOVERY_THREADS_PER_DATA_DIR_CONFIG (#18106)
Reviewers: Divij Vaidya <diviv@amazon.com>
2025-01-16 15:30:00 +01:00
Apoorv Mittal a1f7457389
KAFKA-18514: Refactor share module code to server and server-common (#18524)
As per the discussion with @ijuma and @mumrah, the `share` module seems not required and it's advised to user `server` and `server-common` instead. The PR moves the classes from `share` module to respective server related modules.

Following has been refactored in the PR:

- Moved Share Fetch, Acknowledge, Session, Context and Cache related classes to `server` module as the classes are used by `core` and `tools` modules.
- Moved `Persister` releated classes from `share` to `server-common` as the Persister classes though currently just being used by `core` module but in [near future](https://github.com/apache/kafka/pull/17775) will also be used by `group-coordinator`. Hence the Persister classes shouldn't go in `server`. The debate is mostly between `coordinator-common` vs `server-common`. We have kept the Persister in `server-common` for now, the classes are more related to the server than the coordinator. Persister is basically an abstraction in the server to let you choose how you want to persist the share group progress.
- Updated build.gradle to remove `share` module.
- Removed `import-control-share.xml`

Reviewers: Ismael Juma <ismael@juma.me.uk>
2025-01-16 00:14:25 -08:00
Xuan-Zhang Gong 1ee715473c
KAFKA-18446 Remove MetadataCacheControllerNodeProvider (#18437)
Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2025-01-12 02:12:35 +08:00
Ismael Juma d4aee71e36
KAFKA-18465: Remove MetadataVersions older than 3.0-IV1 (#18468)
Apache Kafka 4.0 will only support KRaft and 3.0-IV1 is the minimum version supported by KRaft. So, we can assume that Apache Kafka 4.0 will only communicate with brokers that are 3.0-IV1 or newer.

Note that KRaft was only marked as production-ready in 3.3, so we could go further and set the baseline to 3.3. I think we should have that discussion, but it made sense to start with the non controversial parts.

Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <david.jacot@gmail.com>
2025-01-11 09:42:39 -08:00
mingdaoy 23e77ed2d4
KAFKA-18374 Remove EncryptingPasswordEncoder, CipherParamsEncoder, GcmParamsEncoder, IvParamsEncoder, and the unused static variables in PasswordEncoder (#18355)
Reviewers: TengYao Chi <kitingiao@gmail.com>, TaiJuWu <tjwu1217@gmail.com>, Christo Lolov <lolovc@amazon.com>, PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2025-01-07 03:48:33 +08:00
Ismael Juma 409a43eff7
MINOR: Collection/Option usage simplification via methods introduced in Java 9 & 11 (#18305)
Relevant methods:
1. `List.of`, `Set.of`, `Map.of` and similar (introduced in Java 9)
2. Optional: `isEmpty` (introduced in Java 11), `stream` (introduced in Java 9).

Reviewers: Mickael Maison <mimaison@users.noreply.github.com>
2025-01-03 16:13:39 -08:00
Ismael Juma d6f24d3665
Use `instanceof` pattern to avoid explicit cast (#18373)
This feature was introduced in Java 16.

Reviewers: David Arthur <mumrah@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>
2025-01-02 09:32:51 -08:00
TengYao Chi 3161115ada
KAFKA-18361 Remove PasswordEncoderConfigs (#18347)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-12-31 00:18:23 +08:00
PoAn Yang b4be178599
KAFKA-17393: Remove log.message.format.version/message.format.version (KIP-724) (#18267)
Based on [KIP-724](https://cwiki.apache.org/confluence/display/KAFKA/KIP-724%3A+Drop+support+for+message+formats+v0+and+v1), the `log.message.format.version` and `message.format.version` can be removed in 4.0.

These configs effectively a no-op with inter-broker protocol version 3.0 or higher
since Apache Kafka 3.0, so the impact should be minimal.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2024-12-21 15:35:15 -08:00
Justine Olshan 8bd3746e0c
KAFKA-17705: Add Transactions V2 system tests and mark as production ready (#18132)
Added transaction version 2 to some of the system tests. Also marking TV2 as production ready.

Also fixes the defaultVersion test. 

Reviewers: Jun Rao <jun@confluent.io>
2024-12-21 14:01:54 -08:00
Ismael Juma fe56fc98fa
KAFKA-18269: Remove deprecated protocol APIs support (KIP-896, KIP-724) (#18218)
Included in this change:
1. Remove deprecated protocol api versions from json files.
3. Remove fields that are no longer used from json files (affects ListOffsets, OffsetCommit, DescribeConfigs).
4. Remove record down-conversion support from KafkaApis.
5. No longer return `Errors.UNSUPPORTED_COMPRESSION_TYPE` on the fetch path[1].
6. Deprecate `TopicConfig. MESSAGE_DOWNCONVERSION_ENABLE_CONFIG` and made the relevant
configs (`message.downconversion.enable` and `log.message.downcoversion.enable`) no-ops since
down-conversion is no longer supported. It was an oversight not to deprecate this via KIP-724.
7. Fix `shouldRetainsBufferReference` to handle null request schemas for a given version.
8. Simplify producer logic since it only supports the v2 record format now.
9. Fix tests so they don't exercise protocol api versions that have been removed.
10. Add upgrade note.

Testing:
1. System tests have a lot of failures, but those tests fail for trunk too and I didn't see any issues specific to this change - it's hard to be sure given the number of failing tests, but let's not block on that given the other testing that has been done (see below).
3. Java producers and consumers with version 0.9-0.10.1 don't have api versions support and hence they fail in an ungraceful manner: the broker disconnects and the clients reconnect until the relevant timeout is triggered.
4. Same thing seems to happen for the console producer 0.10.2 although it's unclear why since api versions should be supported. I will look into this separately, it's unlikely to be related to this PR.
5. Console consumer 0.10.2 fails with the expected error and a reasonable message[2].
6. Console producer and consumer 0.11.0 works fine, newer versions should naturally also work fine.
7. kcat 1.5.0 (based on librdkafka 1.1.0) produce and consume fail with a reasonable message[3][4].
8. kcat 1.6.0-1.7.0 (based on librdkafka 1.5.0 and 1.7.0 respectively) consume fails with a reasonable message[5].
9. kcat 1.6.0-1.7.0 produce works fine.
10. kcat 1.7.1  (based on librdkafka 1.8.2) works fine for consumer and produce.
11. confluent-go-client (librdkafka based) 1.8.2 works fine for consumer and produce.
12. I will test more clients, but I don't think we need to block the PR on that.

Note that this also completes part of KIP-724: produce v2 and lower as well as fetch v3 and lower are no longer supported.

Future PRs will remove conditional code that is no longer needed (some of that has been done in KafkaApis,
but only what was required due to the schema changes). We can probably do that in master only as it does
not change behavior.

Note that I did not touch `ignorable` fields even though some of them could have been
changed. The reasoning is that this could result in incompatible changes for clients
that use new protocol versions without setting such fields _if_ we don't manually
validate their presence. I will file a JIRA ticket to look into this carefully for each
case (i.e. if we do validate their presence for the appropriate versions, we can
set them to ignorable=false in the json file).

[1] We would return this error if a fetch < v10 was used and the compression topic config was set
to zstd, but we would not do the same for the case where zstd was compressed at the producer
level (the most common case). Since there is no efficient way to do the check for the common
case, I made it consistent for both by having no checks.
[2] ```org.apache.kafka.common.errors.UnsupportedVersionException: The broker is too new to support JOIN_GROUP version 1```
[3]```METADATA|rdkafka#producer-1| [thrd:main]: localhost:9092/bootstrap: Metadata request failed: connected: Local: Required feature not supported by broker (0ms): Permanent```
[4]```METADATA|rdkafka#consumer-1| [thrd:main]: localhost:9092/bootstrap: Metadata request failed: connected: Local: Required feature not supported by broker (0ms): Permanent```
[5] `ERROR: Topic test-topic [0] error: Failed to query logical offset END: Local: Required feature not supported by broker`

Reviewers: David Arthur <mumrah@gmail.com>
2024-12-20 19:52:00 -08:00
TengYao Chi b37b89c668
KAFKA-9366 Upgrade log4j to log4j2 (#17373)
This pull request replaces Log4j with Log4j2 across the entire project, including dependencies, configurations, and code. The notable changes are listed below:

1. Introduce Log4j2 Instead of Log4j
2. Change Configuration File Format from Properties to YAML
3. Adds warnings to notify users if they are still using Log4j properties, encouraging them to transition to Log4j2 configurations

Co-authored-by: Lee Dongjin <dongjin@apache.org>

Reviewers: Luke Chen <showuon@gmail.com>, Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-12-14 01:14:31 +08:00
Justine Olshan 3cf8745243
MINOR: Add clientTransactionVersion to AddPartitionsToTxn requests and persist the value across transitions (#18086)
We can better keep track of which transactions use TV_2 by storing this information in the clientTransactionVersion field and persisting it across state transitions. Also updated some logging and equality code to include this information.

Added a test to ensure version persists. There aren't many TV2 transitions that don't specify TV, but I did test the InitProducerId + epoch overflow case.

Reviewers: Artem Livshits <alivshits@confluent.io>, Jeff Kim <jeff.kim@confluent.io>
2024-12-10 12:59:01 -08:00
Mickael Maison 4630628701
KAFKA-18144 Move the storage exceptions out of the core module (#18021)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-12-07 22:43:00 +08:00
Dongnuo Lyu e30edb3eff
KAFKA-18052: Decouple the dependency of feature stable version to the metadata version (#17886)
Currently the validation of feature upgrade relies on the supported version range generated during registration. For a given feature, its max supported feature version in production is set to be the default version value (the latest feature version with bootstrap metadata value smaller or equal to the latest production metadata value).

This patch introduces a LATEST_PRODUCTION value independent from the metadata version to each feature so that the highest supported feature version can be customized by the feature owner.

The change only applies to dynamic feature upgrade. During formatting, we still use the default value associated the metadata version.

Reviewers: Justine Olshan <jolshan@confluent.io>, Jun Rao <junrao@gmail.com>
2024-12-05 11:07:47 -08:00
HYUNSANG HAN (한현상, Travis) 700bdd5fee
KAFKA-17997 Remove deprecated config log.message.timestamp.difference.max.ms (#17928)
Reviewers: Divij Vaidya <diviv@amazon.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-11-29 03:15:46 +08:00
Ken Huang c32a49549d
MINOR: Remove duplicate valid value in document (#17947)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-27 21:07:36 +08:00
Calvin Liu 2b2b3cd355
KAFKA-18062: use feature version to enable ELR (#17867)
Replace the ELR static config with feature version.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-11-26 14:40:23 -08:00
Colin Patrick McCabe 5fba067aaa
KAFKA-18063: SnapshotRegistry should not leak memory (#17898)
SnapshotRegistry needs to have a reference to all snapshot data structures. However, this should
not be a strong reference, but a weak reference, so that these data structures can be garbage
collected as needed. This PR also adds a scrub mechanism so that we can eventually reclaim the
slots used by GC'ed Revertable objects in the SnapshotRegistry.revertables array.

Reviewers: David Jacot <david.jacot@gmail.com>
2024-11-21 13:27:53 -08:00
Abhinav Dixit aa7a3dbd30
KAFKA-18027: MINOR: Correct DelayedOperationPurgatory code around adding of an already completed operation (#17842)
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Jun Rao <junrao@gmail.com>
2024-11-20 09:26:44 -08:00
Abhinav Dixit 77cc8ffbab
KAFKA-17948: Potential issue during tryComplete and onComplete simultaneous calls to access global variables (#17739)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>, Jun Rao <junrao@gmail.com>
2024-11-15 10:26:54 -08:00
Mickael Maison 5d2074e8f2
MINOR: Move StopPartition to server-common (#17704)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-10 03:13:12 +08:00
Mickael Maison 0049b967e5
KAFKA-17890: Move DelayedOperationPurgatory to server-common (#17636)
Reviewers: Jun Rao <jun@confluent.io>, Apoorv Mittal <amittal@confluent.io>
2024-11-08 09:55:09 +01:00
Linsiyuan9 af53758746
KAFKA-17814 Use `final` declaration to replace the suppression `this-escape` (#17613)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-03 15:00:02 +08:00
Apoorv Mittal ff116df015
KAFKA-17002: Integrated partition leader epoch for Persister APIs (KIP-932) (#16842)
The PR integrates leader epoch for partition while invoking Persister APIs. The write RPC is retried once on leader epoch failure.

Reviewers: Abhinav Dixit <adixit@confluent.io>, Andrew Schofield <aschofield@confluent.io>, Jun Rao <junrao@gmail.com>, David Arthur <mumrah@gmail.com>
2024-10-30 14:41:39 -07:00
Ken Huang 30b1bdfc74
KAFKA-17835 Move ProducerIdManager and RPCProducerIdManager to transaction-coordinator module (#17562)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-31 02:40:47 +08:00