kafka

Commit Graph

Author	SHA1	Message	Date
TaiJuWu	1c82b89b4c	KAFKA-18712 Move Endpoint to server module (#18803 ) Reviewers: Ismael Juma <ismael@juma.me.uk>, Mickael Maison <mickael.maison@gmail.com>, Christo Lolov <lolovc@amazon.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-25 14:02:51 +08:00
PoAn Yang	10873e4210	KAFKA-18281: Kafka is improperly validating non-advertised listeners for routable controller addresses (#18387 ) When a cluster is configured with a dynamic controller quorum, KRaft replica's endpoint are computed using the advertised.listeners property and not the quorum.controller.voters property. This change in the configuration makes it difficult to keeping all previous node configurations compatible with the new endpoint discovery functionality. The least intrusive solution is to rely on Kafka's reverse hostname lookup when the hostname is not specified. The effective advertised controller listener now remove '0.0.0.0' hostname if the endpoint came from the listener configuration and not the advertised.listener configuration. Reviewers: José Armando García Sancio <jsancio@apache.org>, Alyssa Huang <ahuang@confluent.io>	2025-02-24 21:51:28 -05:00
Nick Guo	d23a61738a	KAFKA-17937 Cleanup AbstractFetcherThreadTest (#18900 ) - Remove AbstractFetcherThreadWithIbp26Test as it tests unsupported IBP - cleanup AbstractFetcherThreadTest to remove unreachable paths, variables, and code Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-02-25 07:45:47 +08:00
Calvin Liu	009bee75ab	KIP-966 part 1 release doc (#18898 ) Add notes to explain how ELR and how to manage ELR. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2025-02-24 15:19:18 -08:00
David Arthur	cb33e98dfc	KAFKA-18748 Run new tests separately in PRs (#18770 ) Split the JUnit tests into "new", "flaky", and the remainder. On PR builds, "new" tests are anything that do not exist on trunk. They are run with zero tolerance for flakiness. On trunk builds, "new" tests are anything added in the last 7 days. They are run with some tolerance for flakiness. Another change included here is that we will not update the test catalog if any test job fails on a trunk build. We have had difficulty determining if all the tests had or not (due to timeout or failures in upstream Gradle tasks). By requiring green ":test" jobs, we can be sure that the resulting catalog will be valid. --- The purpose of this change is to discourage contributors from adding flaky tests, but give some leeway for trunk so we have successful builds. The "quarantinedTest" Gradle target has been consolidated into the regular "test" target. There are now some runtime properties to control what tests are run. * kafka.test.catalog.file: path to test catalog * kafka.test.run.new: include new tests. this selection depends on the age of the loaded test catalog * kafka.test.run.flaky: include tests marked as `@Flaky` (replaces the `excludeTags 'flaky'` directive) * kafka.test.verbose: include additional logging from new JUnit classes (enabled by default if re-running GitHub workflow with debug logging enabled) * maxTestRetries: how many retries to allow via Develocity retry plugin (default 0) * maxTestRetryFailures: how many failures to allow before stopping retries (default 0) Thanks to Jun Rao for inspiring the idea. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>	2025-02-24 17:08:15 -05:00
Calvin Liu	10da082184	MINOR: update truncation test (#18952 ) Reduce the minISR to be 1 for the truncation test in order to skip the protection from KIP-966 Reviewers: David Jacot <djacot@confluent.io>, Colin P. McCabe <cmccabe@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-25 04:32:29 +08:00
Apoorv Mittal	48a506b7b8	KAFKA-18522: Slice records for share fetch (#18804 ) The PR handles slicing of fetched records based on acquire response for share fetch. There could be additional bytes fetched from log but acquired offsets can be a subset, typically with `max fetch records` configuration. Rather sending additional bytes of fetched data to client we should slice the file and wire only needed batches. Note: If the acquired offsets are within a batch then we need to send the entire batch within the file record. Hence rather checking for individual batches, PR finds the first and last acquired offset, and trims the file for all batches between (inclusive) these two offsets. Reviewers: Christo Lolov <lolovc@amazon.com>, Andrew Schofield <aschofield@confluent.io>, Jun Rao <junrao@gmail.com>	2025-02-24 09:55:24 -08:00
Ismael Juma	38c984307c	MINOR: Test showing MetadataLoader waits until metadata version is known (#19012 ) Reviewers: David Arthur <mumrah@gmail.com>	2025-02-24 08:38:45 -08:00
Ismael Juma	48527a1e7f	MINOR: Clean-up imports, imports and unused parameter in upgrade_test.py (#19018 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-02-24 06:35:07 -08:00
Sebastien Viale	3ce5f23295	KAFKA-18023: Enforcing Explicit Naming for Kafka Streams Internal Topics (#18233 ) Pull request to implement KIP-1111, aims to add a configuration that prevents a Kafka Streams application from starting if any of its internal topics have auto-generated names, thereby enforcing explicit naming for all internal topics and enhancing the stability of the application’s topology. - Repartition Topics: All repartition topics are created in the KStreamImpl.createRepartitionedSource(...) static method. This method either receives a name explicitly provided by the user or null and then builds the final repartition topic name. - Changelog Topics and State Store Names: There are several scenarios where these are created: In the MaterializedInternal constructor. During KStream/KStream joins. During KStream/KTable joins with grace periods. With key-value buffers are used in suppressions. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Sophie Blee-Goldman <sophie@responsive.dev>	2025-02-24 11:41:42 +01:00
Shivsundar R	2880e04129	KAFKA-18779: Validate responses from broker in client for ShareFetch and ShareAcknowledge RPCs. (#18939 ) - Currently if we received extraneous topic partitions in the response or if the response was missing some partitions requested, we were processing the response as it came and even populated the callback with these partitions. - These invalid responses should be parsed at the `ShareConsumeRequestManager`. - If the response missed any acknowledgements for partitions that were requested, then we fail the request with `InvalidRecordStateException` and populate the callbacks. - For any extraneous partitions in the response, we log an error and ignore them. Some refactors are also done in this PR in ShareConsumeRequestManager to make the code more readable. Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-24 10:27:24 +00:00
mingdaoy	289e958c39	MINOR: Fix validateResourceNameIsNodeId's exception message (#19017 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-02-24 09:30:02 +08:00
Sushant Mahajan	6e76736890	KAFKA-18827: Initialize share group state persister impl [2/N]. (#18992 ) * In this PR, we have provided implementation for the initialize share group state RPC from the persister perspective. * Tests have been added wherever applicable. Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-23 08:03:13 +00:00
Calvin Liu	a1372ced69	KAFKA-15583 doc update for the "strict min ISR" rule (#18880 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Dave Troiano <dtroiano@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-23 13:06:50 +08:00
Ismael Juma	13cb87c2d0	MINOR: Remove request log space added inadvertently (#19011 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-02-23 11:30:19 +08:00
Sanskar Jhajharia	a206feb4ba	MINOR: Clean up share-coordinator (#19007 ) Given that now we support Java 17 on our brokers, this PR replace the use of `Collections.singletonList()` and `Collections.emptyList()` with `List.of()` Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-23 11:27:38 +08:00
Sushant Mahajan	3fc103b48b	KAFKA-18629: ShareGroupDeleteState admin client impl. (#18928 ) * In this PR, we add various infra classes needed to support the `deleteShareGroups` functionality via the `kafka-share-groups.sh` script, as well as the implementation of `kafka-share-groups.sh --delete`. Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-22 16:21:10 +00:00
Apoorv Mittal	6e45ab7d84	KAFKA-17351: Update tests and acquire API to allow discard batches from compacted topics (1/N) (#18978 ) The PR does following: 1. Adds `fetchOffset` to `acquire` API in SharePartition. 2. Adds a ShareFetchPartitionData class efficiently handle the propagation of fetchOffset information. 3. Updates SharePartitionTests to make common code so such improvements does not require all tests changes for future PRs. Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-22 16:14:09 +00:00
Sushant Mahajan	4f28973bd1	KAFKA-18827: Initialize share state, share coordinator impl. [1/N] (#18968 ) In this PR, we have added the share coordinator and KafkaApis side impl of the intialize share group state RPC. ref: https://cwiki.apache.org/confluence/display/KAFKA/KIP-932%3A+Queues+for+Kafka#KIP932:QueuesforKafka-InitializeShareGroupStateAPI Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-22 16:12:08 +00:00
Ken Huang	c6335c2ae8	MINOR: Fix fail e2e transactions_upgrade_test.py::TransactionsUpgradeTest.test_transactions_upgrade (#19004 ) The main root cause is `3dba3125e9`, this PR remove the metadata version which is older than 3.3, thus this test will fail when it use metadata version 3.2, 3.1 Reviewers: David Jacot <djacot@confluent.io>	2025-02-22 14:45:39 +01:00
David Jacot	407431499e	MINOR: Update version is doc (#19006 ) This patch updates the version in the documentation.	2025-02-22 12:37:15 +01:00
TengYao Chi	1e9565788c	MINOR: Fix fail e2e TestUpgrade#test_combined_mode_upgrade and test_isolated_mode_upgrade (#19003 ) #18845 assumed a baseline of 3.3 for server protocol versions so that the lower version couldn't roll up to 4.0. Hence, the `TestUpgrade#test_combined_mode_upgrad` and `test_isolated_mode_upgrade` failed for the 3.1 and 3.2 versions. e2e tests result with this patch on jenkins: ![Screenshot from 2025-02-22 13-22-17](https://github.com/user-attachments/assets/2de6f707-8281-4f30-b5d0-83dd4de9666d) e2e tests result with this patch on local machine: ![Screenshot from 2025-02-22 13-28-16](https://github.com/user-attachments/assets/2e5e563a-1ac4-4894-ba30-593304697d1d) Reviewers: David Jacot <djacot@confluent.io>	2025-02-22 08:53:34 +01:00
Ken Huang	d820559751	MINOR: Fix fail e2e TransactionsMixedVersionsTest#test_transactions_mixed_versions (#19002 ) The main root cause is `3dba3125e9`, this PR remove the metadata version which is older than 3.3, thus this test will fail when it use metadata version 3.2, 3.1 Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>	2025-02-22 08:52:22 +01:00
David Jacot	14ebec345a	MINOR: Update release script for 4.0 (#18999 ) This patch updates the release script to use JDK 21 to build the release. We could also use JDK 17 but using JDK 21 directly does not change much. We have to verify anyway that the server works with 17 and the client with 11. Reviewers: Ismael Juma <ismael@juma.me.uk>	2025-02-21 20:30:33 +01:00
xijiu	118818a7ca	KAFKA-18795 Remove `Records#downConvert` (#18897 ) Since we no longer convert records to the old format for fetch requests, this code is no longer used in production. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-02-22 02:29:58 +08:00
Lianet Magrans	c580874fc2	KAFKA-18813: [3/N] Client support for TopicAuthException in DescribeConsumerGroup path (#18996 ) Reviewers: David Jacot <djacot@confluent.io>	2025-02-21 12:42:00 -05:00
Apoorv Mittal	f543eac4fe	KAFKA-18733: Implemented fetch ratio and partition acquire time metrics (3/N) (#18959 ) PR implements the final set of ShareGroupMetrics, RequestTopicPartitionsFetchRatio and TopicPartitionsAcquireTimeMs, as defined in KIP-1103: https://cwiki.apache.org/confluence/display/KAFKA/KIP-1103%3A+Additional+metrics+for+cooperative+consumption Note: Metric `RequestTopicPartitionsFetchRatio` is calculated as percentage as Histogram API doesn't record double. Reviewers: Andrew Schofield <aschofield@confluent.io>, Abhinav Dixit <adixit@confluent.io>	2025-02-21 17:01:39 +00:00
Calvin Liu	8f13e7c207	MINOR: Move the ELR default version to 4.1 (#18954 ) - ELR is enabled (ELRV_1) by default if the cluster is created with its bootstrap metadata version >= IBP_4_1_IV0. - ELRV_1 can be manually enabled iff the metadata version is >= IBP_4_0_IV1. Reviewers: Ismael Juma <ismael@juma.me.uk>, Colin P. McCabe <cmccabe@apache.org>, David Jacot <djacot@confluent.io>	2025-02-21 16:13:11 +01:00
Shivsundar R	7da1a6cbff	KAFKA-18033: Remove flaky tag in ShareConsumerTest (#18995 ) 3 tests which were marked flaky in ShareConsumerTest do not have any failure on trunk since the test was converted to use `ClusterTestExtensions`. Reviewers: Sushant Mahajan <smahajan@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>	2025-02-21 13:50:08 +00:00
Lianet Magrans	c56c9faee2	KAFKA-18813: [2/N] Client support for TopicAuthException in HB path (#18986 ) Reviewers: David Jacot <djacot@confluent.io>	2025-02-21 08:45:20 -05:00
TengYao Chi	767a62ade6	KAFKA-18737 KafkaDockerWrapper setup functions fails due to storage format command (#18844 ) The current Docker Hub documentation for Kafka is based on the use of static voters. Since Kafka 4.0 utilizes dynamic voters, users following the doc of docker hub may encounter unexpected behavior. Due to the limited time available for the 4.0.0 release, a simple and quick solution is to revert to using static voters within the Docker image. This can be achieved by adding a configuration file with static voter definitions to the kafka/docker folder, keeping it separate from the main kafka/config directory. This approach allows us to encourage the use of dynamic voters in typical deployments while maintaining compatibility within the Docker image. Reviewers: Vedarth Sharma <142404391+VedarthConfluent@users.noreply.github.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-21 20:43:41 +08:00
David Jacot	2124511431	MINOR: Rearrange configs in GroupCoordinatorConfigs (#18970 ) I was looking into GroupCoordinatorConfigs to review configurations that we will ship with Apache Kafka 4.0. I found out that it was pretty disorganised. This patch cleans up the format and re-groups the configurations which are related. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-02-21 13:20:58 +01:00
Sushant Mahajan	c2cb543a1e	KAFKA-18629: Delete share group state RPC group coordinator impl. [3/N] (#18848 ) * In this PR, we have added GC side impl to call the delete state share coord RPC using the persister. * We will be using the existing `GroupCoordinatorService.deleteGroups`. The logic will be modified as follows: * After sanitization, we will call a new `runtime.scheduleWriteOperation` (not read for consistency) with callback `GroupCoordinatorShard.sharePartitions`. This will return a Map of share partitions of the groups which are of SHARE type. We need to pass all groups as WE CANNOT DETERMINE the type of the group in the service class. * Then using the map we will create requests which could be passed to the persister and make the appropriate calls. * Once this future completes, we will continue with the existing flow of group deletion. * If the group under inspection is not share group - the read callback should return an empty map. * Tests have been added wherever applicable. Reviewers: David Jacot <djacot@confluent.io>, Andrew Schofield <aschofield@confluent.io>	2025-02-21 12:13:16 +00:00
TengYao Chi	d31cbf59de	KAFKA-18831 Migrating to log4j2 introduce behavior changes of adjusting level dynamically (#18969 ) fix the following behavior changes. 1) in log4j 1, users can't change the logger by parent if the logger is declared by properties explicitly. For example, `org.apache.kafka.controller` has level explicitly in the properties. Hence, we can't use "org.apache.kafka=INFO" to change the level of `org.apache.kafka.controller` to INFO. By contrast, log4j2 allows us to change all child loggers by the parent logger. 2) in log4j2, we can change the level of root to impact all loggers' level. By contrast, log4j 1 can't. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-02-21 16:12:58 +08:00
Matthias J. Sax	acea35ddf3	MINOR: cleanup SinkNode generics (#18975 ) Reviewers: Andrew Schofield <aschofield@confluent.io>, Bill Bejeck <bill@confluent.io>	2025-02-20 17:47:39 -08:00
TengYao Chi	709bfc506a	KAFKA-18641: AsyncKafkaConsumer could lose records with auto offset commit (#18737 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>, Jun Rao <jun@confluent.io>, Kirk True <ktrue@confluent.io>	2025-02-20 12:11:01 -05:00
Calvin Liu	1eecd02ce8	MINOR: Deflake EligibleLeaderReplicasIntegrationTest (#18923 ) Make sure to give enough time for the partition ISR updates. Reviewers: David Jacot <djacot@confluent.io>	2025-02-20 05:14:15 -08:00
Sushant Mahajan	c89fd2bff6	KAFKA-18828: Update share group metrics per new init and call mechanism. (#18962 ) * Due to recent changes in the way group count metrics are initialized and updated, the current share group count code has become obsolete as well as non-functional. * The update method for the share group count which should be called from `ShareGroup` cannot be called either. This is because the constructor has been changed to NOT accept the `GroupCoordinatorShardMetrics` ref. * In this PR, we remedy the situation by bringing share group count code at par with consumer and streams groups. * Additionally the metric name for share groups with group state attributes was not aligned with streams and consumer groups as mentioned in https://github.com/apache/kafka/pull/17011#discussion_r1960309578. This PR aligns them too. Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-20 10:23:37 +00:00
Ken Huang	eda8fc84ae	KAFKA-16918 TestUtils#assertFutureThrows should use future.get with timeout (#18891 ) Reviewers: TengYao Chi <kitingiao@gmail.com>, Luke Chen <showuon@gmail.com>, Parker Chang <45290853+Parkerhiphop@users.noreply.github.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-20 07:22:31 +08:00
Matthias J. Sax	9f23b25f6e	MINOR: fix Kafka Streams "smoke test" pass criteria (#18835 ) Reviewers: Bill Bejeck <bill@confluent.io>, Bruno Cadonna <bruno@confluent.io>	2025-02-19 14:33:31 -08:00
Matthias J. Sax	538a60e1b3	MINOR: disallow rawtypes and fail build (#18877 ) Cleanup code to avoid rawtype, and add suppressions where necessary. Change the build to fail on rawtype warning. Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>	2025-02-19 13:11:49 -08:00
Ismael Juma	3a59a526d9	MIINOR: Remove redundant quorum parameter from *AdminIntegrationTest classes (#18965 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>	2025-02-19 15:57:47 -05:00
David Arthur	7d628401e9	KAFKA-18791 Set default commit to PR title and description [2/n] (#18967 ) Reviewers: Justine Olshan <jolshan@confluent.io>	2025-02-19 14:46:53 -05:00
Calvin Liu	f85c7d4696	MINOR: Fix incorrect return value from upgradeFeatures #18958 Reviewers: Colin P. McCabe <cmccabe@apache.org>	2025-02-19 09:41:06 -08:00
Shivsundar R	3603c8fe35	KAFKA-18829: Added check before converting to IMPLICIT mode (#18964 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-19 17:34:28 +00:00
Ismael Juma	6aab304542	MINOR: Update upgrade notes for 4.0.0 (#18960 ) Details: 1. Upgrades to 4.0.x are only supported from 3.3.x and for kraft mode clusters 2. Add rolling upgrade instructions for 4.0.x 3. Clarify the message for zk to kraft migrations 4. Remove all the upgrade instructions for older versions (they can still be found in the upgrade notes for older releases) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>	2025-02-19 09:32:50 -08:00
Kaushik Raina	469c55cf02	Add TransactionAbortableException and Timeout Exception handling instruction in docs (#18942 ) Add instruction for handing TransactionAbortableException and TimeoutException at application side. Reviewers: Justine Olshan <jolshan@confluent.io>	2025-02-19 09:15:47 -08:00
David Arthur	9b29e91218	KAFKA-18791 Enable new asf.yaml parser [1/n] (#18955 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-02-19 10:48:23 -05:00
Ismael Juma	3dba3125e9	KAFKA-18601: Assume a baseline of 3.3 for server protocol versions (#18845 ) 3.3.0 was the first KRaft release that was deemed production-ready and also when KIP-778 (KRaft to KRaft upgrades) landed. Given that, it's reasonable for 4.x to only support upgrades from 3.3.0 or newer (the metadata version also needs to be set to "3.3" or newer before upgrading). Noteworthy changes: 1. `AlterPartition` no longer includes topic names, which makes it possible to simplify `AlterParitionManager` logic. 2. Metadata versions older than `IBP_3_3_IV3` have been removed and `IBP_3_3_IV3` is now the minimum version. 3. `MINIMUM_BOOTSTRAP_VERSION` has been removed. 4. Removed `isLeaderRecoverySupported`, `isNoOpsRecordSupported`, `isKRaftSupported`, `isBrokerRegistrationChangeRecordSupported` and `isInControlledShutdownStateSupported` - these are always `true` now. Also removed related conditional code. 5. Removed default metadata version or metadata version fallbacks in multiple places - we now fail-fast instead of potentially using an incorrect metadata version. 6. Update `MetadataBatchLoader.resetToImage` to set `hasSeenRecord` based on whether image is empty - this was a previously existing issue that became more apparent after the changes in this PR. 7. Remove `ibp` parameter from `BootstrapDirectory` 8. A number of tests were not useful anymore and have been removed. I will update the upgrade notes via a separate PR as there are a few things that need changing and it would be easier to do so that way. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jun Rao <junrao@gmail.com>, David Arthur <mumrah@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Justine Olshan <jolshan@confluen.io>, Ken Huang <s7133700@gmail.com>	2025-02-19 05:35:42 -08:00
ShivsundarR	a6a588fbed	KAFKA-18198: Added check to prevent acknowledgements on initial ShareFetchRequest. (#18944 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-02-19 10:49:58 +00:00

1 2 3 4 5 ...

15172 Commits All Branches Search

15172 Commits

All Branches