kafka

Commit Graph

Author	SHA1	Message	Date
Ken Huang	7fdd11295c	KAFKA-18685: Cleanup DynamicLogConfig constructor (#18764 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Christo Lolov <lolovc@amazon.com>	2025-02-03 15:38:05 +00:00
PoAn Yang	bc7b87001b	KAFKA-18676; Update Benchmark system tests (#18785 ) Update `benchmark_test.py` to use KRaft. ``` > TC_PATHS="tests/kafkatest/benchmarks/core/benchmark_test.py" /bin/bash tests/docker/run_tests.sh ================================================================================ SESSION REPORT (ALL TESTS) ducktape version: 0.12.0 session_id: 2025-02-03--001 run time: 96 minutes 48.900 seconds tests run: 120 passed: 120 flaky: 0 failed: 0 ignored: 0 ================================================================================ ``` Reviewers: David Jacot <djacot@confluent.io>	2025-02-03 14:42:22 +01:00
PoAn Yang	f6f41dc5eb	KAFKA-17631 Convert SaslApiVersionsRequestTest to kraft (#18330 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-02-03 21:01:38 +08:00
Jhen-Yung Hsu	9ba2621620	MINOR: Remove the test for ZooKeeper metrics used by ZooKeeperClient (#18775 ) Reviewers: Ismael Juma <ismael@juma.me.uk>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-03 20:06:01 +08:00
David Jacot	bf05d2c914	KAFKA-18672; CoordinatorRecordSerde must validate value version (#18749 ) CoordinatorRecordSerde does not validate the version of the value to check whether the version is supported by the current version of the software. This is problematic if a future and unsupported version of the record is read by an older version of the software because it would misinterpret the bytes. Hence CoordinatorRecordSerde must throw an error if the version is unknown. This is also consistent with the handling in the old coordinator. Reviewers: Jeff Kim <jeff.kim@confluent.io>	2025-02-03 02:19:27 -08:00
Alieh Saeedi	eb01221dc0	KAFKA-17125: Streams Sticky Task Assignor (#18652 ) Implements streams sticky assignor on the broker-side. Reviewers: Bill Bejeck <bbejeck@apache.org>, Lucas Brutschy <lbrutschy@confluent.io>	2025-02-03 10:43:26 +01:00
PoAn Yang	5268fcdc98	KAFKA-18678 Update TestVerifiableProducer system test (#18768 ) Reviewers: David Jacot <djacot@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-03 14:14:54 +08:00
Ming-Yen Chung	9d6faf0283	KAFKA-18674 Document the incompatible changes in parsing --bootstrap-server (#18751 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-02-03 13:57:32 +08:00
Logan Zhu	04ea25b3c3	MINOR: Replace lambda expressions with method references in ProducerStateManager (#18753 ) Reviewers: TengYao Chi <kitingiao@gmail.com>, Divij Vaidya <diviv@amazon.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-02-03 10:16:22 +08:00
Ismael Juma	78aff4fede	KAFKA-18659: librdkafka compressed produce fails unless api versions returns produce v0 (#18727 ) Return produce v0-v2 as supported versions in `ApiVersionsResponse`, but disable support for it everywhere else. Since clients pick the highest supported version by both client and broker during version negotiation, this solves the problem with minimal tech debt (even though it's not ideal that `ApiVersionsResponse` becomes inconsistent with the actual protocol support). Add one test for the socket server handling (in `ProcessorTest`) and one test for the client behavior (in `ProduceRequestTest`). Adjust a couple of api versions tests to verify the new behavior. Finally, include a few clean-ups in `ApiKeys`, `Protocol`, `ProduceRequest`, `ProduceRequestTest` and `BrokerApiVersionsCommandTest`. Reference to related librdkafka issue: https://github.com/confluentinc/librdkafka/issues/4956 Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>	2025-02-01 16:08:54 -08:00
Apoorv Mittal	484ba83f59	KAFKA-18683: Handle slicing of file records for updated start position (#18759 ) The PR corrects the check which was introduced in #5332 where position is checked to be within boundaries of file. The check position > currentSizeInBytes - start is incorrect, since the position is relative to start. Reviewers: Jun Rao <junrao@gmail.com>	2025-01-31 15:43:51 -08:00
Lianet Magrans	7920fadbb5	Revert "KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#17700 )" This reverts commit `6cf54c4dab`.	2025-01-31 17:18:35 -05:00
David Jacot	d19b605210	KAFKA-18320; Ensure that assignors are at the right place (#18750 ) The full class name of the assignors if part of our public api. Hence, we should ensure that they are not changed by mistake. This patch adds a unit test verifying them. Reviewers: Sean Quah <squah@confluent.io>, Jeff Kim <jeff.kim@confluent.io>	2025-01-31 07:51:28 -08:00
David Jacot	0ff4dafb7d	KAFKA-18146; tests/kafkatest/tests/core/upgrade_test.py needs to be re-added as KRaft (#18766 ) This patch renames kraft_upgrade_test.py to upgrade_test.py. This is enough to cover the old upgrade/downgrade tests. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-01-31 06:07:11 -08:00
TengYao Chi	d7a5b877f2	KAFKA-18677; Update ConsoleConsumerTest system test (#18763 ) This patch converts the ConsoleConsumerTest system test to only use KRaft. Reviewers: David Jacot <djacot@confluent.io>	2025-01-31 12:19:49 +01:00
Mickael Maison	71314739f9	KAFKA-15995: Initial API + make Producer/Consumer plugins Monitorable (#17511 ) Reviewers: Greg Harris <gharris1727@gmail.com>, Luke Chen <showuon@gmail.com>	2025-01-31 10:40:10 +01:00
Luke Chen	15c5c075c1	MINOR: Clean up for sasl endpoints (#18519 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	2025-01-31 09:27:04 +01:00
Matthias J. Sax	281a3c6a3a	MINOR: cleanup KStream JavaDocs (3/N) - groupBy[Key] (#18705 ) Reviewers: Alieh Saeedi <asaeedi@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>	2025-01-30 19:52:14 -08:00
Matthias J. Sax	0d1e7e04b2	KAFKA-18644: improve generic type names for KStreamImpl and KTableImpl (#18722 ) Reviewers: Bill Bejeck <bill@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>	2025-01-30 19:50:37 -08:00
kevin-wu24	184b891871	KAFKA-16524; Metrics for KIP-853 (#18304 ) This change implement some of the metrics enumerated in KIP-853. The KafkaRaftMetrics object now exposes number-of-voters, number-of-observers and uncommitted-voter-change. The number-of-observers and uncommitted-voter-change metrics are only present on the active controller or leader, since it does not make sense for other replicas to report these metrics. In order to make these two metrics thread-safe, KafkaRaftMetrics needs to be passed into LeaderState, and therefore QuorumState. This introduces a circularity since the KafkaRaftMetrics constructor takes in QuorumState. To break the circularity for now, the logic using QuorumState will be moved to the KafkaRaftMetrics#initialize method. The BrokerServerMetrics object now exposes ignored-static-voters. The ControllerServerMetrics object now exposes IgnoredStaticVoters. To implement both metrics for "ignored static voters", this PR introduces the ExternalKRaftMetrics interface, which allows for higher layer metrics objects to be accessible within the raft module. Reviewers: José Armando García Sancio <jsancio@apache.org>	2025-01-30 18:35:01 -05:00
Justine Olshan	ccab9eb8b4	KAFKA-18660: Transactions Version 2 doesn't handle epoch overflow correctly (#18730 ) Fixed the typo that used the wrong producer ID and epoch when returning so that we handle epoch overflow correctly. We also had to rearrange the concurrent transaction handling so that we don't self-fence when we start the new transaction with the new producer ID. I also tested this with a modified version of the code where epoch overflow happens on the first epoch bump (every request has a new producer id) Reviewers: Artem Livshits <alivshits@confluent.io>, Jeff Kim <jeff.kim@confluent.io>	2025-01-30 13:42:10 -08:00
Kirk True	6cf54c4dab	KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#17700 ) This change reduces fetch session cache evictions on the broker for AsyncKafkaConsumer by altering its logic to determine which partitions it includes in fetch requests. Background Consumer implementations fetch data from the cluster and temporarily buffer it in memory until the user next calls Consumer.poll(). When a fetch request is being generated, partitions that already have buffered data are not included in the fetch request. The ClassicKafkaConsumer performs much of its fetch logic and network I/O in the application thread. On poll(), if there is any locally-buffered data, the ClassicKafkaConsumer does not fetch any new data and simply returns the buffered data to the user from poll(). On the other hand, the AsyncKafkaConsumer consumer splits its logic and network I/O between two threads, which results in a potential race condition during fetch. The AsyncKafkaConsumer also checks for buffered data on its application thread. If it finds there is none, it signals the background thread to create a fetch request. However, it's possible for the background thread to receive data from a previous fetch and buffer it before the fetch request logic starts. When that occurs, as the background thread creates a new fetch request, it skips any buffered data, which has the unintended result that those partitions get added to the fetch request's "to remove" set. This signals to the broker to remove those partitions from its internal cache. This issue is technically possible in the ClassicKafkaConsumer too, since the heartbeat thread performs network I/O in addition to the application thread. However, because of the frequency at which the AsyncKafkaConsumer's background thread runs, it is ~100x more likely to happen. Options The core decision is: what should the background thread do if it is asked to create a fetch request and it discovers there's buffered data. There were multiple proposals to address this issue in the AsyncKafkaConsumer. Among them are: The background thread should omit buffered partitions from the fetch request as before (this is the existing behavior) The background thread should skip the fetch request generation entirely if there are any buffered partitions The background thread should include buffered partitions in the fetch request, but use a small “max bytes” value The background thread should skip fetching from the nodes that have buffered partitions Option 4 won out. The change is localized to AbstractFetch where the basic idea is to skip fetch requests to a given node if that node is the leader for buffered data. By preventing a fetch request from being sent to that node, it won't have any "holes" where the buffered partitions should be. Reviewers: Lianet Magrans <lmagrans@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Jun Rao <junrao@gmail.com>	2025-01-30 13:12:11 -08:00
Joao Pedro Fonseca Dantas	9980e12ce1	MINOR: remove close from contextual processors javadoc Reviewers: Matthias J. Sax <matthias@confluent.io>	2025-01-30 11:02:07 -08:00
Matthias J. Sax	ea07ff7694	MINOR: cleanup KStream JavaDocs (2/N) - print/foreach/peek/split/merge (#18704 ) Reviewers: Lucas Brutschy <lbrutschy@confluent.io>	2025-01-30 09:32:57 -08:00
Matthias J. Sax	a916a1db82	MINOR: cleanup KStream JavaDocs (1/N) - filter[Not]/selectKey (#18703 ) Reviewers: Alieh Saeedi <asaeedi@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>	2025-01-30 09:31:47 -08:00
Ken Huang	4b29fd6383	KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18548 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>, Kirk True <ktrue@confluent.io>	2025-01-30 11:22:54 -05:00
Sushant Mahajan	be96807ac8	MINOR: Refactor share coord cache helper to share package. (#18743 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-01-30 13:33:42 +00:00
Pramithas Dhakal	aa27df9396	MINOR: KafkaProducerTest - Fix resource leakage and replace explicit invocation of close() method with try with resources (#18678 ) Reviewers: Divij Vaidya <diviv@amazon.com>, Greg Harris <greg.harris@aiven.io>, Christo Lolov <lolovc@amazon.com>	2025-01-30 12:34:57 +01:00
yx9o	c0b5d3334a	MINOR: Improve error message for invalid topic in TopicCommand (#18714 ) Reviewers: Divij Vaidya <diviv@amazon.com>	2025-01-30 12:07:45 +01:00
Almog Gavra	95abd174c7	MINOR: fix typo in HTML docs (#18742 ) Reviewers: Divij Vaidya <diviv@amazon.com>	2025-01-30 11:58:13 +01:00
Mehari Beyene	cc259d76e9	KAFKA-18570: Update documentation to add remainingLogsToRecover, remainingSegmentsToRecover and LogDirectoryOffline metrics (#18731 ) Reviewers: Divij Vaidya <diviv@amazon.com>	2025-01-30 11:52:35 +01:00
Lucas Brutschy	56e50120be	KAFKA-18621: Add StreamsCoordinatorRecordHelpers (#18669 ) A class with helper methods to create records stored in the __consumer_offsets topic. Compared to the feature branch, I added unit tests (most functions were not tested) and adopted the new interface for constructing coordinator records introduced by David. Reviewers: Bruno Cadonna <cadonna@apache.org>	2025-01-30 09:28:45 +01:00
PoAn Yang	0dfc4017b8	KAFKA-18441: Fix flaky KafkaAdminClientTest#testAdminClientApisAuthenticationFailure (#18735 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Andrew Schofield <aschofield@confluent.io>	2025-01-30 08:01:20 +00:00
David Arthur	617196c68e	KAFKA-18636 Fix how we handle Gradle exits in CI (#18681 ) This patch removes the explicit failure of test tasks in Gradle when there is a flaky test. This also fixes a fall-through case in junit.py where we did not recognize an error prior to running the tests (such as the javadoc task). Additionally, this patch removes usages of ignoreFailures in our CI and changes the XML copy task to a finalizer task instead of doLast closure. Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2025-01-29 18:42:39 -05:00
Calvin Liu	fdbed6c458	KAFKA-18649: complete ClearElrRecord handling (#18708 ) Implement ClearElrRecord handling in the TopicDelta. Also, the ReplicationControlManager should not merge updates if ELR/LastKnownElr are empty, becuase that will cause an unnecessary partition epoch bump. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2025-01-29 15:07:44 -08:00
TengYao Chi	9dd73d43b0	KAFKA-18569: New consumer close may wait on unneeded FindCoordinator (#18590 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>, Kirk True <ktrue@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-01-29 14:15:56 -05:00
Matthias J. Sax	1123a76110	KAFKA-13722: remove internal usage of old ProcessorContext (#18698 ) Reviewers: Lucas Brutschy <lbrutschy@confluent.io>	2025-01-29 11:13:57 -08:00
Bill Bejeck	20b073bbee	KAFKA-18498: Update lock ownership from main thread (#18732 ) Once a StreamThread receives its assignment, it will close the startup tasks. But during the closing process, the StandbyTask.closeClean() method will eventually call theStatemanagerUtil.closeStateManager method which needs to lock the state directory, but locking requires the calling thread be the current owner. Since the main thread grabs the lock on startup but moves on without releasing it, we need to update ownership explicitly here in order for the stream thread to close the startup task and begin processing. Reviewers: Matthias Sax <mjsax@apache.org>, Nick Telford	2025-01-29 14:09:44 -05:00
Joao Pedro Fonseca Dantas	85109a5111	KAFKA-16339: Add Kafka Streams migrating guide from transform to process (#18314 ) Reviewers: Matthias J. Sax <matthias@confluent.io>	2025-01-29 11:07:16 -08:00
PoAn Yang	4dd0bcbde8	KAFKA-18383 Remove reserved.broker.max.id and broker.id.generation.enable (#18478 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2025-01-30 02:55:09 +08:00
Calvin Liu	a3b34c1315	KAFKA-18662: Return CONCURRENT_TRANSACTIONS on produce request in TV2 (#18733 ) While testing, it was found that the not_enough_replicas error was super common and could be easily confused. Since we are already bumping the request, we can signify that the produce request may return this error and new clients can handle it (Note, the java client should be able to handle this already as a retriable error, but other client libraries may need to implement this change) Reviewers: Justine Olshan <jolshan@confluent.io>	2025-01-29 10:15:48 -08:00
Sushant Mahajan	632aedcf4f	KAFKA-18632: Multibroker test improvements. (#18718 ) Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-01-29 17:03:43 +00:00
Jeff Kim	048dfeffd0	MINOR: prevent exception from HdrHistogram (#18674 ) HdrHistogram can throw an exception if the recorded value is greater than a configured limit. Expand the ceiling from per-metric to all invocations. Reviewers: David Jacot <djacot@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2025-01-29 11:34:46 -05:00
Abhinav Dixit	dd1f2b8aab	KAFKA-18653: Fix mocks and potential thread leak issues causing silent RejectedExecutionException in share group broker tests (#18725 ) Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>	2025-01-29 16:24:30 +00:00
Ismael Juma	ca5d2cf76d	KAFKA-18646: Null records in fetch response breaks librdkafka (#18726 ) Ensure we always return empty records (including cases where an error is returned). We also remove `nullable` from `records` since it is effectively expected to be non-null by a large percentage of clients in the wild. This behavior regressed in `fe56fc9` (KAFKA-18269). Empty records were previously set via `FetchResponse.recordsOrFail(partitionData)` in the now-removed `maybeConvertFetchedData` method. Added an integration test that fails without this fix and also update many tests to set `records` to `empty` instead of leaving them as `null`. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Arthur <mumrah@gmail.com>	2025-01-29 07:04:12 -08:00
TengYao Chi	97a228070e	KAFKA-18619: New consumer topic metadata events should set requireMetadata flag (#18668 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>	2025-01-29 08:36:05 -05:00
Andrew Schofield	f960e20647	KAFKA-18488: Improve KafkaShareConsumerTest (#18728 ) Reviewers: Lianet Magrans <lmagrans@confluent.io>	2025-01-29 09:47:21 +00:00
Ismael Juma	e6d72c9e60	KAFKA-18648: Add back support for metadata version 0-3 (#18716 ) During testing, we identified that kafka-python (and aiokafka) relies on metadata request v0 and hence we need to add these back to comply with the premise of KIP-896 - i.e. it should not break the clients listed within it. I reverted the changes from #18218 related to the removal of metadata versions 0-3. I will submit a separate PR to undeprecate these API versions on the relevant 3.x branches. kafka-python (and aiokafka) work correctly (produce & consume) with this change on top of the 4.0 branch. Reviewers: David Arthur <mumrah@gmail.com>	2025-01-28 18:35:33 -08:00
David Arthur	f18457f2b8	MINOR Mark a StickyAssignorTest as flaky (#18719 ) Mark StickyAssignorTest#testLargeAssignmentAndGroupWithNonEqualSubscription as flaky. Used data from this report https://github.com/apache/kafka/actions/runs/12982945953 Reviewers: Andrew Schofield <aschofield@confluent.io>	2025-01-28 10:34:05 -05:00
Apoorv Mittal	c7619ef8d1	KAFKA-17951: Share parition rotate strategy (#18651 ) Reviewers: Andrew Schofield <aschofield@confluent.io>, Abhinav Dixit <adixit@confluent.io>	2025-01-28 11:44:48 +00:00

... 3 4 5 6 7 ...

15172 Commits All Branches Search

15172 Commits

All Branches