Commit Graph

3143 Commits

Author SHA1 Message Date
David Jacot 84049369c1
MINOR: Bump trunk to 4.1.0-SNAPSHOT (#18213)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-12-16 16:47:13 +01:00
A. Sophie Blee-Goldman bac8928521
KAFKA-18026: KIP-1112, migrate foreign-key joins to use ProcesserSupplier#stores (#18194)
Convert FKJ processors to implementing the #stores method

Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>
2024-12-15 20:00:42 -08:00
Rohan ed10fc63a9
KAFKA-18026: supply stores for KTable#mapValues using ProcessorSupplier#stores (#18155)
KAFKA-18026: supply stores for KTable#mapValues using ProcessorSupplier#stores

Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-12-14 20:18:49 -08:00
A. Sophie Blee-Goldman 91575892d2
HOTFIX: RocksDBMetricsRecorder#init should null check taskId (#18151)
Appears to be a typo in the code, since the error message indicates this check is for taskId being null, but instead we accidentally check the streams metrics twice

Reviewers: Matthias Sax <mjsax@apache.org>, runo Cadonna <cadonna@apache.org>, Lucas Brutschy <lbrutschy@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
2024-12-13 20:36:08 -08:00
Lianet Magrans 84bc0c26ee
KAFKA-18224: Explicit group protocol setting in streams resetter (#18172)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-12-13 14:31:50 -05:00
TengYao Chi b37b89c668
KAFKA-9366 Upgrade log4j to log4j2 (#17373)
This pull request replaces Log4j with Log4j2 across the entire project, including dependencies, configurations, and code. The notable changes are listed below:

1. Introduce Log4j2 Instead of Log4j
2. Change Configuration File Format from Properties to YAML
3. Adds warnings to notify users if they are still using Log4j properties, encouraging them to transition to Log4j2 configurations

Co-authored-by: Lee Dongjin <dongjin@apache.org>

Reviewers: Luke Chen <showuon@gmail.com>, Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-12-14 01:14:31 +08:00
A. Sophie Blee-Goldman ef2e4600f3
KAFKA-18026: KIP-1112, migrate stream-stream joins to use ProcesserSupplier#stores (#18111)
Covers wrapping of processors and state stores for KStream-KStream joins.

Includes self-joins and the spurious results fix optimization

Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>
2024-12-12 14:54:58 -08:00
Almog Gavra 9b776ffc50
KAFKA-18026: KIP-1112 convert StreamToTableNode (#18149)
Covers wrapping of processors and state stores for StreamToTableSource

Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-12-12 14:52:21 -08:00
Lianet Magrans 7a64623e40
Set protocol for streams tests (#18160)
Reviewers: Bill Bejeck <bill@confluent.io>
2024-12-12 13:33:43 -05:00
Matthias J. Sax a0a501952b
MINOR: improve Kafka Streams metrics documentation (#17900)
Reviewers: Bruno Cadonna <bruno@confluent.io>, Guozhang Wang <guozhang.wang.us@gmail.com>
2024-12-11 18:34:43 -08:00
Almog Gavra 21563380f3
KAFKA-18026: KIP-1112, migrate table-table joins to use ProcesserSuppliers#stores (#18048)
Covers wrapping of processors and state stores for KTable-KTable joins

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Guozhang Wang <guozhang.wang.us@gmail.com>
2024-12-11 17:37:34 -08:00
Matthias J. Sax 6cdb8c352a
KAFKA-18015: add byDuration auto.offset.reset to Kafka Streams (#18115)
Part of KIP-1106.

Adds support for "by_duration" and "none" reset strategy
to the Kafka Streams runtime.

Reviewers: Bill Bejeck <bill@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-12-11 15:12:16 -08:00
Matthias J. Sax 990c8c750c
MINOR: remove old procesor API MockInternalProcessorContext (#18103)
Reviewers: Bill Bejeck <bill@confluent.io>
2024-12-11 15:09:13 -08:00
Matthias J. Sax ab2facca58
KAFKA-12829: Remove deprecated KStream.process() for old Processor API (#18088)
Reviewers: Bill Bejeck <bill@confluent.io>
2024-12-11 14:28:47 -08:00
KApolinario1120 d83f09d014
KAFKA-18015: Add support for duration based offset reset strategy to Kafka Streams (#17973)
Part of KIP-1106.

Adds the public APIs to Kafka Streams, to support the the newly added "by_duration" reset policy,
plus adds the missing "none" reset policy. Deprecates the enum `Topology.AutoOffsetReset` and
all related methods, and replaced them with new overload using the new `AutoOffsetReset` class.

Co-authored-by: Matthias J. Sax <matthias@confluent.io>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-12-11 10:47:25 -08:00
mingdaoy 4603f7495e
KAFKA-18030 Remove old upgrade-system-tests modules (#17843)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-12-10 11:19:14 +08:00
TengYao Chi e8837465a5
KAFKA-18067: Kafka Streams can leak Producer client under EOS (#17931)
To avoid leaking producers, we should add a 'closedflag toStreamProducer` indicating whether we should reset prouder.

Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-12-09 16:12:05 -08:00
Joao Pedro Fonseca Dantas d5c2029434
KAFKA-16339: [4/4 KStream#flatTransformValues] Remove Deprecated "transformer" methods and classes (#17882)
Reviewer: Matthias J. Sax <matthias@confluent.io>
2024-12-08 20:10:11 -08:00
yx9o 38e727fe4d
KAFKA-17864: add descriptions to fields in the agreement (#17681)
Improve descriptive information in Kafka protocol documentation.

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>
2024-12-07 18:47:11 +00:00
Andrew Schofield e7d986e48c
KAFKA-17550: DescribeGroups v6 exploitation (#17706)
This PR introduces the DescribeGroups v6 API as part of KIP-1043. This adds an error message for the described groups so that it is possible to get some context on the error. It also changes the behaviour for when the group ID cannot be found but returning error code GROUP_ID_NOT_FOUND rather than NONE.

Reviewers: David Jacot <djacot@confluent.io>
2024-12-05 23:12:24 -08:00
A. Sophie Blee-Goldman 09e8fa2dbe
KAFKA-18026: KIP-1112, migrate stream-table joins to use ProcesserSupplier#stores (#18047)
Covers wrapping of processors and state stores for KStream-KTable joins

Reviewers: Almog Gavra <almog@responsive.dev>, Guozhang Wang <guozhang.wang.us@gmail.com>
2024-12-05 10:06:11 -08:00
Ken Huang 2b43c49f51
KAFKA-18050 Upgrade the checkstyle version to 10.20.2 (#17999)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-12-05 10:59:18 +08:00
A. Sophie Blee-Goldman 31d97bc3c9
KAFKA-18026: KIP-1112, skip re-registering aggregate stores in StatefulProcessorNode (#18015)
Minor followup to #17929 based on this discussion

Also includes some very minor refactoring/renaming on the side. The only real change is in the KGroupedStreamImpl class

Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>
2024-12-03 22:18:55 -08:00
Peter Lee c76fb5cb9b
KAFKA-17893: Support record keys in the foreignKeyExtractor argument of KTable foreign join (#17756)
Currently, KTable foreign key joins only allow extracting the foreign key from the value of the source record. This forces users to duplicate data that might already exist in the key into the value when the foreign key needs to be derived from both the key and value. This leads to:

- Data duplication
- Additional storage overhead
- Potential data inconsistency if the duplicated data gets out of sync
- Less intuitive API when the foreign key is naturally derived from both key and value

This change allows user to extract the foreign key from the key and value of the source record.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-12-03 17:34:13 +01:00
A. Sophie Blee-Goldman 184b64fb41
KAFKA-18026: migrate KStream and KTable aggregates to use ProcesserSupplier#stores (#17929)
As part of KIP-1112, to maximize the utility of the new ProcessorWrapper, we need to migrate the DSL operators to the new method of attaching state stores by implementing ProcessorSupplier#stores, which makes these stores available for inspection by the user's wrapper.

This PR covers the aggregate operator for both KStream and KTable.


Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Rohan Desai <rohan@responsive.dev>
2024-12-03 02:09:43 -08:00
TengYao Chi 6fd951a9c0
KAFKA-17610 Drop alterConfigs (#18002)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-12-02 23:26:06 +08:00
Almog Gavra 5243fb9a7d
KAFKA-18026: migrate KTableSource to use ProcesserSupplier#stores (#17903)
This PR is part of the implementation for KIP-1112 (KAFKA-18026). In order to have DSL operators be properly wrapped by the interface suggestion in 1112, we need to make sure they all use the ConnectedStoreProvider#stores method to connect stores instead of manually calling addStateStore.

This is a refactor only, there is no new behaviors.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-11-27 14:04:27 -08:00
Bill Bejeck d334f60944
MINOR: KStreamRepartitionIntegrationTest bug (#17963)
The KStreamRepartitionIntegrationTest.shouldThrowAnExceptionWhenNumberOfPartitionsOfRepartitionOperationDoNotMatchSourceTopicWhenJoining test was taking two minutes due not reaching an expected condition. By updating to have the StreamsUncaughtExceptionHandler return a response of SHUTDOWN_CLIENT the expected ERROR state is now reached. The root cause was using the Thread.UncaughtExceptionHandler to handle the exception.

Without this fix, the test takes 2 minutes to run, and now it's 1 second.

Reviewers: Matthias Sax <mjsax@apache.org>
2024-11-27 16:08:05 -05:00
Joao Pedro Fonseca Dantas 3f834781a4
KAFKA-12844: clean up TaskId (#17904)
Rename topicGroupId as subtopology.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-11-26 17:06:36 -08:00
Matthias J. Sax f5d712396b
MINOR: fix warnings in Kafka Streams state store tests (#17855)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-27 01:54:21 +08:00
Matthias J. Sax 95947d2f58
KAFKA-17299: add unit tests for previous fix (#17919)
https://github.com/apache/kafka/pull/17899 fixed the issue, but did not
add any unit tests.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-11-25 12:03:57 -08:00
Kaushik Raina 7908a4838b
Fix long running RangeQueryIntegrationTest. (#17933)
Noticed that RangeQueryIntegrationTest is taking ~approx 20 - 30min to run
Upon deep dive in logs, noticed that there were error for consumer rebalancing and test was stuck in loop
Seems like due to same application.id across tests, Kafka Streams application is failing to track its state correctly across rebalances.

Reviewers: Bill Bejeck <bbejeck@apache.org>
2024-11-25 11:42:02 -05:00
Andrew Schofield d17a149205
KAFKA-17956 Remove Admin.listShareGroups (#17912)
KIP-1043 introduced Admin.listGroups as the way to list all types of groups. As a result, Admin.listShareGroups has been removed. This PR is the final step of the removal.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-25 22:05:35 +08:00
Manikumar Reddy 3268435fd6
KAFKA-18013: Add AutoOffsetResetStrategy internal class (#17858)
- Deprecates OffsetResetStrategy enum
- Adds new internal class AutoOffsetResetStrategy
- Replaces all OffsetResetStrategy enum usages with AutoOffsetResetStrategy
- Deprecate old/Add new constructors to MockConsumer

 Reviewers: Andrew Schofield <aschofield@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-11-25 19:11:12 +05:30
A. Sophie Blee-Goldman 87b902d35d
KAFKA-18026: KIP-1112, ProcessorWrapper API with PAPI and partial DSL implementation (#17892)
This PR includes the API for KIP-1112 and a partial implementation, which wraps any processors added through the PAPI and the DSL processors that are written to the topology through the ProcessorParameters#addProcessorTo method.

Further PRs will complete the implementation by converting the remaining DSL operators to using the #addProcessorTo method, and future-proof the processor writing mechanism to prevent new DSL operators from being implemented incorrectly/without the wrapper

Reviewers: Almog Gavra <almog@responsive.dev>, Guozhang Wang <guozhang.wang.us@gmail.com>
2024-11-23 21:19:19 -08:00
Laxman Ch d36b24f45f
KAFKA-17299: Fix Kafka Streams consumer hang issue (#17899)
When Kafka Streams skips overs corrupted messages, it might not resume previously paused partitions,
if more than one record is skipped at once, and if the buffer drop below the max-buffer limit at the same time.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-11-22 18:32:00 -08:00
Joao Pedro Fonseca Dantas 866f0cc308
KAFKA-16339: [3/4 KStream#transformValues] Remove Deprecated "transformer" methods and classes (#17266)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-11-22 15:07:03 -08:00
Matthias J. Sax 240efbb99d
MINOR: improve JavaDocs for Kafka Streams exceptions and error handlers (#17856)
Reviewers: Bill Bejeck <bill@confluent.io>
2024-11-21 11:46:23 -08:00
Matthias J. Sax 2519e4af0c
KAFKA-18038: fix flakey test StreamThreadTest.shouldLogAndRecordSkippedRecordsForInvalidTimestamps (#17889)
With KAFKA-17872, we changed some internals that effects the conditions
of this test, introducing a race condition when the expected log
messages are printed.

This PR adds additional wait-conditions to the test to close the race
condition.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-11-21 11:42:28 -08:00
Bill Bejeck fd9de50de1
KAFKA-18041: Update key for storing global consumer instance id for consistency (#17869)
This PR updates the key for storing the KIP-714 client instance id for the global consumer to follow a more consistent pattern of the other embedded Kafka Streams consumer clients.

Reviewers: Matthias Sax <mjsax@apache.org>
2024-11-20 16:14:03 -05:00
Sebastien Viale 615c8c0e11
KAFKA-17850: fix leaking internal exception in state manager (#17711)
Following the KIP-1033 a FailedProcessingException is passed to the Streams-specific uncaught exception handler.

The goal of the PR is to unwrap a FailedProcessingException into a StreamsException when an exception occurs during the flushing or closing of a store

Reviewer: Bruno Cadonna <cadonna@apache.org>
2024-11-19 10:51:07 +01:00
Nick Telford 57299cfbb1
KAFKA-17954: Error getting oldest-iterator-open-since-ms from JMX (#17713)
The thread that evaluates the gauge for the oldest-iterator-open-since-ms runs concurrently
with threads that open/close iterators (stream threads and interactive query threads). This PR
fixed a race condition between `openIterators.isEmpty()` and `openIterators.first()`, by catching
a potential exception. Because we except the race condition to be rare, we rather catch the
exception in favor of introducing a guard via locking.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-11-18 17:45:49 -08:00
Bill Bejeck 50c15b94c9
KAFKA-17561: KIP-1091 add operator metrics (#17820)
Implementation of KIP-1091 adding operator metrics to Kafka Streams
Updated existing tests to validate added metrics
Reviewers: Bruno Cadonna <cadonna@apache.org>, Matthias Sax <mjsax@apache.org>
2024-11-18 10:30:09 -05:00
TengYao Chi 84fe66827d
KAFKA-18006: Add 3.9.0 to end-to-end test (streams) (#17800)
This commit adds AK 3.9 to the system tests on trunk.
Follow-up of #17797

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Bruno Cadonna <cadonna@apache.org>
2024-11-15 14:58:24 +01:00
Matthias J. Sax f02c28b21d
KAFKA-17994 Checked exceptions are not handled (#17817)
Reviewers: Bill Bejeck <bill@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-11-15 20:36:03 +08:00
David Arthur 48ff6a6b53
MINOR Fix a few test names (#17788)
Remove or update custom display names to make sure we actually include the test method as the first part of the display name.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Bill Bejeck <bill@confluent.io>
2024-11-13 13:28:38 -05:00
Bill Bejeck a22fe10544
Update javadoc on split to mention first matching (#17799)
Clarify the functionality of split matching on first predicate
Reviewers: Matthias Sax <mjsax@apache.org>
2024-11-13 11:41:48 -05:00
Rajini Sivaram 52d2fa5c8b
KAFKA-17885: Enable clients to rebootstrap based on timeout or error code (KIP-1102) (#17720)
Implementation of https://cwiki.apache.org/confluence/display/KAFKA/KIP-1102%3A+Enable+clients+to+rebootstrap+based+on+timeout+or+error+code
- Introduces rebootstrap trigger interval config metadata.recovery.rebootstrap.trigger.ms, set to 5 minutes by default
- Makes rebootstrap the default for metadata.recovery.strategy
- Adds new error code REBOOTSTRAP_REQUIRED, introduces top-level error code in metadata response. On this error, clients rebootstrap.
- Configs apply to producers, consumers, share consumers, admin clients, Connect and KStreams clients.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>,  Manikumar Reddy <manikumar.reddy@gmail.com>
2024-11-13 13:01:08 +00:00
Colin Patrick McCabe 085b27ec6e
KAFKA-17987 Remove assorted ZK-related files (#17768)
Remove zookeeper files in bin:
- bin/zookeeper-security-migration.sh
- bin/zookeeper-server-start.sh
- bin/zookeeper-server-stop.sh
- bin/zookeeper-shell.sh

Remove files used to configure Kafka in zookeeper mode in config:
- config/server.properties
- config/zookeeper.properties

Remove ZK references from all remaining Kafka configuration files.

Remove ZK references from all log4j.properties files.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-13 20:32:18 +08:00
Danica Fine 9682e63c11
KAFKA-17109: Reduce log message load for failed locking (#16705)
Reducing log messaging by removing stacktrace.

Reviewer: Bruno Cadonna <cadonna@apache.org>
2024-11-13 12:32:40 +01:00