KAFKA-18026: supply stores for KTable#mapValues using ProcessorSupplier#stores
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Appears to be a typo in the code, since the error message indicates this check is for taskId being null, but instead we accidentally check the streams metrics twice
Reviewers: Matthias Sax <mjsax@apache.org>, runo Cadonna <cadonna@apache.org>, Lucas Brutschy <lbrutschy@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
This pull request replaces Log4j with Log4j2 across the entire project, including dependencies, configurations, and code. The notable changes are listed below:
1. Introduce Log4j2 Instead of Log4j
2. Change Configuration File Format from Properties to YAML
3. Adds warnings to notify users if they are still using Log4j properties, encouraging them to transition to Log4j2 configurations
Co-authored-by: Lee Dongjin <dongjin@apache.org>
Reviewers: Luke Chen <showuon@gmail.com>, Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Covers wrapping of processors and state stores for KStream-KStream joins.
Includes self-joins and the spurious results fix optimization
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>
Covers wrapping of processors and state stores for StreamToTableSource
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Covers wrapping of processors and state stores for KTable-KTable joins
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Guozhang Wang <guozhang.wang.us@gmail.com>
Part of KIP-1106.
Adds support for "by_duration" and "none" reset strategy
to the Kafka Streams runtime.
Reviewers: Bill Bejeck <bill@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Part of KIP-1106.
Adds the public APIs to Kafka Streams, to support the the newly added "by_duration" reset policy,
plus adds the missing "none" reset policy. Deprecates the enum `Topology.AutoOffsetReset` and
all related methods, and replaced them with new overload using the new `AutoOffsetReset` class.
Co-authored-by: Matthias J. Sax <matthias@confluent.io>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
To avoid leaking producers, we should add a 'closedflag toStreamProducer` indicating whether we should reset prouder.
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Improve descriptive information in Kafka protocol documentation.
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>
This PR introduces the DescribeGroups v6 API as part of KIP-1043. This adds an error message for the described groups so that it is possible to get some context on the error. It also changes the behaviour for when the group ID cannot be found but returning error code GROUP_ID_NOT_FOUND rather than NONE.
Reviewers: David Jacot <djacot@confluent.io>
Covers wrapping of processors and state stores for KStream-KTable joins
Reviewers: Almog Gavra <almog@responsive.dev>, Guozhang Wang <guozhang.wang.us@gmail.com>
Minor followup to #17929 based on this discussion
Also includes some very minor refactoring/renaming on the side. The only real change is in the KGroupedStreamImpl class
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>
Currently, KTable foreign key joins only allow extracting the foreign key from the value of the source record. This forces users to duplicate data that might already exist in the key into the value when the foreign key needs to be derived from both the key and value. This leads to:
- Data duplication
- Additional storage overhead
- Potential data inconsistency if the duplicated data gets out of sync
- Less intuitive API when the foreign key is naturally derived from both key and value
This change allows user to extract the foreign key from the key and value of the source record.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
As part of KIP-1112, to maximize the utility of the new ProcessorWrapper, we need to migrate the DSL operators to the new method of attaching state stores by implementing ProcessorSupplier#stores, which makes these stores available for inspection by the user's wrapper.
This PR covers the aggregate operator for both KStream and KTable.
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Rohan Desai <rohan@responsive.dev>
This PR is part of the implementation for KIP-1112 (KAFKA-18026). In order to have DSL operators be properly wrapped by the interface suggestion in 1112, we need to make sure they all use the ConnectedStoreProvider#stores method to connect stores instead of manually calling addStateStore.
This is a refactor only, there is no new behaviors.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
The KStreamRepartitionIntegrationTest.shouldThrowAnExceptionWhenNumberOfPartitionsOfRepartitionOperationDoNotMatchSourceTopicWhenJoining test was taking two minutes due not reaching an expected condition. By updating to have the StreamsUncaughtExceptionHandler return a response of SHUTDOWN_CLIENT the expected ERROR state is now reached. The root cause was using the Thread.UncaughtExceptionHandler to handle the exception.
Without this fix, the test takes 2 minutes to run, and now it's 1 second.
Reviewers: Matthias Sax <mjsax@apache.org>
Noticed that RangeQueryIntegrationTest is taking ~approx 20 - 30min to run
Upon deep dive in logs, noticed that there were error for consumer rebalancing and test was stuck in loop
Seems like due to same application.id across tests, Kafka Streams application is failing to track its state correctly across rebalances.
Reviewers: Bill Bejeck <bbejeck@apache.org>
KIP-1043 introduced Admin.listGroups as the way to list all types of groups. As a result, Admin.listShareGroups has been removed. This PR is the final step of the removal.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
- Deprecates OffsetResetStrategy enum
- Adds new internal class AutoOffsetResetStrategy
- Replaces all OffsetResetStrategy enum usages with AutoOffsetResetStrategy
- Deprecate old/Add new constructors to MockConsumer
Reviewers: Andrew Schofield <aschofield@confluent.io>, Matthias J. Sax <matthias@confluent.io>
This PR includes the API for KIP-1112 and a partial implementation, which wraps any processors added through the PAPI and the DSL processors that are written to the topology through the ProcessorParameters#addProcessorTo method.
Further PRs will complete the implementation by converting the remaining DSL operators to using the #addProcessorTo method, and future-proof the processor writing mechanism to prevent new DSL operators from being implemented incorrectly/without the wrapper
Reviewers: Almog Gavra <almog@responsive.dev>, Guozhang Wang <guozhang.wang.us@gmail.com>
When Kafka Streams skips overs corrupted messages, it might not resume previously paused partitions,
if more than one record is skipped at once, and if the buffer drop below the max-buffer limit at the same time.
Reviewers: Matthias J. Sax <matthias@confluent.io>
With KAFKA-17872, we changed some internals that effects the conditions
of this test, introducing a race condition when the expected log
messages are printed.
This PR adds additional wait-conditions to the test to close the race
condition.
Reviewers: Bill Bejeck <bill@confluent.io>
This PR updates the key for storing the KIP-714 client instance id for the global consumer to follow a more consistent pattern of the other embedded Kafka Streams consumer clients.
Reviewers: Matthias Sax <mjsax@apache.org>
Following the KIP-1033 a FailedProcessingException is passed to the Streams-specific uncaught exception handler.
The goal of the PR is to unwrap a FailedProcessingException into a StreamsException when an exception occurs during the flushing or closing of a store
Reviewer: Bruno Cadonna <cadonna@apache.org>
The thread that evaluates the gauge for the oldest-iterator-open-since-ms runs concurrently
with threads that open/close iterators (stream threads and interactive query threads). This PR
fixed a race condition between `openIterators.isEmpty()` and `openIterators.first()`, by catching
a potential exception. Because we except the race condition to be rare, we rather catch the
exception in favor of introducing a guard via locking.
Reviewers: Matthias J. Sax <matthias@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This commit adds AK 3.9 to the system tests on trunk.
Follow-up of #17797
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Bruno Cadonna <cadonna@apache.org>
Remove or update custom display names to make sure we actually include the test method as the first part of the display name.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Bill Bejeck <bill@confluent.io>
Implementation of https://cwiki.apache.org/confluence/display/KAFKA/KIP-1102%3A+Enable+clients+to+rebootstrap+based+on+timeout+or+error+code
- Introduces rebootstrap trigger interval config metadata.recovery.rebootstrap.trigger.ms, set to 5 minutes by default
- Makes rebootstrap the default for metadata.recovery.strategy
- Adds new error code REBOOTSTRAP_REQUIRED, introduces top-level error code in metadata response. On this error, clients rebootstrap.
- Configs apply to producers, consumers, share consumers, admin clients, Connect and KStreams clients.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>