Minor followup to #18195 that I split out into a separate PR since that one was getting a bit long. Should be rebased & reviewed after that one is merged.
Introduces a new class for windowed graph nodes with a grace period defined to improve (slightly) the type safety
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Almog Gavra <almog@responsive.dev>
Avoiding 'WARN Method #getTimestampedKeyValueStore() should be
used to access a TimestampedKeyValueStore.' running
KTableKTableForeignKeyJoinIntegrationTest.
Reviewers: Bill Bejeck <bill@confluent.io>
RocksDBTimeOrderedKeyValueBuffer is not initialize with serdes provides
via Joined, but always uses serdes from StreamsConfig.
Reviewers: Bill Bejeck <bill@confluent.io>
TransactionAbortedException is a follow up error to a previous error,
and such a previous error would already be handled when
`producer.abortTransaction()` is called. Thus, a
TransactionAbortedException can just be silently swallowed.
Reviewers: Bill Bejeck <bill@confluent.io>
The current log is really helpful, this PR adds a bit more information to that log to help debug some issues. In particular, it is interesting to be able to debug situations that have long intervals between polls. It also includes a reference to how long it has been since it last logged so you don't have to find the previous time it was logged to compute quick per-second ratios.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This PR fixed a race-condition in KafkaStreamsTest, by replacing a non-deterministic
sleep with a CountDownLatch to fully control how long a thread blocks.
Reviewers: David Arthur <mumrah@gmail.com>, Matthias J. Sax <matthias@confluent.io>
This PR upgrades RocksDB from 7.9.2 to 9.7.3 and addresses the following compatibility issues introduced by the RocksDB upgrade:
- Removal of AccessHint: The AccessHint class was completely removed in RocksDB 9.7.3. This required removing all import statements, variable declarations, method parameters, method return types, and static method calls related to AccessHint in RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapter.java RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest.java Unused methods are removed in RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapter.java
- Removal of NO_FILE_CLOSES: The NO_FILE_CLOSES metric was also removed in RocksDB 9.7.3. The calculation for numberOfOpenFiles in RocksDBMetricsRecorder.java has been adjusted to now track the total number of file opens since the last reset. The previous calculation, which subtracted NO_FILE_CLOSES from NO_FILE_OPENS, is no longer possible. The reason RocksDB removed NO_FILE_CLOSES seems to be that it did not properly work: https://github.com/search?q=repo%3Afacebook%2Frocksdb+NO_FILE_CLOSES&type=issues
- Removal of methods related to compressed block cache configuration in BlockBasedTableConfig
- Change of the signature of org.rocksdb.Options.setLogger()
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Matthias J. Sax <matthias@confluent.io>, Bruno Cadonna <cadonna@apache.org>
See https://issues.apache.org/jira/browse/KAFKA-18326 for more information. The main bug here is that in the old implementation, deleted cache entries would be skipped so long as they didn't equal the next store key, which resulted in potentially skipping tombstones for future keys in the store.
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Relevant methods:
1. `List.of`, `Set.of`, `Map.of` and similar (introduced in Java 9)
2. Optional: `isEmpty` (introduced in Java 11), `stream` (introduced in Java 9).
Reviewers: Mickael Maison <mimaison@users.noreply.github.com>
Because of how we have to wrap StoreFactory and StoreBuilder layers on top of each other for various parts of the topology building process, we need to make sure both of these are capable of configuration and will check for & delegate to an underlying layer if it exists
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>
We remove the deprecated overload of StateStore.init() and thus do not
need to cast any longer. This PR removes all unnecessary casts, and
additionally cleans ups all related classed to reduce warnings.
Reviewers: Bill Bejeck <bill@confluent.io>
Reviewers: John Huang <pegasashjy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Bill Bejeck <bill@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>
The tests are flaky because the tests end before the verified calls
are executed. This happens because the state updater thread executes
the verified calls, but the thread that executes the
tests with the verifications is a different thread.
This commit fixes the flaky tests by enusring that the calls were
performed by the state updater by either shutting down the state
updater or waiting for the condition.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Matthias J. Sax <matthias@confluent.io>
Migrates KTableSuppressProcessorSupplier to use the the ProcessorSupplier#stores() method
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This patch transitions the KTable#filter implementation to provide the materialized store via the ProcessorSupplier so that it can be wrapped by the processor wrapper if the wrapper is configured
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
KAFKA-18026: supply stores for KTable#mapValues using ProcessorSupplier#stores
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Appears to be a typo in the code, since the error message indicates this check is for taskId being null, but instead we accidentally check the streams metrics twice
Reviewers: Matthias Sax <mjsax@apache.org>, runo Cadonna <cadonna@apache.org>, Lucas Brutschy <lbrutschy@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
This pull request replaces Log4j with Log4j2 across the entire project, including dependencies, configurations, and code. The notable changes are listed below:
1. Introduce Log4j2 Instead of Log4j
2. Change Configuration File Format from Properties to YAML
3. Adds warnings to notify users if they are still using Log4j properties, encouraging them to transition to Log4j2 configurations
Co-authored-by: Lee Dongjin <dongjin@apache.org>
Reviewers: Luke Chen <showuon@gmail.com>, Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Covers wrapping of processors and state stores for KStream-KStream joins.
Includes self-joins and the spurious results fix optimization
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>
Covers wrapping of processors and state stores for StreamToTableSource
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Covers wrapping of processors and state stores for KTable-KTable joins
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Guozhang Wang <guozhang.wang.us@gmail.com>
Part of KIP-1106.
Adds support for "by_duration" and "none" reset strategy
to the Kafka Streams runtime.
Reviewers: Bill Bejeck <bill@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Part of KIP-1106.
Adds the public APIs to Kafka Streams, to support the the newly added "by_duration" reset policy,
plus adds the missing "none" reset policy. Deprecates the enum `Topology.AutoOffsetReset` and
all related methods, and replaced them with new overload using the new `AutoOffsetReset` class.
Co-authored-by: Matthias J. Sax <matthias@confluent.io>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
To avoid leaking producers, we should add a 'closedflag toStreamProducer` indicating whether we should reset prouder.
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Improve descriptive information in Kafka protocol documentation.
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>
This PR introduces the DescribeGroups v6 API as part of KIP-1043. This adds an error message for the described groups so that it is possible to get some context on the error. It also changes the behaviour for when the group ID cannot be found but returning error code GROUP_ID_NOT_FOUND rather than NONE.
Reviewers: David Jacot <djacot@confluent.io>
Covers wrapping of processors and state stores for KStream-KTable joins
Reviewers: Almog Gavra <almog@responsive.dev>, Guozhang Wang <guozhang.wang.us@gmail.com>
Minor followup to #17929 based on this discussion
Also includes some very minor refactoring/renaming on the side. The only real change is in the KGroupedStreamImpl class
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>
Currently, KTable foreign key joins only allow extracting the foreign key from the value of the source record. This forces users to duplicate data that might already exist in the key into the value when the foreign key needs to be derived from both the key and value. This leads to:
- Data duplication
- Additional storage overhead
- Potential data inconsistency if the duplicated data gets out of sync
- Less intuitive API when the foreign key is naturally derived from both key and value
This change allows user to extract the foreign key from the key and value of the source record.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
As part of KIP-1112, to maximize the utility of the new ProcessorWrapper, we need to migrate the DSL operators to the new method of attaching state stores by implementing ProcessorSupplier#stores, which makes these stores available for inspection by the user's wrapper.
This PR covers the aggregate operator for both KStream and KTable.
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Rohan Desai <rohan@responsive.dev>
This PR is part of the implementation for KIP-1112 (KAFKA-18026). In order to have DSL operators be properly wrapped by the interface suggestion in 1112, we need to make sure they all use the ConnectedStoreProvider#stores method to connect stores instead of manually calling addStateStore.
This is a refactor only, there is no new behaviors.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
The KStreamRepartitionIntegrationTest.shouldThrowAnExceptionWhenNumberOfPartitionsOfRepartitionOperationDoNotMatchSourceTopicWhenJoining test was taking two minutes due not reaching an expected condition. By updating to have the StreamsUncaughtExceptionHandler return a response of SHUTDOWN_CLIENT the expected ERROR state is now reached. The root cause was using the Thread.UncaughtExceptionHandler to handle the exception.
Without this fix, the test takes 2 minutes to run, and now it's 1 second.
Reviewers: Matthias Sax <mjsax@apache.org>
Noticed that RangeQueryIntegrationTest is taking ~approx 20 - 30min to run
Upon deep dive in logs, noticed that there were error for consumer rebalancing and test was stuck in loop
Seems like due to same application.id across tests, Kafka Streams application is failing to track its state correctly across rebalances.
Reviewers: Bill Bejeck <bbejeck@apache.org>
KIP-1043 introduced Admin.listGroups as the way to list all types of groups. As a result, Admin.listShareGroups has been removed. This PR is the final step of the removal.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
- Deprecates OffsetResetStrategy enum
- Adds new internal class AutoOffsetResetStrategy
- Replaces all OffsetResetStrategy enum usages with AutoOffsetResetStrategy
- Deprecate old/Add new constructors to MockConsumer
Reviewers: Andrew Schofield <aschofield@confluent.io>, Matthias J. Sax <matthias@confluent.io>
This PR includes the API for KIP-1112 and a partial implementation, which wraps any processors added through the PAPI and the DSL processors that are written to the topology through the ProcessorParameters#addProcessorTo method.
Further PRs will complete the implementation by converting the remaining DSL operators to using the #addProcessorTo method, and future-proof the processor writing mechanism to prevent new DSL operators from being implemented incorrectly/without the wrapper
Reviewers: Almog Gavra <almog@responsive.dev>, Guozhang Wang <guozhang.wang.us@gmail.com>
When Kafka Streams skips overs corrupted messages, it might not resume previously paused partitions,
if more than one record is skipped at once, and if the buffer drop below the max-buffer limit at the same time.
Reviewers: Matthias J. Sax <matthias@confluent.io>
With KAFKA-17872, we changed some internals that effects the conditions
of this test, introducing a race condition when the expected log
messages are printed.
This PR adds additional wait-conditions to the test to close the race
condition.
Reviewers: Bill Bejeck <bill@confluent.io>
This PR updates the key for storing the KIP-714 client instance id for the global consumer to follow a more consistent pattern of the other embedded Kafka Streams consumer clients.
Reviewers: Matthias Sax <mjsax@apache.org>
Following the KIP-1033 a FailedProcessingException is passed to the Streams-specific uncaught exception handler.
The goal of the PR is to unwrap a FailedProcessingException into a StreamsException when an exception occurs during the flushing or closing of a store
Reviewer: Bruno Cadonna <cadonna@apache.org>
The thread that evaluates the gauge for the oldest-iterator-open-since-ms runs concurrently
with threads that open/close iterators (stream threads and interactive query threads). This PR
fixed a race condition between `openIterators.isEmpty()` and `openIterators.first()`, by catching
a potential exception. Because we except the race condition to be rare, we rather catch the
exception in favor of introducing a guard via locking.
Reviewers: Matthias J. Sax <matthias@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This commit adds AK 3.9 to the system tests on trunk.
Follow-up of #17797
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Bruno Cadonna <cadonna@apache.org>
Remove or update custom display names to make sure we actually include the test method as the first part of the display name.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Bill Bejeck <bill@confluent.io>
Implementation of https://cwiki.apache.org/confluence/display/KAFKA/KIP-1102%3A+Enable+clients+to+rebootstrap+based+on+timeout+or+error+code
- Introduces rebootstrap trigger interval config metadata.recovery.rebootstrap.trigger.ms, set to 5 minutes by default
- Makes rebootstrap the default for metadata.recovery.strategy
- Adds new error code REBOOTSTRAP_REQUIRED, introduces top-level error code in metadata response. On this error, clients rebootstrap.
- Configs apply to producers, consumers, share consumers, admin clients, Connect and KStreams clients.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
The logs for Kafka Streams local state restoration incorrectly refer to the StreamThread instead of the StateUpdater thread, which is responsible for decoupling the restoration process. The restore consumer also references StreamThread instead of StateUpdater.
This commit corrects the log message for more clarity.
Reviewer: Bruno Cadonna <cadonna@apache.org>
When we introduced "startup tasks" in #16922, we initialized them with
no input partitions, because they aren't known until assignment.
However, when we update them during assignment, it's possible that we
update the topology with the incorrect source topics for some internal
topics, due to a difference in the way internal topics are handled for
StandbyTasks.
To resolve this, we now initialize startup tasks with the correct input
partitions, by calculating them from the Topology.
When we assign our startup tasks, we now conditionally update their
input partitions only if they've actually changed, just as we do for
regular StandbyTasks.
With this, the E2E tests now pass, as expected.
Reviewer: Bruno Cadonna <cadonna@apache.org>
TimestampExtractor allows to drop records by returning a timestamp of -1. For this case, we still need to update consumed offsets to allows us to commit progress.
Reviewers: Bill Bejeck <bill@confluent.io>
Fixes an issue with the TTD in the specific case where users don't specify an initial time for the driver and also don't specify a start timestamp for the TestInputTopic, then pipe input records without timestamps. This combination results in a slight mismatch in the expected timestamps for the piped records, which can be noticeable when writing tests with very small time deltas.
The problem is that, while both the TTD and the TestInputTopic will be initialized to the "current time" when not otherwise specified, it's possible for some milliseconds to have passed between the creation of the TTD and the creation of the TestInputTopic. This can result in a TestInputTopic getting a start timestamp that's several ms larger than the driver's time, and ultimately causing the piped input records to have timestamps slightly in the future relative to the driver.
In practice even those who hit this issue might not notice it if they aren't manipulating time in their tests, or are advancing time by enough to negate the several-milliseconds of difference. However we noticed a test fail due to this because we were testing a ttl-based processor and had advanced the driver time by only 1 millisecond past the ttl. The piped record should have been expired, but because it's timestamp was a few milliseconds longer than the driver's start time, this test ended up failing.
Reviewers: Matthias Sax <mjsax@apache.org>, Bruno Cadonna <cadonna@apache.org>, Lucas Brutschy < lbrutschy@confluent.io>
Kafka Streams actively purges records from repartition topics. Prior to this PR, Kafka Streams would retrieve the offset from the consumedOffsets map, but here are a couple of edge cases where the consumedOffsets can get ahead of the commitedOffsets map. In these cases, this means Kafka Streams will potentially purge a repartition record before it's committed.
Updated the current StreamTask test to cover this case
Reviewers: Matthias Sax <mjsax@apache.org>
Implementation of KIP-1076 to allow for adding client application metrics to the KIP-714 framework
Reviewers: Apoorv Mittal <amittal@confluent.io>, Andrew Schofield <aschofield@confluent.io>, Matthias Sax <mjsax@apache.org>
Updates KafkaStreams.clientInstanceIds method to correctly populate the client-id -> clientInstanceId map that was altered in a previous refactoring.
Added a test that confirms ClientInstanceIds is correctly storing consumer and producer instance ids
Reviewers: Matthias Sax <mjsax@apache.org>