Each KafkaStreams instance maintains a map from threadId to state
to use to aggregate to a KafkaStreams app state. The map is updated
on every state change, and when a new thread is created. State change
updates are done in a synchronized blocks, however the update that
happens on thread creation is not, which can raise
ConcurrentModificationException. This patch moves this update
into the listener object and protects it using the object's lock.
It also moves ownership of the state map into the listener so that
its less likely that future changes access it without locking
Reviewers: Matthias J. Sax <matthias@confluent.io>
The new default standby task assignment in TaskAssignment should only assign standby tasks for changelogged tasks, not all stateful tasks.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
KStreamKStreamJoin was recently refactored to be type safe with regard to left and right value-types. This PR updates the corresponding test to use different value types instead of the same one to improve test coverage.
Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
To avoid confusion in 3.8/until we fully remove all the old task assignors and internal config, we should rename the old internal assignor classes like the StickyTaskAssignor so that they won't be mixed up with the new version of the assignor (which is also named StickyTaskAssignor)
Reviewers: Bruno Cadonna <cadonna@apache.org>, Josep Prat <josep.prat@aiven.io>
Since the new public API for TaskAssignor shared a name, this rename will prevent users from confusing the internal definition with the public one.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Also moved the assignment validation test from StreamsPartitionAssignorTest to TaskAssignmentUtilsTest.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
We have already enabled the state updater by default once.
However, we ran into issues that forced us to disable it again.
We think that we fixed those issues. So we want to enable the
state updater again by default.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Matthias J. Sax <matthias@confluent.io>
We now provide a way to more easily customize the rack aware
optimizations that we provide by way of a configuration class called
RackAwareOptimizationParams.
We also simplified the APIs for the optimizeXYZ utility functions since
they were mutating the inputs anyway.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This PR is part of KAFKA-16448 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.
This PR brings ProcessingExceptionHandler in Streams configuration.
Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: sebastienviale <sebastien.viale@michelin.com>
Reviewer: Bruno Cadonna <cadonna@apache.org>
The StreamsConfig class was not parsing the new task assignment
configuration flag properly, which made it impossible to properly
configure a custom task assignor.
This PR fixes this and adds a bit of INFO logging to help users diagnose
assignor misconfiguration issues.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Added more testing for the StickyTaskAssignor, which includes a large-scale test with rack aware enabled.
Also added a no-op change to StreamsAssignmentScaleTest.java to allow for rack aware optimization testing.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This PR updates all of the streams task assignment code to use the new AssignmentConfigs public class.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
A task does not know anything about a produce error thrown
by a different task. That might lead to a InvalidTxnStateException
when a task attempts to do a transactional operation on a producer
that failed due to a different task.
This commit stores the produce exception in the streams producer
on completion of a send instead of the record collector since the
record collector is on task level whereas the stream producer
is on stream thread level. Since all tasks use the same streams
producer the error should be correctly propagated across tasks
of the same stream thread.
For EOS alpha, this commit does not change anything because
each task uses its own producer. The send error is still
on task level but so is also the transaction.
Reviewers: Matthias J. Sax <matthias@confluent.io>
This PR is part of KAFKA-16448 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.
This PR brings ProcessingExceptionHandler interface and default implementations.
Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: sebastienviale <sebastien.viale@michelin.com>
Reviewer: Bruno Cadonna <cadonna@apache.org>
Fixed the calculation of the store name list based on the subtopology being accessed.
Also added a new test to make sure this new functionality works as intended.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This PR takes care of making the call back toTaskAssignor.onAssignmentComputed.
It also contains a change to the public AssignmentConfigs API, as well as some simplifications of the StickyTaskAssignor.
This PR also changes the rack information fetching to happen lazily in the case where the TaskAssignor makes its decisions without said rack information.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This fills in the implementation details of the standby task assignment utility functions within TaskAssignmentUtils.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This PR is part of KAFKA-16448 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.
This PR brings ProcessingExceptionHandler in Streams configuration.
Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: sebastienviale <sebastien.viale@michelin.com>
Reviewer: Bruno Cadonna <cadonna@apache.org>
This PR implements the rack aware optimization of active tasks that can be used by the assignors themselves. It takes in the full output of the assignment and tries to reorganize it so as to minimize cross-rack traffic.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This PR adds the logic and wiring necessary to make the callback to
TaskAssignor::onAssignmentComputed with the necessary parameters.
We also fixed some log statements in the actual assignment error
computation, as well as modified the ApplicationState::allTasks method
to return a Map instead of a Set of TaskInfos.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This PR is part of KAFKA-16448 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.
This PR brings ProcessingExceptionHandler interface and default implementations.
Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: sebastienviale <sebastien.viale@michelin.com>
Reviewer: Bruno Cadonna <cadonna@apache.org>
The introduced changes provide a cleaner definition of the join side in KStreamKStreamJoin. Before, this was done by using a Boolean flag, which led to returning a raw LeftOrRightValue without generic arguments because the generic type arguments depended on the boolean input.
Reviewers: Greg Harris <greg.harris@aiven.io>, Bruno Cadonna <cadonna@apache.org>
This PR adds the post-processing of the TaskAssignment to figure out if the new assignment is valid, and return an AssignmentError otherwise.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This PR creates the new public config of KIP-924 in StreamsConfig and uses it to instantiate user-created TaskAssignors. If such a TaskAssignor is found and successfully created we then use that assignor to perform the task assignment, otherwise we revert back to the pre KIP-924 world with the internal task assignors.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Almog Gavra <almog@responsive.dev>
Part of [KIP-989](https://cwiki.apache.org/confluence/x/9KCzDw).
This new `StateStore` metric tracks the timestamp that the oldest
surviving Iterator was created.
This timestamp should continue to climb, and closely track the current
time, as old iterators are closed and new ones created. If the timestamp
remains very low (i.e. old), that suggests an Iterator has leaked, which
should enable users to isolate the affected store.
It will report no data when there are no currently open Iterators.
Reviewers: Matthias J. Sax <matthias@confluent.io>
`LongAdder` performs better than `AtomicInteger` when under contention
from many threads. Since it's possible that many Interactive Query
threads could create a large number of `KeyValueIterator`s, we don't
want contention on a metric to be a performance bottleneck.
The trade-off is memory, as `LongAdder` uses more memory to space out
independent counters across different cache lines. In practice, I don't
expect this to cause too many problems, as we're only constructing 1
per-store.
Reviewers: Matthias J. Sax <matthias@confluent.io>
This PR uses the new TaskTopicPartition structure to simplify the build
process for the ApplicationState, which is the input to the new
TaskAssignor#assign call.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Part of [KIP-989](https://cwiki.apache.org/confluence/x/9KCzDw).
This new `StateStore` metric tracks the average and maximum amount of
time between creating and closing Iterators.
Iterators with very high durations can indicate to users performance
problems that should be addressed.
If a store reports no data for these metrics, despite the user opening
Iterators on the store, it suggests those iterators are not being
closed, and have therefore leaked.
Reviewers: Matthias J. Sax <matthias@confluent.io>
For task assignment purposes, the user needs to have a set of information available for each topic partition affecting the desired tasks.
This PR introduces a new interface for a read-only container class that allows all the important and relevant information to be found in one place.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Rack aware graphs don't actually need any topology information about the system, but rather require a simple ordered (not sorted) grouping of tasks.
This PR changes the internal constructors and some interface signatures of RackAwareGraphConstructor and its implementations to allow reuse by future components that may not have access to the actual subtopology information.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This rack information is required to compute rack-aware assignments, which many of the current assigners do.
The internal ClientMetadata class was also edited to pass around this rack information.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Part of [KIP-989](https://cwiki.apache.org/confluence/x/9KCzDw).
This new `StateStore` metric tracks the number of `Iterator` instances
that have been created, but not yet closed (via `AutoCloseable#close()`).
This will aid users in detecting leaked iterators, which can cause major
performance problems.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Matthias J. Sax <matthias@confluent.io>
This PR creates the required methods to post-process the result of TaskAssignor.assign into the required ClientMetadata map. This allows most of the internal logic to remain intact after the user's assignment code runs.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Added unit tests of two processors included in foreignKey join : SubscriptionSendProcessorSupplier and ForeignTableJoinProcessorSupplier.
Renamed ForeignTableJoinProcessorSupplierTest to SubscriptionJoinProcessorSupplierTest as that's the processor which the test class is testing.
Reviewers: Walker Carlson <wcarlson@apache.org>