Commit Graph

3175 Commits

Author SHA1 Message Date
Sebastien Viale f1ef7a5a9f
KAFKA-16448: Handle processing exceptions in punctuate (#16300)
This PR is part of KIP-1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR actually catches processing exceptions from punctuate.

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>

Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-07-31 15:53:47 -07:00
Loïc GREFFIER 0eb9ac2bd0
KAFKA-16448: Unify class cast exception handling for both key and value (#16736)
Part of KIP-1033. Minor code cleanup.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-07-31 13:24:15 -07:00
Matthias J. Sax 8f2679bebf
MINOR: update SuppressionDurabilityIntegrationTest (#16740)
Refactor test to move off deprecated `transform()` in favor of `process()`.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-07-31 11:28:09 -07:00
Matthias J. Sax 6224feee65
MINOR: update StandbyTaskCreationIntegrationTest (#16700)
Refactor test to move off deprecated `transform()` in favor of
`process()`.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-07-31 11:27:38 -07:00
Matthias J. Sax 08b3d9f37b
MINOR: update KStreamAggregationIntegrationTest (#16699)
Refactor test to move off deprecated `transform()` in favor of
`process()`.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-07-31 11:27:09 -07:00
Matthias J. Sax d6a41ac3ca
MINOR: update EosV2UpgradeIntegrationTest (#16698)
Refactor test to move off deprecated `transform()` in favor of
`process()`.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-07-31 11:26:25 -07:00
Matthias J. Sax 1528264f02
MINOR: update EosIntegrationTest (#16697)
Refactor test to move off deprecated `transform()` in favor of
`process()`.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-07-31 11:25:44 -07:00
Sebastien Viale 0dc9b9e4ee
KAFKA-16448: Handle fatal user exception during processing error (#16675)
This PR is part of KIP-1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR catch the exceptions thrown while handling a processing exception

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>

Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-07-30 22:58:07 -07:00
Matthias J. Sax e9d8109658
MINOR: simplify code which calles `Punctuator.punctuate()` (#16725)
Reviewers: Bill Bejeck <bill@confluent.io>
2024-07-30 17:10:40 -07:00
PoAn Yang fd2cd046f8
KAFKA-17203 StreamThread leaking producer instances (#16730)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-31 04:23:32 +08:00
Matthias J. Sax 683b8a2bee
MINOR: update AdjustStreamThreadCountTest (#16696)
Refactor test to move off deprecated `transform()` in favor of
`process()`.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-07-30 12:47:51 -07:00
Chung, Ming-Yen 7c0a96d08d
KAFKA-17185 Declare Loggers as static to prevent multiple logger instances (#16680)
As discussed in #16657 (comment) , we should make logger as static to avoid creating multiple logger instances.
I use the regex private.*Logger.*LoggerFactory to search and check all the results if certain logs need to be static.

There are some exceptions that loggers don't need to be static:
1) The logger in the inner class. Since java8 doesn't support static field in the inner class.
        https://github.com/apache/kafka/blob/trunk/clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetchRequestManagerTest.java#L3676

2) Custom loggers for each instance (non-static + non-final). In this case, multiple logger instances is actually really needed.
        https://github.com/apache/kafka/blob/trunk/storage/src/test/java/org/apache/kafka/server/log/remote/storage/LocalTieredStorage.java#L166

3) The logger is initialized in constructor by LogContext. Many non-static but with final modifier loggers are in this category, that's why I use .*LoggerFactory to only check the loggers that are assigned initial value when declaration.
    
4) protected final Logger log = Logger.getLogger(getClass())
    This is for subclass can do logging with subclass name instead of superclass name.
    But in this case, if the log access modifier is private, the purpose cannot be achieved since subclass cannot access the log defined in superclass. So if access modifier is private, we can replace getClass() with <className>.class

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-31 02:37:36 +08:00
Josep Prat e299a006c8
KAFKA-17214: Add 3.8.0 version to streams system tests (#16728)
* KAFKA-17214: Add 3.8.0 version to streams system tests

Reviewers: Bill Bejeck <bbejeck@gmail.com>
2024-07-30 19:04:38 +02:00
Matthias J. Sax 3c580e25bf
MINOR: update flaky CustomHandlerIntegrationTest (#16710)
This PR reduces the MAX_BLOCK_MS config which defaults to 60sec to
10sec, to avoid a race condition with the 60sec test timeout.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-07-30 08:13:59 -07:00
Matthias J. Sax 3db4a78167 HOTFIX: fix compilation error 2024-07-29 21:07:49 -07:00
Sebastien Viale faaef527d7
KAFKA-16448: Add ErrorHandlerContext in deserialization exception handler (#16432)
This PR is part of KIP1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR expose the new ErrorHandlerContext as a parameter to the Deserialization exception handlers and deprecate the previous handle signature.

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>

Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-07-29 20:33:33 -07:00
Sebastien Viale b6d5f0556c
KAFKA-16448: Add ErrorHandlerContext in production exception handler (#16433)
This PR is part of KIP-1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR expose the new ErrorHandlerContext as a parameter to the Production exception handler and deprecate the previous handle signature.

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>

Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-07-29 20:17:15 -07:00
Colin P. McCabe 0ec520a2af Bump trunk to 4.0.0-SNAPSHOT 2024-07-29 15:51:54 -07:00
Matthias J. Sax b6c1cb0eec
MINOR: update CachingPersistentWindowStoreTest (#16701)
Refactor test to move off deprecated `transform()` in favor of
`process()`.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-07-29 12:45:13 -07:00
TengYao Chi b348b556be
KAFKA-17202 surround consumer with try-resource statement (#16702)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-30 01:06:11 +08:00
Kirk True d260b06180
KAFKA-17060 Rename LegacyKafkaConsumer to ClassicKafkaConsumer (#16683)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-07-29 20:56:23 +08:00
TengYao Chi a07294a732
KAFKA-17204: KafkaStreamsCloseOptionsIntegrationTest.before leaks AdminClient (#16692)
To avoid a resource leak, we need to close the AdminClient after the test.

Reviewers: Lianet Magrans <lianetmr@gmail.com>, Matthias J. Sax <matthias@confluent.io>
2024-07-26 10:32:39 -07:00
Sebastien Viale 09be14bb09
KAFKA-16448: Fix processing exception handler (#16663)
Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>

Minor code improvements across different classed, related to the `ProcessingExceptionHandler` implementation (KIP-1033).

Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-07-25 16:17:31 -07:00
dujian0068 7efb58f321
KAFKA-16584 Make log processing summary configurable or debug (#16509)
KAFKA-16584 Make log processing summary configurable or debug

Reviewers: Matthias Sax <mjsax@apache.org>, Bill Bejeck <bbejeck@apache.org>
2024-07-23 16:09:25 -04:00
PoAn Yang defcbb51ee
KAFKA-17082 replace kafka.utils.LogCaptureAppender with org.apache.kafka.common.utils.LogCaptureAppender (#16601)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-21 18:22:05 +08:00
Loïc GREFFIER 4de83d38c9
KAFKA-16448: Catch and handle processing exceptions (#16093)
This PR is part of KAFKA-16448 (KIP-1033) which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR actually catches processing exceptions.

Authors:
@Dabz
@sebastienviale
@loicgreffier

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Bruno Cadonna <cadonna@apache.org>
2024-07-19 11:24:25 +02:00
Matthias J. Sax a9b2b36908
MINOR: Add more debug logging to EOSUncleanShutdownIntegrationTest (#16490)
Reviewers: Bruno Cadonna <bruno@confluent.io>
2024-07-14 10:28:31 -07:00
Bruno Cadonna a0309be7f4
KAFKA-17098: Re-add task to state updater if transit to RUNNING fails (#16570)
When Streams tries to transit a restored active task to RUNNING, the
first thing it does is getting the committed offsets for this task.
If getting the offsets expires a timeout, Streams does not re-throw
the error initially, but tries to get the committed offsets later
until a Streams-specific timeout is hit.

Restored active tasks from the state updater are removed from the
output queue of the restored tasks in the state updater. If a
timeout occurs, the restored task is neither added to the
task registry nor re-added to the state updater. The task is lost
since it is not maintained anywhere. This means the task is also
not closed. When the same task is created again on the same
stream thread since the stream thread does not know about this
lost task, the state stores are opened again and RocksDB will
throw the "No locks available" error.

This commit re-adds the task to the state updater if the
committed request times out.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-07-11 17:45:56 +02:00
Bruno Cadonna d7fbdcfd83
KAFKA-17085: Handle tasks in state updater before tasks in task registry (#16555) (#16561)
When a active tasks are revoked they land as suspended tasks in the
task registry. If they are then reassigned, the tasks are resumed
and put into restoration. On assignment, we first handle the tasks
in the task registry and then the tasks in the state updater. That
means that if a task is re-assigned after a revocation, we remove
the suspended task from the task registry, resume it, add it
to the state updater, and then remove it from the list of tasks
to create. After that we iterate over the tasks in the state
updater and remove from there the tasks that are not in the list
of tasks to create. However, now the state updater contains the
resumed tasks that we removed from the task registry before but
are no more in the list of tasks to create. In other words, we
remove the resumed tasks from the state updater and close them
although we just got them assigned.

This commit ensures that we first handle the tasks in the
state updater and then the tasks in the task registry.

Cherry-pick of 4ecbb75

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-07-11 10:41:55 +02:00
Bruno Cadonna 25230b5388
KAFKA-10199: Close pending active tasks to init on partitions lost (#16545) (#16550)
With enabled state updater tasks that are created but not initialized
are stored in a set. In each poll iteration the stream thread drains
that set, intializes the tasks, and adds them to the state updater.

On partition lost, all active tasks are closed.

This commit ensures that active tasks pending initialization in
the set mentioned above are closed cleanly on partition lost.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-07-08 21:14:15 +02:00
Bill Bejeck 20e101c2e4
KAFKA-16991: Flaky PurgeRepartitionTopicIntegrationTest (#16503)
When the PurgeRepartitionTopicintegrationTest was written, the InitialTaskDelayMs was hard-coded on the broker requiring setting a timeout in the test to wait for the delay to expire. But I believe this creates a race condition where the test times out before the broker deletes the inactive segment. PR #15719 introduced an internal config to control the IntitialTaskDelayMs config for speeding up tests, and this PR leverages this internal config to reduce the task delay to 0 to eliminate this race condition.
2024-07-03 10:02:28 -04:00
abhi-ksolves 6897b06b03
KAFKA-3346 Rename Mode to ConnectionMode (#16403)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-03 02:46:04 +08:00
Alieh Saeedi 15a4501bde
KAFKA-16508: Streams custom handler should handle the timeout exceptions (#16450)
For a non-existing output topic, Kafka Streams ends up in an infinite retry loop, because the returned TimeoutException extends RetriableException.

This PR updates the error handling pass for this case and instead of retrying calls the ProductionExceptionHandler to allow breaking the infinite retry loop.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-06-30 11:52:36 -07:00
Matthias J. Sax dc7c9ad068
MINOR: pass in timeout to Admin.close() (#16422)
Reviewers: Luke Chen <showuon@gmail.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Apoorv Mittal <amittal@confluent.io>, Bruno Cadonna <bruno@confluent.io>
2024-06-27 12:14:05 -07:00
gongxuanzhang 1040d78372
KAFKA-10787 Apply spotless to all module (#16467)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-27 20:50:06 +08:00
Ken Huang 9b4f13efbc
KAFKA-15623 Remove junit 4 from stream module (#16447)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-27 15:11:32 +08:00
dujian0068 b07688b063
MINOR: Improve TaskAssignor#onAssignmentComputed() javadoc (#16434) 2024-06-25 22:28:36 +08:00
PoAn Yang db1c8a80c4
KAFKA-15623 (5/N) Migrate KafkaStreamsTest to JUnit 5 (#16424)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-25 18:18:49 +08:00
TingIāu "Ting" Kì 0e346d3103
KAFKA-15623 (4/N) Migrate streams tests (processor) module to JUnit 5 (#16396)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-24 13:06:35 +08:00
Ken Huang 2d692e4506
KAFKA-15623 (3/N) Migrate test of Stream module to Junit5 (Stream state) (#16356)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-21 02:17:36 +08:00
Kuan-Po (Cooper) Tseng 26b9163487
KAFKA-15623 (2/N) Migrate remaining tests in streams module to JUnit 5 (integration & internals) (#16360)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-20 09:24:44 +08:00
Rohan a0bfc64470
KAFKA-15774: use the default dsl store supplier for fkj subscriptions (#16380)
Foreign key joins have an additional "internal" state store used for subscriptions, which is not exposed for configuration via Materialized or StoreBuilder which means there is no way to plug in a different store implementation via the DSL operator. However, we should respect the configured default dsl store supplier if one is configured, to allow these stores to be customized and conform to the store type selection logic used for other DSL operator stores

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-19 00:32:59 -07:00
gongxuanzhang 5a331acad0
KAFKA-10787 apply spotless to `streams:examples` and `streams-scala` (#16378)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-18 18:56:46 +08:00
dujian0068 823d6f7555
KAFKA-16958 add STRICT_STUBS to EndToEndLatencyTest, OffsetCommitCallbackInvokerTest, ProducerPerformanceTest, and TopologyTest (#16348)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-18 18:51:43 +08:00
TingIāu "Ting" Kì 92d8d4bd1f
KAFKA-16970 Fix hash implementation of `ScramCredentialValue`, `ScramCredentialData`, and `ContextualRecord` (#16359)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-17 22:29:22 +08:00
gongxuanzhang 09b4ef416a
KAFKA-10787 Apply spotless to `stream:test-utils` and `:streams:upgrade-system-tests-xxxx` (#16357)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-17 19:48:44 +08:00
Rohan 9a239c6142
KAFKA-16955: fix synchronization of streams threadState (#16337)
Each KafkaStreams instance maintains a map from threadId to state
to use to aggregate to a KafkaStreams app state. The map is updated
on every state change, and when a new thread is created. State change
updates are done in a synchronized blocks, however the update that
happens on thread creation is not, which can raise
ConcurrentModificationException. This patch moves this update
into the listener object and protects it using the object's lock.
It also moves ownership of the state map into the listener so that
its less likely that future changes access it without locking

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-06-14 10:44:36 -07:00
Antoine Pourchet 8afa5e74ac
KAFKA-15045: (KIP-924 pt. 26) default standby task assignment nit (#16331)
The new default standby task assignment in TaskAssignment should only assign standby tasks for changelogged tasks, not all stateful tasks.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-14 00:08:31 -07:00
PoAn Yang c32693d556
KAFKA-15623 (1/N) Migrate streams tests module to JUnit 5 (#16254)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-14 12:54:11 +08:00
Greg Harris dfe0fcfe28
MINOR: Change KStreamKstreamOuterJoinTest to use distinct left and right types (#15513)
KStreamKStreamJoin was recently refactored to be type safe with regard to left and right value-types. This PR updates the corresponding test to use different value types instead of the same one to improve test coverage.

Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-06-13 18:32:25 -07:00
dujian0068 133f2b0f31
KAFKA-16879 SystemTime should use singleton mode (#16266)
Reviewers: Greg Harris <gharris1727@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-14 08:49:19 +08:00
A. Sophie Blee-Goldman 4333af5c9f
KAFKA-15045: (KIP-924 pt. 25) Rename old internal StickyTaskAssignor to LegacyStickyTaskAssignor (#16322)
To avoid confusion in 3.8/until we fully remove all the old task assignors and internal config, we should rename the old internal assignor classes like the StickyTaskAssignor so that they won't be mixed up with the new version of the assignor (which is also named StickyTaskAssignor)

Reviewers: Bruno Cadonna <cadonna@apache.org>, Josep Prat <josep.prat@aiven.io>
2024-06-13 11:27:50 -07:00
gongxuanzhang 596b945072
KAFKA-16643 Add ModifierOrder checkstyle rule (#15890)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-06-13 15:39:32 +08:00
Antoine Pourchet 103ff5c0f0
KAFKA-15045: (KIP-924 pt. 24) internal TaskAssignor rename to LegacyTaskAssignor (#16318)
Since the new public API for TaskAssignor shared a name, this rename will prevent users from confusing the internal definition with the public one.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-13 00:32:39 -07:00
Antoine Pourchet 16e2b68b73
KAFKA-15045: (KIP-924 pt. 23) More TaskAssignmentUtils tests (#16292)
Also moved the assignment validation test from StreamsPartitionAssignorTest to TaskAssignmentUtilsTest.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-12 14:25:47 -07:00
Bruno Cadonna 39ffdea6d3
KAFKA-10199: Enable state updater by default (#16107)
We have already enabled the state updater by default once.
However, we ran into issues that forced us to disable it again.
We think that we fixed those issues. So we want to enable the
state updater again by default.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-06-12 07:51:38 +02:00
Antoine Pourchet 0782232fbe
KAFKA-15045: (KIP-924 pt. 22) Add RackAwareOptimizationParams and other minor TaskAssignmentUtils changes (#16294)
We now provide a way to more easily customize the rack aware
optimizations that we provide by way of a configuration class called
RackAwareOptimizationParams.

We also simplified the APIs for the optimizeXYZ utility functions since
they were mutating the inputs anyway.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-11 21:31:43 -07:00
Antoine Pourchet 8b6013f851
KAFKA-15045: (KIP-924 pt. 21) UUID to ProcessId migration (#16269)
This PR changes the assignment process to use ProcessId instead of UUID.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-11 12:24:44 -07:00
Loïc GREFFIER 2533a07ad0
KAFKA-16448: Add ProcessingExceptionHandler in Streams configuration (#16188)
This PR is part of KAFKA-16448 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR brings ProcessingExceptionHandler in Streams configuration.

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: sebastienviale <sebastien.viale@michelin.com>

Reviewer: Bruno Cadonna <cadonna@apache.org>
2024-06-10 13:23:10 +02:00
Antoine Pourchet 34afdb9627
KAFKA-15045: (KIP-924 pt. 20) Custom task assignment configuration fix (#16245)
The StreamsConfig class was not parsing the new task assignment
configuration flag properly, which made it impossible to properly
configure a custom task assignor.

This PR fixes this and adds a bit of INFO logging to help users diagnose
assignor misconfiguration issues.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-08 19:55:13 -07:00
Antoine Pourchet 824900b97a
KAFKA-15045: (KIP-924 pt. 18) Better assignment testing (#16201)
Added more testing for the StickyTaskAssignor, which includes a large-scale test with rack aware enabled.

Also added a no-op change to StreamsAssignmentScaleTest.java to allow for rack aware optimization testing.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-08 19:48:43 -07:00
Antoine Pourchet ee834d9214
KAFKA-15045: (KIP-924 pt. 19) Update to new AssignmentConfigs (#16219)
This PR updates all of the streams task assignment code to use the new AssignmentConfigs public class.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-06 14:20:48 -07:00
Bruno Cadonna 8a2bc3a221
KAFKA-16903: Consider produce error of different task (#16222)
A task does not know anything about a produce error thrown
by a different task. That might lead to a InvalidTxnStateException
when a task attempts to do a transactional operation on a producer
that failed due to a different task.

This commit stores the produce exception in the streams producer
on completion of a send instead of the record collector since the
record collector is on task level whereas the stream producer
is on stream thread level. Since all tasks use the same streams
producer the error should be correctly propagated across tasks
of the same stream thread.

For EOS alpha, this commit does not change anything because
each task uses its own producer. The send error is still
on task level but so is also the transaction.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-06-06 12:19:52 -07:00
Loïc GREFFIER ebe1e964ff
KAFKA-16448: Add ProcessingExceptionHandler interface and implementations (#16187)
This PR is part of KAFKA-16448 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR brings ProcessingExceptionHandler interface and default implementations.

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: sebastienviale <sebastien.viale@michelin.com>

Reviewer: Bruno Cadonna <cadonna@apache.org>
2024-06-06 13:40:31 +02:00
Ayoub Omari 1134520dec
KAFKA-16573: Specify node and store where serdes are needed (#15790)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Bruno Cadonna <bruno@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-05 15:05:04 -07:00
Antoine Pourchet 0109a3f718
KAFKA-15045: (KIP-924 pt. 17) State store computation fixed (#16194)
Fixed the calculation of the store name list based on the subtopology being accessed.

Also added a new test to make sure this new functionality works as intended.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-05 13:09:37 -07:00
Antoine Pourchet 5ce4b91dfa
KAFKA-15045: (KIP-924 pt. 16) TaskAssignor.onAssignmentComputed handling (#16147)
This PR takes care of making the call back toTaskAssignor.onAssignmentComputed.

It also contains a change to the public AssignmentConfigs API, as well as some simplifications of the StickyTaskAssignor.

This PR also changes the rack information fetching to happen lazily in the case where the TaskAssignor makes its decisions without said rack information.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-06-04 11:33:48 -07:00
Josep Prat 7e81cc5e68
MINOR: Bump trunk to 3.9.0-SNAPSHOT (#16150)
Signed-off-by: Josep Prat <josep.prat@aiven.io>

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-31 16:41:44 +02:00
Bruno Cadonna 76d1f18e42
Revert "KAFKA-16448: Add ProcessingExceptionHandler interface and implementations (#16090)" (#16142)
This reverts commit 8d11d95795.

We decided to not release KIP-1033 with AK 3.8

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-31 09:56:36 +02:00
Antoine Pourchet 370e5ea1f8
KAFKA-15045: (KIP-924 pt. 15) Implement #defaultStandbyTaskAssignment and finish rack-aware standby optimization (#16129)
This fills in the implementation details of the standby task assignment utility functions within TaskAssignmentUtils.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-30 15:11:33 -07:00
Bruno Cadonna fea3eeb7f7
Revert "KAFKA-16448: Add ProcessingExceptionHandler in Streams configuration (#16092)" (#16141)
This reverts commit 3f70c46874.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-30 17:52:07 +02:00
Loïc GREFFIER 3f70c46874
KAFKA-16448: Add ProcessingExceptionHandler in Streams configuration (#16092)
This PR is part of KAFKA-16448 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR brings ProcessingExceptionHandler in Streams configuration.

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: sebastienviale <sebastien.viale@michelin.com>

Reviewer: Bruno Cadonna <cadonna@apache.org>
2024-05-30 10:39:38 +02:00
Antoine Pourchet 5c08ee0062
KAFKA-15045: (KIP-924 pt. 9) TaskAssignmentUtils implementation of optimizeRackAwareActiveTasks (#16033)
This PR implements the rack aware optimization of active tasks that can be used by the assignors themselves. It takes in the full output of the assignment and tries to reorganize it so as to minimize cross-rack traffic.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-29 16:11:37 -07:00
Antoine Pourchet cc269b0d43
KAFKA-15045: (KIP-924 pt. 14) Callback to TaskAssignor::onAssignmentComputed (#16123)
This PR adds the logic and wiring necessary to make the callback to
TaskAssignor::onAssignmentComputed with the necessary parameters.

We also fixed some log statements in the actual assignment error
computation, as well as modified the ApplicationState::allTasks method
to return a Map instead of a Set of TaskInfos.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-29 13:15:02 -07:00
Loïc GREFFIER 8d11d95795
KAFKA-16448: Add ProcessingExceptionHandler interface and implementations (#16090)
This PR is part of KAFKA-16448 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR brings ProcessingExceptionHandler interface and default implementations.

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: sebastienviale <sebastien.viale@michelin.com>

Reviewer: Bruno Cadonna <cadonna@apache.org>
2024-05-29 14:09:22 +02:00
Ramin Gharib b73f4798a4
KAFKA-16362: Fix type-unsafety in KStreamKStreamJoin caused by isLeftSide (#15601)
The introduced changes provide a cleaner definition of the join side in KStreamKStreamJoin. Before, this was done by using a Boolean flag, which led to returning a raw LeftOrRightValue without generic arguments because the generic type arguments depended on the boolean input.

Reviewers: Greg Harris <greg.harris@aiven.io>, Bruno Cadonna <cadonna@apache.org>
2024-05-29 13:12:54 +02:00
A. Sophie Blee-Goldman 9562143f08
HOTFIX: remove unnecessary list creation (#16117)
Removing a redundant list declaration in the new StickyTaskAssignor implementation

Reviewers: Antoine Pourchet <antoine@responsive.dev>
2024-05-28 21:35:02 -07:00
Antoine Pourchet d64e3fbb2b
KAFKA-15045: (KIP-924 pt. 13) AssignmentError calculation added (#16114)
This PR adds the post-processing of the TaskAssignment to figure out if the new assignment is valid, and return an AssignmentError otherwise.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-28 19:43:30 -07:00
Antoine Pourchet 8d243dfbd4
KAFKA-15045: (KIP-924 pt. 12) Wiring in new assignment configs and logic (#16074)
This PR creates the new public config of KIP-924 in StreamsConfig and uses it to instantiate user-created TaskAssignors. If such a TaskAssignor is found and successfully created we then use that assignor to perform the task assignment, otherwise we revert back to the pre KIP-924 world with the internal task assignors.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Almog Gavra <almog@responsive.dev>
2024-05-28 18:01:18 -07:00
Antoine Pourchet 56ee1392e8
KAFKA-15045: (KIP-924 pt. 11) Implemented StickyTaskAssignor (#16052)
This PR implements the StickyTaskAssignor with the new KIP 924 API.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-28 17:05:51 -07:00
Nick Telford 59ba555142
KAFKA-15541: Add oldest-iterator-open-since-ms metric (#16041)
Part of [KIP-989](https://cwiki.apache.org/confluence/x/9KCzDw).

This new `StateStore` metric tracks the timestamp that the oldest
surviving Iterator was created.

This timestamp should continue to climb, and closely track the current
time, as old iterators are closed and new ones created. If the timestamp
remains very low (i.e. old), that suggests an Iterator has leaked, which
should enable users to isolate the affected store.

It will report no data when there are no currently open Iterators.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-05-28 16:23:23 -07:00
Omnia Ibrahim 64f699aeea
KAFKA-15853: Move general configs out of KafkaConfig (#16040)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-28 16:22:54 +02:00
Nick Telford d9ee9c96dd
KAFKA-15541: Use LongAdder instead of AtomicInteger (#16076)
`LongAdder` performs better than `AtomicInteger` when under contention
from many threads. Since it's possible that many Interactive Query
threads could create a large number of `KeyValueIterator`s, we don't
want contention on a metric to be a performance bottleneck.

The trade-off is memory, as `LongAdder` uses more memory to space out
independent counters across different cache lines. In practice, I don't
expect this to cause too many problems, as we're only constructing 1
per-store.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-05-25 12:22:56 -07:00
Kuan-Po (Cooper) Tseng de3202832d
KAFKA-16828 RackAwareTaskAssignorTest failed (#16044)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-24 05:25:53 +08:00
Antoine Pourchet 93238ae312
KAFKA-15045: (KIP-924 pt. 10) Topic partition rack annotation simplified (#16034)
This PR uses the new TaskTopicPartition structure to simplify the build
process for the ApplicationState, which is the input to the new
TaskAssignor#assign call.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-23 12:45:29 -07:00
Nick Telford bef83ce89b
KAFKA-15541: Add iterator-duration metrics (#16028)
Part of [KIP-989](https://cwiki.apache.org/confluence/x/9KCzDw).

This new `StateStore` metric tracks the average and maximum amount of
time between creating and closing Iterators.

Iterators with very high durations can indicate to users performance
problems that should be addressed.

If a store reports no data for these metrics, despite the user opening
Iterators on the store, it suggests those iterators are not being
closed, and have therefore leaked.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-05-22 21:34:31 -07:00
Antoine Pourchet 06739d5aa0
KAFKA-15045: (KIP-924 pt. 8) Added TopicPartitionAssignmentInfo (#16024)
For task assignment purposes, the user needs to have a set of information available for each topic partition affecting the desired tasks.

This PR introduces a new interface for a read-only container class that allows all the important and relevant information to be found in one place.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-22 15:52:53 -07:00
Antoine Pourchet 27a6c156c4
KAFKA-15045: (KIP-924 pt. 7) Simplify requirements for rack aware graphs (#16004)
Rack aware graphs don't actually need any topology information about the system, but rather require a simple ordered (not sorted) grouping of tasks.

This PR changes the internal constructors and some interface signatures of RackAwareGraphConstructor and its implementations to allow reuse by future components that may not have access to the actual subtopology information.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-22 13:25:18 -07:00
Antoine Pourchet ef2c5e41a5
KAFKA-15045: (KIP-924 pt. 5) Add rack information to ApplicationState (#15972)
This rack information is required to compute rack-aware assignments, which many of the current assigners do.

The internal ClientMetadata class was also edited to pass around this rack information.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-22 13:24:26 -07:00
Nick Telford 5552f5c26d
KAFKA-15541: Add num-open-iterators metric (#15975)
Part of [KIP-989](https://cwiki.apache.org/confluence/x/9KCzDw).

This new `StateStore` metric tracks the number of `Iterator` instances
that have been created, but not yet closed (via `AutoCloseable#close()`).

This will aid users in detecting leaked iterators, which can cause major
performance problems.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-05-21 23:29:50 -07:00
Antoine Pourchet 6339e3a6bf
KAFKA-15045: (KIP-924 pt. 6) Post process new assignment structure (#16002)
This PR creates the required methods to post-process the result of TaskAssignor.assign into the required ClientMetadata map. This allows most of the internal logic to remain intact after the user's assignment code runs.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-21 13:14:39 -07:00
Ayoub Omari 4cc99cbf3f
KAFKA-16343: Add unit tests of foreignKeyJoin classes (#15564)
Added unit tests of two processors included in foreignKey join : SubscriptionSendProcessorSupplier and ForeignTableJoinProcessorSupplier.
Renamed ForeignTableJoinProcessorSupplierTest to SubscriptionJoinProcessorSupplierTest as that's the processor which the test class is testing.

Reviewers: Walker Carlson <wcarlson@apache.org>
2024-05-21 14:23:04 -05:00
Bruno Cadonna 69fc4c5da4
MINOR: Migrate tests in o.a.k.streams to JUnit 5 (except KafkaStreamsTest) (#15942)
Reviewers: Christo Lolov <lolovc@amazon.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-18 03:31:05 +08:00
Bruno Cadonna c58c21cc20
KAFKA-16774: Delete flaky test since it is redundant (#15978)
The test shouldCloseAllTaskProducersOnCloseIfEosEnabled() in
StreamThreadTest is flaky with a concurrent modification exception.
The concurrent modification exception is caused by the test itself
because it starts a stream thread and at the same time the thread
that executes the test calls methods on the stream thread. The stream
thread was not designed for such a concurrency.
The tests verifies that under EOS the streams producer are closed
during shutdown. Actually the test is not needed since we already
have a test that verifies that when the stream thread shuts down
also the task manager shuts down and for the tasks manager we have
tests that verify that the producers are closed when the task manager
shuts down.

This commit verifies that those tests are run under EOS and ALOS.

Reviewer: Matthias J. Sax <matthias@confluent.io>
2024-05-17 08:36:07 +02:00
Antoine Pourchet fafa3c76dc
KAFKA-15045: (KIP-924 pt. 4) Generify rack graph solving utilities (#15956)
The graph solving utilities are currently hardcoded to work with ClientState, but don't actually depend on anything in those state classes.

This change allows the MinTrafficGraphConstructor and BalanceSubtopologyGraphConstructor to be reused with KafkaStreamsStates instead.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Almog Gavra <almog@responsive.dev>
2024-05-16 11:37:59 -07:00
Bruno Cadonna ba19eedb90
KAFKA-7342: Migrate tests in remaining packages in o.a.k.streams (#15963)
Migrates tests in the following packages (excluding subpackages)
to JUnit 5:
- org.apache.kafka.streams.internals
- org.apache.kafka.streams.kstream
- org.apache.kafka.streams.processor
- org.apache.kafka.streams.query
- org.apache.kafka.streams.state
- org.apache.kafka.streams.tests
- org.apache.kafka.streams.utils

Reviewer: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-15 20:28:34 +02:00
Bruno Cadonna d2e6c86632
KAFKA-10199: Remove queue-based remove from state updater (#15896)
Removes the unused remove operation from the state updater
that asynchronously removed tasks and put them into an
output queue.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-15 09:48:33 +02:00
Mickael Maison 34ec3fac15
MINOR: Fix warnings in streams javadoc (#15955)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-15 12:51:47 +08:00
Antoine Pourchet cb968845ec
KAFKA-15045: (KIP-924 pt. 3) Implement KafkaStreamsAssignment (#15944)
This PR changes KafkaStreamsAssignment from an interface to a container class, and implements said class.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-14 17:33:52 -07:00
Antoine Pourchet 0c5e8d3966
KAFKA-15045: (KIP-924 pt. 2) Implement ApplicationState and KafkaStreamsState (#15920)
This PR implements read-only container classes for ApplicationState and KafkaStreamsState, and initializes those within StreamsPartitionAssignor#assign.

New internal methods were also added to the ClientState to easily pass this data through to the KafkaStreamsState.

One test was added to check the lag sorting within the implementation of KafkaStreamsState, which is the counterpart to the test that existed for the ClientState class.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-14 17:29:54 -07:00
Walker Carlson 57d30d3450
KAFKA-16699: Have Streams treat InvalidPidMappingException like a ProducerFencedException (#15919)
KStreams is able to handle the ProducerFenced (among other errors) cleanly. It does this by closing the task dirty and triggering a rebalance amongst the worker threads to rejoin the group. The producer is also recreated. Due to how streams works (writing to and reading from various topics), the application is able to figure out the last thing the fenced producer completed and continue from there. InvalidPidMappingException should be treated the same way.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Justine Olshan <jolshan@confluent.io>
2024-05-14 11:07:55 -05:00
Bruno Cadonna 5439914c32
KAFKA-10199: Shutdown with new remove operation in state updater (#15894)
Uses the new remove operation of the state updater that returns
a future to shutdown the task manager.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-13 14:33:58 +02:00
Bruno Cadonna cfffe4e2e8
KAFKA-10199: Handle assignment with new remove operation in state updater (#15882)
Uses the new remove operation of the state updater that returns
a future to handle task assignment.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-13 11:12:30 +02:00
Christo Lolov 4bece0131f
KAFKA-14133 Move StreamTaskTest to Mockito (#14716)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-10 11:07:06 +08:00
Lucia Cerchie 31528f581d
KAFKA-15307: update/note deprecated configs (#14360)
Configs default.windowed.value.serde.inner and default.windowed.key.serde.inner
were replace with windowed.inner.class.serde. This PR updates the docs accordingly,
plus a few more side cleanups.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-05-09 19:46:00 -07:00
ChickenchickenLove ff6d01c90f
KAFKA-15951: MissingSourceTopicException should include topic names (#15573)
MissingSourceTopicException should contain the name of the missing topic.
There is one corner case for which we don't have the topic name at hand, but we can log the topic
name somewhere else.

Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-05-09 19:35:36 -07:00
Antoine Pourchet 8fd6596454
KAFKA-15045: (KIP-924) New interfaces and stubbed utility classes for pluggable TaskAssignors. (#15887)
This is the first PR in a sequence to support custom task assignors in Kafka Streams, which was described in KIP 924. It creates and exposes all of the interfaces that will need to be implemented during the refactor of the current task assignment logic.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-05-09 19:14:53 -07:00
Ayoub Omari 29f3260a9c
MINOR: Fix streams javadoc links (#15900)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-09 10:52:49 +08:00
Bruno Cadonna f7b242f94e
KAFKA-10199: Revoke tasks from state updater with new remove (#15871)
Uses the new remove operation of the state updater that returns
a future to remove revoked tasks from the state updater.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-08 09:53:58 +02:00
Bruno Cadonna cb35ddc5ca
KAFKA-10199: Remove lost tasks in state updater with new remove (#15870)
Uses the new remove operation of the state updater that returns a future to remove lost tasks from the state udpater.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-07 14:26:23 +02:00
Matthias J. Sax d76352e215
MINOR: log newly created processId (#15851)
Reviewers: Colt McNealy <colt@littlehorse.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-07 14:14:35 +08:00
Bruno Cadonna 366aeab488
KAFKA-10199: Add remove operation with future to state updater (#15852)
Adds a remove operation to the state updater that returns a future
instead of adding the removed tasks to an output queue. Code that
uses the state updater can then wait on the future.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-06 11:27:40 +02:00
Chia Chuan Yu 55a00be4e9
MINOR: Replaced Utils.join() with JDK API. (#15823)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-05-06 15:13:01 +08:00
Bruno Cadonna 240243b91d
KAFKA-10199: Accept only one task per element in output queue for failed tasks (#15849)
Currently, the state updater writes multiple tasks per exception in the output
queue for failed tasks. To add the functionality to remove tasks synchronously
from the state updater, it is simpler that each element of the output queue for
failed tasks holds one single task.

This commit refactors the class that holds exceptions and failed tasks
in the state updater -- i.e., ExceptionAndTasks -- to just hold one single
task.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-03 10:52:12 +02:00
Matthias J. Sax 49587777c1
MINOR: fix timeouts of EosIntegrationTest (#15811)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-26 15:04:05 +08:00
Gaurav Narula 025f9816f1
MINOR: fix javadoc warnings (#15527)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-26 08:31:52 +08:00
TingIāu "Ting" Kì 864744ffd4
KAFKA-16610 Replace "Map#entrySet#forEach" by "Map#forEach" (#15795)
Reviewers: Apoorv Mittal <amittal@confluent.io>, Igor Soarez <soarez@apple.com>
2024-04-25 01:52:24 +01:00
Omnia Ibrahim 1b301b3020
KAFKA-15853 Move socket configs into org.apache.kafka.network.SocketServerConfigs (#15772)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-23 17:39:36 +08:00
Gyeongwon, Do e6b6b6c1c2
MINOR: Remove extra "}" logged in KafkaStreams close (#15783)
Reviewers: Igor Soarez <soarez@apple.com>
2024-04-23 09:45:36 +01:00
Omnia Ibrahim ecb2dd4cdc
KAFKA-15853 Move KafkaConfig log properties and docs out of core (#15569)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Nikolay <nizhikov@apache.org>, Federico Valeri <fvaleri@redhat.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-04-20 04:14:23 +08:00
Matthias J. Sax 5783159175
MINOR: disable internal result emit throttling in TTD (#15660)
Kafka Streams DSL operators use internal wall-clock based throttling
parameters for performance reasons. These configs make the usage of TTD
difficult: users need to advance the mocked wall-clock time in
their test code, or set these internal configs to zero.

To simplify testing, TDD should disable both configs automatically.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-04-18 18:49:10 -07:00
Calvin Liu 53ff1a5a58
KAFKA-15585: DescribeTopicPartitions client side change. (#15470)
Add the support for DescribeTopicPartitions API to AdminClient. For this initial implementation, we are simply loading all of the results into memory on the client side. 

Reviewers: Andrew Schofield <aschofield@confluent.io>, Kirk True <ktrue@confluent.io>, David Jacot <djacot@confluent.io>, Artem Livshits <alivshits@confluent.io>, David Arthur <mumrah@gmail.com>
2024-04-18 12:09:14 -04:00
Mickael Maison aee9724ee1
MINOR: Remove unneeded explicit type arguments (#15736)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-17 21:55:58 +02:00
Ayoub Omari e63efbc5d7
MINOR: fix duplicated return and other streams docs typos (#15713)
Reviewers: Johnny Hsu <johnnyhsu@fb.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-04-17 21:26:04 +08:00
Omnia Ibrahim 363f4d2847
KAFKA-15853 Move consumer group and group coordinator configs out of core (#15684)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-17 20:41:22 +08:00
Omnia Ibrahim 8c0458861c
KAFKA-15853 Move KafkaConfig Replication properties and docs out of … (#15575)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-16 15:28:35 +08:00
Omnia Ibrahim 61baa7ac6b
KAFKA-15853 Move transactions configs out of core (#15670)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-13 00:29:51 +08:00
Ayoub Omari 753cc4b7c8
MINOR: Update docs link to deprecated method TimeWindows.grace (#15700)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-12 11:16:10 +08:00
Bruno Cadonna 5c855be015
MINOR: Remove dead code of metric forward-rate (#15686)
Kafka Streams announced the removal  of metric forward-rate in
KIP-444 and removed it completely in AK 3.0. However, we forgot
to remove some code for this metric.
This commit removes the code to create the metric forward-rate.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Matthias J. Sax <matthias@confluent.io>
2024-04-10 09:28:11 +02:00
Victor van den Hoven ee61bb721e
KAFKA-15417: move outerJoinBreak-flags out of the loop (#15510)
Follow up PR for https://github.com/apache/kafka/pull/14426 to fix a bug introduced by the previous PR.

Cf https://github.com/apache/kafka/pull/14426#discussion_r1518681146

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-04-02 06:46:54 -07:00
Walker Carlson 8b274d8c1b
KAFKA-7663: Reprocessing on user added global stores restore (#15414)
When custom processors are added via StreamBuilder#addGlobalStore they will now reprocess all records through the custom transformer instead of loading directly.

We do this so that users that transform the records will not get improperly formatted records down stream.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-03-28 10:30:18 -05:00
Nikolay 355873aa54
MINOR: Use CONFIG suffix in ZkConfigs (#15614)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Omnia Ibrahim <o.g.h.ibrahim@gmail.com>
Co-authored-by: n.izhikov <n.izhikov@vk.team>
2024-03-28 15:52:34 +01:00
Nikolay 6f38fe5e0a
KAFKA-14588 ZK configuration moved to ZkConfig (#15075)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-27 22:37:01 +08:00
PoAn Yang 6f8d4fe26b
KAFKA-15949: Unify metadata.version format in log and error message (#15505)
There were different words for metadata.version like metadata version or metadataVersion. Unify format as metadata.version.

Reviewers: Luke Chen <showuon@gmail.com>
2024-03-26 20:09:29 +08:00
Christo Lolov 997ca14f80
KAFKA-14133: Move stateDirectory mock in TaskManagerTest to Mockito (#15254)
This pull requests migrates the StateDirectory mock in TaskManagerTest from EasyMock to Mockito.
The change is restricted to a single mock to minimize the scope and make it easier for review.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Bruno Cadonna <cadonna@apache.org>
2024-03-22 10:43:53 +01:00
Kuan-Po (Cooper) Tseng 12a1d85362
KAFKA-12187 replace assertTrue(obj instanceof X) with assertInstanceOf (#15512)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-20 10:36:25 +08:00
Matthias J. Sax 313574e329
MINOR: fix flaky EosIntegrationTest (#15494)
Bumping some timeout due to slow Jenkins build.

Reviewers: Bruno Cadonna <bruno@confluent.io>
2024-03-16 18:06:45 -07:00
A. Sophie Blee-Goldman 96bfac4216
MINOR: Update javadocs and exception string in "deprecated" ProcessorRecordContext#hashcode (#15508)
This PR updates the javadocs for the "deprecated" hashCode() method of ProcessorRecordContext, as well as the UnsupportedOperationException thrown in its implementation, to actually explain why the class is mutable and therefore unsafe for use in hash collections. They now point out the mutable field in the class (namely the Headers)

Reviewers: Matthias Sax <mjsax@apache.org>, Bruno Cadonna <cadonna@apache.org>
2024-03-14 23:08:39 -07:00
Matthias J. Sax 612a1fe1bb
MINOR: Kafka Streams docs fixes (#15517)
- add missing section to TOC
- add default value for client.id

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Bruno Cadonna <bruno@confluent.io>
2024-03-13 21:54:06 -07:00
Cheryl Simmons 2c613b2d42
MINOR: Tweak streams config doc (#15518)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-03-12 15:21:43 -07:00
Christo Lolov 8b72a2c72f
KAFKA-14133: Move consumer mock in TaskManagerTest to Mockito - part 3 (#15497)
The previous pull request in this series was #15261.

This pull request continues the migration of the consumer mock in TaskManagerTest test by test for easier reviews.

The next pull request in the series will be #15254 which ought to complete the Mockito migration for the TaskManagerTest class

Reviewer: Bruno Cadonna <cadonna@apache.org>
2024-03-11 12:51:20 +01:00
Matthias J. Sax 2fcafbd497 HOTFIX: fix html markup 2024-03-08 14:28:39 -08:00
Matthias J. Sax 861fe68cee TRIVIAL: fix typo 2024-03-08 13:52:33 -08:00
Daan Gerits b9a5b4a805
KAFKA-10892: Shared Readonly State Stores ( revisited ) (#12742)
Implements KIP-813.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Walker Carlson <wcarlson@confluent.io>
2024-03-08 10:57:56 -08:00
Matthias J. Sax d8dd068a62
KAFKA-15964: fix flaky StreamsAssignmentScaleTest (#15485)
This PR bumps some timeouts due to slow Jenkins builds.

Reviewers: Bruno Cadonna <bruno@confluent.io>
2024-03-07 09:17:52 -08:00
Christo Lolov a33c47ea4d
KAFKA-14133: Move consumer mock in TaskManagerTest to Mockito - part 2 (#15261)
The previous pull request in this series was #15112.

This pull request continues the migration of the consumer mock in TaskManagerTest test by test for easier reviews.

I envision there will be at least 1 more pull request to clean things up. For example, all calls to taskManager.setMainConsumer should be removed.

Reviewer: Bruno Cadonna <cadonna@apache.org>
2024-03-07 10:33:31 +01:00
Matthias J. Sax ccf4bd5f46
MINOR: Add 3.7 to Kafka Streams system tests (#15443)
Reviewers: Bruno Cadonna <bruno@confluent.io>
2024-03-06 12:02:58 -08:00
Matthias J. Sax 6d7b25bb25
KAFKA-15797: Fix flaky EOS_v2 upgrade test (#15449)
Originally, we set commit-interval to MAX_VALUE for this test,
to ensure we only commit expliclity. However, we needed to decrease it
later on when adding the tx-timeout verification.

We did see failing test for which commit-interval hit, resulting in
failing test runs. This PR increase the commit-interval close to
test-timeout to avoid commit-interval from triggering.

Reviewers: Bruno Cadonna <bruno@confluent.io>
2024-03-06 10:27:12 -08:00
Victor van den Hoven e81379d3fe
KAFKA-15417: flip joinSpuriousLookBackTimeMs and emit non-joined items (#14426)
Kafka Streams support asymmetric join windows. Depending on the window configuration
we need to compute window close time etc differently.

This PR flips `joinSpuriousLookBackTimeMs`, because they were not correct, and
introduced the `windowsAfterIntervalMs`-field that is used to find if emitting records can be skipped.

Reviewers: Hao Li <hli@confluent.io>, Guozhang Wang <guozhang.wang.us@gmail.com>, Matthias J. Sax <matthias@confluent.io>
2024-03-05 17:06:20 -08:00
Nikolay eea369af94
KAFKA-14588 Log cleaner configuration move to CleanerConfig (#15387)
In order to move ConfigCommand to tools we must move all it's dependencies which includes KafkaConfig and other core classes to java. This PR moves log cleaner configuration to CleanerConfig class of storage module.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-05 18:11:56 +08:00
Ayoub Omari 4f92a3f0af
KAFKA-14747: record discarded FK join subscription responses (#15395)
A foreign-key-join might drop a "subscription response" message, if the value-hash changed.
This PR adds support to record such event via the existing "dropped records" sensor.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-03-04 15:56:40 -08:00
Ayoub Omari 7dbdc15c66
KAFKA-15625: Do not flush global state store at each commit (#15361)
Global state stores are currently flushed at each commit, which may impact performance, especially for EOS (commit each 200ms).
The goal of this improvement is to flush global state stores only when the delta between the current offset and the last checkpointed offset exceeds a threshold.
This is the same logic we apply on local state store, with a threshold of 10000 records.
The implementation only flushes if the time interval elapsed and the threshold of 10000 records is exceeded.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Bruno Cadonna <cadonna@apache.org>
2024-03-04 10:19:59 +01:00
Ayoub Omari 907e945c0b
MINOR: fix SessionStore java doc (#15412)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-04 01:03:04 +08:00
Almog Gavra 1c9f360f4a
KAFKA-15215: migrate StreamedJoinTest to Mockito (#15424)
Migrate StreamedJoinTest to Mockito

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, 
Divij Vaidya <diviv@amazon.com>
2024-02-26 18:52:25 -08:00
Daan Gerits 06392f7ae2
MINOR: Update of the PAPI testing classes to the latest implementation (#12740)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-02-22 18:15:24 -08:00
Justine Olshan 661e41c92f
KAFKA-16302: Remove check for log message that is no longer present (fix builds) (#15422)
a3528a3 removed this log but not the test asserting it.

Builds are currently red because for some reason these tests can't retry. We should address that as a followup.

Reviewers:  Greg Harris <greg.harris@aiven.io>,  Matthias J. Sax <matthias@confluent.io>
2024-02-22 17:10:11 -08:00
Matthias J. Sax a3528a316f
MINOR: remove unnecessary logging (#15396)
We already record dropping record via metrics and logging at WARN level
is too noise. This PR removes the unnecessary logging.

Reviewers: Kalpesh Patel <kpatel@confluent.io>, Walker Carlson <wcarlson@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>
2024-02-21 08:01:11 -08:00
Lucas Brutschy fcbfd3412e
KAFKA-16284: Fix performance regression in RocksDB (#15393)
A performance regression introduced in commit 5bc3aa4 reduces the write performance in RocksDB by ~3x. The bug is that we fail to pass the WriteOptions that disable the write-ahead log into the DB accessor.

For testing, the time to write 10 times 1 Million records into one RocksDB each were measured:

Before 5bc3aa4: 7954ms, 12933ms
After 5bc3aa4: 30345ms, 31992ms
After 5bc3aa4 with this fix: 8040ms, 10563ms
On current trunk with this fix: 9508ms, 10441ms

Reviewers: Bruno Cadonna <bruno@confluent.io>, Nick Telford <nick.telford@gmail.com>
2024-02-21 09:01:53 +01:00
Matthias J. Sax 4c70581eb6
KAFKA-15770: IQv2 must return immutable position (#15219)
ConsistencyVectorIntegrationTest failed frequently because the return
Position from IQv2 is not immutable while the test assume immutability.
To return a Position with a QueryResult that does not change, we need to
deep copy the Position object.

Reviewers: John Roesler <john@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>
2024-02-20 12:24:32 -08:00
Minha, Jeong 553f45bca8
MINOR: Fix toString method of IsolationLevel (#14782)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Ashwin Pankaj <apankaj@confluent.io>
2024-02-15 19:07:18 -08:00
Mickael Maison 0bf830fc9c
KAFKA-14576: Move ConsoleConsumer to tools (#15274)
Reviewers: Josep Prat <josep.prat@aiven.io>, Omnia Ibrahim <o.g.h.ibrahim@gmail.com>
2024-02-13 19:24:07 +01:00
Owen Leung d233eb98f7
KAFKA-14957: Update-Description-String (#13909)
HTML code for configs is auto-generated and for Kafka Streams config `state.dir` produces a confusing default value.
This PR adds a new property `alternativeString` to set a "default" value which should be rendered in HTML instead of the actual default value.

Reviewers: Manyanda Chitimbo <manyanda.chitimbo@gmail.com>, @eziosudo <eziosudo@gmail.com>, Matthias J. Sax <matthias@confluent.io>
2024-02-10 12:46:51 -08:00
Kohei Nozaki 87c7fc0cd7
KAFKA-16055: Thread unsafe use of HashMap stored in QueryableStoreProvider#storeProviders (#15121)
This PR replaces a HashMap by a ConcurrentHashMap so that the local state store queries can be made from multiple threads. See this for additional context: https://lists.apache.org/thread/gpct1275bfqovlckptn3lvf683qpoxol

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Sagar Rao <sagarmeansocean@gmail.com>
2024-02-07 13:02:42 -08:00
Greg Harris 5f35b41e92
KAFKA-15834: Remove more leaky NamedTopologyIntegrationTest tests (#15243)
Signed-off-by: Greg Harris <greg.harris@aiven.io>
Reviewers: Anna Sophie Blee-Goldman <sophie@responsive.dev>
2024-02-05 11:39:47 -08:00
Lucas Brutschy a101d20c40
KAFKA-16220: Increase timeout in KTableKTableForeignKeyInnerJoinCustomPartitionerIntegrationTest (#15307)
The test is flaky, since sometimes one of the threads haven't processed a single record that
cause the ERROR state in test run (due to unlucky rebalancing).

Reviewers: Bruno Cadonna <bruno@confluent.io>
2024-02-02 15:07:53 +01:00
Matthias J. Sax 93712eca15
KAFKA-15594: Add version 3.6 to Kafka Streams system tests (#15151)
Reviewers: Walker Carlson <wcarlson@confluent.io>
2024-01-26 14:59:24 -08:00
Matthias J. Sax aaccf542d1
KAFKA-16141: Fix StreamsStandbyTask system test (#15217)
KAFKA-15629 added `TimestampedByteStore` interface to
`KeyValueToTimestampedKeyValueByteStoreAdapter` which break the restore
code path and thus some system tests.

This PR reverts this change for now.

Reviewers: Almog Gavra <almog.gavra@gmail.com>, Walker Carlson <wcarlson@confluent.io>
2024-01-19 09:23:42 -08:00
Matthias J. Sax 2dc3fff14a
KAFKA-16139: Fix StreamsUpgradeTest (#15207)
Adds version 3.6 to the possible values for config upgrade_from.

Reviewers: Bruno Cadonna <bruno@confluent.io>
2024-01-17 13:28:12 -08:00
Lucas Brutschy 26465c6409
KAFKA-16097: Disable state updater in trunk (#15204)
Several problems are still appearing while running 3.7 with
the state updater. This change will disable the state updater
by default also in trunk.

Reviewers: Bruno Cadonna <cadonna@apache.org>
2024-01-17 15:25:26 +01:00
Bruno Cadonna e563aad4ee
KAFKA-16139: Fix StreamsUpgradeTest (#15199)
Adds version 3.5 to the possible values for config upgrade_from.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-01-16 17:25:33 -08:00
Greg Harris 055ff2b831
KAFKA-15834: Remove NamedTopologyIntegrationTest case which leaks clients (#15185)
Signed-off-by: Greg Harris <greg.harris@aiven.io>
Reviewers: Anna Sophie Blee-Goldman <sophie@responsive.dev>, Matthias J. Sax <matthias@confluent.io>
2024-01-16 10:20:39 -08:00
Nick Telford 8904317518
KAFKA-16089: Fix memory leak in RocksDBStore (#15174)
`ColumnFamilyDescriptor` is _not_ a `RocksObject`, which in theory means
it's not backed by any native memory allocated by RocksDB.

However, in practice, `ColumnFamilyHandle#getDescriptor()`, which
returns a `ColumnFamilyDescriptor`, allocates an internal
`rocksdb::db::ColumnFamilyDescriptor`, copying the name and handle of
the column family into it.

Since the Java `ColumnFamilyDescriptor` is not a `RocksObject`, it's not
possible to track this allocation and free it from Java.

Fortunately, in our case, we can simply avoid calling
`ColumnFamilyHandle#getDescriptor()`, since we're only interested in the
column family name, which is already available on
`ColumnFamilyHandle#getName()`, which does not leak memory.

We can also optimize away the temporary `Options`, which was previously a
source of memory leaks, because `userSpecifiedOptions` is an instance of
`Options`.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-01-11 18:09:54 +01:00
Divij Vaidya 65424ab484
MINOR: New year code cleanup - include final keyword (#15072)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Sagar Rao <sagarmeansocean@gmail.com>
2024-01-11 17:53:35 +01:00
Bruno Cadonna fbbfafe1f5
KAFKA-16098: Verify pending recycle action when standby is re-assigned (#15168)
When a standby is recycled to an active and then re-assigned as
a standby again, it might happen that the recycling is still
pending when the standby is reassigned. That causes an illegal
state exception from the main consumer since the active task
that results from the recycling is actually not assigned to
the main consumer anymore, but it was re-assigned as a
standby in the most recent rebalance.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2024-01-10 17:59:06 +01:00
Christo Lolov 5a0a4c5a54
MINOR: Address occasional UnnecessaryStubbingException in StreamThreadTest (#15134)
Reviewers: Divij Vaidya <diviv@amazon.com>, Lucas Brutschy <lbrutschy@confluent.io>
2024-01-10 14:39:59 +01:00
Christo Lolov ee96935c60
KAFKA-14133: Migrate consumer mock in TaskManagerTest to Mockito (#15112)
Reviewers: Divij Vaidya <diviv@amazon.com>
2024-01-10 14:34:03 +01:00
Lucas Brutschy 0349f2310c
KAFKA-16097: Add suspended tasks back to the state updater when reassigned (#15163)
When a partition is revoked, the corresponding task gets a pending action
"SUSPEND". This pending action may overwrite a previous pending action.

If the task was previously removed from the state updater, e.g. because
we were fenced, the pending action is overwritten with suspend, and in
handleAssigned, upon reassignment of that task, then SUSPEND action is
removed.

Then, once the state updater executes the removal, no pending action
is registered anymore, and we run into an IllegalStateException.

This commit solves the problem by adding back reassigned tasks to the
state updater, since they may have been removed from the state updater
for another reason than being restored completely.

Reviewers: Bruno Cadonna <cadonna@apache.org>
2024-01-10 10:21:38 +01:00
sanepal f1a0207cbb
KAFKA-16025: Fix orphaned locks when rebalancing and store cleanup race on unassigned task directories (#15088)
KAFKA-16025 describes the race condition sequence in detail. When this occurs, it can cause the impacted task's initializing to block indefinitely, blocking progress on the impacted task, and any other task assigned to the same stream thread. The fix I have implemented is pretty simple, simply re-check whether a directory is still empty after locking it during the start of rebalancing, and if it is, unlock it immediately. This preserves the idempotency of the method when it coincides with parallel state store cleanup executions.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-01-08 14:49:48 -08:00
Lucas Brutschy c0b6493455
KAFKA-16077: Streams with state updater fails to close task upon fencing (#15117)
* KAFKA-16077: Streams fails to close task after restoration when input partitions are updated

There is a race condition in the state updater that can cause the following:

 1. We have an active task in the state updater
 2. We get fenced. We recreate the producer, transactions now uninitialized. We ask the state updater to give back the task, add a pending action to close the task clean once it’s handed back
 3. We get a new assignment with updated input partitions. The task is still owned by the state updater, so we ask the state updater again to hand it back and add a pending action to update its input partition
 4. The task is handed back by the state updater. We update its input partitions but forget to close it clean (pending action was overwritten)
 5. Now the task is in an initialized state, but the underlying producer does not have transactions initialized

This can cause an IllegalStateException: `Invalid transition attempted from state UNINITIALIZED to state IN_TRANSACTION` when running in EOSv2.

To fix this, we introduce a new pending action CloseReviveAndUpdateInputPartitions that is added when we handle a new assignment with updated input partitions, but we still need to close the task before reopening it.

We should not remove the task twice, otherwise, we'll end up in this situation

1. We have an active task in the state updater
2. We get fenced. We recreate the producer, transactions now uninitialized. We ask the state updater to give back the task, add a pending action to close the task clean once it’s handed back
3. The state updater moves the task from the updating tasks to the removed tasks
4. We get a new assignment with updated input partitions. The task is still owned by the state updater, so we ask the state updater again to hand it back (adding a task+remove into the task and action queue) and add a pending action to close, revive and update input partitions
5. The task is handed back by the state updater. We close revive and update input partitions, and add the task back to the state updater
6. The state updater executes the "task+remove" action that is still in its task + action queue, and hands the task immediately back to the main thread
7. The main thread discoveres a removed task that was not restored and has no pending action attached to it. IllegalStateException

Reviewers: Bruno Cadonna <cadonna@apache.org>
2024-01-05 19:32:33 +01:00
Nick Telford cce63274f2
KAFKA-16086: Fix memory leak in RocksDBStore (#15135)
We allocate an `Options` in order to list column families while opening
the `RocksDBStore`, but never explicitly `close()` it.

`Options` is a RocksDB native object, which needs to be explicitly
closed to free the resources it allocates in native memory.

Failing to close this causes a memory leak when repeatedly
opening/closing stores.

It's an `AutoCloseable`, and all usage of it is confined to the
surrounding `try` block, so we can just hook it out to the `try` to
auto-close it when done.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-01-05 15:40:00 +01:00
Christo Lolov c703ce2563
KAFKA-14133: Migrate remaining mocks in StoreChangelogReaderTest to Mockito (#15125)
Reviewers: Divij Vaidya <diviv@amazon.com>
2024-01-04 21:54:15 +01:00
Christo Lolov e6f2624c48
KAFKA-14133: Migrate storeMetadata mock in StoreChangelogReaderTest to Mockito (#15116)
Reviewers: Divij Vaidya <diviv@amazon.com>
2024-01-04 13:50:14 +01:00
Nick Telford 5bc3aa4280
KAFKA-14412: Decouple RocksDB access from CF (#15105)
To support future use-cases that use different strategies for accessing
RocksDB, we need to de-couple the RocksDB access strategy from the
Column Family access strategy.

To do this, we now have two separate accessors:

  * `DBAccessor`: dictates how we access RocksDB. Currently only one
    strategy is supported: `DirectDBAccessor`, which access RocksDB
    directly, via the `RocksDB` class for all operations. In the future, a
    `BatchedDBAccessor` will be added, which enables transactions via
    `WriteBatch`.
  * `ColumnFamilyAccessor`: maps StateStore operations to operations on
    one or more column families. This is a rename of the old
    `RocksDBDBAccessor`.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-01-04 11:42:30 +01:00
Matthias J. Sax c078e51c8f
MINOR: improve logging for state management (#15045)
Increase log level to INFO similar to other log statement in this class, to surface important information on the non-critical code path.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-01-04 02:21:52 -08:00
Christo Lolov 43a5ff2570
KAFKA-14133: Migrate activeStateManager and standbyStateManager mocks in StoreChangelogReaderTest to Mockito (#15106)
This pull request takes a similar approach to how TaskManagerTest is being migrated to Mockito mock by mock for easier reviews.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-01-03 09:48:51 +01:00
Almog Gavra e6875f378c
KAFKA-16046: also fix stores for outer join (#15073)
This is the corollary to #15061 for outer joins, which don't use timestamped KV stores either (compared to just window stores).

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Lucas Brutschy <lbrutschy@confluent.io>
2024-01-02 15:07:46 -08:00
Lucas Brutschy e01eed32ab
KAFKA-9545: Fix IllegalStateException in updateLags (#15096)
We attempt to update lags when in state PENDING_SHUTDOWN or PARTITIONS_REVOKED. In these states,
however, our representation of the assignment may not be up-to-date with the subscription
object inside the consumer. This can cause a bug, in particular, when we subscribe to a
set of topics via a regular expression, and the underlying topic is deleted. The consumer
subscription may reflect that topic deletion already, while our internal state still
contains references to the deleted topic, because `onAssignment` has not yet been
executed. Therefore, we will attempt to call `currentLag` on partitions that are not
assigned to us any more inside the consumer, leading to an `IllegalStateException`.

This bug causes flakiness of the test
`RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted`.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Bruno Cadonna <cadonna@apache.org>
2024-01-02 16:35:31 +01:00
Christo Lolov 65a28246ad
KAFKA-14133: Migrate stateManager mock in StoreChangelogReaderTest to Mockito (#14929)
Reviewers: Divij Vaidya <diviv@amazon.com>
2024-01-02 13:36:52 +01:00
Nikolay 45bd19f2ef
KAFKA-14588: Move ConfigType to server-common (#14867)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-12-22 18:35:27 +01:00
Almog Gavra 18a65b25c1
KAFKA-16046: fix stream-stream-join store types (#15061)
Before #14648, the KStreamImplJoin class would always create non-timestamped persistent windowed stores. After that PR, the default was changed to create timestamped stores. This wasn't compatible because, during restoration, timestamped stores have their values transformed to prepend the timestamp to the value. This caused serialization errors when trying to read from the store because the deserializers did not expect the timestamp to be prepended.

To fix this, we allow creating non-timestamped stores using the DslWindowParams

Testing was done both manually as well as adding a unit test to ensure that the stores created are not timestamped. I also confirmed that the only place in the code persistentWindowStore was used before #14648 was in the StreamJoined code.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2023-12-22 15:23:58 +01:00
Divij Vaidya 6250049e10
KAFKA-13950: Fix resource leak in error scenarios (#12228)
We are not properly closing Closeable resources in the code base at multiple places especially when we have an exception. This code change fixes multiple of these leaks.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Luke Chen <showuon@gmail.com>, Mickael Maison <mickael.maison@gmail.com>
2023-12-21 13:47:22 +01:00
Bruno Cadonna 19727f8d51
KAFKA-16017: Checkpoint restored offsets instead of written offsets (#15044)
Kafka Streams checkpoints the wrong offset when a task is closed during
restoration. If under exactly-once processing guarantees a
TaskCorruptedException happens, the affected task is closed dirty, its
state content is wiped out and the task is re-initialized. If during
the following restoration the task is closed cleanly, the task writes
the offsets that it stores in its record collector to the checkpoint
file. Those offsets are the offsets that the task wrote to the changelog
topics. In other words, the task writes the end offsets of its changelog
topics to the checkpoint file. Consequently, when the task is
initialized again on the same Streams client, the checkpoint file is
read and the task assumes it is fully restored although the records
between the last offsets the task restored before closing clean and
the end offset of the changelog topics are missing locally.

The fix is to clear the offsets in the record collector on close.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2023-12-21 10:15:04 +01:00
Nick Telford b903d786ba
KAFKA-14412: Generalise over RocksDB WriteBatch (#14853)
* KAFKA-14412: Generalise over RocksDB WriteBatch

The type hierarchy of RocksDB's `WriteBatch` looks like this:

```
        +---------------------+
        | WriteBatchInterface |
        +---------------------+
                   ^
                   |
        +---------------------+
        |  AbstractWriteBatch |
        +---------------------+
                   ^
                   |
        +----------+----------+
        |                     |
 +------------+    +---------------------+
 | WriteBatch |    | WriteBatchWithIndex |
 +------------+    +---------------------+
```

By switching our `BatchWritingStore` methods from `WriteBatch` to
`WriteBatchInterface`, we enable the use of `WriteBatchWithIndex` as
well.

* Improve error reporting for unknown batch type

We should be using an `IllegalStateException`, and we should log a
message informing the user that this is a bug.

This branch should be unreachable, as both of the possible
implementations of `WriteBatchInterface` are matched in the previous two
branches.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2023-12-19 13:10:24 -08:00
Stanislav Kozlovski b352cc6b4e
MINOR: Bump trunk to 3.8.0-SNAPSHOT (#14993)
This patch bumps the next release version to 3.8.0-SNAPSHOT.

Following the Release Process, I created the 3.7 branch and am following the steps to bump these versions:

Modify the version in trunk to bump to the next one (eg. "0.10.1.0-SNAPSHOT") in the following files:

docs/js/templateData.js
gradle.properties
kafka-merge-pr.py
streams/quickstart/java/pom.xml
streams/quickstart/java/src/main/resources/archetype-resources/pom.xml
streams/quickstart/pom.xml
tests/kafkatest/__init__.py
2023-12-14 09:07:18 +01:00
Bruno Cadonna 87e3cbe4da
MINOR: Add junit properties to display parameterized test names (#14983)
In many parameterized tests, the display name is broken. Example - testMetadataFetch appears as [1] true, [2] false link
This is because the constant in @ParameterizedTest

String DEFAULT_DISPLAY_NAME = "[{index}] {argumentsWithNames}";

This PR adds a new junit-platform.properties which overrides to add a {displayName} which shows the the display name of the method

For existing tests which override the name, should work as is. The precedence rules are explained

    name attribute in @ParameterizedTest, if present
    value of the junit.jupiter.params.displayname.default configuration parameter, if present
    DEFAULT_DISPLAY_NAME constant defined in @ParameterizedTest

Source: https://junit.org/junit5/docs/current/user-guide/#writing-tests-parameterized-tests-display-names

Sample test run output
Before: [1] true link
After: testMetadataExpiry(boolean).false link

This commit is an extension of bdf6d46b41 which needed to reverted due to introduces test failures.

Reviewers: David Jacot <djacot@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>
2023-12-13 09:42:18 +01:00
Matthias J. Sax 083aa61a96
KAFKA-15662: Add support for clientInstanceIds in Kafka Stream (#14936)
Part of KIP-714.

Add support to collect client instance id of the restore consumer.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2023-12-12 08:54:45 -08:00
Hao Li 85cee984ac
MINOR: Fix rack-aware assignment tests (#14965)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-12-11 01:38:57 -08:00
Matthias J. Sax f52575b172
KAFKA-15662: Add support for clientInstanceIds in Kafka Stream (#14948)
Part of KIP-714.

Adds support to expose producer client instance id.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2023-12-11 00:20:01 -08:00
Matthias J. Sax fb5d45d26e
KAFKA-15662: Add support for clientInstanceIds in Kafka Stream (#14935)
Part of KIP-714.

Add support to collect client instance id of the global consumer.

Reviewers: Walker Carlson <wcarlson@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>
2023-12-08 09:42:32 -08:00
David Jacot b96ded9859
Revert "MINOR: Add junit properties to display parameterized test names (#14687)" (#14961)
This reverts commit bdf6d46b41. We found out that this commit introduced flakiness in Streams' tests. We will revise it.

Reviewers: Bruno Cadonna <cadonna@apache.org>
2023-12-07 23:20:03 -08:00
Hanyu Zheng 5ba7bfaa57
KAFKA-15629: Support ResultOrder to TimestampedRangeQuery. (#14907)
Update to KIP-992.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-12-07 15:29:29 -08:00
Matthias J. Sax 7dabd27f8d
KAFKA-15662: Add support for clientInstanceIds in Kafka Stream (#14922)
Part of KIP-714.

Adds support to expose main consumer client instance id.

Reviewers: Walker Carlson <wcarlson@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>
2023-12-07 10:39:39 -08:00
Hanyu Zheng 9d2297ad2d
KAFKA-15527: Support ResultOrder to reverseRange and reverseAll query over kv-store in IQv2 (#14906)
Update to KIP-985.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-12-07 08:32:16 -08:00
Alieh Saeedi 6694ea424a
KAFKA-15347: fix unit test (#14947)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-12-06 19:53:43 -08:00
A. Sophie Blee-Goldman 0ca6e3e359
MINOR: fully encapsulate user restore listener in the DelegatingRestoreListener (#14886)
Minor cleanup to make it easier to follow the restore listener logic. Currently, the KafkaStreams class tracks two restore listener fields: there is a non-final, nullable "globalRestoreListener" that holds the restore listener specified by the user (if any), and then there is a final "delegatingRestoreListener" that's used to encapsulate the null checks for the user-specified restore listener. It's a bit confusing to follow along with what each of these restore listener fields is doing and the relationship between them when they're on equal footing like this, when in reality they're more hierarchical and the DelegatingRestoreListener is actually a wrapper over the user-specified globalRestoreListener. The term "global" is also a bit misleading as it can get mixed up with global state stores, when it's really meant to be "global" in the sense that it applies to all state stores in the application.

It would be nice to just move the user listener completely inside the DelegatingRestoreListener class and then make that class static, as well as renaming the field to "userRestoreListener"

Reviewers: Matthias J. Sax <mjsax@apache.org>
2023-12-06 10:38:54 -08:00
Alok Thatikunta bdf6d46b41
MINOR: Add junit properties to display parameterized test names (#14687)
In many parameterized tests, the display name is broken. Example - `testMetadataFetch` appears as `[1] true`, `[2] false`  [link](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14607/9/testReport/junit/org.apache.kafka.clients.producer/KafkaProducerTest/) 
This is because the constant in `@ParameterizedTest`
```java
String DEFAULT_DISPLAY_NAME = "[{index}] {argumentsWithNames}";
```

This PR adds a new `junit-platform.properties` which overrides to add a `{displayName}` which shows the `the display name of the method`

For existing tests which override the name, should work as is. The precedence rules are explained

> 1. `name` attribute in `@ParameterizedTest`, if present
> 2. value of the `junit.jupiter.params.displayname.default` configuration parameter, if present
> 3. `DEFAULT_DISPLAY_NAME` constant defined in `@ParameterizedTest`

Source: https://junit.org/junit5/docs/current/user-guide/#writing-tests-parameterized-tests-display-names

Sample test run output 
Before: `[1] true` [link](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14607/9/testReport/junit/org.apache.kafka.clients.producer/KafkaProducerTest/)
After: `testMetadataExpiry(boolean).false` [link](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14687/1/testReport/junit/org.apache.kafka.clients.producer/KafkaProducerTest/)

Reviewers: Divij Vaidya <diviv@amazon.com>, Bruno Cadonna <cadonna@apache.org>, David Jacot <djacot@confluent.io>
2023-12-06 08:42:45 -08:00
Hao Li 6be2e5c131
KAFKA-15022: tests for HA assignor and StickyTaskAssignor (#14921)
Part of KIP-925.

Tests for HAAssignor and StickyAssignor.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-12-06 08:12:47 -08:00
Alieh Saeedi 9658942366
KAFKA-15347: add support for 'single key multi timestamp' IQs with versioned state stores (KIP-968) (#14626)
Implements KIP-968.

Add new query type MultiVersionedKeyQuery for IQv2 supported by versioned state stores.
2023-12-06 07:56:12 -08:00
Eduwer Camacaro 83110e2d42
KAFKA-15448: Streams Standby Update Listener (KIP-988) (#14735)
Implementation for KIP-988, adds the new StandbyUpdateListener interface

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Colt McNealy <colt@littlehorse.io>
2023-12-06 01:27:38 -08:00
Hao Li f6560ab1cd
KAFKA-15022: introduce interface to control graph constructor (#14714)
Part of KIP-925.

Refactor graph construction and assignment in RackAwareAssignor to new interface.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-12-05 22:00:04 -08:00
Florin Akermann 4a958c6cb1
Kafka-14748: Relax non-null FK left-join requirement (#14107)
Relax non-null FK left-join requirement.

Testing Strategy: Inject extractor which returns null on first or second element.

Reviewers: Walker Carlson <wcarlson@apace.org>
2023-12-05 18:04:32 -06:00
Matthias J. Sax 45f5d0f621
KAFKA-15662: Add support for clientInstanceIds in Kafka Stream (#14908)
- Part of KIP-714
- Add new configs and public API for Kafka Streams
- Implement support to get admin client instance id

Reviewers: Andrew Schofield <aschofield@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>, Apoorv Mittal <amittal@confluent.io>, Walker Carlson <wcarlson@confluent.io>
2023-12-05 12:19:56 -08:00
ashwinpankaj f2aeff0026
KAFKA-9545: Fix Flaky Test `RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted` (#14910)
RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted does not wait to ensure that test-topic-A is deleted. The second assignment condition times out in 15sec.

We should wait for the topic to be deleted (default timeout = 30sec) and then check the assignment.

Reviewers: Walker Carlson <wcarlson@apache.org>
2023-12-05 11:16:06 -06:00
Nick Telford 20a223061c
KAFKA-14412: Better Rocks column family management (#14852)
When opening RocksDB, we were checking for an error in
`RocksDBTimestampedStore` to detect if the `keyValueWithTimestamp` CF is
missing.

The `openRocksDB` method now supports any number of column families, not
just the extra one used by `RocksDBTimestampedStore`. We now check for
the existing column families _before_ opening the database, which allows
us to create any missing column families.

Supporting automatic creation of any number of missing column families
is a pre-requisite for KIP-892: Transactional StateStores.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Bruno Cadonna <cadonna@apache.org>
2023-12-05 10:02:04 +01:00
Christo Lolov d4c95cfc2a
KAFKA-14133: Migrate ProcessorStateManagerTest and StreamThreadTest to Mockito (#13932)
This pull request is an attempt to get what has started in #12524 to completion as part of the Streams project migration to Mockito.

Reviewers: Divij Vaidya <diviv@amazon.com>, Bruno Cadonna <cadonna@apache.org>
2023-12-04 18:37:57 +01:00
Lucas Brutschy 59ac9be21c
HOTFIX: fix ConsistencyVectorIntegrationTest failure (#14895)
#14570 changed the result for KeyQuery from ValueAndTimestamp<V> to
V, but forgot to update ConsistencyVectorIntegrationTest accordingly.
2023-12-03 23:06:41 +01:00
Matthias J. Sax 1a2f74be67 MINOR: fix typo 2023-12-01 15:39:32 -08:00
Matthias J. Sax b22bbd656c
MINOR: cleanup internal Iterator impl (#14889)
makeNext() is internal and visibility should not be extended to `public`

Reviewers: Walker Carlson <wcarlson@confluent.io>
2023-12-01 11:53:07 -08:00
Lucas Brutschy bfee3b3c6b
KAFKA-15690: Fix restoring tasks on partition loss, flaky EosIntegrationTest (#14869)
The following race can happen in the state updater code path

Task is restoring, owned by state updater
We fall out of the consumer group, lose all partitions
We therefore register a "TaskManager.pendingUpdateAction", to CLOSE_DIRTY
We also register a "StateUpdater.taskAndAction" to remove the task
We get the same task reassigned. Since it's still owned by the state updater, we don't do much
The task completes restoration
The "StateUpdater.taskAndAction" to remove will be ignored, since it's already restored
Inside "handleRestoredTasksFromStateUpdater", we close the task dirty because of the pending update action
We now have the task assigned, but it's closed.
To fix this particular race, we cancel the "close" pending update action. Furthermore, since we may have made progress in other threads during the missed rebalance, we need to add the task back to the state updater, to at least check if we are still at the end of the changelog. Finally, it seems we do not need to close dirty here, it's enough to close clean when we lose the task, related to KAFKA-10532.

This should fix the flaky EOSIntegrationTest.

Reviewers: Bruno Cadonna <cadonna@apache.org>
2023-12-01 18:57:27 +01:00
Hanyu Zheng f1cd11dcc5
KAFKA-15629: Proposal to introduce IQv2 Query Types: TimestampedKeyQuery and TimestampedRangeQuery (#14570)
Implements KIP-992.

Adds TimestampedKeyQuery and TimestampedRangeQuery (IQv2) for ts-ks-store, plus changes semantics of existing KeyQuery and RangeQuery if issues against a ts-kv-store, now unwrapping value-and-timestamp and only returning the plain value.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-11-30 12:14:23 -08:00
Nick Telford 96b43bf16f
KAFKA-14412: Add ProcessingThread tag interface (#14839)
This interface provides a common supertype for `StreamThread` and
`DefaultTaskExecutor.TaskExecutorThread`, which will be used by KIP-892
to differentiate between "processing" threads and interactive query
threads.

This is needed because `DefaultTaskExecutor.TaskExecutorThread` is
`private`, so cannot be seen directly from `RocksDBStore`.

Reviewer: Bruno Cadonna <cadonna@apache.org>
2023-11-30 09:44:02 +01:00
Greg Harris 9f896ed6c9
KAFKA-15816: Fix leaked sockets in streams tests (#14769)
Signed-off-by: Greg Harris <greg.harris@aiven.io>
Reviewers: Matthias J. Sax <mjsax@apache.org>
2023-11-29 11:53:34 -08:00
Hao Li e7b9bd5a26
KAFKA-15022: add config for balance subtopology in rack aware task assignment (#14711)
Part of KIP-925.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-11-29 11:33:52 -08:00
Hao Li 10555ec6de
KAFKA-15022: Only relax edge when path exist (#14198)
If there is no path from u to v, we should not represent it at Integer.MAX_VALUE but null instead.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-11-28 20:44:12 -08:00
Hao Li bbd75b80ce
KAFKA-15022: Detect negative cycle from one source (#14696)
Introduce a dummy node connected to every other node and run Bellman-ford from the dummy node once instead of from every node in the graph.

Reviewers: Qichao Chu (@ex172000), Matthias J. Sax <matthias@confluent.io>
2023-11-28 00:29:00 -08:00
Lucas Brutschy fe58cb1ebd
KAFKA-13531: Disable flaky NamedTopologyIntegrationTest (#14830)
Named topologies is a feature that is planned to be removed from AK with 4.0 and was never used via the public interface. It was used in a few versions of KSQL only, but was disabled there as well. While we do not want to remove it in 3.7 yet, we should disable flaky tests in that feature, that are disruptive to AK development.

Reviewers: Bruno Cadonna <cadonna@apache.org>
2023-11-24 10:44:25 +01:00
Almog Gavra 9309653219
KAFKA-15215: [KIP-954] support custom DSL store providers (#14648)
Implementation for KIP-954: support custom DSL store providers

Testing Strategy:
- Updated the topology tests to ensure that the configuration is picked up in the topology builder
- Manually built a Kafka Streams application using a customer DslStoreSuppliers class and verified that it was used

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Guozhang Wang <guozhang.wang.us@gmail.com>
2023-11-21 13:51:39 -08:00
Bruno Cadonna 922d0d0e5c
MINOR: Do not check whether updating tasks exist in the waiting loop (#14791)
The state updater waits on a condition variable if no tasks exist that need to be updated. The condition variable is wrapped by a loop to account for spurious wake-ups. The check whether updating tasks exist is done in the condition of the loop. Actually, the state updater thread can change whether updating tasks exists, but since the state updater thread is waiting for the condition variable the check for the existence of updating tasks will always return the same value as long as the state updater thread is waiting. Thus, the check only need to be done once before entering the loop.

This commit moves check before the loop making also the usage of mocks more robust since the processing becomes more deterministic.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2023-11-18 21:10:21 +01:00
vamossagar12 e7f4f5dfe7
[MINOR] Removing unused variables from StreamThreadTest (#14777)
A few variables which aren't being used anymore but still exist. This commit removes those unused variables.

Co-authored-by: Sagar Rao <sagarrao@Sagars-MacBook-Pro.local>
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2023-11-16 11:00:11 +01:00
Alieh 0489b7cd33
KAFKA-15346: add support for 'single key single timestamp' IQs with versioned state stores (KIP-960) (#14596)
This PR implements KIP-960 which add support for `VersionedKeyQuery`.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-11-15 17:34:54 -08:00
Kirk True 22f7ffe5e1
KAFKA-15277: Design & implement support for internal Consumer delegates (#14670)
The consumer refactoring project introduced another `Consumer` implementation, creating two different, coexisting implementations of the `Consumer` interface:

* `KafkaConsumer` (AKA "existing", "legacy" consumer)
* `PrototypeAsyncConsumer` (AKA "new", "refactored" consumer)

The goal of this task is to refactor the code via the delegation pattern so that we can keep a top-level `KafkaConsumer` but then delegate to another implementation under the covers. There will be two delegates at first:

* `LegacyKafkaConsumer`
* `AsyncKafkaConsumer`

`LegacyKafkaConsumer` is essentially a renamed `KafkaConsumer`. That implementation handles the existing group protocol. `AsyncKafkaConsumer` is renamed from `PrototypeAsyncConsumer` and will implement the new consumer group protocol from KIP-848. Both of those implementations will live in the `internals` sub-package to discourage their use.

This task is part of the work to implement support for the new KIP-848 consumer group protocol.

Reviewers: Philip Nee <pnee@confluent.io>, Andrew Schofield <aschofield@confluent.io>, David Jacot <djacot@confluent.io>
2023-11-15 05:00:40 -08:00
Zihao Lin 7c562a776d
HOTFIX: Fix compilation error for JDK 21 caused by this-escape warning (#14740)
This patch fixes the compilation error for JDK 21 introduced in https://github.com/apache/kafka/pull/14708.

Reviewers: Ismael Juma <ismael@juma.me.uk>, David Jacot <djacot@confluent.io>
2023-11-12 08:59:40 +01:00
Almog Gavra 39cacca89b
KAFKA-15774: refactor windowed stores to use StoreFactory (#14708)
This is a follow up from #14659 that ports the windowed classes to use the StoreFactory abstraction as well. There's a side benefit of not duplicating the materialization code twice for each StreamImpl/CogroupedStreamImpl class as well.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Matthias Sax <mjsax@apache.org>
2023-11-10 18:19:11 -08:00
Bruno Cadonna 81cceedf7e
MINOR: Delete task-level commit sensor (#14677)
The task-level commit metrics were removed without deprecation in KIP-447 and the corresponding PR #8218. However, we forgot to update the docs and to remove the code that creates the task-level commit sensor.
This PR removes the task-level commit metrics from the docs and removes the code that creates the task-level commit sensor. The code was effectively dead since it was only used in tests.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>
2023-11-09 15:37:13 +01:00
Bruno Cadonna f1e58a35d7
MINOR: Do not checkpoint standbys when handling corrupted tasks (#14709)
When a task is corrupted, uncorrupted tasks are committed. That is also true for standby tasks. Committing standby tasks actually means that they are checkpointed.

When the state updater is enabled, standbys are owned by the state updater. The stream thread should not checkpoint them when handling corrupted tasks. That is not a big limitation since the state updater periodically checkpoints standbys anyway. Additionally, when handling corrupted tasks the important thing is to commit active running tasks to abort the transaction. Committing standby tasks do not abort any transaction.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>
2023-11-08 16:09:24 +00:00
Almog Gavra febf0fb573
KAFKA-15774: introduce internal StoreFactory (#14659)
This PR sets up the necessary prerequisites to respect configurations such as dsl.default.store.type and the dsl.store.suppliers.class (introduced in #14648) without requiring them to be passed in to StreamBuilder#new(TopologyConfig) (passing them only into new KafkaStreams(...).

It essentially makes StoreBuilder an external-only API and internally it uses the StoreFactory equivalent. It replaces KeyValueStoreMaterializer with an implementation of StoreFactory that creates the store builder only after configure() is called (in a Future PR we will create the missing equivalents for all of the places where the same thing happens for window stores, such as in all the *WindowKStreamImpl classes)

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2023-11-06 17:30:58 -08:00
gongzhongqiang d682b15eeb
KAFKA-15769: Fix logging with exception trace (#14683)
Reviewers: Divij Vaidya <diviv@amazon.com>, hudeqi <1217150961@qq.com>
2023-11-06 11:02:05 +01:00
Christo Lolov ba394aa28a
KAFKA-14133: Move StandbyTaskTest to Mockito (#14679)
Reviewers: Divij Vaidya <diviv@amazon.com>
2023-11-06 10:41:37 +01:00
Christo Lolov 760abfbdab
KAFKA-14133: Move StreamsMetricsImplTest to Mockito (#14623)
Reviewers: Divij Vaidya <diviv@amazon.com>, Ismael Juma <ismael@juma.me.uk>
2023-11-01 12:13:06 +01:00
Florin Akermann b5c24974ae
Kafka 12317: Relax non-null key requirement in Kafka Streams (#14174)
[KIP-962](https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams)

The key requirments got relaxed for the followinger streams dsl operator:

left join Kstream-Kstream: no longer drop left records with null-key and call ValueJoiner with 'null' for right value.
outer join Kstream-Kstream: no longer drop left/right records with null-key and call ValueJoiner with 'null' for right/left value.
left-foreign-key join Ktable-Ktable: no longer drop left records with null-foreign-key returned by the ForeignKeyExtractor and call ValueJoiner with 'null' for right value.
left join KStream-Ktable: no longer drop left records with null-key and call ValueJoiner with 'null' for right value.
left join KStream-GlobalTable: no longer drop records when KeyValueMapper returns 'null' and call ValueJoiner with 'null' for right value.

Reviewers: Walker Carlson <wcarlson@apache.org>
2023-10-31 11:09:42 -05:00
James Cheng b9f2874c44
MINOR: Fix typo in a comment at KTableFilter (#14665)
Reviewers: Divij Vaidya <diviv@amazon.com>
2023-10-30 10:16:12 +01:00
bachmanity1 f0e97397c0
KAFKA-14133: Replace Easymock with Mockito in StreamsProducerTest, TopologyMetadataTest & GlobalStateStoreProviderTest (#14410)
Reviewers: Ismael Juma <ismael@juma.me.uk>, Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>
2023-10-27 10:45:25 +02:00
Hanyu Zheng 834f72b03d
KAFKA-15527: Update docs and JavaDocs (#14600)
Part of KIP-985.

Updates JavaDocs for `RangeQuery` and `ReadOnlyKeyValueStore` with regard to ordering guarantees.
Updates Kafka Streams upgrade guide with KIP information.

Reviewer: Matthias J. Sax <matthias@confluent.io>
2023-10-26 17:48:28 -07:00
Levani Kokhreidze 986c1b1f31
KAFKA-15659: Fix shouldInvokeUserDefinedGlobalStateRestoreListener flaky test (#14608)
Trying to fix flakiness for the shouldInvokeUserDefinedGlobalStateRestoreListener test introduced in #14519.

Fixes are:

-Do not use static membership.
-Always close the 2nd KafkaStreams instance.
-Await for the Kafka Streams instance to transition to a RUNNING state before proceeding.
-Added logging for the StateRestoreListener implementation.
-Reduce restore consumer MAX_POLL_RECORDS.

Reviewers: Anna Sophie Blee-Goldman <sophie@responsive.dev>
2023-10-26 14:56:33 -07:00
Matthias J. Sax a6c14003a9
HOTFIX: close iterator to avoid resource leak (#14624)
Reviewers: Hao Li <hli@confluent.io>, Bill Bejeck <bill@confluent.io>
2023-10-26 10:30:39 -07:00
Lucas Brutschy b061ab7701
MINOR: Fix misleading log-line (#14643)
After finishing restoration, we should only log the active tasks. Standby tasks are not part of restoration and it can be confusing to see them show up on this log message.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-10-26 08:31:46 -07:00
Lucas Brutschy d144b7ee38
KAFKA-15326: [10/N] Integrate processing thread (#14193)
- Introduce a new internal config flag to enable processing threads
- If enabled, create a scheduling task manager inside the normal task manager (renamings will be added on top of this), and use it from the stream thread
- All operations inside the task manager that change task state, lock the corresponding tasks if processing threads are enabled.
- Adds a new abstract class AbstractPartitionGroup. We can modify the underlying implementation depending on the synchronization requirements. PartitionGroup is the unsynchronized subclass that is going to be used by the original code path. The processing thread code path uses a trivially synchronized SynchronizedPartitionGroup that uses object monitors. Further down the road, there is the opportunity to implement a weakly synchronized alternative. The details are complex, but since the implementation is essentially a queue + some other things, it should be feasible to implement this lock-free.
- Refactorings in StreamThreadTest: Make all tests use the thread member variable and add tearDown in order avoid thread leaks and simplify debugging. Make the test parameterized on two internal flags: state updater enabled and processing threads enabled. Use JUnit's assume to disable all tests that do not apply.
Enable some integration tests with processing threads enabled.

Reviewer: Bruno Cadonna <bruno@confluent.io>
2023-10-24 10:17:55 +02:00
Mickael Maison 8b9f6d17f2
KAFKA-15093: Add 3.5 Streams upgrade system tests (#14602)
Reviewers: Matthias J. Sax <mjsax@apache.org>
2023-10-23 13:26:50 +02:00
Mickael Maison 9c77c17c4e
KAFKA-15664: Add 3.4 Streams upgrade system tests (#14601)
Reviewers: Luke Chen <showuon@gmail.com>,  Matthias J. Sax <mjsax@apache.org>
2023-10-23 10:33:59 +02:00
Christo Lolov b5ec6e8a0d
KAFKA-14133: Move RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest to Mockito (#14586)
Reviewers: Divij Vaidya <diviv@amazon.com>
2023-10-20 16:09:36 +02:00