This constructor was not initializing a field with the constructor
argument, the extra `} {` was ending the constructor body and creating
an instance initializer block that assigned the field to itself.
Reviewers: Matthias J. Sax <matthias@confluent.io>
* Fixes a thread-safety bug in the Kafka Streams Position class
* Adds a multithreaded test to validate the fix and prevent regressions
Reviewers: John Roesler <vvcephei@apache.org>
Replace `StandardCharsets.UTF_8.name()` with `StandardCharsets.UTF_8` to
avoid UnsupportedEncodingException and optimize the related code at the
same time.
Reviewers: Ken Huang <s7133700@gmail.com>, PoAn Yang
<payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
Enable KIP-1071 parameter in `StandbyTaskCreationIntegrationTest`.
Required a fix: In `ChangelogTopic.setup`, we actually need to return
both the source-topic (optimized) and the non-source-topic changelog
topics, since otherwise we will not find the partition number later on.
Extended `EmbeddedKafkaCluster` to set the number of standby replicas
dynamically for the group. We need to initialize it to one for the
integration test to go through.
Reviewers: Bill Bejeck <bbejeck@apache.org>
In the first version of the integration of the stream thread with the
new Streams rebalance protocol, the consumer used a dedicated event
queue for Streams/specific background events to request the stream
thread to call the rebalance callbacks. That led to an issue where the
consumer times out when unsubscribing.
This commit gets rid of the dedicated queue and incorporates the
Streams-specific background events into event queue used by the
consumer.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
KIP-1071 creates internal topics broker-side, so this test checks
whether, when KIP-1071 is enabled, basically the same topics are
created.
It also adds a little helper method in `EmbeddedKafkaCluster`, so that
fewer code changes are required to enable KIP-1071. We use that helper
in the already enabled SmokeTestDriverIntegrationTest and revert some of
the changes there (making the cluster `final` again).
Reviewers: Bill Bejeck <bbejeck@apache.org>, PoAn Yang
<payang@apache.org>
Call the StateRestoreListener#onBatchRestored with numRestored and not
the totalRestored when reprocessing state
See: https://issues.apache.org/jira/browse/KAFKA-18962
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Matthias
Sax <mjsax@apache.org>
The consumer adaptations for the new Streams rebalance protocol need to
be integrated into the Streams code. This commit does the following:
- creates an async Kafka consumer
- with a Streams heartbeat request manager
- with a Streams membership manager
- integrates consumer code with the Streams membership manager and the
Streams heartbeat request manager
- processes the events from the consumer network thread (a.k.a.
background thread)
that request the invocation of the "on tasks revoked", "on tasks
assigned", and "on all tasks lost"
callbacks
- executes the callbacks
- sends to the consumer network thread the events signalling the
execution of the callbacks
- adapts SmokeTestDriverIntegrationTest to use the new Streams rebalance
protocol
This commit misses some unit test coverage, but it also unblocks other
work on trunk regarding the new Streams rebalance protocol. The missing
unit tests will be added soon.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
This PR contains the implementation of KafkaAdminClient and
GroupCoordinator for DeleteShareGroupOffsets RPC.
- Added `deleteShareGroupOffsets` to `KafkaAdminClient`
- Added implementation for `handleDeleteShareGroupOffsetsRequest` in
`KafkaApis.scala`
- Added `deleteShareGroupOffsets` to `GroupCoordinator` as well.
internally this makes use of `persister.deleteState` to persist the
changes in share coordinator
Reviewers: Andrew Schofield <aschofield@confluent.io>, Sushant Mahajan <smahajan@confluent.io>
Fixes both KAFKA-16407 and KAFKA-16434.
Summary of existing issues:
- We are ignoring new left record when its previous FK value is null
- We do not unset foreign key join result when FK becomes null
Reviewers: Matthias J. Sax <matthias@confluent.io>
When a row in a FK-join left table is updated, we should send a "delete
subscription with no response" for the old FK to the right hand side, to
avoid getting two responses from the right hand side. Only the "new
subscription" for the new FK should request a response. If two responses
are requested, there is a race condition for which both responses could
be processed in the wrong order, leading to an incorrect join result.
This PR fixes the "delete subscription" case accordingly, to no request
a response.
Reviewers: Matthias J. Sax <matthias@confluent.io>
JIRA: KAFKA-18067
Fix producer client double-closing issue in Kafka Streams.
During StreamThread shutdown, TaskManager closes first, which closes the
producer client. Later, calling `unsubscribe` on the main consumer may
trigger the `onPartitionsLost` callback, attempting to reset
StreamsProducer when EOS is enabled. This causes an already closed
producer to be closed twice while the newly created producer is never
closed.
In detail:
This patch adds a flag to control the producer reset and has a new
method to change this flag, which is only invoked in
`ActiveTaskCreator#close`.
This would guarantee that the disable reset producer will only occur
when StreamThread shuts down.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Matthias Sax <mjsax@apache.org>
Adds the FunctionalInterface annotation to relevant Kafka Streams
classes. While this is not strictly required for Java, it's still best
practice and also useful for better integration with other JVM
languages, for example Clojure, to allow using these interfaces as
lambdas.
Reviewers: Matthias J. Sax <matthias@confluent.io>
This patch is part of KIP-939 [Support Participation in
2PC](https://cwiki.apache.org/confluence/display/KAFKA/KIP-939%3A+Support+Participation+in+2PC)
The kafka-transactions.sh tool will support a new command
--forceTerminateTransaction It has one required argument
--transactionalId that would take the transactional id for the
transaction to be terminated.
The command uses the existing Admin#fenceProducers method to forcefully
abort the transaction associated with the specified transactional ID.
Under the hood, it sends an InitProducerId request to the transaction
coordinator with the given transactional ID and keepPreparedTxn = false
by default. This is aligned with the functionality outlined in the KIP.
We will be creating a new public method in the Admin Client **public
TerminateTransactionResult forceTerminateTransaction(String
transactionalId)**, and re-use the existing fence producer method.
Reviewers: Artem Livshits <alivshits@confluent.io>, Justine Olshan <jolshan@confluent.io>
When implementing
[KIP-1091](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1091%3A+Improved+Kafka+Streams+operator+metrics)
metrics for the Global Stream thread was overlooked. This ticket adds
the Global Thread metrics so they are available via the KIP-1076 process
of adding external Kafka metrics.
The existing integration test has been updated to confirm GlobalThread
metrics are sent via the broker plugin.
Reviewers: Matthias Sax <mjsax@apache.org>
The `window.size.ms` and `window.inner.class.serde` are not a true
KafkaStreams config, and are ignored when set from a KStreams
application. Both belong on the client.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
Signed-off-by: PoAn Yang <payang@apache.org>
The generated request data type's constructors take Readable as an input. However, the parse method in the
AbstractRequest takes a ByteBuffer as input. So to create the corresponding request data objects, each individual
concrete Request classes wraps the ByteBuffer into a ByteBufferAccessor.
This is boilerplate code present in all the concrete request classes. This changes AbstractRequest's parse method so that subclasses can simply pass the `Readable` they get directly to request data classes.
The same change is made to the serialize method to maintain symmetry.
Reviewers: Ismael Juma <ismael@juma.me.uk>, José Armando García Sancio
<jsancio@apache.org>, Artem Livshits <alivshits@confluent.io>,
Truc Nguyen <trnguyen@confluent.io>
Fixes two issues:
- only commit TX if no revoked tasks need to be committed
- commit revoked tasks after punctuation triggered
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Anna Sophie Blee-Goldman <sophie@responsive.dev>, Bruno Cadonna <bruno@confluent.io>, Bill Bejeck <bill@confluent.io>
Exceptions that are caught during closing a task dirty are swallowed.
The corresponding log messages are on error level which is misleading
since the exceptions do not cause any special handling or crash.
This commit downgrades the log messages of swallowed exceptions to
warnings and explains in the log messages that the exceptions are
swallowed.
Reviewers: Bill Bejeck <bill@confluent.io>
Recently, we found a regression that could have been detected by static
analysis, since a local variable wasn't being passed to a method during
a refactoring, and was left unused. It was fixed in
[7a749b5](7a749b589f),
but almost slipped into 4.0. Unused variables are typically detected by
IDEs, but this is insufficient to prevent these kinds of bugs. This
change enables unused local variable detection in checkstyle for Kafka.
A few notes on the usage:
- There are two situations in which people actually want to have a local
variable but not use it. First, there are `for (Type ignored:
collection)` loops which have to loop `collection.length` number of
times, but that do not use `ignored` in the loop body. These are
typically still easier to read than a classical `for` loop. Second, some
IDEs detect it if a return value of a function such as `File.delete` is
not being used. In this case, people sometimes store the result in an
unused local variable to make ignoring the return value explicit and to
avoid the squiggly lines.
- In Java 22, unsued local variables can be omitted by using a single
underscore `_`. This is supported by checkstyle. In pre-22 versions,
IntelliJ allows such variables to be named `ignored` to suppress the
unused local variable warning. This pattern is often (but not
consistently) used in the Kafka codebase. This is, however, not
supported by checkstyle.
Since we cannot switch to Java 22, yet, and we want to use automated
detection using checkstyle, we have to resort to prefixing the unused
local variables with `@SuppressWarnings("UnusedLocalVariable")`. We have
to apply this in 11 cases across the Kafka codebase. While not being
pretty, I'd argue it's worth it to prevent bugs like the one fixed in
[7a749b5](7a749b589f).
Reviewers: Andrew Schofield <aschofield@confluent.io>, David Arthur
<mumrah@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Bruno
Cadonna <cadonna@apache.org>, Kirk True <ktrue@confluent.io>
Adds `describeStreamsGroup` to Admin API.
This exposes the result of the `DESCRIBE_STREAMS_GROUP` RPC in the Admin
API.
Reviewers: Bill Bejeck <bill@confluent.io>
Implement Admin API extensions beyond list/describe group (delete group,
offset-related APIs).
* adds methods for describing and manipulating offsets, as described in
KIP-1071
* adds corresponding unit tests
These are doing the exact same thing as the corresponding consumer group
counter-parts.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
Implement Admin API extensions beyond list/describe group (delete group, offset-related APIs).
* adds methods for describing and manipulating offsets, as described in KIP-1071
* adds corresponding unit tests
These are doing the exact same thing as the corresponding consumer group counter-parts.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
When `transformValues` is used with a `Materialized` instance, but
without a queryable name, a `NullPointerException` is thrown. To
preserve the semantics present in 3.9, we need to avoid materialization
when a queryable name is not present.
Reviewers: Bruno Cadonna <cadonna@apache.org>
This reverts commit e8837465a5.
The commit that is reverted prevents Kafka Streams from re-initializing
its transactional producer. If an exception that fences the
transactional producer occurs, the producer is not re-initialized during
the handling of the exception. That causes an infinite loop of
ProducerFencedExceptions with corresponding rebalances.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, David Jacot
<djacot@confluent.io>
To address the issue of not creating a checkpoint file during the
restoring and closing process, called the
GlobalStateUpdateTask.flushState() method in
GlobalStateUpdateTask.initialize() and GlobalStateUpdateTask.close()
methods. This will flush the state and create a checkpoint file thereby,
avoiding the need to completely restore the entire state.
Reviewers: Alieh Saeedi <asaeedi@confluent.io>, Matthias J. Sax <matthias@confluent.io>
In 3.1 we deprecated the eager rebalancing protocol and marked it for
removal in a later release. We aim to officially drop support and remove
the protocol from Streams in 4.0.
The effect of this PR is that it will no longer be possible to perform a
live upgrade Kafka Streams directly to 4.0 from version 2.3 or below.
Users will have to go through a bridge release between 2.4 - 3.9
instead.
Reviewers: Matthias J. Sax <matthias@confluent.io>
Pull request to implement KIP-1111, aims to add a configuration that
prevents a Kafka Streams application from starting if any of its
internal topics have auto-generated names, thereby enforcing explicit
naming for all internal topics and enhancing the stability of the
application’s topology.
- Repartition Topics:
All repartition topics are created in the
KStreamImpl.createRepartitionedSource(...) static method. This method
either receives a name explicitly provided by the user or null and then
builds the final repartition topic name.
- Changelog Topics and State Store Names:
There are several scenarios where these are created:
In the MaterializedInternal constructor.
During KStream/KStream joins.
During KStream/KTable joins with grace periods.
With key-value buffers are used in suppressions.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Sophie Blee-Goldman <sophie@responsive.dev>
* In this PR, we add various infra classes needed to support the
`deleteShareGroups` functionality via the `kafka-share-groups.sh`
script, as well as the implementation of `kafka-share-groups.sh --delete`.
Reviewers: Andrew Schofield <aschofield@confluent.io>
Cleanup code to avoid rawtype, and add suppressions where necessary.
Change the build to fail on rawtype warning.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
KAFKA-16720 aims to add the support for the AlterShareGroupOffsets AdminClient. Key Changes in the PR:
1. Added handing of alterShareGroupOffsets() in KafkaAdminClient and introduce AlterShareGroupOffsetRequest/AlterShareGroupOffsetResponse/AlterShareGroupOffsetsOptions classes.
2. Corresponding test in KafkaAdminClientTest.
3. Added ALTER_SHARE_GROUP_OFFSETS API (will finish it in next PR and the share coordinator pieces)
Reviewers: poorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
KIP-744 introduced the StreamsMetadata class as part of the implementation. In the KIP, the javadoc for the standbyTopicPartitions states that the method returns the set of source TopicPartition that it represents as a standby. The current javadoc states that it represents the changelog TopicPartition(s). While the partitions of the source and changelog topics will match, the javadoc needs to be updated to reflect the correct behavior.
Note that the deprecated o.a.k.streams.state.StreamsMetadata#standbyTopicPartitions method also describes the set of TopicPartition being source TopicPartition.
Reviewers: Matthias Sax<mjsax@apache.org>
In the streams smoke test, flush records that are appended to the input topics, to advance the stream time so that all suppressed windows are flushed at the end of the test. The records are created with record time equal to current time + 2 days. caf0b67 changed the broker defaults so that records more than one hour in the future are rejected by the broker. This breaks the flush messages. By moving all record time stamps 2 days into the past, the existing logic should work correctly with the new default broker configuration.
A similar thing happens in the relational smoke test, where data is emitted 4 days into the future. To avoid running into retention / compaction, the window retention time is increased for both tests.
Reviewers: Bruno Cadonna <bruno@confluent.io>, Bill Bejeck <bill@confluent.io>
While reviewing PR #18287, I wrote some javadocs to help me understand the AbstractMergedSortedCacheStoreIterator. Maybe we could add them to help the next developers getting into it.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
The exception stack trace shown in the the ticket can happen when we are
concurrently closing the producer because of an error and doing a
regular close. This is not a bug in the test, but a real race condition
that can happen.
The sequence is this:
Thread1: Enter PENDING_ERROR
Thread2: Check if state is already ERROR
Thread1: Transition to ERROR
Thread2: Check if state is already PENDING_ERROR
Thread2: Transition to PENDING_SHUTDOWN
One idea to fix this would be to synchronize the sequence performed by
Thread1 using the state lock. However, this would require more changes,
since we cannot use the normal state transition method `setState` while
owning the lock, as it calls user-defined callbacks, which may create
deadlocks. Do avoid adding more synchronization, we can also fix it by
first attempting to transition to PENDING_SHUTDOWN, and _then_ checking
whether another thread is already attempting to shut down (states
PENDING_SHUTDOWN, PENDING_ERROR, ERROR, NOT_RUNNING). Since we never
transition from a shutdown state back to a non-shutdown state.
Reviewers: Matthias J. Sax <matthias@confluent.io>
Once a StreamThread receives its assignment, it will close the startup tasks. But during the closing process, the StandbyTask.closeClean() method will eventually call theStatemanagerUtil.closeStateManager method which needs to lock the state directory, but locking requires the calling thread be the current owner. Since the main thread grabs the lock on startup but moves on without releasing it, we need to update ownership explicitly here in order for the stream thread to close the startup task and begin processing.
Reviewers: Matthias Sax <mjsax@apache.org>, Nick Telford
We want to deprecate an remove the old ProcessorContext. Thus, we need
to refactor Kafka Streams runtime code, to not make calls into the old
ProcessorContext but only into new code path.
Reviewers: Bill Bejeck <bill@confluent.io>
Minor followup to #18195 that I split out into a separate PR since that one was getting a bit long. Should be rebased & reviewed after that one is merged.
Introduces a new class for windowed graph nodes with a grace period defined to improve (slightly) the type safety
Reviewers: Guozhang Wang <guozhang.wang.us@gmail.com>, Almog Gavra <almog@responsive.dev>
Avoiding 'WARN Method #getTimestampedKeyValueStore() should be
used to access a TimestampedKeyValueStore.' running
KTableKTableForeignKeyJoinIntegrationTest.
Reviewers: Bill Bejeck <bill@confluent.io>
RocksDBTimeOrderedKeyValueBuffer is not initialize with serdes provides
via Joined, but always uses serdes from StreamsConfig.
Reviewers: Bill Bejeck <bill@confluent.io>
TransactionAbortedException is a follow up error to a previous error,
and such a previous error would already be handled when
`producer.abortTransaction()` is called. Thus, a
TransactionAbortedException can just be silently swallowed.
Reviewers: Bill Bejeck <bill@confluent.io>
The current log is really helpful, this PR adds a bit more information to that log to help debug some issues. In particular, it is interesting to be able to debug situations that have long intervals between polls. It also includes a reference to how long it has been since it last logged so you don't have to find the previous time it was logged to compute quick per-second ratios.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This PR fixed a race-condition in KafkaStreamsTest, by replacing a non-deterministic
sleep with a CountDownLatch to fully control how long a thread blocks.
Reviewers: David Arthur <mumrah@gmail.com>, Matthias J. Sax <matthias@confluent.io>