Test the `StreamsGroupDescribeRequest` RPC and corresponding responses
for situations where
- `streams.version` not upgraded to 1
- `streams.version` enabled, multiple groups listening to the same
topic.
Reviewers: Lucas Brutschy <lucasbru@apache.org>
In Kafka Streams, configuration classes typically follow a fluent API
pattern to ensure a consistent and intuitive developer experience.
However, the current implementation of
`org.apache.kafka.streams.KafkaStreams$CloseOptions` deviates from this
convention by exposing a public constructor, breaking the uniformity
expected across the API.
To address this inconsistency, we propose introducing a new
`CloseOptions` class that adheres to the fluent API style, replacing the
existing implementation. The new class will retain the existing
`timeout(Duration)` and `leaveGroup(boolean)` methods but will enforce
fluent instantiation and configuration. Given the design shift, we will
not maintain backward compatibility with the current class.
This change aligns with the goal of standardizing configuration objects
across Kafka Streams, offering developers a more cohesive and
predictable API.
Reviewers: Bill Bejeck<bbejeck@apache.org>
Simplify the last known elr update logic. This way can make a more
robust logic.
Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
We can rewrite this class from scala to java and move to server-common
module. To maintain backward compatibility, we should keep the logger
name `state.change.logger`.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
[In this PR](https://github.com/apache/kafka/pull/20334), we added some
validation checks for the connect config, such as ensuring that
`plugin.path` cannot be empty.
However, currently, Connect first loads the plugin and then creates the
configuration. Even if `plugin.path` is empty, it still attempts to load
the plugin first, and then throws an exception when creating the
configuration.
The approach should be to first create a configuration to validate that
the config meet the requirements, and then load the plugin only if the
validation passes. This allows for early detection of problems and
avoids unnecessary plugin loading processes.
Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
Filter out unsubscribed topics during reconciliation.
This eliminates the window where a consumer group assignment could
contain unsubscribed topics when a member unsubscribes from a topic
while it has unrevoked partitions.
We also apply filtering in a few other cases that would arise when
client-side assignors are implemented, since new assignments would no
longer be available immediately. This is important for mixed groups,
since clients on the classic protocol will rejoin if they receive a
topic in their assignment that is no longer in their subscription.
Regex subscriptions have a window where the regex is not resolved and we
cannot know which topics are part of the subscription. We opt to be
conservative and treat unresolved regexes as matching no topics.
The same change is applied to share groups, since the reconciliation
process is similar.
To gauge the performance impact of the change, we add a jmh benchmark.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Lianet Magran <lmagrans@confluent.io>, Sushant Mahajan <smahajan@confluent.io>, Dongnuo Lyu <dlyu@confluent.io>, David Jacot <djacot@confluent.io>
Remove stalling instance in EOSIntegrationTest, since it doesn’t matter
what it thinks what the assignment is but blocks the test with streams
group protocol
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
KIP-188 introduced MetricValueProvider, adding Measurable and Gauge as
its sub interfaces. However, this left a legacy
[issue](https://github.com/apache/kafka/pull/3705#discussion_r140830112):
move the value method from Gauge to the super interface,
MetricValueProvider.
This PR moves the value method from Gauge to MetricValueProvider and
provides a default implementation in Measurable. This unifies the
methods used by Gauge and Measurable to obtain monitoring values.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai
<chia7712@gmail.com>
`tools-api` is already in core module runtime path, so adding it to
`releaseTarGz` causes the resolution conflicts, which will be a fatal
error in gradle 9
Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<frankvicky@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
This PR fixes a leak in StreamsMetricImpl not removing a
store-level-metric correctly, and thus leaking objects.
Reviewers: Eduwer Camacaro <eduwerc@gmail.com>, Bill Bejeck
<bbejeck@apache.org>
JIRA: KAFKA-18185
This is a follow-up of #17614 The patch is to remove the
`internal.leave.group.on.close` config.
Reviewers: Sophie Blee-Goldman <ableegoldman@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>, Bill Bejeck <bbejeck@apache.org>
- Move restore time calculation to ChangelogMetadata.
- Introduced a new interface to propagate the calculated value to the
stores to avoid modifications in the public interface.
Reviewers: Matthias J. Sax <matthias@confluent.io>
This change adds the metric ControllerEventManager::AvgIdleRatio which
measures the amount of time the controller spends blocked waiting for
events vs the amount of time spent processing events. A value of 1.0
means that the controller spent the entire interval blocked waiting for
events.
Reviewers: José Armando García Sancio <jsancio@apache.org>, Kevin Wu
<kevin.wu2412@gmail.com>, Alyssa Huang <ahuang@confluent.io>, TengYao
Chi <frankvicky@apache.org>, Reviewers: Chia-Ping Tsai
<chia7712@apache.org>
Update KIP-1147 changes (renaming --property to --formatter-property) in
the ops and streams documentation.
Reviewers: Andrew Schofield <aschofield@confluent.io>
## Summary
Quota test isn't testing anything on the client side, but rather
enforcing server-side quotas, so moving it out of the clients directory
into the core directory.
Reviewers: Lianet Magrans <lmagrans@confluent.io>
Improves the documentation of the clusterId field in AddRaftVoterOptions
and RemoveRaftVoterOptions.
The changes include:
1. Adding Javadoc to both addRaftVoter and removeRaftVoter methods to
explain the behavior of the optional clusterId.
2. Integration tests have been added to verify the correct behavior of
add and remove voter operations with and without clusterId, including
scenarios with inconsistent cluster ids.
Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
Partially addresses KAFKA-15873. When filtering and sorting, we should
be applying the filter before the sort of topics. Order that
unauthorizedForDescribeTopicMetadata is added to not relevant as it is a
HashSet.
Reviewers: TaiJuWu <tjwu1217@gmail.com>, Calvin Liu
<caliu@confluent.io>, Matthias J. Sax <matthias@confluent.io>
Repartition topic records should be purged up to the currently committed
offset once `repartition.purge.interval.ms` duration has passed.
Reviewers: Matthias J. Sax <matthias@confluent.io>
Add count store and output topic; produce 1,000 records across 50 keys
to better exercise concurrency.
Reviewers: Matthias J. Sax <matthias@confluent.io>
Problem: When AsyncConsumer is closing, CoordinatorRequestManager stops
looking for coordinator by returning EMPTY in poll() method when closing
flag is true. This prevents commitAsync() and other
coordinator-dependent operations from completing, causing close() to
hang until timeout.
Solution:
Modified the closing flag check in poll() method of
CommitRequestManager to be more targeted:
- When both coordinators are unknown and the consumer is closing, only
return EMPTY
- When this condition is met, proactively fail all pending commit
requests with CommitFailedException
- This allows coordinator lookup to continue when coordinator is
available during shutdown, while preventing indefinite hangs when
coordinator is unreachable
Reviewers: PoAn Yang <payang@apache.org>, Andrew Schofield
<aschofield@confluent.io>, TengYao Chi <kitingiao@gmail.com>, Kirk True
<kirk@kirktrue.pro>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Lan Ding
<isDing_L@163.com>, TaiJuWu <tjwu1217@gmail.com>, Ken Huang
<s7133700@gmail.com>, KuoChe <kuoche1712003@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
In [KAFKA-19359](https://issues.apache.org/jira/browse/KAFKA-19359), the
commons-beanutils transitive dependency was force bumped in the project
to avoid related CVEs. The commons-validator already has a new release,
which solves this problem:
https://github.com/apache/commons-validator/tags
The workaround could be deleted as part of the version bump.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
The `logContext` attribute in `StreamsGroup` and `CoordinatorRuntime` is
not used anymore. This patch removes it.
Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Rename org.apache.kafka.server:type=AssignmentsManager and
org.apache.kafka.storage.internals.log.RemoteStorageThreadPool metrics
for the consist, these metrics should be
- `kafka.log.remote:type=...`
- `kafka.server:type=...`
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
With changes to the consumer protocol, rebalance may not necessarily
result in a "stop the world". Thus, the method for calculating pause
time in `ConsumerPerformance#ConsumerPerfRebListener` needs to be
modified.
Stop time is only recorded if `assignedPartitions` is empty.
Reviewers: Andrew Schofield <aschofield@confluent.io>
Some sections are not very clear, and we need to update the
documentation.
Reviewers: TengYao Chi <kitingiao@gmail.com>, Jun Rao
<junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
When toology not configured.
In the streams group heartbeat, we validate the topology set for the
group against the topic metadata, to generate the "configured topology"
which has a specific number of partitions for each topic.
In streams group describe, we use the configured topology to expose this
information to the user. However, we leave the topology information as
null in the streams group describe response, if the topology is not
configured. This triggers an IllegalStateException in the admin client
implementation.
Instead, we should expose the unconfigured topology when the configured
topology is not available, which will still expose useful information.
Reviewers: Matthias J. Sax <matthias@confluent.io>
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Fixing flakiness seen on this test, where static consumers could not
join as expected after shutting down previous consumers with the same
instance ID, and logs showed `UnreleasedInstanceIdException`.
I expect the flakiness could happen if a consumer with instanceId1 is
closed but not effectively removed from the group due to leave group
fail/delayed (the leave group request is sent on a best effort, not
retried if fails or times out).
Fix by adding check to ensure the group is empty before attempting to
reuse the instance ID
Reviewers: Matthias J. Sax <matthias@confluent.io>
Just because a controller node sets --no-initial-controllers flag does
not mean it is necessarily running kraft.version=1. The more precise
meaning is that the controller node being formatted does not know what
kraft version the cluster should be in, and therefore it is only safe to
assume kraft.version=0. Only by setting
--standalone,--initial-controllers, or --no-initial-controllers
AND not specifying the controller.quorum.voters static config, is it
known kraft.version > 0.
For example, it is a valid configuration (although confusing) to run a
static quorum defined by controller.quorum.voters but have all the
controllers format with --no-initial-controllers. In this case,
specifying --no-initial-controllers alongside a metadata version that
does not support kraft.version=1 causes formatting to fail, which is
a regression.
Additionally, the formatter should not check the kraft.version against
the release version, since kraft.version does not actually depend on any
release version. It should only check the kraft.version against the
static voters config/format arguments.
This PR also cleans up the integration test framework to match the
semantics of formatting an actual cluster.
Reviewers: TengYao Chi <kitingiao@gmail.com>, Kuan-Po Tseng
<brandboat@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, José
Armando García Sancio <jsancio@apache.org>
The given PR mostly fixes the order of arguments in `assertEquals()` for
the Clients module. Some minor cleanups were included with the same too.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
In the create topic request we send a CreateTopic request in an
Envelope, so we need to also unpack the response correctly
Reviewers: Lucas Brutschy <lucasbru@apache.org>
Remove invalid mentions of default values for group.id,
config.storage.topic, offset.storage.topic, status.storage.topic
Reviewers: Luke Chen <showuon@gmail.com>, Ken Huang <s7133700@gmail.com>
Improve `MetadataVersion.fromVersionString()` to take an
`enableUnstableFeature` flag, and enable `FeatureCommand` and
`StorageTool` to leverage the exception message from
`fromVersionString`.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
*What*
In the implementation of KIP-1147 for console tools -
https://github.com/apache/kafka/pull/20479/files#diff-85b87c675a4b933e8e0e05c654d35d60e9cfd36cebe3331af825191b2cc688ee,
we missed adding unit tests for verifying the new
"`--formatter-property`" option.
Thanks to @Yunyung for pointing this out.
PR adds unit tests to both `ConsoleConsumerOptionsTest` and
`ConsoleShareConsumerOptionsTest` to verify the same.
Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
During online downgrade, when a static member using the consumer
protocol which is also the last member using the consumer protocol is
replaced by another static member using the classic protocol with the
same instance id, the latter will take the assignment of the former and
an online downgrade will be triggered.
In the current implementation, if the replacing static member has a
different subscription, no rebalance will be triggered when the
downgrade happens. The patch checks whether the static member has
changed subscription and triggers a rebalance when it does.
Reviewers: Sean Quah <squah@confluent.io>, David Jacot
<djacot@confluent.io>
1. Fix doc of `inter.worker.signature.algorithm` config in
`DistributedConfig`.
2. Improve the style of the `inter.worker.verification.algorithms` and
`worker.unsync.backoff.ms` config.
3. `INTER_WORKER_KEY_TTL_MS_MS_DOC` -> `INTER_WORKER_KEY_TTL_MS_DOC`.
Reviewers: Mickael Maison <mickael.maison@gmail.com>