BatchingStateRestoreCallback's default implemeantion of restore() lead
to waraning `FunctionalInterfaceMethodChanged`.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, PoAn Yang
<payang@apache.org>, Ken Huang <s7133700@gmail.com>, TengYao Chi
<frankvicky@apache.org>
AdminCommandFailedException and AdminOperationException are used only in
tools module, so move both into tools module.
Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
1. remove org.apache.kafka.raft.OffsetAndEpoch
2. rewrite org.apache.kafka.server.common.OffsetAndEpoch by record
keyword
3. rename OffsetAndEpoch#leaderEpoch to OffsetAndEpoch#epoch
Reviewers: PoAn Yang <payang@apache.org>, Xuan-Zhang Gong
<gongxuanzhangmelt@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
The ShareVersion feature does not make any metadata version changes. As
a result, `SV_1` does not depend on any MV level, and no MV needs to be
defined for the preview of KIP-932.
Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
Propose adding a new filter TransactionalIdPattern. This transaction ID pattern filter works as AND with the other transaction filters. Also, it is empowered with Re2j.
KIP: https://cwiki.apache.org/confluence/x/4gm9F
Reviewers: Justine Olshan <jolshan@confluent.io>, Ken Huang
<s7133700@gmail.com>, Kuan-Po Tseng <brandboat@gmail.com>, Chia-Ping
Tsai <chia7712@gmail.com>
For records which are automatically released as a result of closing a
share session normally, the delivery count should not be incremented.
These records were fetched but they were not actually delivered to the
client since the disposition of the delivery records is carried in the
ShareAcknowledge which closes the share session. Any remaining records
were not delivered, only fetched.
This PR releases the delivery count for records when closing a share
session normally.
Co-authored-by: d00791190 <dinglan6@huawei.com>
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
Up till now, the share sessions in the broker were only attempted to
evict when the share session cache was full and a new session was trying
to get registered. With the changes in this PR, whenever a share
consumer gets disconnected from the broker, the corresponding share
session would be evicted from the cache.
Note - `connectAndReceiveWithoutClosingSocket` has been introduced in
`GroupCoordinatorBaseRequestTest`. This method creates a socket
connection, sends the request, receives a response but does not close
the connection. Instead, these sockets are stored in a ListBuffer
`openSockets`, which are closed in tearDown method after each test is
run. Also, all the `connectAndReceive` calls in
`ShareFetchAcknowledgeRequestTest` have been replaced by
`connectAndReceiveWithoutClosingSocket`, because these tests depends
upon the persistence of the share sessions on the broker once
registered. But, with the new code introduced, as soon as the socket
connection is closed, a connection drop is assumed by the broker,
leading to session eviction.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
Test throws `NumberFormatException` and thus still passed as this
exception extends `IllegalArgumentException`. However, the test does
not verify what it is supposed to verify.
Piggybacking some code cleanup to all related files.
Reviewers: PoAn Yang <payang@apache.org>, Ken Huang <s7133700@gmail.com>, Bill Bejeck <bill@confluent.io>
This PR moves the computation of the "client list", which is the same
for all tasks, out of the loop, to avoid unnecessary re-computation.
Reviewers: Matthias J. Sax <matthias@confluent.io>
Enable next system test with KIP-1071.
Also fixes the other KIP-1071 system tests, which now require enabling
the unstable `streams.version` feature.
Reviewers: Bill Bejeck <bbejeck@apache.org>
This PR is the last in series to implement the DeleteShareGroupOffsets
request. This PR includes the changes in ShareGroupCommand which
internally calls the admin api to delete the offsets. Now, any enduser
will be able to delete share group offsets for topics subscribed by a
share group using kafka-share-groups.sh --delete-offsets command.
Reviewers: Andrew Schofield <aschofield@confluent.io>
The test testShareGroupHeartbeatInitializeOnPartitionUpdate was flaky
earlier. The shareGroupStatePartitionMetadataRecord that is created
during heartbeat contains 2 topics to be initialized, but the order in
which they appear in the list is not deterministic. The test is changed
to simply see whether the contents of the record is correct instead of
directly comparing it with an expected record which may contains the
correct topics, but in some different order.
Reviewers: Sushant Mahajan <smahajan@confluent.io>, Andrew Schofield <aschofield@confluent.io>
Adding the replicaId in the id mismatch log message to be able to
troubleshoot which node sent the erroneous VOTE request.
Reviewers: José Armando García Sancio <jsancio@apache.org>
This PR uses the v1 of the ShareVersion feature to enable share groups
for KIP-932.
Previously, there were two potential configs which could be used -
`group.share.enable=true` and including "share" in
`group.coordinator.rebalance.protocols`. After this PR, the first of
these is retained, but the second is not. Instead, the preferred switch
is the ShareVersion feature.
The `group.share.enable` config is temporarily retained for testing and
situations in which it is inconvenient to set the feature, but it should
really not be necessary, especially when we get to AK 4.2. The aim is to
remove this internal config at that point.
No tests should be setting `group.share.enable` any more, because they
can use the feature (which is enabled in test environments by default
because that's how features work). For tests which need to disable share
groups, they now set the share feature to v0. The majority of the code
changes were related to correct initialisation of the metadata cache in
tests now that a feature is used.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
It seems that IQv2EndpointToPartitionsIntegrationTest uses a
non-existent method to create `EmbeddedKafkaCluster`
Reviewers: Luke Chen <showuon@gmail.com>, PoAn Yang <payang@apache.org>,
Ken Huang <s7133700@gmail.com>
Add new StreamsGroupFeature, disabled by default, and add "streams" as
default value to `group.coordinator.rebalance.protocols`.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot
<david.jacot@gmail.com>, Lucas Brutschy <lbrutschy@confluent.io>,
Justine Olshan <jolshan@confluent.io>, Andrew Schofield
<aschofield@confluent.io>, Jun Rao <jun@confluent.io>
There will be an update to the PluginMetrics#metricName method: the type
of the tags parameter will be changed
from Map to LinkedHashMap.
This change is necessary because the order of metric tags is important
1. If the tag order is inconsistent, identical metrics may be treated as
distinct ones by the metrics backend
2. KAFKA-18390 is updating metric naming to use LinkedHashMap. For
consistency, we should follow the same approach here.
Reviewers: TengYao Chi <frankvicky@apache.org>, Jhen-Yung Hsu
<jhenyunghsu@gmail.com>, lllilllilllilili
This PR is a migration of the initial IQ support for KIP-1071 from the
feature branch to trunk. It includes a parameterized integration test
that expects the same results whether using either the classic or new
streams group protocol.
Note that this PR will deliver IQ information in each heartbeat
response. A follow-up PR will change that to be only sending IQ
information when assignments change.
Reviewers Lucas Brutschy <lucasbru@apache.org>
Reviewers: TengYao Chi <frankvicky@apache.org>, PoAn Yang <payang@apache.org>, Lianet Magrans <lmagrans@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
- Add support topicId in `ProduceRequest`/`ProduceResponse`. Topic name
and Topic Id will become `ignorable` following the footstep of
`FetchRequest`/`FetchResponse`
- ReplicaManager still look for `HostedPartition` using `TopicPartition`
and doesn't check topic id. This is an **[OPEN QUESTION]** if we should
address this in this pr or wait for
[KAFKA-16212](https://issues.apache.org/jira/browse/KAFKA-16212) as this
will update `ReplicaManager::getPartition` to use `TopicIdParittion`
once we update the cache. Other option is that we compare provided
`topicId` with `Partition` topic id and return `UNKNOW_TOPIC_ID` or
`UNKNOW_TOPIC_PARTITION` if we can't find partition with matched topic
id.
Reviewers: Jun Rao <jun@confluent.io>, Justine Olshan
<jolshan@confluent.io>
This is part of the client side changes required to enable 2PC for
KIP-939
New KafkaProducer.PreparedTxnState class is going to be defined as
following: ``` static public class PreparedTxnState { public String
toString(); public PreparedTxnState(String serializedState); public
PreparedTxnState(); } ``` The objects of this class can serialize to
/ deserialize from a string value and can be written to / read from a
database. The implementation is going to store producerId and epoch in
the format **producerId:epoch**
Reviewers: Artem Livshits <alivshits@confluent.io>, Justine Olshan
<jolshan@confluent.io>
Upon investigations for the failure of the system test
test_broker_failure, it was found there were situations where the
writing of records to the consumer_offsets topic was taking longer than
5 seconds (default value of offsets.commit.timeout.ms). Since the
persister requests of share partition initialization depends on the
completion of the record committing, due to the timeout, there were no
persister requests actually being sent. This PR increases the timeout
for this config to 20 seconds, as a temporary solution. The fix for this
is being tracked in the JIRA -
https://issues.apache.org/jira/browse/KAFKA-19204
Reviewers: Andrew Schofield <aschofield@confluent.io>
Enables KIP-1071 (`group.protocol=streams`) in the first streams system
test `streams_smoke_test.py`.
All tests using KIP-1071 cannot use `KafkaTest` anymore, since we need
to customize the broker configuration. The corresponding functionality
is added to `BaseStreamsTest`, which all streams tests will have to
extend from now on.
There are some left-overs from ZK in the tests that I copied from
'KafkaTest'. They need to be cleaned up, but this should be done in a
separate PR.
The tests related of OffsetFetch request/response in MessageTest are
incomprehensible. This patch rewrites them in a simpler way.
Reviewers: TengYao Chi <frankvicky@apache.org>
This is a follow-up of this #19433 This PR aims at adding the
`repartition source topics` to the output of `--describe` for streams
groups.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
While working on https://github.com/apache/kafka/pull/19515, I came to
the conclusion that the OffsetFetchRequest is quite messy and overall
too complicated. This patch rationalize the constructors.
OffsetFetchRequest has a single constructor accepting the
OffsetFetchRequestData. This will also simplify adding the topic ids.
All the changes are mechanical, replacing data structures by others.
Reviewers: PoAn Yang <payang@apache.org>, TengYao Chi <frankvicky@apache.org>, Lianet Magran <lmagrans@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
The vector is a synchronized collection, and in the case we don't need
to sync. Also, we can use `Collections.enumeration` to convert
collection to enumeration easily.
Reviewers: PoAn Yang <payang@apache.org>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
This is a follow up PR for implementation of DeleteShareGroupOffsets
RPC. This PR adds the ShareGroupStatePartitionMetadata record to
__consumer__offsets topic to make sure the topic is removed from the
initializedTopics list. This PR also removes partitions from the request
and response schemas for DeleteShareGroupState RPC
Reviewers: Sushant Mahajan <smahajan@confluent.io>, Andrew Schofield <aschofield@confluent.io>
Use Java to rewrite `PlaintextConsumerFetchTest` by new test infra and
move it to client-integration-tests module.
Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai
<chia7712@gmail.com>
This patches moves the static request validations from the
`GroupMetadataManager` to the `GroupCoordinatorService`. We already had
static validation in the service for other requests so it makes sense to
consolidate all the static validations at the same place. Moreover, it
also prevents faulty requests from unnecessarily using group
coordinator's resources.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Andrew Schofield <aschofield@confluent.io>
Commit 732ed06 changed the logic of handling shutdowns, but in parallel
commit 3fae785 had introduced a new unit test for checking how to shut
down, which was broken by the later commit.
Reviewers: David Jacot <djacot@confluent.io>
The remote storage reader thread pool use same count for both maximum
and core size. If users adjust the pool size larger than original value,
it throws `IllegalArgumentException`. Updated both value to fix the
issue.
---------
Signed-off-by: PoAn Yang <payang@apache.org>
Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>
Commit 369cc56 added a new parameter to newStreamsGroupEpochRecord, but
did not update the test that was added in 732ed06, breaking compilation.
Reviewers: David Jacot <djacot@confluent.io>
* Add MetadataHash field to ConsumerGroupMetadataValue,
ShareGroupMetadataValue, and StreamGroupMetadataValue.
* Add metadataHash field to
GroupCoordinatorRecordHelpers#newConsumerGroupEpochRecord,
GroupCoordinatorRecordHelpers#newShareGroupEpochRecord, and
StreamsCoordinatorRecordHelpers#newStreamsGroupEpochRecord.
* Add deprecated message to ConsumerGroupPartitionMetadataKey and
ConsumerGroupPartitionMetadataValue.
* ShareGroupPartitionMetadataKey / ShareGroupPartitionMetadataValue /
StreamGroupPartitionMetadataKey / StreamGroupPartitionMetadataValue will
be removed in next PR.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, David Jacot <djacot@confluent.io>
---------
Signed-off-by: PoAn Yang <payang@apache.org>
If the streams rebalance protocol is enabled in
StreamsUncaughtExceptionHandlerIntegrationTest, the streams application
does not shut down correctly upon error.
There are two causes for this. First, sometimes, the SHUTDOWN_APPLICATION
code only sent with the leave heartbeat, but that is not handled broker
side. Second, the SHUTDOWN_APPLICATION code wasn't properly handled
client-side at all.
Reviewers: Bruno Cadonna <cadonna@apache.org>, Bill Bejeck
<bill@confluent.io>, PoAn Yang <payang@apache.org>
This PR just resolves an NPE when a topic assigned in a share group is
deleted. The NPE is caused by code which uses the current metadata image
to convert from a topic ID to the topic name. For a deleted topic, there
is no longer any entry in the image. A future PR will properly handle
the topic deletion.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, PoAn Yang <payang@apache.org>