The output from the delete-offsets option of kafka-consumer-groups.sh
can be improved. For example, the column widths are excessive which
looks untidy, and the output messages can be improved.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
The PR fixes the issue when ShareAcknowledgements are piggybacked on
ShareFetch. The current default configuration in clients sets `batch
size` and `max fetch records` as per the `max.poll.records` config,
default 500. Which means all records in a single poll will be fetched
and acknowledged. Also the default configuration for inflight records in
a partition is 200. Which means prior fetch records has to be
acknowledged prior fetching another batch from share partition.
The piggybacked share fetch-acknowledgement calls from KafkaApis are
async and later the response is combined. If respective share fetch
starts waiting in purgatory because all inflight records are currently
full, hence when startOffset is moved as part of acknowledgement, then a
trigger should happen which should try completing any pending share
fetch requests in purgatory. Else the share fetch requests wait in
purgatory for timeout though records are available, which dips the share
fetch performance.
The regular fetch has a single criteria to land requests in purgatory,
which is min bytes criteria, hence any produce in respective topic
partition triggers to check any pending fetch requests. But share fetch
can wait in purgatory because of multiple reasons: 1) Min bytes 2)
Inflight records exhaustion 3) Share partition fetch lock competition.
The trigger already happens for 1 and current PR fixes 2. We will
investigate further if there should be any handling required for 3.
Reviewers: Abhinav Dixit <adixit@confluent.io>, Andrew Schofield
<aschofield@confluent.io>
In https://github.com/apache/kafka/pull/16578 , we tried to exclude both
`checker-qual` and `error_prone_annotations`, but when excluding
`error_prone_annotations`, the compilation failed. So in the end, we
only excluded `checker-qual` and shipped `error_prone_annotations.jar`
to users. In Kafka v4.0.0, thanks to jdk 8 removal, we upgraded caffeine
to the latest v3.1.8, instead of v2.x.x, and now, we can successfully
pass the compilation without error after excluding
`error_prone_annotations` from `caffeine`.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ken Huang <s7133700@gmail.com>
The test is failing once in a while but there is not enough information
in the logs to determine the root cause. Adding more information, and
fixing thread resource leak.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
### About
11 of the test cases in `SharePartitionTest` have failed at least once
in the past 28 days.
https://develocity.apache.org/scans/tests?search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.timeZoneId=Europe%2FLondon&tests.container=kafka.server.share.SharePartitionTest
Observing the flakiness, they seem to be caused due to the usage of
`SystemTimer` for various acquisition lock timeout related tests. I have
replaced the usage of `SystemTimer` with `MockTimer` and also improved
the `MockTimer` API with regard to removing the timer task entries that
have already been cancelled.
Also, this has reduced the time taken to run `SharePartitionTest` from
~6 sec to ~1.5 sec
### Testing
The testing has been done with the help of already present unit tests in
Apache Kafka.
Reviewers: Andrew Schofield <aschofield@confluent.io>
### About
This PR removes the limitation in remote storage fetch for share groups
of only performing remote fetch for a single topic partition in a share
fetch request. With this PR, share groups can now fetch multiple remote
storage topic partitions in a single share fetch request.
### Testing
I have followed the [AK
documentation](https://kafka.apache.org/documentation/#tiered_storage_config_ex)
to test my code locally (by adopting `LocalTieredStorage.java`) and
verify with the help of logs that remote storage is happening for
multiple topic partitions in a single share fetch request. Also,
verified it with the help of unit tests.
Reviewers: Jun Rao <junrao@gmail.com>, Apoorv Mittal <apoorvmittal10@gmail.com>
Add a wait for cleaner thread shutdown in `testCleanerThreadShutdown` to
eliminate flakiness. After calling `cache.close()`, the test now uses
`TestUtils.waitForCondition` to poll until the background
“remote-log-index-cleaner” thread has fully exited before asserting that
no cleaner threads remain. This ensures the asynchronous shutdown always
completes before the final assertions.
Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
The PR do following:
1. Move MetadataVersionIntegrationTest to clients-integration-tests
module
2. rewrite to java from scala
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Handle the new `ShareSessionLimitReachedException` in
`ShareSessionHandler` in the client to reset the ShareSession. Added a
unit test verifying the change.
Reviewers: Andrew Schofield <aschofield@confluent.io>
replace all applicable `.stream().forEach()` in codebase with just
`.forEach()`.
Reviewers: TengYao Chi <kitingiao@gmail.com>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Some client APIs may return `null` values in the map, but this behavior
isn’t documented in the JavaDoc. We should update the JavaDoc to include
these edge cases.
Reviewers: Kirk True <kirk@kirktrue.pro>, Jhen-Yung Hsu
<jhenyunghsu@gmail.com>, PoAn Yang <payang@apache.org>, Chia-Ping Tsai
<chia7712@gmail.com>
In the return results of the methods beginningOffsets and endOffset, if
timeout == 0, then an empty Map should be returned uniformly instead of
in the form of <TopicPartition, null>
Reviewers: Ken Huang <s7133700@gmail.com>, PoAn Yang
<payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>, Lianet
Magrans <lmagrans@confluent.io>
There is some redundant code that could be removed in `CloseOptions`.
This patch also adds unit tests for CloseOptions.
Reviewers: Ken Huang <s7133700@gmail.com>, PoAn Yang
<payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
* We have a 2 perpetual timer tasks in ShareCoordinatorService to do
internal topic cleanup and snapshot cold partitions respectively.
* There are a few info level logs being printed as part of the
procedures. These are introducing noise and are not absolutely
necessary.
* We also move a debug log to error for the prune job.
* To remedy the situation, this PR changes the log level from info to
debug.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield
<aschofield@confluent.io>
The setter of `maxPollRecords` wrongly checks the field instead of the argument.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi
<frankvicky@apache.org>
* Currently in the share group heartbeat flow, if we see a TP subscribed
for the first time, we move that TP to initializing state in GC and let
the GC send a persister request to share group initialize the
aforementioned TP.
* However, if the coordinator runtime request for share group heartbeat
times out (maybe due to restarting/bad broker), the future completes
exceptionally resulting in persiter request to not be sent.
* Now, we are in a bad state since the TP is in initializing state in GC
but not persister initialized. Future heartbeats for the same share
partitions will also not help since we do not allow retrying persister
request for initializing TPs.
* This PR remedies the situation by allowing the same.
* A temporary fix to increase offset commit timeouts in system tests was
added to fix the issue. In this PR, we revert that change as well.
Reviewers: Andrew Schofield <aschofield@confluent.io>
BatchingStateRestoreCallback's default implemeantion of restore() lead
to waraning `FunctionalInterfaceMethodChanged`.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, PoAn Yang
<payang@apache.org>, Ken Huang <s7133700@gmail.com>, TengYao Chi
<frankvicky@apache.org>
AdminCommandFailedException and AdminOperationException are used only in
tools module, so move both into tools module.
Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
1. remove org.apache.kafka.raft.OffsetAndEpoch
2. rewrite org.apache.kafka.server.common.OffsetAndEpoch by record
keyword
3. rename OffsetAndEpoch#leaderEpoch to OffsetAndEpoch#epoch
Reviewers: PoAn Yang <payang@apache.org>, Xuan-Zhang Gong
<gongxuanzhangmelt@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
The ShareVersion feature does not make any metadata version changes. As
a result, `SV_1` does not depend on any MV level, and no MV needs to be
defined for the preview of KIP-932.
Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
Propose adding a new filter TransactionalIdPattern. This transaction ID pattern filter works as AND with the other transaction filters. Also, it is empowered with Re2j.
KIP: https://cwiki.apache.org/confluence/x/4gm9F
Reviewers: Justine Olshan <jolshan@confluent.io>, Ken Huang
<s7133700@gmail.com>, Kuan-Po Tseng <brandboat@gmail.com>, Chia-Ping
Tsai <chia7712@gmail.com>
For records which are automatically released as a result of closing a
share session normally, the delivery count should not be incremented.
These records were fetched but they were not actually delivered to the
client since the disposition of the delivery records is carried in the
ShareAcknowledge which closes the share session. Any remaining records
were not delivered, only fetched.
This PR releases the delivery count for records when closing a share
session normally.
Co-authored-by: d00791190 <dinglan6@huawei.com>
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
Up till now, the share sessions in the broker were only attempted to
evict when the share session cache was full and a new session was trying
to get registered. With the changes in this PR, whenever a share
consumer gets disconnected from the broker, the corresponding share
session would be evicted from the cache.
Note - `connectAndReceiveWithoutClosingSocket` has been introduced in
`GroupCoordinatorBaseRequestTest`. This method creates a socket
connection, sends the request, receives a response but does not close
the connection. Instead, these sockets are stored in a ListBuffer
`openSockets`, which are closed in tearDown method after each test is
run. Also, all the `connectAndReceive` calls in
`ShareFetchAcknowledgeRequestTest` have been replaced by
`connectAndReceiveWithoutClosingSocket`, because these tests depends
upon the persistence of the share sessions on the broker once
registered. But, with the new code introduced, as soon as the socket
connection is closed, a connection drop is assumed by the broker,
leading to session eviction.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
Test throws `NumberFormatException` and thus still passed as this
exception extends `IllegalArgumentException`. However, the test does
not verify what it is supposed to verify.
Piggybacking some code cleanup to all related files.
Reviewers: PoAn Yang <payang@apache.org>, Ken Huang <s7133700@gmail.com>, Bill Bejeck <bill@confluent.io>
This PR moves the computation of the "client list", which is the same
for all tasks, out of the loop, to avoid unnecessary re-computation.
Reviewers: Matthias J. Sax <matthias@confluent.io>
Enable next system test with KIP-1071.
Also fixes the other KIP-1071 system tests, which now require enabling
the unstable `streams.version` feature.
Reviewers: Bill Bejeck <bbejeck@apache.org>
This PR is the last in series to implement the DeleteShareGroupOffsets
request. This PR includes the changes in ShareGroupCommand which
internally calls the admin api to delete the offsets. Now, any enduser
will be able to delete share group offsets for topics subscribed by a
share group using kafka-share-groups.sh --delete-offsets command.
Reviewers: Andrew Schofield <aschofield@confluent.io>
The test testShareGroupHeartbeatInitializeOnPartitionUpdate was flaky
earlier. The shareGroupStatePartitionMetadataRecord that is created
during heartbeat contains 2 topics to be initialized, but the order in
which they appear in the list is not deterministic. The test is changed
to simply see whether the contents of the record is correct instead of
directly comparing it with an expected record which may contains the
correct topics, but in some different order.
Reviewers: Sushant Mahajan <smahajan@confluent.io>, Andrew Schofield <aschofield@confluent.io>
Adding the replicaId in the id mismatch log message to be able to
troubleshoot which node sent the erroneous VOTE request.
Reviewers: José Armando García Sancio <jsancio@apache.org>
This PR uses the v1 of the ShareVersion feature to enable share groups
for KIP-932.
Previously, there were two potential configs which could be used -
`group.share.enable=true` and including "share" in
`group.coordinator.rebalance.protocols`. After this PR, the first of
these is retained, but the second is not. Instead, the preferred switch
is the ShareVersion feature.
The `group.share.enable` config is temporarily retained for testing and
situations in which it is inconvenient to set the feature, but it should
really not be necessary, especially when we get to AK 4.2. The aim is to
remove this internal config at that point.
No tests should be setting `group.share.enable` any more, because they
can use the feature (which is enabled in test environments by default
because that's how features work). For tests which need to disable share
groups, they now set the share feature to v0. The majority of the code
changes were related to correct initialisation of the metadata cache in
tests now that a feature is used.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
It seems that IQv2EndpointToPartitionsIntegrationTest uses a
non-existent method to create `EmbeddedKafkaCluster`
Reviewers: Luke Chen <showuon@gmail.com>, PoAn Yang <payang@apache.org>,
Ken Huang <s7133700@gmail.com>
Add new StreamsGroupFeature, disabled by default, and add "streams" as
default value to `group.coordinator.rebalance.protocols`.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot
<david.jacot@gmail.com>, Lucas Brutschy <lbrutschy@confluent.io>,
Justine Olshan <jolshan@confluent.io>, Andrew Schofield
<aschofield@confluent.io>, Jun Rao <jun@confluent.io>
There will be an update to the PluginMetrics#metricName method: the type
of the tags parameter will be changed
from Map to LinkedHashMap.
This change is necessary because the order of metric tags is important
1. If the tag order is inconsistent, identical metrics may be treated as
distinct ones by the metrics backend
2. KAFKA-18390 is updating metric naming to use LinkedHashMap. For
consistency, we should follow the same approach here.
Reviewers: TengYao Chi <frankvicky@apache.org>, Jhen-Yung Hsu
<jhenyunghsu@gmail.com>, lllilllilllilili
This PR is a migration of the initial IQ support for KIP-1071 from the
feature branch to trunk. It includes a parameterized integration test
that expects the same results whether using either the classic or new
streams group protocol.
Note that this PR will deliver IQ information in each heartbeat
response. A follow-up PR will change that to be only sending IQ
information when assignments change.
Reviewers Lucas Brutschy <lucasbru@apache.org>
Reviewers: TengYao Chi <frankvicky@apache.org>, PoAn Yang <payang@apache.org>, Lianet Magrans <lmagrans@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
- Add support topicId in `ProduceRequest`/`ProduceResponse`. Topic name
and Topic Id will become `ignorable` following the footstep of
`FetchRequest`/`FetchResponse`
- ReplicaManager still look for `HostedPartition` using `TopicPartition`
and doesn't check topic id. This is an **[OPEN QUESTION]** if we should
address this in this pr or wait for
[KAFKA-16212](https://issues.apache.org/jira/browse/KAFKA-16212) as this
will update `ReplicaManager::getPartition` to use `TopicIdParittion`
once we update the cache. Other option is that we compare provided
`topicId` with `Partition` topic id and return `UNKNOW_TOPIC_ID` or
`UNKNOW_TOPIC_PARTITION` if we can't find partition with matched topic
id.
Reviewers: Jun Rao <jun@confluent.io>, Justine Olshan
<jolshan@confluent.io>
This is part of the client side changes required to enable 2PC for
KIP-939
New KafkaProducer.PreparedTxnState class is going to be defined as
following: ``` static public class PreparedTxnState { public String
toString(); public PreparedTxnState(String serializedState); public
PreparedTxnState(); } ``` The objects of this class can serialize to
/ deserialize from a string value and can be written to / read from a
database. The implementation is going to store producerId and epoch in
the format **producerId:epoch**
Reviewers: Artem Livshits <alivshits@confluent.io>, Justine Olshan
<jolshan@confluent.io>
Upon investigations for the failure of the system test
test_broker_failure, it was found there were situations where the
writing of records to the consumer_offsets topic was taking longer than
5 seconds (default value of offsets.commit.timeout.ms). Since the
persister requests of share partition initialization depends on the
completion of the record committing, due to the timeout, there were no
persister requests actually being sent. This PR increases the timeout
for this config to 20 seconds, as a temporary solution. The fix for this
is being tracked in the JIRA -
https://issues.apache.org/jira/browse/KAFKA-19204
Reviewers: Andrew Schofield <aschofield@confluent.io>
Enables KIP-1071 (`group.protocol=streams`) in the first streams system
test `streams_smoke_test.py`.
All tests using KIP-1071 cannot use `KafkaTest` anymore, since we need
to customize the broker configuration. The corresponding functionality
is added to `BaseStreamsTest`, which all streams tests will have to
extend from now on.
There are some left-overs from ZK in the tests that I copied from
'KafkaTest'. They need to be cleaned up, but this should be done in a
separate PR.