Task pairs is an optimization that is enabled in the current sticky task
assignor.
The basic idea is that every time we add a task A to a client that has
tasks B, C, we add pairs (A, B) and (A, C) to a global collection of
pairs. When adding a standby task, we then prioritize creating standby
tasks that create new task pairs. If this does not work, we fall back to
the original behavior.
The complexity of this optimization is fairly significant, and its
usefulness is questionable, the HighAvailabilityAssignor does not seem
to have such an optimization, and the absence of this optimization does
not seem to have caused any problems that I know of. I could not find
any what this optimization is actually trying to achieve.
A side effect of it is that we will sometimes avoid “small loops”, such
as
Node A: ActiveTask1, StandbyTask2 Node B: ActiveTask2,
StandbyTask1 Node C: ActiveTask3,
StandbyTask4 Node D: ActiveTask4,
StandbyTask3
So a small loop like this, worst case losing two nodes will cause 2
tasks to go down, so the assignor is preferring
Node A: ActiveTask1, StandbyTask4 Node B: ActiveTask2,
StandbyTask1 Node C: ActiveTask3,
StandbyTask2 Node D: ActiveTask4,
StandbyTask3
Which is a “big loop” assignment, where worst-case losing two nodes will
cause at most 1 task to be unavailable. However, this optimization seems
fairly niche, and also the current implementation does not seem to
implement it in a direct form, but a more relaxed constraint which
usually, does not always avoid small loops. So it remains unclear
whether this is really the intention behind the optimization. The
current unit tests of the StickyTaskAssignor pass even after removing
the optimization.
The pairs optimization has a worst-case quadratic space and time
complexity in the number of tasks, and make a lot of other optimizations
impossible, so I’d suggest we remove it. I don’t think, in its current
form, it is suitable to be implemented in a broker-side assignor. Note,
however, if we identify a useful effect of the code in the future, we
can work on finding an efficient algorithm that can bring the
optimization to our broker-side assignor.
This reduces the runtime of our worst case benchmark by 10x.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
While walking through the source code I confirmed that the broker checks
`replica.fetch.min.bytes` exactly the same way it checks
`fetch.min.bytes`, so this patch updates the wording for both config
keys.
Co-authored-by: yangxuze <xuze_yang@163.com>
Reviewers: Luke Chen <showuon@gmail.com>
Fixup PR Labels / fixup-pr-labels (needs-attention) (push) Has been cancelledDetails
Fixup PR Labels / fixup-pr-labels (triage) (push) Has been cancelledDetails
Docker Image CVE Scanner / scan_jvm (3.7.2) (push) Has been cancelledDetails
Docker Image CVE Scanner / scan_jvm (3.8.1) (push) Has been cancelledDetails
Docker Image CVE Scanner / scan_jvm (3.9.1) (push) Has been cancelledDetails
Docker Image CVE Scanner / scan_jvm (4.0.0) (push) Has been cancelledDetails
Docker Image CVE Scanner / scan_jvm (latest) (push) Has been cancelledDetails
Fixup PR Labels / needs-attention (push) Has been cancelledDetails
Flaky Test Report / Flaky Test Report (push) Has been cancelledDetails
Moving rollback out of lock, if persister returns a completed future for
write state then same data-plane-request-handler thread should not call
purgatory safeTryAndComplete while holding SharePartition's write lock.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Abhinav Dixit
<adixit@confluent.io>
This fixes the flaky
`DescribeStreamsGroupTest.testDescribeMultipleStreamsGroupWithMembersAndVerboseOptions()`,
which sometimes fails due to `ERROR stream-thread Missing source topics:
Source topics customInputTopic2 are missing`
Reviewers: Bill Bejeck <bbejeck@apache.org>
We failed the native image build and test workflow
[here](https://github.com/apache/kafka/actions/runs/16211393417/job/45772104969).
The failed messages are:
```
Exception in thread "main" java.lang.ExceptionInInitializerError at
org.apache.kafka.server.config.AbstractKafkaConfig.<clinit>(AbstractKafkaConfig.java:56)
at
java.base@21.0.2/java.lang.Class.ensureInitialized(DynamicHub.java:601)
at kafka.tools.StorageTool$.$anonfun$execute$1(StorageTool.scala:79) at
scala.Option.flatMap(Option.scala:283) at
kafka.tools.StorageTool$.execute(StorageTool.scala:79) at
kafka.tools.StorageTool$.main(StorageTool.scala:46) at
kafka.docker.KafkaDockerWrapper$.main(KafkaDockerWrapper.scala:57) at
kafka.docker.KafkaDockerWrapper.main(KafkaDockerWrapper.scala) at
java.base@21.0.2/java.lang.invoke.LambdaForm$DMH/sa346b79c.invokeStaticInit(LambdaForm$DMH)
Caused by: org.apache.kafka.common.config.ConfigException: Invalid value
org.apache.kafka.common.security.oauthbearer.DefaultJwtRetriever for
configuration sasl.oauthbearer.jwt.retriever.class: Class
org.apache.kafka.common.security.oauthbearer.DefaultJwtRetriever could
not be found. at
org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:778)
at
org.apache.kafka.common.config.ConfigDef$ConfigKey.<init>(ConfigDef.java:1271)
at org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:155)
at org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:198)
at org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:237)
at org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:399)
at org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:412)
at
org.apache.kafka.common.config.internals.BrokerSecurityConfigs.<clinit>(BrokerSecurityConfigs.java:197)
... 9 more
```
After investigation, I found we have to update the native image configs
to support the new code change as described
[here](https://github.com/apache/kafka/blob/trunk/docker/native/README.md#native-image-reachability-metadata).
This PR fixes this issue and verified that the same workflow for native
image passed
[here](https://github.com/apache/kafka/actions/runs/16215454627/job/45783738496).
The PR for v4.1.0 is https://github.com/apache/kafka/pull/20151 .
Reviewers: TengYao Chi <frankvicky@apache.org>
Now that Kafka support Java 17, this PR makes some changes in connect
module. The changes in this PR are limited to only some files. A future
PR(s) shall follow.
The changes mostly include:
- Collections.emptyList(), Collections.singletonList() and
Arrays.asList() are replaced with List.of()
- Collections.emptyMap() and Collections.singletonMap() are replaced
with Map.of()
- Collections.singleton() is replaced with Set.of()
Sub modules targeted: api, basic-auth-extensions, file, json, mirror,
mirror-client
Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
https://issues.apache.org/jira/browse/KAFKA-19485
**Bug :**
There is a bug in `ShareConsumeRequestManager` where we are adding
acknowledgements on initial `ShareSession` epoch even after checking for
it.
Added fix to only include acknowledgements in the request if we have to,
PR also adds the check at another point in the code where we could
potentially be sending such acknowledgements. One of the cases could be
when metadata is refreshed with empty topic IDs after a broker restart.
This means leader information would not be available on the node.
- Consumer subscribed to a partition whose leader was node-0.
- Broker restart happens and node-0 is elected leader again. Broker
starts a new `ShareSession`.
- Background thread sends a fetch request with **non-zero** epoch.
- Broker responds with `SHARE_SESSION_NOT_FOUND`.
- Client updates session epoch to 0 once it receives this error.
- Client updates metadata but receives empty metadata response. (Leader
unavailable)
- Application thread processing the previous fetch, completes and sends
acks to piggyback on next fetch.
- Next fetch will send the piggyback acknowledgements on the fetch for
previously subscribed partitions resulting in error from broker
("`Acknowledge data present on initial epoch`"). (Currently we attempt
to send even if leader is unavailable).
**Fix** : Add a check before sending out acknowledgments if we are on
an initial epoch.
Added unit test covering the above scenario.
Reviewers: Andrew Schofield <aschofield@confluent.io>
"the action will only execute" is incorrect, as the admin still sends
the request. The "if-not-exists" flag is actually used to swallow the
exception
Reviewers: TengYao Chi <frankvicky@apache.org>, Nick Guo
<lansg0504@gmail.com>, Ken Huang <s7133700@gmail.com>
In the old protocol, Kafka Streams used to throw a
`MissingSourceTopicException` when a source topic is missing. In the new
protocol, it doesn’t do that anymore, while only log the status that is
returned from the broker, which contains a status that indicates that a
source topic is missing.
This change:
1. Throws an `MissingSourceTopicException` when source topic is missing
2. Adds unit tests
3. Modifies integration tests to fit both old and new protocols
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
Minor tidying up in AlterShareGroupOffsetsHandler based on review
comment
https://github.com/apache/kafka/pull/20049#discussion_r2192904850.
Reviewers: Jimmy Wang <wangzhiwang611@gmail.com>, Lan Ding
<isDing_L@163.com>, TaiJuWu <tjwu1217@gmail.com>, Ken Huang
<s7133700@gmail.com>, Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Chia-Ping
Tsai <chia7712@gmail.com>
## Changes
- The partitions == 0 branch has been moved from **waitForTopic** to
**waitTopicDeletion**.
## Reasons
- Clarify the responsibility of each helper method makes the test code
easier to reason by moving the partitions == 0 logic from
**waitForTopic** into a dedicated method **waitTopicDeletion**.
Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, TaiJuWu
<tjwu1217@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
- Fix testUncommittedDataNotConsumedFrequentSegmentRolls() and
testUncommittedDataNotConsumed(), which call createLog() but never close
the log when the tests complete.
- Move LogConcurrencyTest to the Storage module and rewrite it in Java.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
The current assignor used in KIP-1071 is verbatim the assignor used on
the client-side. The assignor performance was not a big concern on the
client-side, and it seems some additional performance overhead has crept
in during the adaptation to the broker-side interfaces, so we expect it
to be too slow for groups of non-trivial size.
We base ourselves on the share-group parameters for these benchmarks:
- Up to 1000 members - Up to 100 topics - Up to 100
partitions per topic
Note, however, that the parameters influencing the Streams assignment
are different and more complicated compared to regular consumer groups /
share consumer groups. The assignment logic is independent of the number
of topics, but depends on the number of subtopologies. A subtopology may
read from multiple topics. We simplify this relationship by assuming one
topic per subtopology Members may be part of the same process or
separate processes. We introduce a parameter membersPerProcess to tune
two extreme configurations (1, 50).
We define 50% of the subtopologies to be stateful. Stateful
subtopologies get standby replicas assigned, if enabled. For example, if
we have 100 subtopologies with 100 partitions, we get 10,000 active
tasks and 5,000 standby tasks.
Reviewers: Bill Bejeck <bbejeck@apache.org>
Update the documentation to describe how to upgrade the kraft feature
version from 0 to 1.
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Alyssa Huang
<ahuang@confluent.io>
The e2e tests currently cover version 2.1.0 and above. Thus, we can
remove `force_use_zk_connection` in
`kafka_acls_cmd_with_optional_security_settings`
In contrast, the `force_use_zk_connection` in
`kafka_topics_cmd_with_optional_security_settings` and
`kafka_configs_cmd_with_optional_security_settings` still needs to be
kept as `kafka-topics.sh` does not support `--bootstrap-server` in 2.1
and 2.2
e2e test result:
```
===========================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.12.0
session_id: 2025-07-02--001
run time: 200 minutes 28.399 seconds
tests run: 90
passed: 90
flaky: 0
failed: 0
ignored: 0
===========================================
```
Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
* While creating share group init requests in
`GroupMetadataManager.shareGroupHeartbeat`, we check for topics in
`initializing` state and if they are a certain amount of time old, we
issue retry requests for the same.
* The interval for considering initializing topics as old was based of
`offsetsCommitTimeoutMs` and was not configurable.
* In this PR, we remedy the situation by introducing a new config to
supply the value. The default is `30_000` which is a
heuristic based on the fact that the share coordinator `persister`
retries request with exponential backoff, with upper cap of `30_000`
seconds.
* Tests have been updated wherever applicable.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Lan Ding
<isDing_L@163.com>, TaiJuWu <tjwu1217@gmail.com>, Andrew Schofield
<aschofield@confluent.io>
### About
Within `ShareConsumerPerformance.java`, all the share consumers run with
within an executorService object and when we
perform `executorService.submit()`, we do not store this future and
exception would be recovered only when we do a future.get() in this
case. I believe this is a shortcoming
in `ShareConsumerPerformance.java` which needs to be improved.
Reviewers: Andrew Schofield <aschofield@confluent.io>
#5608 introduced a regression where the check for `targetOffset <
log.highWatermark`
to emit a `WARN` log was made incorrectly after truncating the log.
This change moves the check for `targetOffset < log.highWatermark` to
`UnifiedLog#truncateTo` and ensures we emit a `WARN` log on truncation
below the replica's HWM by both the `ReplicaFetcherThread` and
`ReplicaAlterLogDirsThread`
Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai
<chia7712@gmail.com>
Creates GetReplicaLogInfoRequest and GetReplicaLogInfoResponse RPCs
Information returned by these brokers will be used to aid
unclean-recovery by selecting longest logs.
Reviewers: Alyssa Huang <ahuang@confluent.io>, Calvin Liu <caliu@confluent.io>, Colin P. McCabe <cmccabe@apache.org>, TaiJuWu <tjwu1217@gmail.com>
https://issues.apache.org/jira/browse/KAFKA-19390
The AbstractIndex.resize() method does not release the old memory map
for both index and time index files. In some cases, Mixed GC may not
run for a long time, which can cause the broker to crash when the
vm.max_map_count limit is reached.
The root cause is that safeForceUnmap() is not being called on Linux
within resize(), so we have changed the code to unmap old mmap on all
operating systems.
The same problem was reported in
[KAFKA-7442](https://issues.apache.org/jira/browse/KAFKA-7442), but the
PR submitted at that time did not acquire all necessary locks around the
mmap accesses and was closed without fixing the issue.
Reviewers: Jun Rao <junrao@gmail.com>
This PR includes the following fixes:
- Streams CLI used to list and return the description of the first group
which is a bug. With this fix, it returns the descriptions of the groups
specified by the `--group` or `all-groups`. Integration test are added
to verify the fix.
- `timeoutOption` is missing in describe groups. This fix adds and tests
it with short timeout.
- `DescribeStreamsGroupsHandler` used to return an empty group in `DEAD`
state when the group id was not found, but with this fix, it throws
`GroupIdNotFoundException`
This is a very mechanical and obvious change that is making most
accessors in ProcessState constant time O(1), instead of linear time
O(n), by computing the collections and aggregations at insertion time,
instead of every time the value is accessed.
Since the accessors are used in deeply nested loops, this reduces the
runtime of our worst case benchmarks by ~14x.
Reviewers: Bill Bejeck <bbejeck@apache.org>
PlaintextConsumerTest should extend AbstractConsumerTest instead
BaseConsumerTest. Otherwise, those tests will be executed on both
`clients-integration-tests` and `core` (see
https://github.com/apache/kafka/pull/20081/files#r2190749592).
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
When building Streams I get this warning:
```
> Task :streams:compileTestJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note:
<PATH>/kafka/streams/src/test/java/org/apache/kafka/streams/processor/internals/RecordCollectorTest.java
uses unchecked or unsafe operations.
```
Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Use Java to rewrite PlaintextConsumerTest by new test infra and move it
to client-integration-tests module.
Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
This fixes librdkafka older than the recently released 2.11.0 with
Kerberos authentication and Apache Kafka 4.x.
Even though this is a bug in librdkafka, a key goal of KIP-896 is not to
break the popular client libraries listed in it. Adding back JoinGroup
v0 & v1 is a very small change and worth it from that perspective.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Migrate ControllerMutationQuotaManager to Java implementation and move
to server module, including ClientQuotaManager and associated files.
Reviewers: TengYao Chi <kitingiao@gmail.com>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
When writing HTML, it's recommended to use the <code> element instead of
backticks for inline code formatting.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, TengYao Chi
<frankvicky@apache.org>
Flaky Test Report / Flaky Test Report (push) Has been cancelledDetails
- Metadata doesn't have the full view of topicNames to ids during
rebootstrap of client or when topic has been deleted/recreated. The
solution is to pass down topic id and stop trying to figure it out later
in the logic.
---------
Co-authored-by: Kirk True <kirk@kirktrue.pro>
When the group coordinator is processing a heartbeat from a share
consumer, it must decide whether the recompute the assignment. Part of
this decision hinges on whether the assigned partitions match the
partitions initialised by the share coordinator. However, when the set
of subscribed topics changes, there may be initialised partitions which
are not currently assigned. Topics which are not subscribed should be
omitted from the calculation about whether to recompute the assignment.
Co-authored-by: Sushant Mahajan <smahajan@confluent.io>
Reviewers: Lan Ding <53332773+DL1231@users.noreply.github.com>, Sushant
Mahajan <smahajan@confluent.io>, Apoorv Mittal
<apoorvmittal10@gmail.com>
Finalise the share group SimpleAssignor for heterogeneous subscriptions.
The assignor code is much more accurate about the number of partitions
assigned to each member, and the number of members assigned for each
partition. It eliminates the idea of hash-based assignment because that
has been shown to the unhelpful. The revised code is very much more
effective at assigning evenly as the number of members grows and shrinks
over time.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
Update the outdated Javadocs in Metrics.java. The `MetricName(String
name, String group)` constructor in MetricName.java was removed in
59b918ec2b
Minor typo fixes included.
Reviewers: Ken Huang <s7133700@gmail.com>, TengYao Chi
<kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
* There are instances where share group delete calls in group
coordinator (`onPartitionsDelete`, `deleteShareGroups`) where we lookup
the metadata image to fetch the topic id/partitions/topic name for a
topic name/id. However, there have been
instances where the looked up info was not found due to cluster being
under load or the underlying topic being deleted and information not
propagated correctly.
* To remedy the same, this PR adds checks to determine that topic is
indeed present in the image before the lookups thus preventing NPEs. The
problematic situations are logged.
* New tests have been added for `GroupMetadataManger` and
`GroupCoordinatorService`.
Reviewers: Andrew Schofield <aschofield@confluent.io>
While testing the code in https://github.com/apache/kafka/pull/19820, it
became clear that the error handling problems were due to the underlying
Admin API. This PR fixes the error handling for top-level errors in the
AlterShareGroupOffsets RPC.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Lan Ding
<isDing_L@163.com>, TaiJuWu <tjwu1217@gmail.com>
Estimate the fetch size for remote fetch to avoid to exceed the
`fetch.max.bytes` config. We don't want to query the remoteLogMetadata
during API handling, thus we assume the remote fetch can get
`max.partition.fetch.bytes` size. Tests added.
Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>
### About
`nextFetchOffset` function in `SharePartition` updates the fetch offsets
without considering batches/offsets which might be undergoing state
transition. This can cause problems in updating to the right fetch
offset.
### Testing
The new code added has been tested with the help of unit tests.
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
issue: https://github.com/apache/kafka/pull/19687/files#r2094574178
Why:
- To improve performance by avoiding redundant temporary collections and
repeated method calls.
- To make the utility more flexible for inputs from both Java and Scala.
What:
- Refactored `createResponseConfig` in `ConfigHelper.scala` by
overloading the method to accept both Java maps and `AbstractConfig`.
- Extracted helper functions to `ConfigHelperUtils` in the server
module.
Reviewers: Ken Huang <s7133700@gmail.com>, Jhen-Yung Hsu
<jhenyunghsu@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping
Tsai <chia7712@gmail.com>
Now that Kafka supports Java 17, this PR makes some changes in
jmh-benchmarks module. The changes mostly include:
- Collections.emptyList(), Collections.singletonList() and
Arrays.asList() are replaced with List.of()
- Collections.emptyMap() and Collections.singletonMap() are replaced
with Map.of()
- Collections.singleton() is replaced with Set.of()
Reviewers: Jhen-Yung Hsu <jhenyunghsu@gmail.com>, Ken Huang
<s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
* When a `ShareGroup*` record is replayed in group
metadata manager, there is a call to check if the group exists. If the
group does not exist - we are throwing an exception which is
unnecessary.
* In this PR, we have added check to ignore this exception.
* New test to validate the logic has been added.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Dongnuo Lyu
<139248811+dongnuo123@users.noreply.github.com>
* The CREATE_TOPIC request gets issued only when it is clear that the
topic does not exist in the cluster.
* When the request to describe the topic gets timed-out or any exception
thrown other than UnknownTopicOrPartitionException, then the same gets
re-thrown and the describe/create topic request gets retried in the next
iteration until the initializationRetryMaxTimeoutMs gets breached.
Fixes: https://issues.apache.org/jira/browse/KAFKA-19371
Reviewers: Luke Chen <showuon@gmail.com>, Kamal Chandraprakash
<kamal.chandraprakash@gmail.com>
---------
Co-authored-by: stroller.fu <stroller.fu@zoom.us>
We can remove the explicit counter for open iterators, and just use
size() on the set of open iterators we track anyway.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
It looks like we are checking for properties that are not guaranteed
under at_least_once, for example, exact counting (not allowing for
overcounting).
This change relaxes the validation constraint:
The TAGG topic contains effectively count-by-count results. So for
example, if we have the input without duplication
0 -> 1,2,3 we will get in TAGG 3 -> 1, since 1 key had 3 values.
with duplication:
0 -> 1,1,2,3 we will get in TAGG 4 -> 1, since 1 key had 4 values.
This makes the result difficult to compare. Since we run the smoke test
also with Exactly_Once, I propose to disable validation off TAGG under
ALOS.
Similarly, the topic AVG may overcount or undercount. The test case is
extremely similar to DIF, both performing a join and two streams, the
only difference being the mathematical operation performed, so we can
also disable this validation under ALOS with minimal loss of coverage.
Finally, the change fixes a bug that would throw a NPE when validation
of a windowed stream would fail.
Reviewers: Kirk True <kirk@kirktrue.pro>, Matthias J. Sax
<matthias@confluent.io>
* If a write request with higher state than seen so far for a
specific share partition arrives at the share coordinator, the code will
create a new share snapshot and also update the internal view of the
state epoch.
* For writes with higher leader epoch, the current records are updated
with that value as well.
* The above is not the expected behavior and only initialize RPCs should
set and alter the state epoch and read RPC should set the leader epoch.
* This PR rectifies the behavior.
* Few tests have been removed.
Reviewers: Andrew Schofield <aschofield@confluent.io>
* If we get an `UNKNOWN_TOPIC_OR_PARTITION` response from the
`ShareCoordinator` is could imply a transient issue where the metadata
image is not upto date.
* In this case it makes sense to retry the request to give time for data
to be available.
* In this PR, we are making the required change.
Reviewers: Andrew Schofield <aschofield@confluent.io>
- remove `threadNamePrefix` from `ReplicaManager` constructor
- update `BrokerServer` to use updated constructor
- remove `threadNamePrefix` from `ReplicaFetcherManager`
Reviewers: PoAn Yang <payang@apache.org>, TengYao Chi
<frankvicky@apache.org>
Adds documentation to support the OAuth additions from KIP-768 and
KIP-1139.
The existing documentation is heavily geared toward Kafka's support for
non-production OAuth usage. Since this mode is still supported, it
should not be removed. However, with the addition of the production
OAuth usage, the documentation is less than succinct because it has a
bit of a split personality issue.
StreamProducer may timeout in sendOffsetsToTransaction() or
commitTransaction() call. To distinguish both cases, we should make both
calls in individual try-catch blocks.
Reviewers: Bill Bejeck<bbejeck@apache.org>
Remove FlattenedIterator. It’s no longer used anywhere after
https://github.com/apache/kafka/pull/20037.
Reviewers: TengYao Chi <kitingiao@gmail.com>, Lan Ding
<isDing_L@163.com>, Chia-Ping Tsai <chia7712@gmail.com>