In an earlier PR, the flag --consumer.config in
VerifiableShareConsumer.java was changed to --command-config. This PR
makes the same change when VerifiableShareConsumer is used in
verifiable_share_consumer.py.
Reviewers: Andrew Schofield <aschofield@confluent.io>
This patch is the first of a series of patches to remove the old group
coordinator. With the release of Apache Kafka 4.0, the so-called new
group coordinator is the default and only option available now.
This patch update the system tests to not run with the old group
coordinator. It also removed the ability to use the old group
coordinator.
Reviewers: Lianet Magrans <lmagrans@confluent.io>
This patch updates all the core system tests to include 4.0.0.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans
<lmagrans@confluent.io>
The upgrade test in question is not supported for AK 3.3.2 due to a
[known issue](https://issues.apache.org/jira/browse/KAFKA-18442).
Previous attempt at solving this left the `metadata.log.dir` empty which
leads to the following crash log:
```
ERROR Exiting Kafka due to fatal exception (kafka.Kafka$)
org.apache.kafka.common.KafkaException: No `meta.properties` found in (have you run `kafka-storage.sh` to format the directory?)
at kafka.server.BrokerMetadataCheckpoint$.$anonfun$getBrokerMetadataAndOfflineDirs$2(BrokerMetadataCheckpoint.scala:172)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at kafka.server.BrokerMetadataCheckpoint$.getBrokerMetadataAndOfflineDirs(BrokerMetadataCheckpoint.scala:161)
at kafka.server.KafkaRaftServer$.initializeLogDirs(KafkaRaftServer.scala:184)
at kafka.server.KafkaRaftServer.<init>(KafkaRaftServer.scala:61)
at kafka.Kafka$.buildServer(Kafka.scala:79)
at kafka.Kafka$.main(Kafka.scala:87)
at kafka.Kafka.main(Kafka.scala)
```
In 3.1 we deprecated the eager rebalancing protocol and marked it for
removal in a later release. We aim to officially drop support and remove
the protocol from Streams in 4.0.
The effect of this PR is that it will no longer be possible to perform a
live upgrade Kafka Streams directly to 4.0 from version 2.3 or below.
Users will have to go through a bridge release between 2.4 - 3.9
instead.
Reviewers: Matthias J. Sax <matthias@confluent.io>
Reduce the minISR to be 1 for the truncation test in order to skip the protection from KIP-966
Reviewers: David Jacot <djacot@confluent.io>, Colin P. McCabe <cmccabe@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
The main root cause is
3dba3125e9,
this PR remove the metadata version which is older than 3.3, thus this
test will fail when it use metadata version 3.2, 3.1
Reviewers: David Jacot <djacot@confluent.io>
The main root cause is
3dba3125e9,
this PR remove the metadata version which is older than 3.3, thus this
test will fail when it use metadata version 3.2, 3.1
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>
This patch adds a test case to replication_test.py test_replication_with_broker_failure which validates the scenario when we have failures of a combined mode broker/controller.
Reviewers: David Arthur <mumrah@gmail.com>
This patch marks IBP_4_0_IV3 as production ready for the Apache Kafka 4.0 release. It also introduced IBP_4_1_IV0 as the next development version.
Reviewers: Justine Olshan <jolshan@confluent.io>
This patch cleans up the places that should not use MV to determine ELR is enabled marks 4.0IV1 stable.
Reviewers: Alyssa Huang <ahuang@confluent.io>, Colin P. McCabe <cmccabe@apache.org>
The tests which set reassign_from_offset_zero=False have a setup phase which produces records with old timestamps to the topic and waits until they are cleaned by the retention in order to run the main phase of the test based on non-zero offsets. The setup phases did not wait enough for the cleaning task to kick in, mainly because the scheduled task was not started yet due to log.initial.task.delay.ms being set to 30s by default. Reducing it to 5s helps to stabilize the test. The patch also changes the sleep to 12s in order to have a bit more head room.
```
================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.12.0
session_id: 2025-02-11--016
run time: 26 minutes 9.451 seconds
tests run: 12
passed: 12
flaky: 0
failed: 0
ignored: 0
================================================================================
```
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
This patch renames kraft_upgrade_test.py to upgrade_test.py. This is enough to cover the old upgrade/downgrade tests.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Once a StreamThread receives its assignment, it will close the startup tasks. But during the closing process, the StandbyTask.closeClean() method will eventually call theStatemanagerUtil.closeStateManager method which needs to lock the state directory, but locking requires the calling thread be the current owner. Since the main thread grabs the lock on startup but moves on without releasing it, we need to update ownership explicitly here in order for the stream thread to close the startup task and begin processing.
Reviewers: Matthias Sax <mjsax@apache.org>, Nick Telford
A prior commit introduced checking for the version of a node related to move to log4j2 but it was causing an error
AttributeError("'ClusterNode' object has no attribute 'version'") This PR uses the get_version method from version.py which checks if the Node has a version attribute preventing an error.
Reviewers: Matthias Sax <mjsax@apache.org>
Due to an issue with handling folders in Kafka version 3.3.2 (see https://github.com/apache/kafka/pull/13130), this end-to-end test requires using a single folder for upgrade/downgrade scenarios involving 3.3.2.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
Related to KAFKA-18343,
Currently, there is an issue that the ps ax output is truncated, which causes the Kafka process ID to be unavailable. This issue can be mitigated by replacing ps ax with jcmd (i.e. using java_pids in ducktape), as it does not suffer from the truncation problem.
Reviewers: Justine Olshan <jolshan@confluent.io>
Added transaction version 2 to some of the system tests. Also marking TV2 as production ready.
Also fixes the defaultVersion test.
Reviewers: Jun Rao <jun@confluent.io>
This pull request replaces Log4j with Log4j2 across the entire project, including dependencies, configurations, and code. The notable changes are listed below:
1. Introduce Log4j2 Instead of Log4j
2. Change Configuration File Format from Properties to YAML
3. Adds warnings to notify users if they are still using Log4j properties, encouraging them to transition to Log4j2 configurations
Co-authored-by: Lee Dongjin <dongjin@apache.org>
Reviewers: Luke Chen <showuon@gmail.com>, Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Added ShareConsumeBenchSpec and ShareConsumeBenchWorker similar to ConsumeBenchSpec/ConsumeBenchWorker. This will help us run trogdor workloads for share consumers as well.
Added a sample json workload running 5 share consumers.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
The org.reflections is removed, so the initial logger of worker is only "root". However, the e2e needs a non-root logger to verify dynamic logger
We can add a logger to connect_log4j.properties to fix this e2e. For example:
log4j.logger.org.apache.kafka.clients.consumer.ConsumerConfig=ERROR
this can make admin/logger return two logger - org.apache.kafka.clients.consumer.ConsumerConfig and root
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
- Remove some unused Zookeeper code
- Migrate group mode transactions, security rolling upgrade, and throttling tests to using KRaft
- Add KRaft downgrade tests to kraft_upgrade_test.py
Reviewers: Colin P. McCabe <cmccabe@apache.org>
This commit adds AK 3.9 to the system tests on trunk.
Follow-up of #17797
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Bruno Cadonna <cadonna@apache.org>
We plan to remove all ZooKeeper-related code in version 4.0. However, some old brokers in the end-to-end tests still require ZooKeeper service, so we need to run the ZooKeeper service using the 3.x release instead of the dev branch.
Since version 3.9 is not available in the https://s3-us-west-2.amazonaws.com/kafka-packages repo, we can use version 3.8 for now.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Added ShareRoundTripWorker.java similar to RoundTripWorker.java. This will start a producer and a share consumer on a single node. The share consumer reads back the messages produced by the producer.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
Migrates existing connect tests that were using Zookeeper to use KRaft
instead, and cleans up some dead ZK code. For broker compatibility tests,
tests for versions 2.1-2.3 still need to use ZK.
Reviewers: Colin P. McCabe <cmccabe@apache.org>
This is part one of a multi-pr effort to convert Kafka Streams system tests to KRaft. I decided to break down the changes into multiple PRs to reduce the review load
Reviewers: Matthias Sax <mjsax@apache.org>
This PR removes ZK test parameterizations from ducktape by:
- Removing zk from quorum.all_non_upgrade
- Removing quorum.zk from @matrix and @parametrize annotations
- Changing usages of quorum.all to quorum.all_kraft
- Deleting message_format_change_test.py
The default metadata_quorum value still needs to be changed to KRaft rather than ZK, but this will be done in a follow-up PR.
Reviewers: Kirk True <kirk@kirktrue.pro>, Colin P. McCabe <cmccabe@apache.org>
KIP-853 adds support for dynamic KRaft quorums. This means that the quorum topology is
no longer statically determined by the controller.quorum.voters configuration. Instead, it
is contained in the storage directories of each controller and broker.
Users of dynamic quorums must format at least one controller storage directory with either
the --initial-controllers or --standalone flags. If they fail to do this, no quorum can be
established. This PR changes the storage tool to warn about the case where a KIP-853 flag has
not been supplied to format a KIP-853 controller. (Note that broker storage directories
can continue to be formatted without a KIP-853 flag.)
There are cases where we don't want to specify initial voters when formatting a controller. One
example is where we format a single controller with --standalone, and then dynamically add 4
more controllers with no initial topology. In this case, we want the 4 later controllers to grab
the quorum topology from the initial one. To support this case, this PR adds the
--no-initial-controllers flag.
Reviewers: José Armando García Sancio <jsancio@apache.org>, Federico Valeri <fvaleri@redhat.com>
This change fixes a few issues.
KAFKA-17608; KRaft controller crashes when active controller is removed
When a control batch is committed, the quorum controller currently increases the last stable offset but fails to create a snapshot for that offset. This causes an issue if the quorum controller renounces and needs to revert to that offset (which has no snapshot present). Since the control batches are no-ops for the quorum controller, it does not need to update its offsets for control records. We skip handle commit logic for control batches.
KAFKA-17604; Describe quorum output missing added voters endpoints
Describe quorum output will miss endpoints of voters which were added via AddRaftVoter. This is due to a bug in LeaderState's updateVoterAndObserverStates which will pull replica state from observer states map (which does not include endpoints). The fix is to populate endpoints from the lastVoterSet passed into the method.
Reviewers: José Armando García Sancio <jsancio@apache.org>, Colin P. McCabe <cmccabe@apache.org>, Chia-Ping Tsai <chia7712@apache.org>