Adds the `metadata_quorum` parameter to the `@matrix(...)` annotation to many existing tests, so that they are run with both zookeeper and remote_kraft nodes.
Reviewers: Randall Hauch <rhauch@gmail.com>, Greg Harris <gharris1727@gmail.com>
KIP-373 added a "token requester" field to the output of kafka-delegation-tokens.sh. The system test was failing since it was not expecting this new field. This patch adds support for this field and improves the error output if we can't parse.
Reviewers: José Armando García Sancio <jsancio@apache.org>, Manikumar Reddy <manikumar.reddy@gmail.com>
Migrates Streams sustem tests to either use kraft brokers or to use both kraft and zk in a testing matrix.
This skips tests which use various forms of Kafka versioning since those seem to have issues with KRaft at the moment. Running these tests with KRaft will require a followup PR.
Reviewers: Guozhang Wang <guozhang@apache.org>, John Roesler <vvcephei@apache.org>
This patch removes test_kafka_version.py, which contains two tests at the moment. The first test verifies we can start a 0.8.2 cluster. The second verifies we can start a cluster with one node on 0.8.2 and another on the latest. These test are covered in greater depth by upgrade_test.py and downgrade_test.py.
Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>
Also includes a minor quality-of-life improvement to clarify why some internal REST requests to workers may fail while that worker is still starting up.
Reviewers: Tom Bentley <tbentley@redhat.com>, Luke Chen <showuon@gmail.com>, José Armando García Sancio <jsancio@gmail.com>, Mickael Maison <mickael.maison@gmail.com>
This PR adds support to kafka-features.sh for the --metadata flag, as specified in KIP-778. This
flag makes it possible to upgrade to a new metadata version without consulting a table mapping
version names to short integers. Change --feature to use a key=value format.
FeatureCommandTest.scala: make most tests here true unit tests (that don't start brokers) in order
to improve test run time, and allow us to test more cases. For the integration test part, test both
KRaft and ZK-based clusters. Add support for mocking feature operations in MockAdminClient.java.
upgrade.html: add a section describing how the metadata.version should be upgraded in KRaft
clusters.
Add kraft_upgrade_test.py to test upgrades between KRaft versions.
Reviewers: David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com>, José Armando García Sancio <jsancio@gmail.com>
KRaft mode will not support writing messages with an older message format (2.8) since the min supported IBP is 3.0 for KRaft. Testing support for reading older message formats will be covered by https://issues.apache.org/jira/browse/KAFKA-14056.
Reviewers: David Jacot <djacot@confluent.io>
* KAFKA-13930: Add 3.2.0 Streams upgrade system tests
Apache Kafka 3.2.0 was recently released. Now we need
to test upgrades from 3.2 to trunk in our system tests.
Reviewer: Bill Bejeck <bbejeck@apache.org>
I noticed that a system test using a KRaft cluster with 3 brokers but only 1 co-located controller did not force-kill the second and third broker after shutting down the first broker (the one with the controller). The issue was a floating point rounding error. This patch adjusts for the rounding error and also makes the logic work for an even number of controllers. A local run of `tests/kafkatest/sanity_checks/test_bounce.py` succeeded (and I manually increased the cluster size for the 1 co-located controller case and observed the correct kill behavior: the second and third brokers were force-killed as expected).
Reviewers: Luke Chen <showuon@gmail.com>, José Armando García Sancio <jsancio@users.noreply.github.com>, David Jacot <djacot@confluent.io>
Replace ACL_AUTHORIZER attribute with ZK_ACL_AUTHORIZER in system tests. Required after the changes merged with https://github.com/apache/kafka/pull/12190.
Reviewers: David Jacot <djacot@confluent.io>
Apache Kafka 3.2.0 was recently released. Now we need
to test upgrades and compatibility with 3.2 in core system tests.
Reviewer: Jason Gustafson <jason@confluent.io>
Currently the only place we see controller/raft logging in system tests is `server-start-stdout-stderr.log` where they are mixed with all other logs. It is more convenient to send them to `controller.log` as we do for zk tests.
Reviewers: Kvicii <42023367+Kvicii@users.noreply.github.com>, David Jacot <djacot@confluent.io>
It is useful to collect the directory for `__cluster_metadata` in system tests. We use a separate directory from user partitions, so it must be configured separately.
Reviewers: David Arthur <mumrah@gmail.com>
When running our Connect system tests with JDK 10+, we hit the error
AttributeError: 'ClusterNode' object has no attribute 'version'
because util.py attempts to check the version variable for non-Kafka service objects.
Reviewers: Konstantine Karantasis <k.karantasis@gmail.com>
Change `ZookeeperAuthorizerTest` to `AuthorizerTest` and add support for KRaft's `StandardAuthorizer` implementation.
Reviewers: David Jacot <djacot@confluent.io>
This patch fixes some strangeness and inconsistency in the messages written by `TransactionalMessageCopier` to stdout. Here is a sample of two messages.
Progress message:
```
{"consumed":33000,"stage":"ProcessLoop","totalProcessed":33000,"progress":"copier-0","time":"2022/04/24 05:40:31:649","remaining":333}
```
The `transactionalId` is set to the value of the `progress` key.
And a shutdown message:
```
{"consumed":33333,"shutdown_complete":"copier-0","totalProcessed":33333,"time":"2022/04/24 05:40:31:937","remaining":0}
```
The `transactionalId` this time is set to the `shutdown_complete` key and there is no `stage` key.
In this patch, we change the following:
1. Use a separate key for the `transactionalId`.
2. Drop the `progress` and `shutdown_complete` keys.
3. Use `stage=ShutdownComplete` in the shutdown message.
4. Modify `transactional_message_copier.py` system test service accordingly.
Reviewers: David Arthur <mumrah@gmail.com>
The second validation does not verify the second bounce because the verified producer and the verified consumer are stopped in `self.run_validation`. This means that the second `run_validation` just spit out the same information as the first one. Instead, we should just run the validation at the end.
Reviewers: Jason Gustafson <jason@confluent.io>
With this change we stop including the non-production grade connectors that are meant to be used for demos and quick starts by default in the CLASSPATH and plugin.path of Connect deployments. The package of these connector will still be shipped with the Apache Kafka distribution and will be available for explicit inclusion.
The changes have been tested through the system tests and the existing unit and integration tests.
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Randall Hauch <rhauch@gmail.com>
KRaft brokers always use the first controller listener, so if there is not also a colocated KRaft controller on the node be sure to only publish one controller listener in `controller.listener.names` even when the inter-controller listener name differs. System tests were failing due to unnecessarily publishing a second entry in `controller.listener.names` for a broker-only config and not also publishing a mapping for it in `listener.security.protocol.map`. Removing the unnecessary entry in `controller.listener.names` solves the problem.
Reviewers: David Jacot <djacot@confluent.io>
Log messages were changed in the AssignorConfiguration (#11490) that are
also used for verification in system test
StreamsCooperativeRebalanceUpgradeTest.test_upgrade_to_cooperative_rebalance.
This commit fixes the test and adds comments to the log messages
that point to the test that needs to be updated in case of
changes to the log messages.
Reviewers: John Roesler <vvcephei@apache.org>, Luke Chen <showuon@gmail.com>, David Jacot <djacot@confluent.io>
This patch adds a new system test which exercises the shrining/expansion process of the partition leader. It does so by introducing a network partition which isolates a broker from the other brokers in the cluster but not from KRaft Controller/ZK.
Reviewers: Jason Gustafson <jason@confluent.io>
Replace deprecated exactly_once_beta with exactly_once_v2 in system tests.
Follow up for #10870, found out there are still some system tests using the deprecated exactly_once_beta. This PR updates them.
Reviewers: Bruno Cadonna <cadonna@apache.org>
Clearing under-replicated-partitions helps ensure that partitions do not become unavailable longer than necessary as brokers are rolled. This prevents flakiness due to transaction timeouts.
Reviewers: Luke Chen <showuon@gmail.com>, Ismael Juma <ismael@juma.me.uk>
This patch ensures that the transaction message copier is fully started in `start_node`. Without this, it is possible that `stop_node` is called before the process is started which results in not stopping it at all.
Reviewers: Jason Gustafson <jason@confluent.io>
We increased the default session timeout to 30s in KIP-735:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-735%3A+Increase+default+consumer+session+timeout
Since then, we are observing sporadic system test failures
due to rebalances taking longer than the test timeout.
Rather than increase the test wait times, we can just override
the session timeout to a value more appropriate in the testing
domain.
Reviewers: A. Sophie Blee-Goldman <ableegoldman@apache.org>
The streams static membership test has failed several times due to hitting the Kafka shutdown timeout, but the logs were showing that the shutdown did actually succeed after the 60 second timeout.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Walker Carlson <wcarlson@confluent.io>
Also adjusted the acceptable recovery lag to stabilize Streams tests.
Reviewers: Justine Olshan <jolshan@confluent.io>, Matthias J. Sax <mjsax@apache.org>, John Roesler <vvcephei@apache.org>
As of version 2.2.1 , Kafka Streams uses message headers and
thus requires broker version 0.11.0 or newer.
Reviewers: John Roesler <john@confluent.io>, Ismael Juma <ismael@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>