kafka

Commit Graph

Author	SHA1	Message	Date
Colin Patrick McCabe	29c09e2ca1	MINOR: ControllerServer should use the new metadata loader and snapshot generator (#12983 ) This PR introduces the new metadata loader and snapshot generator. For the time being, they are only used by the controller, but a PR for the broker will come soon. The new metadata loader supports adding and removing publishers dynamically. (In contrast, the old loader only supported adding a single publisher.) It also passes along more information about each new image that is published. This information can be found in the LogDeltaManifest and SnapshotManifest classes. The new snapshot generator replaces the previous logic for generating snapshots in QuorumController.java and associated classes. The new generator is intended to be shared between the broker and the controller, so it is decoupled from both. There are a few small changes to the old snapshot generator in this PR. Specifically, we move the batch processing time and batch size metrics out of BrokerMetadataListener.scala and into BrokerServerMetrics.scala. Finally, fix a case where we are using 'is' rather than '==' for a numeric comparison in snapshot_test.py. Reviewers: David Arthur <mumrah@gmail.com>	2022-12-15 16:53:07 -08:00
A. Sophie Blee-Goldman	c1a54671e8	MINOR: Bump trunk to 3.5.0-SNAPSHOT (#12960 ) Version bumps in trunk after the creation of the 3.4 branch. Reviewers: Ismael Juma <ismael@juma.me.uk>	2022-12-07 18:29:20 -08:00
Bruno Cadonna	18629f6816	MINOR: Fix log message used in version probing system test (#12931 ) PR #12684 introduced a better format for timestamps in log messages. Unfortunately, we missed that one of the modified log messages is used by a system test for validation. This PR adapts the system test to look for the modified log message. Reviewers: Divij Vaidya <diviv@amazon.com>, Matthias J. Sax <mjsax@apache.org>	2022-12-05 13:15:36 +01:00
Jonathan Albrecht	b56e71faee	MINOR: Update unit/integration tests to work with the IBM Semeru JDK (#12343 ) The IBM Semeru JDK use the OpenJDK security providers instead of the IBM security providers so test for the OpenJDK classes first where possible and test for Semeru in the java.runtime.name system property otherwise. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Bruno Cadonna <cadonna@apache.org>	2022-12-01 16:22:00 +01:00
Stanislav Vodetskyi	b2b9ecdd61	MINOR: try-finally around super call in http.py (#12924 ) Reviewers: Daniel Gospodinow <danielgospodinow@gmail.com>, Ian McDonald <imcdonald@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>	2022-12-01 15:16:45 +05:30
Lucas Brutschy	4560978ed7	KAFKA-14309: FK join upgrades not tested with DEV_VERSION (#12760 ) The streams upgrade system inserted FK join code for every version of the the StreamsUpgradeTest except for the latest. Also, the original code never switched on the `test.run_fk_join` flag for the target version of the upgrade. The effect was that FK join upgrades were not tested at all, since no FK join code was executed after the bounce in the system test. We introduce `extra_properties` in the system tests, that can be used to pass any property to the upgrade driver, which is supposed to be reused by system tests for switching on and off flags (e.g. for the state restoration code). Reviewers: Alex Sorokoumov <asorokoumov@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>	2022-11-07 15:46:51 -08:00
Jason Gustafson	150a0758cb	MINOR: Change system test console consumer default log level (#12819 ) For tests which use the console consumer service, we are currently enabling TRACE logging by default. I have seen some system tests where this produces GBs of logging. A better default is probably DEBUG. Reviewers: José Armando García Sancio <jsancio@apache.org>	2022-11-07 13:42:36 -08:00
srishti-saraswat	57aefa9c82	MINOR: Migrate connect system tests to KRaft (#12621 ) Adds the `metadata_quorum` parameter to the `@matrix(...)` annotation to many existing tests, so that they are run with both zookeeper and remote_kraft nodes. Reviewers: Randall Hauch <rhauch@gmail.com>, Greg Harris <gharris1727@gmail.com>	2022-10-27 11:19:14 -05:00
José Armando García Sancio	5c5dcb7a96	MINOR; Use 3.3.1 release for system test (#12714 ) The following files are available in https://s3-us-west-2.amazonaws.com/kafka-packages/: kafka-streams-3.3.1-test.jar kafka_2.12-3.3.1.tgz kafka_2.13-3.3.1.tgz Reviewers: Colin P. McCabe <cmccabe@apache.org>	2022-10-04 16:19:24 -07:00
David Arthur	c1f23b6c9a	MINOR: Fix delegation token system test (#12693 ) KIP-373 added a "token requester" field to the output of kafka-delegation-tokens.sh. The system test was failing since it was not expecting this new field. This patch adds support for this field and improves the error output if we can't parse. Reviewers: José Armando García Sancio <jsancio@apache.org>, Manikumar Reddy <manikumar.reddy@gmail.com>	2022-10-01 19:22:46 -07:00
Nikolay	51b079dca7	KAFKA-12878: Support --bootstrap-server in kafka-streams-application-reset tool (#12632 ) Reviewers: Chris Egerton <chrise@aiven.io>	2022-09-19 13:20:41 -04:00
Manikumar Reddy	3e8e082fab	MINOR: Bump latest 2.8 version to 2.8.2	2022-09-19 17:18:47 +05:30
Tom Bentley	352c71ffb5	MINOR: Update release versions for upgrade tests with 3.0.2, 3.1.2, 3.2.3 release (#12661 ) Updates release versions in files that are used for upgrade test with the 3.0.2, 3.1.2, 3.2.3 release version.	2022-09-19 17:13:40 +05:30
Jason Gustafson	921885d31f	MINOR; Remove redundant version system test (#12612 ) This patch removes test_kafka_version.py, which contains two tests at the moment. The first test verifies we can start a 0.8.2 cluster. The second verifies we can start a cluster with one node on 0.8.2 and another on the latest. These test are covered in greater depth by upgrade_test.py and downgrade_test.py. Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>	2022-09-08 18:13:59 -07:00
Chris Egerton	897bf4741c	KAFKA-14143: Exactly-once source connector system tests (#11783 ) Also includes a minor quality-of-life improvement to clarify why some internal REST requests to workers may fail while that worker is still starting up. Reviewers: Tom Bentley <tbentley@redhat.com>, Luke Chen <showuon@gmail.com>, José Armando García Sancio <jsancio@gmail.com>, Mickael Maison <mickael.maison@gmail.com>	2022-09-08 15:13:43 -04:00
Yash Mayya	8a19f2da27	Update expected task configs for FileStream source and sink connectors in ConnectRestApiTest (#12576 ) Reviewer: Chris Egerton <chrise@aiven.io>	2022-08-31 16:34:00 -04:00
Colin Patrick McCabe	28d5a05943	KAFKA-14187: kafka-features.sh: add support for --metadata (#12571 ) This PR adds support to kafka-features.sh for the --metadata flag, as specified in KIP-778. This flag makes it possible to upgrade to a new metadata version without consulting a table mapping version names to short integers. Change --feature to use a key=value format. FeatureCommandTest.scala: make most tests here true unit tests (that don't start brokers) in order to improve test run time, and allow us to test more cases. For the integration test part, test both KRaft and ZK-based clusters. Add support for mocking feature operations in MockAdminClient.java. upgrade.html: add a section describing how the metadata.version should be upgraded in KRaft clusters. Add kraft_upgrade_test.py to test upgrades between KRaft versions. Reviewers: David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com>, José Armando García Sancio <jsancio@gmail.com>	2022-08-30 16:56:03 -07:00
Alan Sheinberg	481fefb4f9	MINOR: Adds KRaft versions of most streams system tests (#12458 ) Migrates Streams sustem tests to either use kraft brokers or to use both kraft and zk in a testing matrix. This skips tests which use various forms of Kafka versioning since those seem to have issues with KRaft at the moment. Running these tests with KRaft will require a followup PR. Reviewers: Guozhang Wang <guozhang@apache.org>, John Roesler <vvcephei@apache.org>	2022-08-26 16:11:19 -05:00
José Armando García Sancio	6ace67b2de	MINOR; Bump trunk to 3.4.0-SNAPSHOT (#12463 ) Version bumps in trunk after the creation of the 3.3 branch. Reviewers: David Arthur <mumrah@gmail.com>	2022-08-01 09:54:12 -07:00
Hao Li	5e4ae06d12	MINOR: fix flaky test test_standby_tasks_rebalance (#12428 ) * Description In this test, when third proc join, sometimes there are other rebalance scenarios such as followup joingroup request happens before syncgroup response was received by one of the proc and the previously assigned tasks for that proc is then lost during new joingroup request. This can result in standby tasks assigned as 3, 1, 2. This PR relax the expected assignment of 2, 2, 2 to a range of [1-3]. * Some backgroud from Guozhang: I talked to @hao Li offline and also inspected the code a bit, and tl;dr is that I think the code logic is correct (i.e. we do not really have a bug), but we need to relax the test verification a little bit. The general idea behind the subscription info is that: When a client joins the group, its subscription will try to encode all its current assigned active and standby tasks, which would be used as prev active and standby tasks by the assignor in order to achieve some stickiness. When a client drops all its active/standby tasks due to errors, it does not actually report all empty from its subscription, instead it tries to check its local state directory (you can see that from TaskManager#getTaskOffsetSums which populates the taskOffsetSum. For active task, its offset would be “-2” a.k.a. LATEST_OFFSET, for standby task, its offset is an actual numerical number. So in this case, the proc2 which drops all its active and standby tasks, would still report all tasks that have some local state still, and since it was previously owning all six tasks (three as active, and three as standby), it would report all six as standbys, and when that happens the resulted assignment as @hao Li verified, is indeed the un-even one. So I think the actual “issue“ happens here, is when proc2 is a bit late sending the sync-group request, when the previous rebalance has already completed, and a follow-up rebalance has already triggered, in that case, the resulted un-even assignment is indeed expected. Such a scenario, though not common, is still legitimate since in practice all kinds of timing skewness across instances can happen. So I think we should just relax our verification here, i.e. just making sure that each instance has at least one standby replica at the end, not exactly evenly as “2, 2, 2”. Reviewers: Suhas Satish <ssatish@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	2022-07-21 12:12:29 -07:00
Alyssa Huang	8e9869a777	MINOR: Run MessageFormatChangeTest in ZK mode only (#12395 ) KRaft mode will not support writing messages with an older message format (2.8) since the min supported IBP is 3.0 for KRaft. Testing support for reading older message formats will be covered by https://issues.apache.org/jira/browse/KAFKA-14056. Reviewers: David Jacot <djacot@confluent.io>	2022-07-13 08:46:04 +02:00
Bruno Cadonna	4d53dd9972	KAFKA-13930: Add 3.2.0 Streams upgrade system tests (#12209 ) * KAFKA-13930: Add 3.2.0 Streams upgrade system tests Apache Kafka 3.2.0 was recently released. Now we need to test upgrades from 3.2 to trunk in our system tests. Reviewer: Bill Bejeck <bbejeck@apache.org>	2022-06-21 16:33:40 +02:00
Ron Dagostino	b04937dc65	MINOR: Fix force kill of KRaft colocated controllers in system tests (#11238 ) I noticed that a system test using a KRaft cluster with 3 brokers but only 1 co-located controller did not force-kill the second and third broker after shutting down the first broker (the one with the controller). The issue was a floating point rounding error. This patch adjusts for the rounding error and also makes the logic work for an even number of controllers. A local run of `tests/kafkatest/sanity_checks/test_bounce.py` succeeded (and I manually increased the cluster size for the 1 co-located controller case and observed the correct kill behavior: the second and third brokers were force-killed as expected). Reviewers: Luke Chen <showuon@gmail.com>, José Armando García Sancio <jsancio@users.noreply.github.com>, David Jacot <djacot@confluent.io>	2022-06-15 16:45:00 +02:00
Aneesh Garg	47bb93cfd7	MINOR: Replace ACL_AUTHORIZER attribute with ZK_ACL_AUTHORIZER (#12247 ) Replace ACL_AUTHORIZER attribute with ZK_ACL_AUTHORIZER in system tests. Required after the changes merged with https://github.com/apache/kafka/pull/12190. Reviewers: David Jacot <djacot@confluent.io>	2022-06-03 17:50:49 +02:00
Bruno Cadonna	5424324722	KAFKA-13930: Add 3.2.0 to core upgrade and compatibility system tests (#12210 ) Apache Kafka 3.2.0 was recently released. Now we need to test upgrades and compatibility with 3.2 in core system tests. Reviewer: Jason Gustafson <jason@confluent.io>	2022-06-03 09:13:10 +02:00
Jason Gustafson	f980820e2b	MINOR: Send kraft raft/controller logs to controller log in systests (#12222 ) Currently the only place we see controller/raft logging in system tests is `server-start-stdout-stderr.log` where they are mixed with all other logs. It is more convenient to send them to `controller.log` as we do for zk tests. Reviewers: Kvicii <42023367+Kvicii@users.noreply.github.com>, David Jacot <djacot@confluent.io>	2022-05-30 09:21:41 -07:00
Jason Gustafson	02fc6e7d3c	MINOR: Collect metadata log dir in kraft system tests (#12215 ) It is useful to collect the directory for `__cluster_metadata` in system tests. We use a separate directory from user partitions, so it must be configured separately. Reviewers: David Arthur <mumrah@gmail.com>	2022-05-25 17:36:58 -07:00
Lucas Bradstreet	46630a0610	MINOR: fix number of nodes used in test_compatible_brokers_eos_v2_enabled (#12211 ) Reviewers: David Jacot <djacot@confluent.io>	2022-05-25 20:03:06 +02:00
Lucas Bradstreet	f7502f430a	MINOR: fix Connect system test runs with JDK 10+ (#12202 ) When running our Connect system tests with JDK 10+, we hit the error AttributeError: 'ClusterNode' object has no attribute 'version' because util.py attempts to check the version variable for non-Kafka service objects. Reviewers: Konstantine Karantasis <k.karantasis@gmail.com>	2022-05-25 10:25:00 -07:00
Jason Gustafson	b5699b5ccd	KAFKA-13923; Generalize authorizer system test for kraft (#12190 ) Change `ZookeeperAuthorizerTest` to `AuthorizerTest` and add support for KRaft's `StandardAuthorizer` implementation. Reviewers: David Jacot <djacot@confluent.io>	2022-05-23 09:47:14 -07:00
Alex Sorokoumov	78dd40123c	MINOR: Add upgrade tests for FK joins (#12122 ) Follow up PR for KAFKA-13769. Reviewers: Matthias J. Sax <matthias@confluent.io>	2022-05-13 17:21:27 -07:00
Tom Bentley	467bce04ae	MINOR: Update release versions for upgrade tests with 3.1.1 release (#12156 ) Updates release versions in files that are used for upgrade test with the 3.1.1 release version. Reviewers: Bruno Cadonna <bruno@confluent.io>	2022-05-13 09:32:41 +01:00
Jason Gustafson	f0a09ea003	MINOR: Fix event output inconsistencies in TransactionalMessageCopier (#12098 ) This patch fixes some strangeness and inconsistency in the messages written by `TransactionalMessageCopier` to stdout. Here is a sample of two messages. Progress message: ``` {"consumed":33000,"stage":"ProcessLoop","totalProcessed":33000,"progress":"copier-0","time":"2022/04/24 05:40:31:649","remaining":333} ``` The `transactionalId` is set to the value of the `progress` key. And a shutdown message: ``` {"consumed":33333,"shutdown_complete":"copier-0","totalProcessed":33333,"time":"2022/04/24 05:40:31:937","remaining":0} ``` The `transactionalId` this time is set to the `shutdown_complete` key and there is no `stage` key. In this patch, we change the following: 1. Use a separate key for the `transactionalId`. 2. Drop the `progress` and `shutdown_complete` keys. 3. Use `stage=ShutdownComplete` in the shutdown message. 4. Modify `transactional_message_copier.py` system test service accordingly. Reviewers: David Arthur <mumrah@gmail.com>	2022-04-29 10:02:25 -07:00
Luke Chen	f28a2ee918	MINOR: revert back to 60s session timeout for static membership test (#11881 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	2022-04-21 11:51:31 -07:00
David Jacot	6d36487b68	MINOR: Fix TestDowngrade.test_upgrade_and_downgrade (#12027 ) The second validation does not verify the second bounce because the verified producer and the verified consumer are stopped in `self.run_validation`. This means that the second `run_validation` just spit out the same information as the first one. Instead, we should just run the validation at the end. Reviewers: Jason Gustafson <jason@confluent.io>	2022-04-18 14:22:33 -07:00
Konstantine Karantasis	dd62ef2eda	KAFKA-13748: Do not include file stream connectors in Connect's CLASSPATH and plugin.path by default (#11908 ) With this change we stop including the non-production grade connectors that are meant to be used for demos and quick starts by default in the CLASSPATH and plugin.path of Connect deployments. The package of these connector will still be shipped with the Apache Kafka distribution and will be available for explicit inclusion. The changes have been tested through the system tests and the existing unit and integration tests. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Randall Hauch <rhauch@gmail.com>	2022-03-30 13:15:42 -07:00
Bruno Cadonna	4c8685e701	MINOR: Bump trunk to 3.3.0-SNAPSHOT (#11925 ) Version bumps on trunk following the creation of the 3.2 release branch. Reviewer: David Jacot <djacot@confluent.io>	2022-03-21 21:37:05 +01:00
Justine Olshan	7afdb069bf	KAFKA-13750; Client Compatability KafkaTest uses invalid idempotency configs (#11909 ) Reviewers: Luke Chen <showuon@gmail.com>, David Jacot <djacot@confluent.io>	2022-03-17 18:00:27 +01:00
Mickael Maison	1783fb14df	MINOR: Bump latest 3.0 version to 3.0.1 (#11885 ) Reviewers: Matthias J. Sax <mjsax@apache.org>	2022-03-16 11:43:37 +01:00
Levani Kokhreidze	87eb0cf03c	KAFKA-6718: Update SubscriptionInfoData with clientTags (#10802 ) adds ClientTags to SubscriptionInfoData Reviewer: Luke Chen <showuon@gmail.com>, Bruno Cadonna <cadonna@apache.org>	2022-03-11 16:29:05 +08:00
David Jacot	7215c90c5e	MINOR: Add 3.0 and 3.1 to streams system tests (#11716 ) Reviewers: Bill Bejeck <bill@confluent.io>	2022-01-28 10:06:31 +01:00
David Jacot	110fae2f59	MINOR: Add 3.0 and 3.1 to broker and client compatibility tests (#11701 ) Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2022-01-25 16:22:48 +01:00
Ron Dagostino	1785e1223e	KAFKA-13582: TestVerifiableProducer.test_multiple_kraft_security_protocols fails (#11664 ) KRaft brokers always use the first controller listener, so if there is not also a colocated KRaft controller on the node be sure to only publish one controller listener in `controller.listener.names` even when the inter-controller listener name differs. System tests were failing due to unnecessarily publishing a second entry in `controller.listener.names` for a broker-only config and not also publishing a mapping for it in `listener.security.protocol.map`. Removing the unnecessary entry in `controller.listener.names` solves the problem. Reviewers: David Jacot <djacot@confluent.io>	2022-01-10 20:54:26 +01:00
Chia-Ping Tsai	b6e7f6a4df	MINOR: replace Thread.isAlive by Thread.is_alive for Python code (#11545 ) Reviewers: Luke Chen <showuon@gmail.com>, David Jacot <djacot@confluent.io>	2021-11-29 18:49:14 +08:00
Bruno Cadonna	4fed0001ec	MINOR: Fix system test StreamsCooperativeRebalanceUpgradeTest.test_upgrade_to_cooperative_rebalance (#11532 ) Log messages were changed in the AssignorConfiguration (#11490) that are also used for verification in system test StreamsCooperativeRebalanceUpgradeTest.test_upgrade_to_cooperative_rebalance. This commit fixes the test and adds comments to the log messages that point to the test that needs to be updated in case of changes to the log messages. Reviewers: John Roesler <vvcephei@apache.org>, Luke Chen <showuon@gmail.com>, David Jacot <djacot@confluent.io>	2021-11-25 10:48:09 +01:00
David Jacot	3aef0a5ceb	MINOR: Bump trunk to 3.2.0-SNAPSHOT (#11458 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>	2021-11-02 13:38:54 +01:00
David Jacot	38a3ddb562	MINOR: Add a replication system test which simulates a slow replica (#11395 ) This patch adds a new system test which exercises the shrining/expansion process of the partition leader. It does so by introducing a network partition which isolates a broker from the other brokers in the cluster but not from KRaft Controller/ZK. Reviewers: Jason Gustafson <jason@confluent.io>	2021-10-20 08:19:36 +02:00
Luke Chen	1af1c80e2d	MINOR: replace deprecated exactly_once_beta into exactly_once_v2 (#10884 ) Replace deprecated exactly_once_beta with exactly_once_v2 in system tests. Follow up for #10870, found out there are still some system tests using the deprecated exactly_once_beta. This PR updates them. Reviewers: Bruno Cadonna <cadonna@apache.org>	2021-09-27 17:02:48 +02:00
David Jacot	f650a14d56	KAFKA-13312; 'NetworkDegradeTest#test_rate' should wait until iperf server is listening (#11344 ) Reviewers: Jason Gustafson <jason@confluent.io>	2021-09-21 10:26:46 +02:00
David Jacot	493280735b	MINOR: Bump latest 2.8 version to 2.8.1 (#11341 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2021-09-20 09:23:15 +02:00
Jason Gustafson	25b0857bdb	KAFKA-13234; Transaction system test should clear URPs after broker restarts (#11267 ) Clearing under-replicated-partitions helps ensure that partitions do not become unavailable longer than necessary as brokers are rolled. This prevents flakiness due to transaction timeouts. Reviewers: Luke Chen <showuon@gmail.com>, Ismael Juma <ismael@juma.me.uk>	2021-09-01 08:37:05 -07:00
David Jacot	c4e1e23857	KAFKA-13231; `TransactionalMessageCopier.start_node` should wait until the process if fully started (#11264 ) This patch ensures that the transaction message copier is fully started in `start_node`. Without this, it is possible that `stop_node` is called before the process is started which results in not stopping it at all. Reviewers: Jason Gustafson <jason@confluent.io>	2021-08-27 08:28:14 +02:00
John Roesler	45ecaa19f8	MINOR: Set session timeout back to 10s for Streams system tests (#11236 ) We increased the default session timeout to 30s in KIP-735: https://cwiki.apache.org/confluence/display/KAFKA/KIP-735%3A+Increase+default+consumer+session+timeout Since then, we are observing sporadic system test failures due to rebalances taking longer than the test timeout. Rather than increase the test wait times, we can just override the session timeout to a value more appropriate in the testing domain. Reviewers: A. Sophie Blee-Goldman <ableegoldman@apache.org>	2021-08-20 11:27:54 -05:00
Zara Lim	9bc45d4e03	MINOR: Increase the Kafka shutdown timeout to 120 (#11183 ) The streams static membership test has failed several times due to hitting the Kafka shutdown timeout, but the logs were showing that the shutdown did actually succeed after the 60 second timeout. Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Walker Carlson <wcarlson@confluent.io>	2021-08-05 15:26:10 -07:00
Kamal Chandraprakash	a103c95a31	KAFKA-12724: Add 2.8.0 to system tests and streams upgrade tests. (#10602 ) Also adjusted the acceptable recovery lag to stabilize Streams tests. Reviewers: Justine Olshan <jolshan@confluent.io>, Matthias J. Sax <mjsax@apache.org>, John Roesler <vvcephei@apache.org>	2021-08-04 17:31:10 -05:00
Matthias J. Sax	a7d9a8ac36	MINOR: Remove older brokers from upgrade test (#11117 ) As of version 2.2.1 , Kafka Streams uses message headers and thus requires broker version 0.11.0 or newer. Reviewers: John Roesler <john@confluent.io>, Ismael Juma <ismael@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>	2021-07-26 14:09:47 -07:00
Cheng Tan	8ed271e1fd	KAFKA-13026: Idempotent producer (KAFKA-10619) follow-up testings (#11002 ) Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2021-07-26 21:45:59 +01:00
Niket	dc512cc038	KAFKA-13015: Ducktape System Tests for Metadata Snapshots (#11053 ) This PR implements system tests in ducktape to test the ability of brokers and controllers to generate and consume snapshots and catch up with the metadata log. Reviewers: Colin P. McCabe <cmccabe@apache.org>, José Armando García Sancio <jsancio@gmail.com>	2021-07-23 16:28:21 -07:00
Ryan Dielhenn	04fd555475	MINOR: Enable KRaft in transactions_test.py #11121 Reviewers: Colin P. McCabe <cmccabe@apache.org>	2021-07-23 16:01:54 -07:00
Ismael Juma	f34bb28ab6	KAFKA-13116: Fix message_format_change_test and compatibility_test_new_broker_test failures (#11108 ) These failures were caused by `a46b82bea9`. Details for each test: * message_format_change_test: use IBP 2.8 so that we can write in older message formats. * compatibility_test_new_broker_test_failures: fix down-conversion path to handle empty record batches correctly. The record scan in the old code ensured that empty record batches were never down-converted, which hid this bug. * upgrade_test: set the IBP 2.8 when message format is < 0.11 to ensure we are actually writing with the old message format even though the test was passing without the change. Verified with ducker that some variants of these tests failed without these changes and passed with them. Also added a unit test for the down-conversion bug fix. Reviewers: Jason Gustafson <jason@confluent.io>	2021-07-23 13:43:31 -07:00
Luke Chen	f959e6c583	KAFKA-13129: replace describe topic via zk with describe users (#11115 ) Replace the unsupported describe topic via zk with describe users to fix the system tests. For the upgrade_test case where TLS support is not required, use list_acls instead. Reviewers: Ismael Juma <ismael@juma.me.uk>	2021-07-23 05:33:43 -07:00
Bruno Cadonna	9b3687e0ac	HOTFIX: Modify system test config to reduce time to stable task assignment. (#11090 ) Currently, we verify the startup of a Streams client by checking the transition from REBALANCING to RUNNING and if the client processed some records in the EOS system test. However, if the Streams client only has standby tasks assigned as it can happen if the client is catching up by using warm-up replicas, the client will never process records within the timeout of the startup verification. Hence, the test will fail although everything is fine. This commit fixes this by reducing the time to the next probing rebalance and by increasing the number of max warm-up replicas. In such a way, the catch up of the client and the following processing of records should still be within the startup verification timeout of the client. Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>	2021-07-21 07:58:14 +02:00
Ron Dagostino	1e78dcda69	MINOR: Fix ZooKeeperAuthorizerTest for KRaft (#11095 ) This patch fixes the ZooKeeperAuthorizerTest for KRaft. The system test was not configuring/reconfiguring/restarting the remote controller quorum with the correct security settings. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2021-07-20 16:35:14 -07:00
Colin Patrick McCabe	bfc57aa4dd	MINOR: enable reassign_partitions_test.py for kraft (#11064 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	2021-07-19 09:08:55 -07:00
CHUN-HAO TANG	98bd590718	MINOR: Replace unused variable with underscore (#11037 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2021-07-17 16:36:52 +08:00
Ron Dagostino	762d11c13f	MINOR: ducktape should start brokers in parallel and support co-located kraft This patch adds a sanity-check bounce system test for the case where we have 3 co-located KRaft controllers and fixes the system test code so that this case will pass by starting brokers in parallel by default instead of serially. We now also send SIGKILL to any running KRaft broker or controller nodes for the co-located case when a majority of co-located controllers have been stopped -- otherwise they do not shutdown, and we spin for the 60 second timeout. Finally, this patch adds the ability to specify that certain brokers should not be started when starting the cluster, and then we can start those nodes at a later time via the add_broker() method call; this is going to be helpful for KRaft snapshot system testing. We were not testing the 3 co-located KRaft controller case previously, and it would not pass because the first Kafka node would never be considered started. We were starting the Kafka nodes serially, and we decide that a node has successfully started when it logs a particular message. This message is not logged until the broker has identified the controller (i.e. the leader of the KRaft quorum). There cannot be a leader until a majority of the KRaft quorum has started, so with 3 co-located controllers the first node could never be considered "started" by the system test. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2021-07-16 16:28:09 -07:00
Bruno Cadonna	332db13047	HOTFIX: Fix verification of version probing (#10943 ) Fixes and improves version probing in system test test_version_probing_upgrade().	2021-07-12 18:50:25 +02:00
Konstantine Karantasis	d2a05d71c0	Bump trunk to 3.1.0-SNAPSHOT (#10981 ) Typical version bumps on trunk following the creation of the 3.0 release branch. Reviewer: Randall Hauch <rhauch@gmail.com>	2021-07-06 14:28:13 -07:00
kpatelatwork	527ba111c7	KAFKA-4793: Connect API to restart connector and tasks (KIP-745) (#10822 ) Implements KIP-745 https://cwiki.apache.org/confluence/display/KAFKA/KIP-745%3A+Connect+API+to+restart+connector+and+tasks to change connector REST API to restart a connector and its tasks as a whole. Testing strategy - [x] Unit tests added for all possible combinations of onlyFailed and includeTasks - [x] Integration tests added for all possible combinations of onlyFailed and includeTasks - [x] System tests for happy path Reviewers: Randall Hauch <rhauch@gmail.com>, Diego Erdody <erdody@gmail.com>, Konstantine Karantasis <k.karantasis@gmail.com>	2021-06-30 21:13:07 -07:00
Ron Dagostino	4f5b4c868e	KAFKA-12756: Update ZooKeeper to v3.6.3 (#10918 ) Update the ZooKeeper version to v3.6.3. This requires adding dropwizard as a new dependency. Also, add Kafka v2.8.0 to the ducktape system test image. Reviewers: Luke Chen <showuon@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>	2021-06-30 11:21:33 -07:00
Chia-Ping Tsai	01c2345658	MINOR: fix round_trip_fault_test.py - don't assign replicas to nonexistent brokers (#10908 ) The broker id starts with 1 (https://github.com/apache/kafka/blob/trunk/tests/kafkatest/services/kafka/kafka.py#L207) so round_trip_fault_test.py fails because it assigns replica to nonexistent broker. The interesting story is the failure happens only on KRaft only. KRaft mode checks the existent ids (https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L950). By contrast, ZK mode has no such check and the min.insync.replicas is set to 1 so this test works with ZK mode even though there is one replica is always off-line. Reviewers: Ismael Juma <ismael@juma.me.uk>	2021-06-19 23:54:02 +08:00
Ron Dagostino	ebef7d0c21	MINOR: TestSecurityRollingUpgrade system test fixes (#10886 ) The TestSecurityRollingUpgrade. test_disable_separate_interbroker_listener() system test had a design flaw: it was migrating inter-broker communication from a SASL_SSL listener to an SSL listener in one roll while immediately removing the SASL_SSL listener in that roll. This requires two rolls because the existing SASL_SSL listener must remain available throughout the first roll so that unrolled brokers can continue to communicate with rolled brokers throughout. This patch adds the second roll to this test and removes the original SASL_SSL listener on that second roll instead of the first one. The test was not failing all the time -- it was flaky. The TestSecurityRollingUpgrade.test_rolling_upgrade_phase_two() system test was not explicitly identifying the SASL mechanism to enable on a third port when that port was using SASL but the client security protocol was not SASL-based. This was resulting in an empty sasl.enabled.mechanisms config, which applied to that third port, and then when the cluster was rolled to take advantage of this third port for inter-broker communication the potential for an inability to communicate with other, unrolled brokers existed (similar to above, this resulted in a flaky test). Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2021-06-18 15:50:21 +08:00
John Roesler	987391958d	MINOR: enable EOS during smoke test IT (#10870 ) This IT has been failing on trunk recently. Enabling EOS during the integration test makes it easier to be sure that the test's assumptions are really true during verification and should make the test more reliable. I also noticed that in the actual system test file, we are using the deprecated property name "beta" instead of "v2". Reviewers: Boyang Chen <boyang@apache.org>	2021-06-13 21:35:02 -05:00
Chia-Ping Tsai	398800a4f3	MINOR: fix client_compatibility_features_test.py - DescribeAcls is already supported by KRaft (#10860 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	2021-06-10 22:02:17 +08:00
A. Sophie Blee-Goldman	48379bd6e5	KAFKA-12648: Pt. 1 - Add NamedTopology to protocol and state directory structure (#10609 ) This PR includes adding the NamedTopology to the Subscription/AssignmentInfo, and to the StateDirectory so it can place NamedTopology tasks within the hierarchical structure with task directories under the NamedTopology parent dir. Reviewers: Walker Carlson <wcarlson@confluent.io>, Guozhang Wang <guozhang@confluent.io>	2021-06-07 15:38:12 -07:00
Mickael Maison	7f91d2935f	MINOR: Updating files with release 2.7.1 (#10660 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Matthias J. Sax <mjsax@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>	2021-05-20 10:43:15 +01:00
Ron Dagostino	5b0c58ed53	MINOR: Support using the ZK authorizer with KRaft (#10550 ) This patch adds support for running the ZooKeeper-based kafka.security.authorizer.AclAuthorizer with KRaft clusters. Set the authorizer.class.name config as well as the zookeeper.connect config while also setting the typical KRaft configs (node.id, process.roles, etc.), and the cluster will use KRaft for metadata and ZooKeeper for ACL storage. A system test that exercises the authorizer is included. This patch also changes "Raft" to "KRaft" in several system test files. It also fixes a bug where system test admin clients were unable to connect to a cluster with broker credentials via the SSL security protocol when the broker was using that for inter-broker communication and SASL for client communication. Reviewers: Colin P. McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>	2021-05-19 10:32:56 -07:00
Colin Patrick McCabe	9e5b77fb96	KAFKA-12788: improve KRaft replica placement (#10494 ) Implement a striped replica placement algorithm for KRaft. This also means implementing rack awareness. Previously, KRraft just chose replicas randomly in a non-rack-aware fashion. Also, allow replicas to be placed on fenced brokers if there are no other choices. This was specified in KIP-631 but previously not implemented. Reviewers: Jun Rao <junrao@gmail.com>	2021-05-17 16:49:47 -07:00
Ron Dagostino	12377bd3c6	MINOR: Add missing @cluster annotation to StreamsNamedRepartitionTopicTest (#10697 ) The StreamsNamedRepartitionTopicTest system tests did not have the @cluster annotation and was therefore taking up the entire cluster. For example, we see this in the log output: kafkatest.tests.streams.streams_named_repartition_topic_test.StreamsNamedRepartitionTopicTest.test_upgrade_topology_with_named_repartition_topic is using entire cluster. It's possible this test has no associated cluster metadata. This PR adds the missing annotation. Reviewers: Bill Bejeck <bbejeck@apache.org>	2021-05-17 17:33:43 -04:00
Ron Dagostino	55b24ce9d6	MINOR: fix system test TestSecurityRollingUpgrade (#10694 ) Ensure security protocol and sasl mechanism are updated in the cached SecurityConfig during rolling system tests. Also explicitly indicate which SASL mechanisms we wish to expose during the tests. Reviewers: David Arthur <mumrah@gmail.com>	2021-05-17 13:46:44 -04:00
Chia-Ping Tsai	29c55fdbbc	MINOR: set replication.factor to 1 to make StreamsBrokerCompatibilityService work with old broker (#10673 ) Reviewers: Matthias J. Sax <mjsax@conflunet.io>, A. Sophie Blee-Goldman <sophie@confluent.io>	2021-05-14 13:51:31 +08:00
Chia-Ping Tsai	d881d11388	MINOR: fix streams_broker_compatibility_test.py (#10632 ) The log message was changed and so the system test can't capture expected message. Reviewers: Anna Sophie Blee-Goldman ableegoldman@apache.org>	2021-05-05 11:12:00 -07:00
Ron Dagostino	1f4207c7c1	MINOR: system test spelling/pydoc/dead code fixes (#10604 ) Reviewers: Kamal Chandraprakash <kamal@nmsworks.co.in>, Chia-Ping Tsai <chia7712@gmail.com>	2021-05-01 23:22:46 +08:00
A. Sophie Blee-Goldman	3bfc9fe486	MINOR: Bump latest 2.6 version to 2.6.2 (#10582 ) Bump the version for system tests to 2.6.2	2021-04-21 12:50:30 -07:00
Ismael Juma	976e78e405	KAFKA-12590: Remove deprecated kafka.security.auth.Authorizer, SimpleAclAuthorizer and related classes in 3.0 (#10450 ) These were deprecated in Apache Kafka 2.4 (released in December 2019) to be replaced by `org.apache.kafka.server.authorizer.Authorizer` and `AclAuthorizer`. As part of KIP-500, we will implement a new `Authorizer` implementation that relies on a topic (potentially a KRaft topic) instead of `ZooKeeper`, so we should take the chance to remove related tech debt in 3.0. Details on the issues affecting the old Authorizer interface can be found in the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-504+-+Add+new+Java+Authorizer+Interface Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Ron Dagostino <rdagostino@confluent.io>	2021-04-03 08:23:26 -07:00
John Roesler	4ed7f2cd01	KAFKA-12593: Fix Apache License headers (#10452 ) * Standardize license headers in scala, python, and gradle files. * Relocate copyright attribution to the NOTICE. * Add a license header check to `spotless` for scala files. Reviewers: Ewen Cheslack-Postava <ewencp@apache.org>, Matthias J. Sax <mjsax@apache.org>, A. Sophie Blee-Goldman <ableegoldman@apache.org	2021-04-01 10:38:37 -05:00
Ismael Juma	16b2d4f3a7	MINOR: Self-managed -> KRaft (Kafka Raft) (#10414 ) `Self-managed` is also used in the context of Cloud vs on-prem and it can be confusing. `KRaft` is a cute combination of `Kafka Raft` and it's pronounced like `craft` (as in `craftsmanship`). Reviewers: Colin P. McCabe <cmccabe@apache.org>, Jose Sancio <jsancio@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Ron Dagostino <rdagostino@confluent.io>	2021-03-29 15:39:10 -07:00
Ismael Juma	7c7e8078e4	MINOR: Use self-managed mode instead of KIP-500 and nozk (#10362 ) KIP-500 is not particularly descriptive. I also tweaked the readme text a bit. Tested that the readme for self-managed still works after these changes. Reviewers: Colin P. McCabe <cmccabe@apache.org>, Ron Dagostino <rdagostino@confluent.io>, Jason Gustafson <jason@confluent.io>	2021-03-19 16:42:37 -07:00
Justine Olshan	fdd11a034c	KAFKA-12318: system tests need to fetch Topic IDs via Admin Client instead of via ZooKeeper (#10286 ) Change the ducktape system tests to support both ZK and raft topic IDs. Clarifies that the IBP check applies to the ZK code path. Reviewers: Colin P. McCabe <cmccabe@apache.org>, Ron Dagostino <rdagostino@confluent.io>	2021-03-19 11:41:50 -07:00
Ron Dagostino	9adfac2803	MINOR: fix failing ZooKeeper system tests (#10297 ) ZooKeeper-related system tests in zookeeper_security_upgrade_test.py and zookeeper_tls_test.py broke due to #10199. That patch changed the logic of SecurityConfig.enabled_sasl_mechanisms() to only add the inter-broker SASL mechanism when the inter-broker protocol was SASL_{PLAINTEXT,SSL}. The inter-broker protocol is left to default to PLAINTEXT for the SecurityConfig instance associated with Zookeeper since that value doesn't apply to ZooKeeper, so the default inter-broker SASL mechanism of GSSAPI was not being added into the set returned by enabled_sasl_mechanisms(). This is actually correct -- GSSAPI shouldn't be added since inter-broker communication is a Kafka concept and doesn't apply to ZooKeeper. GSSAPI should be added when ZooKeeper uses it, though -- which is the case in these tests. So the prior patch referred to above uncovered a bug: we were relying on the default inter-broker SASL mechanism to signal that Kerberos was being used by ZooKeeper even though the inter-broker protocol has nothing to do with that determination in such cases. This patch explicitly includes GSSAPI in the list of enabled SASL mechanisms when SASL is enabled for use by ZooKeeper. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2021-03-17 10:58:42 -07:00
Chia-Ping Tsai	3288db5ed1	MINOR: fix client_compatibility_features_test.py (#10292 ) Reviewers: Colin Patrick McCabe <cmccabe@confluent.io>, Ron Dagostino <rdagostino@confluent.io>	2021-03-18 01:27:06 +08:00
Ron Dagostino	b96fc7892f	KAFKA-12455: Fix OffsetValidationTest.test_broker_rolling_bounce failure with Raft (#10322 ) This test was failing when used with a Raft-based metadata quorum but succeeding with a ZooKeeper-based quorum. This patch increases the consumers' session timeouts to 30 seconds, which fixes the Raft case and also eliminates flakiness that has historically existed in the Zookeeper case. This patch also fixes a minor logging bug in RaftReplicaManager.endMetadataChangeDeferral() that was discovered during the debugging of this issue, and it adds an extra logging statement in RaftReplicaManager.handleMetadataRecords() when a single metadata batch is applied to mirror the same logging statement that occurs when deferred metadata changes are applied. In the Raft system test case the consumer was sometimes receiving a METADATA response with just 1 alive broker, and then when that broker rolled the consumer wouldn't know about any alive nodes. It would have to wait until the broker returned before it could reconnect, and by that time the group coordinator on the second broker would have timed-out the client and initiated a group rebalance. The test explicitly checks that no rebalances occur, so the test would fail. It turns out that the reason why the ZooKeeper configuration wasn't seeing rebalances was just plain luck. The brokers' metadata caches in the ZooKeeper configuration show 1 alive broker even more frequently than the Raft configuration does. If we tweak the metadata.max.age.ms value on the consumers we can easily get the ZooKeeper test to fail, and in fact this system test has historically been flaky for the ZooKeeper configuration. We can get the test to pass by setting session.timeout.ms=30000 (which is longer than the roll time of any broker), or we can increase the broker count so that the client never sees a METADATA response with just a single alive broker and therefore never loses contact with the cluster for an extended period of time. We have plenty of system tests with 3+ brokers, so we choose to keep this test with 2 brokers and increase the session timeout. Reviewers: Ismael Juma <ismael@juma.me.uk>	2021-03-16 13:57:29 -07:00
Ron Dagostino	b92d606379	MINOR: disable round_trip_fault_test system tests for Raft quorums (#10249 ) The KIP-500 early access release will not support creating a partition with a manual partition assignment that includes a broker that is not currently online. This patch disables system tests for Raft-based metadata quorums where the test depends on this functionality to pass. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2021-03-09 13:57:15 -08:00
Ron Dagostino	0fc53652e1	MINOR: fix failing system test delegation_token_test (#10237 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>, David Arthur <mumrah@gmail.com>	2021-03-09 13:55:29 -08:00
Boyang Chen	17851da667	KAFKA-12381: remove live broker checks for forwarding topic creation (#10240 ) Removed broker number checks for invalid replication factor when doing the forwarding, in order to reduce false alarms for clients. Reviewers: Jason Gustafson <jason@confluent.io>	2021-03-05 15:55:14 -08:00
Ron Dagostino	29b4a3d1fe	MINOR: Disable transactional/idempotent system tests for Raft quorums (#10224 )	2021-03-02 12:57:12 -05:00
Ron Dagostino	5d37901500	KAFKA-12374: Add missing config sasl.mechanism.controller.protocol (#10199 ) Fix some cases where we were erroneously using the configuration of the inter broker listener instead of the controller listener. Add the sasl.mechanism.controller.protocol configuration key specified by KIP-631. Add some ducktape tests. Reviewers: Colin P. McCabe <cmccabe@apache.org>, David Arthur <mumrah@gmail.com>, Boyang Chen <boyang@confluent.io>	2021-02-26 16:56:11 -08:00
Ron Dagostino	02226fa090	MINOR: disable test_produce_bench_transactions for Raft metadata quorum (#10222 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>	2021-02-26 13:54:21 -08:00
Jason Gustafson	74dfe80bb8	KAFKA-12365; Disable APIs not supported by KIP-500 broker/controller (#10194 ) This patch updates request `listeners` tags to be in line with what the KIP-500 broker/controller support today. We will re-enable these APIs as needed once we have added the support. I have also updated `ControllerApis` to use `ApiVersionManager` and simplified the envelope handling logic. Reviewers: Ron Dagostino <rdagostino@confluent.io>, Colin P. McCabe <cmccabe@apache.org>	2021-02-25 19:38:21 -08:00
Ron Dagostino	bd04f7557a	MINOR: fix syntax error in upgrade_test.py (#10210 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>	2021-02-25 12:14:38 -08:00

1 2 3 4 5 ...

681 Commits