Commit Graph

933 Commits

Author SHA1 Message Date
Ken Huang 2ee7e4d22c
KAFKA-18152 add 0.11, 1.0, 1.1, and 2.0 streams dependencies to dockerfile (#18025)
Reviewers: Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-12-04 19:24:12 +08:00
Peter Lee b295318796
KAFKA-18145 Fix failed e2e ConnectDistributedTest.test_dynamic_logging (#18023)
The org.reflections is removed, so the initial logger of worker is only "root". However, the e2e needs a non-root logger to verify dynamic logger

We can add a logger to connect_log4j.properties to fix this e2e. For example:

log4j.logger.org.apache.kafka.clients.consumer.ConsumerConfig=ERROR
this can make admin/logger return two logger - org.apache.kafka.clients.consumer.ConsumerConfig and root

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-12-04 19:22:45 +08:00
PoAn Yang 1f3f03579c
KAFKA-17979 Change [pytest] to [tool:pytest] in setup.cfg file (#17740)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-29 02:26:58 +08:00
PoAn Yang d1952e8542
KAFKA-18045 Add 0.11, 1.0, 1.1, and 2.0 back to streams_upgrade_test.py (#17876)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-25 21:34:36 +08:00
kevin-wu24 38aca3a045
KAFKA-17917: Convert Kafka core system tests to use KRaft (#17847)
- Remove some unused Zookeeper code

- Migrate group mode transactions, security rolling upgrade, and throttling tests to using KRaft

- Add KRaft downgrade tests to kraft_upgrade_test.py

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-11-21 13:40:49 -08:00
Ken Huang fde6ae1500
KAFKA-18029 remove the `kraft.version=1` from kafka.py (#17838)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-18 00:48:19 +08:00
Ken Huang d8bfbb7d1a
KAFKA-17791: Dockerfile should use requirements.txt for dependencies (#17542)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-11-15 17:52:59 +01:00
TengYao Chi 84fe66827d
KAFKA-18006: Add 3.9.0 to end-to-end test (streams) (#17800)
This commit adds AK 3.9 to the system tests on trunk.
Follow-up of #17797

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Bruno Cadonna <cadonna@apache.org>
2024-11-15 14:58:24 +01:00
PoAn Yang ed9cb08dfe
KAFKA-17977 Remove new_consumer from E2E (#17798)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-15 16:26:26 +08:00
TengYao Chi e9cd9c9811
KAFKA-18006 Add 3.9.0 to end-to-end test (core, client) (#17797)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-15 00:24:24 +08:00
Ken Huang 6147a311bf
KAFKA-17888 Upgrade ZooKeeper version from 3.4.9 to 3.5.7 to avoid ZOOKEEPER-3779, which can't run under JDK 11. (#17625)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-14 19:20:51 +08:00
PoAn Yang 1c3ee7fe60
KAFKA-18004 Use version 3.8 to run the ZooKeeper service for end-to-end tests (#17790)
We plan to remove all ZooKeeper-related code in version 4.0. However, some old brokers in the end-to-end tests still require ZooKeeper service, so we need to run the ZooKeeper service using the 3.x release instead of the dev branch.

Since version 3.9 is not available in the https://s3-us-west-2.amazonaws.com/kafka-packages repo, we can use version 3.8 for now.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-13 20:34:53 +08:00
PoAn Yang 440e0b8801
KAFKA-17923 Remove old kafka version from e2e (#17673)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-11 06:22:06 +08:00
ShivsundarR 0181073d49
KAFKA-17933: Added round trip trogdor workload for share consumer. (#17692)
Added ShareRoundTripWorker.java similar to RoundTripWorker.java. This will start a producer and a share consumer on a single node. The share consumer reads back the messages produced by the producer.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
2024-11-07 16:21:14 +05:30
kevin-wu24 5ec9dffa81
KAFKA-17916: removing ZK from connect ducktape tests (#17689)
Migrates existing connect tests that were using Zookeeper to use KRaft
instead, and cleans up some dead ZK code. For broker compatibility tests,
tests for versions 2.1-2.3 still need to use ZK.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-11-05 14:33:17 -08:00
Bill Bejeck 36c131ef4a
KAFKA-17609:[1/4] Changes needed to convert system tests to use KRaft and remove ZK (#17275)
This is part one of a multi-pr effort to convert Kafka Streams system tests to KRaft. I decided to break down the changes into multiple PRs to reduce the review load

Reviewers: Matthias Sax <mjsax@apache.org>
2024-11-05 11:23:33 -05:00
kevin-wu24 c5a31cd6fb
KAFKA-17625: Removing explicit ZK test parameterizations (#17638)
This PR removes ZK test parameterizations from ducktape by:

- Removing zk from quorum.all_non_upgrade
- Removing quorum.zk from @matrix and @parametrize annotations
- Changing usages of quorum.all to quorum.all_kraft
- Deleting message_format_change_test.py

The default metadata_quorum value still needs to be changed to KRaft rather than ZK, but this will be done in a follow-up PR.

Reviewers: Kirk True <kirk@kirktrue.pro>, Colin P. McCabe <cmccabe@apache.org>
2024-11-04 09:38:04 -08:00
Bill Bejeck 29881782c8
KAFKA-17609 Migrate broker compatibility test from ZK to KRaft (#17603)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-10-31 04:51:06 +08:00
Josep Prat 5859df9ee0
MINOR: Add Kafka 3.8.1 to system tests (#17629)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-31 01:00:37 +08:00
Bill Bejeck 58dd76817e
KAFKA-17609:[2/4]Convert system tests to kraft part 2 (#17321)
* Part 2 of 4 converting system tests to use KRaft

Reviewers: Matthias Sax <mjsax@apache.org>
2024-10-30 12:31:47 -04:00
Bill Bejeck 358d8775fb
KAFKA-17609:[3/4]Convert system tests to kraft part 3 (#17327)
Part 3 of 4 converting streams system tests to KRaft

Reviewers: Matthias Sax <mjsax@apache.org>
2024-10-30 12:20:58 -04:00
Bill Bejeck 3d2edf8de0
KAFKA-17609:[4/4]Convert system tests to kraft part 4 (#17328)
Part 4 of 4 converting streams system tests to KRaft

Reviewers: Matthias Sax <mjsax@apache.org>
2024-10-30 12:07:16 -04:00
Mahsa Seifikar bed70d4d2e
MINOR: Correct error message in reassign_partitions_test.py (#17632)
Reviewers: Justine Olshan <jolshan@confluent.io>
2024-10-30 08:20:46 -07:00
Yung db25c212ed
KAFKA-17883 Fix jvm error caused by UseParNewGC when running old kafka client in e2e (#17612)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-28 20:23:32 +08:00
Yung 24689dc6ab
KAFKA-17879 test_performance_services.py should use DEV version to run kafka service (#17606)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-28 04:32:06 +08:00
TengYao Chi 553e6b4c6d
KAFKA-17860 Remove log4j-appender module (#17588)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-24 18:13:30 +08:00
David Jacot a96cc6a24d
MINOR: Fix coordinator logging in system tests (#17585)
Reviewers: Jeff Kim <jeff.kim@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-10-24 02:51:18 +08:00
PoAn Yang 2d896d9130
KAFKA-17614: Remove AclAuthorizer (#17424)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-10-23 17:07:48 +02:00
Dongnuo Lyu 243e8e2830
KAFKA-17272 [2/2]: System tests for protocol migration (#17503)
This patch adds `consumer_protocol_migration_test.py` that tests the upgrade/downgrade paths between the old and new group protocol in KIP-848.

A successful test result can be found [here](https://confluent-open-source-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/dlyu-test-sem-versions/2024-10-21--001/report.html)

Reviewers: David Jacot <djacot@confluent.io>
2024-10-22 07:40:13 -07:00
Colin Patrick McCabe e3751a838c
KAFKA-17794: Add some formatting safeguards for KIP-853 (#17504)
KIP-853 adds support for dynamic KRaft quorums. This means that the quorum topology is
no longer statically determined by the controller.quorum.voters configuration. Instead, it
is contained in the storage directories of each controller and broker.

Users of dynamic quorums must format at least one controller storage directory with either
the --initial-controllers or --standalone flags.  If they fail to do this, no quorum can be
established. This PR changes the storage tool to warn about the case where a KIP-853 flag has
not been supplied to format a KIP-853 controller. (Note that broker storage directories
can continue to be formatted without a KIP-853 flag.)

There are cases where we don't want to specify initial voters when formatting a controller. One
example is where we format a single controller with --standalone, and then dynamically add 4
more controllers with no initial topology. In this case, we want the 4 later controllers to grab
the quorum topology from the initial one. To support this case, this PR adds the
--no-initial-controllers flag.

Reviewers: José Armando García Sancio <jsancio@apache.org>, Federico Valeri <fvaleri@redhat.com>
2024-10-21 10:06:41 -07:00
Justine Olshan 8f1df347f5
MINOR: add psutil to setup.py (#17547)
Reviewers: Ian McDonald <ian_mcdonald@rocketmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-10-19 11:29:08 +08:00
Ken Huang 05a6898610
KAFKA-17812 upgrade base image of e2e from JDK 11 to JDK 17 (#17520)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-17 20:54:13 +08:00
Ken Huang bb7c083049
KAFKA-17781 add `psutil` to e2e dockerfile and upgrade ducktape version (#17480)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-16 22:38:23 +08:00
PoAn Yang 9bbf0950f9
KAFKA-17387 Remove broker-list in VerifiableConsumer (#17406)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-10 11:41:53 +08:00
Yung c36b993af0
KAFKA-17738 upgrade base image from jdk8 to jdk11 (#17432)
Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-10-10 09:28:22 +08:00
Yung be24e1d608
KAFKA-17737 E2E tests need to drop Kafka versions prior to 1.0.0 (#17427)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-10 08:57:31 +08:00
xijiu 7592bc3cbe
KAFKA-17655 add example of changing the e2e image name (#17408)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-09 10:22:02 +08:00
Chung, Ming-Yen a6bce450dd
KAFKA-17720 Remove zookeeper_migration_test.py and migration-related functions in kafka.py (#17410)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-09 09:45:33 +08:00
TengYao Chi 2733268409
KAFKA-17624 Remove the E2E uses of accessing ACLs from zk (#17338)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-08 07:10:14 +08:00
Alyssa Huang e27d0dfb17
MINOR: Fix kafkatest advertised listeners (#17294)
Followup for #17146

Reviewers: Bill Bejeck <bbejeck@apache.org>
2024-09-30 08:51:49 -04:00
Alyssa Huang 68b9770506
KAFKA-17608, KAFKA-17604, KAFKA-16963; KRaft controller crashes when active controller is removed (#17146)
This change fixes a few issues.

KAFKA-17608; KRaft controller crashes when active controller is removed
When a control batch is committed, the quorum controller currently increases the last stable offset but fails to create a snapshot for that offset. This causes an issue if the quorum controller renounces and needs to revert to that offset (which has no snapshot present). Since the control batches are no-ops for the quorum controller, it does not need to update its offsets for control records. We skip handle commit logic for control batches.

KAFKA-17604; Describe quorum output missing added voters endpoints
Describe quorum output will miss endpoints of voters which were added via AddRaftVoter. This is due to a bug in LeaderState's updateVoterAndObserverStates which will pull replica state from observer states map (which does not include endpoints). The fix is to populate endpoints from the lastVoterSet passed into the method.

Reviewers: José Armando García Sancio <jsancio@apache.org>, Colin P. McCabe <cmccabe@apache.org>, Chia-Ping Tsai <chia7712@apache.org>
2024-09-26 13:56:19 -04:00
Eric Chang e146c7c916
KAFKA-17520 Align ducktape version in tests/docker/Dockerfile and tests/setup.py (#17240)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-26 03:44:40 +08:00
TengYao Chi f51fc16c16
KAFKA-17459 Stablize reassign_partitions_test.py (#17250)
This test expects that each partition can receive the record, so using a non-null key helps distribute the records more randomly.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-24 17:36:36 +08:00
Bill Bejeck e1f11c6714
MINOR: Need to split the controller bootstrap servers on ',' in list comprehenson (#17183)
Kafka Streams system tests were failing with this error:

Failed to parse host name from entry 3001@d for the configuration controller.quorum.voters.  Each entry should be in the form `{id}@{host}:{port}`.

The cause is that in kafka.py line 876, we create a delimited string from a list comprehension, but the input is a string itself, so each character gets appended vs. the bootstrap server string of host:port. To fix this, this PR adds split(',') to controller_quorum_bootstrap_servers. Note that this only applies when dynamicRaftQuorum=False

Reviewers: Alyssa Huang <ahuang@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-09-16 02:26:06 +08:00
Matthias J. Sax 6fd973b4a5
KAFKA-16331: Remove EOSv1 from Kafka Streams system tests (#17108)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Bill Bejeck <bill@confluent.io>
2024-09-10 17:55:03 -07:00
xijiu 0af75c0e41
KAFKA-17458 Add 3.8 to transactions_upgrade_test.py, transactions_mixed_versions_test.py, and kraft_upgrade_test.py (#17084)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-11 02:12:25 +08:00
Ken Huang e311716beb
KAFKA-17492 skip features with minVersion of 0 instead of replacing 0 with 1 when BrokerRegistrationRequest < 4 (#17128)
The 3.8 controller assumes the unknown features have min version = 0, but KAFKA-17011 replace the min=0 by min=1 when BrokerRegistrationRequest < 4. Hence, to support upgrading from 3.8.0 to 3.9, this PR changes the implementation of ApiVersionsResponse (<4) and BrokerRegistrationRequest (<4) to skip features with supported minVersion of 0 instead of replacing 0 with 1

Reviewers: Jun Rao <junrao@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2024-09-11 01:16:59 +08:00
TengYao Chi 2dc3ee0557 KAFKA-17497 Add e2e for zk migration with old controller (#17131)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-10 12:15:32 +08:00
David Jacot 2ff81f087a
MINOR: Clean up system tests based on new defaults (#17113)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-08 12:28:05 +08:00
Alyssa Huang a9a4a52c9d
KAFKA-16963: Ducktape test for KIP-853 (#17081)
Add a ducktape system test for KIP-853 quorum reconfiguration, including adding and removing voters.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-09-06 13:44:09 -07:00
TengYao Chi f0f37409be
KAFKA-17454 Fix failed transactions_mixed_versions_test.py when running with 3.2 (#17067)
why df04887ba5 does not fix it?

The fix of df04887ba5 is to NOT collect the log from path `/mnt/kafka/kafka-operational-logs/debug/xxxx.log`if the task is successful. It does not change the log level. see ducktape b2ad7693f2/ducktape/tests/test.py (L181)

why df04887ba5 does not see the error of "sort"

df04887ba5 does NOT show the error since the number of features is only "one" (only metadata.version). Hence, the bug is not triggered as it does not need to "sort". Now, we have two features - metadata.version and krafe.version - so the sort is executed and then we see the "hello bug"

why we should change the kafka.log_level to INFO?

the template of log4j.properties is controlled by `log_level` (https://github.com/apache/kafka/blob/trunk/tests/kafkatest/services/kafka/templates/log4j.properties#L16), and the bug happens in writing debug message (e4ca066680/core/src/main/scala/kafka/server/metadata/BrokerMetadataListener.scala (L274)). Hence, changing the log level to DEBUG can avoid triggering the bug.

Reviewers: Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-09-04 15:09:03 +08:00
PoAn Yang 4a3ab89f95
KAFKA-17386 Remove broker-list, threads and num-fetch-threads in ConsumerPerformance (#16983)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-30 22:09:37 +08:00
Colin Patrick McCabe 453cf9c987
KAFKA-17434: Do not test impossible scenarios in upgrade_test.py (#17024)
Because of KIP-902 (Upgrade Zookeeper version to 3.8.2), it is not possible to upgrade from a Kafka version
earlier than 2.4 to a version later than 2.4. Therefore, we should not test these upgrade scenarios
in upgrade_test.py. They do happen to work sometimes, but only in the trivial case where we don't
create topics or make changes during the upgrade (which would reveal the ZK incompatibility).
Instead, we should test only supported scenarios.

Reviewers: Reviewers: José Armando García Sancio <jsancio@gmail.com>
2024-08-29 12:51:42 -07:00
Logan Zhu 464051929d
KAFKA-17388 Remove broker-list from VerifiableProducer (#16958) 2024-08-29 20:02:29 +08:00
David Jacot c977bfdd3c
KAFKA-17413; Re-introduce `group.version` feature flag (#17013)
This patch re-introduces the `group.version` feature flag and gates the new consumer rebalance protocol with it. The `group.version` feature flag is attached to the metadata version `4.0-IV0` and it is marked as production ready. This allows system tests to pick it up directly by default without requiring to set `unstable.feature.versions.enable` in all of them. This is fine because we don't plan to do any incompatible changes before 4.0.

Reviewers: Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-08-29 01:22:54 -07:00
abhi-ksolves fb19b3f7e7
KAFKA-14262 Deletion of MirrorMaker v1 deprecated classes & tests (#16879)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-28 09:49:56 +08:00
Xuan-Zhang Gong 31f408d6da
KAFKA-17382 cleanup out-of-date configs of config_property (#17000)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-26 21:54:42 +08:00
David Jacot aaf887d3d9
KAFKA-14048; [2/2] Use the new group coordinator by default in 4.0 (#16945)
This patch makes the new group coordinator, introduced as part of KIP-848, the default. This means that any KRaft cluster created from trunk defaults to using the new group coordinator. This includes all the integration tests which do not specify it. This patch also changes the default in system tests.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-08-26 01:14:26 -07:00
Greg Harris b40b5a24f4
KAFKA-17369: Remove Reflections from logging and update licenses (#16924)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-08-19 16:27:36 -07:00
Dongnuo Lyu 5dd4d84eec
KAFKA-17272: [1/2] System test framework for consumer protocol migration (#16845)
This patch adds the necessary framework for system tests of consumer protocol upgrade/downgrade paths. The change mainly includes
- adding `ConsumerProtocolConsumerEventHandler` for the consumers using the new protocol.
- some other fixes to consumer_test.py with the new framework which fixes
  - [KAFKA-16576](https://issues.apache.org/jira/browse/KAFKA-16576): fixed by getting `partition_owner` after the group is fully stabilized.
  - [KAFKA-17219](https://issues.apache.org/jira/browse/KAFKA-17219): The first issue is the same as KAFKA-16576. The second issue is fixed by taking `num_rebalances` after the group is fully stabilized.
  - [KAFKA-17295](https://issues.apache.org/jira/browse/KAFKA-17295): Same as KAFKA-17219 second issue. Fixed by taking `num_rebalances` after the group is fully stabilized.

A test result of `tests/kafkatest/tests/client` is [here](https://confluent-open-source-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/trunk/2024-08-13--001.54e3cf70-869c-465c-bd7a-2ec0c26b2f05--1723594100--confluentinc--kip-848-migration-system-test-framework-comment-aug12--2388f23da7/report.html).

Reviewers: David Jacot <djacot@confluent.io>
2024-08-14 06:47:51 -07:00
Colin Patrick McCabe 132e0970fb
KAFKA-17018: update MetadataVersion for the Kafka release 3.9 (#16841)
- Mark 3.9-IV0 as stable. Metadata version 3.9-IV0 should return Fetch version 17.

- Move ELR to 4.0-IV0. Remove 3.9-IV1 since it's no longer needed.

- Create a new 4.0-IV1 MV for KIP-848.

Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>
2024-08-12 16:30:43 -07:00
Justine Olshan df04887ba5
MINOR: Reduce log levels for transactions_mixed_versions_test with 3.2 due to bug in that version (#16787)
7496e62434 fixed an error that caused an exception to be thrown on broker startup when debug logs were on. This made it to every version except 3.2. 

The Kraft upgrade tests totally turn off debug logs, but I think we only need to remove them for the broken version.

Note: this bug is also present in 3.1, but there is no logging on startup like in subsequent versions.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <david.jacot@gmail.com>
2024-08-12 09:51:04 -07:00
TengYao Chi da14b5a61d
KAFKA-16390: add `group.coordinator.rebalance.protocols=classic,consumer` to broker configs when system tests need the new coordinator (#16715)
Fix an issue that cause system test failing when using AsyncKafkaConsumer.
A configuration option, group.coordinator.rebalance.protocols, was introduced to specify the rebalance protocols used by the group coordinator. By default, the rebalance protocol is set to classic. When the new group coordinator is enabled, the rebalance protocols are set to classic,consumer.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, Kirk True <kirk@kirktrue.pro>, Justine Olshan <jolshan@confluent.io>
2024-08-02 16:07:45 -07:00
Josep Prat 89214d033e
KAFKA-17214: Add 3.8.0 version to core and client system tests (#16726)
Reviewers: Greg Harris <greg.harris@aiven.io>
2024-07-30 19:39:10 +02:00
Josep Prat e299a006c8
KAFKA-17214: Add 3.8.0 version to streams system tests (#16728)
* KAFKA-17214: Add 3.8.0 version to streams system tests

Reviewers: Bill Bejeck <bbejeck@gmail.com>
2024-07-30 19:04:38 +02:00
Josep Prat dcb331c623
MINOR: Add 3.8.0 to system tests (#16714)
Reviewers:  Manikumar Reddy <manikumar.reddy@gmail.com>
2024-07-30 09:19:48 +02:00
Colin P. McCabe 0ec520a2af Bump trunk to 4.0.0-SNAPSHOT 2024-07-29 15:51:54 -07:00
Xuan-Zhang Gong 9a9eb18beb
MINOR: add docs of "ducker-ak down -f" to e2e README (#16560)
Reviewers: Arnav Dadarya <ardada2468@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-07-18 08:34:35 +08:00
David Arthur 8aee314a46
KAFKA-16667 Avoid stale read in KRaftMigrationDriver (#15918)
When becoming the active KRaftMigrationDriver, there is another race condition similar to KAFKA-16171. This time, the race is due to a stale read from ZK. After writing to /controller and /controller_epoch, it is possible that a read on /migration is not linearized with the writes that were just made. In other words, we get a stale read on /migration. This leads to an inability to sync metadata to ZK due to incorrect zkVersion on the migration ZNode.

The non-linearizability of reads is in fact documented behavior for ZK, so we need to handle it.

To fix the stale read, this patch adds a write to /migration after updating /controller and /controller_epoch. This allows us to learn the correct zkVersion for the migration ZNode before leaving the BECOME_CONTROLLER state.

This patch also adds a check on the current leader epoch when running certain events in KRaftMigrationDriver. Historically, we did not include this check because it is not necessary for correctness. Writes to ZK are gated on the /controller_epoch zkVersion, and RPCs sent to brokers are gated on the controller epoch. However, during a time of rapid failover, there is a lot of processing happening on the controller (i.e., full metadata sync to ZK and full UMRs sent to brokers), so it is best to avoid running events we know will fail.

There is also a small fix in here to improve the logging of ZK operations. The log message are changed to past tense to reflect the fact that they have already happened by the time the log message is created.

Reviewers: Igor Soarez <soarez@apple.com>
2024-07-15 09:32:06 -04:00
Xuan-Zhang Gong 0ada8fac68
KAFKA-17096 Fix kafka_log4j_appender.py (#16559)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-12 22:35:55 +08:00
Justine Olshan e118811da3
MINOR: Fix transactions_upgrade_test to run transactions during upgrade (#16462)
Move when the upgrade happens so we actually upgrade while transactions are running

Reviewers: David Jacot <djacot@confluent.io>
2024-07-11 10:30:15 -07:00
Igor Soarez 1bec3811ad
KAFKA-17083: Update LATEST_STABLE_METADATA_VERSION in system tests (#16533)
LATEST_PRODUCTION version in MetadataVersion.java was updated in
both #16347 and #16400, but it was left unchanged in the system
tests.

Reviewers: Josep Prat <josep.prat@aiven.io>
2024-07-05 21:29:35 +01:00
David Jacot e7a75805fe
KAFKA-17050: Revert `group.version` (#16482)
This patch partially reverts `group.version` in trunk. I kept the `GroupVersion` class but removed it from `Features` so it is not advertised. I also kept all the changes in the test framework. I removed the logic to require `group.version=1` to enable the new consumer rebalance protocol. The new protocol is enabled based on the static configuration.

For the context, I prefer to revert it in trunk now so we don't forget to revert it in the 3.9 release. I will bring it back for the 4.0 release.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-02 07:21:18 -07:00
Igor Soarez 285698e0cf
MINOR: Add 3.7.1 to system tests (#16483)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-01 19:25:31 +08:00
Justine Olshan a599b89fe0
Revert "KAFKA-16275: Update kraft_upgrade_test.py to support KIP-848’s group protocol config (#16409) (#16441)
This reverts commit e95e91a.

With the change to include the group.version flag, these tests fail due to trying to set the feature for the old version.

It is unclear if these tests originally worked as intended and given the upgrade is not expected for 3.8, we will just revert from 3.8.

Reviewers: David Jacot <djacot@confluent.io>
2024-06-25 10:15:58 -07:00
vamossagar12 ceec218351
KAFKA-16949: Fixing test_dynamic_logging in system test connect_distributed_test (#15915)
Reviewers: Chris Egerton <chrise@aiven.io>
2024-06-25 12:36:04 -04:00
Luke Chen 191b6476d7
KAFKA-16988: add 1 more node for test_exactly_once_source system test (#16379)
Reviewers: Igor Soarez <soarez@apple.com>
2024-06-18 13:03:55 +01:00
Gaurav Narula 39b514d350
MINOR: use 2 logdirs in ZK migration system tests (#15394)
Zookeeper migration system tests currently override the config to
use only one log directory.

This PR removes the override so that the system tests run with 2 log
directories following the work done as part of KIP-858.

Reviewers: Igor Soarez <soarez@apple.com>, Proven Provenzano <pprovenzano@confluent.io>
2024-06-18 11:49:25 +01:00
Krishna Agarwal bcf781230e
KAFKA-16932: Add documentation for the native docker image (#16338)
This PR contains the the following documentation changes for the native docker image:

in the docker/README.md: How to build, release and promote the native docker image.
in the tests/README.md: How to run system tests by bringing up kafka in the native mode.
added docker/native/README.md
added html changes for the kafka-site
added native docker image support in the docker compose files examples.

Testing:
Tested all the docker compose files with both the docker images - jvm and native
Tested the html changes locally with the kafka-site

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Vedarth Sharma <vesharma@confluent.io>
2024-06-16 14:01:13 +05:30
A. Sophie Blee-Goldman 4333af5c9f
KAFKA-15045: (KIP-924 pt. 25) Rename old internal StickyTaskAssignor to LegacyStickyTaskAssignor (#16322)
To avoid confusion in 3.8/until we fully remove all the old task assignors and internal config, we should rename the old internal assignor classes like the StickyTaskAssignor so that they won't be mixed up with the new version of the assignor (which is also named StickyTaskAssignor)

Reviewers: Bruno Cadonna <cadonna@apache.org>, Josep Prat <josep.prat@aiven.io>
2024-06-13 11:27:50 -07:00
David Jacot 190dd79457
KAFKA-16860; [2/2] Introduce group.version feature flag (#16149)
This patch updates the system tests to correctly enable the new consumer protocol/coordinator in the tests requiring them.

I went with the simplest approach for now. Long term, I think that we should refactor the tests to better handle features and non-production features.

I got a successful run of the consumer system tests with this patch combined with https://github.com/apache/kafka/pull/16120: https://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1717155071--dajac--KAFKA-16860-2--29028ae0dd/2024-05-31--001./2024-05-31--001./report.html.

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-05-31 12:49:26 -07:00
Josep Prat 7e81cc5e68
MINOR: Bump trunk to 3.9.0-SNAPSHOT (#16150)
Signed-off-by: Josep Prat <josep.prat@aiven.io>

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-05-31 16:41:44 +02:00
Krishna Agarwal bb6a042e99
KAFKA-16827: Integrate kafka native-image with system tests (#16046)
This PR does following things

System tests should bring up Kafka broker in the native mode
System tests should run on Kafka broker in native mode
Extract out native build command so that it can be reused.
Allow system tests to run on Native Kafka broker using Docker mechanism

To run system tests by bringing up Kafka in native mode:
Pass kafka_mode as native in the ducktape globals:--globals '{\"kafka_mode\":\"native\"}'

Running system tests by bringing up kafka in native mode via docker mechanism
_DUCKTAPE_OPTIONS="--globals '{\"kafka_mode\":\"native\"}'" TC_PATHS="tests/kafkatest/tests/"  bash tests/docker/run_tests.sh

To only bring up ducker nodes to cater native kafka
bash tests/docker/ducker-ak up -m native

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2024-05-30 22:24:23 +05:30
Lucas Brutschy 1dcdccf736
MINOR: fix streams_broker_compatibility test (#16015)
The system test was broken in ccf4bd5
which failed to import the matrix symbol. The test was failing silently, not
discovering any tests.

Reviewers: Bruno Cadonna <cadonna@apache.org>
2024-05-21 16:07:49 +02:00
Justine Olshan 3e15ab98ec
KAFKA-16992: InvalidRequestException: ADD_PARTITIONS_TO_TXN with version 4 which is not enabled when upgrading from kafka (#15971)
We weren't enabling discoverBrokerVersions to check the supported versions in the AddPartitionsToTxnManager. This means that any verification request (or any AddPartitionsToTxnRequest version) from a newer broker would fail when sending to an older broker.

The bulk of this change is adding additional transactions system tests for old versions.
One test upgrades the cluster completely. This didn't catch the issue but could be useful.

The other test forces a new broker to send a verification request to an older one. Without the discoverBrokerVersions change, all tests between mixed brokers failed. (We introduced a new request version in 3.8 -- which is a separate version from the one that caused the bug for 3.5 -> 3.6) With the addition, the tests all passed.

I also manually ran a test for 3.5 -> 3.6 since the issue there was slightly different and was caused by the unstableLatestVersion flag being enabled. This change should fix this as well. 👍

Reviewers:  David Jacot <djacot@confluent.io>
2024-05-17 21:35:28 -07:00
Lianet Magrans a4952572dc
MINOR: Change sys test describe topic parsing to improve extensibility (#15941)
Minor change to how the describe topic output is parsed in system tests, to ensure that the output is preserved, even if only some fields are relevant to the test for now (which is what the test used to do before recent changes)

Initial problem: System tests were parsing the describe topic output in kafka.py assuming all fields would include a value. The describe API was recently changed, breaking this logic, because it included new fields for which there may not be values (ex. LastKnownElr).

Initial fix: The initial fix for this was to drop all fields from the output except for the ones currently used in the test, where in reality only the fields without values are the problematic ones.

Proposed improvement: A more extensible approach would be to drop only the fields that have no values and preserve the full output, which is what the test did before the initial fix mentioned above. This allows to easily extend the test to include more fields as needed, which could follow as the describe API and tests evolves (it will only require to add the fields to the returned value when needed, without having to change how the fields object is stripped).

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-05-15 11:09:44 +02:00
Lucas Brutschy 3b43edd7a1
MINOR: Remove dev_version parameter from streams tests (#15874)
In two tests, we are using the current snapshot version as a test parameter
`to_version`, but as the only option. We can hardcode it. This
simplifies testing downstream, since the test parameters do not change
with every version. In particular, some tests downstream are blacklisted
because they do not work with ARM. These lists need to be updated every
time `DEV_VERSION` is bumped.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-05-08 09:43:26 +02:00
Lianet Magrans 636e65aa6b
KAFKA-16465: Fix consumer sys test revocation validation (#15778)
This fixes a consumer system test that was failing for the new protocol. The failure was because the test was expecting the eager behaviour of partitions being revoked on every rebalance, and it was wrongfully applying it to the runs with the new protocol too.
This same situation was previously identified and fixed in other parts of the sys test with #15661.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-04-29 11:56:36 +02:00
Kirk True 21faf874c0
KAFKA-16565: IncrementalAssignmentConsumerEventHandler throws error when attempting to remove a partition that isn't assigned (#15737)
Checking that the TopicPartition is in assignment before attempting to remove it.

Also added some logging and refactoring.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Lianet Magrans <lianetmr@gmail.com>
2024-04-26 10:21:22 +02:00
Kirk True dcdf812880
KAFKA-16609: Update parse_describe_topic to support new topic describe output (#15799)
The format of the 'describe topic' output was changed as part of KAFKA-15585 which required an update in the parsing logic used by system tests.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-04-25 11:12:42 +02:00
Justine Olshan a7ceacdd35
KAFKA-16571: reassign_partitions_test.bounce_brokers should wait for messages to be sent to every partition (#15739)
Added the check before the reassignment occurs and we start bouncing brokers.

Reviewers: David Mao <dmao@confluent.io>, David Jacot <djacot@confluent.io>
2024-04-24 17:30:20 -07:00
vamossagar12 f22ad6645b
KAFKA-16272: Adding new coordinator related changes for connect_distributed.py (#15594)
Summary of the changes:

Parameterizes the tests to use new coordinator and pass in consumer group protocol. This would be applicable to sink connectors only.
Enhances the sink connector creation code in system tests to accept a new optional parameter for consumer group protocol to be used.
Sets the consumer group protocol via consumer.override. override config when the new group coordinator is enabled.
Note about testing: There are 288 tests that need to be run and running on my local takes a lot of time. I will try to post the test results once I have a full run.

Reviewers: Kirk True <ktrue@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>, Philip Nee <pnee@confluent.io>
2024-04-19 17:29:50 +02:00
Philip Nee b87cd66dab
KAFKA-16579: Revert Consumer Rolling Upgrade (#15753)
Consumer Rolling Upgrade is meant to test the protocol upgrade for the old protocol. Therefore, I am removing old changes.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2024-04-19 11:36:27 +02:00
Lianet Magrans 9ea0503e23
KAFKA-16566: Fix consumer static membership system test with new protocol (#15738)
Updating consumer system test that was failing with the new protocol, related to static membership behaviour. The behaviour regarding static consumers that join with conflicting group instance id is slightly different between the classic and new consumer protocol, so the expectations in the tests needed to be updated.

If static members join with same instance id:

Classic protocol: all members join the group with the same group instance id, and then the first one will eventually fail (receives a HB error with FencedInstanceIdException)

Consumer protocol: new member with an instance Id already in use is not able to join, and first member remains active (new member with same instance Id receives an UnreleasedInstanceIdException in the response to the HB to join the group)

This PR is keeping the single parametrized test that existed before, given that what's being tested and part of the test itself apply to all protocols. This is just updating the expectations that are different, based on the protocol parameter.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Kirk True <ktrue@confluent.io>
2024-04-19 11:34:05 +02:00
Philip Nee dc9fbe453c
KAFKA-16389: ConsumerEventHandler does not support incremental assignment changes causing failure in system test (#15661)
The current AssignmentValidationTest only tests EAGER assignment protocol and does not support incremental assignment like CooperativeStickyAssignor and consumer protocol. Therefore in the ConsumerEventHandler, I subclassed the existing handler overridden the assigned and revoke event handling methods, to permit incremental changes to the current assignments.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Kirk True <ktrue@confluent.io>
2024-04-10 18:52:05 +02:00
Gaurav Narula bdd85405e3
KAFKA-16293: Test log directory failure in Kraft (#15409)
Enables log directory failure system test for all Kraft modes in addition to ZK mode.

Reviewers: Luke Chen <showuon@gmail.com>, Igor Soarez <soarez@apple.com>, Proven Provenzano <pprovenzano@confluent.io>
2024-04-06 16:01:25 +08:00
Manikumar Reddy fd9c7d2932
MINOR: Add 3.6.2 to system tests (#15665)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-05 19:36:23 +05:30
Kirk True c7ef80bb6c
KAFKA-16439: Update replication_replica_failure_test.py to support KIP-848’s group protocol config (#15629)
Added a new optional group_protocol parameter to the test methods, then passed that down to the setup_consumer method.

Unfortunately, because the new consumer can only be used with the new coordinator, this required a new @matrix block instead of adding the group_protocol=["classic", "consumer"] to the existing blocks 😢

Reviewers: Walker Carlson <wcarlson@apache.org>
2024-04-03 12:13:26 -05:00
Kirk True 6bb9caced0
KAFKA-16440: Update security_test.py to support KIP-848’s group protocol config (#15628)
Added a new optional group_protocol parameter to the test methods, then passed that down to the setup_consumer method.

Unfortunately, because the new consumer can only be used with the new coordinator, this required a new @matrix block instead of adding the group_protocol=["classic", "consumer"] to the existing blocks 😢

Reviewers: Walker Carlson <wcarlson@apache.org>
2024-04-03 12:13:14 -05:00
Kirk True 6569a354e6
KAFKA-16438: Update consumer_test.py’s static tests to support KIP-848’s group protocol config (#15627)
Migrated the following tests for the new consumer:

- test_fencing_static_consumer
- test_static_consumer_bounce
- test_static_consumer_persisted_after_rejoin

Reviewers: Walker Carlson <wcarlson@apache.org>
2024-04-03 12:13:03 -05:00