Commit Graph

936 Commits

Author SHA1 Message Date
Ron Dagostino faaef2c2df
MINOR: Support Raft-based metadata quorums in system tests (#10093)
We need to be able to run system tests with Raft-based metadata quorums -- both
co-located brokers and controllers as well as remote controllers -- in addition to the
ZooKepeer-based mode we run today. This PR adds this capability to KafkaService in a
backwards-compatible manner as follows.

If no changes are made to existing system tests then they function as they always do --
they instantiate ZooKeeper, and Kafka will use ZooKeeper. On the other hand, if we want
to use a Raft-based metadata quorum we can do so by introducing a metadata_quorum
argument to the test method and using @matrix to set it to the quorums we want to use for
the various runs of the test. We then also have to skip creating a ZooKeeperService when
the quorum is Raft-based.

This PR does not update any tests -- those will come later after all the KIP-500 code is
merged.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2021-02-11 09:44:17 -08:00
Dániel Urbán 202ff6336f
KAFKA-5235: GetOffsetShell: Support for multiple topics and consumer configuration override (KIP-635) (#9430)
This patch implements KIP-635 which mainly adds support for querying offsets of multiple topics/partitions.

Reviewers: David Jacot <djacot@confluent.io>
2021-02-11 12:06:21 +01:00
John Roesler 1f240ce179 bump to 2.9 development version 2021-02-07 09:25:36 -06:00
Stanislav Vodetskyi 91d6c55da4
MINOR: Upgrade ducktape to version 0.8.1 (#9933)
ducktape 0.8.1 was updated to include the following changes/fixes from 0.7.x branch:
* Junit reporting support
* fix for an issue where unicode characters in exception message would cause test runner to hang on py27.

Reviewers: Konstantine Karantasis <k.karantasis@gmail.com>
2021-01-22 20:23:55 -08:00
Justine Olshan 86b9fdef2b
KAFKA-10869: Gate topic IDs behind IBP 2.8 (KIP-516) (#9814)
Topics processed by the controller and topics newly created will only be given topic IDs if the inter-broker protocol version on the controller is greater than 2.8. This PR also adds a kafka config to specify whether the IBP is greater or equal to 2.8. System tests have been modified to include topic ID checks for upgrade/downgrade tests. This PR also adds a new integration test file for requests/responses that are not gated by IBP (ex: metadata) 

Reviewers: dengziming <dengziming1993@gmail.com>, Lucas Bradstreet <lucas@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
2021-01-20 22:32:06 +00:00
John Roesler be88f5a1aa
MINOR: Fix StreamsOptimizedTest (#9911)
We have seen recent system test timeouts associated with this test.
Analysis revealed an excessive amount of time spent searching
for test conditions in the logs.

This change addresses the issue by dropping some unnecessary
checks and using a more efficient log search mechanism.

Reviewers: Bill Bejeck <bbejeck@apache.org>, Guozhang Wang <guozhang@apache.org>
2021-01-19 14:57:34 -06:00
Mickael Maison 966e9dd6a2
MINOR: Updating files with release 2.6.1 (#9844)
Reviewers: Bill Bejeck <bbejeck@gmail.com>, Matthias J. Sax <mjsax@apache.org>
2021-01-14 12:24:18 +00:00
Bill Bejeck bf694b2943
MINOR: Add 2.7.0 release to broker and client compat tests (#9774)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ismael Juma <ismael@confluent.io>
2021-01-05 09:45:00 -05:00
Chia-Ping Tsai ac7b5d3389
KAFKA-10893 Increase target_messages_per_sec of ReplicaScaleTest to reduce the run time (#9797)
Reviewers: David Arthur <mumrah@gmail.com>
2021-01-05 00:23:34 +08:00
Bill Bejeck b6891f6729
MINOR: Kafka Streams updates for 2.7.0 release (#9773)
Reviewer: Matthias J. Sax <matthias@confluent.io>
2020-12-22 14:34:59 -08:00
Bill Bejeck 300909d9e6
MINOR: Updating files with latest release 2.7.0 (#9772)
Changes to trunk for the 2.7.0 release. Updating dependencies.gradle, Dockerfile, and vagrant/bash.sh

Reviewers: Matthias J. Sax <mjsax@apache.org>
2020-12-21 11:52:49 -05:00
Chia-Ping Tsai 6e15937feb
KAFKA-10289; Fix failed connect_distributed_test.py (ConnectDistributedTest.test_bounce) (#9673)
In Python 3, `filter` functions return iterators rather than `list` so it can traverse only once. Hence, the following loop will only see "empty" and then validation fails.

```python
        src_messages = self.source.committed_messages() # return iterator
        sink_messages = self.sink.flushed_messages()) # return iterator
        for task in range(num_tasks):
            # only first task can "see" the result. following tasks see empty result
            src_seqnos = [msg['seqno'] for msg in src_messages if msg['task'] == task]
```

Reference: https://portingguide.readthedocs.io/en/latest/iterators.html#new-behavior-of-map-and-filter.

Reviewers: Jason Gustafson <jason@confluent.io>
2020-12-09 13:38:17 -08:00
Chia-Ping Tsai 1cf9ce95ad
MINOR: add "flush=True" to all print in system tests (#9711)
That makes the behavior of print equal to pyhton2.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2020-12-09 11:19:06 -08:00
Chia-Ping Tsai abb8ff61cc
MINOR: Align the UID inside/outside container (#9652)
Reviewers: Jason Gustafson <jason@confluent.io>
2020-12-03 10:39:58 +08:00
Bruno Cadonna 60139d5b25
MINOR: fix reading SSH output in Streams system tests (#9665)
SSH outputs in system tests originating from paramiko are bytes. However, the logger in the system tests does not accept bytes and instead throws an exception. That means, the bytes returned as SSH output from paramiko need to converted to a type that the logger (or other objects) can process.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2020-12-01 10:28:04 -08:00
Luke Chen 9412fc1151
MINOR: Update vagrant/tests readme (#9650)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2020-11-28 13:06:48 +08:00
Tom Bentley 91679f247a
KAFKA-10692: Add delegation.token.secret.key, deprecate ...master.key (#9623)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2020-11-19 15:26:25 +00:00
Walker Carlson 5899f5fc4a
KAFKA-9331: Add a streams specific uncaught exception handler (#9487)
This PR introduces a streams specific uncaught exception handler that currently has the option to close the client or the application. If the new handler is set as well as the old handler (java thread handler) will be ignored and an error will be logged.
The application shutdown is achieved through the rebalance protocol.

Reviewers: Bruno Cadonna <cadonna@confluent.io>, Leah Thomas <lthomas@confluent.io>, John Roesler <john@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2020-11-17 22:55:09 -08:00
feyman2016 3e2d1fc8aa
Add system test coverage for group coordinator migration (#9588)
This newly added system test is to verify that with the fix in #9270 , the member.id update caused by static member rejoin would be persisted correctly.

Reviewers: Boyang Chen <boyang@confluent.io>
2020-11-12 19:36:27 -08:00
Gardner Vickers f978d0551b
MINOR: Increase the amount of time available to the `test_verifiable_producer` (#9201)
Increase the amount of time available to the `test_verifiable_producer` test to login and get the process name for the verifiable producer from 5 seconds to 10 seconds.

We were seeing some test failures due to the assertion failing because the verifiable producer would complete before we could login, list the processes, and parse out the producer version. Previously, we were giving this operation 5 seconds to run, this PR bumps it up to 10 seconds. 

I verified locally that this does not flake, but even at 5 seconds I wasn't seeing any flakes. Ultimately we should find a better strategy than racing to query the producer process (as outlined in the existing comments). 

Reviewers: Jason Gustafson <jason@confluent.io>
2020-11-12 13:09:15 -08:00
David Mao ee1aa07036
MINOR: Fix group_mode_transactions_test (#9538)
KIP-431 (#9099) changed the format of console consumer output to `Partition:$PARTITION\t$VALUE` whereas previously the output format was `$VALUE\t$PARTITION`. This PR updates the message verifier to accommodate the updated console consumer output format.
2020-10-31 13:43:13 +01:00
Bruno Cadonna a85b944011
MINOR: Fix verification in StreamsUpgradeTest.test_version_probing_upgrade (#9530)
The system test StreamsUpgradeTest.test_version_probing_upgrade tries to verify the wrong version for version probing.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2020-10-29 16:10:30 -07:00
Manikumar Reddy 36493efa59 MINOR: fix error in quota_test.py system tests
quota_test.py tests are failing with below error.

```
23:24:42 [INFO:2020-10-24 17:54:42,366]: RunnerClient: kafkatest.tests.client.quota_test.QuotaTest.test_quota.quota_type=user.override_quota=False: FAIL: not enough arguments for format string
23:24:42 Traceback (most recent call last):
23:24:42   File "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.6/site-packages/ducktape-0.8.0-py3.6.egg/ducktape/tests/runner_client.py", line 134, in run
23:24:42     data = self.run_test()
23:24:42   File "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.6/site-packages/ducktape-0.8.0-py3.6.egg/ducktape/tests/runner_client.py", line 192, in run_test
23:24:42     return self.test_context.function(self.test)
23:24:42   File "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.6/site-packages/ducktape-0.8.0-py3.6.egg/ducktape/mark/_mark.py", line 429, in wrapper
23:24:42     return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
23:24:42   File "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/tests/kafkatest/tests/client/quota_test.py", line 141, in test_quota
23:24:42     self.quota_config = QuotaConfig(quota_type, override_quota, self.kafka)
23:24:42   File "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/tests/kafkatest/tests/client/quota_test.py", line 60, in __init__
23:24:42     self.configure_quota(kafka, self.producer_quota, self.consumer_quota, ['users', None])
23:24:42   File "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/tests/kafkatest/tests/client/quota_test.py", line 83, in configure_quota
23:24:42     (kafka.kafka_configs_cmd_with_optional_security_settings(node, force_use_zk_conection), producer_byte_rate, consumer_byte_rate)
23:24:42 TypeError: not enough arguments for format string
23:24:42
```

ran thee tests locally.

Author: Manikumar Reddy <manikumar.reddy@gmail.com>

Reviewers: David Jacot <djacot@confluent.io>, Ron Dagostino <rndgstn@gmail.com>

Closes #9496 from omkreddy/quota-tests
2020-10-25 14:45:05 +05:30
Nikolay Izhikov c8c1baf4e1 KAFKA-10592: Fix vagrant for a system tests with python3
Fix vagrant for a system tests with a python3.

Author: Nikolay Izhikov <nizhikov@apache.org>

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #9480 from nizhikov/KAFKA-10592
2020-10-25 01:09:16 +05:30
Ron Dagostino c7f19aba37
MINOR: fix system tests sending ACLs through ZooKeeper (#9458)
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2020-10-20 13:13:50 +01:00
Ron Dagostino 1636481c5f
MINOR: fix error in quota_test.py system tests (#9443) 2020-10-15 17:08:42 +01:00
Ron Dagostino 147a19036e
MINOR: ACLs for secured cluster system tests (#9378)
This PR adds missing broker ACLs required to create topics and SCRAM credentials when ACLs are enabled for a system test. This PR also adds support for using PLAINTEXT as the inter broker security protocol when using SCRAM from the client in a system test with a secured cluster-- without this it would always be necessary to set both the inter-broker and client mechanisms to a SCRAM mechanism. Also contains some refactoring to make assumptions clearer.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2020-10-09 15:34:53 +01:00
bill 4d3036bb4e Updating trunk versions after cutting branch for 2.7 2020-10-08 07:47:36 -04:00
Nikolay 4e65030e05
KAFKA-10402: Upgrade system tests to python3 (#9196)
For now, Kafka system tests use python2 which is outdated and not supported.
This PR upgrades python to the third version.

Reviewers: Ivan Daschinskiy, Mickael Maison <mickael.maison@gmail.com>, Magnus Edenhill <magnus@edenhill.se>, Guozhang Wang <wangguoz@gmail.com>
2020-10-07 09:41:30 -07:00
Nikolay bc7674fe1b
KAFKA-10505: Fix parsing of generation log string. (#9312)
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>
2020-09-23 14:24:02 -07:00
Bruno Cadonna a46c07ec8d
KAFKA-10292: Set min.insync.replicas to 1 of __consumer_offsets (#9286)
The test StreamsBrokerBounceTest.test_all_brokers_bounce() fails on
2.5 because in the last stage of the test there is only one broker
left and the offset commit cannot succeed because the
min.insync.replicas of __consumer_offsets is set to 2 and acks is
set to all. This causes a time out and extends the closing of the
Kafka Streams client to beyond the duration passed to the close
method of the client.

This affects especially the 2.5 branch since there Kafka Streams
commits offsets for each task, i.e., close() needs to wait for the
timeout for each task. In 2.6 and trunk the offset commit is done
per thread, so close() does only need to wait for one time out per
stream thread.

I opened this PR on trunk, since the test could also become
flaky on trunk and we want to avoid diverging system tests across
branches.

A more complete solution would be to improve the test by defining
a better success criteria.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2020-09-15 11:12:37 -07:00
Ron Dagostino ebd64b5d55
KAFKA-10131: Remove use_zk_connection flag from ducktape (#9274)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2020-09-14 15:56:21 -07:00
Chia-Ping Tsai ee68b999c4
KAFKA-10463: Install `git` explicitly in Dockerfile (#9257)
`openjdk:8` includes `git` by default, but `openjdk:11` does not. Install `git` explicitly to make it easier to
test with newer openjdk versions.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2020-09-14 15:22:00 -07:00
Ron Dagostino e8524ccd8f
KAFKA-10259: KIP-554 Broker-side SCRAM Config API (#9032)
Implement the KIP-554 API to create, describe, and alter SCRAM user configurations via the AdminClient.  Add ducktape tests, and modify JUnit tests to test and use the new API where appropriate.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, Rajini Sivaram <rajinisivaram@googlemail.com>
2020-09-04 13:05:01 -07:00
Justine Olshan a027b9a934
MINOR: Fix typo in ducker-ak test example
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2020-08-22 00:30:22 +05:30
Sanjana Kaundinya b7856df21b
MINOR: Include security configs for topic delete in system tests (#9142)
Reviewers: Ron Dagostino <rdagostino@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
2020-08-19 13:37:27 +01:00
Andrew Egelhofer f6c26eaa04 MINOR: Use new version of ducktape
ducktape diff: https://github.com/confluentinc/ducktape/compare/v0.7.8...v0.7.9

- bcrypt (a dependency of ducktape) dropped Python2.7 support.
ducktape-0.7.9 now pins bcrypt to a Python2.7-supported version.

Author: Andrew Egelhofer <aegelhofer@confluent.io>

Reviewers: Dhruvil Shah <dhruvil@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #9192 from andrewegel/trunk
2020-08-18 07:11:24 +05:30
Sanjana Kaundinya f7a4fe7c14
MINOR: fix the way total consumed is calculated for verifiable consumer (#9143)
Reviewers: Ron Dagostino <rdagostino@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
2020-08-16 11:15:41 +01:00
John Roesler 7159c6ddd0
MINOR: bump 2.5 versions to 2.5.1 (#9165)
Reviewers: Bill Bejeck <bbejeck@apache.org>
2020-08-11 15:18:33 -05:00
Randall Hauch 1112fd4723
KAFKA-10341: Add 2.6.0 to system tests and streams upgrade tests (#9116)
Author: Randall Hauch <rhauch@gmail.com>
Reviewer: Matthias J. Sax <matthias@confluent.io>
2020-08-04 18:04:52 -05:00
Bruno Cadonna ac3a51d013
MINOR: Remove staticmethod tag to be able to use logger of instance (#9086)
A system test failed with the following error: global name 'self' is not defined

The reason was that `self` was accessed to log a message in a static method. This commit makes the method an instance method.

Reviewer: Matthias J. Sax <matthias@confluent.io>
2020-07-27 09:38:14 -07:00
Chia-Ping Tsai 0d5c967073
KAFKA-10300 fix flaky core/group_mode_transactions_test.py (#9059)
the root cause is same to #9026 so I copy the approach of #9026 to resolve core/group_mode_transactions_test.py

Reviewers: Jun Rao <junrao@gmail.com>
2020-07-23 15:59:57 -07:00
Jason Gustafson 67f5b5de77
KAFKA-10274; Consistent timeouts in transactions_test (#9026)
KAFKA-10235 fixed a consistency issue with the transaction timeout and the progress timeout. Since the test case relies on transaction timeouts, we need to wait at last as long as the timeout in order to ensure progress. However, having a low transaction timeout makes the test prone to the issue identified in KAFKA-9802, in which the coordinator timed out the transaction while the producer was awaiting a Produce response.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>,  Boyang Chen <boyang@confluent.io>, Jun Rao <junrao@gmail.com>
2020-07-22 12:06:47 -07:00
Manikumar Reddy c38825ab97 KAFKA-9432:(follow-up) Set `configKeys` to null in `describeConfigs()` to make it backward compatible with older Kafka versions.
- After #8312, older brokers are returning empty configs,  with latest `adminClient.describeConfigs`.  Old brokers  are receiving empty configNames in `AdminManageer.describeConfigs()` method. Older brokers does not handle empty configKeys. Due to this old brokers are filtering all the configs.
- Update ClientCompatibilityTest to verify describe configs
- Add test case to test describe configs with empty configuration Keys

Author: Manikumar Reddy <manikumar.reddy@gmail.com>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #9046 from omkreddy/KAFKA-9432
2020-07-21 17:32:11 +05:30
Greg Harris f4944ee460
KAFKA-10295: Wait for connector recovery in test_bounce (#9043)
Signed-off-by: Greg Harris <gregh@confluent.io>
2020-07-20 08:50:05 -05:00
Greg Harris 5a2a7c6348
KAFKA-10286: Connect system tests should wait for workers to join group (#9040)
Currently, the system tests `connect_distributed_test` and `connect_rest_test` only wait for the REST api to come up.
The startup of the worker includes an asynchronous process for joining the worker group and syncing with other workers.
There are some situations in which this sync takes an unusually long time, and the test continues without all workers up.
This leads to flakey test failures, as worker joins are not given sufficient time to timeout and retry without waiting explicitly.

This changes the `ConnectDistributedTest` to wait for the Joined group message to be printed to the logs before continuing with tests. I've activated this behavior by default, as it's a superset of the checks that were performed by default before.

This log message is present in every version of DistributedHerder that I could find, in slightly different forms, but always with `Joined group` at the beginning of the log message. This change should be safe to backport to any branch.

Signed-off-by: Greg Harris <gregh@confluent.io>
Author: Greg Harris <gregh@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>
2020-07-20 08:48:02 -05:00
Manikumar Reddy b02fa53419 MINOR: Enable broker/client compatibility tests for 2.5.0 release
- Add missing broker/client compatibility tests for 2.5.0 release

Author: Manikumar Reddy <manikumar.reddy@gmail.com>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #9041 from omkreddy/compat
2020-07-20 18:20:48 +05:30
Jason Gustafson 6d2c7802da
MINOR: Fix flaky system test assertion after static member fencing (#9033)
The test case `OffsetValidationTest.test_fencing_static_consumer` fails periodically due to this error:
```
Traceback (most recent call last):
  File "/home/jenkins/workspace/system-test-kafka_2.6/kafka/venv/lib/python2.7/site-packages/ducktape-0.7.8-py2.7.egg/ducktape/tests/runner_client.py", line 134, in run
    data = self.run_test()
  File "/home/jenkins/workspace/system-test-kafka_2.6/kafka/venv/lib/python2.7/site-packages/ducktape-0.7.8-py2.7.egg/ducktape/tests/runner_client.py", line 192, in run_test
    return self.test_context.function(self.test)
  File "/home/jenkins/workspace/system-test-kafka_2.6/kafka/venv/lib/python2.7/site-packages/ducktape-0.7.8-py2.7.egg/ducktape/mark/_mark.py", line 429, in wrapper
    return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File "/home/jenkins/workspace/system-test-kafka_2.6/kafka/tests/kafkatest/tests/client/consumer_test.py", line 257, in test_fencing_static_consumer
    assert len(consumer.dead_nodes()) == num_conflict_consumers
AssertionError
```
When a consumer stops, there is some latency between when the shutdown is observed by the service and when the node is added to the dead nodes. This patch fixes the problem by giving some time for the assertion to be satisfied.

Reviewers: Boyang Chen <boyang@confluent.io>
2020-07-17 11:27:33 -07:00
vinoth chandar 796fae25c3
KAFKA-10174: Prefer --bootstrap-server for configs command in ducker tests (#8948)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2020-07-16 09:01:46 -07:00
Chia-Ping Tsai 598a0d16fa
KAFKA-10257 system test kafkatest.tests.core.security_rolling_upgrade_test fails (#9021)
security_rolling_upgrade_test may change the security listener and then restart Kafka servers. has_sasl and has_ssl get out-of-date due to cached _security_config. This PR offers a simple fix that we always check the changes of port mapping and then update the sasl/ssl flag.

Reviewers:  Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
2020-07-15 11:33:49 -07:00
Chia-Ping Tsai e099b58df5
KAFKA-10235 Fix flaky transactions_test.py (#8981)
Reducing timeout of transaction to clean up the unstable offsets quicker. IN hard_bounce mode, transactional client is killed ungracefully. Hence, it produces unstable offsets which obstructs TransactionalMessageCopier from receiving position of group.

Reviewers: Jun Rao <junrao@gmail.com>
2020-07-09 09:33:07 -07:00
Chia-Ping Tsai 80cab851ee
KAFKA-10225 Increase default zk timeout for system tests (#8974)
Increase ZK connection and session timeout in system tests to match the defaults.

Reviewers: Jun Rao <junrao@gmail.com>
2020-07-08 13:19:40 -07:00
Chia-Ping Tsai 6953161125
KAFKA-10191 fix flaky StreamsOptimizedTest (#8913)
Call KafkaStreams#cleanUp to reset local state before starting application up the second run.

Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, John Roesler <john@confluent.io>
2020-07-07 12:48:36 -05:00
John Roesler 34f749db30
MINOR: prune the metadata upgrade test matrix (#8971)
Most of the values in the metadata upgrade test matrix are just testing
the upgrade/downgrade path between two previous releases. This is
unnecessary. We run the tests for all supported branches, so what we
should test is the up-/down-gradability of released versions with respect
to the current branch.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2020-07-06 18:52:51 -05:00
Chia-Ping Tsai 72042f26af
KAFKA-10209: Fix connect_rest_test.py after the introduction of new connector configs (#8944)
There are two new configs introduced by 371f14c3c1 and 1c4eb1a575 so we have to update the expected configs in the connect_rest_test.py system test too.

Reviewer: Konstantine Karantasis <konstantine@confluent.io>
2020-07-03 10:38:42 -07:00
John Roesler 3b2ae7b95a
KAFKA-10173: Use SmokeTest for upgrade system tests (#8938)
Replaces the previous upgrade test's trivial Streams app
with the commonly used SmokeTest, exercising many more
features. Also adjust the test matrix to test upgrading
from each released version since 2.2 to the current branch.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2020-07-02 18:14:46 -05:00
Chia-Ping Tsai 6094af8974 KAFKA-10214: Fix zookeeper_tls_test.py system test
After 3661f981fff2653aaf1d5ee0b6dde3410b5498db security_config is cached. Hence, the later changes to security flag can't impact the security_config used by later tests.

issue: https://issues.apache.org/jira/browse/KAFKA-10214

Author: Chia-Ping Tsai <chia7712@gmail.com>

Reviewers: Ron Dagostino <rdagostino@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #8949 from chia7712/KAFKA-10214
2020-07-01 17:08:54 +05:30
Bruno Cadonna f3a9ce4a69
MINOR: Do not swallow exception when collecting PIDs (#8914)
During Streams' system tests the PIDs of the Streams
clients are collected. The method the collects the PIDs
swallows any exception that might be thrown by the
ssh_capture() function. Swallowing any exceptions
might make the investigation of failures harder,
because no information about what happened are recorded.

Reviewers: John Roesler <vvcephei@apache.org>
2020-06-30 12:18:23 -05:00
Nikolay 3661f981ff
KAFKA-10180: Fix security_config caching in system tests (#8917)
Reviewers: Jun Rao <junrao@gmail.com>
2020-06-27 09:27:49 -07:00
vinoth chandar 54dbd041bc
KAFKA-10138: Prefer --bootstrap-server for reassign_partitions command in ducktape tests (#8898)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2020-06-19 12:35:49 -07:00
Ego d3d65dd5dd
MINOR: Upgrade ducktape to 0.7.8 (#8879)
Newer version of ducktape that updates some dependencies and adds some features. You can see that diff here:

https://github.com/confluentinc/ducktape/compare/v0.7.7...v0.7.8

Reviewer: Konstantine Karantasis <konstantine@confluent.io>
2020-06-17 21:53:22 -07:00
Nikolay 8b22b81596
KAFKA-9320: Enable TLSv1.3 by default (KIP-573) (#8695)
1. Enables `TLSv1.3` by default with Java 11 or newer.
2. Add unit tests that cover the various TLSv1.2 and TLSv1.3 combinations.
3. Extend `benchmark_test.py` and `replication_test.py` to run with 'TLSv1.2'
or 'TLSv1.3'.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2020-06-02 15:34:43 -07:00
Randall Hauch 19e40788e7
Bump trunk to 2.7.0-SNAPSHOT (#8746) 2020-06-01 21:23:09 -05:00
Jason Gustafson d9fe30dab0
KAFKA-9802; Increase transaction timeout in system tests to reduce flakiness (#8736)
We have been seeing increased flakiness in transaction system tests. I believe the cause might be due to KIP-537, which increased the default zk session timeout from 6s to 18s and the default replica lag timeout from 10s to 30s. In the system test, we use the default transaction timeout of 10s. However, since the system test involves hard failures, the Produce request could be blocking for as long as the max of these two in order to wait for an ISR shrink. Hence this patch increases the timeout to 30s.

Note this patch also includes a minor logging fix in `Partition`. Previously we would see messages like the following:
```
[Broker id=3] Leader output-topic-0 starts at leader epoch 0 from offset 0 with high watermark 0 ISR 3,2,1 addingReplicas  removingReplicas .Previous leader epoch was -1.
```
This patch fixes the log to print as the following:
```
[Broker id=3] Leader output-topic-0 starts at leader epoch 0 from offset 0 with high watermark 0 ISR [3,2,1] addingReplicas []  removingReplicas []. Previous leader epoch was -1.
```

Reviewers: Bob Barrett <bob.barrett@confluent.io>, Ismael Juma <github@juma.me.uk>
2020-05-27 20:54:09 -07:00
Nikolay 2951b6dd99
KAFKA-10050: kafka_log4j_appender.py fixed for JDK11 (#8731)
kafka_log4j_appender.py was broken on JDK11 by befd80b38.
`fix_opts_for_new_jvm` requires `node.version` to be set, we
add the relevant code to the test.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2020-05-27 20:49:57 -07:00
John Roesler 2cff1fab3f
KAFKA-6145: KIP-441: Fix assignor config passthough (#8716)
Also fixes a system test by configuring the HATA to perform a one-shot balanced assignment

Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Bruno Cadonna <bruno@confluent.io>
2020-05-27 13:50:12 -05:00
Magnus Edenhill 4aa4786a81
MINOR: Deploy VerifiableClient in constructor to avoid test timeouts (#8651)
Previous to this fix a plugged-in verifiable client, such as
confluent-kafka-python, would be deployed on the node in the background
worker thread as the client was started. Since this could be time consuming
(e.g., 10+ seconds) and since the main test thread would continue to
operate, it was common for the current test to time out waiting
for e.g. the verifiable producer to produce messages while it was in fact
still deploying.

The fix here is to deploy the verifiable client on the node when
the verifiable client is instantiated, which is thus a blocking
operation on the main test thread, avoiding any test-based timeouts.

Reviewers: Jason Gustafson <jason@confluent.io>
2020-05-21 09:59:32 -07:00
Boyang Chen fad8db67bb
MINOR: add option to rebuild source for system tests (#6656)
Reviewers: Jason Gustafson <jason@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2020-05-13 17:54:43 -07:00
A. Sophie Blee-Goldman 58f7a97314
KAFKA-9821: consolidate Streams rebalance triggering mechanisms (#8596)
Persist followup rebalance in assignment and consolidate rebalance triggering mechanisms

Reviewers: John Roesler <vvcephei@apache.org>
2020-05-12 15:57:18 -05:00
Bruno Cadonna c19a3be198
KAFKA-6145: Set HighAvailabilityTaskAssignor as default in streams_upgrade_test.py (#8613)
Generalize the verification in the upgrade test so that it
does not rely on the task assignor's behavior.

Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, John Roesler <vvcephei@apache.org>
2020-05-07 21:10:44 -05:00
Lucas Bradstreet 4e1d6a3d04 MINOR: add support for kafka 2.4 and 2.5 to downgrade test
The downgrade test does not currently support 2.4 and 2.5. When you enable them, it fails as a result of consumer group static membership. This PR makes the downgrade test work with all of our released versions again.

Author: Lucas Bradstreet <lucas@confluent.io>

Reviewers: Boyang Chen, Gwen Shapira

Closes #8518 from lbradstreet/downgrade-test-2.4-2.5
2020-04-28 18:01:24 -07:00
John Roesler 5bb3415c77
KAFKA-6145: KIP-441: Add TaskAssignor class config (#8541)
* add a config to set the TaskAssignor
* set the default assignor to HighAvailabilityTaskAssignor
* fix broken tests (with some TODOs in the system tests)

Implements: KIP-441
Reviewers: Bruno Cadonna <bruno@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>
2020-04-28 15:57:11 -05:00
John Roesler 88eae49a8d
MINOR: document how to escape json parameters to ducktape tests (#8546)
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2020-04-27 18:22:00 -05:00
Bruno Cadonna 362d199dbe
HOTFIX: Fix broker bounce system tests (#8532)
Reviewers: Boyang Chen <boyang@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2020-04-24 08:49:47 -07:00
Lucas Bradstreet b161358998
MINOR: Downgrade test should wait for ISR rejoin between rolls (#8495)
I added a change to the upgrade test a while back that would make it wait for
ISR rejoin before rolls. This prevents incompatible brokers charging through a
bad roll and disguising a downgrade problem.

We now also check for protocol errors in the broker logs.

Reviewers: Boyang Chen <boyang@confluent.io>, Ismael Juma <ismael@juma.me.uk>
2020-04-23 00:15:43 -07:00
Boyang Chen df41713d64
KAFKA-9779: Add Stream system test for 2.5 release (#8378)
Reviewer: Matthias J. Sax <matthias@confluent.io>
2020-04-15 15:59:03 -07:00
Matthias J. Sax 17f9879261
KAFKA-9832: extend Kafka Streams EOS system test (#8440)
Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2020-04-15 13:13:23 -07:00
Rajini Sivaram 8820055744
KAFKA-9797; Fix TestSecurityRollingUpgrade.test_enable_separate_interbroker_listener (#8403)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
2020-04-15 13:04:11 +01:00
Ewen Cheslack-Postava cadb3499ff
MINOR: Upgrade ducktape to 0.7.7 (#8487)
This fixes a version pinning issue where a transitive dependency had a
major version upgrade that a dependency did not account for, breaking
the build.

Reviewers: Andrew Egelhofer <aegelhofer@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2020-04-14 16:36:52 -07:00
Boyang Chen ea47a885b1
MINOR: remove stream simple benchmark suite (#8353)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2020-04-14 09:49:03 -07:00
Matthias J. Sax 20e4a74c35
KAFKA-9832: Extend Streams system tests for EOS-beta (#8443)
Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2020-04-10 11:55:01 -07:00
Boyang Chen 7f640f13b4
KAFKA-9776: Downgrade TxnCommit API v3 when broker doesn't support (#8375)
Revert the decision for the sendOffsetsToTransaction(groupMetadata) API to fail with old version of brokers for the sake of making the application easier to adapt between versions. This PR silently downgrade the TxnOffsetCommit API when the build version is small than 3.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2020-04-02 21:48:37 -07:00
Ron Dagostino f0ad03069a
MINOR: System test ZooKeeper upgrades (#8384)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2020-04-02 23:23:48 +05:30
Michal T 86a3ebe537
MINOR: Fix typo in version 2.4.1 of kafka folder in Dockerfile (#8393) 2020-04-01 17:56:47 -07:00
Matthias J. Sax 6ad5407350
KAFKA-9719: Streams with EOS-beta should fail fast for older brokers (#8367)
Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2020-03-30 15:21:27 -07:00
Bill Bejeck c725c2338b
MINOR: Update dependencies.gradle, Dockerfile, version.py, and bash.sh for 2.4.1 upgrade (#8387)
These files were missed in the 2.4.1 release

Reviewers: Ismael Juma <ismael@confluent.io>
2020-03-30 12:55:35 -04:00
Nikolay befd80b38d
KAFKA-9573: Fix JVM options to run early versions of Kafka on the latest JVMs (#8138)
Startup scripts for the early version of Kafka contain removed JVM options like `-XX:+PrintGCDateStamps` or `-XX:UseParNewGC`. 
When system tests run on JVM that doesn't support these options we should set up
environment variables with correct options.

Reviewers: Guozhang Wang <guozhang@confluent.io>, Ron Dagostino <rdagostino@confluent.io>, Ismael Juma <ismael@juma.me.uk
2020-03-25 10:31:07 -07:00
A. Sophie Blee-Goldman e1cbefef60
HOTFIX: fix log message in version probing system test (#8341)
Reviewer: Matthias J. Sax <matthias@confluent.io>
2020-03-24 21:46:37 -07:00
Rajini Sivaram 6b419933a0
KAFKA-9662: Wait for consumer offset reset in throttle test to avoid losing early messages (#8227) 2020-03-06 14:50:22 -05:00
A. Sophie Blee-Goldman 674360f5b3
KAFKA-6145: Encode task positions in SubscriptionInfo (#8121)
* Replace Prev/Standby task lists with a representation of the current poasition
  of all tasks, where each task is encoded as the sum of the positions of all the
  changelogs in that task.
* Only the protocol change is implemented, not actual positions, and the
  assignor is updated to translate the new protocol back to lists of Prev/Standby
  tasks so that the current assignment protocol still functions without modification.

Implements: KIP-441

Reviewers: John Roesler <vvcephei@apache.org>, Bruno Cadonna <bruno@confluent.io>
2020-03-06 09:19:04 -06:00
A. Sophie Blee-Goldman a1f2ece323
KAFKA-9525: add enforceRebalance method to Consumer API (#8087)
As described in KIP-568.

Waiting on acceptance of the KIP to write the tests, on the off chance something changes. But rest assured unit tests are coming ️

Will also kick off existing Streams system tests which leverage this new API (eg version probing, sometimes broker bounce)

Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2020-02-29 18:44:22 -08:00
Brian Bushree 72a5aa8b07
MINOR: add wait_for_assigned_partitions to console-consumer (#8192)
what/why
the throttling_test was broken by this PR (#7785) since it depends on the consumer having partitions-assigned before starting the producer

this PR provides the ability to wait for partitions to be assigned in the console consumer before considering it started.

caveat
this does not support starting up the JmxTool inside the console-consumer for custom metrics while using this wait_until_partitions_assigned flag since the code assumes one JmxTool running per node.

I think a proper fix for this would be to make JmxTool its own standalone single-node service

alternatives
we could use the EndToEnd test suite which uses the verifiable producer/consumer under the hood but I found that there were more changes necessary to get this working unfortunately (specifically doesn't seem like this test suite plays nicely with the ProducerPerformanceService)

Reviewers: Mathew Wong <mwong@confluent.io>, Bill Bejeck <bbejeck.com>
2020-02-29 19:43:51 -05:00
Matthew Wong 294b62963b
throttle consumer timeout increase (#8188)
The test_throttled_reassignment test fails because the consumer that is used to validate reassignment does not start on time to consume all messages. This does not seem like an issue with the throttling of the reassignment, since increasing the timeout allowed the test to pass multiple consecutive runs locally.

This test seemed to rely on the default JmxTool for the console consumer that was removed in this commit: 179d0d7
The console consumer would check to see if it had partitions assigned to it before beginning to consume. Although the test occasionally failed with the JmxTool, it began to fail much more after the removal.

Error messages of failures followed the below format with varying numbers of missed messages. They are the first messages by the producer.

535 acked message did not make it to the Consumer. They are: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19...plus 515 more. Total Acked: 192792, Total Consumed: 192259. We validated that the first 535 of these missing messages correctly made it into Kafka's data files. This suggests they were lost on their way to the consumer.
In the scope of the test, this error suggests that the test is falling into the race condition described in produce_consume_validate.py, which has the timeout to prevent the consumer from missing initial messages.

This can serve as a temporary fix until the logic of consumer startup is addressed further.

Reviewers: Jason Gustafson <jason@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
2020-02-27 17:46:55 -05:00
Ron Dagostino 9d53ad794d KAFKA-9567: Docs, system tests for ZooKeeper 3.5.7
These changes depend on [KIP-515: Enable ZK client to use the new TLS supported authentication](https://cwiki.apache.org/confluence/display/KAFKA/KIP-515%3A+Enable+ZK+client+to+use+the+new+TLS+supported+authentication), which was only added to 2.5.0. The upgrade to ZooKeeper 3.5.7 was merged to both 2.5.0 and 2.4.1 via https://issues.apache.org/jira/browse/KAFKA-9515, but this change must only be merged to 2.5.0 (it will break the system tests if merged to 2.4.1).

Author: Ron Dagostino <rdagostino@confluent.io>

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Andrew Choi <li_andchoi@microsoft.com>

Closes #8132 from rondagostino/KAFKA-9567
2020-02-25 19:59:55 +05:30
Nikolay f364281431
KAFKA-9319: Fix generation of CA certificate for system tests. (#8106)
Newer versions of Java have added checks to ensure that trust anchors are CA certificates and contain proper extensions. This PR adds Basic Constraints extension with the CA field set to true for system tests.

Reviewers: ajini Sivaram <rajinisivaram@googlemail.com>
2020-02-17 09:49:35 +00:00
Boyang Chen 07db26c20f
KAFKA-9417: New Integration Test for KIP-447 (#8000)
This change mainly have 2 components:

1. extend the existing transactions_test.py to also try out new sendTxnOffsets(groupMetadata) API to make sure we are not introducing any regression or compatibility issue
  a. We shrink the time window to 10 seconds for the txn timeout scheduler on broker so that we could trigger expiration earlier than later

2. create a completely new system test class called group_mode_transactions_test which is more complicated than the existing system test, as we are taking rebalance into consideration and using multiple partitions instead of one. For further breakdown:
  a. The message count was done on partition level, instead of global as we need to visualize 
the per partition order throughout the test. For this sake, we extend ConsoleConsumer to print out the data partition as well to help message copier interpret the per partition data.
  b. The progress count includes the time for completing the pending txn offset expiration
  c. More visibility and feature improvements on TransactionMessageCopier to better work under either standalone or group mode.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2020-02-12 12:34:12 -08:00
Ron Dagostino 342f13a838 KAFKA-8843: KIP-515: Zookeeper TLS support
Signed-off-by: Ron Dagostino <rdagostinoconfluent.io>

Author: Ron Dagostino <rdagostino@confluent.io>

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #8003 from rondagostino/KAFKA-8843
2020-02-08 21:16:48 +05:30
Guozhang Wang 4090f9a2b0
KAFKA-9113: Clean up task management and state management (#7997)
This PR is collaborated by Guozhang Wang and John Roesler. It is a significant tech debt cleanup on task management and state management, and is broken down by several sub-tasks listed below:

Extract embedded clients (producer and consumer) into RecordCollector from StreamTask.
guozhangwang#2
guozhangwang#5

Consolidate the standby updating and active restoring logic into ChangelogReader and extract out of StreamThread.
guozhangwang#3
guozhangwang#4

Introduce Task state life cycle (created, restoring, running, suspended, closing), and refactor the task operations based on the current state.
guozhangwang#6
guozhangwang#7

Consolidate AssignedTasks into TaskManager and simplify the logic of changelog management and task management (since they are already moved in step 2) and 3)).
guozhangwang#8
guozhangwang#9

Also simplified the StreamThread logic a bit as the embedded clients / changelog restoration logic has been moved into step 1) and 2).
guozhangwang#10

Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Bruno Cadonna <bruno@confluent.io>, Boyang Chen <boyang@confluent.io>
2020-02-04 21:06:39 -08:00
David Arthur 7e776b0462
Bump trunk to 2.6.0-SNAPSHOT (#8026) 2020-02-03 13:04:56 -05:00
vinoth chandar 71c5729a41 KAFKA-6144: Add KeyQueryMetadata APIs to KafkaStreams (#7960)
Deprecate existing metadata query APIs in favor of new
ones that include standby hosts as well as partition
information.

Closes: #7960
Implements: KIP-535
Co-authored-by: Navinder Pal Singh Brar <navinder_brar@yahoo.com>
Reviewed-by: John Roesler <vvcephei@apache.org>
2020-01-15 09:39:02 -06:00
Brian Bushree 422bc1f0fa MINOR: Disable JmxTool in kafkatest console-consumer by default (#7785)
Do not initialize `JmxTool` by default when running console consumer. In order to support this, we remove `has_partitions_assigned` and its only usage in an assertion inside `ProduceConsumeValidateTest`, which did not seem to contribute much to the validation.

Reviewers: David Arthur <mumrah@gmail.com>, Jason Gustafson <jason@confluent.io>
2020-01-09 16:53:36 -08:00
A. Sophie Blee-Goldman 3453e9e2ee HOTFIX: fix system test race condition (#7836)
In some system tests a Streams app is started and then prints a message to stdout, which the system test waits for to confirm the node has successfully been brought up. It then greps for certain log messages in a retriable loop.

But waiting on the Streams app to start/print to stdout does not mean the log file has been created yet, so the grep may return an error. Although this occurs in a retriable loop it is assumed that grep will not fail, and the result is piped to wc and then blindly converted to an int in the python function, which fails since the error message is a string (throws ValueError)

We should catch the ValueError and return a 0 so it can try again rather than immediately crash

Reviewers: Bill Bejeck <bbejeck@gmail.com>, John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>
2019-12-31 18:44:31 -08:00
Lucas Bradstreet 8fd7cd6a43 MINOR: upgrade system test should check for ISR rejoin on each roll (#7827)
The upgrade system test correctly rolls by upgrading the broker and 
leaving the IBP, and then rolling again with the latest IBP version.
Unfortunately, this is not sufficient to pick up many problems in our IBP
gating as we charge through the rolls and after the second roll all of
the brokers will rejoin the ISR and the test will be treated as a
success.

This test adds two new checks:
1. We wait for the ISR to stabilize for all partitions. This is best
practice during rolls, and is enough to tell us if a broker hasn't
rejoined after each roll.
2. We check the broker logs for some common protocol errors. This is a
fail safe as it's possible for the test to be successful even if some
protocols are incompatible and the ISR is rejoined.

Reviewers: Nikhil Bhatia <nikhil@confluent.io>, Jason Gustafson <jason@confluent.io>
2019-12-30 11:02:30 -08:00
Bruno Cadonna 1d21cf166a KAFKA-9305: Add version 2.4 to Streams system tests (#7841)
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2019-12-20 14:21:12 -08:00
Manikumar Reddy b50d925e07
MINOR: Add compatibility tests for 2.4.0 (#7838)
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2019-12-17 19:44:32 +05:30
John Roesler 717ce42a6d
KAFKA-9138: Add system test for relational joins (#7664)
Add a system test to verify the new foreign-key join introduced in KIP-213

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2019-12-11 09:48:23 -08:00
Randall Hauch 39d68de393
MINOR: Simplify the timeout logic to handle protocol in Connect distributed system tests (#7806) 2019-12-10 12:30:49 -06:00
Randall Hauch ccded348eb
MINOR: Bump system test version from 2.2.1 to 2.2.2 (#7765)
Author: Randall Hauch <rhauch@gmail.com>
Reviewer: Ismael Juma <ismael@confluent.io>
2019-12-06 15:44:56 -06:00
Randall Hauch 9d9139024a
MINOR: Increase the timeout in one of Connect's distributed system tests (#7789)
Author: Randall Hauch <rhauch@gmail.com>
Reviewers: Nigel Liang <nigel@nigelliang.com>, Roesler <john@confluent.io>
2019-12-06 15:33:43 -06:00
David Arthur 91f948763d
Actually run the delete topic command in kafka.py (#7776)
Reviewed-By: Jason Gustafson <jason@confluent.io>
2019-12-04 17:54:37 -05:00
Jason Gustafson e057d61050
MINOR: Fix --enable-autocommit flag in verifiable consumer (#7743)
The --enable-autocommit argument is a flag. It does not take a parameter. This was broken in #7724.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Manikumar Reddy <manikumar.reddy@gmail.com>
2019-11-23 14:18:42 -08:00
David Arthur b15e05d925
KAFKA-9123 Test a large number of replicas (#7621)
Two tests using 50k replicas on 8 brokers:
* Do a rolling restart with clean shutdown, delete topics
* Run produce bench and consumer bench on a subset of topics

Reviewed-By: David Jacot <djacot@confluent.io>, Vikas Singh <vikas@confluent.io>, Jason Gustafson <jason@confluent.io>
2019-11-22 19:32:08 -05:00
Jason Gustafson 9d8ab3a1a2
KAFKA-8509; Add downgrade system test (#7724)
This patch adds a basic downgrade system test. It verifies that producing and consuming continues to work before and after the downgrade.

Reviewers: Ismael Juma <ismael@juma.me.uk>, David Arthur <mumrah@gmail.com>
2019-11-22 10:09:13 -08:00
Bruno Cadonna 2604d39b9f MINOR: Fix Streams EOS system tests by adding clean-up of state dir (#7693)
Recently, system tests test_rebalance_[simple|complex] failed
repeatedly with a verfication error. The cause was most probably
the missing clean-up of a state directory of one of the processors.

A node is cleaned up when a service on that node is started and when
a test is torn down.

If the clean-up flag clean_node_enabled of a EOS Streams service is
unset, the clean-up of the node is skipped.

The clean-up flag of processor1 in the EOS tests should stay set before
its first start, so that the node is cleaned before the service is started.
Afterwards for the multiple restarts of processor1 the cleans-up flag should
be unset to re-use the local state.

After the multiple restarts are done, the clean-up flag of processor1 should
again be set to trigger node clean-up during the test teardown.

A dirty node can lead to test failures when tests from Streams EOS tests are
scheduled on the same node, because the state store would not start empty
since it reads the local state that was not cleaned up.

Reviewers: Matthias J. Sax <mjsax@apache.org>, Andrew Choi <andchoi@linkedin.com>, Bill Bejeck <bbejeck@gmail.com>
2019-11-21 10:32:31 -05:00
David Arthur cca0225390
Fix missing reference in kafka.py (#7715)
Also fix a default value for a dictionary arg
2019-11-19 15:19:48 -05:00
David Arthur d04699486d
KAFKA-8981 Add rate limiting to NetworkDegradeSpec (#7446)
* Add rate limiting to tc

* Feedback from PR

* Add a sanity test for tc

* Add iperf to vagrant scripts

* Dynamically determine the network interface

* Add some temp code for testing on AWS

* Temp: use hostname instead of external IP

* Temp: more AWS debugging

* More AWS WIP

* More AWS temp

* Lower latency some

* AWS wip

* Trying this again now that ping should work

* Add cluster decorator to tests

* Fix broken import

* Fix device name

* Fix decorator arg

* Remove errant import

* Increase timeouts

* Fix tbf command, relax assertion on latency test

* Fix log line

* Final bit of cleanup

* Newline

* Revert Trogdor retry count

* PR feedback

* More PR feedback

* Feedback from PR

* Remove unused argument
2019-11-18 20:36:00 -05:00
Brian Bushree 9fb22868fe [MINOR] allow additional JVM args in KafkaService (#7297)
Reviewers: Colin P. McCabe <cmccabe@apache.org>, Vikas Singh <vikas@confluent.io>
2019-11-18 10:50:36 -08:00
Colin Patrick McCabe 7f49674439
KAFKA-8746: Kibosh must handle an empty JSON string from Trogdor (#7155)
When Trogdor wants to clear all the faults injected to Kibosh, it sends the empty JSON object {}. However, Kibosh expects {"faults":[]} instead.  Kibosh should handle the empty JSON object, since that's consistent with how Trogdor handles empty JSON fields in general (if they're empty, they can be omitted). We should also have a test for this.

Reviewers: David Arthur <mumrah@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>
2019-11-15 15:13:32 -08:00
John Roesler cac85601a0
KAFKA-9169: fix standby checkpoint initialization (#7681)
Instead of caching the checkpoint map during StandbyTask
initialization, use the latest checkpoints (which would have
been updated during suspend).

Reviewers: Bill Bejeck <bill@confluent.io>
2019-11-13 22:03:44 -06:00
Colin P. Mccabe 67fd88050f KAFKA-8984: Improve tagged fields documentation
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Vikas Singh <vikas@confluent.io>, Jason Gustafson <jason@confluent.io>

Closes #7477 from cmccabe/KAFKA-8984
2019-11-09 10:37:48 +05:30
Jason Gustafson 903d66e2f9 KAFKA-9079: Fix reset logic in transactional message copier
The consumer's `committed` API does not return an entry in the response map for a requested partition if there is no committed offset. The transactional message copier, which is used in the transaction system test, did not account for this. If the first transaction attempted by the copier was randomly aborted, then we would not seek to the beginning as expected, which means we would fail to copy some of the records.

This patch fixes the problem by iterating over the assignment rather than the result of `committed` when resetting offsets. It also adds enables additional logging in the transaction message copier service to make finding problems easier in the future.

Author: Jason Gustafson <jason@confluent.io>

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #7653 from hachikuji/fix-transaction-system-test
2019-11-06 15:59:51 +05:30
Bruno Cadonna 065411aa22 KAFKA-9077: Fix reading of metrics of Streams' SimpleBenchmark (#7610)
With KIP-444 the metrics definitions are refactored. Thus, Streams' SimpleBenchmark needs to be updated to correctly access the refactored metrics.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>,  Bill Bejeck <bbejeck@gmail.com>
2019-10-29 16:02:15 -04:00
Boyang Chen 465f810730 KAFKA-8972 (2.4 blocker): correctly release lost partitions during consumer.unsubscribe() (#7441)
Inside onLeavePrepare we would look into the assignment and try to revoke the owned tasks and notify users via RebalanceListener#onPartitionsRevoked, and then clear the assignment.

However, the subscription's assignment is already cleared in this.subscriptions.unsubscribe(); which means user's rebalance listener would never be triggered. In other words, from consumer client's pov nothing is owned after unsubscribe, but from the user caller's pov the partitions are not revoked yet. For callers like Kafka Streams which rely on the rebalance listener to maintain their internal state, this leads to inconsistent state management and failure cases.

Before KIP-429 this issue is hidden away since every time the consumer re-joins the group later, it would still revoke everything anyways regardless of the passed-in parameters of the rebalance listener; with KIP-429 this is easier to reproduce now.

Our fixes are following:

• Inside unsubscribe, first do onLeavePrepare / maybeLeaveGroup and then subscription.unsubscribe. This we we are guaranteed that the streams' tasks are all closed as revoked by then.
• [Optimization] If the generation is reset due to fatal error from join / hb response etc, then we know that all partitions are lost, and we should not trigger onPartitionRevoked, but instead just onPartitionsLost inside onLeavePrepare. This is because we don't want to commit for lost tracks during rebalance which is doomed to fail as we don't have any generation info.

Reviewers: Matthias J. Sax <matthias@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2019-10-29 10:41:25 -07:00
Bill Bejeck c015169aa6
MINOR: Streams upgrade system test cleanup (#7571)
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>,
2019-10-24 10:28:29 -04:00
Konstantine Karantasis fc0963a236 KAFKA-9078: Fix Connect system test after adding MM2 connector classes
MM2 added a few connector classes in Connect's classpath and given that the assertion in the Connect REST system tests need to be adjusted to account for these additions.

This fix makes sure that the loaded Connect plugins are a superset of the expected by the test connectors.

Testing: The change is straightforward. The fix was tested with local system test runs.

Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #7578 from kkonstantine/minor-fix-connect-test-after-mm2-classes
2019-10-23 15:46:26 +05:30
Bill Bejeck 6afe05fe89
MINOR: system test clean up (#7552)
Guozhang Wang <wangguoz@gmail.com>, Sophie Blee-Goldman <sophie@confluent.io>,
2019-10-21 10:51:15 -04:00
Bill Bejeck b62f2a1123 KAFKA-8496: System test for KIP-429 upgrades and compatibility (#7529)
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-10-16 22:29:33 -07:00
A. Sophie Blee-Goldman 11a401d51d delete (#7504)
This system test was marked @Ignore around a year and a half ago pending the version probing work, but never turned on again.

These days, it is made redundant by the suite of system tests in streams_upgrade_test, which cover rolling upgrades (including version probing and metadata change).

Reviewers: Boyang Chen <boyang@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
2019-10-14 11:41:44 -04:00
A. Sophie Blee-Goldman cc6525a746 KAFKA-8743: Flaky Test Repartition{WithMerge}OptimizingIntegrationTest (#7472)
All four flavors of the repartition/optimization tests have been reported as flaky and failed in one place or another:
* RepartitionOptimizingIntegrationTest.shouldSendCorrectRecords_OPTIMIZED
* RepartitionOptimizingIntegrationTest.shouldSendCorrectRecords_NO_OPTIMIZATION
* RepartitionWithMergeOptimizingIntegrationTest.shouldSendCorrectRecords_OPTIMIZED
* RepartitionWithMergeOptimizingIntegrationTest.shouldSendCorrectRecords_NO_OPTIMIZATION

They're pretty similar so it makes sense to knock them all out at once. This PR does three things:

* Switch to in-memory stores wherever possible
* Name all operators and update the Topology accordingly (not really a flaky test fix, but had to update the topology names anyway because of the IM stores so figured might as well)
* Port to TopologyTestDriver -- this is the "real" fix, should make a big difference as these repartition tests required multiple roundtrips with the Kafka cluster (while using only the default timeout)

Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-10-10 16:23:18 -07:00
A. Sophie Blee-Goldman f9934e7e93 MINOR: remove unused imports in Streams system tests (#7468)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2019-10-08 14:57:07 -07:00
A. Sophie Blee-Goldman d88f1048da KAFKA-8179: Part 7, cooperative rebalancing in Streams (#7386)
Key improvements with this PR:

* tasks will remain available for IQ during a rebalance (but not during restore)
* continue restoring and processing standby tasks during a rebalance
* continue processing active tasks during rebalance until the RecordQueue is empty*
* only revoked tasks must suspended/closed
* StreamsPartitionAssignor tries to return tasks to their previous consumers within a client
* but do not try to commit, for now (pending KAFKA-7312)


Reviewers: John Roesler <john@confluent.io>, Boyang Chen <boyang@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-10-07 09:27:09 -07:00
Manikumar Reddy 4c2bd567b1
MINOR: Bump version to 2.5.0-SNAPSHOT (#7455) 2019-10-07 20:04:57 +05:30
A. Sophie Blee-Goldman 3b05dc685b MINOR: just remove leader on trunk like we did on 2.3 (#7447)
Small follow-up to trunk PR #7423

While debugging the 2.3 VP PR we realized we should remove the leader-tracking from the VP system test altogether. We'd already merged the corresponding trunk PR so I made a quick new PR for trunk (also fixes a missed version bump in one of the log messages)

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2019-10-04 15:58:11 -07:00
Chris Egerton 791d0d61bf KAFKA-8804: Secure internal Connect REST endpoints (#7310)
Implemented KIP-507 to secure the internal Connect REST endpoints that are only for intra-cluster communication. A new V2 of the Connect subprotocol enables this feature, where the leader generates a new session key, shares it with the other workers via the configuration topic, and workers send and validate requests to these internal endpoints using the shared key.

Currently the internal `POST /connectors/<connector>/tasks` endpoint is the only one that is secured.

This change adds unit tests and makes some small alterations to system tests to target the new `sessioned` Connect subprotocol. A new integration test ensures that the endpoint is actually secured (i.e., requests with missing/invalid signatures are rejected with a 400 BAD RESPONSE status).

Author: Chris Egerton <chrise@confluent.io>
Reviewed: Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>
2019-10-02 17:06:57 -05:00
A. Sophie Blee-Goldman 8da69936a7 KAFKA-8649: Send latest commonly supported version in assignment (#7423)
Instead of sending the leader's version and having older members try to blindly upgrade.

The only other real change here is that we will also set the VERSION_PROBING error code and return early from onAssignment when we are upgrading our used subscription version (not just downgrading it) since this implies the whole group has finished the rolling upgrade and all members should rejoin with the new subscription version.

Also piggy-backing on a fix for a potentially dangerous edge case, where every thread of an instance is assigned the same set of active tasks.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2019-10-02 08:54:32 -07:00
Rajini Sivaram 0d31272b35
KAFKA-8848; Update system tests to use new AclAuthorizer (#7374)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2019-09-24 10:30:17 +01:00
Randall Hauch ada35d5ff4 Add recent versions of Kafka to the matrix of ConnectDistributedTest (#7024)
Reviewers: Arjun Satish <arjun@confluent.io>, Konstantine Karantasis <k.karantasis@gmail.com>
2019-09-18 10:21:21 -07:00
Vikas Singh 312e4db590 MINOR. implement --expose-ports option in ducker-ak (#7269)
This change adds a command line option to the `ducker-ak up' command to enable exposing ports from docker containers. The exposed ports will be mapped to the ephemeral ports on the host. The option is called `expose-ports' and can take either a single value (like 5005) or a range (like 5005-5009). This port will then exposed from each docker container that ducker-ak sets up.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, José Armando García Sancio <jsancio@users.noreply.github.com>
2019-09-09 07:57:29 -07:00
vinoth chandar ffef0871c2 KAFKA-7149 : Reducing streams assignment data size (#7185)
* Leader instance uses dictionary encoding on the wire to send topic partitions
* Topic names (most expensive component) are mapped to an integer using the dictionary
* Follower instances receive the dictionary, decode topic names back
* Purely an on-the-wire optimization, no in-memory structures changed
* Test case added for version 5 AssignmentInfo

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2019-09-05 13:50:55 -07:00
Colin Patrick McCabe a225347ff2
KAFKA-8840: Fix bug where ClientCompatibilityFeaturesTest fails when running multiple iterations (#7260)
Fix a bug where ClientCompatibilityFeaturesTest fails when running multiple iterations.

Also, fix a typo in tests/docker/Dockerfile.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2019-08-30 16:07:59 -07:00
Vikas Singh 09ad6b84c5 MINOR. Fix 2.3.0 streams systest dockerfile typo (#7272)
As part of commit 4d1ee26a13 streams
version 2.3.0 test jar was added, but there was a simple typo in the
path that specified the version.

`ducker-ak up` was failing because of that. Fixed that.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2019-08-29 14:12:08 -07:00
Matthias J. Sax 4d1ee26a13
KAFKA-8594: Add version 2.3 to Streams system tests (#7131)
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, Bill Bejeck <bill@confluent.io>
2019-08-21 10:26:57 -07:00
Brian Bushree 6c8f654d5f MINOR: Upgrade ducktape to 0.7.6
Author: Brian Bushree <bbushree@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #7138 from brianbushree/update-ducktape
2019-08-19 17:12:48 -07:00
David Arthur ff9e95cb09 MINOR: Add fetch from follower system test (#7166)
This adds a basic system test that enables rack-aware brokers with the rack-aware replica selector for fetch from followers (KIP-392). The test asserts that the follower was read from at least once and that all the messages that were produced were successfully consumed. 

Reviewers: Jason Gustafson <jason@confluent.io>
2019-08-13 12:33:05 -07:00
Arjun Satish 794637232c KAFKA-8774: Regex can be found anywhere in config value (#7197)
Corrected the AbstractHerder to correctly identify task configs that contain variables for externalized secrets. The original method incorrectly used `matcher.matches()` instead of `matcher.find()`. The former method expects the entire string to match the regex, whereas the second one can find a pattern anywhere within the input string (which fits this use case more correctly).

Added unit tests to cover various cases of a config with externalized secrets, and updated system tests to cover case where config value contains additional characters besides secret that requires regex pattern to be found anywhere in the string (as opposed to complete match).

Author: Arjun Satish <arjun@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-08-13 09:40:12 -05:00
Matthias J. Sax e9a35fe02e
MINOR: Bump system test version from 2.2.0 to 2.2.1 (#6873)
Reviewers: Boyang Chen <boyang@confluent.io>, Bill Bejeck <bill@confluent.io>
2019-08-09 14:33:20 -07:00
Rajini Sivaram de8ce78a90
MINOR: Tolerate limited data loss for upgrade tests with old message format (#7102)
To avoid transient system test failures, tolerate a small amount of data loss due to truncation in upgrade system tests using older message format prior to KIP-101, where data loss was possible.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2019-07-31 16:19:36 +01:00
Brian Bushree e5f7220b23 MINOR: kafkatest - adding whitelist for interbroker sasl configs (#7093) 2019-07-22 09:38:28 +01:00
Ismael Juma d67495d6a7
KAFKA-8634: Update ZooKeeper to 3.5.5 (#6802)
ZooKeeper 3.5.5 is the first stable release in the 3.5.x series. The key new feature
in is TLS support, but there are a few more noteworthy features:

* Dynamic reconfiguration
* Local sessions
* New node types: Container, TTL
* Ability to remove watchers
* Multi-threaded commit processor
* Upgraded to Netty 4.1

See the release notes for more detail:
https://zookeeper.apache.org/doc/r3.5.5/releasenotes.html

In addition to the version bump, we:

* Add `commons-cli` dependency as it's required by `ZooKeeperMain`, but specified as
`provided` in their pom.
* Remove unnecessary `ZooKeeperMainWrapper`, the bug it worked around was fixed
upstream a long time ago.
* Ignore non zero exit in one system test invocation of `ZooKeeperMain`.
`ZooKeeperMainWrapper` always returned `0` and `ZooKeeperService.query` relies
on that for correct behavior.

Reviewers: Jason Gustafson <jason@confluent.io>
2019-07-10 09:45:10 -07:00
Ismael Juma 57903be496
MINOR: Remove zkclient dependency (#7036)
ZkUtils was removed so we don't need this anymore.

Also:
* Fix ZkSecurityMigrator and ReplicaManagerTest not to
reference ZkClient classes.
* Remove references to zkclient in various `log4j.properties`
and `import-control.xml`.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>
2019-07-05 07:50:32 -07:00
David Arthur 23beeea34b KAFKA-8443; Broker support for fetch from followers (#6832)
Follow on to #6731, this PR adds broker-side support for [KIP-392](https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica) (fetch from followers). 

Changes:
* All brokers will handle FetchRequest regardless of leadership
* Leaders can compute a preferred replica to return to the client
* New ReplicaSelector interface for determining the preferred replica
* Incremental fetches will include partitions with no records if the preferred replica has been computed
* Adds new JMX to expose the current preferred read replica of a partition in the consumer

Two new conditions were added for completing a delayed fetch. They both relate to communicating the high watermark to followers without waiting for a timeout:
* For regular fetches, if the high watermark changes within a single fetch request 
* For incremental fetch sessions, if the follower's high watermark is lower than the leader

A new JMX attribute `preferred-read-replica` was added to the `kafka.consumer:type=consumer-fetch-manager-metrics,client-id=some-consumer,topic=my-topic,partition=0` object. This was added to support the new system test which verifies that the fetch from follower behavior works end-to-end. This attribute could also be useful in the future when debugging problems with the consumer.

Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, Jun Rao <junrao@gmail.com>, Jason Gustafson <jason@confluent.io>
2019-07-04 08:18:51 -07:00
Brian Bushree 5287036b38 MINOR: system tests - avoid 'sasl.enabled.mechanisms' in listener overrides (#7018)
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2019-07-03 12:17:17 +01:00
Randall Hauch 6f91096c7d MINOR: Fix version for ConnectDistributed system test, remove 0.9.0.1 compatibility test (#7023)
Connect tests were using String version for KafkaService instead of the expected KafkaVersion object. This broke due to recent changes to KafkaVersion. It turns out that the tests with String version were running compatibility tests against `dev` brokers rather than the older broker versions they were expecting to run against. When version was fixed, tests using 0.9.0.1 brokers started failing since new clients are not compatible with 0.9.0.1 brokers. So this PR fixes version parameter and removes the two tests against 0.9.0.1 brokers.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Rajini Sivaram <rajinisivaram@googlemail.com>
2019-07-02 19:10:41 +01:00
Colin Patrick McCabe 3d2d87abd1
MINOR: Add compatibility tests for 2.3.0 (#6995)
Reviewers: Ismael Juma <ismael@juma.me.uk>
2019-06-28 09:25:08 -07:00
Brian Bushree 357aedeb1b MINOR: Support listener config overrides in system tests (#6981)
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2019-06-27 18:10:43 +01:00
Stanislav Vodetskyi 594d043037 MINOR: Fix failing upgrade test by supporting both security.inter.broker.protocol and inter.broker.listener.name depending on kafka version (#7000)
Reviewers: Brian Bushree <bbushree@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
2019-06-27 17:50:17 +01:00
Stanislav Vodetskyi f51d7d3c93 KAFKA-8557: system tests - add support for (optional) interbroker listener with the same security protocol as client listeners (#6938)
Reviewers: Brian Bushree <bbushree@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
2019-06-21 17:51:43 +01:00
David Arthur d7a5e31ca2
KAFKA-8519 Add trogdor action to slow down a network (#6912)
This adds a new Trogdor fault spec for inducing network latency on a network device for system testing. It operates very similarly to the existing network partition spec by executing the `tc` linux utility.
2019-06-21 11:30:05 -04:00
Boyang Chen e981b82601 KAFKA-8500; Static member rejoin should always update member.id (#6899)
This PR fixes a bug in static group membership. Previously we limit the `member.id` replacement in JoinGroup to only cases when the group is in Stable. This is error-prone and could potentially allow duplicate consumers reading from the same topic. For example, imagine a case where two unknown members join in the `PrepareRebalance` stage at the same time. 

The PR fixes the following things:

1. Replace `member.id` at any time we see a known static member rejoins group with unknown member.id
2. Immediately fence any ongoing join/sync group callback to early terminate the duplicate member.
3. Clearly handle Dead/Empty cases as exceptional.
4. Return old leader id upon static member leader rejoin to avoid trivial member assignment being triggered.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>
2019-06-12 08:41:58 -07:00
Jason Gustafson 2feb44ebc8 MINOR: Fix race condition on shutdown of verifiable producer
We've seen `ReplicaVerificationToolTest.test_replica_lags` fail occasionally due to errors such as the following:
```
RemoteCommandError: ubuntuworker7: Command 'kill -15 2896' returned non-zero exit status 1. Remote error message: bash: line 0: kill: (2896) - No such process
```
The problem seems to be a shutdown race condition when using `max_messages` with the producer. The process may already be gone which will cause the signal to fail.

Author: Jason Gustafson <jason@confluent.io>

Reviewers: Gwen Shapira

Closes #6906 from hachikuji/fix-failing-replicat-verification-test
2019-06-07 16:56:21 -07:00
Jason Gustafson c7c310beff MINOR: Lower producer throughput in flaky upgrade system test
We see the upgrade test failing from time to time. I looked into it and found that the root cause is basically that the test throughput can be too high for the 0.9 producer to make progress. Eventually it reaches a point where it has a huge backlog of timed out requests in the accumulator which all have to be expired. We see a long run of messages like this in the output:

```
{"exception":"class org.apache.kafka.common.errors.TimeoutException","time_ms":1559907386132,"name":"producer_send_error","topic":"test_topic","message":"Batch Expired","class":"class org.apache.kafka.tools.VerifiableProducer","value":"335160","key":null}
{"exception":"class org.apache.kafka.common.errors.TimeoutException","time_ms":1559907386132,"name":"producer_send_error","topic":"test_topic","message":"Batch Expired","class":"class org.apache.kafka.tools.VerifiableProducer","value":"335163","key":null}
{"exception":"class org.apache.kafka.common.errors.TimeoutException","time_ms":1559907386133,"name":"producer_send_error","topic":"test_topic","message":"Batch Expired","class":"class org.apache.kafka.tools.VerifiableProducer","value":"335166","key":null}
{"exception":"class org.apache.kafka.common.errors.TimeoutException","time_ms":1559907386133,"name":"producer_send_error","topic":"test_topic","message":"Batch Expired","class":"class org.apache.kafka.tools.VerifiableProducer","value":"335169","key":null}
```
This can continue for a long time (I have observed up to 1 min) and prevents the producer from successfully writing any new data. While it is busy expiring the batches, no data is getting delivered to the consumer, which causes it to eventually raise a timeout.
```
kafka.consumer.ConsumerTimeoutException
at kafka.consumer.NewShinyConsumer.receive(BaseConsumer.scala:50)
at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:109)
at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:69)
at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:47)
at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
```
The fix here is to reduce the throughput, which seems reasonable since the purpose of the test is to verify the upgrade, which does not demand heavy load. Note that I investigated several failing instances of this test going back to 1.0 and saw a similar pattern, so there does not appear to be a regression.

Author: Jason Gustafson <jason@confluent.io>

Reviewers: Gwen Shapira

Closes #6907 from hachikuji/lower-throughput-for-upgrade-test
2019-06-07 16:53:50 -07:00
Lucas Bradstreet 677713baf3 KAFKA-8499: ensure java is in PATH for ducker system tests (#6898)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2019-06-07 14:23:49 -07:00
Boyang Chen cca05cace4 KAFKA-8331: stream static membership system test (#6877)
As title suggested, we boost 3 stream instances stream job with one minute session timeout, and once the group is stable, doing couple of rolling bounces for the entire cluster. Every rejoin based on restart should have no generation bump on the client side.

Reviewers: Guozhang Wang <wangguoz@gmail.com>,  Bill Bejeck <bbejeck@gmail.com>
2019-06-07 16:52:12 -04:00
Stanislav Kozlovski 58aa04f91e MINOR: Improve Trogdor external command worker docs (#6438)
Reviewers: Colin McCabe <cmccabe@apache.org>, Xi Yang <xi@confluent.io>
2019-06-06 10:04:05 -07:00
Matthias J. Sax ba3dc49437
KAFKA-8155: Add 2.2.0 release to system tests (#6597)
Reviewers: Bill Bejeck <bill@confluent.io>, Boyang Chen <boyang@confluent.io>, Bruno Cadonna <bruno@confluent.io>, Guozhang Wang <guozhang@confuent.io>
2019-06-03 21:09:58 -07:00
Konstantine Karantasis 55d07e717e KAFKA-8473: Adjust Connect system tests for incremental cooperative rebalancing (#6872)
Author: Konstantine Karantasis <konstantine@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-06-03 16:50:03 -05:00
Matthias J. Sax 55bfea1378
KAFKA-8155: Add 2.1.1 release to system tests (#6596)
Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2019-05-30 12:50:30 -07:00
Alex Diachenko 77a9a108ff KAFKA-8418: Wait until REST resources are loaded when starting a Connect Worker. (#6840)
Author: Alex Diachenko <sansanichfb@gmail.com>
Reviewers: Arjun Satish <arjun@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>
2019-05-30 14:01:00 -05:00
Alex Diachenko 4838855ea7 MINOR: Fix red herring when ConnectDistributedTest.test_bounce fails. (#6838)
Author: Alex Diachenko <sansanichfb@gmail.com>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-05-29 17:33:24 -05:00
Bill Bejeck f249956390
MINOR: Account for different versions in upgrade (#6835)
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Bruno Cadonna <bruno@confluent.io>
2019-05-29 17:02:37 -04:00
Matthias J. Sax d286051a21
MINOR: fix Streams version-probing system test (#6764)
Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Boyang Chen <boyang@confluent.io>
2019-05-24 08:08:52 -07:00
Ismael Juma c823f32070 MINOR: Add 2.0, 2.1 and 2.2 to broker and client compat tests
These are important to ensure we don't break compatibility.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Gwen Shapira

Closes #6794 from ijuma/update-version-compat-tests
2019-05-23 13:34:00 -07:00
Konstantine Karantasis c6d083d7fc KAFKA-8417: Remove redundant network definition --net=host when starting testing docker containers (#6797)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2019-05-23 11:46:10 -07:00
Manikumar Reddy 5ca6a2ee94 MINOR: Use `jps` cmd to find out the pid of TransactionalMessageCopier
Author: Manikumar Reddy <manikumar.reddy@gmail.com>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #6787 from omkreddy/transaction_test
2019-05-22 19:12:55 +05:30
Colin Patrick McCabe 87ff83a82e
MINOR: Bump version to 2.4.0-SNAPSHOT (#6774)
Reviewers: Jason Gustafson <jason@confluent.io>
2019-05-20 12:47:21 -07:00
Jason Gustafson b52170372b
MINOR: Increase security test timeouts for transient failures (#6760)
Reviewers: Ismael Juma <ismael@juma.me.uk>
2019-05-18 16:52:18 -07:00
Boyang Chen 9fa331b811 KAFKA-8225 & KIP-345 part-2: fencing static member instances with conflicting group.instance.id (#6650)
For static members join/rejoin, we encode the current timestamp in the new member.id. The format looks like group.instance.id-timestamp.

During consumer/broker interaction logic (Join, Sync, Heartbeat, Commit), we shall check the whether group.instance.id is known on group. If yes, we shall match the member.id stored on static membership map with the request member.id. If mismatching, this indicates a conflict consumer has used same group.instance.id, and it will receive a fatal exception to shut down.

Right now the only missing part is the system test. Will work on it offline while getting the major logic changes reviewed.

Reviewers: Ryanne Dolan <ryannedolan@gmail.com>, Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-05-18 07:28:36 -07:00
Vahid Hashemian 16ece15fb3 MINOR: Include StickyAssignor in system tests (#5223)
Reviewers: Jason Gustafson <jason@conflent.io>
2019-05-11 11:13:07 -07:00
Magesh Nandakumar 40d5c9fac9 KAFKA-8352 : Fix Connect System test failure 404 Not Found (#6713)
Corrects the system tests to check for either a 404 or a 409 error and sleeping until the Connect REST API becomes available. This corrects a previous change to how REST extensions are initialized (#6651), which added the ability of Connect throwing a 404 if the resources are not yet started. The integration tests were already looking for 409.

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-05-10 18:20:39 -05:00
Boyang Chen 0f995ba6be KAFKA-7862 & KIP-345 part-one: Add static membership logic to JoinGroup protocol (#6177)
This is the first diff for the implementation of JoinGroup logic for static membership. The goal of this diff contains:

* Add group.instance.id to be unique identifier for consumer instances, provided by end user;
Modify group coordinator to accept JoinGroupRequest with/without static membership, refactor the logic for readability and code reusability.
* Add client side support for incorporating static membership changes, including new config for group.instance.id, apply stream thread client id by default, and new join group exception handling.
* Increase max session timeout to 30 min for more user flexibility if they are inclined to tolerate partial unavailability than burdening rebalance.
* Unit tests for each module changes, especially on the group coordinator logic. Crossing the possibilities like:
6.1 Dynamic/Static member
6.2 Known/Unknown member id
6.3 Group stable/unstable
6.4 Leader/Follower

The rest of the 345 change will be broken down to 4 separate diffs:

* Avoid kicking out members through rebalance.timeout, only do the kick out through session timeout.
* Changes around LeaveGroup logic, including version bumping, broker logic, client logic, etc.
* Admin client changes to add ability to batch remove static members
* Deprecate group.initial.rebalance.delay

Reviewers: Liquan Pei <liquanpei@gmail.com>, Stanislav Kozlovski <familyguyuser192@windowslive.com>, Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-04-26 11:44:38 -07:00
Boyang Chen 847957cbea KAFKA-8291 : System test fix (#6637)
As titled, this PR changed the default reset policy to latest accidentally for system tests, which in fact was earliest.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2019-04-25 14:16:34 -07:00
Ismael Juma 7d9e93ac6d
MINOR: Use https instead of http in links (#6477)
Verified that the https links work.

I didn't update the license header in this PR since that touches
so many files. Will file a separate one for that.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2019-04-22 11:58:25 -07:00
David Arthur 409fabc561 KAFKA-7747; Check for truncation after leader changes [KIP-320] (#6371)
After the client detects a leader change we need to check the offset of the current leader for truncation. These changes were part of KIP-320: https://cwiki.apache.org/confluence/display/KAFKA/KIP-320%3A+Allow+fetchers+to+detect+and+handle+log+truncation.

Reviewers: Jason Gustafson <jason@confluent.io>
2019-04-21 16:24:18 -07:00
Guozhang Wang 4aa2cfe467
MINOR: Tighten up metadata upgrade test (#6531)
Reviewers: Bill Bejeck <bbejeck@gmail.com>
2019-04-05 12:50:42 -07:00
Stanislav Kozlovski 0d55f0f3ec KAFKA-8102: Add an interval-based Trogdor transaction generator (#6444)
This patch adds a TimeIntervalTransactionsGenerator class which enables the Trogdor ProduceBench worker to commit transactions based on a configurable millisecond time interval.

Also, we now handle 409 create task responses in the coordinator command-line client by printing a more informative message

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2019-03-25 09:58:11 -07:00
Brian Bushree 860e957999 MINOR: list-topics should not require topic param
`kafka.list_topics(...)` should not require a topic parameter

Author: Brian Bushree <bbushree@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6367 from brianbushree/list-topics-no-topic
2019-03-22 11:50:00 -07:00
Stanislav Kozlovski f20f3c1a97 MINOR: Update Trogdor ConnectionStressWorker status at the end of execution (#6445)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2019-03-15 09:53:21 -07:00
John Roesler 8e97540071 KAFKA-7944: Improve Suppress test coverage (#6382)
* add a normal windowed suppress with short windows and a short grace
period
* improve the smoke test so that it actually verifies the intended
conditions

See https://issues.apache.org/jira/browse/KAFKA-7944

Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2019-03-12 09:53:29 -07:00
Rajini Sivaram 460b3a6259
KAFKA-8070: Increase consumer startup timeout in system tests (#6405)
For consumers using SSL, this timeout includes the time to create and copy keystores and truststores and sometime it takes longer than 10s to complete the security setup before starting the consumer process.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2019-03-08 16:57:58 +00:00
Guozhang Wang 057bb35f24 HOTFIX: add igore import to streams_upgrade_test 2019-03-01 11:20:41 -08:00
Guozhang Wang dfae20ecee MINOR: disable Streams system test for broker upgrade/downgrade (#6341)
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2019-02-28 00:20:44 -08:00
Bill Bejeck 071f62a356 MINOR: refactor topic check to make sure all topics exist by name vs doing a topic count (#6271)
Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2019-02-20 17:02:30 -08:00
Bill Bejeck be76560011
MINOR: Add all topics created check streams broker bounce test (trunk) (#6243)
The StreamsBrokerBounceTest.test_broker_type_bounce experienced what looked like a transient failure. After looking over this test and failure, it seems like it is vulnerable to timing error that streams will start before the kafka service creates all topics.

Reviewers:  Matthias J. Sax <mjsax@apache.org>, John Roesler <john@confluent.io>
2019-02-20 12:45:22 -05:00
Matthias J. Sax d2575f03a3
MINOR: Bump version to 2.3.0-SNAPSHOT (#6226)
* MINOR: Bump version to 2.3.0-SNAPSHOT

* Github comment
2019-02-11 14:46:49 -08:00
Colin Patrick McCabe 4be68c58da
KAFKA-7828: Add ExternalCommandWorker to Trogdor (#6219)
Allow the Trogdor agent to execute external commands. The agent communicates with the external commands via stdin, stdout, and stderr.

Based on a patch by Xi Yang <xi@confluent.io>

Reviewers: David Arthur <mumrah@gmail.com>
2019-02-06 16:42:02 -08:00
Konstantine Karantasis 83c435f3ba KAFKA-7834: Extend collected logs in system test services to include heap dumps
* Enable heap dumps on OOM with -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<file.bin> in the major services in system tests
* Collect the heap dump from the predefined location as part of the result logs for each service
* Change Connect service to delete the whole root directory instead of individual expected files
* Tested by running the full suite of system tests

Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6158 from kkonstantine/KAFKA-7834
2019-02-04 16:46:03 -08:00
Konstantine Karantasis 0dbb064963 MINOR: Upgrade ducktape to 0.7.5 (#6197)
Reviewed-by: Colin P. McCabe <cmccabe@apache.org>
2019-01-25 11:14:19 -08:00
Colin Patrick McCabe a79d6dcdb6
KAFKA-7793: Improve the Trogdor command line. (#6133)
* Allow the Trogdor agent to be started in "exec mode", where it simply
runs a single task and exits after it is complete.

* For AgentClient and CoordinatorClient, allow the user to pass the path
to a file containing JSON, instead of specifying the JSON object in the
command-line text itself.  This means that we can get rid of the bash
scripts whose only function was to load task specs into a bash string
and run a Trogdor command.

* Print dates and times in a human-readable way, rather than as numbers
of milliseconds.

* When listing tasks or workers, output human-readable tables of
information.

* Allow the user to filter on task ID name, task ID pattern, or task
state.

* Support a --json flag to provide raw JSON output if desired.

Reviewed-by: David Arthur <mumrah@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>
2019-01-24 09:26:51 -08:00
Kan Li f8e8d62f56 MINOR: ducker-ak: add down -f, avoid using a terminal in ducker test
When using ./ducker-ak test on Jenkins, the script complains that there is no TTY.  To fix this, we should skip passing -t to docker exec.  We do not need a pseudo-TTY to run the tests.  Similarly, we should skip passing -i, since we do not need to keep stdin open.

The down command should have a force option, specified as -f or --force.

Reviewed-by: Colin P. McCabe <cmccabe@apache.org>
2019-01-23 13:39:47 -08:00
Chris Egerton 743607af5a KAFKA-5117: Stop resolving externalized configs in Connect REST API
[KIP-297](https://cwiki.apache.org/confluence/display/KAFKA/KIP-297%3A+Externalizing+Secrets+for+Connect+Configurations#KIP-297:ExternalizingSecretsforConnectConfigurations-PublicInterfaces) introduced the `ConfigProvider` mechanism, which was primarily intended for externalizing secrets provided in connector configurations. However, when querying the Connect REST API for the configuration of a connector or its tasks, those secrets are still exposed. The changes here prevent the Connect REST API from ever exposing resolved configurations in order to address that. rhauch has given a more thorough writeup of the thinking behind this in [KAFKA-5117](https://issues.apache.org/jira/browse/KAFKA-5117)

Tested and verified manually. If these changes are approved unit tests can be added to prevent a regression.

Author: Chris Egerton <chrise@confluent.io>

Reviewers: Robert Yokota <rayokota@gmail.com>, Randall Hauch <rhauch@gmail.com, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6129 from C0urante/hide-provided-connect-configs
2019-01-23 11:00:23 -08:00
Xi Yang cc33511e9a MINOR: Support choosing different JVMs when running integration tests
+ Add a parameter to the ducktap-ak to control the OpenJDK base image.
+ Fix a few issues of using OpenJDK:11 as the base image.

*More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.*

*Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.*

Author: Xi Yang <xi@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6071 from yangxi/ducktape-jdk
2019-01-11 15:11:55 -08:00
Bill Bejeck 3746bf2c84 MINOR: Need to have same wait as verify timeout broker upgrade downgrade (#6127)
When I originally refactored the streams_upgrade_test#upgrade_downgrade_brokers test I removed the wait call which would wait for up 24 minutes for the SmokeTestDriver class to publish and verify all of its records.

Since most of the tests run in two minutes or less I set the monitor_log timeout to three minutes. However, the SmokeTestDriver#verify method allows up to six minutes to consume all records before verifying the monitor_log timeout needs to be greater than 6 minutes. I've set the timeout to 8 minutes.

Additionally, the steps needed to update the streams_upgrade_test should be documented as there are several components that need to get updated. So I've documented those steps here on the test as a giant comment.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2019-01-11 11:35:33 -08:00
Bill Bejeck b1b792d9a8 MINOR: Add 2.1 version metadata upgrade (#6111)
Updated the test_metadata_upgrade test. To enable using the 2.1 version I needed to add config change to the StreamsUpgradeTestJobRunnerService to ensure the ductape passes proper args when starting the StreamsUpgradeTest

For testing, I ran the test_metadata_upgrade test and all versions now pass http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/2019-01-09--001.1547049873--bbejeck--MINOR_add_2_1_version_metadata_upgrade--a450c68/report.html

Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-01-09 15:19:00 -08:00
Bill Bejeck 515e680c71 MINOR: Put states in proper order, increase timeout for starting (#6105)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-01-08 13:48:38 -08:00
Jason Gustafson f9a22f42a8 KAFKA-7773; Add end to end system test relying on verifiable consumer (#6070)
This commit creates an EndToEndTest base class which relies on the verifiable consumer. This will ultimately replace ProduceConsumeValidateTest which depends on the console consumer. The advantage is that the verifiable consumer exposes more information to use for validation. It also allows for a nicer shutdown pattern. Rather than relying on the console consumer idle timeout, which requires a minimum wait time, we can halt consumption after we have reached the last acked offsets. This should be more reliable and faster. The downside is that the verifiable consumer only works with the new consumer, so we cannot yet convert the upgrade tests. This commit converts only the replication tests and a flaky security test to use EndToEndTest.
2019-01-08 14:14:51 +00:00
Bill Bejeck 404bdef08d MINOR: Remove sleep calls and ignore annotation from streams upgrade test (#6046)
The StreamsUpgradeTest::test_upgrade_downgrade_brokers used sleep calls in the test which led to flaky test performance and as a result, we placed an @ignore annotation on the test. This PR uses log events instead of the sleep calls hence we can now remove the @ignore setting.

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-01-06 23:03:54 -08:00
John Roesler ef9204dc58 MINOR: improve resilience of Streams test producers (#6028)
Some Streams system tests have failed during the setup phase
due to the producer having retries disabled and getting some
transient error from the broker.

This patch adds a retries parameter to the VerifiableProducer
(default unchanged), and sets retries to 10 for Streams tests.

It also sets acks equal to the number of brokers for Streams tests.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2019-01-04 13:44:15 -08:00
Bill Bejeck 639427a38f MINOR: Increase throughput for VerifiableProducer in test (#6060)
Previous PR #6043 reduced throughput for VerifiableProducer in base class, but the streams_standby_replica_test needs higher throughput for consumer to complete verification in 60 seconds

Update system test and kicked off branch builder with 25 repeats https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2201/

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-12-21 22:32:19 -08:00
Bill Bejeck 40ca7ddeed MINOR: Stabilization fixes broker down test trunk (#6043)
This PR addresses a few issues with this system test flakiness. This PR is a cherry-picked duplicate of #6041 but for the trunk branch, hence I won't repeat the inline comments here.

1. Need to grab the monitor before a given operation to observe logs for signal
2. Relied too much on a timely rebalance and only sent a handful of messages.
I've updated the test and ran it here https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2143/ parameterized for 15 repeats all passed.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-12-19 17:53:09 -08:00
Bill Bejeck da332f2241 MINOR:Start processor inside verify message (#6029)
This PR fixes a flaky system test.

I ran six runs of branch builder, and each run was parameterized to repeat the test 25 times for a total of 150 runs. All test runs passed.

https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2122/
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2123/
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2124/
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2128/
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2129/
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2130/

Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, John Roesler <vvcephei@users.noreply.github.com>
2018-12-13 20:30:04 -08:00
Stanislav Kozlovski 05dc36d548 MINOR: Fix failing ConsumeBenchTest:test_multiple_consumers_specified_group_partitions_should_raise (#6015)
This is the error message we're after:

"You may not specify an explicit partition assignment when using multiple consumers in the same group."

We apparently changed it midway through #5810 and forgot to update the test.

Reviewers: Dhruvil Shah <dhruvil@confluent.io>, Ismael Juma <ismael@juma.me.uk>
2018-12-08 09:27:59 -08:00
Bill Bejeck ab1fb3fdde MINOR: Adding system test for named repartition topics (#5913)
This is a system test for doing a rolling upgrade of a topology with a named repartition topic.

1. An initial Kafka Streams application is started on 3 nodes. The topology has one operation forcing a repartition and the repartition topic is explicitly named.
2. Each node is started and processing of data is validated
3. Then one node is stopped (full stop is verified)
4. A property is set signaling the node to add operations to the topology before the repartition node which forces a renumbering of all operators (except repartition node)
5. Restart the node and confirm processing records
6. Repeat the steps for the other 2 nodes completing the rolling upgrade

I ran two runs of the system test with 25 repeats in each run for a total of 50 test runs.
All test runs passed

Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-12-03 12:37:31 -08:00
Attila Sasvari e7ce0e7e0a KAFKA-4544: Add system tests for delegation token based authentication
This change adds some basic system tests for delegation token based authentication:
- basic delegation token creation
- producing with a delegation token
- consuming with a delegation token
- expiring a delegation token
- producing with an expired delegation token

New files:
- delegation_tokens.py: a wrapper around kafka-delegation-tokens.sh  - executed in container where a secure Broker is running (taking advantage of automatic cleanup)
- delegation_tokens_test.py: basic test to validate the lifecycle of a delegation token

Changes were made in the following file to extend their functionality:
- config_property was updated to be able to configure Kafka brokers with delegation token related settings
- jaas.conf template because a broker needs to support multiple login modules when delegation tokens are used
- consule-consumer and verifiable_producer to override KAFKA_OPTS (to specify custom jaas.conf) and the client properties (to authenticate with delegation token).

Author: Attila Sasvari <asasvari@apache.org>

Reviewers: Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Andras Katona <41361962+akatona84@users.noreply.github.com>, Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #5660 from asasvari/KAFKA-4544
2018-12-03 11:28:36 +05:30
Bill Bejeck dfd545485a MINOR: Add system test for optimization upgrades (#5912)
This is a new system test testing for optimizing an existing topology. This test takes the following steps

1. Start a Kafka Streams application that uses a selectKey then performs 3 groupByKey() operations and 1 join creating four repartition topics
2. Verify all instances start and process data
3. Stop all instances and verify stopped
4. For each stopped instance update the config for TOPOLOGY_OPTIMIZATION to all then restart the instance and verify the instance has started successfully also verifying Kafka Streams reduced the number of repartition topics from 4 to 1
5. Verify that each instance is processing data from the aggregation, reduce, and join operation
Stop all instances and verify the shut down is complete.
6. For testing I ran two passes of the system test with 25 repeats for a total of 50 test runs.

All test runs passed

Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-11-27 13:07:34 -08:00
Stanislav Kozlovski 9368743b8f KAFKA-7597: Add transaction support to ProduceBenchWorker (#5885)
KAFKA-7597: Add configurable transaction support to ProduceBenchWorker.  In order to get support for serializing Optional<> types to JSON, add a new library: jackson-datatype-jdk8. Once Jackson 3 comes out, this library will not be needed.

Reviewers: Colin McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>
2018-11-27 12:49:53 -08:00
John Roesler 14fbf16875 MINOR: increase system test kafka start timeout (#5934)
The Kafka Streams system tests fail with some regularity due to a timeout starting the broker.

The initial start is quite quick, but many of our tests involve stopping and restarting nodes with data already loaded, and also while processing is ongoing.

Under these conditions, it seems to be normal for the broker to take about 25 seconds to start, which makes the 30 second timeout pretty close for comfort.
I have seen many test failures in which the broker successfully started within a couple of seconds after the tests timed out and already initiated the failure/shut-down sequence.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-11-21 11:49:36 -08:00
Matthias J. Sax 190cbd9fe5
MINOR: fix failing Streams system tests (#5928)
Reviewers: Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>
2018-11-19 18:44:45 -08:00
Cyrus Vafadari f52de5f0ad MINOR: Remove unused abstract function in test class (#5888)
The function `setup_producer_and_consumer` is unused in the system test
framework, which incorrectly suggests subclasses should implement
it. It is not required or even referenced by the framework, so
the requirement should be removed.

Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-11-13 09:43:29 -08:00
Stanislav Kozlovski 8259fda695 KAFKA-7514: Add threads to ConsumeBenchWorker (#5864)
Add threads with separate consumers to ConsumeBenchWorker.  Update the Trogdor test scripts and documentation with the new functionality.

Reviewers: Colin McCabe <cmccabe@apache.org>
2018-11-13 08:38:42 -08:00
Magesh Nandakumar 1c5aec6e9d MINOR: Modify Connect service's startup timeout to be passed via the init (#5882)
Currently, the startup timeout is hardcoded to be 60 seconds in Connect's test service. Modifying it to be passable via init. This can safely be backported as well.

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-11-06 13:41:19 -08:00
Randall Hauch f856082cb8 KAFKA-7559: Correct standalone system tests to use the correct external file (#5883)
This fixes the Connect standalone system tests. See branch builder: https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2021/

This should be backported to the 2.0 branch, since that's when the tests were first
modified to use the external property file.

Reviewers: Magesh Nandakumar <magesh.n.kumar@gmail.com>, Ismael Juma <ismael@juma.me.uk>
2018-11-05 22:39:12 -08:00
Gardner Vickers ea518e4d67 KAFKA-7561: Increase stop_timeout_sec to make ConsoleConsumerTest pass (#5853)
Looks like the increased delay happens when connecting to the docker container.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-11-03 10:06:17 -07:00
Stanislav Kozlovski d28c534819 KAFKA-7515: Trogdor - Add Consumer Group Benchmark Specification (#5810)
This ConsumeBenchWorker now supports using consumer groups.  The groups may be either used to store offsets, or as subscriptions.
2018-10-29 10:51:07 -07:00
Randall Hauch 8b1d705404 MINOR: Fix undefined variable in Connect test
Corrects an error in the system tests:
```
07:55:45 [ERROR:2018-10-23 07:55:45,738]: Failed to import kafkatest.tests.connect.connect_test, which may indicate a broken test that cannot be loaded: NameError: name 'EXTERNAL_CONFIGS_FILE' is not defined
```

The constant is defined in the [services/connect.py](https://github.com/apache/kafka/blob/trunk/tests/kafkatest/services/connect.py#L43) file in the `ConnectServiceBase` class, but the problem is in the [tests/connect/connect_test.py](https://github.com/apache/kafka/blob/trunk/tests/kafkatest/tests/connect/connect_test.py#L50) `ConnectStandaloneFileTest`, which does *not* extend the `ConnectServiceBase class`. Suggestions welcome to be able to reuse that variable without duplicating the literal (as in this PR).

System test run with this PR: https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2004/

If approved, this should be merged as far back as the `2.0` branch.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5832 from rhauch/fix-connect-externals-tests
2018-10-24 13:16:34 -07:00
Guozhang Wang 2d77746a7b
MINOR: fixes on streams upgrade test (#5754)
1. In test_upgrade_downgrade_brokers, allow duplicates to happen.
2. In test_version_probing_upgrade, grep the generation numbers from brokers at the end, and check if they can ever be synchronized.

Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-10-13 22:39:24 -07:00
Colin Hicks b63cd6370f MINOR: Adjust test params pursuant to KAFKA-4514. (#5777)
PR #2267 Introduced support for Zstandard compression. The relevant test expects values for `num_nodes` and `num_producers` based on the (now-incremented) count of compression types.

Passed the affected, previously-failing test:
`ducker-ak test tests/kafkatest/tests/client/compression_test.py`

Reviewers: Jason Gustafson <jason@confluent.io>
2018-10-10 15:16:54 -07:00
Lee Dongjin 741cb761c5 KAFKA-4514; Add Codec for ZStandard Compression (#2267)
This patch adds support for zstandard compression to Kafka as documented in KIP-110: https://cwiki.apache.org/confluence/display/KAFKA/KIP-110%3A+Add+Codec+for+ZStandard+Compression. 

Reviewers: Ivan Babrou <ibobrik@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
2018-10-09 17:13:33 -07:00
Max Zheng 50ec82940d MINOR: Switch to use AWS spot instances
Pricing for m3.xlarge: On-Demand is at $0.266. Reserved is at about $0.16 (40% discount). And Spot is at $0.0627 (76% discount relative to On-Demand, or 60% discount relative to Reserved). Insignificant fluctuation in the past 3 months.

Ran on branch builder and works as expected -- each worker is created using spot instances (https://jenkins.confluent.io/job/system-test-kafka-branch-builder/1982/console)

This can be safely backported to 0.10.2 (tested using https://jenkins.confluent.io/job/system-test-kafka-branch-builder/1983/)

Author: Max Zheng <maxzheng.os@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5707 from maxzheng/minor-switch@trunk
2018-10-05 10:21:25 -07:00
Dong Lin 2b37a041d3 MINOR: Bump version to 2.2.0-SNAPSHOT
Author: Dong Lin <lindong28@gmail.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #5744 from lindong28/bump-up-version-2.2.0
2018-10-04 17:30:03 -07:00
Randall Hauch 09f3205e44 MINOR: Increase timeout for starting JMX tool (#5735)
In some tests, the check monitoring the JMX tool log output doesn’t quite wait long enough before failing. Increasing the timeout from 10 to 20 seconds.
2018-10-03 08:56:44 -07:00
Bill Bejeck df00f1a738 MINOR: Remove ignored tests that hang, added new versions for EOS tests (#5666)
Reviewers: John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>
2018-09-19 15:05:39 -07:00
Rajini Sivaram 5e72f66ab5
MINOR: Increase timeout in log4j system test to avoid transient failures (#5658)
Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-09-17 19:12:46 +01:00
Bill Bejeck b74e7e407c MINOR: Enable ignored upgrade system tests - trunk (#5605)
Removed ignore annotations from the upgrade tests. This PR includes the following changes for updating the upgrade tests:

* Uploaded new versions 0.10.2.2, 0.11.0.3, 1.0.2, 1.1.1, and 2.0.0 (in the associated scala versions) to kafka-packages
* Update versions in version.py, Dockerfile, base.sh
* Added new versions to StreamsUpgradeTest.test_upgrade_downgrade_brokers including version 2.0.0
* Added new versions StreamsUpgradeTest.test_simple_upgrade_downgrade test excluding version 2.0.0
* Version 2.0.0 is excluded from the streams upgrade/downgrade test as StreamsConfig needs an update for the new version, requiring a KIP. Once the community votes the KIP in, a minor follow-up PR can be pushed to add the 2.0.0 version to the upgrade test.
* Fixed minor bug in kafka-run-class.sh for classpath in upgrade/downgrade tests across versions.
* Follow on PRs for 0.10.2x, 0.11.0x, 1.0.x, 1.1.x, and 2.0.x will be pushed soon with the same updates required for the specific version.

Reviewers: Eno Thereska <eno.thereska@gmail.com>, John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>
2018-09-13 13:46:47 -07:00
Anna Povzner 1427e84cc7 MINOR: Use LATEST_1_1 instead of V_1_1_0 in quota_test (#5636)
The goal is to only test against the latest bug fix release for each release branch.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-09-10 17:34:48 -07:00
Stanislav Kozlovski d31c9db4af MINOR: Use producer template in verifiable producer service (#5581)
Also remove an unused old consumer template.

Reviewers: Jason Gustafson <jason@confluent.io>
2018-08-30 08:49:01 -07:00
Stanislav Kozlovski 7884258489 MINOR: Make JAAS configurable via template variables in system tests (#5554)
Currently, the only way in system tests to add a new variable to the `jaas.conf` template file is to directly edit the path the config is constructed by adding new keyword arguments.
This wasn't necessarily a big problem, since you'd only need edit the `security_config.py` file as JAAS settings should come from the security settings.

Now, with the addition of [KIP-342](https://cwiki.apache.org/confluence/display/KAFKA/KIP-342%3A+Add+support+for+Custom+SASL+extensions+in+OAuthBearer+authentication), the OAuthBearer JAAS config supports arbitrary values in the form of SASL extensions. This patch exposes a more convenient API to overrides these values in system tests.

Reviewers: Jason Gustafson <jason@confluent.io>
2018-08-23 14:55:40 -07:00
Randall Hauch e577f6d366 KAFKA-7225; Corrected system tests by generating external properties file (#5489)
Fix system tests from earlier #5445 by moving to the `ConnectSystemBase` class the creation & cleanup of a file that can be used as externalized secrets in connector configs. 

Reviewers: Arjun Satish <arjun@confluent.io>, Robert Yokota <rayokota@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Jason Gustafson <jason@confluent.io>
2018-08-23 14:22:09 -07:00
Robert Yokota 3511e904ef MINOR: Split at first occurrence of '=' in kafka.py props parsing (#5549)
This is a fix to #5226 to account for config properties that have an
equal char in the value.   Otherwise if there is one
equal char in the value the following error occurs:

dictionary update sequence element #XX has length 3; 2 is required

Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Ismael Juma <ismael@juma.me.uk>
2018-08-21 23:42:56 -07:00
Manikumar Reddy O 914ffa9dbe KAFKA-7210: Add system test to verify log compaction (#5226)
* Updated TestLogCleaning tool to use Java consumer and rename as LogCompactionTester.
* Enabled the log cleaner in every system test.
* Removed configs from "kafka.properties" with default values and `socket.receive.buffer.bytes`
as the override did not seem necessary.
* Updated `kafka.py` logic to handle duplicates between `kafka.properties` and `server_prop_overrides`.
* Updated Gradle build so that classes from `kafka-clients` test jar can be used in
system tests.

Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Ismael Juma <ismael@juma.me.uk>
2018-08-20 03:46:57 -07:00
Arjun Satish 28a1ae4183 MINOR: System test for error handling and writes to DeadLetterQueue
Added a system test which creates a file sink with json converter and attempts to feed it bad records. The bad records should land in the DLQ if it is enabled, and the task should be killed or bad records skipped based on test parameters.

Signed-off-by: Arjun Satish <arjunconfluent.io>

*More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.*

*Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.*

Author: Arjun Satish <arjun@confluent.io>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5456 from wicknicks/error-handling-sys-test
2018-08-07 14:44:01 -07:00
Robert Yokota 36a8fec0ab KAFKA-7225: Pretransform validated props
If a property requires validation, it should be pretransformed if it is a variable reference, in order to have a value that will properly pass the validation.

Author: Robert Yokota <rayokota@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5445 from rayokota/KAFKA-7225-pretransform-validated-props
2018-08-07 13:18:16 -07:00
Arjun Satish cbe86bc463 MINOR: Fix minikdc cleanup in system tests (#5471)
The original way of stopping the minikdc process sometimes misfires because the process arg string is very long, and `ps` is not able to find the correct process. Using the `kill_java_processes` method is more reliable for finding and killing java processes.
2018-08-07 11:08:51 -07:00
Jimmy Casey b282bbe8be Fixed Spelling. (#5432)
Reviewers: Sriharsha Chintalapani <sriharsha@apache.org>
2018-08-06 13:25:18 -07:00
Arjun Satish 0b5fd99ecb MINOR: Add thread dumps if broker node cannot be stopped (#5373)
In system tests, it is useful to have the thread dumps if a broker cannot be stopped using SIGTERM.

Reviewers: Xavier Léauté <xl+github@xvrl.net>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
2018-07-20 11:10:43 -07:00
Ted Yu 82f124ae30 KAFKA-5037: Fix infinite loop if all input topics are unknown at startup
1. At the beginning of assign, we first check that all the non-repartition source topics are included in the metadata. If not, we log an error at the leader and set an error in the Assignment userData bytes, indicating that leader cannot complete assignment and the error code would indicate the root cause of it.

2. Upon receiving the assignment, if the error is not NONE the streams will shutdown itself with a log entry re-stating the root cause interpreted from the error code.

Author: tedyu <yuzhihong@gmail.com>

Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>

Closes #5322 from tedyu/trunk
2018-07-19 15:22:53 -07:00
John Roesler e38e3a66ab MINOR: Fix standby streamTime (#5288)
#5253 broke standby restoration for windowed stores.

Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-07-03 07:07:50 -07:00
Anna Povzner 98e546611a MINOR: Use kill_java_processes when killing ConsoleConsumer in system tests (#5297)
Use `kill_java_processes` to stop the console consumer service since it uses jcmd instead of grep to find pids, which is more reliable.
2018-06-29 09:19:22 -07:00
Jon Lee a3db23cb76 KAFKA-6944; Add system tests testing the new throttling behavior using older clients/brokers
Added two additional test cases to quota_test.py, which run between brokers and clients with different throttling behaviors. More specifically,
1. clients with new throttling behavior (i.e., post-KIP-219) and brokers with old throttling behavior (i.e., pre-KIP-219)
2. clients with old throttling behavior and brokers with new throttling behavior

Author: Jon Lee <jonlee@linkedin.com>
Author: Dong Lin <lindong28@gmail.com>

Reviewers: Dong Lin <lindong28@gmail.com>

Closes #5294 from jonlee2/kafka-6944
2018-06-27 16:49:12 -07:00
John Roesler 74502710b3 MINOR: report streams benchmarks separately (#5275)
Specify each benchmark as a separate test so that we can see the results reported independently.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-06-22 17:10:06 -07:00
John Roesler 80553197e9 MINOR: temporarily ignore system test until I can fix it (#5283) 2018-06-22 16:55:39 -07:00
Andy Coates d2b84bd4b0 Kafka_7064 - bug introduced when switching config commands to ConfigResource (#5245)
Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Jun Rao <junrao@gmail.com>
2018-06-19 13:47:24 -07:00
Ismael Juma cc4dce94af
KAFKA-2983: Remove Scala consumers and related code (#5230)
- Removed Scala consumers (`SimpleConsumer` and `ZooKeeperConsumerConnector`)
and their tests.
- Removed Scala request/response/message classes.
- Removed any mention of new consumer or new producer in the code
with the exception of MirrorMaker where the new.consumer option was
never deprecated so we have to keep it for now. The non-code
documentation has not been updated either, that will be done
separately.
- Removed a number of tools that only made sense in the context
of the Scala consumers (see upgrade notes).
- Updated some tools that worked with both Scala and Java consumers
so that they only support the latter (see upgrade notes).
- Removed `BaseConsumer` and related classes apart from `BaseRecord`
which is used in `MirrorMakerMessageHandler`. The latter is a pluggable
interface so effectively public API.
- Removed `ZkUtils` methods that were only used by the old consumers.
- Removed `ZkUtils.registerBroker` and `ZKCheckedEphemeral` since
the broker now uses the methods in `KafkaZkClient` and no-one else
should be using that method.
- Updated system tests so that they don't use the Scala consumers except
for multi-version tests.
- Updated LogDirFailureTest so that the consumer offsets topic would
continue to be available after all the failures. This was necessary for it
to work with the Java consumer.
- Some multi-version system tests had not been updated to include
recently released Kafka versions, fixed it.
- Updated findBugs and checkstyle configs not to refer to deleted
classes and packages.

Reviewers: Dong Lin <lindong28@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
2018-06-19 07:32:54 -07:00
Magesh Nandakumar fc81acbfd7 KAFKA-7067; Include new connector configs in system test assertion (#5242)
The new connector configs added in  KIP-297 and KIP-298 need to be updated in the connect_rest_test.py so that the expected results match the actual.
2018-06-18 15:37:26 -07:00
Randall Hauch 7a1f555676 KAFKA-7009: Suppress the Reflections log warning messages in system tests
This could be backported to older branches to reduce the extra log warning messages there, too.

Running Connect system tests in this branch builder job: https://jenkins.confluent.io/job/system-test-kafka-branch-builder/1773/

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5151 from rhauch/kafka-7009
2018-06-12 22:56:49 -07:00
Guozhang Wang a20c8e8760
MINOR: Relax Unsupported Version check on BrokerCompatibilityTest (#5170)
In BrokerCompatibilityTest.java, when older versioned broker is used (0.10.1, 0.10.2), LIST_OFFSET is not supported as well. Hence in the verification phase, there is a possibility that consumer hit the UnsupportedVersionException earlier than Streams actually hits it:

org.apache.kafka.common.errors.UnsupportedVersionException: The broker does not support LIST_OFFSETS with version in range [2,3]. The supported range is [0,1].

While the test is waiting for

org.apache.kafka.common.errors.UnsupportedVersionException: Cannot create a v0 FindCoordinator request because we require features supported only in 1 or later.

Both are valid errors to expect (the former is from consumer while the latter is from producer of the streams app).

Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-06-11 15:45:29 -07:00
Matthias J. Sax c140603fa9 MINOR: need to update system test version after version bump (#5156) 2018-06-06 23:29:42 +01:00
Rajini Sivaram a1ca07d316
MINOR: Bump version to 2.1.0-SNAPSHOT (#5153) 2018-06-06 17:24:53 +01:00
Paolo Patierno be8808dd4b KAFKA-5588: Remove deprecated --new-consumer tools option (#5097)
Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Vahid Hashemian <vahidhashemian@us.ibm.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
2018-06-05 20:28:03 -07:00
Matthias J. Sax d166485be1
KAFKA-6054: Add 'version probing' to Kafka Streams rebalance (#4636)
implements KIP-268

Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-05-30 22:39:42 -07:00
Chris Egerton a64ab91238 KAFKA-5540: Deprecate internal converter configs (KIP-174)
Implementation of [KIP-174](https://cwiki.apache.org/confluence/display/KAFKA/KIP-174+-+Deprecate+and+remove+internal+converter+configs+in+WorkerConfig)

Configuration properties 'internal.key.converter' and 'internal.value.converter'
are deprecated, and default to org.apache.kafka.connect.json.JsonConverter.

Warnings are logged if values are specified for either, or if properties that
appear to configure instances of internal converters (i.e., ones prefixed with
either 'internal.key.converter.' or 'internal.value.converter.') are given.

The property 'schemas.enable' is also defaulted to false for internal
JsonConverter instances (both for keys and values) if it isn't specified.

Documentation and code have also been updated with deprecation notices and
annotations, respectively.

Unit tests have been updated in `PluginsTest` to account for the new defaults for `schemas.enable` for internal key/value converters, and to ensure that (for the time being), internal key/value converters are still configurable despite being deprecated.

Author: Chris Egerton <chrise@confluent.io>
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4693 from C0urante/kafka-5540
2018-05-29 16:22:47 -07:00
Ismael Juma 7132a85fc3 KAFKA-6921; Remove old Scala producer and related code
* Removed Scala producers, request classes, kafka.tools.ProducerPerformance, encoders,
tests.
* Updated ConsoleProducer to remove Scala producer support (removed `BaseProducer`
and several options that are not used by the Java producer).
* Updated a few Scala consumer tests to use the new producer (including a minor
refactor of `produceMessages` methods in `TestUtils`).
* Updated `ClientUtils.fetchTopicMetadata` to use `SimpleConsumer` instead of
`SyncProducer`.
* Removed `TestKafkaAppender` as it looks useless and it defined an `Encoder`.
* Minor import clean-ups

No new tests added since behaviour should remain the same after these changes.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Dong Lin <lindong28@gmail.com>

Closes #5045 from ijuma/kafka-6921-remove-old-producer
2018-05-24 17:32:49 -07:00
Guozhang Wang 70a506b983
MINOR: Ignore test_broker_type_bounce_at_start system test (#5055)
test_broker_type_bounce_at_start tries to validate that when the controller is down, the streams client will always fail trying to create the topic; with the current behavior of admin client it is actually not always true: the actual behavior depends on the admin client internals as well as when the controller becomes unavailable during the leader assign partitions phase. I'd suggest at least ignore this test for now until the admin client has more stable (personally I'd even suggest removing this test as its coverage benefits is smaller than its introduced issues to me).

Also adding a few more log4j entries as a result of investigating this issue.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2018-05-21 17:28:40 -07:00
Guozhang Wang cbce95d9a5
MINOR: Reduce required occurrance from 100 to 10 (#5048)
Due to #4644 the consumer connector logs will be much more clean with fewer "broker may not be available" entries. We need to reduce the required frequency from 100 to a smaller number.

I've thought about reducing to just 1, but it may still be transient (i.e. even if broker is starting up you may see a few entries) so I reduced it to 10.

Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-05-21 11:50:19 -07:00
Colin Patrick McCabe aa3c38f595 KAFKA-6785: Add Trogdor documentation (#4862) 2018-04-29 15:57:25 +01:00
Colin Patrick McCabe 8577632b3a MINOR: Fix Trogdor tests, partition assignments (#4892) 2018-04-29 15:54:38 +01:00
Bill Bejeck c6fd3d488e MINOR: update VerifiableProducer to send keys if configured and removed StreamsRepeatingKeyProducerService (#4841)
This PR does the following:

* Remove the StreamsRepeatingIntegerKeyProducerService and the associated Java class
* Add a parameter to VerifiableProducer.java to enable sending keys when specified
* Update the corresponding Python file verifiable_producer.py to support the new parameter.

Reviewers: Matthias J Sax <matthias@confluentio>, Guozhang Wang <wangguoz@gmail.com>
2018-04-27 22:12:57 -07:00
Guozhang Wang b2e0812f69
HOTFIX: rename run_test to execute in streams simple benchmark (#4941) 2018-04-27 13:40:24 -07:00
Max Zheng e128f7f3bb MINOR: Pin pip to 9.0.3 as 10 is not compatible with with system pip (#4937)
If not pinned, the following error will happen:
Traceback (most recent call last):
  File "/usr/bin/pip", line 9, in <module>
    from pip import main
ImportError: cannot import name main

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-04-26 21:01:27 -07:00
Bill Bejeck e7f019690a MINOR: Fixes for streams system tests (#4935)
This PR fixes some regressions introduced into streams system tests and sets the upgrade tests to ignore until PR #4636 is merged as it has the fixes for the upgrade tests.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-04-26 10:04:59 -07:00
Ismael Juma c853ef75a1 MINOR: Bump version to 2.0.0-SNAPSHOT (#4804) 2018-04-25 11:19:31 +01:00
Guozhang Wang 1f523d9d72
MINOR: add window store range query in simple benchmark (#4894)
There are a couple minor additions in this PR:

1. Add a new test for window store, to range query upon receiving each record.
2. In the non-windowed state store case, add a get call before the put call.
3. Enable caching by default to be consistent with other Join / Aggregate cases, where caching is enabled by default.

Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-04-19 16:14:32 -07:00
Matthias J. Sax cae42215b7
KAFKA-6054: Update Kafka Streams metadata to version 3 (#4880)
- adds Streams upgrade tests for 1.1 release
 - introduces metadata version 3

Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-04-18 09:38:27 +02:00
Guozhang Wang 0dc7f0e66f
KAFKA-6611, PART II: Improve Streams SimpleBenchmark (#4854)
SimpleBenchmark:

1.a Do not rely on manual num.records / bytes collection on atomic integers.
1.b Rely on config files for num.threads, bootstrap.servers, etc.
1.c Add parameters for key skewness and value size.
1.d Refactor the tests for loading phase, adding tumbling-windowed count.
1.e For consumer / consumeproduce, collect metrics on consumer instead.
1.f Force stop the test after 3 minutes, this is based on empirical numbers of 10M records.

Other tests: use config for kafka bootstrap servers.

streams_simple_benchmark.py: only use scale 1 for system test, remove yahoo from benchmark tests.

Note that the JMX based metrics is more accurate than the manually collected metrics. 

Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-04-15 10:15:31 -07:00
Matthias J. Sax 0c0d8363e5
KAFKA-6054: Fix upgrade path from Kafka Streams v0.10.0 (#4779)
Reviewers: Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Damian Guy <damian@confluent.io>
2018-04-06 17:00:52 -07:00
Bill Bejeck 29838c1042 HOTFIX: ignoring tests using old versions of Streams until KIP-268 is merged (#4766)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-03-28 17:41:08 -07:00
Bill Bejeck f4a39f9b83 MINOR: Fix flaky standby task test (#4767)
The standby-task test failed due to standby task distribution not be exactly equal. I think this will be the case from time to time, so I've updated test to make sure the standby task assignment count is not zero.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-03-26 16:22:20 -07:00
Guozhang Wang f2fbfaaccc
KAFKA-6611: PART I, Use JMXTool in SimpleBenchmark (#4650)
1. Use JmxMixin for SimpleBenchmark (will remove the self reporting in #4744), only when loading phase is false (i.e. we are in fact starting the streams app).

2. Reported the full jmx reported metrics in log files, and in the returned data only return the max values: this is because we want to skip the warming up and cooling down periods that will have lower rate numbers, while max represents the actual rate at full speed.

3. Incorporates two other improves to JMXTool: #1241 and #2950

Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Rohan Desai <desai.p.rohan@gmail.com>
2018-03-22 16:46:56 -07:00
Bill Bejeck 286216b56e MINOR: Rolling bounce upgrade fixed broker system test (#4690)
Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-03-22 16:02:16 -07:00
Guozhang Wang 0f364cd53a
MINOR: Pass a streams config to replace the single state dir (#4714)
This is a general change and is re-requisite to allow streams benchmark test with different streams tests. For the streams benchmark itself I will have a separate PR for switching configs. Details:

1. Create a "streams.properties" file under PERSISTENT_ROOT before all the streams test. For now it will only contain a single config of state.dir pointing to PERSISTENT_ROOT.

2. For all the system test related code, replace the main function parameter of state.dir with propsFilename, then inside the function load the props from the file and apply overrides if necessary.

3. Minor fixes.

Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-03-19 14:17:00 -07:00
Ewen Cheslack-Postava f264bfa296 KAFKA-6676: Ensure Kafka chroot exists in system tests and use chroot on one test with security parameterizations (#4729)
Ensures Kafka chroot exists in ZK when starting KafkaService so commands that use ZK and are executed before the first Kafka broker starts do not fail due to the missing chroot.

Also uses chroot with one test that also has security parameterizations so Kafka's test suite exercises these combinations. Previously no tests were exercising chroots.

Changes were validated using sanity_checks which include the chroot-ed test as well as some non-chroot-ed tests.
2018-03-19 10:37:02 +00:00
Matthias J. Sax 7fe06a8666
MINOR: fix flaky Streams EOS system test (#4728)
Reviewer: Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-03-17 18:02:20 -07:00
Chris Egerton b6ae51b108 Allow users to specify 'if-not-exists' when creating topics while testing (#4715)
Author: Chris Egerton <chrise@confluent.io>
2018-03-16 15:30:38 -07:00
John Roesler 7006d0f58b MINOR: Streams system tests fixes/updates (#4689)
Some changes required to get the Streams system tests working via Docker

To test:

TC_PATHS="tests/kafkatest/tests/streams" bash tests/docker/run_tests.sh

That command will take about 3.5 hours, and should pass. Note there are a couple of ignored tests.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bill@confluent.io>
2018-03-15 14:42:43 -07:00
Guozhang Wang e5d6c9a79a
MINOR: Do not start processor for bounce-at-start (#4639)
Only start it after the broker has been shutdown.
2018-03-06 11:19:38 -08:00
Bill Bejeck 8a7d7e7955 MINOR: Add System test for standby task-rebalancing (#4554)
Author: Bill Bejeck <bill@confluent.io>

Reviewers: Damian Guy <damian@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-03-05 11:06:32 -08:00
Matthias J. Sax 0cd83e997d
MINOR: wait for broker startup for system tests (#4363)
ensure that brokers are registered at ZK before start() returns

Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Damian Guy <damian@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-02-23 16:13:57 -08:00
Randall Hauch fc19c3e6f2 KAFKA-6577: Fix Connect system tests and add debug messages
**NOTE: This should be backported to the `1.1` branch, and is currently a blocker for 1.1.**

The `connect_test.py::ConnectStandaloneFileTest.test_file_source_and_sink` system test is failing with the SASL configuration without a sufficient explanation. During the test, the Connect worker fails to start, but the Connect log contains no useful information. There are actual several things compounding to cause the failure and make it difficult to understand the problem.

First, the `tests/kafkatest/tests/connect/templates/connect_standalone.properties` is only adding in the broker's security configuration with the `producer.` and `consumer.` prefixes, but is not adding them with no prefix. The worker uses the AdminClient to connect to the broker to get the Kafka cluster ID and to manage the three internal topics, and the AdminClient is configured via top-level properties. Because the SASL test requires the clients all connect using SASL, the lack of broker security configs means the AdminClient was attempting and failing to connect to the broker. This is corrected by adding the broker's security configuration to the Connect worker configuration file at the top-level. (This was already being done in the `connect_distributed.properties` file.)

Second, the default `request.timeout.ms` for the AdminClient (and the other clients) is 120 seconds, so the AdminClient was retrying for 120 seconds before it would give up and thrown an error. However, the test was only waiting for 60 seconds before determining that the service failed to start. This can be corrected by setting `request.timeout.ms=10000` in the Connect distributed and standalone worker configurations.

Third, the Connect workers were recently changed to lookup the Kafka cluster ID before it started the herder. This is unlike the older uses of the AdminClient to find and manage the internal topics, where failure to connect was not necessarily logged correctly but nevertheless still skipped over, relying upon broker auto-topic creation to create the internal topics. (This may be why the test did not fail prior to the recent change to always require a successful AdminClient connection.) Although the worker never got this far in its startup process, the fact that we missed such an error since the prior releases means that failure to connect with the AdminClient was not being properly reported.

The `ConnectStandaloneFileTest.test_file_source_and_sink` system tests were run locally prior to this fix, and they failed as with the nightlies. Once these fixes were made, the locally run system tests passed.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <me@ewencp.org>

Closes #4610 from rhauch/kafka-6577-trunk
2018-02-22 09:39:59 +00:00
Matthias J. Sax 1715e6eac0
MINOR: Fix Streams-Broker-Compatibility system test (#4594)
fixes error message handling for test consumer client and KafkaStreams instance
updates expected error message
fixes race condition in system test code and avoids starting Streams processor twice

Author: Matthias J. Sax <matthias@confluent.io.>

Reviewer: Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-02-21 15:53:46 -08:00
Ewen Cheslack-Postava da379c95e4 MINOR: Cancel port forwarding for HttpMetricsCollector during cleanup
Currently port forwarding is setup for HttpMetricsCollector when the Service's start_node method is called, but not canceled during stop. This hasn't presented a problem so far because we don't have tests that use this *and* restart the service. However, if a test/service does that, it will throw an exception since the port is already bound.

This just does the cleanup when stopping so a subsequent attempt to start again will succeed.

https://jenkins.confluent.io/job/system-test-kafka-branch-builder/1320 is a test run for a Test that uses ProducerPerformanceService, which in turn uses HttpMetricsCollector to validate the change.

Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Apurva Mehta <apurva@confluent.io>

Closes #4604 from ewencp/cleanup-reverse-port-forward
2018-02-21 10:52:27 -08:00
Damian Guy 57059d4022 MINOR: Fix streams broker compatibility test.
Change the string in the test condition to the one that is logged

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #4599 from dguy/broker-compatibility
2018-02-20 17:46:05 +00:00
Konstantine Karantasis f10c0d3863 MINOR: Fix file source task configs in system tests.
Another fall-through of `headers.converter` and `batch.size` properties. Here in `FileStreamSourceConnector` tests

Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Damian Guy <damian.guy@gmail.com>

Closes #4590 from kkonstantine/MINOR-Fix-file-source-task-config-in-system-tests
2018-02-20 10:45:50 +00:00
Matthias J. Sax 0b3b6049f0
MINOR: Fix Streams EOS system tests (#4572)
Avoid loosing log/stdout/stderr files on restart
Reenables tests

Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-02-16 13:13:13 -08:00
Attila Sasvari 4837d44673 MINOR: Fix typos in kafka.py (#4581) 2018-02-16 08:30:14 -08:00
Guozhang Wang 962bc638f9
MINOR: Add a new system test for resilience (#4560)
* Rolling kill-restart Streams instances with brokers unavailable temporarily, and validate that the streams can still complete the restart process

Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-02-12 18:28:18 -08:00
Bill Bejeck 67803384d9 MINOR: adding system tests for how streams functions with broker faiures (#4513)
System test for two cases:

* Starting a multi-node streams application with the broker down initially, broker starts and confirm rebalance completes and streams application still able to process records.

* Multi-node streams app running, broker goes down, stop stream instance(s) confirm after broker comes back remaining streams instance(s) still function.

Reviewers: Guozhang Wang <guozhang@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-02-07 17:21:35 -08:00
Damian Guy ca01711c0e
Bump trunk versions to 1.2-SNAPSHOT (#4505) 2018-02-01 11:35:43 +00:00
Konstantine Karantasis 83cc138e0c MINOR: Add async and different sync startup modes in connect service test class
Allow Connect Service in system tests to start asynchronously.

Specifically, allow for three startup conditions:
1. No condition - start async and return immediately.
2. Semi-async - start immediately after plugins have been discovered successfully.
3. Sync - start returns after the worker has completed startup. This is the current mode, but its condition is improved by checking that the port of Connect's REST interface is open, rather than that a log line has appeared in the logs.

An associated system test run has been started here:
https://jenkins.confluent.io/job/system-test-confluent-platform-branch-builder/586/

ewencp rhauch, I'd appreciate your review.

Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4423 from kkonstantine/MINOR-Add-async-and-different-sync-startup-modes-in-ConnectService-test-class
2018-01-22 15:00:31 -08:00
Matthias J. Sax 2cefe3f0d0 MINOR: Temporarily disable flaky Streams EOS system tests (#4355)
Reviewers: Bill Bejeck <bill@confluent.io>, Ismael Juma <ismael@juma.me.uk>
2017-12-29 12:44:08 +00:00
Guozhang Wang 7d6f6f7320 MINOR: Fix race condition in Streams EOS system test
We should start the process only within the `with` block, otherwise the bytes parameter would cause a race condition that result in false alarms of system test failures.

Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Ewen Cheslack-Postava <me@ewencp.org>

Closes #4348 from guozhangwang/KMinor-fix-eos-test
2017-12-20 18:44:36 -08:00
Colin P. Mccabe 760d86a970 KAFKA-5849; Add process stop, round trip workload, partitioned test
* Implement process stop faults via SIGSTOP / SIGCONT

* Implement RoundTripWorkload, which both sends messages, and confirms that they are received at least once.

* Allow Trogdor tasks to block until other Trogdor tasks are complete.

* Add CreateTopicsWorker, which can be a building block for a lot of tests.

* Simplify how TaskSpec subclasses in ducktape serialize themselves to JSON.

* Implement some fault injection tests in round_trip_workload_test.py

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #4323 from cmccabe/KAFKA-5849
2017-12-20 21:35:33 +00:00
Bill Bejeck f3b9afe622 MINOR: Broker down for significant amt of time system test
System test where a broker is offline more than the configured timeouts.  In this case:
- Max poll interval set to 45 secs
- Retries set to 2
- Request timeout set to 15 seconds
- Max block ms set to 30 seconds

The broker was taken off-line for 70 seconds or more than double request timeout * num retries

[passing system test results](http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/2017-12-11--001.1513034559--bbejeck--KSTREAMS_1179_broker_down_for_significant_amt_of_time--6ab4802/report.html)

Author: Bill Bejeck <bill@confluent.io>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #4313 from bbejeck/KSTREAMS_1179_broker_down_for_significant_amt_of_time
2017-12-19 15:37:21 -08:00
Rajini Sivaram e5741b90cd MINOR: Increase number of messages in replica verification tool test
Increase the number of messages produced to make the test more reliable. The test failed in a recent build and also fails intermittently when run locally. Since the producer uses acks=0 and the test stops as soon as a lag is observed, the change shouldn't have a big impact on the time taken to run when lag is observed sooner.

Author: Rajini Sivaram <rajinisivaram@googlemail.com>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #4312 from rajinisivaram/MINOR-replicaverification-test
2017-12-11 11:13:57 -08:00
Matthias J. Sax 234ec8a8af KAFKA-4857: Replace StreamsKafkaClient with AdminClient in Kafka Streams
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Bill Bejeck <bbejeck@gmail.com>, Guozhang Wang <wangguoz@gmail.com>

Closes #4242 from mjsax/kafka-4857-admit-client
2017-12-07 16:16:54 -08:00
Bill Bejeck 9204197abf MINOR: Fix broker compatibility tests
Updated the System test `stream_broker_compatibility_test.py` to address system test failures as we have removed explicit broker version checking

- Ignore the `0.8.2.2` and `0.9.0.0` tests because the `NetworkClient` only logs `UnsupportedVersionException`s that occur and will continue to retry connecting.  Once issue https://issues.apache.org/jira/browse/KAFKA-6297 is addressed, we may re-enable these tests.
- Updated existing tests expected error messages
- Updated Streams code in test for to make sure we fail fast for incompatible brokers

Author: Bill Bejeck <bill@confluent.io>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #4286 from bbejeck/MINOR_fix_broker_compatibility_tests
2017-12-03 09:05:18 -08:00
Mikkin 18e34482e6 KAFKA-6284: Fixed system test for Connect REST API
`topics.regex` was added in KAFKA-3073. This change fixes the test that invokes `/validate` to ensure that all the configdefs are returned as expected.

Author: Mikkin <mikkin@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4279 from mikkin/KAFKA-6284
2017-11-30 13:51:19 -08:00
Colin P. Mccabe 58877a0dea KAFKA-6255; Add ProduceBench to Trogdor
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #4245 from cmccabe/KAFKA-6255
2017-11-28 22:09:55 +00:00
Colin P. Mccabe a133e69b45 KAFKA-6247; Install Kibosh on Vagrant and fix release downloads in Docker
Fix an omission where Kibosh was not getting installed on Vagrant
instances running in AWS.

Fix an issue where the Dockerfile was unable to download old Apache
Kafka releases.  See the discussion on KAFKA-6233.

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #4240 from cmccabe/KAFKA-6247
2017-11-21 14:39:14 +00:00
Colin P. Mccabe d9cbc6b1a2 KAFKA-5811; Add Kibosh integration for Trogdor and Ducktape
For ducktape: add Kibosh to the testing Dockerfile.
Create files_unreadable_fault_spec.py.

For trogdor: create FilesUnreadableFaultSpec.java.
Add a unit test of using the Kibosh service.

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #4195 from cmccabe/KAFKA-5811
2017-11-16 17:59:24 +00:00
Ewen Cheslack-Postava 718dda1144 MINOR: Add HttpMetricsReporter for system tests
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #4072 from ewencp/http-metrics
2017-11-09 09:42:46 -08:00
Adem Efe Gencer 86062e9a78 KAFKA-6157; Fix repeated words words in JavaDoc and comments.
Author: Adem Efe Gencer <agencer@linkedin.com>

Reviewers: Jiangjie Qin <becket.qin@gmail.com>

Closes #4170 from efeg/bug/typoFix
2017-11-05 18:00:43 -08:00
Colin P. Mccabe 4fac83ba1f KAFKA-6060; Add workload generation capabilities to Trogdor
Previously, Trogdor only handled "Faults."  Now, Trogdor can handle
"Tasks" which may be either faults, or workloads to execute in the
background.

The Agent and Coordinator have been refactored from a
mutexes-and-condition-variables paradigm into a message passing
paradigm.  No locks are necessary, because only one thread can access
the task state or worker state.  This makes them a lot easier to reason
about.

The MockTime class can now handle mocking deferred message passing
(adding a message to an ExecutorService with a delay).  I added a
MockTimeTest.

MiniTrogdorCluster now starts up Agent and Coordinator classes in
paralle in order to minimize junit test time.

RPC messages now inherit from a common Message.java class.  This class
handles implementing serialization, equals, hashCode, etc.

Remove FaultSet, since it is no longer necessary.

Previously, if CoordinatorClient or AgentClient hit a networking
problem, they would throw an exception.  They now retry several times
before giving up.  Additionally, the REST RPCs to the Coordinator and
Agent have been changed to be idempotent.  If a response is lost, and
the request is resent, no harm will be done.

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Ismael Juma <ismael@juma.me.uk>

Closes #4073 from cmccabe/KAFKA-6060
2017-11-03 09:37:29 +00:00
Matthias J. Sax c7ab3efcbe MINOR: Code cleanup and JavaDoc improvements for clients and Streams
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Bill Bejeck <bill@confluent.io>, Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #4128 from mjsax/minor-cleanup

minor fix
2017-10-30 13:13:19 -07:00
Xavier Léauté f7f8e11213 MINOR: reset state in cleanup, fixes jmx mixin flakiness
ewencp ijuma

Author: Xavier Léauté <xl+github@xvrl.net>
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4123 from xvrl/fix-jmx-flakiness

(cherry picked from commit 91eb178e95)
Signed-off-by: Ewen Cheslack-Postava <me@ewencp.org>
2017-10-25 13:34:36 -07:00
Colin P. Mccabe a3c4ab2427 KAFKA-6070; add ipaddress and enum34 dependencies to docker image
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Alex Ayars <alex.ayars@confluent.io>

Closes #4084 from cmccabe/KAFKA-6070
2017-10-23 16:38:39 +01:00
Magnus Edenhill 60c36b0984 MINOR: Fix var typo in verifiable_consumer assertion
Author: Magnus Edenhill <magnus@edenhill.se>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #4098 from edenhill/verfcons_var_fix
2017-10-19 08:42:12 -07:00
Apurva Mehta 90b5ce3f04 KAFKA-6016; Make the reassign partitions system test use the idempotent producer
With these changes, we are ensuring that the partitions being reassigned are from non-zero offsets. We also ensure that every message in the log has producerId and sequence number.

This means that it successfully reproduces https://issues.apache.org/jira/browse/KAFKA-6003.

Author: Apurva Mehta <apurva@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #4029 from apurvam/KAFKA-6016-add-idempotent-producer-to-reassign-partitions
2017-10-10 10:32:17 -07:00
Matthias J. Sax 51063441d3 KAFKA-5362; Follow up to Streams EOS system test
- improve tests to get rid of calls to `sleep` in Python
 - fixed some flaky test conditions
 - improve debugging

Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #3542 from mjsax/failing-eos-system-tests
2017-10-06 17:48:34 -07:00
Guozhang Wang 6bcbd17d34 Bump up version to 1.1.0-SNAPSHOT 2017-10-04 15:36:19 -07:00
Apurva Mehta dd6347a5df KAFKA-5888; System test to check ordering of messages with transactions and max.in.flight > 1
To check ordering, we augment the existing transactions test to read and write from topics with one partition. Since we are writing monotonically increasing numbers, the topics should always be sorted, making it very easy to check for out of order messages.

Author: Apurva Mehta <apurva@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3969 from apurvam/KAFKA-5888-system-test-which-check-ordering
2017-09-28 13:03:07 -07:00
Randall Hauch afaaea8093 KAFKA-5954; Correct Connect REST API system test
Author: Randall Hauch <rhauch@gmail.com>

Reviewers: tedyu <yuzhihong@gmail.com>, Ismael Juma <ismael@juma.me.uk>

Closes #3934 from rhauch/kafka-5954
2017-09-21 16:46:10 +01:00
Ismael Juma 52d7b6763b MINOR: Fix replica_verification_tool.py to handle slight change in output format
The string representation of TopicPartition was changed to be
{topic}-{partitition} consistently in the following commit:

f6f56a645b

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Damian Guy <damian.guy@gmail.com>

Closes #3890 from ijuma/fix-replica-verification-test
2017-09-18 15:49:44 +01:00
Ismael Juma 439050816b MINOR: Tweak detection of kafka server start-up in system tests
Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3834 from ijuma/tweak-system-test-regex-for-detecting-server-start-up
2017-09-12 05:41:46 +01:00
Xavier Léauté 6e40455862 KAFKA-5742; Fix incorrect method name follow-up
Author: Xavier Léauté <xl+github@xvrl.net>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3818 from xvrl/fix-startswith
2017-09-08 19:05:01 +01:00
Paolo Patierno 6546f9af6d MINOR: Improve documentation for running docker system tests
Added some tips for running a single test file, test class and/or test method on the documentation landing page about tests

Author: Paolo Patierno <ppatierno@live.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3577 from ppatierno/minor-tests-doc
2017-09-08 11:21:28 +01:00
Ismael Juma 07a428e0c8 MINOR: Always specify the keystore type in system tests
Also throw an exception if a null keystore type is seen
in `SecurityStore`. This should never happen.

The default keystore type has changed in Java 9 (
http://openjdk.java.net/jeps/229), so we need to
be explicit to have consistent behaviour across
Java versions.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #3808 from ijuma/set-jks-explicitly-in-system-tests
2017-09-08 02:29:03 +01:00
Colin P. Mccabe 4065ffb3e1 KAFKA-5777; Add ducktape integration for Trogdor
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #3726 from cmccabe/KAFKA-5777
2017-09-07 13:23:03 +01:00
Randall Hauch 75070bdb5d MINOR: Increase timeout of Zookeeper service in system tests
The previous timeout was 10 seconds, but system test failures have occurred when Zookeeper has started after about 11 seconds. Increasing the timeout to 30 seconds, since most of the time this extra time will not be required, and when it is it will prevent a failed system test.

In addition to merging to `trunk`, please backport to the `0.11.x` and `0.10.2.x` branches.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3774 from rhauch/MINOR-Increase-timeout-of-zookeeper-service-in-system-tests
2017-08-31 14:53:44 -07:00
Colin P. Mccabe 949577ca77 KAFKA-5768; Upgrade to ducktape 0.7.1
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3721 from cmccabe/KAFKA-5768
2017-08-29 16:41:19 -07:00
Colin P. Mccabe 14f6ecd915 MINOR: KafkaService should print node hostname on failure
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #3715 from cmccabe/kafka_service_print_node_hostname_on_failure
2017-08-25 07:44:42 +01:00
Damian Guy 99eebc8404 HOTFIX: reduce streams benchmark input records to 10 million
We are occasionally hitting some timeouts due to processing not finishing. So rather than failing the build for these reasons it would be better to reduce the runtime.

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Guozhang Wang <wangguoz@gmail.com>

Closes #3725 from dguy/fix-system-test
2017-08-23 20:53:00 +01:00
Konstantine Karantasis 3b7e104559 MINOR: Verify startup of zookeeper service in system tests
Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3707 from kkonstantine/MINOR-Verify-startup-of-zookeeper-service-in-system-tests-by-connecting-to-port
2017-08-23 09:52:21 -07:00
Colin P. Mccabe 57dbe39ae1 KAFKA-5743; Ducktape services should use subdirs of /mnt
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3680 from cmccabe/KAFKA-5743
2017-08-21 18:36:45 +01:00
Eno Thereska 3e22c1c04a KAFKA-5725; More failure testing
Author: Eno Thereska <eno.thereska@gmail.com>

Reviewers: Damian Guy <damian.guy@gmail.com>

Closes #3656 from enothereska/minor-add-more-tests
2017-08-18 18:13:02 +01:00
Xavier Léauté 520e651d53 KAFKA-5742: support ZK chroot in system tests
Author: Xavier Léauté <xavier@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3677 from xvrl/support-zk-chroot-in-tests
2017-08-17 15:14:32 -07:00
Eno Thereska 889da45dd0 MINOR: Improve instructions for running system tests with docker
Author: Eno Thereska <eno.thereska@gmail.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3649 from enothereska/minor-docker-docs
2017-08-09 23:35:13 +01:00
Randall Hauch 1a653c813c KAFKA-5704: Corrected Connect distributed startup behavior to allow older brokers to auto-create topics
When a Connect distributed worker starts up talking with broker versions 0.10.1.0 and later, it will use the AdminClient to look for the internal topics and attempt to create them if they are missing. Although the AdminClient was added in 0.11.0.0, the AdminClient uses APIs to create topics that existed in 0.10.1.0 and later. This feature works as expected when Connect uses a broker version 0.10.1.0 or later.

However, when a Connect distributed worker starts up using a broker older than 0.10.1.0, the AdminClient is not able to find the required APIs and thus will throw an UnsupportedVersionException. Unfortunately, this exception is not caught and instead causes the Connect worker to fail even when the topics already exist.

This change handles the UnsupportedVersionException by logging a debug message and doing nothing. The existing producer logic will get information about the topics, which will cause the broker to create them if they don’t exist and broker auto-creation of topics is enabled. This is the same behavior that existed prior to 0.11.0.0, and so this change restores that behavior for brokers older than 0.10.1.0.

This change also adds a system test that verifies Connect works with a variety of brokers and is able to run source and sink connectors. The test verifies that Connect can read from the internal topics when the connectors are restarted.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3641 from rhauch/kafka-5704
2017-08-08 20:20:41 -07:00
Xavier Léauté b8cf976865 MINOR: support retrieving cluster_id in system tests
ewencp would be great to cherry-pick this back into 0.11.x if possible

Author: Xavier Léauté <xavier@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3645 from xvrl/system-test-cluster-id
2017-08-08 19:58:47 -07:00
Paolo Patierno 1d291a4219 KAFKA-5643: Using _DUCKTAPE_OPTIONS has no effect on executing tests
Added handling of _DUCKTAPE_OPTIONS (mainly for enabling debugging)

Author: Paolo Patierno <ppatierno@live.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3578 from ppatierno/kafka-5643
2017-08-06 21:50:43 -07:00
Colin P. Mccabe d637c134c8 KAFKA-5602; ducker-ak: support --custom-ducktape
Support a --custom-ducktape flag which allows developers to install
their own versions of ducktape into Docker images.  This is helpful for
ducktape development.

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ewen Cheslack-Postava <me@ewencp.org>, Ismael Juma <ismael@juma.me.uk>

Closes #3539 from cmccabe/KAFKA-5602
2017-08-05 09:04:05 +01:00
Ismael Juma db306ec362 MINOR: Support versions with 3 segments in _kafka_jar_versions
The bump from 0.11.1.0-SNAPSHOT to 1.0.0-SNAPSHOT broke a couple
of system tests:

* TestVerifiableProducer.test_simple_run
* KafkaVersionTest.test_multi_version

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Damian Guy <damian.guy@gmail.com>

Closes #3587 from ijuma/fix-_kafka_jar_versions
2017-08-02 15:50:40 +01:00
Ismael Juma 5effe72390 MINOR: Next release will be 1.0.0
Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Guozhang Wang <wangguoz@gmail.com>

Closes #3580 from ijuma/bump-to-1.0.0-SNAPSHOT
2017-07-26 13:01:41 -07:00
Dong Lin fc93fb4b61 KAFKA-4763; Handle disk failure for JBOD (KIP-112)
Author: Dong Lin <lindong28@gmail.com>

Reviewers: Jiangjie Qin <becket.qin@gmail.com>, Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Onur Karaman <okaraman@linkedin.com>

Closes #2929 from lindong28/KAFKA-4763
2017-07-22 12:35:32 -07:00
Colin P. Mccabe 9ae09e8059 KAFKA-5623: ducktape kafka service: do not assume Service contains num_nodes
…m_nodes

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3557 from cmccabe/KAFKA-5623
2017-07-20 19:34:16 -07:00
Ewen Cheslack-Postava f50af9c31d KAFKA-5608: Add --wait option for JmxTool and use in system tests to avoid race between JmxTool and monitored services
Author: Ewen Cheslack-Postava <me@ewencp.org>
Author: Ewen Cheslack-Postava <ewen@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>

Closes #3547 from ewencp/wait-jmx-metrics
2017-07-19 21:39:51 -07:00
Ewen Cheslack-Postava d663005fdd MINOR: Do not wait for first line of console consumer output since we now have a more reliable test using JMX
Waiting for the first line of output was added in KAFKA-2527 when JmxMixin was originally added as a heuristic to
determine when the process was ready. We've since determined this is not good enough given JmxTool's limitations
and now include a separate, more reliable check before starting JmxTool. This check is also dangerous since a
consumer that is started before data is available in the topic, it won't output anything to stdout and only logs
errors to a separate log file. This means we may have a long delay between starting the process and starting JMX
monitoring.

Since we have a more reliable check for liveness via JMX now (and in cases that need it, partition assignment
metrics via JMX), we should no longer need to wait for the first line of output.

Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Apurva Mehta <apurva@confluent.io>

Closes #3447 from ewencp/dont-wait-first-line-console-consumer
2017-07-17 21:24:45 -07:00
Matthias J. Sax 6ea36bc6fb HOTFIX: disable flaky system tests
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Eno Thereska <eno.thereska@gmail.com>, Damian Guy <damian.guy@gmail.com>

Closes #3497 from mjsax/disable-flaky-system-tests
2017-07-07 09:45:37 +01:00
Eno Thereska cbafc08a5f MINOR: compatibility tests for streams
Update system tests to make use of the newly released 0.11 version.

Add on to https://github.com/apache/kafka/pull/3454

Author: Eno Thereska <eno.thereska@gmail.com>

Reviewers: Damian Guy <damian.guy@gmail.com>

Closes #3479 from enothereska/minor-compatibility-tests
2017-07-03 16:04:41 +01:00
Ismael Juma 49ed16daf4 MINOR: Compatibility and upgrade tests for 0.11.0.x
Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Eno Thereska <eno.thereska@gmail.com>, Ewen Cheslack-Postava <me@ewencp.org>

Closes #3454 from ijuma/test-upgrades-from-0.11.0.x
2017-06-30 23:36:21 +02:00
Colin P. Mccabe 5536b1237f KAFKA-5484: Refactor kafkatest docker support
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3389 from cmccabe/KAFKA-5484
2017-06-29 13:28:35 -07:00
Ewen Cheslack-Postava e45c767d53 MINOR: Make JmxMixin wait for the monitored process to be listening on the JMX port before launching JmxTool
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #3437 from ewencp/wait-jmx-listening
2017-06-26 17:06:38 -07:00
Matthias J. Sax 2265834803 KAFKA-5362: Add Streams EOS system test with repartitioning topic
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Guozhang Wang <wangguoz@gmail.com>

Closes #3310 from mjsax/kafka-5362-add-eos-system-tests-for-streams-api
2017-06-25 09:34:27 -07:00
Eno Thereska ee5eac715d KAFKA-5487; upgrade and downgrade streams app system test
-Tests for rolling upgrades for a streams app (keeping broker config fixed)
-Tests for rolling upgrades of brokers (keeping streams app config fixed)

Author: Eno Thereska <eno.thereska@gmail.com>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Damian Guy <damian.guy@gmail.com>

Closes #3411 from enothereska/KAFKA-5487-upgrade-test-streams
2017-06-24 07:48:10 +01:00
Matthias J. Sax ac53979647 MINOR: update AWS test setup guide
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Joseph Rea <jrea@users.noreply.github.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2575 from mjsax/minor-update-system-test-readme
2017-06-22 16:42:55 -07:00
Matthias J. Sax b62cccd078 MINOR: improve test README
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3416 from mjsax/minor-aws
2017-06-22 16:17:27 -07:00
Eno Thereska 55a90938a1 MINOR: add Yahoo benchmark to nightly runs
Author: Eno Thereska <eno.thereska@gmail.com>

Reviewers: Damian Guy <damian.guy@gmail.com>

Closes #3289 from enothereska/yahoo-benchmark
2017-06-21 11:46:59 +01:00
Randall Hauch bcf5da0bac KAFKA-5450: Increased timeout of Connect system test utilities
Increased the timeout from 30sec to 60sec. When running the system tests with packaged Kafka, Connect workers can take about 30seconds to start.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3344 from rhauch/KAFKA-5450
2017-06-14 16:23:29 -07:00
Jason Gustafson 005b86ecf3 MINOR: Add random aborts to system test transactional copier service
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #3340 from hachikuji/add-random-aborts-to-system-test
2017-06-14 16:20:18 -07:00
Apurva Mehta 0f3f2f56a2 KAFKA-5437; Always send a sig_kill when cleaning the message copier
When the message copier hangs (like when there is a bug in the client), it ignores the sigterm and doesn't shut down. this leaves the cluster in an unclean state causing future tests to fail.

In this patch we always send SIGKILL when cleaning the node if the process isn't already dead. This is consistent with the other services.

Author: Apurva Mehta <apurva@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3308 from apurvam/KAFKA-5437-force-kill-message-copier-on-cleanup
2017-06-12 15:57:01 -07:00
Colin P. Mccabe 7d1ef63bec KAFKA-5404; Add more AdminClient checks to ClientCompatibilityTest
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3263 from cmccabe/KAFKA-5404
2017-06-12 22:27:58 +01:00
Matthias J. Sax ba07d828c5 KAFKA-5362: Add EOS system tests for Streams API
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #3201 from mjsax/kafka-5362-add-eos-system-tests-for-streams-api
2017-06-08 14:08:54 -07:00
Apurva Mehta eff79849a0 MINOR: Set log level for producer internals to trace for transactions test
We need this to debug most issues with the transactions system test.

Author: Apurva Mehta <apurva@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3261 from apurvam/MINOR-set-log-level-for-producer-to-trace-for-transactions-test
2017-06-08 09:46:15 -07:00
Eno Thereska 5c3d7ca711 MINOR: log4j template should accept log_level
The log_level parameter is used in system tests in kafka.py. However the log4j template accepted that parameter in only one place. This led to a large number of DEBUG lines printed even when the intention was to capture only INFO lines. Led to huge log files. Thanks to ijuma for noticing this.

Author: Eno Thereska <eno.thereska@gmail.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3247 from enothereska/minor-log4j-template-fix
2017-06-07 16:52:10 +01:00
Apurva Mehta 202cb8ea89 KAFKA-5366; Add concurrent reads to transactions system test
This currently fails in multiple ways. One of which is most likely KAFKA-5355, where the concurrent consumer reads duplicates.

During broker bounces, the concurrent consumer misses messages completely. This is another bug.

Author: Apurva Mehta <apurva@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3217 from apurvam/KAFKA-5366-add-concurrent-reads-to-transactions-system-test
2017-06-06 17:21:45 -07:00
Ismael Juma b119d04093 KAFKA-5112; Update compatibility system tests to include 0.10.2.1
Also update message format tests now that we have a third message
format.

Finally, set group.initial.rebalance.delay.ms=100.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Jason Gustafson <jason@confluent.io>

Closes #2701 from ijuma/update-upgrade-tests-for-0.11
2017-06-06 03:21:43 +01:00
Colin P. Mccabe f389b71570 KAFKA-5374; Set allow auto topic creation to false when requesting node information only
It avoids the need to handle protocol downgrades and it's safe (i.e. it will never cause
the auto creation of topics).

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3220 from ijuma/kafka-5374-admin-client-metadata
2017-06-03 06:26:16 +01:00
Apurva Mehta 1959835d9e KAFKA-5281; System tests for transactions
Author: Apurva Mehta <apurva@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3149 from apurvam/KAFKA-5281-transactions-system-tests
2017-06-01 10:25:29 -07:00
Matthias J. Sax 495836a4a2 KAFKA-4923: Modify compatibility test for Exaclty-Once Semantics in Streams
- add broker compatibility system tests

Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Damian Guy, Eno Thereska, Guozhang Wang

Closes #2974 from mjsax/kafka-4923-add-eos-to-streams-add-broker-check-and-system-test
2017-05-21 22:16:18 -07:00
Magnus Edenhill dc10b0ea01 MINOR: Fix race condition in TestVerifiableProducer sanity test
## Fixes race condition in TestVerifiableProducer sanity test:
The test starts a producer, waits for at least 5 acks, and then
logs in to the worker to grep for the producer process to figure
out what version it is running.

The problem was that the producer was set up to produce 1000 messages
at a rate of 1000 msgs/s and then exit. This means it will have a
typical runtime slightly above 1 second.

Logging in to the vagrant instance might take longer than that thus
resulting in the process grep to fail, failing the test.

This commit doesn't really fix the issue - a proper fix would be to tell
the producer to stick around until explicitly killed - but it increases
the chances of the test passing, at the expense of a slightly longer
runtime.

## Improves error reporting when is_version() fails

Author: Magnus Edenhill <magnus@edenhill.se>

Reviewers: Apurva Mehta <apurva@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2765 from edenhill/trunk
2017-05-21 18:23:12 -07:00
Ewen Cheslack-Postava d190d89dbc HOTFIX: In Connect test with auto topic creation disabled, ensure precreated topic is always used
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3112 from ewencp/hotfix-precreate-topic
2017-05-21 18:04:32 -07:00
Ismael Juma 8b2995c31a MINOR: Bump Kafka version to 0.11.1.0-SNAPSHOT
Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3095 from ijuma/bump-kafka-version
2017-05-19 10:20:23 +01:00
Randall Hauch 56623efd73 KAFKA-4667: Connect uses AdminClient to create internal topics when needed (KIP-154)
The backing store for offsets, status, and configs now attempts to use the new AdminClient to look up the internal topics and create them if they don’t yet exist. If the necessary APIs are not available in the connected broker, the stores fall back to the old behavior of relying upon auto-created topics. Kafka Connect requires a minimum of Apache Kafka 0.10.0.1-cp1, and the AdminClient can work with all versions since 0.10.0.0.

All three of Connect’s internal topics are created as compacted topics, and new distributed worker configuration properties control the replication factor for all three topics and the number of partitions for the offsets and status topics; the config topic requires a single partition and does not allow it to be set via configuration. All of these new configuration properties have sensible defaults, meaning users can upgrade without having to change any of the existing configurations. In most situations, existing Connect deployments will have already created the storage topics prior to upgrading.

The replication factor defaults to 3, so anyone running Kafka clusters with fewer nodes than 3 will receive an error unless they explicitly set the replication factor for the three internal topics. This is actually desired behavior, since it signals the users that they should be aware they are not using sufficient replication for production use.

The integration tests use a cluster with a single broker, so they were changed to explicitly specify a replication factor of 1 and a single partition.

The `KafkaAdminClientTest` was refactored to extract a utility for setting up a `KafkaAdminClient` with a `MockClient` for unit tests.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2984 from rhauch/kafka-4667
2017-05-18 16:02:29 -07:00
Ismael Juma bcf447e93e KAFKA-4422; Drop support for Scala 2.10 (KIP-119)
Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Ewen Cheslack-Postava <me@ewencp.org>

Closes #2956 from ijuma/kafka-4422-drop-support-scala-2.10
2017-05-11 08:08:11 +01:00
Konstantine Karantasis 8ace736f71 HOTFIX: Increase kafkatest startup wait time on ConnectDistributed service
Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Magesh Nandakumar <magesh.n.kumar@gmail.com>, Jason Gustafson <jason@confluent.io>

Closes #3006 from kkonstantine/HOTFIX-Align-startup-wait-time-for-ConnectDistributed-service-with-ConnectStandalone-in-kafkatests
2017-05-09 14:02:30 -07:00
Jason Gustafson 4815323be7 MINOR: Replication system tests should cover compressed path
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #2745 from hachikuji/add-replication-testcase-for-compression
2017-04-24 10:17:12 +01:00
Matthias J. Sax 1fbb8cfafa KAFKA-4564; Follow-up to fix test_timeout_on_pre_010_brokers system test failure
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Eno Thereska <eno@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #2897 from mjsax/KAFKA-4564-follow-up
2017-04-23 16:23:48 +01:00
Matthias J. Sax 72be1df598 KAFKA-4564: Add system test for pre-0.10 brokers
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Ismael Juma, Eno Thereska, Matthias J. Sax, Guozhang Wang

Closes #2837 from mjsax/kafka-4564-fail-fast-test-stream-compatibility
2017-04-21 10:53:29 -07:00
Matthias J. Sax 779874fb14 MINOR: improve test stability for Streams broker-compatibility test
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Magnus Edenhill, Eno Thereska, Damian Guy, Guozhang Wang

Closes #2836 from mjsax/minor-broker-comp-test
2017-04-20 13:49:06 -07:00
Ewen Cheslack-Postava 7c436d3889 MINOR: Fix some re-raising of exceptions in system tests
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #2852 from ewencp/minor-re-raise-exceptions
2017-04-19 16:17:53 +01:00
Ben Stopford 0baea2ac13 KIP-101: Alter Replication Protocol to use Leader Epoch rather than High Watermark for Truncation
This PR replaces https://github.com/apache/kafka/pull/2743 (just raising from Confluent repo)

This PR describes the addition of Partition Level Leader Epochs to messages in Kafka as a mechanism for fixing some known issues in the replication protocol. Full details can be found here:

[KIP-101 Reference](https://cwiki.apache.org/confluence/display/KAFKA/KIP-101+-+Alter+Replication+Protocol+to+use+Leader+Epoch+rather+than+High+Watermark+for+Truncation)

*The key elements are*:
- Epochs are stamped on messages as they enter the leader.
- Epochs are tracked in both leader and follower in a new checkpoint file.
- A new API allows followers to retrieve the leader's latest offset for a particular epoch.
- The logic for truncating the log, when a replica becomes a follower, has been moved from Partition into the ReplicaFetcherThread
- When partitions are added to the ReplicaFetcherThread they are added in an initialising state. Initialising partitions request leader epochs and then truncate their logs appropriately.

This test provides a good overview of the workflow `EpochDrivenReplicationProtocolAcceptanceTest.shouldFollowLeaderEpochBasicWorkflow()`

The corrupted log use case is covered by the test
`EpochDrivenReplicationProtocolAcceptanceTest.offsetsShouldNotGoBackwards()`

Remaining work: There is a do list here: https://docs.google.com/document/d/1edmMo70MfHEZH9x38OQfTWsHr7UGTvg-NOxeFhOeRew/edit?usp=sharing

Author: Ben Stopford <benstopford@gmail.com>
Author: Jun Rao <junrao@gmail.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>

Closes #2808 from benstopford/kip-101-v2
2017-04-06 14:51:09 -07:00
Eno Thereska 49f80b2360 KAFKA-4916: test streams with brokers failing
Several fixes for handling broker failures:
- default replication value for internal topics is now 3 in test itself (not in streams code, that will require a KIP.
- streams producer waits for acks from all replicas in test itself (not in streams code, that will require a KIP.
- backoff time for streams client to try again after a failure to contact controller.
- fix bug related to state store locks (this helps in multi-threaded scenarios)
- fix related to catching exceptions property for network errors.
- system test for all the above

Author: Eno Thereska <eno@confluent.io>
Author: Eno Thereska <eno.thereska@gmail.com>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Damian Guy <damian.guy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Dan Norwood <norwood@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2719 from enothereska/KAFKA-4916-broker-bounce-test
2017-04-04 18:32:58 -07:00
Apurva Mehta bdf4cba047 KAFKA-4817; Add idempotent producer semantics
This is from the KIP-98 proposal.

The main points of discussion surround the correctness logic, particularly the Log class where incoming entries are validated and duplicates are dropped, and also the producer error handling to ensure that the semantics are sound from the users point of view.

There is some subtlety in the idempotent producer semantics. This patch only guarantees idempotent production upto the point where an error has to be returned to the user. Once we hit a such a non-recoverable error, we can no longer guarantee message ordering nor idempotence without additional logic at the application level.

In particular, if an application wants guaranteed message order without duplicates, then it needs to do the following in the error callback:

1. Close the producer so that no queued batches are sent. This is important for guaranteeing ordering.
2. Read the tail of the log to inspect the last message committed. This is important for avoiding duplicates.

Author: Apurva Mehta <apurva@confluent.io>
Author: hachikuji <jason@confluent.io>
Author: Apurva Mehta <apurva.1618@gmail.com>
Author: Guozhang Wang <wangguoz@gmail.com>
Author: fpj <fpj@apache.org>
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>

Closes #2735 from apurvam/exactly-once-idempotent-producer
2017-04-02 19:41:44 -07:00
Jason Gustafson 93d451ceeb KAFKA-4689; Disable system tests for consumer hard failures
See the JIRA for the full details. Essentially the test assertions depend on receiving reliable events from the consumer processes, but this is not generally possible in the presence of a hard failure (i.e. `kill -9`). Until we solve this problem, the hard failure scenarios will be turned off.

Author: Jason Gustafson <jason@confluent.io>

Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #2771 from hachikuji/KAFKA-4689
2017-03-30 23:52:03 +01:00
Eno Thereska 84a14fec29 KAFKA-4843: More efficient round-robin scheduler
- Improves streams efficiency by more than 200K requests/second (small 100 byte requests)
- Gets streams efficiency very close to pure consumer (see results in https://jenkins.confluent.io/job/system-test-kafka-branch-builder/746/console)

- Maintains same fairness across tasks
- Schedules all records in the queue in-between poll() calls, not just one per task.

Author: Eno Thereska <eno@confluent.io>
Author: Eno Thereska <eno.thereska@gmail.com>

Reviewers: Damian Guy, Matthias J. Sax, Guozhang Wang

Closes #2643 from enothereska/minor-schedule-round-robin
2017-03-29 15:00:26 -07:00
Ismael Juma 4ce65d65df KAFKA-4574; Ignore test_zk_security_upgrade until KIP-101 lands
The transient failures make it harder to spot real failures and we can live without what is being tested (adding security to ZK via a rolling upgrade) until KIP-101 lands.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Apurva Mehta <apurva@confluent.io>, Jun Rao <junrao@gmail.com>

Closes #2742 from ijuma/disable-zk-upgrade-test
2017-03-29 02:30:30 +01:00
Jason Gustafson 462767660b HOTFIX: Fix unsafe dependence on class name in VerifiableClientJava
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2736 from hachikuji/hotfix-verifiable-clients
2017-03-24 19:42:55 -07:00
Magnus Edenhill d5cec01f2a MINOR: Pluggable verifiable clients
This adds support for pluggable VerifiableConsumer and VerifiableProducers test client implementations
allowing third-party clients to be used in-place of the Java client in kafkatests.

A new VerifiableClientMixin class is added and the standard Java Verifiable{Producer,Consumer}
classes have been changed to use it.

While third-party client drivers may be implemented with a complete class based on the Mixin, a simpler
alternative which requries no kafkatest class implementation is available through the VerifiableClientApp class that uses ducktape's global param to specify the client app to use (passed to ducktape through the `--globals <json>` command line argument).

Some existing kafkatest clients for reference:
Go: https://github.com/confluentinc/confluent-kafka-go/tree/master/kafkatest
Python: https://github.com/confluentinc/confluent-kafka-python/tree/master/confluent_kafka/kafkatest
C++: https://github.com/edenhill/librdkafka/blob/0.9.2.x/examples/kafkatest_verifiable_client.cpp
C#/.NET: https://github.com/confluentinc/confluent-kafka-dotnet/tree/master/test/Confluent.Kafka.VerifiableClient

This PR also contains documentation for the simplex JSON-based verifiable\* client protocol.

There are also some minor improvements that makes troubleshooting failing client tests easier.

Author: Magnus Edenhill <magnus@edenhill.se>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2048 from edenhill/pluggable_verifiable_clients
2017-03-21 22:24:17 -07:00
Ben Stopford 6b11fb7af8 MINOR: Increase Throttle lower bound assertion in throttling_test
Improves the reliability of this test by decreasing the lower bound (this is to be expected as throttling  takes the first fetch to stabilise and will "over-fetch" for this first request)

Author: Ben Stopford <benstopford@gmail.com>

Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #2698 from benstopford/throttling-test-fix
2017-03-17 12:39:53 +00:00
Raghav Kumar Gautam dbcbd7920f KAFKA-4467: Run tests on travis-ci using docker
ijuma ewencp cmccabe harshach Please review.
Here is a sample run:
https://travis-ci.org/raghavgautam/kafka/builds/191714520

In this run 214 tests were run and 144 tests passed.

I will open separate jiras for fixing failures.

Author: Raghav Kumar Gautam <raghav@apache.org>

Reviewers: Sriharsha Chintalapani <harsha@hortonworks.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2376 from raghavgautam/trunk
2017-03-09 16:24:38 -08:00
Eno Thereska b7378d567f MINOR: Standardised benchmark params for consumer and streams
There were some minor differences in the basic consumer config and streams config that are now rectified. In addition, in AWS environments the socket size makes a big difference to performance and I've tuned it up accordingly. I've also increased the number of records now that perf is higher.

Author: Eno Thereska <eno@confluent.io>

Reviewers: Guozhang Wang <wangguoz@gmail.com>

Closes #2634 from enothereska/minor-standardize-params
2017-03-04 20:55:16 -08:00
Colin P. Mccabe 4b1415c109 MINOR: Fix tests/docker/Dockerfile
Fix tests/docker/Dockerfile to put the old Kafka distributions in the
correct spot for tests.  Also, run_tests.sh should exit with an error
code if image rebuilding fails, rather than silently falling back to an
older image.

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2613 from cmccabe/dockerfix
2017-03-03 16:40:25 -08:00
Ismael Juma 2e92f9b2e4 MINOR: Bump version to 0.11.0.0-SNAPSHOT
There won't be a 0.10.3.0.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #2628 from ijuma/bump-version-to-0.11.0.0-SNAPSHOT
2017-03-03 22:23:20 +00:00
Colin P. Mccabe f3fab2e476 KAFKA-4809: docker/run_tests.sh should set up /opt/kafka-dev to be the source directory
…0.x and not 0.8

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2602 from cmccabe/KAFKA-4809
2017-02-28 12:21:46 -08:00
Eno Thereska 9231cc439e KAFKA-4744: Increased timeout for bounce test
Author: Eno Thereska <eno.thereska@gmail.com>

Reviewers: Ismael Juma, Matthias J. Sax, Guozhang Wang

Closes #2601 from enothereska/KAFKA-4744-bounce
2017-02-27 11:19:09 -08:00
Rajini Sivaram 2a7b18a2ac KAFKA-4779; Fix security upgrade system test to be non-disruptive
The phase_two security upgrade test verifies upgrading inter-broker and client protocols to the same value as well as different values. The second case currently changes inter-broker protocol without first enabling the protocol, disrupting produce/consume until the whole cluster is updated. This commit changes the test to be a non-disruptive upgrade test that enables protocols first (simulating phase one of upgrade).

Author: Rajini Sivaram <rajinisivaram@googlemail.com>

Reviewers: Apurva Mehta <apurva.1618@gmail.com>, Ismael Juma <ismael@juma.me.uk>

Closes #2589 from rajinisivaram/KAFKA-4779
2017-02-24 17:24:07 +00:00
Apurva Mehta c6fcc721e2 MINOR: Increase consumer init timeout in throttling test
The throttling system test sometimes fail because it takes longer than the current 10 second time out for partitions to get assigned to the consumer.

Author: Apurva Mehta <apurva@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #2567 from apurvam/increase-timeout-for-partitions-assigned
2017-02-18 06:38:35 -08:00
Eno Thereska b865a8b1dc KAFKA-4716: send create topics to controller in internaltopicmanager
This PR fixes a blocker issue, where the streams client code cannot talk to the controller. It also enables a system test that was previously failing.

This PR is for trunk only. A separate PR with just the fix (but not the tests) will be created for 0.10.2.

Author: Eno Thereska <eno@confluent.io>
Author: Eno Thereska <eno.thereska@gmail.com>

Reviewers: Damian Guy, Ismael Juma, Matthias J. Sax, Guozhang Wang

Closes #2522 from enothereska/KAFKA-4716-metadata
2017-02-09 14:01:13 -08:00
Eno Thereska f8c11eb0c3 HOTFIX: Add missing default arguments to __init__
This caused the bounce and smoke tests to fail on trunk.

Author: Eno Thereska <eno.thereska@gmail.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #2524 from enothereska/hotfix-tests
2017-02-09 15:13:31 +00:00
Eno Thereska 13a82b48ca KAFKA-4702: Parametrize streams benchmarks to run at scale
Author: Eno Thereska <eno.thereska@gmail.com>
Author: Eno Thereska <eno@confluent.io>
Author: Ubuntu <ubuntu@ip-172-31-22-146.us-west-2.compute.internal>

Reviewers: Matthias J. Sax, Guozhang Wang

Closes #2478 from enothereska/minor-benchmark-args
2017-02-08 13:06:09 -08:00
Eno Thereska e7c869e65d HOTFIX: renamed test so it is picked up by ducktape
Author: Eno Thereska <eno@confluent.io>

Reviewers: Matthias J. Sax, Guozhang Wang

Closes #2517 from enothereska/hotfix-broker-test
2017-02-08 12:50:04 -08:00
Ewen Cheslack-Postava 82e75b9607 MINOR: Fix import for streams broker compatibility test to use new DEV_BRANCH constant
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #2508 from ewencp/minor-streams-compatibility-trunk-dev-branch
2017-02-06 13:53:22 -08:00