KAFKA-10292: Set min.insync.replicas to 1 of __consumer_offsets (#9286)

The test StreamsBrokerBounceTest.test_all_brokers_bounce() fails on
2.5 because in the last stage of the test there is only one broker
left and the offset commit cannot succeed because the
min.insync.replicas of __consumer_offsets is set to 2 and acks is
set to all. This causes a time out and extends the closing of the
Kafka Streams client to beyond the duration passed to the close
method of the client.

This affects especially the 2.5 branch since there Kafka Streams
commits offsets for each task, i.e., close() needs to wait for the
timeout for each task. In 2.6 and trunk the offset commit is done
per thread, so close() does only need to wait for one time out per
stream thread.

I opened this PR on trunk, since the test could also become
flaky on trunk and we want to avoid diverging system tests across
branches.

A more complete solution would be to improve the test by defining
a better success criteria.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
This commit is contained in:
Bruno Cadonna 2020-09-15 20:12:37 +02:00 committed by GitHub
parent 12d98a3d7a
commit a46c07ec8d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 8 additions and 1 deletions

View File

@ -275,7 +275,14 @@ class StreamsBrokerBounceTest(Test):
Start a smoke test client, then kill a few brokers and ensure data is still received
Record if records are delivered
"""
self.setup_system()
# Set min.insync.replicas to 1 because in the last stage of the test there is only one broker left.
# Otherwise the last offset commit will never succeed and time out and potentially take longer as
# duration passed to the close method of the Kafka Streams client.
self.topics['__consumer_offsets'] = { 'partitions': 50, 'replication-factor': self.replication,
'configs': {"min.insync.replicas": 1} }
self.setup_system()
# Sleep to allow test to run for a bit
time.sleep(120)