mirror of https://github.com/apache/kafka.git
broker failure system test broken on replication branch; patched by John Fung; reviewed by Joel Koshy and Jun Rao; KAFKA-306
git-svn-id: https://svn.apache.org/repos/asf/incubator/kafka/branches/0.8@1359812 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
e7efee02c1
commit
d79919d827
|
|
@ -1,23 +1,34 @@
|
||||||
This script performs broker failure tests with the following
|
** Please note that the following commands should be executed
|
||||||
setup in a single local machine:
|
after downloading the kafka source code to build all the
|
||||||
|
required binaries:
|
||||||
|
1. <kafka install dir>/ $ ./sbt update
|
||||||
|
2. <kafka install dir>/ $ ./sbt package
|
||||||
|
|
||||||
1. A cluster of Kafka source brokers
|
Now you are ready to follow the steps below.
|
||||||
2. A cluster of Kafka mirror brokers with embedded consumers in
|
|
||||||
point-to-point mode
|
This script performs broker failure tests in an environment with
|
||||||
3. An independent ConsoleConsumer in publish/subcribe mode to
|
Mirrored Source & Target clusters in a single machine:
|
||||||
|
|
||||||
|
1. Start a cluster of Kafka source brokers
|
||||||
|
2. Start a cluster of Kafka target brokers
|
||||||
|
3. Start one or more Mirror Maker to create mirroring between
|
||||||
|
source and target clusters
|
||||||
|
4. A producer produces batches of messages to the SOURCE brokers
|
||||||
|
in the background
|
||||||
|
5. The Kafka SOURCE, TARGET brokers and Mirror Maker will be
|
||||||
|
terminated in a round-robin fashion and wait for the consumer
|
||||||
|
to catch up.
|
||||||
|
6. Repeat step 5 as many times as specified in the script
|
||||||
|
7. An independent ConsoleConsumer in publish/subcribe mode to
|
||||||
consume messages from the SOURCE brokers cluster
|
consume messages from the SOURCE brokers cluster
|
||||||
4. An independent ConsoleConsumer in publish/subcribe mode to
|
8. An independent ConsoleConsumer in publish/subcribe mode to
|
||||||
consume messages from the MIRROR brokers cluster
|
consume messages from the TARGET brokers cluster
|
||||||
5. A producer produces batches of messages to the SOURCE brokers
|
|
||||||
6. One of the Kafka SOURCE or MIRROR brokers in the cluster will
|
|
||||||
be randomly terminated and waiting for the consumer to catch up.
|
|
||||||
7. Repeat Step 4 & 5 as many times as specified in the script
|
|
||||||
|
|
||||||
Expected results:
|
Expected results:
|
||||||
==================
|
==================
|
||||||
There should not be any discrepancies by comparing the unique
|
There should not be any discrepancies by comparing the unique
|
||||||
message checksums from the source ConsoleConsumer and the
|
message checksums from the source ConsoleConsumer and the
|
||||||
mirror ConsoleConsumer.
|
target ConsoleConsumer.
|
||||||
|
|
||||||
Notes:
|
Notes:
|
||||||
==================
|
==================
|
||||||
|
|
@ -26,17 +37,36 @@ The number of Kafka SOURCE brokers can be increased as follows:
|
||||||
2. Make sure that there are corresponding number of prop files:
|
2. Make sure that there are corresponding number of prop files:
|
||||||
$base_dir/config/server_source{1..4}.properties
|
$base_dir/config/server_source{1..4}.properties
|
||||||
|
|
||||||
The number of Kafka MIRROR brokers can be increased as follows:
|
The number of Kafka TARGET brokers can be increased as follows:
|
||||||
1. Update the value of $num_kafka_target_server in this script
|
1. Update the value of $num_kafka_target_server in this script
|
||||||
2. Make sure that there are corresponding number of prop files:
|
2. Make sure that there are corresponding number of prop files:
|
||||||
$base_dir/config/server_target{1..3}.properties
|
$base_dir/config/server_target{1..3}.properties
|
||||||
|
|
||||||
Quick Start:
|
Quick Start:
|
||||||
==================
|
==================
|
||||||
Execute this script as follows:
|
In the directory <kafka home>/system_test/broker_failure,
|
||||||
<kafka home>/system_test/broker_failure $ bin/run-test.sh
|
execute this script as following:
|
||||||
|
$ bin/run-test.sh -n <num of iterations> -s <servers to bounce>
|
||||||
|
|
||||||
|
num of iterations - the number of iterations that the test runs
|
||||||
|
|
||||||
|
servers to bounce - the servers to be bounced in a round-robin fashion.
|
||||||
|
|
||||||
|
Values to be entered:
|
||||||
|
1 - source broker
|
||||||
|
2 - mirror maker
|
||||||
|
3 - target broker
|
||||||
|
|
||||||
|
Example:
|
||||||
|
* To bounce only mirror maker and target broker
|
||||||
|
in turns, enter the value 23.
|
||||||
|
* To bounce only mirror maker, enter the value 2.
|
||||||
|
* To run the test without bouncing, enter 0.
|
||||||
|
|
||||||
|
At the end of the test, the received messages checksums in both
|
||||||
|
SOURCE & TARGET will be compared. If all checksums are matched,
|
||||||
|
the test is PASSED. Otherwise, the test is FAILED.
|
||||||
|
|
||||||
In the event of failure, by default the brokers and zookeepers
|
In the event of failure, by default the brokers and zookeepers
|
||||||
remain running to make it easier to debug the issue - hit Ctrl-C
|
remain running to make it easier to debug the issue - hit Ctrl-C
|
||||||
to shut them down.
|
to shut them down.
|
||||||
|
|
||||||
|
|
|
||||||
File diff suppressed because it is too large
Load Diff
|
|
@ -13,7 +13,7 @@
|
||||||
# See the License for the specific language governing permissions and
|
# See the License for the specific language governing permissions and
|
||||||
# limitations under the License.
|
# limitations under the License.
|
||||||
|
|
||||||
log4j.rootLogger=INFO, stdout, kafkaAppender
|
log4j.rootLogger=INFO, stdout
|
||||||
|
|
||||||
# ====================================
|
# ====================================
|
||||||
# messages going to kafkaAppender
|
# messages going to kafkaAppender
|
||||||
|
|
@ -27,7 +27,7 @@ log4j.logger.org.apache.zookeeper=INFO, kafkaAppender
|
||||||
# ====================================
|
# ====================================
|
||||||
# (comment out this line to redirect ZK-related messages to kafkaAppender
|
# (comment out this line to redirect ZK-related messages to kafkaAppender
|
||||||
# to allow reading both Kafka and ZK debugging messages in a single file)
|
# to allow reading both Kafka and ZK debugging messages in a single file)
|
||||||
#log4j.logger.org.apache.zookeeper=INFO, zookeeperAppender
|
log4j.logger.org.apache.zookeeper=INFO, zookeeperAppender
|
||||||
|
|
||||||
# ====================================
|
# ====================================
|
||||||
# stdout
|
# stdout
|
||||||
|
|
@ -73,6 +73,7 @@ log4j.additivity.org.apache.zookeeper=false
|
||||||
|
|
||||||
log4j.logger.kafka.consumer=DEBUG
|
log4j.logger.kafka.consumer=DEBUG
|
||||||
log4j.logger.kafka.tools.VerifyConsumerRebalance=DEBUG
|
log4j.logger.kafka.tools.VerifyConsumerRebalance=DEBUG
|
||||||
|
log4j.logger.kafka.tools.ConsumerOffsetChecker=DEBUG
|
||||||
|
|
||||||
# to print message checksum from ProducerPerformance
|
# to print message checksum from ProducerPerformance
|
||||||
log4j.logger.kafka.perf.ProducerPerformance$ProducerThread=DEBUG
|
log4j.logger.kafka.perf.ProducerPerformance$ProducerThread=DEBUG
|
||||||
|
|
|
||||||
|
|
@ -15,7 +15,8 @@
|
||||||
# zk connection string
|
# zk connection string
|
||||||
# comma separated host:port pairs, each corresponding to a zk
|
# comma separated host:port pairs, each corresponding to a zk
|
||||||
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"
|
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"
|
||||||
broker.list=0:localhost:9081
|
#broker.list=0:localhost:9081
|
||||||
|
zk.connect=localhost:2182
|
||||||
|
|
||||||
# timeout in ms for connecting to zookeeper
|
# timeout in ms for connecting to zookeeper
|
||||||
zk.connectiontimeout.ms=1000000
|
zk.connectiontimeout.ms=1000000
|
||||||
|
|
|
||||||
|
|
@ -15,7 +15,8 @@
|
||||||
# zk connection string
|
# zk connection string
|
||||||
# comma separated host:port pairs, each corresponding to a zk
|
# comma separated host:port pairs, each corresponding to a zk
|
||||||
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"
|
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"
|
||||||
broker.list=0:localhost:9082
|
#broker.list=0:localhost:9082
|
||||||
|
zk.connect=localhost:2182
|
||||||
|
|
||||||
# timeout in ms for connecting to zookeeper
|
# timeout in ms for connecting to zookeeper
|
||||||
zk.connectiontimeout.ms=1000000
|
zk.connectiontimeout.ms=1000000
|
||||||
|
|
|
||||||
|
|
@ -15,7 +15,8 @@
|
||||||
# zk connection string
|
# zk connection string
|
||||||
# comma separated host:port pairs, each corresponding to a zk
|
# comma separated host:port pairs, each corresponding to a zk
|
||||||
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"
|
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"
|
||||||
broker.list=0:localhost:9083
|
#broker.list=0:localhost:9083
|
||||||
|
zk.connect=localhost:2182
|
||||||
|
|
||||||
# timeout in ms for connecting to zookeeper
|
# timeout in ms for connecting to zookeeper
|
||||||
zk.connectiontimeout.ms=1000000
|
zk.connectiontimeout.ms=1000000
|
||||||
|
|
|
||||||
|
|
@ -74,8 +74,3 @@ log.default.flush.interval.ms=1000
|
||||||
# time based topic flasher time rate in ms
|
# time based topic flasher time rate in ms
|
||||||
log.default.flush.scheduler.interval.ms=1000
|
log.default.flush.scheduler.interval.ms=1000
|
||||||
|
|
||||||
# set sendBufferSize
|
|
||||||
send.buffer.size=10000
|
|
||||||
|
|
||||||
# set receiveBufferSize
|
|
||||||
receive.buffer.size=10000
|
|
||||||
|
|
|
||||||
|
|
@ -41,7 +41,7 @@ socket.send.buffer=1048576
|
||||||
socket.receive.buffer=1048576
|
socket.receive.buffer=1048576
|
||||||
|
|
||||||
# the maximum size of a log segment
|
# the maximum size of a log segment
|
||||||
log.file.size=536870912
|
log.file.size=10000000
|
||||||
|
|
||||||
# the interval between running cleanup on the logs
|
# the interval between running cleanup on the logs
|
||||||
log.cleanup.interval.mins=1
|
log.cleanup.interval.mins=1
|
||||||
|
|
@ -74,9 +74,3 @@ log.default.flush.interval.ms=1000
|
||||||
# time based topic flasher time rate in ms
|
# time based topic flasher time rate in ms
|
||||||
log.default.flush.scheduler.interval.ms=1000
|
log.default.flush.scheduler.interval.ms=1000
|
||||||
|
|
||||||
# set sendBufferSize
|
|
||||||
send.buffer.size=500000
|
|
||||||
|
|
||||||
# set receiveBufferSize
|
|
||||||
receive.buffer.size=500000
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -41,7 +41,7 @@ socket.send.buffer=1048576
|
||||||
socket.receive.buffer=1048576
|
socket.receive.buffer=1048576
|
||||||
|
|
||||||
# the maximum size of a log segment
|
# the maximum size of a log segment
|
||||||
log.file.size=536870912
|
log.file.size=10000000
|
||||||
|
|
||||||
# the interval between running cleanup on the logs
|
# the interval between running cleanup on the logs
|
||||||
log.cleanup.interval.mins=1
|
log.cleanup.interval.mins=1
|
||||||
|
|
@ -74,9 +74,3 @@ log.default.flush.interval.ms=1000
|
||||||
# time based topic flasher time rate in ms
|
# time based topic flasher time rate in ms
|
||||||
log.default.flush.scheduler.interval.ms=1000
|
log.default.flush.scheduler.interval.ms=1000
|
||||||
|
|
||||||
# set sendBufferSize
|
|
||||||
send.buffer.size=500000
|
|
||||||
|
|
||||||
# set receiveBufferSize
|
|
||||||
receive.buffer.size=500000
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -41,7 +41,7 @@ socket.send.buffer=1048576
|
||||||
socket.receive.buffer=1048576
|
socket.receive.buffer=1048576
|
||||||
|
|
||||||
# the maximum size of a log segment
|
# the maximum size of a log segment
|
||||||
log.file.size=536870912
|
log.file.size=10000000
|
||||||
|
|
||||||
# the interval between running cleanup on the logs
|
# the interval between running cleanup on the logs
|
||||||
log.cleanup.interval.mins=1
|
log.cleanup.interval.mins=1
|
||||||
|
|
@ -74,9 +74,3 @@ log.default.flush.interval.ms=1000
|
||||||
# time based topic flasher time rate in ms
|
# time based topic flasher time rate in ms
|
||||||
log.default.flush.scheduler.interval.ms=1000
|
log.default.flush.scheduler.interval.ms=1000
|
||||||
|
|
||||||
# set sendBufferSize
|
|
||||||
send.buffer.size=500000
|
|
||||||
|
|
||||||
# set receiveBufferSize
|
|
||||||
receive.buffer.size=500000
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -41,7 +41,7 @@ socket.send.buffer=1048576
|
||||||
socket.receive.buffer=1048576
|
socket.receive.buffer=1048576
|
||||||
|
|
||||||
# the maximum size of a log segment
|
# the maximum size of a log segment
|
||||||
log.file.size=536870912
|
log.file.size=10000000
|
||||||
|
|
||||||
# the interval between running cleanup on the logs
|
# the interval between running cleanup on the logs
|
||||||
log.cleanup.interval.mins=1
|
log.cleanup.interval.mins=1
|
||||||
|
|
@ -77,9 +77,3 @@ log.default.flush.scheduler.interval.ms=1000
|
||||||
# topic partition count map
|
# topic partition count map
|
||||||
# topic.partition.count.map=topic1:3, topic2:4
|
# topic.partition.count.map=topic1:3, topic2:4
|
||||||
|
|
||||||
# set sendBufferSize
|
|
||||||
send.buffer.size=500000
|
|
||||||
|
|
||||||
# set receiveBufferSize
|
|
||||||
receive.buffer.size=500000
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -41,7 +41,7 @@ socket.send.buffer=1048576
|
||||||
socket.receive.buffer=1048576
|
socket.receive.buffer=1048576
|
||||||
|
|
||||||
# the maximum size of a log segment
|
# the maximum size of a log segment
|
||||||
log.file.size=536870912
|
log.file.size=10000000
|
||||||
|
|
||||||
# the interval between running cleanup on the logs
|
# the interval between running cleanup on the logs
|
||||||
log.cleanup.interval.mins=1
|
log.cleanup.interval.mins=1
|
||||||
|
|
@ -77,9 +77,3 @@ log.default.flush.scheduler.interval.ms=1000
|
||||||
# topic partition count map
|
# topic partition count map
|
||||||
# topic.partition.count.map=topic1:3, topic2:4
|
# topic.partition.count.map=topic1:3, topic2:4
|
||||||
|
|
||||||
# set sendBufferSize
|
|
||||||
send.buffer.size=500000
|
|
||||||
|
|
||||||
# set receiveBufferSize
|
|
||||||
receive.buffer.size=500000
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -41,7 +41,7 @@ socket.send.buffer=1048576
|
||||||
socket.receive.buffer=1048576
|
socket.receive.buffer=1048576
|
||||||
|
|
||||||
# the maximum size of a log segment
|
# the maximum size of a log segment
|
||||||
log.file.size=536870912
|
log.file.size=10000000
|
||||||
|
|
||||||
# the interval between running cleanup on the logs
|
# the interval between running cleanup on the logs
|
||||||
log.cleanup.interval.mins=1
|
log.cleanup.interval.mins=1
|
||||||
|
|
@ -77,9 +77,3 @@ log.default.flush.scheduler.interval.ms=1000
|
||||||
# topic partition count map
|
# topic partition count map
|
||||||
# topic.partition.count.map=topic1:3, topic2:4
|
# topic.partition.count.map=topic1:3, topic2:4
|
||||||
|
|
||||||
# set sendBufferSize
|
|
||||||
send.buffer.size=500000
|
|
||||||
|
|
||||||
# set receiveBufferSize
|
|
||||||
receive.buffer.size=500000
|
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue