broker failure system test broken on replication branch; patched by John Fung; reviewed by Joel Koshy and Jun Rao; KAFKA-306

git-svn-id: https://svn.apache.org/repos/asf/incubator/kafka/branches/0.8@1359812 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Jun Rao 2012-07-10 18:05:54 +00:00
parent e7efee02c1
commit d79919d827
13 changed files with 558 additions and 549 deletions

View File

@ -1,23 +1,34 @@
This script performs broker failure tests with the following
setup in a single local machine:
** Please note that the following commands should be executed
after downloading the kafka source code to build all the
required binaries:
1. <kafka install dir>/ $ ./sbt update
2. <kafka install dir>/ $ ./sbt package
1. A cluster of Kafka source brokers
2. A cluster of Kafka mirror brokers with embedded consumers in
point-to-point mode
3. An independent ConsoleConsumer in publish/subcribe mode to
Now you are ready to follow the steps below.
This script performs broker failure tests in an environment with
Mirrored Source & Target clusters in a single machine:
1. Start a cluster of Kafka source brokers
2. Start a cluster of Kafka target brokers
3. Start one or more Mirror Maker to create mirroring between
source and target clusters
4. A producer produces batches of messages to the SOURCE brokers
in the background
5. The Kafka SOURCE, TARGET brokers and Mirror Maker will be
terminated in a round-robin fashion and wait for the consumer
to catch up.
6. Repeat step 5 as many times as specified in the script
7. An independent ConsoleConsumer in publish/subcribe mode to
consume messages from the SOURCE brokers cluster
4. An independent ConsoleConsumer in publish/subcribe mode to
consume messages from the MIRROR brokers cluster
5. A producer produces batches of messages to the SOURCE brokers
6. One of the Kafka SOURCE or MIRROR brokers in the cluster will
be randomly terminated and waiting for the consumer to catch up.
7. Repeat Step 4 & 5 as many times as specified in the script
8. An independent ConsoleConsumer in publish/subcribe mode to
consume messages from the TARGET brokers cluster
Expected results:
==================
There should not be any discrepancies by comparing the unique
message checksums from the source ConsoleConsumer and the
mirror ConsoleConsumer.
target ConsoleConsumer.
Notes:
==================
@ -26,17 +37,36 @@ The number of Kafka SOURCE brokers can be increased as follows:
2. Make sure that there are corresponding number of prop files:
$base_dir/config/server_source{1..4}.properties
The number of Kafka MIRROR brokers can be increased as follows:
The number of Kafka TARGET brokers can be increased as follows:
1. Update the value of $num_kafka_target_server in this script
2. Make sure that there are corresponding number of prop files:
$base_dir/config/server_target{1..3}.properties
Quick Start:
==================
Execute this script as follows:
<kafka home>/system_test/broker_failure $ bin/run-test.sh
In the directory <kafka home>/system_test/broker_failure,
execute this script as following:
$ bin/run-test.sh -n <num of iterations> -s <servers to bounce>
num of iterations - the number of iterations that the test runs
servers to bounce - the servers to be bounced in a round-robin fashion.
Values to be entered:
1 - source broker
2 - mirror maker
3 - target broker
Example:
* To bounce only mirror maker and target broker
in turns, enter the value 23.
* To bounce only mirror maker, enter the value 2.
* To run the test without bouncing, enter 0.
At the end of the test, the received messages checksums in both
SOURCE & TARGET will be compared. If all checksums are matched,
the test is PASSED. Otherwise, the test is FAILED.
In the event of failure, by default the brokers and zookeepers
remain running to make it easier to debug the issue - hit Ctrl-C
to shut them down.

File diff suppressed because it is too large Load Diff

View File

@ -13,7 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
log4j.rootLogger=INFO, stdout, kafkaAppender
log4j.rootLogger=INFO, stdout
# ====================================
# messages going to kafkaAppender
@ -27,7 +27,7 @@ log4j.logger.org.apache.zookeeper=INFO, kafkaAppender
# ====================================
# (comment out this line to redirect ZK-related messages to kafkaAppender
# to allow reading both Kafka and ZK debugging messages in a single file)
#log4j.logger.org.apache.zookeeper=INFO, zookeeperAppender
log4j.logger.org.apache.zookeeper=INFO, zookeeperAppender
# ====================================
# stdout
@ -73,6 +73,7 @@ log4j.additivity.org.apache.zookeeper=false
log4j.logger.kafka.consumer=DEBUG
log4j.logger.kafka.tools.VerifyConsumerRebalance=DEBUG
log4j.logger.kafka.tools.ConsumerOffsetChecker=DEBUG
# to print message checksum from ProducerPerformance
log4j.logger.kafka.perf.ProducerPerformance$ProducerThread=DEBUG

View File

@ -15,7 +15,8 @@
# zk connection string
# comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"
broker.list=0:localhost:9081
#broker.list=0:localhost:9081
zk.connect=localhost:2182
# timeout in ms for connecting to zookeeper
zk.connectiontimeout.ms=1000000

View File

@ -15,7 +15,8 @@
# zk connection string
# comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"
broker.list=0:localhost:9082
#broker.list=0:localhost:9082
zk.connect=localhost:2182
# timeout in ms for connecting to zookeeper
zk.connectiontimeout.ms=1000000

View File

@ -15,7 +15,8 @@
# zk connection string
# comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"
broker.list=0:localhost:9083
#broker.list=0:localhost:9083
zk.connect=localhost:2182
# timeout in ms for connecting to zookeeper
zk.connectiontimeout.ms=1000000

View File

@ -74,8 +74,3 @@ log.default.flush.interval.ms=1000
# time based topic flasher time rate in ms
log.default.flush.scheduler.interval.ms=1000
# set sendBufferSize
send.buffer.size=10000
# set receiveBufferSize
receive.buffer.size=10000

View File

@ -41,7 +41,7 @@ socket.send.buffer=1048576
socket.receive.buffer=1048576
# the maximum size of a log segment
log.file.size=536870912
log.file.size=10000000
# the interval between running cleanup on the logs
log.cleanup.interval.mins=1
@ -74,9 +74,3 @@ log.default.flush.interval.ms=1000
# time based topic flasher time rate in ms
log.default.flush.scheduler.interval.ms=1000
# set sendBufferSize
send.buffer.size=500000
# set receiveBufferSize
receive.buffer.size=500000

View File

@ -41,7 +41,7 @@ socket.send.buffer=1048576
socket.receive.buffer=1048576
# the maximum size of a log segment
log.file.size=536870912
log.file.size=10000000
# the interval between running cleanup on the logs
log.cleanup.interval.mins=1
@ -74,9 +74,3 @@ log.default.flush.interval.ms=1000
# time based topic flasher time rate in ms
log.default.flush.scheduler.interval.ms=1000
# set sendBufferSize
send.buffer.size=500000
# set receiveBufferSize
receive.buffer.size=500000

View File

@ -41,7 +41,7 @@ socket.send.buffer=1048576
socket.receive.buffer=1048576
# the maximum size of a log segment
log.file.size=536870912
log.file.size=10000000
# the interval between running cleanup on the logs
log.cleanup.interval.mins=1
@ -74,9 +74,3 @@ log.default.flush.interval.ms=1000
# time based topic flasher time rate in ms
log.default.flush.scheduler.interval.ms=1000
# set sendBufferSize
send.buffer.size=500000
# set receiveBufferSize
receive.buffer.size=500000

View File

@ -41,7 +41,7 @@ socket.send.buffer=1048576
socket.receive.buffer=1048576
# the maximum size of a log segment
log.file.size=536870912
log.file.size=10000000
# the interval between running cleanup on the logs
log.cleanup.interval.mins=1
@ -77,9 +77,3 @@ log.default.flush.scheduler.interval.ms=1000
# topic partition count map
# topic.partition.count.map=topic1:3, topic2:4
# set sendBufferSize
send.buffer.size=500000
# set receiveBufferSize
receive.buffer.size=500000

View File

@ -41,7 +41,7 @@ socket.send.buffer=1048576
socket.receive.buffer=1048576
# the maximum size of a log segment
log.file.size=536870912
log.file.size=10000000
# the interval between running cleanup on the logs
log.cleanup.interval.mins=1
@ -77,9 +77,3 @@ log.default.flush.scheduler.interval.ms=1000
# topic partition count map
# topic.partition.count.map=topic1:3, topic2:4
# set sendBufferSize
send.buffer.size=500000
# set receiveBufferSize
receive.buffer.size=500000

View File

@ -41,7 +41,7 @@ socket.send.buffer=1048576
socket.receive.buffer=1048576
# the maximum size of a log segment
log.file.size=536870912
log.file.size=10000000
# the interval between running cleanup on the logs
log.cleanup.interval.mins=1
@ -77,9 +77,3 @@ log.default.flush.scheduler.interval.ms=1000
# topic partition count map
# topic.partition.count.map=topic1:3, topic2:4
# set sendBufferSize
send.buffer.size=500000
# set receiveBufferSize
receive.buffer.size=500000