kafka/raft
Jason Gustafson a72f0c1eac
KAFKA-10533; KafkaRaftClient should flush log after appends (#9352)
This patch adds missing flush logic to `KafkaRaftClient`. The initial flushing behavior is simplistic. We guarantee that the leader will not replicate above the last flushed offset and we guarantee that the follower will not fetch data above its own flush point. More sophisticated flush behavior is proposed in KAFKA-10526.

We have also extended the simulation test so that it covers flush behavior. When a node is shutdown, all unflushed data is lost. We were able to confirm that the monotonic high watermark invariant fails without the added `flush` calls.

This patch also piggybacks a fix to the `TestRaftServer` implementation. The initial check-in contained a bug which caused `RequestChannel` to fail sending responses because the disabled APIs did not have metrics registered. As a result of this, it is impossible to elect leaders.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2020-10-13 08:59:02 -07:00
..
bin KAFKA-10492; Core Kafka Raft Implementation (KIP-595) (#9130) 2020-09-22 11:32:44 -07:00
config KAFKA-10492; Core Kafka Raft Implementation (KIP-595) (#9130) 2020-09-22 11:32:44 -07:00
src KAFKA-10533; KafkaRaftClient should flush log after appends (#9352) 2020-10-13 08:59:02 -07:00
README.md KAFKA-10492; Core Kafka Raft Implementation (KIP-595) (#9130) 2020-09-22 11:32:44 -07:00

README.md

Kafka Raft

Kafka Raft is a sub module of Apache Kafka which features a tailored version of Raft Consensus Protocol.

Eventually this module will be integrated into the Kafka server. For now, we have a standalone test server which can be used for performance testing. Below we describe the details to set this up.

Run Single Quorum

bin/test-raft-server-start.sh config/raft.properties

Run Multi Node Quorum

Create 3 separate raft quorum properties as the following:

cat << EOF >> config/raft-quorum-1.properties

broker.id=1
listeners=PLAINTEXT://localhost:9092
quorum.bootstrap.servers=localhost:9092,localhost:9093,localhost:9094
quorum.bootstrap.voters=1,2,3
log.dirs=/tmp/raft-logs-1
verbose=true

zookeeper.connect=localhost:2181
EOF

cat << EOF >> config/raft-quorum-2.properties

broker.id=2
listeners=PLAINTEXT://localhost:9093
quorum.bootstrap.servers=localhost:9092,localhost:9093,localhost:9094
quorum.bootstrap.voters=1,2,3
log.dirs=/tmp/raft-logs-2
verbose=true

zookeeper.connect=localhost:2181
EOF

cat << EOF >> config/raft-quorum-3.properties

broker.id=3
listeners=PLAINTEXT://localhost:9094
quorum.bootstrap.servers=localhost:9092,localhost:9093,localhost:9094
quorum.bootstrap.voters=1,2,3
log.dirs=/tmp/raft-logs-3
verbose=true

zookeeper.connect=localhost:2181
EOF

Open up 3 separate terminals, and run individual commands:

bin/test-raft-server-start.sh config/raft-quorum-1.properties
bin/test-raft-server-start.sh config/raft-quorum-2.properties
bin/test-raft-server-start.sh config/raft-quorum-3.properties

This would setup a three node Raft quorum with node id 1,2,3 using different endpoints and log dirs.

Simulate a distributed state machine

You need to use a VerifiableProducer to produce monolithic increasing records to the replicated state machine.

./bin/kafka-run-class.sh org.apache.kafka.tools.VerifiableProducer --bootstrap-server http://localhost:9092 \
--topic __cluster_metadata --max-messages 2000 --throughput 1 --producer.config config/producer.properties

Run Performance Test

Make sure to turn off the printing by setting verbose=false in the property files config/raft-*.properties, to ensure minimum affection to the performance.

Then run the ProducerPerformance module using this example command:

./bin/kafka-producer-perf-test.sh --topic __cluster_metadata --num-records 2000 --throughput -1 --record-size 10 --producer.config config/producer.properties 

Collect the print out throughput to compare with Kafka performance.