Introduces v2 of Vote RPC and implements the handling of the new version of the RPC. Many references to "candidate" in the Vote RPC are changed to the more generic "replica". Replicas sending Vote request with PreVote set to true are not candidate. They are instead prospective candidate that are attempting to become candidate. Replicas receiving PreVote requests (vote request with PreVote=true) with an epoch equal to their own will _not_ transition to Unattached state. They will only grant the vote if they have not recently fetched from leader and the request's last epoch and offset are up-to-date with theirs. If a replica receives a PreVote request with an epoch greater than their current epoch, they will transition to Unattached state (setting their epoch to the one from the pre-vote request) and then grant the vote if the request's last epoch and offset are up-to-date with theirs. To avoid a possible ping-pong scenario. For example, there is 3 node quorum, leader node A disconnects from quorum, node B goes into prospective state first before node C, node B sends pre-vote request to node C still in follower state and receives back that node A is leader, node B transitions to follower while node C transitions to prospective after election timeout. If you repeat this interaction, it is possible for such replicas to transition from Follower to Prospective in perpetuity. This issue is resolved by having follower state nodes grant pre-vote requests only if they have successfully fetched from the leader at least once after becoming a follower. This change introduces a new suite called KafkaRaftClientPreVoteTest, for additional KRaft protocol tests with respect to pre-vote. Reviewers: José Armando García Sancio <jsancio@apache.org> |
||
---|---|---|
.. | ||
bin | ||
config | ||
src | ||
.gitignore | ||
README.md |
README.md
KRaft (Kafka Raft)
KRaft (Kafka Raft) is a protocol based on the Raft Consensus Protocol tailored for Apache Kafka.
This is used by Apache Kafka in the KRaft (Kafka Raft Metadata) mode. We also have a standalone test server which can be used for performance testing. We describe the details to set this up below.
Run Single Quorum
bin/test-kraft-server-start.sh --config config/kraft.properties --replica-directory-id b8tRS7h4TJ2Vt43Dp85v2A
Run Multi Node Quorum
Create 3 separate KRaft quorum properties as the following:
cat << EOF >> config/kraft-quorum-1.properties
node.id=1
listeners=PLAINTEXT://localhost:9092
controller.listener.names=PLAINTEXT
controller.quorum.voters=1@localhost:9092,2@localhost:9093,3@localhost:9094
log.dirs=/tmp/kraft-logs-1
EOF
cat << EOF >> config/kraft-quorum-2.properties
node.id=2
listeners=PLAINTEXT://localhost:9093
controller.listener.names=PLAINTEXT
controller.quorum.voters=1@localhost:9092,2@localhost:9093,3@localhost:9094
log.dirs=/tmp/kraft-logs-2
EOF
cat << EOF >> config/kraft-quorum-3.properties
node.id=3
listeners=PLAINTEXT://localhost:9094
controller.listener.names=PLAINTEXT
controller.quorum.voters=1@localhost:9092,2@localhost:9093,3@localhost:9094
log.dirs=/tmp/kraft-logs-3
EOF
Open up 3 separate terminals, and run individual commands:
bin/test-kraft-server-start.sh --config config/kraft-quorum-1.properties --replica-directory-id b8tRS7h4TJ2Vt43Dp85v2A
bin/test-kraft-server-start.sh --config config/kraft-quorum-2.properties --replica-directory-id Nkij_D9XRiYKNb41SiJo7Q
bin/test-kraft-server-start.sh --config config/kraft-quorum-3.properties --replica-directory-id 4-e97nI7eHPYKfEDtW8rtQ
Once a leader is elected, it will begin writing to an internal
__raft_performance_test
topic with a steady workload of random data.
You can control the workload using the --throughput
and --record-size
arguments passed to test-kraft-server-start.sh
.