This PR implements the second part of KIP-996 and KAFKA-16164 (tasks KAFKA-16607, KAFKA-17642, KAFKA-17643, KAFKA-17675) which encompass the response handling of PreVotes, addition of new ProspectiveState, update to metrics, and addition of Raft simulation tests. Voters now transition to ProspectiveState first before CandidateState to prevent unnecessary epoch bumps. Voters in ProspectiveState send PreVotes requests which are Vote requests with PreVote set to true. Follower grants PreVotes if it has not yet fetched successfully from leader. Leader denies all PreVotes. Unattached, Prospective, Candidate, and Resigned will grant PreVotes if the requesting replica's log is at least as long as theirs. Granted PreVotes are not persisted like standard votes. It is possible for a voter to grant several PreVotes in the same epoch. The only state which is allowed to transition directly to CandidateState is ProspectiveState. This happens on reception of majority of granted PreVotes or if at least one voter doesn't support PreVote requests. Prospective will transition to Follower after election loss/timeout if it was already aware of last known leader and the leader's endpoint, or at any point if it discovers the leader. Prospective will transition to Unattached after election loss/timeout if it does not know the leader endpoints. After electionTimeout, Resigned now always transitions to Unattached and increases the epoch. Prospective grants standard votes if it has not already granted a standard vote (no votedKey), has no leaderId, and the recipient's log is current enough Candidate no longer backs off after election timeout. Candidate still backs off after election loss. Reviewers: José Armando García Sancio <jsancio@apache.org> |
||
---|---|---|
.. | ||
bin | ||
config | ||
src | ||
.gitignore | ||
README.md |
README.md
KRaft (Kafka Raft)
KRaft (Kafka Raft) is a protocol based on the Raft Consensus Protocol tailored for Apache Kafka.
This is used by Apache Kafka in the KRaft (Kafka Raft Metadata) mode. We also have a standalone test server which can be used for performance testing. We describe the details to set this up below.
Run Single Quorum
bin/test-kraft-server-start.sh --config config/kraft.properties --replica-directory-id b8tRS7h4TJ2Vt43Dp85v2A
Run Multi Node Quorum
Create 3 separate KRaft quorum properties as the following:
cat << EOF >> config/kraft-quorum-1.properties
node.id=1
listeners=PLAINTEXT://localhost:9092
controller.listener.names=PLAINTEXT
controller.quorum.voters=1@localhost:9092,2@localhost:9093,3@localhost:9094
log.dirs=/tmp/kraft-logs-1
EOF
cat << EOF >> config/kraft-quorum-2.properties
node.id=2
listeners=PLAINTEXT://localhost:9093
controller.listener.names=PLAINTEXT
controller.quorum.voters=1@localhost:9092,2@localhost:9093,3@localhost:9094
log.dirs=/tmp/kraft-logs-2
EOF
cat << EOF >> config/kraft-quorum-3.properties
node.id=3
listeners=PLAINTEXT://localhost:9094
controller.listener.names=PLAINTEXT
controller.quorum.voters=1@localhost:9092,2@localhost:9093,3@localhost:9094
log.dirs=/tmp/kraft-logs-3
EOF
Open up 3 separate terminals, and run individual commands:
bin/test-kraft-server-start.sh --config config/kraft-quorum-1.properties --replica-directory-id b8tRS7h4TJ2Vt43Dp85v2A
bin/test-kraft-server-start.sh --config config/kraft-quorum-2.properties --replica-directory-id Nkij_D9XRiYKNb41SiJo7Q
bin/test-kraft-server-start.sh --config config/kraft-quorum-3.properties --replica-directory-id 4-e97nI7eHPYKfEDtW8rtQ
Once a leader is elected, it will begin writing to an internal
__raft_performance_test
topic with a steady workload of random data.
You can control the workload using the --throughput
and --record-size
arguments passed to test-kraft-server-start.sh
.