kafka/raft
José Armando García Sancio b97a130c08
KAFKA-16538; Enable upgrading kraft version for existing clusters (#19416)
This change implements upgrading the kraft version from 0 to 1 in existing clusters.
Previously, clusters were formatted with either version 0 or version 1, and could not
be moved between them.

The kraft version for the cluster metadata partition is recorded using the
KRaftVersion control record. If there is no KRaftVersion control record
the default kraft version is 0.

The kraft version is upgraded using the UpdateFeatures RPC. These RPCs
are handled by the QuorumController and FeatureControlManager. This
change adds special handling in the FeatureControlManager so that
upgrades to the kraft.version are directed to
RaftClient#upgradeKRaftVersion.

To allow the FeatureControlManager to call
RaftClient#upgradeKRaftVersion is a non-blocking fashion, the kraft
version upgrade uses optimistic locking. The call to
RaftClient#upgradeKRaftVersion does validations of the version change.
If the validations succeeds, it generates the necessary control records
and adds them to the BatchAccumulator.

Before the kraft version can be upgraded to version 1, all of the
brokers and controllers in the cluster need to support kraft version 1.
The check that all brokers support kraft version 1 is done by the
FeatureControlManager. The check that all of the controllers support
kraft version is done by KafkaRaftClient and LeaderState.

When the kraft version is 0, the kraft leader starts by assuming that
all voters do not support kraft version 1. The leader discovers which
voters support kraft version 1 through the UpdateRaftVoter RPC. The
KRaft leader handles UpdateRaftVoter RPCs by storing the updated
information in-memory until the kraft version is upgraded to version 1.
This state is stored in LeaderState and contains the latest directory
id, endpoints and supported kraft version for each voter.

Only when the KRaft leader has received an UpdateRaftVoter RPC from all
of the voters will it allow the upgrade from kraft.version 0 to 1.

Reviewers: Alyssa Huang <ahuang@confluent.io>, Colin P. McCabe <cmccabe@apache.org>
2025-04-22 16:02:51 -07:00
..
bin KAFKA-18388 test-kraft-server-start.sh should use log4j2.yaml (#18370) 2025-01-07 04:20:06 +08:00
config KAFKA-18388 test-kraft-server-start.sh should use log4j2.yaml (#18370) 2025-01-07 04:20:06 +08:00
src KAFKA-16538; Enable upgrading kraft version for existing clusters (#19416) 2025-04-22 16:02:51 -07:00
.gitignore KAFKA-13429: ignore bin on new modules (#11415) 2021-11-10 14:36:24 -06:00
README.md KAFKA-17728 Add missing config `replica-directory-id` to raft README (#17518) 2024-10-17 11:32:27 +08:00

README.md

KRaft (Kafka Raft)

KRaft (Kafka Raft) is a protocol based on the Raft Consensus Protocol tailored for Apache Kafka.

This is used by Apache Kafka in the KRaft (Kafka Raft Metadata) mode. We also have a standalone test server which can be used for performance testing. We describe the details to set this up below.

Run Single Quorum

bin/test-kraft-server-start.sh --config config/kraft.properties --replica-directory-id b8tRS7h4TJ2Vt43Dp85v2A

Run Multi Node Quorum

Create 3 separate KRaft quorum properties as the following:

cat << EOF >> config/kraft-quorum-1.properties

node.id=1
listeners=PLAINTEXT://localhost:9092
controller.listener.names=PLAINTEXT
controller.quorum.voters=1@localhost:9092,2@localhost:9093,3@localhost:9094
log.dirs=/tmp/kraft-logs-1
EOF

cat << EOF >> config/kraft-quorum-2.properties

node.id=2
listeners=PLAINTEXT://localhost:9093
controller.listener.names=PLAINTEXT
controller.quorum.voters=1@localhost:9092,2@localhost:9093,3@localhost:9094
log.dirs=/tmp/kraft-logs-2
EOF

cat << EOF >> config/kraft-quorum-3.properties

node.id=3
listeners=PLAINTEXT://localhost:9094
controller.listener.names=PLAINTEXT
controller.quorum.voters=1@localhost:9092,2@localhost:9093,3@localhost:9094
log.dirs=/tmp/kraft-logs-3
EOF

Open up 3 separate terminals, and run individual commands:

bin/test-kraft-server-start.sh --config config/kraft-quorum-1.properties --replica-directory-id b8tRS7h4TJ2Vt43Dp85v2A
bin/test-kraft-server-start.sh --config config/kraft-quorum-2.properties --replica-directory-id Nkij_D9XRiYKNb41SiJo7Q
bin/test-kraft-server-start.sh --config config/kraft-quorum-3.properties --replica-directory-id 4-e97nI7eHPYKfEDtW8rtQ

Once a leader is elected, it will begin writing to an internal __raft_performance_test topic with a steady workload of random data. You can control the workload using the --throughput and --record-size arguments passed to test-kraft-server-start.sh.