KRaft is the operating mode in which Kafka itself replicates metadata and forms a cluster without ZooKeeper. A controller quorum manages a metadata log via Raft consensus, and brokers subscribe to the resulting applied state.
This article walks through the rationale for removing ZooKeeper, the KRaft metadata design, the key properties, deployment and migration, and availability/security design — focusing on the points most likely to appear on the CCAAK exam.
KRaft brings Kafka's control plane in-house: a controller quorum agrees on and replicates the metadata log via Raft. ZooKeeper is no longer required, and cluster formation and metadata updates are completed entirely within the Kafka lifecycle.
In production, the key decisions are: dedicated controller nodes vs. co-located with brokers, sizing an odd-numbered quorum, and storage initialization (kafka-storage.sh). On the exam, property names, startup order, and KRaft-specific concepts (metadata log, snapshots) appear frequently.
| Term | Meaning in KRaft | Exam check points |
|---|---|---|
| Controller Quorum | Group of controllers that agree on and replicate metadata | Odd-sized, voters config, leader concept |
| Metadata Log | Immutable log of metadata changes (managed by controllers) | Relationship between snapshots and replay |
| Snapshot | Compacted point-in-time view of metadata state | Fast bootstrap and recovery procedure |
| process.roles | Assigns broker / controller roles | Whether to co-locate or separate, and gotchas |
| node.id | Unique node ID in KRaft | Difference from the deprecated broker.id |
KRaft logical architecture (high level)
Generating a cluster ID and initializing storage (mandatory steps)
# 1) Generate a cluster ID
$ KAFKA_HOME/bin/kafka-storage.sh random-uuid
hd9a2a0f-1b23-4c56-9d78-90efab12cd34
# 2) Format storage using server.properties
$ KAFKA_HOME/bin/kafka-storage.sh format -t hd9a2a0f-1b23-4c56-9d78-90efab12cd34 -c config/server.propertiesIn KRaft, the control plane and data plane converge inside Kafka. This eliminates the external ZooKeeper dependency and its operational burden (monitoring, upgrades, scaling), improving consistency and the predictability of metadata-change latency.
Compare them across metadata storage format, consensus algorithm, fault tolerance, and configuration/operational simplicity. It is important to align your mental model with the terminology in the official docs.
| Aspect | ZooKeeper cluster | KRaft (Controller Quorum) |
|---|---|---|
| Control plane | External (ZooKeeper) | In-house in Kafka (controllers) |
| Metadata format | ZNode hierarchy | Immutable log + snapshots |
| Consensus algorithm | Zab | Raft (majority consensus) |
| Dependencies / operations | Requires designing and monitoring a separate cluster | Simplified by single-cluster operation with Kafka |
| Key config | zookeeper.connect | process.roles, controller.quorum.voters etc. |
| Network | Requires opening ZooKeeper ports | Can be isolated via the CONTROLLER listener |
Before / After conceptual diagram
Config comparison example
# Old (ZooKeeper-based)
# zookeeper.connect=zk1:2181,zk2:2181,zk3:2181
# broker.id=1
# New (KRaft)
process.roles=broker,controller
node.id=1
controller.quorum.voters=1@node1:9093,2@node2:9093,3@node3:9093
controller.listener.names=CONTROLLER
listeners=PLAINTEXT://node1:9092,CONTROLLER://node1:9093
inter.broker.listener.name=PLAINTEXT
metadata.log.dir=/var/lib/kafka/metadata
log.dirs=/var/lib/kafka/dataMetadata updates such as topic creation and ACL changes are appended to the metadata log by the controller leader and replicated by followers. Snapshots are taken at fixed intervals or thresholds, enabling fast sync on restart and for newly joining nodes.
Brokers subscribe to metadata updates from the controllers and update their in-memory cache, providing the equivalent of ZooKeeper watchers entirely inside Kafka.
| Element | Role | Operational point |
|---|---|---|
| Metadata Record | A single state-change event | Ordering and idempotent application |
| Metadata Log | Ordered, durable log of records | Fault tolerance and replayability |
| Snapshot | Compacted representation of state | Faster recovery, snapshot generation management |
Metadata change flow
Dedicated metadata directory configuration
# Place the metadata log separately on the controller side
metadata.log.dir=/var/lib/kafka/metadata
# Broker data (topic logs) remains in the usual location
log.dirs=/var/lib/kafka/dataThe minimum production-ready setup is a 3-node controller quorum. Below is an example with brokers and controllers co-located. Each node uses a unique node.id and address.
It is recommended to isolate the CONTROLLER listener at the network level in production, and apply TLS and authentication.
| Property | Purpose | Example |
|---|---|---|
| process.roles | Role assignment | broker,controller |
| node.id | Unique node ID | 1 |
| controller.quorum.voters | Quorum membership | 1@n1:9093,2@n2:9093,3@n3:9093 |
| controller.listener.names | Listener name used by the controller | CONTROLLER |
| listeners | Listener definitions | PLAINTEXT://n1:9092,CONTROLLER://n1:9093 |
| inter.broker.listener.name | Used for inter-broker traffic | PLAINTEXT |
Port and listener separation overview
n1: PLAINTEXT 9092 <--> clients/brokers
CONTROLLER 9093 <--> controller quorum traffic
n2: PLAINTEXT 9092, CONTROLLER 9093
n3: PLAINTEXT 9092, CONTROLLER 9093server.properties (excerpt for node n1)
process.roles=broker,controller
node.id=1
controller.quorum.voters=1@n1:9093,2@n2:9093,3@n3:9093
controller.listener.names=CONTROLLER
listeners=PLAINTEXT://n1:9092,CONTROLLER://n1:9093
inter.broker.listener.name=PLAINTEXT
advertised.listeners=PLAINTEXT://n1:9092
metadata.log.dir=/var/lib/kafka/metadata
log.dirs=/var/lib/kafka/dataFor brand-new clusters, building in KRaft is the cleanest option. The sequence is: config → storage init → startup order (controllers → brokers) → verification.
When migrating from an existing ZooKeeper-based cluster, always check the official procedure and supported versions. Mechanisms aimed at zero- or low-downtime migration have been added incrementally, but version compatibility and constraints can change, so rehearse in a staging environment first.
| Step | Purpose | Representative command / focus |
|---|---|---|
| Cluster ID generation | Assign a unique identifier | kafka-storage.sh random-uuid |
| Storage initialization | Prepare the metadata area | kafka-storage.sh format -t <UUID> -c server.properties |
| Startup order | Establish stable consensus | controllers → brokers |
| Health check | Confirm metadata propagation | kafka-topics.sh --create / --describe |
Greenfield build flow
Verification commands
# Create a topic
$ KAFKA_HOME/bin/kafka-topics.sh --bootstrap-server n1:9092 --create --topic test --partitions 3 --replication-factor 3
# Verify metadata from the broker's perspective
$ KAFKA_HOME/bin/kafka-topics.sh --bootstrap-server n1:9092 --describe --topic testAvailability hinges on an odd-sized quorum. 3 or 5 is typical, balancing latency and fault tolerance. Using dedicated controller nodes adds the benefits of load isolation and fault-domain separation.
For security, make TLS and authentication mandatory on the CONTROLLER listener. For monitoring, surface controller leader elections, replication lag, snapshot events, and metadata-apply latency.
| Quorum size | Tolerated failures | Use case |
|---|---|---|
| 1 | 0 | Lab only (not for production) |
| 3 | 1 | Small production |
| 5 | 2 | Mid-sized clusters with high-availability requirements |
Majority concept (5-node example)
[1][2][3][4][5]
^ ^ ^
Requires agreement from 3 nodes (majority). Tolerates up to 2 failures.Applying TLS to the CONTROLLER listener (conceptual example)
# Actual keystore / truststore configuration should follow your operational standards
listeners=SSL://n1:9092,CONTROLLER://n1:9093
controller.listener.names=CONTROLLER
listener.security.protocol.map=SSL:SSL,CONTROLLER:SSL
ssl.keystore.location=/path/keystore.jks
ssl.keystore.password=***
ssl.truststore.location=/path/truststore.jks
ssl.truststore.password=***CCAAK
問題 1
Which of the following is the most appropriate minimal configuration to correctly form a 3-node co-located (broker+controller) cluster in KRaft mode?
正解: A
KRaft does not use ZooKeeper. You list id@host:port entries in controller.quorum.voters, set controller.listener.names to CONTROLLER, and define CONTROLLER:// in listeners. node.id is unique per node. zookeeper.connect and broker.id are not used in KRaft.
Can KRaft and ZooKeeper be used together in the same cluster?
No. KRaft mode is designed not to use ZooKeeper at all, and mixing the two is not supported. Build new clusters in KRaft mode, and follow the official migration procedure and supported versions when moving from an existing ZooKeeper-based cluster.
How many nodes should controller.quorum.voters be sized for?
Odd numbers are the rule. Use 3 nodes for small clusters (tolerates 1 failure) or 5 nodes for high availability (tolerates 2 failures). Going larger than needed only increases latency and operational overhead.
What happens if you start a node without initializing storage?
Startup fails with a cluster ID mismatch or uninitialized-storage error. Generate a UUID with kafka-storage.sh and format every node with the same cluster ID before starting.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Kafka Topics & Partitions: Distribution Fundamentals (2026)
How Kafka topics and partitions enable scale — ordering guar...
CCDAK Exam Guide: Confluent Certified Developer (2026)
Complete prep for the CCDAK exam — Producer/Consumer API, St...
CCAAK Exam Guide: Confluent Certified Administrator (2026)
Pass the CCAAK exam — cluster management, partitions, securi...
Kafka Replicas & ISR: Fault Tolerance Explained (2026)
Replica placement, in-sync replicas (ISR), leader election. ...
Kafka Offsets: Commit Modes & Consumer Position (2026)
Offset semantics — auto vs. manual commit, __consumer_offset...