Kafka Rack Awareness: AZ-Spread Replica Placement (2026)

Rack Awareness is a built-in feature that physically or logically separates brokers and replicas (across racks, availability zones, or data centers) to reduce the chance that a single failure causes an outage or data loss. The two operational keys are: set the correct rack on every broker, and tune replication settings on the assumption that a rack can fail.

This article walks through broker placement, the partition placement algorithm, RF combined with min.insync.replicas, and failure-time behavior, all grounded in stable behavior from the official documentation. We close with the CCAAK-specific points to remember for the exam.

Goals and Prerequisites of Rack Awareness

Kafka's Rack Awareness spreads replicas across different racks as much as possible when topics are created or replicas are reassigned. As a result, even if an entire rack fails or is taken down for planned maintenance, a replica on another rack can keep the leadership and continue serving reads and writes.

The prerequisite is simple: every broker must have a consistent rack label (broker.rack). Without it, the controller cannot use rack information and the risk of replicas piling up on the same rack rises sharply.

Rack Awareness is about replica placement; it is a separate feature from cross-cluster replication and from client-side near-replica reads.
Fault tolerance only materializes when rack placement is combined with RF (replication.factor), min.insync.replicas, and the producer acks setting.
When the number of racks < RF, placing every replica on its own rack is impossible and some replicas will share a rack.

Broker rack Configuration and Safe Rolling Application

broker.rack is a static per-broker setting. Rack and AZ names must be consistent across the cluster, with no typos or accidental duplicates. After setting it, apply with a rolling restart. Replica placement for existing topics does not change automatically, so plan reassignments where needed.

Verify in practice by inspecting broker configuration via the admin API, or by creating a fresh test topic and checking that its replicas are spread across different racks.

Build and review a mapping table from broker ID to physical rack / AZ.
Set broker.rack on every broker and apply with a rolling restart.
Create a test topic (RF ≥ number of racks) and verify replica spread across racks.
To spread existing topics across racks, plan a reassignment (and do not forget to set throttles).
Be cautious when renaming racks; an incorrect change can cause temporary leader imbalance and a spike in re-syncs.

Example broker.rack configuration and verification topic creation

# 各ブローカー server.properties（一例）
broker.id=1
broker.rack=r1
# 他設定は省略

# Kubernetes 等では環境変数を使ってテンプレート展開する例もある
# KAFKA_BROKER_RACK=r1

# 検証用に RF=3 のトピックを作成
kafka-topics --bootstrap-server <host:port> \
  --create --topic rack-test --partitions 3 --replication-factor 3

# 配置の確認（レプリカが複数 rack に分散しているか）
kafka-topics --bootstrap-server <host:port> --describe --topic rack-test
# AdminClient(API)でブローカー構成(broker.rack)を参照しても良い

Partition Placement Algorithm and How Fault Isolation Works

At topic creation time, the controller takes broker.rack into account and places replicas on different racks where possible. It also balances per-broker partition and replica counts so that no single rack becomes overloaded.

When a rack fails, the leader is elected from an in-sync replica on another rack. With RF=3, min.insync.replicas=2, and acks=all, the cluster keeps accepting writes even after losing one rack (= one replica).

Rule of thumb for surviving a single-rack failure: RF=3, min.insync.replicas=2, acks=all, unclean.leader.election.enable=false.
When the number of racks < RF, replicas are spread as much as possible but full isolation cannot be guaranteed.
Reassignment tools can also propose rack-aware placements (check the tool's default behavior before relying on it in production).

Example of rack spread (RF=3, 3 racks, 6 brokers)

Topic Design: Key Points for RF, min.insync.replicas, and acks

To survive rack failures, you need more than just replica spread; you also need an appropriate write-quorum configuration. In particular, treat RF, min.insync.replicas (minISR), and producer acks as a single tuning unit, and understand the trade-off between continuity during a failure and data safety.

For a 3-rack setup, RF=3, minISR=2, and acks=all is the general recommendation. This keeps writes flowing through a single-rack loss while disabling unclean leader election to prevent data loss.

minISR can be set per topic or as a broker default. It must be smaller than RF, and if it cannot be satisfied during a failure, writes fail.
Without acks=all, writes can appear to succeed during a rack failure but later turn out to have lost data.
When the partition count is high, intra-rack balance across brokers also becomes important to avoid hotspots.

Configuration	Rack-failure tolerance	Write availability during failure	Typical use case / caveats
broker.rack not set + RF=3 + minISR=2 + acks=all	Theoretically possible but actual placement may concentrate on one rack	Depending on placement, writes may fail	First step: get broker.rack set on every broker
broker.rack set + RF=3 + minISR=2 + acks=all	Handles a single-rack failure	Available (while 2 replicas remain healthy)	Standard CCAAK answer; disable unclean leader election
broker.rack set + RF=2 + minISR=2 + acks=all	Weak against single-rack failures (RF=2 fails on one-rack loss)	Not available (quorum not met)	Low cost, but unsuitable where rack-failure tolerance is required
broker.rack set + RF=3 + minISR=1 + acks=1	Replica spread works but data safety is weak	Available (but with consistency risk)	For temporary, latency-first workloads; not recommended for critical data

Behavior During Failures and Maintenance, and Practical Patterns

For planned maintenance, avoid rack-wide work and prefer rolling at the broker level. If a full-rack shutdown is unavoidable, review minISR and traffic patterns in advance and, if needed, apply temporary throttles or tune producer back-off.

When reassigning replicas, generate a rack-aware plan and set rebalance throttles appropriately. From a monitoring angle, focus on whether partition leadership is skewed toward a particular rack and whether ISR shrink events are cascading.

Rebalance throttle settings (replica.fetch.max.bytes, follower-fetch bandwidth, tool-level throttles)
Run preferred leader election deliberately to balance leaders across racks
Split large partition moves into smaller batches and watch indicators (UnderReplicatedPartitions, ISR churn) as you go

CCAAK Exam Checklist and Pitfalls

CCAAK tests practical fundamentals: the rack-aware replica spread mechanism, how RF/minISR/acks interact, whether writes succeed during a rack failure, and the rollout order (broker.rack set on every broker). Covering the points below reduces avoidable losses.

Also, client-side near-replica reads are a separate configuration area; do not confuse them with Rack Awareness (which is about replica placement).

Rack spread does not work unless broker.rack is set on every broker.
RF=3, minISR=2, acks=all is the canonical configuration that survives a single-rack failure.
When the number of racks < RF, full isolation is impossible and some replicas may share a rack.
Setting unclean.leader.election.enable=false prevents data loss (at the cost of availability).
Existing topic placements do not change automatically unless you reassign replicas.

Check Your Understanding

CCAAK

問題 1

On a Kafka cluster with brokers evenly placed across 3 racks (r1, r2, r3), you want to keep writes flowing through a full-rack failure while avoiding data loss. Which combination is most appropriate?

Set broker.rack on every broker, RF=3, min.insync.replicas=2, producer acks=all, unclean.leader.election.enable=false
Set broker.rack on every broker, RF=2, min.insync.replicas=2, producer acks=all
Set broker.rack on only some brokers, RF=3, min.insync.replicas=1, producer acks=1
broker.rack unset, RF=3, min.insync.replicas=2, producer acks=all (placement left to the tool)

正解: A

To plan for rack failures, replicas must be spread across racks (broker.rack on every broker), the quorum conditions RF=3 and minISR=2 must be met, and acks=all must confirm commits. Disable unclean leader election to prevent data loss. RF=2 fails the quorum on a rack loss, and missing or partial rack settings do not guarantee spread.

Frequently Asked Questions

If I change broker.rack later, will existing topic replicas move automatically?

No, they do not move automatically. The rack value is only used in placement calculations for newly created topics or reassignments. To spread existing partitions across racks, you need to plan and execute a reassignment with the reassignment tool or admin API.

What happens if the replication factor is larger than the number of racks?

Kafka spreads replicas across as many racks as possible, but the remaining replicas end up on the same rack as another replica. Full rack isolation is impossible in this case, so depending on requirements consider adding racks, lowering RF, or revising the placement policy.

Can consumers be made to read preferentially from replicas in the same rack?

This is a separate concern from replica placement Rack Awareness. Kafka offers client- and broker-side settings that prefer fetching from nearby replicas, by combining client.rack with related replica selector settings. Availability and defaults vary by version, so check the official documentation for the version you are running.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

Kafka Rack Awareness in Practice: Broker Placement for Fault Isolation