Kafka

Choosing Kafka Replication Factor: Designing Fault Tolerance Around RF=3

2026-04-19
NicheeLab Editorial Team

In production Kafka, setting a topic's Replication Factor (RF) to 3 is the standard practice. It tolerates a single broker failure, limits data-loss risk, and strikes a sensible balance between availability and cost.

However, RF=3 alone is not enough. You also need acks=all aligned with min.insync.replicas, unclean.leader.election disabled, rack-aware placement, and throttling during reassignment — all designed together.

Why RF=3 Is the Default

RF determines the number of replicas per partition and directly drives both fault tolerance and cost. RF=3 assumes one broker failure and offers a good balance of write continuity and data protection. Pair it with acks=all and min.insync.replicas=2 to tolerate one replica being down while preventing loss of committed data.

RF=2 looks cost-efficient at first glance, but with min.insync.replicas=2 a single failure immediately stops writes, and with =1 writes continue but data-loss risk rises. RF=1 is not recommended in production.

  • RF multiplies storage consumption by roughly RF×, and adds about (RF-1)× network replication overhead.
  • RF=3 + min.insync.replicas=2 + acks=all makes it easy to keep both availability and durability during a single failure.
  • Setting the broker-side default.replication.factor to 3 prevents accidental misconfiguration.

Example: explicitly set RF=3 at topic creation time

kafka-topics.sh --bootstrap-server localhost:9092 \
  --create --topic orders \
  --partitions 12 --replication-factor 3

# 既定値の整備(ブローカー設定例)
# server.properties
# default.replication.factor=3
# min.insync.replicas はトピック/ブローカーのいずれでも設定可

Write Durability: Aligning acks and min.insync.replicas

acks=all requires commit acknowledgment from the leader and the entire ISR (in-sync replicas). min.insync.replicas (minISR) is the lower bound on ISR size needed to accept a write. For RF=3, we recommend minISR=2: writes continue as long as ISR stays at 2 or above, even when one replica is down.

Setting minISR equal to RF means a single failure stops writes immediately. Setting minISR too low keeps writes flowing but increases data-loss risk during failures. acks=1 has low latency but acknowledges success before followers replicate, so it should be avoided in production.

  • Recommended combo: RF=3, acks=all, min.insync.replicas=2, enable.idempotence=true
  • Write-continuity condition: ISR size >= min.insync.replicas
  • ISR shrinks due to follower lag or failure, so replication bandwidth and disk performance are also design factors.

Example producer configuration (Java/Properties)

bootstrap.servers=broker-1:9092,broker-2:9092,broker-3:9092
acks=all
retries=2147483647
max.in.flight.requests.per.connection=1
enable.idempotence=true
# レイテンシと耐障害性のバランスを見て batch.size / linger.ms を調整

Failure Scenarios and Availability: Single Broker and AZ Failures

With RF=3 and minISR=2, acks=all writes continue during a single broker failure as long as ISR stays at 2 or above. Using rack awareness to spread replicas across AZs or racks improves resilience to a single AZ failure.

For data integrity, set unclean.leader.election.enable=false. That prevents non-ISR replicas from being promoted to leader, prioritizing consistency over availability. Setting it to true may restore writes faster, but committed offsets can roll back.

  • Assume single-failure: RF=3 + minISR=2 keeps writes flowing; disable unclean leader election.
  • AZ placement: set broker.rack and spread replicas across AZs.
  • Follower fetches are possible, but design according to your consistency requirements (most clients still read from the leader).

Example of protective configuration per topic

kafka-configs.sh --bootstrap-server localhost:9092 \
  --alter --topic orders \
  --add-config min.insync.replicas=2,unclean.leader.election.enable=false

# 確認
echo "describe configs"
kafka-configs.sh --bootstrap-server localhost:9092 \
  --describe --topic orders

Quantifying the RF vs. Bandwidth/Storage Tradeoff

Higher RF increases storage linearly and replication traffic by roughly (RF-1)×. If production traffic is 50 MB/s with RF=3, replication adds about 100 MB/s of additional inter-broker ingress. Design with network and disk headroom in mind.

RF=5 is sometimes adopted for mission-critical workloads, but the cost rises sharply, so reassessing RF=3 against your SLOs and failure assumptions (AZ/rack-level) is the practical baseline.

  • Rough formula: storage ≈ input volume × RF (+ index/overhead)
  • Rough formula: replication ingress ≈ produce rate × (RF-1)
  • Account for re-replication peaks during maintenance (smooth them with throttling).
RF and minISRTolerable failures (writes continue)Expected data-loss riskStorage multiplier
RF=1 (minISR=1)0High1x
RF=2 (minISR=2)0Medium (with unclean disabled)2x
RF=2 (minISR=1)1High (progresses with inconsistency)2x
RF=3 (minISR=2)1Low3x
RF=5 (minISR=3)2Low5x

Rough back-of-envelope calculation (shell)

# 入力 50 MB/s, RF=3 の場合
IN_MBPS=50
RF=3
REPL_MBPS=$(( IN_MBPS * (RF-1) ))
echo "Replicate ingress ~ ${REPL_MBPS} MB/s"

# 日次 500 GB のトピック、RF=3 のストレージ概算(圧縮オフ時)
DAILY_GB=500
STORAGE_GB=$(( DAILY_GB * RF ))
echo "Storage per day ~ ${STORAGE_GB} GB (+index/overhead)"

Placement Strategy: Rack Awareness and AZ Spread

Once broker.rack is set, Kafka's partition assignment spreads replicas across different racks/AZs as much as possible. With RF=3, the ideal layout places one replica in each of three AZs. Design to keep minISR satisfied even during an AZ failure.

Avoid manual replica placement; configure broker rack metadata correctly and let topic creation handle assignment. For existing topics, use the reassignment tool to relocate in a planned, controlled way.

  • Set broker.rack on each broker (e.g. az-a, az-b, az-c).
  • Let topic creation use the automatic rack-aware assignment by default.
  • For RF=3, a 3-AZ layout is ideal. With only 2 AZs, watch out for cases where an AZ failure can no longer satisfy minISR.

Replica spread in a 3-AZ layout (example for P0)

Broker-1AZ-a / broker.rack=a / P0 LeaderBroker-2AZ-b / broker.rack=b / P0 FollowerBroker-3AZ-c / broker.rack=c / P0 FollowerISR = {Broker-1, Broker-2, Broker-3}Placement that satisfies RF=3, min.insync.replicas=2, acks=all

Basic rack-awareness configuration

# 各ブローカーの server.properties
broker.rack=az-a   # ブローカーごとに az-a / az-b / az-c を設定

# トピック作成(自動割り当てを利用)
kafka-topics.sh --bootstrap-server localhost:9092 \
  --create --topic payments --partitions 24 --replication-factor 3

Operational Guardrails: Reassignment, Leader Election, Throttling

Adding or decommissioning brokers, or changing RF, triggers partition reassignment. Throttle the re-replication bandwidth to limit the impact on normal traffic. For long-running operations, consider a maintenance window.

Leader skew degrades latency, so run preferred leader election when needed to rebalance. Keep unclean leader election disabled and prioritize data preservation over availability.

  • Plan reassignments and pair them with throttling (replication.quota).
  • RF changes require reassignment — a plain alter cannot do it.
  • Monitor peak bandwidth and ISR stability, and act before you breach minISR.

Common operational commands

# パーティション数の増加(RF は維持)
kafka-topics.sh --bootstrap-server localhost:9092 \
  --alter --topic orders --partitions 36

# 再割り当てプラン生成と適用(例)
# 1) JSON を用意(対象トピック/ブローカー)
# 2) ツールで --execute、--throttle を指定
kafka-reassign-partitions.sh --bootstrap-server localhost:9092 \
  --reassignment-json-file plan.json --execute --throttle 104857600

# Preferred leader election(偏り是正)
kafka-leader-election.sh --bootstrap-server localhost:9092 \
  --election-type PREFERRED --topic orders

Exam Tips: CCAAK Key Points and Pitfalls

Memorize the standard pattern: RF=3 plus acks=all + minISR=2, unclean.leader.election=false, and broker.rack — a four-piece set. Practice explaining it against failure scenarios.

RF=2 pitfalls: with minISR=1, a single failure keeps writes flowing but raises data-loss risk; with minISR=2, a single failure halts writes. Both show up frequently as exam options.

Higher RF raises storage and bandwidth costs linearly, so adopting RF=5 hinges on whether your SLOs and AZ/region requirements justify it.

  • Terminology checklist: RF, ISR, acks, min.insync.replicas, unclean.leader.election, rack awareness.
  • Prefer consistency over availability: disable unclean. acks=all + minISR is typically tested as a pair.
  • Cost design: storage ≈ RF×, replication bandwidth ≈ (RF-1)×.

Cheat sheet (with comments)

# 推奨の基本線(本番)
# RF=3, acks=all, min.insync.replicas=2, enable.idempotence=true
# unclean.leader.election.enable=false, broker.rack=<AZ/Rack>
# rack-aware による 3 AZ 分散、単一障害で書き込み継続
# コスト: ストレージ 3x, レプリケーション帯域 2x

Check Your Understanding

CCAAK

問題 1

You want to maximize both write availability and data preservation in a production Kafka cluster during a single broker failure. The cluster has brokers evenly placed across 3 AZs. Which combination is most appropriate?

  1. Topic RF=3, acks=all, min.insync.replicas=2, unclean.leader.election=false, broker.rack set per AZ
  2. Topic RF=2, acks=1, min.insync.replicas=1, unclean.leader.election=true
  3. Topic RF=3, acks=1, min.insync.replicas=1, unclean.leader.election=false
  4. Topic RF=1, acks=all, min.insync.replicas=1, unclean.leader.election=false

正解: A

RF=3 + acks=all + minISR=2 keeps writes flowing during a single broker failure while preserving data. Rack awareness spreads replicas across AZs, and disabling unclean leader election prevents data loss. B and C have weak acks/minISR settings, and D's RF=1 leaves no fault tolerance.

Frequently Asked Questions

When should RF be larger than 3?

Consider RF=5 when you need to tolerate an AZ failure plus an additional node failure within the same region, or when strict compliance requirements demand extra redundancy. Storage and replication bandwidth grow accordingly, so justify the choice with clear SLO and cost rationale.

How do you change the RF of an existing topic?

Changing RF requires partition reassignment. Build a reassignment plan that includes the new replica broker targets and run the re-replication. Configure replication throttling during the operation to limit the impact on normal traffic.

How is RF different from cross-cluster replication (e.g. MirrorMaker2)?

RF provides partition redundancy within a single cluster. Cross-cluster replication tools like MirrorMaker2 forward streams to a separate cluster for disaster recovery or regional isolation; they operate at a different design layer with different goals. Combine both to build a layered recovery strategy.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Kafka

Kafka Topics & Partitions: Distribution Fundamentals (2026)

How Kafka topics and partitions enable scale — ordering guar...

Kafka

CCDAK Exam Guide: Confluent Certified Developer (2026)

Complete prep for the CCDAK exam — Producer/Consumer API, St...

Kafka

CCAAK Exam Guide: Confluent Certified Administrator (2026)

Pass the CCAAK exam — cluster management, partitions, securi...

Kafka

Kafka Replicas & ISR: Fault Tolerance Explained (2026)

Replica placement, in-sync replicas (ISR), leader election. ...

Kafka

Kafka Offsets: Commit Modes & Consumer Position (2026)

Offset semantics — auto vs. manual commit, __consumer_offset...

Browse all Kafka articles (101)
© 2026 NicheeLab All rights reserved.