Kafka Disaster Recovery: Multi-Region Patterns (2026)

Kafka's in-cluster replication handles single-broker and single-AZ failures well, but a full region outage demands deliberate design. This article lays out a DR strategy starting from RPO/RTO targets and the practical patterns for cross-region replication with MirrorMaker 2 or Cluster Linking.

Confluent CCAAK heavily tests durability parameters, replication mechanisms, client behavior during failover, and offset synchronization. This guide is written for both real operations and exam day — designs and procedures you can defend either way.

Start with RPO/RTO and the Failure Model

Every DR design starts by fixing the failure scope and SLOs (RPO/RTO). A region outage covers network partitions and power loss; cross-cluster replication is asynchronous, so RPO will never be zero. The smaller you push RPO, the sharper the trade-off becomes between link bandwidth, latency, and cost.

Treat in-cluster durability and cross-cluster durability as separate problems. The first is controlled by min.insync.replicas, acks, and unclean.leader.election.enable. The second is controlled by MirrorMaker 2 or Cluster Linking lag, checkpoint cadence, and the promotion procedure during failure.

Pin down the failure model: AZ failure, region outage, partial network partition (avoid split-brain)
Quantify SLOs: RPO (e.g., ≤ 60s), RTO (e.g., ≤ 15 min)
Non-functional requirements: guaranteed bandwidth, encryption, audit, latency ceiling
Test cadence: automate quarterly DR drills with planned outages

Core durability settings (broker and producer)

# server.properties（クラスタ内の耐久性）
min.insync.replicas=2
unclean.leader.election.enable=false
replica.lag.time.max.ms=30000
# リージョン内のゾーン配置に合わせる
broker.rack=az-a

# プロデューサ設定（書き込みの耐久性）
acks=all
enable.idempotence=true
max.in.flight.requests.per.connection=1
retries=1000000
request.timeout.ms=30000
delivery.timeout.ms=120000

Cross-Region Replication: Options and Comparison

Kafka does not provide native synchronous replication across regions. The standard options are MirrorMaker 2 (built on Apache Kafka Connect) and Confluent's Cluster Linking. Both are asynchronous by design, and RPO is the sum of network latency and processing lag.

MirrorMaker 2 is open source, supports flexible topologies, and uses checkpoints to translate consumer group offsets. Cluster Linking is a broker-level link with low mirror-topic lag and simpler operations, but it requires Confluent Platform or Confluent Cloud.

Choose Cluster Linking for operational simplicity; choose MirrorMaker 2 for fine-grained control and a fully open-source stack
Both are asynchronous. Driving RPO toward zero requires aggressive bandwidth, latency, and buffer monitoring
Active-active is hard — duplicates and ordering skew are real. Nail down active-passive procedures first

Approach	Implementation Scope	Offset Sync	Typical Lag
MirrorMaker 2	Connectors running on Kafka Connect	Translatable via checkpoints	Seconds to tens of seconds (load-dependent)
Cluster Linking	Broker-native (Confluent)	Translated within the link (mirror topics)	Low latency (per topic)
Storage/snapshot copy	External mechanism (not recommended)	Not possible	Large (minutes to hours)

Big picture of cross-region replication

Minimal MirrorMaker 2 and Cluster Linking configuration

# MirrorMaker 2（connect-mirror-maker 用プロパティ）
clusters = A, B
A.bootstrap.servers=a1:9092,a2:9092
B.bootstrap.servers=b1:9092,b2:9092
A->B.enabled=true
A->B.topics=orders.*,inventory.*
A->B.emit.checkpoints.enabled=true
replication.policy.class=org.apache.kafka.connect.mirror.IdentityReplicationPolicy
sync.topic.configs.enabled=true
sync.topic.acls.enabled=false

# Cluster Linking（Confluent CLI の一例。実コマンドは環境に依存）
confluent kafka link create dr-link \
  --cluster B \
  --source-cluster A \
  --source-bootstrap-server a1:9092 \
  --link-mode READ_ONLY

# ミラートピック作成
confluent kafka mirror create --link dr-link --topic orders

Topic Design and Replica Placement

In-region high availability comes from replication.factor and rack awareness. The baseline is RF=3 with min.insync.replicas=2, placing each partition's replicas in distinct AZs. Set broker.rack correctly so partition assignment naturally spreads across zones.

When data preservation is the priority, lock down unclean.leader.election.enable=false. Because you may switch over, keep topic names, schema compatibility, and cleanup policy (delete/compact) consistent across regions. Remember that Cluster Linking mirror topics are read-only until you promote them.

Baseline: RF=3, min.insync.replicas=2, acks=all
Set broker.rack to the AZ name to spread partitions across zones
Pre-align topic configs, schema compatibility, and ACLs across regions
For log-compacted topics, monitor tombstone propagation and lag, and estimate the time-to-consistency window

Typical topic creation command

kafka-topics \
  --bootstrap-server a1:9092 \
  --create \
  --topic orders \
  --partitions 12 \
  --replication-factor 3 \
  --config min.insync.replicas=2 \
  --config cleanup.policy=delete

Failover Flow and Client Design

For a planned failover, first stop writes on the primary, then confirm that cross-region replication lag and checkpoints have caught up before promoting the DR side. For unplanned events, translate consumer offsets using the most recent checkpoints and rely on a duplicate-tolerant design — idempotent producers, transactions, and downstream idempotency — to absorb the overlap.

Make bootstrap endpoints region-redundant on the client side, then build switchover hooks around DNS, TLS SNI, and load balancers. Producers should use enable.idempotence with a sensible delivery.timeout.ms; consumers should use static membership (group.instance.id) to suppress mass rebalances.

Planned switchover: stop writes → confirm lag threshold → promote DR → flip DNS/config
Unplanned switchover: translate offsets, then tolerate or deduplicate downstream
Explicitly tune client multi-region endpoints and timeouts
Drive automation off monitoring: alert on threshold breaches and execute the runbook in stages

Sample failover runbook (planned, MM2 / Cluster Linking)

# 1) プライマリ側の書き込みを停止
# 2) リージョン間遅延を確認（例: 60 秒以下）
#    - MM2: replication-latency-ms、checkpoints の遅延
#    - CL: ミラートピックの lag メトリクス
# 3) DR 側の昇格
#    Cluster Linking（例。環境によりコマンドは異なる）
confluent kafka mirror failover --link dr-link --topics 'orders.*'

# 4) クライアントの接続先を切替（DNS/設定配布）
# 5) 書き込み再開

# 非計画時（MM2 オフセット翻訳の概念例）
# 事前に MirrorCheckpointConnector を有効化している前提
# 翻訳されたオフセットに基づきコンシューマグループを調整
kafka-consumer-groups \
  --bootstrap-server b1:9092 \
  --group app-g \
  --reset-offsets --topic orders --to-offset <translated-offset> --execute

Failback and Bidirectional Consistency

After running on DR, returning to the primary means treating the DR cluster as the source of truth, re-establishing replication in the reverse direction, and waiting until the delta has drained before flipping traffic back. Active-passive only needs a one-way switchover, which is easy to automate.

Active-active opens the door to concurrent updates on the same key and ordering skew, so you need a dedup-key design and downstream upsert/reconciliation logic. Remember that transactions do not cross cluster boundaries — fall back to unique keys and idempotent processing where needed.

Before failback, confirm DR→Primary lag is essentially zero
Maintain a checklist for re-aligning schemas, ACLs, and topic configs
Pilot active-active on a limited set of topics and workloads first

Recreating links for failback

# Cluster Linking（DR -> Primary に逆リンクを作成）
confluent kafka link create backfill \
  --cluster A \
  --source-cluster B \
  --source-bootstrap-server b1:9092 \
  --link-mode READ_ONLY

# MirrorMaker 2（双方向を有効化）
clusters = A, B
A->B.enabled=true
B->A.enabled=true
# フェイルバック時は B->A の topics を対象に限定

CCAAK Exam Prep and Checklist

The exam tests the boundary between in-cluster durability and cross-region DR, the meaning and side effects of each setting, and offset synchronization plus client behavior during switchover. Lock in three facts: replication is asynchronous so RPO is never zero, unclean leader election should stay off, and you must know how min.insync.replicas interacts with acks.

In production, continuously monitor UnderReplicatedPartitions, ActiveControllerCount, mirror latency, and link health, then wire alerts directly into the automation runbook.

min.insync.replicas is only effective with acks=all. When the ISR falls below the threshold, writes are rejected — protecting your RPO
unclean.leader.election.enable=false prevents data loss (RTO may stretch)
MirrorMaker 2 translates offsets via checkpoints; Cluster Linking offers a simple switchover by promoting mirror topics
Active-active requires explicit handling of duplicates and ordering skew. Start with active-passive
Key metrics: UnderReplicatedPartitions, mirror lag / replication-latency-ms, ActiveControllerCount==1, link state

Prometheus-style alert examples (excerpt)

alert: KafkaUnderReplicatedPartitions
expr: kafka_server_replicamanager_underreplicatedpartitions > 0
for: 5m
labels:
  severity: critical
annotations:
  summary: Under-replicated partitions detected

alert: ActiveControllerCountNotOne
expr: kafka_controller_kafkacontroller_activecontrollercount != 1
for: 1m
labels:
  severity: warning
annotations:
  summary: Active controller count is not 1

alert: MirrorLagHigh
expr: mm2_replication_latency_ms > 60000
for: 2m
labels:
  severity: warning
annotations:
  summary: MirrorMaker 2 replication latency exceeds 60s

Check Your Understanding

CCAAK

問題 1

You run a two-region setup with MirrorMaker 2 asynchronously replicating A→B. Your RPO is 60 seconds and RTO is 15 minutes. During a planned failover, you want to minimize consumer duplicates while resuming from the correct position. Which procedure is most appropriate?

A. Stop producers on A, confirm that MM2 checkpoint lag is below the threshold, translate offsets on B, then redirect clients to B
B. Leave producers running, flip only DNS to B, and reconcile any delta manually later
C. Lower min.insync.replicas to 1 right before failover to maximize availability
D. Temporarily enable unclean leader election to speed up recovery

正解: A

The canonical planned-switchover sequence is: stop writes → confirm replication lag and checkpoint catch-up → translate offsets on B → redirect connections. B and D carry serious risks of data loss or ordering corruption, and C actively worsens your RPO.

Frequently Asked Questions

Why not use synchronous replication across regions?

Kafka is designed for asynchronous replication across regions. WAN latency and partitions make synchronous replication a non-starter — throughput and availability would both collapse. To shrink RPO, invest in bandwidth, low-latency links, and tight monitoring; for planned switchovers, drain lag close to zero before promoting the DR cluster.

Are exactly-once semantics guaranteed across regions?

No. Kafka transactions and idempotent producers are scoped to a single cluster. Cross-region replication is asynchronous, so DR designs must absorb duplicates via unique keys and idempotent downstream processing.

Does migrating from ZooKeeper to KRaft affect DR design?

Intra-cluster metadata management changes, but the cross-region DR fundamentals — asynchronous replication via MM2 or Cluster Linking, RPO/RTO planning, and failover procedures — remain the same. Watch for renamed metrics and slightly different operational commands during the migration.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

Kafka Disaster Recovery: Region Failover Design (CCAAK Practical Guide)