Kafka

MirrorMaker 2: Fundamentals of Cross-Cluster Kafka Replication

2026-04-19
NicheeLab Editorial Team

Cross-cluster replication is directly tied to DR (disaster recovery), multi-region active-active, and minimizing downtime during migrations. This article covers the core architecture and key configuration points of MirrorMaker 2 (MM2), which runs on Kafka Connect, without going overboard or leaving gaps.

On the CCAAK exam, MM2 components, topic naming policy, offset sync, and the role of internal topics come up frequently. Aim for an understanding grounded in the official behavior rather than rote memorization.

Core Concepts and Main Use Cases

MM2 is a Kafka Connect-based replication mechanism that syncs topic data from a source cluster to a target cluster, and optionally topic configs and consumer-group offset information. You typically assign each cluster a short alias, and the ReplicationPolicy determines how target-side topic names are formed.

Typical use cases include one-way replication to a DR site, delivery to another region, staged cluster migration, and building an aggregation cluster. The first step in design is to clarify RPO/RTO requirements, network latency, and the granularity of target topics.

  • DR design prerequisites: lock down RPO/RTO, network bandwidth, and acceptable latency as concrete numbers
  • Scope: select topics with regex and exclude unnecessary internal or test topics
  • Data sovereignty/compliance: confirm whether cross-region transfer is allowed and pick an encryption scheme (SASL_SSL/SSL)

MirrorMaker 2 Architecture

MM2 operates as three connector groups on top of Kafka Connect. MirrorSourceConnector reads topic data and writes it to the target, MirrorCheckpointConnector syncs consumer-group checkpoints (mapping between source and target offsets), and MirrorHeartbeatConnector emits heartbeats for health checks. These run as parallel tasks across distributed Connect workers.

By default, DefaultReplicationPolicy names target topics in the form '<source-alias>.<original-topic-name>'. Internally, dedicated topics for checkpoints, heartbeats, and offset sync are created on the target side and are used for maintaining replication consistency and for monitoring.

  • Ordering: in-partition order is preserved. End-to-end transactional consistency (EOS across clusters) is not provided
  • Internal topics: heartbeats, checkpoints, and offset-syncs should be created with compaction enabled and a sufficient replication.factor
  • Scalability: increase parallelism via Connect's tasks.max and scale out the I/O path (source consumption and target writes)

MM2 logical topology (one-way replication)

MirrorSourceConnectorMirrorCheckpointConnectorMirrorHeartbeatConnectoroffset-syncsSource Clusteralias: src / topics: orders,... / __consumer_offsetsTarget Clusteralias: dst / dst.src.orders / checkpoints / heartbeatsLogical topology of one-way replication (src→dst)

Minimal Configuration and Core Properties

MM2 defines multiple clusters in a single properties file and enables replication per direction. You specify target topics with regex and reserve a sufficient replication.factor for internal topics and any topics created on the target side. Security is configured with per-cluster consumer/producer/admin prefixes (e.g., SASL/SSL settings).

Topic config sync and group offset sync must be explicitly enabled. If you change the naming policy, swap out the ReplicationPolicy, but verify carefully that the change aligns with your migration plan.

  • Direction: make one-way explicit with src->dst.enabled=true (bidirectional setups require loop prevention)
  • Topic selection: use regex in source->target.topics and exclusions via source->target.topics.blacklist or equivalent
  • Internal topics: in practice, set replication.factor for heartbeats/checkpoints/offset-syncs to at least 3

mm2.properties (example of one-way src→dst)

clusters=src,dst

src.bootstrap.servers=SRC_BROKERS:9092
dst.bootstrap.servers=DST_BROKERS:9092

# 有効化(方向ごと)
src->dst.enabled=true
# 対象トピック(正規表現)。必要に応じて限定する
src->dst.topics=orders|payments|users
# 例: 除外を使う場合(実装/配布によりプロパティ名が異なることがあるため公式ドキュメントを確認)
# src->dst.topics.blacklist=^_.*

# 命名ポリシー(デフォルトは src.<topic> 形式)
replication.policy.class=org.apache.kafka.connect.mirror.DefaultReplicationPolicy

# 設定とオフセット同期
sync.topic.configs.enabled=true
sync.group.offsets.enabled=true
emit.heartbeats.enabled=true

# 並列度と冗長性
tasks.max=4
replication.factor=3
checkpoints.topic.replication.factor=3
heartbeats.topic.replication.factor=3
offset-syncs.topic.replication.factor=3

# メタデータの更新間隔
refresh.topics.interval.seconds=60
refresh.groups.interval.seconds=60

# セキュリティ例(必要に応じて admin/producer/consumer の各接頭辞で設定)
src.consumer.security.protocol=SASL_SSL
src.consumer.sasl.mechanism=PLAIN
src.consumer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="user" password="pass";

dst.producer.security.protocol=SASL_SSL
dst.producer.sasl.mechanism=PLAIN
dst.producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="user" password="pass";

# 必要に応じて管理クライアント
# dst.admin.security.protocol=SASL_SSL

Understanding Topic Naming, Config Sync, and Offset Sync

Naming follows the ReplicationPolicy. With DefaultReplicationPolicy, target topic names take the form '<source-alias>.<original-name>', making collisions and loops easier to avoid. IdentityReplicationPolicy keeps the same topic names, but loop avoidance is hard in bidirectional setups, so careful design is required in production.

Offset sync runs through MirrorCheckpointConnector and offset-syncs. Enabling sync.group.offsets.enabled makes it easier to resume from the most recent consistent position when failing over to the target with the same group.id. For a reliable migration, the recommended steps are: stop the source-side consumer → wait for checkpoints to propagate to the target → start the same group.id on the target side.

  • Prerequisite: source-side consumers must commit regularly (auto or explicit commits both work)
  • Order and duplication: in-partition order is preserved, but strict EOS across clusters is not supported. Design for duplication tolerance
  • Topic configs: sync.topic.configs.enabled syncs the main settings. ACLs and quotas are environment-specific and usually need separate management

Operations, Monitoring, and Tuning

Monitoring centers on JMX metrics for Connect workers and each connector/task, plus the latency and size of heartbeats, checkpoints, and offset-syncs topics on the target side. Track replication lag through lag metrics and consumer-lag visualization.

Effective tuning levers include parallelism (tasks.max), source-side consumer fetch/batch settings, and target-side producer batching (linger.ms, batch.size). Network bandwidth/RTT tends to be the bottleneck, so consider compression (compression.type) for cross-region setups.

  • Keep cleanup.policy=compact on internal topics and reserve a sufficient replication.factor
  • Do not rely on auto topic creation; pre-create topics or explicitly enable config sync as needed
  • For bidirectional setups, always include loop prevention in your design (suppress re-ingestion of messages originating from your own cluster)

Comparison: Legacy MirrorMaker / MirrorMaker 2 / Cluster Linking

For both operations and exam prep, understanding how MM2 differs from legacy MirrorMaker and where Confluent's Cluster Linking (a product feature) fits in helps avoid confusion. MM2 is Connect-based with strong scalability and observability, and ships with offset/config sync mechanisms. Cluster Linking, by contrast, simplifies management by establishing broker-to-broker links across clusters (a Confluent-provided feature).

  • MM2 is generally available in OSS Kafka. Cluster Linking is a Confluent Platform/Cloud feature
  • On the exam, the 3-connector MM2 structure, default naming, and the role of internal topics are common targets
AspectMirrorMaker (Legacy)MirrorMaker 2Cluster Linking (Confluent)
FoundationStandalone tool (no Connect dependency)Kafka Connect-based (Source/Checkpoint/Heartbeat)Broker-to-broker link (no Connect required)
Offset syncManual/limitedSync via checkpoints + offset-syncsAligned automatically by the link mechanism
Topic config syncNot supportedSync key settings via sync.topic.configs.enabledSettings inherited via the link (per product spec)
Naming policySame name by default (watch for collisions)Default is <src>.<topic> (configurable)Same name by default (per product spec)
Operability/monitoringLimitedLeverages Connect's scaling and monitoringUses the product's management/monitoring features
Use casesSimple one-way replicationGeneral DR/migration/active-activeSimpler multi-region setups in Confluent environments

CCAAK Exam Prep Highlights

Remember that MM2 consists of three connectors (Source/Checkpoint/Heartbeat) and runs on Kafka Connect. Under the default ReplicationPolicy, target topic names are prefixed with the source alias.

Enable sync.group.offsets.enabled and sync.topic.configs.enabled as needed. Run the internal topics (heartbeats, checkpoints, offset-syncs) with compact policy and a sufficient replication.factor. Since cross-cluster EOS is not provided, duplication-tolerant consumer design is a baseline assumption.

  • Be able to reproduce the minimal properties for one-way replication
  • Be able to explain the failover procedure (stop → propagate checkpoints → start)
  • Be able to summarize the differences from legacy MirrorMaker / Cluster Linking in a single line

Check Your Understanding

CCAAK

問題 1

After stopping the source consumer group in MM2, you want to resume from the point of interruption on the target cluster using the same group.id. Which combination of settings and prerequisites is most appropriate?

  1. Enable sync.group.offsets.enabled, wait for checkpoints to propagate, then start the same group.id on the target
  2. Start the target consumer with auto.offset.reset=earliest (no additional configuration needed)
  3. Simply setting replication.policy to IdentityReplicationPolicy will automatically sync offsets too
  4. Enabling idempotence on the producer side achieves strict EOS across clusters, allowing resumption from the point of interruption

正解: A

MM2 offset sync is based on MirrorCheckpointConnector and offset-syncs. The recommended procedure is to enable sync.group.offsets.enabled, then stop the source side → propagate checkpoints → start the same group.id on the target. auto.offset.reset specifies the initial position when no existing offset is found and is not a substitute for sync. ReplicationPolicy affects naming but is not a requirement for offset sync. Since cross-cluster EOS is not provided, producer idempotence alone does not guarantee resumption from the point of interruption.

Frequently Asked Questions

Is bidirectional replication possible? How do you prevent loops?

Yes, but it requires careful design. The default DefaultReplicationPolicy adds the source alias as a prefix, making it easier to avoid re-ingesting your own data. You prevent loops by explicitly limiting the target topics and applying per-direction filtering (regex).

Does MM2 provide Exactly-Once Semantics (EOS) across clusters?

No. MM2 guarantees in-partition order and at-least-once delivery. Deduplication must be handled by target-side consumers/applications or downstream storage.

Are schemas and ACLs synchronized?

Schema Registry and ACLs are out of scope for MM2. Some topic configs are synced via sync.topic.configs.enabled, but schemas should be handled by Schema Registry's own replication/mirror feature, and ACLs are typically managed separately through each environment's security operations.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Kafka

Kafka Topics & Partitions: Distribution Fundamentals (2026)

How Kafka topics and partitions enable scale — ordering guar...

Kafka

CCDAK Exam Guide: Confluent Certified Developer (2026)

Complete prep for the CCDAK exam — Producer/Consumer API, St...

Kafka

CCAAK Exam Guide: Confluent Certified Administrator (2026)

Pass the CCAAK exam — cluster management, partitions, securi...

Kafka

Kafka Replicas & ISR: Fault Tolerance Explained (2026)

Replica placement, in-sync replicas (ISR), leader election. ...

Kafka

Kafka Offsets: Commit Modes & Consumer Position (2026)

Offset semantics — auto vs. manual commit, __consumer_offset...

Browse all Kafka articles (101)
© 2026 NicheeLab All rights reserved.