Cross-cluster replication is directly tied to DR (disaster recovery), multi-region active-active, and minimizing downtime during migrations. This article covers the core architecture and key configuration points of MirrorMaker 2 (MM2), which runs on Kafka Connect, without going overboard or leaving gaps.
On the CCAAK exam, MM2 components, topic naming policy, offset sync, and the role of internal topics come up frequently. Aim for an understanding grounded in the official behavior rather than rote memorization.
MM2 is a Kafka Connect-based replication mechanism that syncs topic data from a source cluster to a target cluster, and optionally topic configs and consumer-group offset information. You typically assign each cluster a short alias, and the ReplicationPolicy determines how target-side topic names are formed.
Typical use cases include one-way replication to a DR site, delivery to another region, staged cluster migration, and building an aggregation cluster. The first step in design is to clarify RPO/RTO requirements, network latency, and the granularity of target topics.
MM2 operates as three connector groups on top of Kafka Connect. MirrorSourceConnector reads topic data and writes it to the target, MirrorCheckpointConnector syncs consumer-group checkpoints (mapping between source and target offsets), and MirrorHeartbeatConnector emits heartbeats for health checks. These run as parallel tasks across distributed Connect workers.
By default, DefaultReplicationPolicy names target topics in the form '<source-alias>.<original-topic-name>'. Internally, dedicated topics for checkpoints, heartbeats, and offset sync are created on the target side and are used for maintaining replication consistency and for monitoring.
MM2 logical topology (one-way replication)
MM2 defines multiple clusters in a single properties file and enables replication per direction. You specify target topics with regex and reserve a sufficient replication.factor for internal topics and any topics created on the target side. Security is configured with per-cluster consumer/producer/admin prefixes (e.g., SASL/SSL settings).
Topic config sync and group offset sync must be explicitly enabled. If you change the naming policy, swap out the ReplicationPolicy, but verify carefully that the change aligns with your migration plan.
mm2.properties (example of one-way src→dst)
clusters=src,dst
src.bootstrap.servers=SRC_BROKERS:9092
dst.bootstrap.servers=DST_BROKERS:9092
# 有効化(方向ごと)
src->dst.enabled=true
# 対象トピック(正規表現)。必要に応じて限定する
src->dst.topics=orders|payments|users
# 例: 除外を使う場合(実装/配布によりプロパティ名が異なることがあるため公式ドキュメントを確認)
# src->dst.topics.blacklist=^_.*
# 命名ポリシー(デフォルトは src.<topic> 形式)
replication.policy.class=org.apache.kafka.connect.mirror.DefaultReplicationPolicy
# 設定とオフセット同期
sync.topic.configs.enabled=true
sync.group.offsets.enabled=true
emit.heartbeats.enabled=true
# 並列度と冗長性
tasks.max=4
replication.factor=3
checkpoints.topic.replication.factor=3
heartbeats.topic.replication.factor=3
offset-syncs.topic.replication.factor=3
# メタデータの更新間隔
refresh.topics.interval.seconds=60
refresh.groups.interval.seconds=60
# セキュリティ例(必要に応じて admin/producer/consumer の各接頭辞で設定)
src.consumer.security.protocol=SASL_SSL
src.consumer.sasl.mechanism=PLAIN
src.consumer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="user" password="pass";
dst.producer.security.protocol=SASL_SSL
dst.producer.sasl.mechanism=PLAIN
dst.producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="user" password="pass";
# 必要に応じて管理クライアント
# dst.admin.security.protocol=SASL_SSLNaming follows the ReplicationPolicy. With DefaultReplicationPolicy, target topic names take the form '<source-alias>.<original-name>', making collisions and loops easier to avoid. IdentityReplicationPolicy keeps the same topic names, but loop avoidance is hard in bidirectional setups, so careful design is required in production.
Offset sync runs through MirrorCheckpointConnector and offset-syncs. Enabling sync.group.offsets.enabled makes it easier to resume from the most recent consistent position when failing over to the target with the same group.id. For a reliable migration, the recommended steps are: stop the source-side consumer → wait for checkpoints to propagate to the target → start the same group.id on the target side.
Monitoring centers on JMX metrics for Connect workers and each connector/task, plus the latency and size of heartbeats, checkpoints, and offset-syncs topics on the target side. Track replication lag through lag metrics and consumer-lag visualization.
Effective tuning levers include parallelism (tasks.max), source-side consumer fetch/batch settings, and target-side producer batching (linger.ms, batch.size). Network bandwidth/RTT tends to be the bottleneck, so consider compression (compression.type) for cross-region setups.
For both operations and exam prep, understanding how MM2 differs from legacy MirrorMaker and where Confluent's Cluster Linking (a product feature) fits in helps avoid confusion. MM2 is Connect-based with strong scalability and observability, and ships with offset/config sync mechanisms. Cluster Linking, by contrast, simplifies management by establishing broker-to-broker links across clusters (a Confluent-provided feature).
| Aspect | MirrorMaker (Legacy) | MirrorMaker 2 | Cluster Linking (Confluent) |
|---|---|---|---|
| Foundation | Standalone tool (no Connect dependency) | Kafka Connect-based (Source/Checkpoint/Heartbeat) | Broker-to-broker link (no Connect required) |
| Offset sync | Manual/limited | Sync via checkpoints + offset-syncs | Aligned automatically by the link mechanism |
| Topic config sync | Not supported | Sync key settings via sync.topic.configs.enabled | Settings inherited via the link (per product spec) |
| Naming policy | Same name by default (watch for collisions) | Default is <src>.<topic> (configurable) | Same name by default (per product spec) |
| Operability/monitoring | Limited | Leverages Connect's scaling and monitoring | Uses the product's management/monitoring features |
| Use cases | Simple one-way replication | General DR/migration/active-active | Simpler multi-region setups in Confluent environments |
Remember that MM2 consists of three connectors (Source/Checkpoint/Heartbeat) and runs on Kafka Connect. Under the default ReplicationPolicy, target topic names are prefixed with the source alias.
Enable sync.group.offsets.enabled and sync.topic.configs.enabled as needed. Run the internal topics (heartbeats, checkpoints, offset-syncs) with compact policy and a sufficient replication.factor. Since cross-cluster EOS is not provided, duplication-tolerant consumer design is a baseline assumption.
CCAAK
問題 1
After stopping the source consumer group in MM2, you want to resume from the point of interruption on the target cluster using the same group.id. Which combination of settings and prerequisites is most appropriate?
正解: A
MM2 offset sync is based on MirrorCheckpointConnector and offset-syncs. The recommended procedure is to enable sync.group.offsets.enabled, then stop the source side → propagate checkpoints → start the same group.id on the target. auto.offset.reset specifies the initial position when no existing offset is found and is not a substitute for sync. ReplicationPolicy affects naming but is not a requirement for offset sync. Since cross-cluster EOS is not provided, producer idempotence alone does not guarantee resumption from the point of interruption.
Is bidirectional replication possible? How do you prevent loops?
Yes, but it requires careful design. The default DefaultReplicationPolicy adds the source alias as a prefix, making it easier to avoid re-ingesting your own data. You prevent loops by explicitly limiting the target topics and applying per-direction filtering (regex).
Does MM2 provide Exactly-Once Semantics (EOS) across clusters?
No. MM2 guarantees in-partition order and at-least-once delivery. Deduplication must be handled by target-side consumers/applications or downstream storage.
Are schemas and ACLs synchronized?
Schema Registry and ACLs are out of scope for MM2. Some topic configs are synced via sync.topic.configs.enabled, but schemas should be handled by Schema Registry's own replication/mirror feature, and ACLs are typically managed separately through each environment's security operations.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Kafka Topics & Partitions: Distribution Fundamentals (2026)
How Kafka topics and partitions enable scale — ordering guar...
CCDAK Exam Guide: Confluent Certified Developer (2026)
Complete prep for the CCDAK exam — Producer/Consumer API, St...
CCAAK Exam Guide: Confluent Certified Administrator (2026)
Pass the CCAAK exam — cluster management, partitions, securi...
Kafka Replicas & ISR: Fault Tolerance Explained (2026)
Replica placement, in-sync replicas (ISR), leader election. ...
Kafka Offsets: Commit Modes & Consumer Position (2026)
Offset semantics — auto vs. manual commit, __consumer_offset...