Confluent Replicator: Enterprise Cross-Cluster Sync (2026)

Confluent Replicator is a commercial replication connector that runs on top of Kafka Connect. It is well suited to cross-data-center DR and staged migrations, and it is favored by teams that need stable operations with commercial support.

Following the official documentation, this article walks through the practical design, configuration, and operational points, and highlights the angles most likely to appear on CCAAK (Confluent Certified Administrator for Apache Kafka).

Replicator Overview and Key Use Cases

Replicator continuously copies topics from a source Kafka cluster to a destination cluster. It is implemented as a Kafka Connect source connector, so you inherit Connect's scale-out, fault tolerance, and rebalance behavior out of the box.

Typical use cases include one-way replication to a DR site, staged migration during data-center consolidation, and isolating analytics workloads from production (decoupling read load). Schemas and ACLs are handled by other tools, so the safest design is to keep Replicator focused on stable data-plane replication.

Combines commercial support with a Connect-based operational foundation
Topics can be selected flexibly via regex or a whitelist
Topic renaming enables safe, staged cutovers
Use Kafka's idempotent producer plus retry tuning to minimize duplicates

Use case	Goal	Supporting tools / notes
DR / BCP	Low-RPO copy to a remote site	Plan for network bandwidth, latency, and resync
Staged migration	Running old and new clusters in parallel	Use topic.rename.format for a safe cutover
Analytics isolation	Shield production from read load	Design consumer groups and monitor lag

Sketch of the REST call when creating the connector

POST /connectors
Content-Type: application/json

{
  "name": "replicator-orders",
  "config": {
    "connector.class": "io.confluent.connect.replicator.ReplicatorSourceConnector",
    "tasks.max": "4",
    "src.kafka.bootstrap.servers": "src1:9092,src2:9092",
    "dest.kafka.bootstrap.servers": "dst1:9092,dst2:9092",
    "topic.regex": "^(orders|payments).*",
    "topic.rename.format": "${topic}.dr",
    "key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
    "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
    "provenance.header.enable": "true"
  }
}

Architecture and Deployment Patterns

Replicator runs on Kafka Connect workers, behaving as a consumer toward the source and a producer toward the destination. Network-wise, the Replicator nodes typically need reachability to both clusters.

Operationally, the common pattern is to place the Connect cluster next to the destination. Even in environments with strict egress restrictions on the source, as long as Replicator can open a read connection to the source, one-way copy is achievable.

Make the Connect cluster's internal topics (configs/offsets/status) redundant
Tune task parallelism based on partition count and network bandwidth
Design for the temporary latency spikes that occur during rebalances

Pattern	Placement	Pros / cautions
Destination-adjacent	Connect placed next to the destination	Easier to minimize write latency
Source-adjacent	Connect placed next to the source	Minimizes read latency; watch egress-path constraints
Neutral zone	Placed in a dedicated subnet	Operationally isolated; mind the round-trip latency impact

Logical layout of Replicator

Connect worker baseline settings (excerpt)

# worker.properties (excerpt)
bootstrap.servers=dst1:9092,dst2:9092
config.storage.topic=connect-configs
offset.storage.topic=connect-offsets
status.storage.topic=connect-status
config.storage.replication.factor=3
offset.storage.replication.factor=3
status.storage.replication.factor=3
# Idempotent producer to minimize duplicates on the source -> destination write path
producer.enable.idempotence=true
# Security example (worker -> destination)
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="connect" password="secret";

Key Configuration Points and Best Practices

Select topics with regex, and use explicit patterns rooted in your naming convention to keep unintended new topics from leaking in. For staged migrations, use topic.rename.format to alter the destination topic names and plan consumer cutovers deliberately.

ByteArrayConverter is recommended. Avoiding any schema conversion between source and destination guarantees a faithful copy of the raw data.

Manage scope with topic.regex; favor readability and avoid negative lookahead
Use ByteArrayConverter for key/value so data is copied transparently
Start tasks.max at, or slightly below, the total partition count
Design error topics and DLQs (Connect-level error.tolerance, etc.)

Config key	Purpose	Recommendation / notes
topic.regex / topic.whitelist	Select what to replicate	Pair with a clear naming convention
topic.rename.format	Rename topics on the destination	Useful for staged migrations and parallel-run periods
provenance.header.enable	Attach provenance metadata	Useful for troubleshooting

Replicator connector config (example including security)

{
  "name": "replicator-secure",
  "config": {
    "connector.class": "io.confluent.connect.replicator.ReplicatorSourceConnector",
    "tasks.max": "6",
    "src.kafka.bootstrap.servers": "src-a:9093,src-b:9093",
    "src.kafka.security.protocol": "SASL_SSL",
    "src.kafka.sasl.mechanism": "PLAIN",
    "src.kafka.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username='src-user' password='src-pass';",
    "dest.kafka.bootstrap.servers": "dst-a:9093,dst-b:9093",
    "dest.kafka.security.protocol": "SASL_SSL",
    "dest.kafka.sasl.mechanism": "PLAIN",
    "dest.kafka.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username='dst-user' password='dst-pass';",
    "topic.regex": "^(orders_|inventory_).*
      
        
        NicheeLab を読み込み中…
      
    
  quot;,
    "topic.rename.format": "${topic}.dr",
    "key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
    "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
    "provenance.header.enable": "true"
  }
}

Designing for Availability, Ordering, and Throughput

Kafka guarantees ordering within a partition. Replicator routes the same key to the same partition, so the baseline design is to match the destination's partition count to the source's.

Duplicates can occur from retries during failures. Enabling the idempotent producer and tuning retries and delivery.timeout.ms on the producer side minimizes duplicates in practice.

Align partitions (match partition counts and key distribution between source and destination)
Tune throughput via tasks.max and fetch/produce batch sizes
Network bandwidth and latency map directly to RPO/RTO, so treat them as SLAs

Design lever	Impact	Rule of thumb / notes
tasks.max	Parallelism / throughput	Start near total partition count as the upper bound
linger/batch.size	Send efficiency / latency	Trade-off between latency and bandwidth
replication.factor (destination)	Fault tolerance	3 or higher is safe for a DR destination

Producer tuning on the send side (example of additional connector properties)

{
  "producer.linger.ms": "20",
  "producer.batch.size": "131072",
  "producer.compression.type": "lz4",
  "producer.enable.idempotence": "true",
  "producer.max.in.flight.requests.per.connection": "5"
}

Security and Operational Monitoring

Use SASL/SSL mutual authentication on both source and destination as a baseline, with least-privilege service accounts. Replicator needs read permission on the source and write permission on the destination.

For monitoring, focus on Connect task status, lag (the time and offset gap between source and destination), error rate, and rebalance frequency. Confluent Control Center and JMX metrics make these observable.

ACLs: READ on the source, WRITE/Create on the destination, and DescribeConfigs as needed
Design certificate and secret rotation via Vault or similar
Document the DLQ and retry strategy for errors in a runbook

Area	Recommended setup	Example metrics to watch
Auth / encryption	SASL_SSL plus least-privilege ACLs	Auth failure rate, TLS handshake failures
Observability	Adopt Control Center / JMX	task-running/failed, lag, error-rate
Operational process	Runbook and drills	Failover duration, measured RPO

Example ACL grants (conceptual; adjust to your environment)

# Source side: read (consumer)
kafka-acls --authorizer-properties zookeeper.connect=zk-src:2181 \
  --add --allow-principal User:replicator \
  --operation Read --topic 'orders_*' --group replicator-orders

# Destination side: write and topic create
auth-cli-or-kafka-acls --add --allow-principal User:replicator \
  --operation Write --operation Create --topic 'orders_*.dr'

Replicator vs. MirrorMaker 2 vs. Cluster Linking: When to Choose Which

There are several ways to replicate Kafka. The Apache-standard MirrorMaker 2 (MM2) is open source and broadly used. Confluent's Cluster Linking creates a direct broker-to-broker mirror. Replicator is the right fit when you want a mature, Connect-based operational story plus commercial support.

On the exam, the key skill is being able to articulate the selection rationale for each scenario: network constraints, mixed versions/distributions on both ends, staged cutovers, and operational governance.

Replicator is the safe choice for one-way, staged migration, and operational standardization
Cluster Linking is the strongest pick for low latency, direct mirroring, and minimal overhead
MM2 is preferred when you favor OSS or stock features

Option	Strengths	Cautions
Confluent Replicator	Easy to run on the Connect foundation, commercial support, and trivial topic renaming	Requires running a Connect cluster; processing happens at the application layer (not broker-direct)
MirrorMaker 2 (OSS)	OSS standard, with checkpoints that support consumer-group migration	High design flexibility, but operations and monitoring require more bespoke work
Cluster Linking (Confluent)	Direct broker-to-broker, low overhead, mirror topics	Mind the environment alignment on both ends and the network requirements

Minimal runbook for stopping and switching Replicator (example)

# 1) Watch replication lag for the target topics and confirm it is near zero
# 2) Pause consumers and wait for the final batch
# 3) Pause Replicator (/connectors/{name}/pause)
# 4) Switch consumer endpoints to the new destination topics (e.g. *.dr)
# 5) Decommission the old side in stages (keep monitoring in parallel)

Check Your Understanding

CCAAK

問題 1

You want to migrate topics in stages between two geographically separated Kafka clusters, safely running old and new consumers in parallel while avoiding disruption at cutover. Which design is most appropriate?

Use Replicator with topic.rename.format to rename destination topics (e.g. orders → orders.dr), and gradually migrate consumers to the new topics
Use MM2 to copy __consumer_offsets directly and migrate consumers to the new cluster immediately
Use Cluster Linking to build a bidirectional mirror and write to the same topic name on both clusters simultaneously
Stop the source producers and migrate in one shot via topic export/import

正解: A

For staged migration with parallel operation, the safe pattern is to create renamed topics on the destination and migrate consumer endpoints in stages. Replicator's topic.rename.format fits perfectly. Copying __consumer_offsets directly is not recommended, and bidirectional simultaneous writes via Cluster Linking risks conflicts and design complexity. A one-shot stop-and-migrate approach incurs heavy downtime.

Frequently Asked Questions

Does Replicator also replicate Schema Registry schemas?

No. Replicator focuses on replicating Kafka topic data. The official recommendation is to handle schemas via Schema Registry features such as export/import or Schema Linking.

How do you migrate consumer group offsets?

Replicator itself does not replicate __consumer_offsets. For staged migrations, switch to the renamed topics on the destination and use checkpoint/offset translation tools when needed. Follow the latest Confluent official migration guide for current procedures.

Is exactly-once delivery guaranteed?

Kafka's idempotent producer minimizes duplicates, but full end-to-end exactly-once across failure scenarios is not always guaranteed. The realistic approach is to preserve per-partition ordering and design downstream consumers (idempotent sinks, etc.) to tolerate duplicates.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

Commercial Kafka Replication with Confluent Replicator