Kafka

Commercial Kafka Replication with Confluent Replicator

2026-04-19
NicheeLab Editorial Team

Confluent Replicator is a commercial replication connector that runs on top of Kafka Connect. It is well suited to cross-data-center DR and staged migrations, and it is favored by teams that need stable operations with commercial support.

Following the official documentation, this article walks through the practical design, configuration, and operational points, and highlights the angles most likely to appear on CCAAK (Confluent Certified Administrator for Apache Kafka).

Replicator Overview and Key Use Cases

Replicator continuously copies topics from a source Kafka cluster to a destination cluster. It is implemented as a Kafka Connect source connector, so you inherit Connect's scale-out, fault tolerance, and rebalance behavior out of the box.

Typical use cases include one-way replication to a DR site, staged migration during data-center consolidation, and isolating analytics workloads from production (decoupling read load). Schemas and ACLs are handled by other tools, so the safest design is to keep Replicator focused on stable data-plane replication.

  • Combines commercial support with a Connect-based operational foundation
  • Topics can be selected flexibly via regex or a whitelist
  • Topic renaming enables safe, staged cutovers
  • Use Kafka's idempotent producer plus retry tuning to minimize duplicates
Use caseGoalSupporting tools / notes
DR / BCPLow-RPO copy to a remote sitePlan for network bandwidth, latency, and resync
Staged migrationRunning old and new clusters in parallelUse topic.rename.format for a safe cutover
Analytics isolationShield production from read loadDesign consumer groups and monitor lag

Sketch of the REST call when creating the connector

POST /connectors
Content-Type: application/json

{
  "name": "replicator-orders",
  "config": {
    "connector.class": "io.confluent.connect.replicator.ReplicatorSourceConnector",
    "tasks.max": "4",
    "src.kafka.bootstrap.servers": "src1:9092,src2:9092",
    "dest.kafka.bootstrap.servers": "dst1:9092,dst2:9092",
    "topic.regex": "^(orders|payments).*",
    "topic.rename.format": "${topic}.dr",
    "key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
    "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
    "provenance.header.enable": "true"
  }
}

Architecture and Deployment Patterns

Replicator runs on Kafka Connect workers, behaving as a consumer toward the source and a producer toward the destination. Network-wise, the Replicator nodes typically need reachability to both clusters.

Operationally, the common pattern is to place the Connect cluster next to the destination. Even in environments with strict egress restrictions on the source, as long as Replicator can open a read connection to the source, one-way copy is achievable.

  • Make the Connect cluster's internal topics (configs/offsets/status) redundant
  • Tune task parallelism based on partition count and network bandwidth
  • Design for the temporary latency spikes that occur during rebalances
PatternPlacementPros / cautions
Destination-adjacentConnect placed next to the destinationEasier to minimize write latency
Source-adjacentConnect placed next to the sourceMinimizes read latency; watch egress-path constraints
Neutral zonePlaced in a dedicated subnetOperationally isolated; mind the round-trip latency impact

Logical layout of Replicator

pull / pushconsume via groupproduceSource KafkaDC-A / ProdDestination KafkaDC-B / DRKafka Connect ClusterReplicator tasks / ReplicatorSourceConnector / Internal topics (configs/offsets/status)

Connect worker baseline settings (excerpt)

# worker.properties (excerpt)
bootstrap.servers=dst1:9092,dst2:9092
config.storage.topic=connect-configs
offset.storage.topic=connect-offsets
status.storage.topic=connect-status
config.storage.replication.factor=3
offset.storage.replication.factor=3
status.storage.replication.factor=3
# Idempotent producer to minimize duplicates on the source -> destination write path
producer.enable.idempotence=true
# Security example (worker -> destination)
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="connect" password="secret";

Key Configuration Points and Best Practices

Select topics with regex, and use explicit patterns rooted in your naming convention to keep unintended new topics from leaking in. For staged migrations, use topic.rename.format to alter the destination topic names and plan consumer cutovers deliberately.

ByteArrayConverter is recommended. Avoiding any schema conversion between source and destination guarantees a faithful copy of the raw data.

  • Manage scope with topic.regex; favor readability and avoid negative lookahead
  • Use ByteArrayConverter for key/value so data is copied transparently
  • Start tasks.max at, or slightly below, the total partition count
  • Design error topics and DLQs (Connect-level error.tolerance, etc.)
Config keyPurposeRecommendation / notes
topic.regex / topic.whitelistSelect what to replicatePair with a clear naming convention
topic.rename.formatRename topics on the destinationUseful for staged migrations and parallel-run periods
provenance.header.enableAttach provenance metadataUseful for troubleshooting

Replicator connector config (example including security)

{
  "name": "replicator-secure",
  "config": {
    "connector.class": "io.confluent.connect.replicator.ReplicatorSourceConnector",
    "tasks.max": "6",
    "src.kafka.bootstrap.servers": "src-a:9093,src-b:9093",
    "src.kafka.security.protocol": "SASL_SSL",
    "src.kafka.sasl.mechanism": "PLAIN",
    "src.kafka.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username='src-user' password='src-pass';",
    "dest.kafka.bootstrap.servers": "dst-a:9093,dst-b:9093",
    "dest.kafka.security.protocol": "SASL_SSL",
    "dest.kafka.sasl.mechanism": "PLAIN",
    "dest.kafka.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username='dst-user' password='dst-pass';",
    "topic.regex": "^(orders_|inventory_).*
NicheeLab を読み込み中…
quot;, "topic.rename.format": "${topic}.dr", "key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter", "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter", "provenance.header.enable": "true" } }

Designing for Availability, Ordering, and Throughput

Kafka guarantees ordering within a partition. Replicator routes the same key to the same partition, so the baseline design is to match the destination's partition count to the source's.

Duplicates can occur from retries during failures. Enabling the idempotent producer and tuning retries and delivery.timeout.ms on the producer side minimizes duplicates in practice.

  • Align partitions (match partition counts and key distribution between source and destination)
  • Tune throughput via tasks.max and fetch/produce batch sizes
  • Network bandwidth and latency map directly to RPO/RTO, so treat them as SLAs
Design leverImpactRule of thumb / notes
tasks.maxParallelism / throughputStart near total partition count as the upper bound
linger/batch.sizeSend efficiency / latencyTrade-off between latency and bandwidth
replication.factor (destination)Fault tolerance3 or higher is safe for a DR destination

Producer tuning on the send side (example of additional connector properties)

{
  "producer.linger.ms": "20",
  "producer.batch.size": "131072",
  "producer.compression.type": "lz4",
  "producer.enable.idempotence": "true",
  "producer.max.in.flight.requests.per.connection": "5"
}

Security and Operational Monitoring

Use SASL/SSL mutual authentication on both source and destination as a baseline, with least-privilege service accounts. Replicator needs read permission on the source and write permission on the destination.

For monitoring, focus on Connect task status, lag (the time and offset gap between source and destination), error rate, and rebalance frequency. Confluent Control Center and JMX metrics make these observable.

  • ACLs: READ on the source, WRITE/Create on the destination, and DescribeConfigs as needed
  • Design certificate and secret rotation via Vault or similar
  • Document the DLQ and retry strategy for errors in a runbook
AreaRecommended setupExample metrics to watch
Auth / encryptionSASL_SSL plus least-privilege ACLsAuth failure rate, TLS handshake failures
ObservabilityAdopt Control Center / JMXtask-running/failed, lag, error-rate
Operational processRunbook and drillsFailover duration, measured RPO

Example ACL grants (conceptual; adjust to your environment)

# Source side: read (consumer)
kafka-acls --authorizer-properties zookeeper.connect=zk-src:2181 \
  --add --allow-principal User:replicator \
  --operation Read --topic 'orders_*' --group replicator-orders

# Destination side: write and topic create
auth-cli-or-kafka-acls --add --allow-principal User:replicator \
  --operation Write --operation Create --topic 'orders_*.dr'

Replicator vs. MirrorMaker 2 vs. Cluster Linking: When to Choose Which

There are several ways to replicate Kafka. The Apache-standard MirrorMaker 2 (MM2) is open source and broadly used. Confluent's Cluster Linking creates a direct broker-to-broker mirror. Replicator is the right fit when you want a mature, Connect-based operational story plus commercial support.

On the exam, the key skill is being able to articulate the selection rationale for each scenario: network constraints, mixed versions/distributions on both ends, staged cutovers, and operational governance.

  • Replicator is the safe choice for one-way, staged migration, and operational standardization
  • Cluster Linking is the strongest pick for low latency, direct mirroring, and minimal overhead
  • MM2 is preferred when you favor OSS or stock features
OptionStrengthsCautions
Confluent ReplicatorEasy to run on the Connect foundation, commercial support, and trivial topic renamingRequires running a Connect cluster; processing happens at the application layer (not broker-direct)
MirrorMaker 2 (OSS)OSS standard, with checkpoints that support consumer-group migrationHigh design flexibility, but operations and monitoring require more bespoke work
Cluster Linking (Confluent)Direct broker-to-broker, low overhead, mirror topicsMind the environment alignment on both ends and the network requirements

Minimal runbook for stopping and switching Replicator (example)

# 1) Watch replication lag for the target topics and confirm it is near zero
# 2) Pause consumers and wait for the final batch
# 3) Pause Replicator (/connectors/{name}/pause)
# 4) Switch consumer endpoints to the new destination topics (e.g. *.dr)
# 5) Decommission the old side in stages (keep monitoring in parallel)

Check Your Understanding

CCAAK

問題 1

You want to migrate topics in stages between two geographically separated Kafka clusters, safely running old and new consumers in parallel while avoiding disruption at cutover. Which design is most appropriate?

  1. Use Replicator with topic.rename.format to rename destination topics (e.g. orders → orders.dr), and gradually migrate consumers to the new topics
  2. Use MM2 to copy __consumer_offsets directly and migrate consumers to the new cluster immediately
  3. Use Cluster Linking to build a bidirectional mirror and write to the same topic name on both clusters simultaneously
  4. Stop the source producers and migrate in one shot via topic export/import

正解: A

For staged migration with parallel operation, the safe pattern is to create renamed topics on the destination and migrate consumer endpoints in stages. Replicator's topic.rename.format fits perfectly. Copying __consumer_offsets directly is not recommended, and bidirectional simultaneous writes via Cluster Linking risks conflicts and design complexity. A one-shot stop-and-migrate approach incurs heavy downtime.

Frequently Asked Questions

Does Replicator also replicate Schema Registry schemas?

No. Replicator focuses on replicating Kafka topic data. The official recommendation is to handle schemas via Schema Registry features such as export/import or Schema Linking.

How do you migrate consumer group offsets?

Replicator itself does not replicate __consumer_offsets. For staged migrations, switch to the renamed topics on the destination and use checkpoint/offset translation tools when needed. Follow the latest Confluent official migration guide for current procedures.

Is exactly-once delivery guaranteed?

Kafka's idempotent producer minimizes duplicates, but full end-to-end exactly-once across failure scenarios is not always guaranteed. The realistic approach is to preserve per-partition ordering and design downstream consumers (idempotent sinks, etc.) to tolerate duplicates.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Kafka

Kafka Topics & Partitions: Distribution Fundamentals (2026)

How Kafka topics and partitions enable scale — ordering guar...

Kafka

CCDAK Exam Guide: Confluent Certified Developer (2026)

Complete prep for the CCDAK exam — Producer/Consumer API, St...

Kafka

CCAAK Exam Guide: Confluent Certified Administrator (2026)

Pass the CCAAK exam — cluster management, partitions, securi...

Kafka

Kafka Replicas & ISR: Fault Tolerance Explained (2026)

Replica placement, in-sync replicas (ISR), leader election. ...

Kafka

Kafka Offsets: Commit Modes & Consumer Position (2026)

Offset semantics — auto vs. manual commit, __consumer_offset...

Browse all Kafka articles (101)
© 2026 NicheeLab All rights reserved.