Kafka

Kafka Connect Distributed vs Standalone: Choosing a Configuration and Operational Tips

2026-04-19
NicheeLab Editorial Team

Kafka Connect provides two forms as an execution platform for connectors: standalone and distributed mode. Both use the same connector/task model, but they differ in where configuration is stored, in failover, and in how they scale.

This article organizes selection guidelines and operational best practices in plain language, and also reviews points frequently asked on the exams (CCDAK/CCAAK). The explanation focuses on stable concepts based on the official documentation.

Basics and Selection of Distributed Mode and Standalone

Standalone runs connectors and tasks within a single process and keeps configuration and offsets in local files. There is no failover or rebalancing. It suits one-off migrations, development verification, and single-node batch processing.

Distributed mode forms a single cluster from multiple Connect workers and keeps configuration, offsets, and status in Kafka internal topics. When a worker fails, tasks are reassigned to other workers, enabling scale-out and rolling upgrades. This is the default choice for production operation.

  • Choose distributed mode if you need availability
  • Standalone if you want to start quickly for one-off / small scale
  • If future scaling is anticipated, lean toward distributed mode from the start
AspectStandaloneDistributed Mode
Configuration storageLocal file (properties)Kafka internal topic (config.storage.topic)
Offset storageLocal file (offset.storage.file.filename)Kafka internal topic (offset.storage.topic)
AvailabilitySingle process. Stops on failureTasks automatically reassigned on worker failure
ScalingOnly by increasing processes. Manual partitioningAutomatic rebalancing by adding workers
Applying operational changesProcess restart is the normREST-based updates shared across all workers
Use casesPoC, one-off migrations, developmentProduction always-on, HA, continuous operation

Kafka Connect: Overview of Distributed Mode and Standalone

Standaloneconnect-standalone / Connector + Task(s)Worker W1T1Worker W2T2Config/Offsetslocal filesDistributed (Cluster)connect-distributed (N)KafkaKafka (internal topics)config/offset/status.storage.topic

Minimal configuration comparison (standalone vs distributed worker)

# standalone.properties (excerpt)
bootstrap.servers=broker1:9092
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
offset.storage.file.filename=/var/lib/kafka-connect/connect.offsets
offset.flush.interval.ms=10000

# worker.properties (distributed mode, excerpt)
bootstrap.servers=broker1:9092,broker2:9092
group.id=connect-cluster-1
config.storage.topic=connect-configs
offset.storage.topic=connect-offsets
status.storage.topic=connect-status
config.storage.replication.factor=3
offset.storage.replication.factor=3
status.storage.replication.factor=3

Roles of Workers, Connectors, Tasks, and Internal Topics

In Connect, a worker is a process, a connector is a job, and a task is a parallel execution unit. The upper limit on the number of tasks is determined by the connector configuration (tasks.max), and for sinks by the partition count.

In distributed mode, configuration (config.storage.topic), offsets (offset.storage.topic), and status (status.storage.topic) are stored in Kafka internal topics. As a result, settings submitted via REST are shared across the entire cluster, making failure recovery and rolling upgrades easier. Internal topics are created with compaction enabled, and a replication factor of at least 3 is recommended in production. On clusters where auto-creation is prohibited, create them in advance with appropriate settings.

  • config.storage.topic: single source of connector configuration
  • offset.storage.topic: holds source/sink progress (compacted)
  • status.storage.topic: used for execution-state monitoring; referenced by management tools
  • Pre-creation recommendations: cleanup.policy=compact, RF>=3, partition count determined by workload (e.g. config=1, status=5, offsets=25)

Representative properties related to internal topics (distributed worker)

bootstrap.servers=broker1:9092,broker2:9092
group.id=connect-cluster-1
config.storage.topic=connect-configs
config.storage.replication.factor=3
offset.storage.topic=connect-offsets
offset.storage.replication.factor=3
status.storage.topic=connect-status
status.storage.replication.factor=3

Fault Tolerance and Recovery Procedures

In distributed mode, when a worker fails, the tasks held by that worker are reassigned to healthy workers in the cluster. Since the connector configuration is in internal topics, execution can continue even after the process changes. Standalone is a single process, so a failure stops tasks and requires manual recovery.

Configuration changes are submitted via the REST API and shared by all workers in the distributed cluster. For recovery, use connector or task restart, and pause/resume as needed.

  • Distributed mode: automatic rebalancing when workers are added/removed
  • Standalone: restart and verification of local offset file integrity are required
  • Changes via REST. Local editing/restart is mainly needed for standalone

Operations via the Connect REST API (examples)

# List connectors
curl -s http://connect1:8083/connectors

# Create a connector
curl -s -X POST http://connect1:8083/connectors -H 'Content-Type: application/json' -d '{
  "name": "jdbc-sink-01",
  "config": {"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector", "topics": "orders", "tasks.max": "4"}
}'

# Pause / resume
curl -X PUT http://connect1:8083/connectors/jdbc-sink-01/pause
curl -X PUT http://connect1:8083/connectors/jdbc-sink-01/resume

# Restart (connector / all tasks)
curl -X POST http://connect1:8083/connectors/jdbc-sink-01/restart?includeTasks=true&onlyFailed=true

Scaling and Throughput Design

Sink parallelism is in principle limited by the partition count of the target topic, and tasks.max should be set at or below that. Sources depend on the partitionability of the target system. In distributed mode, total processing capacity can be increased simply by adding workers.

To scale with standalone, you must launch multiple separate processes and clearly partition target topics and tables (with separated offset files). In contrast, with distributed mode you submit a single configuration via REST and the cluster automatically allocates tasks.

  • Sink: tasks.max ≤ total partition count
  • Source: design tasks.max to match the sharding unit on the input side
  • Adjust processing capacity by increasing or decreasing the number of workers (distributed mode)

Sink connector configuration example (throughput tuning)

{
  "name": "s3-sink-raw",
  "config": {
    "connector.class": "io.confluent.connect.s3.S3SinkConnector",
    "topics": "raw.events",
    "tasks.max": "6",
    "flush.size": "10000",
    "key.converter": "org.apache.kafka.connect.storage.StringConverter",
    "value.converter": "org.apache.kafka.connect.json.JsonConverter",
    "value.converter.schemas.enable": "false"
  }
}

Operations: Monitoring, Logging, Security, and Upgrades

For monitoring, combining JMX metrics (per worker/connector/task) with the status topic is practical. For error handling, configure the Dead Letter Queue (DLQ) and log output appropriately.

For security, configure broker connection (SASL/SSL) via worker-common properties, or override per connector with producer.* / consumer.*. Upgrades in distributed mode are basically rolling, stopping, updating, and rejoining workers one at a time.

  • Collect connect-worker-metrics and connect-task-metrics via JMX
  • Use the DLQ (errors.deadletterqueue.topic.name) together with error logging (errors.log.enable)
  • Also leverage connector pause/resume during rolling upgrades

Representative operational configuration snippets

# Enable JMX (startup script, etc.)
export KAFKA_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9010 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"

# DLQ and error control (connector-side settings)
errors.tolerance=all
errors.log.enable=true
errors.deadletterqueue.topic.name=connect-dlq
errors.deadletterqueue.context.headers.enable=true

# Broker connection security example (worker common)
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="user" password="pass";

Exam Prep Key Points (CCDAK / CCAAK)

Mode differences and the roles of internal topics, offset storage location, and scaling constraints (the relationship between tasks.max and partition count) are frequently asked. Operations via REST, DLQ and error control, and the feasibility of rolling upgrades are also commonly tested.

Being able to correctly explain the terminology distinctions (worker / connector / task) and the mechanism by which configuration is shared in distributed mode (config.storage.topic) is a benchmark for a passing score.

  • Only distributed mode uses internal topics (config/offset/status)
  • Standalone offsets are in local files. Availability depends on the single worker
  • Sink parallelism is at most the partition count; sources depend on input-side partitionability
  • Configuration submission, updates, restart, and pause/resume are possible via the REST API
  • In production, replication factor ≥3, and internal topics use compaction

Quick reference of key properties to remember

# Distributed mode required
group.id=connect-cluster-1
config.storage.topic=connect-configs
offset.storage.topic=connect-offsets
status.storage.topic=connect-status

# Standalone-specific
offset.storage.file.filename=/var/lib/kafka-connect/connect.offsets

# Parallelism and flush
tasks.max=4
offset.flush.interval.ms=10000

Check with a Question

CCDAK / CCAAK

問題 1

You want to maintain high availability in production while progressively increasing processing capacity in the future. Which Kafka Connect execution mode should you choose, and where is the configuration kept?

  1. Standalone. Configuration is in local files on each node
  2. Standalone. Configuration is stored in ZooKeeper
  3. Distributed mode. Configuration is in a Kafka internal topic (config.storage.topic)
  4. Distributed mode. Configuration is only in each worker's memory

正解: C

If high availability and scale-out are requirements, distributed mode is recommended. In distributed mode, connector configuration is stored in a Kafka internal topic (config.storage.topic) and shared between workers. Standalone is managed in local files and is not HA.

Frequently Asked Questions

Can standalone mode not be used in production?

It depends on the requirements, but since it is a single process with no failover and configuration and offsets are managed locally, it is unsuitable for general always-on production use from an availability and maintainability standpoint. It is effective for short-term batches, PoCs, and development verification.

What should I watch for when manually creating internal topics?

Always set cleanup.policy=compact, and a replication factor of 3 or more is recommended in production. Partition counts depend on workload, but common initial examples are config=1, status=5, and offsets=25. If automatic creation is disabled on the cluster, it is safer to create them with proper settings before connecting.

What is the difference between tasks and connectors?

A connector is the job definition (target system, topics, and various settings), while a task is the unit inside a worker that executes that job in parallel. tasks.max determines the upper limit on the number of tasks that can be created, and in distributed mode these tasks are distributed across the workers in the cluster.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Kafka

Kafka Topics & Partitions: Distribution Fundamentals (2026)

How Kafka topics and partitions enable scale — ordering guar...

Kafka

CCDAK Exam Guide: Confluent Certified Developer (2026)

Complete prep for the CCDAK exam — Producer/Consumer API, St...

Kafka

CCAAK Exam Guide: Confluent Certified Administrator (2026)

Pass the CCAAK exam — cluster management, partitions, securi...

Kafka

Kafka Replicas & ISR: Fault Tolerance Explained (2026)

Replica placement, in-sync replicas (ISR), leader election. ...

Kafka

Kafka Offsets: Commit Modes & Consumer Position (2026)

Offset semantics — auto vs. manual commit, __consumer_offset...

Browse all Kafka articles (101)
© 2026 NicheeLab All rights reserved.