CQRS is a pattern that separately designs writes (commands) and reads (queries) to improve scalability and changeability. Kafka provides high-throughput logs, strict in-partition ordering, compaction, and transactions — all the building blocks you need to assemble CQRS in practice.
This article focuses on the official features most commonly tested on CCDAK — topic compaction, Idempotent/Transactional Producer, Kafka Streams/KTable, processing.guarantee, isolation.level — and walks through concrete design choices, configurations, and pitfalls.
CQRS separates the write model (state-transition logic) from the read model (query-optimized views). On Kafka, the basic approach is to split topics that receive commands (or events) from views optimized for reads (compacted topics or external stores).
A Streams app aggregates commands and materializes them into a KTable, and the query layer reads from that KTable or the view topic. Assume eventual consistency, then use key design and partition design to draw clear consistency boundaries.
| Design Style | Write / Read Model | Consistency / Latency | Complexity |
|---|---|---|---|
| Single CRUD Model | Same schema for both | Strong consistency / low latency, but limited scalability | Low |
| CQRS | Separated (writes normalized, reads optimized) | Eventual consistency / introduces latency | Medium |
| Event Sourcing + CQRS | Rebuild state from events + views | Eventual consistency / flexible but heavy design load | High |
Big picture of CQRS on Kafka
Sample logical names and roles (design notes)
commands.orders # 書き込み用(append-only, retentionベース)
views.orders # 読み取り用(コンパクション, 最新状態)
internal.orders-agg # Streamsのchangelog/repartition(内部トピック)Design the command topic as an immutable append-only log of facts, with the key fixed to the aggregate ID (e.g. orderId). Enable compaction on the read topic so the latest state is easy to retrieve. This lets KTables and external sinks rebuild the latest snapshot efficiently.
Pick the partition count based on write throughput and parallelism. Ordering is guaranteed per partition, so design keys at the granularity where ordering matters. View topics overwrite by key, so values should ideally be a complete snapshot (source of truth). If you must use delta updates, manage keys and schema evolution carefully.
| Topic Type | Key Settings | Purpose / Caveats |
|---|---|---|
| commands.* | cleanup.policy=delete, min.insync.replicas>=2 | Append-only log. Order holds only within a partition. Set retention based on business needs. |
| views.* | cleanup.policy=compact (or compact,delete), min.compaction.lag.ms | Maintains the latest state. Deletes are expressed with tombstones. |
| internal.* | Auto-managed by Streams; replication.factor>=3 recommended | Involved in failure recovery and EOS. Capacity and performance monitoring are essential. |
Topic separation, visualized
Topic creation examples (kafka-topics)
kafka-topics --create --topic commands.orders --partitions 12 --replication-factor 3 \
--config cleanup.policy=delete --config min.insync.replicas=2
kafka-topics --create --topic views.orders --partitions 12 --replication-factor 3 \
--config cleanup.policy=compact --config min.cleanable.dirty.ratio=0.5Producer idempotence suppresses duplicates from retries on the write path. Use it together with acks=all and an appropriate in-flight setting. This mitigates the duplicate keys that at-least-once delivery tends to introduce.
Use transactions to atomically publish across multiple topics (e.g. updates within the same boundary to commands.* and views.*, or committing processed offsets). In Kafka Streams, enabling processing.guarantee=exactly_once_v2 uses transactions and idempotent writes under the hood.
| Delivery Guarantee / Feature | Characteristics | Overhead / Where to Use |
|---|---|---|
| At-least-once | Duplicates are possible | Low / fits most workloads. Requires downstream dedup. |
| Idempotent Producer | Suppresses duplicates within the same session | Medium / robust against key overwrites |
| Transactions(EOS) | Atomic publish across multiple outputs + duplicate suppression | Medium-high / view updates and exactly-once processing |
How Idempotence and Transactions apply
Producer(retries) -> [commands.orders]
| enable.idempotence=true | acks=all
+--> Duplicates suppressed (per partition sequence)
[Transactional boundary]
begin -> send commands + send processed offsets -> commitJava Producer configuration example (Idempotence / Transactions)
Properties p = new Properties();
p.put("bootstrap.servers", "broker:9092");
p.put("acks", "all");
p.put("enable.idempotence", "true");
p.put("max.in.flight.requests.per.connection", "5");
p.put("retries", Integer.toString(Integer.MAX_VALUE));
p.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
p.put("value.serializer", "org.apache.kafka.common.serialization.ByteArraySerializer");
// トランザクションを使う場合
p.put("transactional.id", "orders-writer-001");
KafkaProducer<String, byte[]> producer = new KafkaProducer<>(p);
producer.initTransactions();
producer.beginTransaction();
producer.send(new ProducerRecord<>("commands.orders", orderId, payload));
// 必要なら追加のトピックにも送信
producer.commitTransaction();A KTable is a logical table that holds the latest value per key, backed by a changelog topic for fault tolerance. When aggregating command events into the latest state, build a KTable with groupByKey → aggregate or reduce, and write it to views.* as needed.
Enabling exactly_once_v2 makes outputs, internal state updates, and consumer offset commits atomic. For joins and groupBy operations that trigger repartitioning, internal topic configuration and capacity monitoring matter most.
| Type | Primary Use | Caveats |
|---|---|---|
| KStream | Per-event processing, pre-aggregation stage | Order holds only within a partition. Reshuffling is often required. |
| KTable | Holding and querying the latest state | Watch out for changelog size and recovery time. |
| GlobalKTable | Joins against lookup data | Memory and disk footprint grow. |
Materialization with Streams
Kafka Streams example (Exactly-once + KTable output)
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "orders-aggregate-app");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "broker:9092");
props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, StreamsConfig.EXACTLY_ONCE_V2);
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.ByteArray().getClass());
StreamsBuilder b = new StreamsBuilder();
KStream<String, OrderEvent> events = b.stream("commands.orders");
KGroupedStream<String, OrderEvent> g = events.groupByKey();
KTable<String, OrderState> state = g.aggregate(
() -> OrderState.empty(),
(key, ev, agg) -> agg.apply(ev),
Materialized.<String, OrderState, KeyValueStore<Bytes, byte[]>>as("orders-state-store")
);
// ビュー用トピックへ最新状態を出力
state.toStream().to("views.orders", Produced.with(Serdes.String(), orderStateSerde));
KafkaStreams streams = new KafkaStreams(b.build(), props);
streams.start();There are two broad options for reads: 1) pull Streams local state (Interactive Queries) over REST, or 2) sink views.* into an external store and query it. The first option gives minimal latency but requires routing and discovery; the second is easier to operate but adds consistency lag and delay.
With key-based routing you can directly query the task (including hot standbys) that owns a given key. For external sinks, use the official Kafka Connect sink connectors (JDBC/Elasticsearch/Redis, etc.) and enable read_committed to skip uncommitted and aborted transactions.
| Query Approach | Consistency / Latency | Operational Cost | Where It Fits |
|---|---|---|---|
| Interactive Queries | Minimal latency / in-process consistency | Medium (routing / HA) | Low-latency APIs |
| Connect Sink → RDB / search platform | Eventual consistency / latency + retries | Low to medium (managed offerings available) | Complex search / reporting |
| Consume views.* directly | Medium (depends on compaction) | Low (simple consumption) | Simple key-value lookups |
Query path options
Client -> REST(Query Service) -> Streams Instance -> [State Store]
\
-> [views.orders] -> Connect Sink -> [External DB]Kafka Connect Sink (Elasticsearch example)
{
"name": "views-orders-es",
"config": {
"connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
"topics": "views.orders",
"connection.url": "http://es:9200",
"type.name": "_doc",
"key.ignore": false,
"behavior.on.malformed.documents": "ignore",
"write.method": "upsert",
"errors.tolerance": "all",
"consumer.override.isolation.level": "read_committed"
}
}CQRS + Kafka is fundamentally eventually consistent. Ordering is only guaranteed within a partition, so align aggregate boundaries with keys and partitions. Bake materialization lag into your SLOs, and design watermarks or timeouts where needed.
When transactions are in play, only consumers and connectors configured with read_committed see committed data. In compacted views, express logical deletes with tombstones and verify there are no gaps when rebuilding snapshots. Use backward-compatible schema evolution as the default, and maintain Schema Registry compatibility rules.
| Challenge | Mitigation | Related Kafka Features |
|---|---|---|
| Ordering and duplicates | Key design / Idempotence / EOS | Idempotent Producer, Transactions |
| Recovery time | Tune compaction / use standby replicas | KTable changelog, standby replicas |
| Read consistency | read_committed / separate views | Consumer isolation, compacted topics |
Timing of eventual consistency (conceptual)
Consistency-related settings for Consumer / Streams
# Consumerがトランザクション整合を尊重
isolation.level=read_committed
# StreamsのExactly-once
processing.guarantee=exactly_once_v2
# コンパクションを前提にする場合のビュー消費(例)
auto.offset.reset=earliestCCDAK
問題 1
You're implementing CQRS on Kafka for an order service. Commands are written to commands.orders, and reads must return the latest state with low latency. To avoid duplicate inserts and to strengthen view freshness and atomicity, which combination is most appropriate?
正解: A
Compact is the right policy for views, since they must hold the latest state. Streams' exactly_once_v2 provides atomicity across outputs, state updates, and offset commits. Idempotent Producer further suppresses duplicates, and read_committed skips uncommitted data. B, C, and D are all weak on consistency or duplicates.
Is Event Sourcing required? How is it related to CQRS?
It is not required. CQRS is a pattern that separates writes and reads; deciding whether the event log itself is the only source of truth (ES) is a separate choice. With Kafka you can persist either commands or events in the log and build views with Streams. Adopting ES gives you richer replay and history, but it adds design and operational complexity.
Can a single topic serve both commands and views?
Not recommended. Write-optimized (append-only, delete policy) and read-optimized (latest state, compact) topics have different requirements and configurations. It is safer to separate topics by role and let each schema evolve independently for its specific purpose.
Does ordering hold when partitions increase? What should I watch out for when scaling?
Ordering is only guaranteed within a single partition. Fix the key at the granularity where order matters (e.g. orderId) and keep the key function immutable to avoid reshuffling. In Streams, internal repartition topics can grow, so monitor replication.factor and cleanup, and consider enabling standby replicas.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Kafka Topics & Partitions: Distribution Fundamentals (2026)
How Kafka topics and partitions enable scale — ordering guar...
CCDAK Exam Guide: Confluent Certified Developer (2026)
Complete prep for the CCDAK exam — Producer/Consumer API, St...
CCAAK Exam Guide: Confluent Certified Administrator (2026)
Pass the CCAAK exam — cluster management, partitions, securi...
Kafka Replicas & ISR: Fault Tolerance Explained (2026)
Replica placement, in-sync replicas (ISR), leader election. ...
Kafka Offsets: Commit Modes & Consumer Position (2026)
Offset semantics — auto vs. manual commit, __consumer_offset...