This guide is a concrete study plan for the Apache Kafka and Confluent certifications (CCDAK/CCAAK), showing how to combine official documentation, practice exercises, and hands-on learning.
It emphasizes perspectives that translate directly to real-world work (design and operations decisions, configuration pitfalls), not just exam prep. Always cross-reference the official Kafka documentation, official Confluent documentation, and the certification pages for the latest specs and exam scope.
CCDAK centers on the developer perspective (producers/consumers, serialization, schema management, delivery guarantees, Kafka Streams/ksqlDB basics, Connect basics). CCAAK centers on the administrator perspective (broker/topic configuration, security, ACLs, replication, ISR, rebalancing, monitoring, capacity planning, maintenance).
Questions cover vendor-neutral core Kafka specifications, sometimes intertwined with Confluent products (Schema Registry, ksqlDB, Connect, Control Center, etc.). Use the official Kafka documentation as the primary source for fundamentals and the official Confluent documentation for product-specific topics. Check the Confluent certification pages for the most up-to-date exam scope.
The official Kafka docs are the primary source of truth. Start with the Configuration reference to confirm each setting's meaning, defaults, and interdependencies (e.g., enable.idempotence with acks/retries/max.in.flight). Then study Design/Operations/Security for the underlying design rationale and operations essentials.
Use the official Confluent docs as the primary source for Schema Registry (compatibility levels, versioning, subject-naming operations that are easy to overlook), Connect (distributed operations, tasks/workers, error handling and DLQ), and ksqlDB (streams/tables, key materialization, processing guarantees).
Fixing a reading template boosts efficiency. For each topic, take notes on What (what it does), When (when/why to use it), Defaults (default and recommended values), Trade-offs (performance, reliability, cost), and CLI (verification steps).
For the fastest learning path, start with a single local broker and expand to multi-broker as needed. Kafka is transitioning to KRaft mode as the mainstream, but for early learning, a minimal ZooKeeper-based setup is enough to experience the core concepts (topics/partitions/offsets/replication behavior). For exam prep, it's safest to understand the terminology and roles of both.
Compose setups using Confluent's Docker images are fast to bring up and minimize environment quirks around CLI locations. Watch out for port conflicts and resource exhaustion. For learning, first verify behavior with replication factor 1, then move on to replication/ISR/failover exercises.
Component relationship diagram for local learning
Minimal Docker Compose example (for learning, ZooKeeper-based)
version: '3.8'
services:
zookeeper:
image: confluentinc/cp-zookeeper:7.6.1
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- '2181:2181'
kafka:
image: confluentinc/cp-kafka:7.6.1
depends_on:
- zookeeper
ports:
- '9092:9092'
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
# 起動: docker compose up -d
# 停止: docker compose downRunning delivery guarantees, partition design, schema management, and consumer group rebalancing by hand clarifies the rationale behind design decisions. Memorize configurations in combinations rather than individually, and use the CLI to confirm observable metrics (lag, ISR, leader, retention, compaction behavior).
For CCDAK-focused practice, work through key design → ordering → duplicates/loss → transactions → Schema Registry → Streams/ksqlDB in that order. For CCAAK-focused practice, prioritize replication and ISR, rebalancing strategies, ACLs, certificates, log segments/retention, and tools (kafka-topics, kafka-configs, kafka-reassign-partitions, etc.).
| Semantics | Main settings (Producer/Consumer/Broker) | Strengths | Caveats |
|---|---|---|---|
| At-most-once | Producer: acks=0 or retries=0. Consumer: enable.auto.commit=true (commit before processing) | Minimal latency, maximum throughput | Message loss can occur. Often appears on exams as the option that fails reliability requirements |
| At-least-once | Producer: acks=all, retries>0. Broker: min.insync.replicas>=2 (production). Consumer: enable.auto.commit=false, commitSync after processing | No loss (duplicates allowed). A realistic default | Duplicates can occur, so idempotency or deduplication is required downstream |
| Exactly-once | Producer: enable.idempotence=true, set transactional.id and use begin/commit. Consumer: isolation.level=read_committed. Streams: processing.guarantee=exactly_once_v2 | Processing guarantee with no loss and no duplicates | In exchange for the guarantee, latency and complexity increase. Full E2E including external systems requires transactional support on the sink side |
Example properties for experimenting with delivery guarantees
# producer.properties
bootstrap.servers=localhost:9092
enable.idempotence=true
acks=all
retries=2147483647
max.in.flight.requests.per.connection=5
transactional.id=txn-app-1
# consumer.properties
bootstrap.servers=localhost:9092
enable.auto.commit=false
isolation.level=read_committed
group.id=g1
# 実行例(コンソールを使う場合)
# プロデューサ(トランザクション対応のクライアントAPIが必要。コンソールはbegin/commitを明示できないため、アプリ実装で確認推奨)
# コンシューマ(コミットはアプリで制御)
# kafka-console-producer --topic t1 --bootstrap-server localhost:9092 --producer.config producer.properties
# kafka-console-consumer --topic t1 --bootstrap-server localhost:9092 --from-beginning --consumer.config consumer.propertiesConfiguration interdependencies and assumptions about defaults are common sources of mistakes. The following are representative examples that often trip people up on the exam and lead to incidents in production.
Practice the full loop including state verification via the CLI, and train yourself to ask which metric or output backs up each claim — this strengthens you on case-based questions.
Representative examples of applying topic/broker settings (CLI)
# 最小ISRを設定(可用性要件に合わせる)
# kafka-configs --bootstrap-server localhost:9092 --alter --topic orders --add-config min.insync.replicas=2
# リテンション時間(ミリ秒)
# kafka-configs --bootstrap-server localhost:9092 --alter --topic logs --add-config retention.ms=604800000
# コンパクションに切替
# kafka-configs --bootstrap-server localhost:9092 --alter --topic kv-store --add-config cleanup.policy=compact
# 設定確認
# kafka-configs --bootstrap-server localhost:9092 --describe --topic kv-storeWith a focused sprint, you can complete a round of study in 4 weeks. Narrow the scope each week, and always cycle through read → run → update notes → quiz. For practice questions, annotate the source (the section of the official docs) and avoid relying on rote memorization.
On exam day, mentally underline the requirement keywords in each question (Are duplicates allowed? Is downtime allowed? Is ordering required per key or globally? What are the operational constraints?) and rigorously use process of elimination. Recall numbers and defaults paired with their trade-offs rather than memorizing them in isolation.
CCDAK / CCAAK
問題 1
You produce order events to a single topic, and the consumer side allows neither duplicates nor loss. Which configuration best meets these requirements using only standard Kafka features?
正解: A
Exactly-once requires producer idempotence and transactions (transactional.id with begin/commit), and the consumer must use read_committed to exclude uncommitted records. acks=all is only part of the consistency story. Compaction is a storage policy, not a processing guarantee. Toggling auto-commit alone cannot prevent duplicates.
Should I learn Kafka with KRaft or ZooKeeper?
For both conceptual learning and exam prep, it's safer to understand the terminology and roles of both. ZooKeeper-based setups are fine for initial local learning, but KRaft has become the mainstream choice, so also study KRaft-specific terms (controller quorum, metadata log, etc.) in the official documentation.
How much do I need to know about Confluent-specific features (Schema Registry, ksqlDB, Connect)?
CCDAK frequently tests Schema Registry compatibility levels, serialization, and the basics of Connect/ksqlDB. CCAAK also emphasizes distributed Connect operations and security. Limit deep dives to chapters that match the exam scope, and focus on terminology and operational mental models for efficiency.
My practice exam scores aren't improving. Where should I start reviewing?
Check whether you're memorizing settings in isolation, and switch to explaining the cause-and-effect of setting sets (acks + min.insync.replicas, batch.size + linger.ms, retention.ms + bytes, etc.). Always go back to the source sections of the official docs and reproduce them with the CLI or small experiments before committing them to memory.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Kafka Topics & Partitions: Distribution Fundamentals (2026)
How Kafka topics and partitions enable scale — ordering guar...
CCDAK Exam Guide: Confluent Certified Developer (2026)
Complete prep for the CCDAK exam — Producer/Consumer API, St...
CCAAK Exam Guide: Confluent Certified Administrator (2026)
Pass the CCAAK exam — cluster management, partitions, securi...
Kafka Replicas & ISR: Fault Tolerance Explained (2026)
Replica placement, in-sync replicas (ISR), leader election. ...
Kafka Offsets: Commit Modes & Consumer Position (2026)
Offset semantics — auto vs. manual commit, __consumer_offset...