Kafka

Kafka Certification Study Guide: Official Docs, Practice, and Hands-On Learning (CCDAK/CCAAK)

2026-04-19
NicheeLab Editorial Team

This guide is a concrete study plan for the Apache Kafka and Confluent certifications (CCDAK/CCAAK), showing how to combine official documentation, practice exercises, and hands-on learning.

It emphasizes perspectives that translate directly to real-world work (design and operations decisions, configuration pitfalls), not just exam prep. Always cross-reference the official Kafka documentation, official Confluent documentation, and the certification pages for the latest specs and exam scope.

Exam Overview and Scope Mapping (CCDAK/CCAAK)

CCDAK centers on the developer perspective (producers/consumers, serialization, schema management, delivery guarantees, Kafka Streams/ksqlDB basics, Connect basics). CCAAK centers on the administrator perspective (broker/topic configuration, security, ACLs, replication, ISR, rebalancing, monitoring, capacity planning, maintenance).

Questions cover vendor-neutral core Kafka specifications, sometimes intertwined with Confluent products (Schema Registry, ksqlDB, Connect, Control Center, etc.). Use the official Kafka documentation as the primary source for fundamentals and the official Confluent documentation for product-specific topics. Check the Confluent certification pages for the most up-to-date exam scope.

  • CCDAK pillars: delivery semantics (at-most/at-least/exactly once), keys and partitions, ordering, serialization and Schema Registry, transactions, Streams/ksqlDB basics, and Connect concepts
  • CCAAK pillars: replication factor/ISR/failover, alignment between min.insync.replicas and acks, rebalancing (range, cooperative-sticky, etc.), security (SSL/SASL/ACL), operations CLI, log segments/retention, and monitoring metrics
  • Prioritize official documentation: Configuration/Design/Operations/Security/Clients on kafka.apache.org, and Schema Registry/Connect/ksqlDB/security/tooling on docs.confluent.io
  • Loop between exam and practice: learn each setting's meaning, defaults, trade-offs, and the CLI commands to verify it, all in one pass

How to Read the Official Documentation, in Priority Order

The official Kafka docs are the primary source of truth. Start with the Configuration reference to confirm each setting's meaning, defaults, and interdependencies (e.g., enable.idempotence with acks/retries/max.in.flight). Then study Design/Operations/Security for the underlying design rationale and operations essentials.

Use the official Confluent docs as the primary source for Schema Registry (compatibility levels, versioning, subject-naming operations that are easy to overlook), Connect (distributed operations, tasks/workers, error handling and DLQ), and ksqlDB (streams/tables, key materialization, processing guarantees).

Fixing a reading template boosts efficiency. For each topic, take notes on What (what it does), When (when/why to use it), Defaults (default and recommended values), Trade-offs (performance, reliability, cost), and CLI (verification steps).

  • Don't memorize settings in isolation — learn them in sets (acks/min.insync.replicas, retention.ms/bytes, batch.size/linger.ms, etc.)
  • For Streams/ksqlDB, connect processing guarantees (processing.guarantee) with state backups (changelog/RocksDB)
  • For Schema Registry, work through examples to confirm the differences between compatibility levels (BACKWARD/FORWARD/FULL/NONE). Subject-naming strategies (TopicNameStrategy, etc.) are also key
  • For security, understand the TLS handshake, SASL variants, and ACL evaluation order with diagrams before touching the configuration

Building a Hands-On Environment (Locally, Within Safe Bounds)

For the fastest learning path, start with a single local broker and expand to multi-broker as needed. Kafka is transitioning to KRaft mode as the mainstream, but for early learning, a minimal ZooKeeper-based setup is enough to experience the core concepts (topics/partitions/offsets/replication behavior). For exam prep, it's safest to understand the terminology and roles of both.

Compose setups using Confluent's Docker images are fast to bring up and minimize environment quirks around CLI locations. Watch out for port conflicts and resource exhaustion. For learning, first verify behavior with replication factor 1, then move on to replication/ISR/failover exercises.

  • Day 1: With a single broker, create topics, produce/consume, and confirm the ordering differences between keyed and non-keyed messages
  • Day 2: Increase the replication factor and confirm leader migration and ISR changes by stopping and restarting brokers
  • Day 3: Observe cleaner/retention (time/size) and log segment behavior, and confirm the differences with cleanup.policy=compact

Component relationship diagram for local learning

App P1App P2Kafka BrokerSchema RegistryConsumersConnectExternal Systemsksql

Minimal Docker Compose example (for learning, ZooKeeper-based)

version: '3.8'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:7.6.1
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
    ports:
      - '2181:2181'
  kafka:
    image: confluentinc/cp-kafka:7.6.1
    depends_on:
      - zookeeper
    ports:
      - '9092:9092'
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
# 起動: docker compose up -d
# 停止: docker compose down

Key Hands-On Challenges and Scoring Points

Running delivery guarantees, partition design, schema management, and consumer group rebalancing by hand clarifies the rationale behind design decisions. Memorize configurations in combinations rather than individually, and use the CLI to confirm observable metrics (lag, ISR, leader, retention, compaction behavior).

For CCDAK-focused practice, work through key design → ordering → duplicates/loss → transactions → Schema Registry → Streams/ksqlDB in that order. For CCAAK-focused practice, prioritize replication and ISR, rebalancing strategies, ACLs, certificates, log segments/retention, and tools (kafka-topics, kafka-configs, kafka-reassign-partitions, etc.).

  • Topic design: be able to explain the alignment between partition count, replication.factor, and min.insync.replicas
  • Delivery guarantees: be able to compare at-most/at-least/exactly once by their settings and behavior
  • Schema compatibility: verify BACKWARD/FORWARD/FULL/NONE with small schema changes
  • Rebalancing: experience the characteristics of cooperative-sticky (minimal movement, reduced disruption) by adding and removing consumers
  • Security: take the shortest path from PLAINTEXT to TLS through ACL addition and verification
SemanticsMain settings (Producer/Consumer/Broker)StrengthsCaveats
At-most-onceProducer: acks=0 or retries=0. Consumer: enable.auto.commit=true (commit before processing)Minimal latency, maximum throughputMessage loss can occur. Often appears on exams as the option that fails reliability requirements
At-least-onceProducer: acks=all, retries>0. Broker: min.insync.replicas>=2 (production). Consumer: enable.auto.commit=false, commitSync after processingNo loss (duplicates allowed). A realistic defaultDuplicates can occur, so idempotency or deduplication is required downstream
Exactly-onceProducer: enable.idempotence=true, set transactional.id and use begin/commit. Consumer: isolation.level=read_committed. Streams: processing.guarantee=exactly_once_v2Processing guarantee with no loss and no duplicatesIn exchange for the guarantee, latency and complexity increase. Full E2E including external systems requires transactional support on the sink side

Example properties for experimenting with delivery guarantees

# producer.properties
bootstrap.servers=localhost:9092
enable.idempotence=true
acks=all
retries=2147483647
max.in.flight.requests.per.connection=5
transactional.id=txn-app-1

# consumer.properties
bootstrap.servers=localhost:9092
enable.auto.commit=false
isolation.level=read_committed
group.id=g1

# 実行例(コンソールを使う場合)
# プロデューサ(トランザクション対応のクライアントAPIが必要。コンソールはbegin/commitを明示できないため、アプリ実装で確認推奨)
# コンシューマ(コミットはアプリで制御)
# kafka-console-producer --topic t1 --bootstrap-server localhost:9092 --producer.config producer.properties
# kafka-console-consumer --topic t1 --bootstrap-server localhost:9092 --from-beginning --consumer.config consumer.properties

Configuration and Design Pitfalls Checklist (Frequently Tested)

Configuration interdependencies and assumptions about defaults are common sources of mistakes. The following are representative examples that often trip people up on the exam and lead to incidents in production.

Practice the full loop including state verification via the CLI, and train yourself to ask which metric or output backs up each claim — this strengthens you on case-based questions.

  • acks=all and min.insync.replicas come as a pair. Confirm whether produce should fail during broker outages and whether behavior matches requirements
  • Priority between retention.ms and retention.bytes: whichever fills first triggers deletion. Compaction (cleanup.policy=compact) keeps the latest value per key, but be careful with tombstone handling
  • Mismatches among message.max.bytes, fetch.max.bytes, and replica.fetch.max.bytes can cause large messages to stall
  • max.in.flight.requests.per.connection and idempotence: understand the default upper bound under idempotent sends and its effect on ordering
  • auto.offset.reset only applies on the first startup of an existing group. Changing it later does not alter existing commits
  • Use static membership (group.instance.id) to suppress unnecessary rebalances when members stop

Representative examples of applying topic/broker settings (CLI)

# 最小ISRを設定(可用性要件に合わせる)
# kafka-configs --bootstrap-server localhost:9092 --alter --topic orders --add-config min.insync.replicas=2

# リテンション時間(ミリ秒)
# kafka-configs --bootstrap-server localhost:9092 --alter --topic logs --add-config retention.ms=604800000

# コンパクションに切替
# kafka-configs --bootstrap-server localhost:9092 --alter --topic kv-store --add-config cleanup.policy=compact

# 設定確認
# kafka-configs --bootstrap-server localhost:9092 --describe --topic kv-store

Study Plan (4-Week Example) and Exam Day Strategy

With a focused sprint, you can complete a round of study in 4 weeks. Narrow the scope each week, and always cycle through read → run → update notes → quiz. For practice questions, annotate the source (the section of the official docs) and avoid relying on rote memorization.

On exam day, mentally underline the requirement keywords in each question (Are duplicates allowed? Is downtime allowed? Is ordering required per key or globally? What are the operational constraints?) and rigorously use process of elimination. Recall numbers and defaults paired with their trade-offs rather than memorizing them in isolation.

  • Week 1: Core concepts and CLI (topics/partitions/offsets, definitions of delivery guarantees, Schema Registry intro)
  • Week 2: Design and configuration (acks/ISR, retention, batching, key design, Connect/ksqlDB concepts)
  • Week 3: Operations (replication/failover, rebalancing, ACL/TLS, monitoring metrics)
  • Week 4: Integrated practice and weak-spot reinforcement (mock exams, log analysis, fault injection and recovery procedures)
  • Right before the exam: review your own cheat sheet (setting sets, CLI commands, common design decisions)

Check with a Sample Question

CCDAK / CCAAK

問題 1

You produce order events to a single topic, and the consumer side allows neither duplicates nor loss. Which configuration best meets these requirements using only standard Kafka features?

  1. On the producer, set enable.idempotence=true, acks=all, and transactional.id and use transactions. On the consumer, set isolation.level=read_committed
  2. On the producer, set acks=1 and retries=0. Leave the consumer at defaults
  3. Set cleanup.policy=compact to enable compaction
  4. Just set enable.auto.commit=false on the consumer to achieve exactly-once

正解: A

Exactly-once requires producer idempotence and transactions (transactional.id with begin/commit), and the consumer must use read_committed to exclude uncommitted records. acks=all is only part of the consistency story. Compaction is a storage policy, not a processing guarantee. Toggling auto-commit alone cannot prevent duplicates.

Frequently Asked Questions

Should I learn Kafka with KRaft or ZooKeeper?

For both conceptual learning and exam prep, it's safer to understand the terminology and roles of both. ZooKeeper-based setups are fine for initial local learning, but KRaft has become the mainstream choice, so also study KRaft-specific terms (controller quorum, metadata log, etc.) in the official documentation.

How much do I need to know about Confluent-specific features (Schema Registry, ksqlDB, Connect)?

CCDAK frequently tests Schema Registry compatibility levels, serialization, and the basics of Connect/ksqlDB. CCAAK also emphasizes distributed Connect operations and security. Limit deep dives to chapters that match the exam scope, and focus on terminology and operational mental models for efficiency.

My practice exam scores aren't improving. Where should I start reviewing?

Check whether you're memorizing settings in isolation, and switch to explaining the cause-and-effect of setting sets (acks + min.insync.replicas, batch.size + linger.ms, retention.ms + bytes, etc.). Always go back to the source sections of the official docs and reproduce them with the CLI or small experiments before committing them to memory.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Kafka

Kafka Topics & Partitions: Distribution Fundamentals (2026)

How Kafka topics and partitions enable scale — ordering guar...

Kafka

CCDAK Exam Guide: Confluent Certified Developer (2026)

Complete prep for the CCDAK exam — Producer/Consumer API, St...

Kafka

CCAAK Exam Guide: Confluent Certified Administrator (2026)

Pass the CCAAK exam — cluster management, partitions, securi...

Kafka

Kafka Replicas & ISR: Fault Tolerance Explained (2026)

Replica placement, in-sync replicas (ISR), leader election. ...

Kafka

Kafka Offsets: Commit Modes & Consumer Position (2026)

Offset semantics — auto vs. manual commit, __consumer_offset...

Browse all Kafka articles (101)
© 2026 NicheeLab All rights reserved.