Kafka certifications split broadly into the application-development-oriented CCDAK and the operations-oriented CCAAK. On top of that, design and operations perspectives from the managed Confluent Cloud environment reinforce understanding of both exams.
This article organizes the exam domains by tying them back to real-world decisions. It sticks to stable features and packages takeaways at the level of CLI commands, configuration values, and design principles.
CCDAK asks whether you can correctly move, process, and manage schemas for data using Kafka. CCAAK asks whether you understand the configuration needed to design, secure, and operate clusters while meeting SLAs. Confluent Cloud demands the ability to project that knowledge onto a managed environment.
The foundation shared by both exams covers partitioning and replication, delivery guarantees (at-least / at-most / exactly-once), consumer groups, schema compatibility, and security (TLS / SASL / ACL). On Confluent Cloud, those same concepts gain management elements like RBAC, networking (private connectivity), and quotas.
| Item | CCDAK (Development) | CCAAK (Operations) |
|---|---|---|
| Target role | Application / data engineer | SRE / platform engineer |
| Main topics | Producer / Consumer / Streams, Schema Registry, delivery guarantees | Broker configuration, replication and disaster recovery, security, monitoring and tuning |
| Environment | Client perspective across both self-hosted and Confluent Cloud | Centered on self-hosted clusters, plus Cloud operations perspective (RBAC / networking) |
| Key concepts / settings | acks, idempotence, transactional.id, offset management, compatibility modes | min.insync.replicas, quotas, retention / compaction, SASL/TLS, ACL/RBAC |
| Common points of confusion | Exactly-once needs to be designed across the entire processing pipeline | acks=all is only part of the availability condition; it must be paired with ISR |
Kafka's basic data flow (the unit of granularity tested on the exam)
Topic creation and availability basics (acks and min.insync.replicas)
kafka-topics.sh --create --topic orders --partitions 6 --replication-factor 3 --bootstrap-server broker:9092
# 書き込み側(プロデューサ)
acks=all
enable.idempotence=true
retries=2147483647
max.in.flight.requests.per.connection=5
# ブローカー/トピック側(ISR維持を前提にデータ損失を抑える)
min.insync.replicas=2At-least-once is easy to implement by default but requires a design that tolerates duplicate processing (idempotent sinks or deduplication). At-most-once is achieved by committing before processing, but allows the possibility of loss. Exactly-once hinges on a pipeline design that combines producer idempotence and transactions with consistency on the sink side.
Consumer offsets are held in Kafka per group. Auto-commit is simple but you have to watch processing and commit timing right after a rebalance. When high reliability is required, prefer manual commit with explicit ordering of poll, process, and commit.
Minimum producer EOS and consumer manual-commit setup
# Producer (Java properties)
enable.idempotence=true
acks=all
transactional.id=orders-tx-01
# Consumer (Java pseudo flow)
while (running) {
ConsumerRecords<K,V> rs = c.poll(Duration.ofSeconds(1));
for (r: rs) { process(r); }
c.commitSync();
}
# at-most-onceの例(推奨しない):commitSync() を process() の前に呼ぶSchema Registry controls schema evolution through compatibility modes. The common operational default is backward compatibility (BACKWARD or BACKWARD_TRANSITIVE), which keeps existing consumers from breaking. If you need bidirectional evolution, consider the FULL family, but expect higher operational overhead.
Subject naming strategies (TopicNameStrategy, RecordNameStrategy, etc.) directly drive multi-event topic design. Use RecordNameStrategy when you want to allow multiple types within a single topic; TopicNameStrategy is simpler when each topic carries one event type.
Configuring compatibility mode and the basics of schema registration
# 後方互換をグローバルに設定
curl -s -X PUT -H 'Content-Type: application/json' \
--data '{"compatibility": "BACKWARD"}' \
http://schema-registry:8081/config
# サブジェクト単位で上書き(orders-value に FULL_TRANSITIVE)
curl -s -X PUT -H 'Content-Type: application/json' \
--data '{"compatibility": "FULL_TRANSITIVE"}' \
http://schema-registry:8081/config/orders-value
# 例: Avroスキーマ登録(値側)
curl -s -X POST -H 'Content-Type: application/vnd.schemaregistry.v1+json' \
--data '{"schema": "{\"type\":\"record\",\"name\":\"Order\",\"fields\":[{\"name\":\"id\",\"type\":\"string\"}]}"}' \
http://schema-registry:8081/subjects/orders-value/versionsKafka Streams is a library you embed in your application, giving fine-grained control over state management (State Store), repartitioning, and processing guarantees (at-least / exactly_once_v2). ksqlDB lets you write stream and table transformations declaratively and runs them continuously as persistent queries.
Repartitioning occurs whenever an operation changes keys, and internal topics are created for it. At scale, do not forget the partition count, retention, and monitoring for these internal topics.
Minimum Kafka Streams topology (skeleton for an aggregation)
Properties p = new Properties();
p.put(StreamsConfig.APPLICATION_ID_CONFIG, "orders-agg");
p.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "broker:9092");
p.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, "exactly_once_v2");
StreamsBuilder b = new StreamsBuilder();
KStream<String, Order> s = b.stream("orders");
KTable<String, Long> t = s.groupByKey().count();
t.toStream().to("orders-count");
KafkaStreams app = new KafkaStreams(b.build(), p);
app.start();On Confluent Cloud, broker placement and patching become the service's responsibility, while the user focuses on logical design (topic count and partition design), access control (RBAC / ACL), and connectivity (internet vs. private connectivity).
Monitoring uses built-in metrics to visualize latency, throughput, and consumer lag. Quotas and limits (API calls, partition counts, and so on) depend on your plan, so verify them before designing.
Minimum ccloud CLI flow (environment → cluster → topic → ACL)
ccloud environment create dev
ccloud environment use <env-id>
ccloud kafka cluster create dev-cluster --cloud aws --region ap-northeast-1 --type standard
ccloud kafka cluster use <lkc-id>
ccloud api-key create --resource <lkc-id>
ccloud kafka topic create orders --partitions 6
# RBAC/ACL例(クライアントに書き込み、コンシューマグループに読み取り)
ccloud kafka acl create --allow --service-account sa-123 --operation WRITE --topic orders
ccloud kafka acl create --allow --service-account sa-123 --operation READ --topic orders
ccloud kafka acl create --allow --service-account sa-123 --operation READ --group orders-appFirst, complete one local cycle of "build → produce → consume → break → fix." Next, layer in Schema Registry and Streams/ksqlDB, and finally repeat the same flow on Confluent Cloud, articulating the differences in words. At every step, aim for the ability to explain each configuration value with a reason.
For mock practice, write at least one of your own questions each on availability, delivery guarantees, schema evolution, and security. Being able to explain the design reasoning in 30 seconds is a good benchmark for passing readiness.
Minimum Docker Compose (skeleton for study)
services:
broker:
image: confluentinc/cp-kafka:7.6.1
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092
ports: ["9092:9092"]
zookeeper:
image: confluentinc/cp-zookeeper:7.6.1
environment:
ZOOKEEPER_CLIENT_PORT: 2181
schema-registry:
image: confluentinc/cp-schema-registry:7.6.1
environment:
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: PLAINTEXT://broker:9092
SCHEMA_REGISTRY_HOST_NAME: schema-registry
ports: ["8081:8081"]CCDAK / CCAAK
問題 1
You want to prioritize high availability while avoiding data loss on writes. For a topic with replication.factor=3 and min.insync.replicas=2, which producer configuration is the most appropriate?
正解: A
acks=all waits for commit to all replicas in the ISR, so combined with min.insync.replicas=2, it suppresses loss when a replica fails. enable.idempotence=true also prevents double application from duplicate writes. acks=1 or 0 acknowledges from only the leader or just on send, which still leaves room for loss during failures.
Should I take CCDAK or CCAAK first?
Start with CCDAK if your work centers on application development or data pipeline implementation. Start with CCAAK if you focus on platform operations, security, and SLAs. If you plan to earn both, building a solid sense of correct data flow design through CCDAK first makes the configuration rationale behind CCAAK much easier to internalize.
Can I pass by studying only with Confluent Cloud?
Core concepts are the same whether managed or self-hosted, but CCAAK tests broker configuration and low-level operations, so you also need hands-on time with a local or self-hosted environment. Treat Confluent Cloud's RBAC and networking as a delta to study on top.
How deeply do I need to learn exactly-once?
If you can articulate these four points, you are covered for both the exam and real work: producer idempotence and transactions, Streams' processing.guarantee=exactly_once_v2, why it shines when both source and sink are Kafka, and the fact that external sinks ultimately require a two-phase consistency design.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Kafka Topics & Partitions: Distribution Fundamentals (2026)
How Kafka topics and partitions enable scale — ordering guar...
CCDAK Exam Guide: Confluent Certified Developer (2026)
Complete prep for the CCDAK exam — Producer/Consumer API, St...
CCAAK Exam Guide: Confluent Certified Administrator (2026)
Pass the CCAAK exam — cluster management, partitions, securi...
Kafka Replicas & ISR: Fault Tolerance Explained (2026)
Replica placement, in-sync replicas (ISR), leader election. ...
Kafka Offsets: Commit Modes & Consumer Position (2026)
Offset semantics — auto vs. manual commit, __consumer_offset...