Kafka scales out well, but getting the initial partition count, broker count, and storage sizing wrong leads to painful after-the-fact reassignments and cost overruns. This guide distills the stable concepts from the official docs into the points CCAAK tests most and a sizing workflow that holds up in production.
Three keywords matter: parallelism (partitions), fault tolerance (replication/brokers), and retention (capacity). The fastest path is to back-solve all three from your workload.
Capacity planning starts by pinning down ingress rate, message size, retention window, and availability/latency SLAs as concrete numbers. Separate averages from peaks and call out the peak factor explicitly (e.g., p95/p99).
Kafka throughput scales with batching efficiency, so fixing target values for producer batch.size and linger.ms, plus consumer fetch settings, as part of your assumptions makes the partition-count math much more stable.
Workload-assumptions template (example)
# unit: MB/s unless noted
workload:
topic: orders
ingress_avg: 50
ingress_peak: 120
msg_size_bytes_avg: 1024
consumers_groups: 3
retention_ms: 604800000 # 7 days
cleanup_policy: delete
availability:
replication_factor: 3
min_insync_replicas: 2
headroom: 0.25 # 25% extraThroughput scales roughly linearly with partition count. A single partition is a serial write to one leader (to preserve ordering), so hardware and batching settings impose a practical ceiling. Start with a small load test, measure effective throughput per partition (Tp), and back-solve the required partition count.
Skewed key distributions create hot partitions and prevent linear scaling. Where possible, add a sharding component to the key space, or tune StickyPartitioner behavior and batch size to even out the load.
Rough partition-count formula and example
# 実測に基づく計算を推奨
# Tp_per_partition_MBps: 1パーティションの実効MB/s(負荷試験で取得)
# required_partitions = ceil(peak_ingress_MBps / Tp_per_partition_MBps)
peak_ingress_MBps=120
Tp_per_partition_MBps=8
required_partitions=$(( (peak_ingress_MBps + Tp_per_partition_MBps - 1) / Tp_per_partition_MBps ))
echo ${required_partitions} # => 15 (例)
# 作成例(RF=3, min.insync.replicas=2)
# kafka-topics.sh --create --topic orders --partitions 16 --replication-factor 3 \
# --config min.insync.replicas=2Replication factor (RF) is the foundation of availability. Each partition has RF replicas, and writes require the leader plus an ISR set of at least min.insync.replicas. For typical availability targets, RF=3 with min.insync.replicas=2 is the starting point.
Provision at least RF brokers and enough to spread replicas across failure domains (racks or AZs). Then add more so you do not exceed the per-broker partition ceiling, disk bandwidth, or CPU limits.
| Strategy | Key settings | Primary use case | Sizing note |
|---|---|---|---|
| Time-based deletion | retention.ms, cleanup.policy=delete | Retain log for a fixed window | Capacity = avg ingress × retention × RF + headroom |
| Size-based deletion | retention.bytes, cleanup.policy=delete | Operations driven by a capacity cap | Capacity = configured size × RF + headroom; oldest data is deleted first |
| Log compaction | cleanup.policy=compact(,delete) | Stream tables that keep the latest value per key | Estimate capacity as total unique-key size plus the delta between compactions |
RF=3 rack-aware cluster placement
Rack identification and topic creation example
# server.properties(各ブローカー)
# broker.rack=rack-a|rack-b|rack-c
# 新規トピック(RF=3, min.insync.replicas=2)
# kafka-topics.sh --create --topic orders --partitions 18 --replication-factor 3 \
# --config min.insync.replicas=2Size capacity by multiplying effective ingress (pre/post compression), retention, and RF, then add index/segment overhead and operational headroom. Official retention behavior is controlled via retention.ms, retention.bytes, and cleanup.policy (delete/compact).
When using log compaction, size primarily off total unique-key footprint and update frequency; if you also enable delete, manage the upper bound with both policies in tandem.
Worked example (delete retention, RF=3, 7 days, 25% headroom)
# 前提
# 実効Ingress = 50 MB/s(圧縮後)
# 保持 = 7日 = 604800 秒
# RF = 3, ヘッドルーム = 25%
# オーバーヘッド係数 = 1.1 (インデックス等)
echo "scale=2; 50*604800*1.1*3*1.25/1024/1024" | bc
# => 約 120.4 TB (クラスター総容量の目安)
# 実機ではディスク単位(例 4TB x 台数)に割り付け、将来増設余地を残すCluster ingress is the producer → leader write path. With RF>1, followers also fetch replicas from the leader, so total receive bytes converge on roughly "client ingress × RF" (with duplicate ACKs and metadata traffic being relatively small). Egress scales with consumer count and read patterns.
HDDs can sustain high throughput on sequential writes, but for mixed workloads (read/write/compression/compaction), SSDs deliver lower jitter and make latency SLAs easier to hit.
Network rule-of-thumb formula
# 例: Ingress=120 MB/s, RF=3, 総Egress=240 MB/s(複数グループ合算)
# 受信Rx ≈ 120 * 3 = 360 MB/s
# 送信Tx ≈ 120 * (3-1) + 240 = 480 MB/s
# 10GbE(実効~1,200 MB/s)なら1本で足りるが、冗長/将来増分を見て2ポート以上を推奨Capacity is not static. Anticipate topic and traffic growth, hold 20-30% headroom at all times, and add brokers or disks as you approach the threshold. After scaling, use partition reassignment to rebalance.
Reassignment puts load on the network and disks. Limit impact by using maintenance windows, throttling, and preferred-leader election.
Reassignment and preferred-leader election example
# reassignment.json(例)
# {"version":1, "partitions":[{"topic":"orders","partition":0,"replicas":[1,2,3]}]}
# 実行
# kafka-reassign-partitions.sh --bootstrap-server <broker> --reassignment-json-file reassignment.json --execute
# 帯域スロットルを適用する場合は broker/topic 設定で replication.throttled.* を活用
# 優先リーダー選出
# kafka-preferred-replica-election.sh --bootstrap-server <broker>CCAAK
問題 1
From a CCAAK perspective, topic orders is configured with RF=3, min.insync.replicas=2, and producer acks=all. Which condition must hold for new writes to continue after two brokers fail?
正解: A
With acks=all, writes succeed only when ISR size is at least min.insync.replicas. With RF=3 and min.insync.replicas=2, two simultaneous failures typically drop ISR below 2 and writes are rejected. The exception is when the failures hit replicas outside the ISR (e.g., replicas already lagging out of ISR) so that ISR stays at 2 or more. B and C are incorrect, and D trades consistency for availability, which conflicts with the original design intent.
How many partitions per broker is safe?
It depends on hardware, Kafka version, and GC strategy. Past several thousand partitions, metadata handling, GC, and cleanup tend to spike, so set a limit based on load testing and keep 20-30% headroom. Controller stability and reassignment duration should also be evaluated.
Should I use retention.ms or retention.bytes?
Use retention.ms as the primary lever when your SLA is time-based, and retention.bytes when cost ceilings are strict; let the other one act as a backstop. If you also use compaction, baseline on unique key size and combine with delete to cap old segments for stability.
What happens when I add partitions later?
Existing key-ordering guarantees do not carry over to new partitions (with unkeyed round-robin/sticky assignment). You can also see momentary latency spikes from data rebalancing and leader election. It is safer to over-provision a bit at the start to absorb future growth.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Kafka Topics & Partitions: Distribution Fundamentals (2026)
How Kafka topics and partitions enable scale — ordering guar...
CCDAK Exam Guide: Confluent Certified Developer (2026)
Complete prep for the CCDAK exam — Producer/Consumer API, St...
CCAAK Exam Guide: Confluent Certified Administrator (2026)
Pass the CCAAK exam — cluster management, partitions, securi...
Kafka Replicas & ISR: Fault Tolerance Explained (2026)
Replica placement, in-sync replicas (ISR), leader election. ...
Kafka Offsets: Commit Modes & Consumer Position (2026)
Offset semantics — auto vs. manual commit, __consumer_offset...