When you run Kafka across multiple tenants or business units, heavy traffic from one client can easily starve the others. Kafka quotas exist to tame this resource contention and hand each client a predictable share of throughput.
This article walks through how to design, configure, and observe quotas for stable cluster operations, drawing on the official documentation and going deep enough to cover what the CCAAK certification asks about.
There are three primary Kafka client quotas: producer_byte_rate (producer write bytes per second), consumer_byte_rate (consumer fetch bytes per second), and request_percentage (a proportional share of server processing capacity). When a limit is exceeded the broker does not hard-reject; instead it throttles by delaying responses, and the client learns about the delay from throttle_time_ms in the response.
The limit is smoothed over an observation window so that short bursts are tolerated while the average rate is enforced (controlled on the broker side by quota.window.num and quota.window.size.seconds). Entities can be specified as user, client-id, user+client-id (combination), or default (cluster-wide). The most specific entity wins, and default is applied last.
| Entity | Scope | Typical use | Sample key |
|---|---|---|---|
| user | Authenticated user (SASL Principal) | Total cap per department/team | users:alice |
| client-id | Per application/process | Allocate across apps owned by the same user | clients:etl-writer |
| user+client-id | User x app combination | Tighten only one specific app within a user | users:alice,clients:etl-writer |
| default | Default shared by all clients | Initial cap for new or unclassified clients | users:* or clients:* |
Throttling flow (conceptual)
Representative quota keys (for clients)
# クライアント向けに設定可能なダイナミック Quota キー(抜粋)
# - producer_byte_rate: 1秒あたりの送信バイト上限
# - consumer_byte_rate: 1秒あたりの受信(フェッチ)バイト上限
# - request_percentage: サーバ処理能力に対する割合シェア(相対配分)
# いずれも超過時は応答に遅延を入れて調整されます。Quotas can be set at multiple levels simultaneously. At evaluation time the most specific match wins, and if nothing matches the broker falls back to a more general setting. For example, if a user=alice, client-id=etl combination is configured it will be used; otherwise it falls back to user=alice, then client-id=etl, and finally to default.
Using default lets you keep unregistered clients from running wild while selectively raising the cap for critical apps — a layered approach to enforcement.
| Design pattern | Benefit | Caveats |
|---|---|---|
| Strict default + per-entity exceptions | Safe even for unregistered clients | Operational overhead grows as exceptions accumulate |
| Allocate per user | Easy to budget by department | Hard to differentiate between apps owned by the same user |
| Combine user+client-id | Fine-grained control | Entries pile up; periodic review is required |
Precedence in pseudocode (match order)
# 擬似コード
if quota.exists(user=u, client=c): use that
elif quota.exists(user=u): use that
elif quota.exists(client=c): use that
else: use defaultClient quotas are applied to the cluster as dynamic configurations. Use kafka-configs.sh (with --bootstrap-server) or the Admin API. Changes take effect almost immediately and require no broker restart.
After configuring, verify with describe and use alter/remove as needed. The changes are stored as metadata in the cluster and take effect across all brokers.
| Operation | CLI example | Key point |
|---|---|---|
| Set | kafka-configs.sh --alter --add-config ... | Separate multiple keys with commas |
| Inspect | kafka-configs.sh --describe --entity-type ... | Be explicit about the target entity |
| Delete | kafka-configs.sh --alter --delete-config ... | When unset, falls back to a higher level or default |
kafka-configs.sh examples (official-docs syntax)
# client-id 単位
bin/kafka-configs.sh --bootstrap-server localhost:9092 \
--alter --add-config 'producer_byte_rate=1048576,consumer_byte_rate=1048576' \
--entity-type clients --entity-name etl-writer
# user 単位(SASL Principal 名)
bin/kafka-configs.sh --bootstrap-server localhost:9092 \
--alter --add-config 'producer_byte_rate=2097152' \
--entity-type users --entity-name alice
# user+client-id(組み合わせ)
bin/kafka-configs.sh --bootstrap-server localhost:9092 \
--alter --add-config 'consumer_byte_rate=524288' \
--entity-type users --entity-name alice \
--entity-type clients --entity-name etl-writer
# describe で確認
bin/kafka-configs.sh --bootstrap-server localhost:9092 \
--describe --entity-type clients --entity-name etl-writerSet limits by accounting for expected message size, QPS, compression ratio, and overhead such as headers. Start by allocating around 80% of expected demand, then adjust step by step while watching throttle_time_ms, processing delay, and latency.
Consumer and producer sides can be asymmetric. For example, with heavy writes but reads concentrated in only a few groups, it is reasonable to keep consumer_byte_rate on the lower side.
| Design item | Guideline | Notes |
|---|---|---|
| producer_byte_rate | Start at ~80% of write peak | Raise if throttling is frequent |
| consumer_byte_rate | Match downstream processing capacity | If lag builds up, consider tightening or loosening |
| request_percentage | Relative share under contention | Affects the allocation of processing slots, not just bandwidth |
Calculation example
# 平均 10 KB のレコードを 1500 rps で送る場合(圧縮後同程度と仮定)
# 10 * 1024 * 1500 ≈ 15,360,000 bytes/s ≈ 14.6 MiB/s
# 初期上限: 12–13 MiB/s 程度(観測しながら調整)
# → producer_byte_rate ≈ 13 * 1024 * 1024 = 13631488Clients detect throttling via throttle_time_ms in response headers. Continuously non-zero values strongly suggest the limit is being hit. On the broker side, per-request throttle times and rates are exposed as metrics (for example, ThrottleTimeMs per request type).
Excessive throttling causes producer buffers to fill up and latency to spike, while consumers see longer fetch intervals leading to app-side delay and growing lag. Start by checking the target entity's configuration and whether default is sweeping it up.
| Symptom | Likely cause | What to check |
|---|---|---|
| Rising throttle_time_ms | Quota exceeded | Limit and precedence for the target entity |
| Growing consumer lag | consumer_byte_rate is set too low | Alignment with the group's processing capacity |
| P99 latency regression | request_percentage too low or contended | Relative allocation under contention |
JMX/metric examples (names vary by environment)
# 例: リクエスト種別ごとのスロットル
# kafka.network:type=RequestMetrics,name=ThrottleTimeMs,request=Produce
# kafka.network:type=RequestMetrics,name=ThrottleTimeMs,request=FetchConsumer
# ダッシュボードで時系列を可視化し、対象エンティティの設定と突き合わせるIn multi-tenant setups, lock down default and only relax it for critical workloads or nightly batches. Standardize a client.id naming convention (such as team-app-purpose) and review the inventory each quarter to prevent rot.
Throttling for replication traffic lives in a different configuration domain than client quotas. It is used during operations like partition reassignment and should not be conflated with controlling application clients.
| Target | Control mechanism | Common confusion |
|---|---|---|
| Application clients | producer/consumer_byte_rate, request_percentage | User/client precedence |
| Replication | Dedicated replication throttle settings | Not a client quota |
| Unregistered clients | Strict default configuration | Exceptions must be reviewed periodically |
Operational inventory (pseudocode procedure)
# 1) 実利用の client.id と Principal を収集(メトリクス/ログ)
# 2) default で保護されているか確認
# 3) 重要系だけ user+client-id で緩和
# 4) 90日以上未使用のエントリは削除候補にCCAAK
問題 1
In a Kafka cluster, you want to suppress unclassified new clients while granting only user=analytics with client-id=nightly-batch a high write bandwidth. Which configuration is most appropriate?
正解: A
Suppressing unclassified clients is what default is for. To relax limits only for a specific user x client pair, lean on the rule that the more specific entity (user+client-id) wins. Setting only client-id or only user could ripple to other clients or other users under those scopes.
Does a quota reject requests or just delay them?
Kafka client quotas are essentially soft throttling. When the limit is exceeded the broker delays the response, and the client receives throttle_time_ms back.
Can I tell which entity-level quota is actually in effect?
Use kafka-configs.sh --describe to inspect configurations and watch whether the client-side throttle_time_ms increases. Precedence is applied in the order user+client-id > user > client-id > default.
Can I control replication throttling with client quotas as well?
No. Replication throttling lives in a separate configuration domain. The client-facing producer/consumer_byte_rate and request_percentage do not control it.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Kafka Topics & Partitions: Distribution Fundamentals (2026)
How Kafka topics and partitions enable scale — ordering guar...
CCDAK Exam Guide: Confluent Certified Developer (2026)
Complete prep for the CCDAK exam — Producer/Consumer API, St...
CCAAK Exam Guide: Confluent Certified Administrator (2026)
Pass the CCAAK exam — cluster management, partitions, securi...
Kafka Replicas & ISR: Fault Tolerance Explained (2026)
Replica placement, in-sync replicas (ISR), leader election. ...
Kafka Offsets: Commit Modes & Consumer Position (2026)
Offset semantics — auto vs. manual commit, __consumer_offset...