Every Vault request performs authentication, policy evaluation, cryptographic work, storage I/O, and audit emission. If you do not know where the bottleneck lives, adding more nodes will not improve throughput.
Building on stable concepts from the official documentation, this article summarizes the angles the Ops exam likes to probe and the design and configuration choices that actually move the needle in the field. Enterprise-only features (Performance Replication, Quotas, Namespaces, and so on) are called out as such.
A single Vault request roughly flows through front-end TLS termination → authentication/token validation → policy evaluation → secret engine or auth method processing → storage I/O below the barrier → audit log emission. Throughput is capped by the slowest stage.
Common rate-limiting factors include: cryptographic work (signing/encryption in Transit and PKI), storage fsync latency (writes to Raft or Consul), audit device I/O (audit is synchronous), external cloud API response times (AWS/GCP/DB dynamic credentials), and network latency (LB or cross-region). OSS standbys forward most operations to the active node, so read scalability is limited.
Storage and replication design define both the cap and the headroom for throughput. Integrated Storage (Raft) has no external dependencies and tends to deliver consistent performance, but fsync latency hits directly. Consul storage is a mature option, but the tuning of the network and Consul cluster directly shapes Vault latency.
Vault Enterprise Performance Replication is the primary lever for reducing read latency globally (writes are still consolidated on the primary). DR Replication is for failover and should not be used for routine read distribution. The recommended voting node count for both Raft and Consul is 3-5; avoid the consensus cost overhead of an excessively large quorum.
| Approach/Feature | Characteristics | Throughput/Latency Notes | Operational Considerations |
|---|---|---|---|
| Integrated Storage (Raft) | No external dependencies. Consistency via Raft consensus | fsync latency is the typical bottleneck. CPU usually has headroom | 3-5 voting nodes recommended. Provide stable local storage |
| Consul Storage | Battle-tested. Depends on the health of the Consul side | Network and Consul write paths drive latency | Place Vault and Consul topologies close together. Avoid stretching across WAN |
| Performance Replication (Enterprise) | Reads available on secondaries. Low latency globally | Effective for horizontal read scaling. Writes still concentrate on the primary | Watch ACL/policy/mount consistency. The write path needs deliberate routing |
| DR Replication (Enterprise) | For disaster recovery. Standby during normal operation | Not intended for steady-state throughput improvement | Drill the failover and failback procedures |
By default Vault processes inbound requests as fast as it can. When CPU or audit I/O saturates due to bursts or misconfiguration, tail latency degrades. Enterprise provides Rate Limit Quotas and Lease Count Quotas (/sys/quotas) as control levers. These cap rate or total lease count per path or namespace and guard against unintended growth.
On OSS, you design backpressure around Vault: rate limiting at the reverse proxy, exponential backoff on the client, and the Transit batch API. To avoid renew storms in token/lease renewals, careful TTL design (default/max) and use of periodic tokens are also effective.
KV v2 adds metadata updates due to versioning, so write-heavy workloads require tuning max_versions and a periodic compaction/deletion plan. For read-heavy use cases, application-side caching and TTL design pay off.
Transit is compute-heavy, so CPU core count and choice of cryptographic algorithm dominate. Elliptic curves (e.g., ECDSA/Ed25519) tend to outperform RSA, and batch_input boosts throughput. PKI gets heavier in proportion to the type and size of the CA key. For dynamic secrets (AWS/GCP/DB), the upstream API's rate limit and latency often set the ceiling. Choose between shortening TTLs to suppress caching or extending TTLs to reduce issuance frequency, depending on the situation.
The base shape is 1 active + N standbys. OSS standbys forward most operations to the active node. Enterprise Performance Standbys handle read-heavy workloads locally, so you scale by adding more places to read from. Writes still concentrate on the active (primary) node.
For global distribution, the textbook play is to deploy Performance Replication secondaries in each region to bring reads closer. In container deployments, tuning GOMAXPROCS (the Go runtime thread cap) to the effective CPU count along with LB health checks and connection pooling affects effective throughput.
A typical Enterprise topology and the throughput paths (read distribution)
Control TTLs with default_lease_ttl and max_lease_ttl (globally and via tune on each mount). Too-short TTLs inflate renew traffic; too-long TTLs widen both the security exposure and the blast radius if a token leaks. Periodic tokens can have a long lifetime through scheduled renewals, but the max_ttl cap still applies.
Auditing is synchronous. Slow audit devices or network file systems cause throughput drops. Choose a fast local disk or system logger and ensure the OS file descriptor limit (ulimit) is generous.
HTTP body size and concurrent connection limits usually depend on LB/reverse proxy settings. Set appropriate limits in front of Vault, and combine them with client-side retry and backoff to achieve a global optimum.
A minimal practical configuration (Raft + TTL tuning + audit + telemetry) plus examples of Enterprise Quotas and Transit batch usage
# vault.hcl (抜粋)
listener "tcp" {
address = "0.0.0.0:8200"
tls_disable = 0
# 既定のTLS終端。LBと責務を分ける場合は相応に調整
}
storage "raft" {
path = "/opt/vault/data"
node_id = "vault-1"
}
api_addr = "https://vault.example.com:8200"
cluster_addr = "https://10.0.0.10:8201"
# TTLはグローバル既定。個別マウントで上書き可能
default_lease_ttl = "1h"
max_lease_ttl = "24h"
# 監査(ローカル高速ディスク推奨)
audit "file" {
path = "/var/log/vault/audit.log"
log_raw = true
}
# Telemetry(Prometheusスクレイプ)
telemetry {
prometheus_retention_time = "24h"
}
# --- Enterprise: Quotas の例(CLI) ---
# レート制限(秒間100リクエスト、対象パスprefix)
# vault write sys/quotas/rate-limit/myrl rate=100 path_prefix="transit/"
# リース数上限(最大10万)
# vault write sys/quotas/lease-count/myleasecount max_leases=100000
# --- Transit: batch_input 例(1リクエストで複数暗号化) ---
# curl --header "X-Vault-Token: $TOKEN" \
# --request POST \
# --data '{"batch_input": [{"plaintext":"aGVsbG8="},{"plaintext":"d29ybGQ="}]}' \
# https://vault.example.com/v1/transit/encrypt/mykey
Ops
問題 1
A globally distributed application performs heavy KV reads, and the round-trip latency to the primary region is the bottleneck. Without changing the write path, which option best reduces read latency in each region? (Assume Enterprise.)
正解: A
Performance Replication serves reads from secondaries, reducing global read latency. DR Replication is for disaster recovery and is not intended for steady-state read distribution. On OSS, adding standbys mostly results in forwarding and is of limited benefit, and extending TTLs only creates a security trade-off without fundamentally addressing round-trip latency.
Does HSM auto-unseal impact steady-state throughput?
Generally no. The HSM is involved in protecting and decrypting the master key at startup/unseal time. Steady-state data encryption and decryption use the in-memory data key.
Will adding more audit devices increase throughput?
Auditing writes synchronously to every enabled device, so simply adding more devices can actually slow things down. Use a single fast device (local SSD or a properly tuned system logger) and ensure I/O waits do not stall the request path.
How many standby nodes should I start with?
Typically start with 2-3 standbys, watch the metrics (/v1/sys/metrics?format=prometheus) for request latency, error rate, and audit I/O waits, then scale incrementally. Enterprise Performance Standbys are especially effective in read-heavy environments.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Vault Core Concepts: Sealed/Unsealed, Auth, Secrets (2026)
Vault fundamentals — sealed/unsealed state, auth methods, se...
Vault Operations Professional (VOP-003): Complete Guide (2026)
Pass the Vault Operations Professional exam — enterprise pat...
Vault Path-Based Routing: API URL Structure (2026)
How Vault's path-based routing works — mount points, sub-pat...
Vault Tokens: Auth Token Mechanics (2026)
Token fundamentals — service vs. batch tokens, accessor, ren...
Vault Token Types: Service, Batch, Periodic (2026)
Service vs. batch tokens compared — performance, ACL behavio...