Vault

Vault Integrated Storage (Raft): Recommended Architecture and Key Points for Ops Exams and Production

2026-04-19
NicheeLab Editorial Team

Vault's Integrated Storage is a Raft-based built-in storage engine that delivers HA without any external key-value store. For small to mid-sized and standard deployments, it is typically the first choice.

From an Ops perspective, the most frequently tested topics are odd-node placement, consistent address configuration (api_addr / cluster_addr), snapshot operations, and the quorum recovery flow during failures.

Overview of Raft Integrated Storage and When to Choose It

Integrated Storage (Raft) is a distributed log replication engine embedded directly in the Vault process. The leader serializes writes and replicates them to followers to guarantee consistency. Because it doesn't depend on external systems like Consul, it has fewer moving parts and is straightforward to deploy.

Raft is the right pick when you want to minimize external dependencies, need standard availability (3 or 5 nodes), and can rely on low-latency communication inside a data center. If you already run Consul as a general-purpose KV, the Consul backend is also a candidate, but for a greenfield Vault-only deployment, Raft is operationally simpler.

  • HA requirements: meet quorum with an odd number of nodes (3 or 5)
  • Network: low latency recommended (same region / across AZs)
  • Storage: persistent disks with fsync guarantees and sufficient IOPS
  • External dependencies: no additional KV cluster required (no Consul needed)
  • Exam focus: the difference between api_addr and cluster_addr, the odd-node principle, and the snapshot save/restore procedure

Recommended Topology and Network Design

The baseline is 3 nodes spread across multiple AZs within the same region. Five nodes offer higher availability but raise both cost and latency. The control plane (cluster traffic on 8201/TCP) must be mutually reachable between nodes, while the data plane (API on 8200/TCP) must be reachable from clients and load balancers.

Quorum requires 2 nodes in a 3-node cluster and 3 nodes in a 5-node cluster. Run the same Vault version on every node and make NTP time sync and TLS mandatory. It is critical not to confuse api_addr (used by clients) with cluster_addr (used between Raft peers).

  • Node count: 3 (standard) or 5 (high availability). Two-node setups are discouraged due to quorum loss risk.
  • AZ placement: distributed (e.g., one node per AZ across 3 AZs); the LB performs health checks against the API (8200)
  • Ports: allow 8200 (API) and 8201 (cluster) bidirectionally
  • Disks: persistent block storage; consider reattachment requirements during failures
  • Certificates: include the FQDNs/IPs of api_addr and cluster_addr in the SAN list

3-node Raft cluster (3 AZs, accessed via LB)

:8200:8200:8200:8201 Raft:8201 RaftClientsLB:8200 Health/APIVault1AZ-aVault2AZ-bVault3AZ-c

Representative Configuration and Initialization Flow

Configuration in server.hcl defines api_addr / cluster_addr, the listener (8200/8201, TLS), and storage "raft" (path, node_id, retry_join). Initialize the first node with vault operator init, then have subsequent nodes join via vault operator raft join.

After joining, unseal the node and confirm that all peers are present. It is safest to take snapshots from the leader.

  • First node: init → unseal → minimal policy/auth method setup
  • Additional nodes: raft join → unseal → verify peers
  • Verification: vault operator raft list-peers and sys/health
  • Note: api_addr must be reachable from clients/LBs; cluster_addr must be resolvable between peers

Vault server.hcl (Raft) and example initialization commands

# /etc/vault.d/server.hcl
ui = true
api_addr    = "https://vault-1.example.com:8200"
cluster_addr= "https://vault-1.example.com:8201"

listener "tcp" {
  address         = "0.0.0.0:8200"
  tls_disable     = 0
  tls_cert_file   = "/etc/vault.d/certs/vault.crt"
  tls_key_file    = "/etc/vault.d/certs/vault.key"
}

storage "raft" {
  path    = "/opt/vault/data"
  node_id = "vault-1"
  # 複数エントリ可。いずれかのリーダーに到達できれば自動参加を試行
  retry_join {
    leader_api_addr = "https://vault-1.example.com:8200"
    # tls_servername や ca_cert_file などを必要に応じて指定
  }
  retry_join {
    leader_api_addr = "https://vault-2.example.com:8200"
  }
}

# 初期化(1台目)
$ export VAULT_ADDR=https://vault-1.example.com:8200
$ export VAULT_CACERT=/etc/vault.d/certs/ca.crt
$ vault operator init -key-shares=5 -key-threshold=3 > init.txt
$ vault operator unseal  # しきい値回数実施

# 2台目以降(例: vault-2)
$ export VAULT_ADDR=https://vault-2.example.com:8200
$ vault operator raft join https://vault-1.example.com:8200
$ vault operator unseal

# ピア確認(どのノードでも)
$ vault operator raft list-peers

Availability and Performance Design Considerations

Raft is a strongly consistent leader-based protocol. Writes are accepted by the leader, replicated to a majority of nodes, and then committed. Standbys forward requests to the leader, so the latency between the leader and each peer has a direct impact on throughput and response time.

Snapshots prevent log growth and shorten recovery time on restart or rejoin. Choose high-IOPS, low-latency persistent disks, and size CPU and memory to match the workload (token issuance and cryptographic operations).

  • Maintaining quorum with an odd node count is the top priority (3- or 5-node configurations)
  • Monitor where the leader is; configure the LB so its health checks can also reach standbys
  • Plan snapshot thresholds and retention counts, and take snapshots during off-peak hours
  • Choose disks in a class that guarantees write durability (fsync)
  • Monitoring: visualize Raft peer count, replication lag, and Prometheus telemetry

Backup, Rolling Upgrade, and DR

Back up the cluster using Raft snapshots. Capture them from the leader as a rule and version them in secure storage. Restore against a new node or a cleaned data directory.

Perform upgrades one node at a time in a rolling fashion, always preserving quorum. Stop the node, upgrade it, start it back up, confirm it has stabilized, and only then move on to the next. Follow the compatibility notes in the release notes.

If you need to remove a node due to quorum loss or hardware failure, review the peer list, remove the affected node, and if necessary, rebuild from a snapshot.

  • Capture: vault operator raft snapshot save backup.snap
  • Restore: stop the node, empty the data directory, then run vault operator raft snapshot restore backup.snap
  • Verification: vault operator raft list-peers / vault operator raft autopilot state
  • Removal: vault operator raft remove-peer -peer-id=<ID>
  • Rolling: always proceed in an order that keeps a majority of nodes running

Comparison Table and Common Exam Traps

Raft Integrated Storage delivers HA with no external dependencies, but if you want to leverage existing Consul operations or share a general-purpose KV, the Consul backend is also reasonable. The File backend is for single-node use and is unsuitable for HA.

The exam likes to test the odd-node principle, confusion between api_addr and cluster_addr, where to take snapshots, and how to handle quorum loss (peer removal and rejoin).

  • api_addr is for clients; cluster_addr is for peer-to-peer traffic
  • Avoid 2-node setups (a single failure stops writes)
  • As a rule, capture snapshots from the leader
  • Clean up failed nodes with remove-peer, then rebuild or rejoin
ItemRaft Integrated StorageConsul BackendFile Backend
HA supportYes (built-in Raft for leader election and replication)Yes (uses Consul sessions and locks)No (single-node only)
External dependenciesNone (no extra middleware)Required (a Consul cluster)None
Operational complexityLow (few moving parts)Medium to high (separate Consul operations)Low (but no HA)
Backupvault operator raft snapshotconsul snapshot, etc.File copy (with consistency caveats)
Best-fit scenarioVault-only deployments wanting simple, standard HALeveraging existing Consul / integrating with a general-purpose KVLightweight use for testing or single-node scenarios

Security and Monitoring Considerations

Enforce TLS on both Raft peer traffic and the API, and include the FQDNs or IPs of api_addr and cluster_addr in the certificate SANs. If TLS verification fails, join and replication become unstable.

Stored data is encrypted by Vault's storage barrier. Snapshots also contain sensitive material, so restrict snapshot permissions tightly and ensure the storage location is encrypted and access-controlled. Combine the health endpoint with telemetry for continuous monitoring of peer count, leader transitions, and replication lag.

  • Make TLS mandatory on the listener, distribute the CA, and ensure SNI matches between peers
  • Enable persistence and forwarding for the audit device (audit logs)
  • Export telemetry and build dashboards on top of it
  • Test certificate rotation procedures ahead of time

Check Your Understanding

Ops

問題 1

You want to achieve high availability (HA) on Vault OSS without adding any external middleware. Which configuration is the standard, recommended choice?

  1. Spread 3 Vault nodes across multiple AZs using Integrated Storage (Raft), and access the API (8200) through a load balancer.
  2. Run a single-node Vault with the File backend and rely on hourly snapshots for availability.
  3. Use a 2-node Vault setup with Raft so the nodes fail over between each other.
  4. Use Raft for Vault but close the internal cluster port (8201) and expose only the API (8200).

正解: A

The recommended way to achieve HA simply is Integrated Storage (Raft) on an odd number of nodes (typically 3). The LB targets the API (8200), while peers must reach each other on 8201. File doesn't support HA, 2-node setups risk losing quorum, and closing 8201 breaks the cluster.

Frequently Asked Questions

What is the difference between api_addr and cluster_addr?

api_addr is the endpoint that clients and load balancers connect to, while cluster_addr is the address used for inter-node (Raft peer) communication between Vault nodes. Both must be reachable and covered by the certificate SANs.

Where should snapshots be taken and restored?

As a rule, take and restore snapshots on the leader node. Use vault operator raft snapshot save to capture snapshots, and to restore, stop the node, clean the data directory, then run vault operator raft snapshot restore.

How do you handle a node that won't rejoin after a failure?

Check peers with vault operator raft list-peers, and for permanent failures, evict the node from the cluster with remove-peer. Then bring up a fresh node with a clean data directory and rejoin via raft join. Always perform these steps in an order that preserves quorum.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Vault

Vault Core Concepts: Sealed/Unsealed, Auth, Secrets (2026)

Vault fundamentals — sealed/unsealed state, auth methods, se...

Vault

Vault Operations Professional (VOP-003): Complete Guide (2026)

Pass the Vault Operations Professional exam — enterprise pat...

Vault

Vault Path-Based Routing: API URL Structure (2026)

How Vault's path-based routing works — mount points, sub-pat...

Vault

Vault Tokens: Auth Token Mechanics (2026)

Token fundamentals — service vs. batch tokens, accessor, ren...

Vault

Vault Token Types: Service, Batch, Periodic (2026)

Service vs. batch tokens compared — performance, ACL behavio...

Browse all Vault articles (101)
© 2026 NicheeLab All rights reserved.