Vault's integrated storage (Raft) lets you capture the cluster's consistent state as a snapshot. Snapshots can be taken with minimal downtime and are effective for disaster recovery and migrations.
This article focuses on the commands and ordering you actually need in operations, with Ops certification exam tips included along the way.
Vault's integrated storage is replicated across multiple nodes via Raft. Snapshots are created from the leader's applied state, so the resulting backup is consistent. You can issue the command against any node — it is forwarded internally to the leader.
A snapshot contains Vault's entire storage state (secrets, policies, auth backend configuration, namespaces, replication metadata, and so on). Server configuration files (vault.hcl), TLS certificates, and keys are not included. Treat snapshots themselves as sensitive material — store them with access control, encryption, and integrity verification in place.
Restoration requires the original Unseal keys (shards). Even if someone obtains the snapshot, they cannot unseal the data without the keys. Version compatibility is not strict — restoring to the same version as the source or to a slightly newer compatible version is safest.
Basic verification commands (leader check and peer status)
export VAULT_ADDR=https://vault.example.com:8200
export VAULT_TOKEN=s.xxxxx # sudo権限のあるトークン
# リーダー確認
vault status | egrep 'HA Mode|HA Cluster|Active Node Address'
# Raftピアの確認
vault operator raft list-peersRoutine backups follow this order: 1) check peers and health, 2) save the snapshot to a secure temporary location, 3) compute a hash, 4) rotate according to the retention policy, and 5) record the operation in audit logs.
Snapshot creation carves out a consistency point on the leader. In large environments, the I/O can take a while, so verify free space and throughput on the destination volume in advance.
Example: taking and verifying a snapshot
# 取得
SNAP_DIR=/var/backups/vault
SNAP_FILE=${SNAP_DIR}/vault-raft-$(date +%Y%m%d-%H%M%S).snap
mkdir -p "$SNAP_DIR"
vault operator raft snapshot save "$SNAP_FILE"
# 整合性検証(ハッシュ)
sha256sum "$SNAP_FILE" | tee -a ${SNAP_DIR}/SHA256SUMS
# メタ情報を記録
{
echo "created_at=$(date -Is)"
vault version | xargs echo "vault_version="
vault operator raft list-peers -format=json | jq -r '.data.leader_address' | xargs echo "leader="
} | tee -a ${SNAP_FILE}.meta
# ローテーション(例:30世代保持)
ls -1t ${SNAP_DIR}/vault-raft-*.snap | tail -n +31 | xargs -r rm -fReplicate snapshots to at least an offsite location or a different AZ, and send the hash and metadata along with them to detect tampering. On cloud storage, versioning and WORM (object lock) further improve recovery resilience.
Double up encryption in transit and at rest where possible. Combine SSE-KMS and GPG encryption and keep the set of people holding decryption keys small. Logging backup success/failure and hashes to your audit trail also makes compliance reviews easier.
Example: transfer to S3 (SSE-KMS) and GPG encryption
# S3へアップロード(KMS鍵で暗号化)
AWS_BUCKET=s3://org-backup-vault
aws s3 cp "$SNAP_FILE" "$AWS_BUCKET" \
--sse aws:kms --sse-kms-key-id arn:aws:kms:us-east-1:123456789012:key/xxxx
aws s3 cp "${SNAP_FILE}.meta" "$AWS_BUCKET"
aws s3 cp "${SNAP_DIR}/SHA256SUMS" "$AWS_BUCKET"
# GPGでクライアント暗号化してから送る例
gpg --encrypt --recipient [email protected] "$SNAP_FILE"
aws s3 cp "${SNAP_FILE}.gpg" "$AWS_BUCKET"Whether you are rebuilding a single node or recovering the entire cluster, the standard approach is to first apply the snapshot to one node, start and unseal it, and then have the other nodes join. Overwriting a node that still holds existing data will fail, so the target node's Raft data directory must be empty.
Restore on the same Vault version as the source, or on a compatible one. After restoration, expired dynamic secrets and leases are cleaned up based on their TTLs. Server configuration (vault.hcl) and TLS certificates are not in the snapshot, so prepare equivalent versions separately.
Example: full cluster recovery (assuming systemd)
# 1) すべてのVaultを停止
sudo systemctl stop vault
# 2) 復旧に使うノードAのみデータ削除
sudo rm -rf /opt/vault/data/*
# 3) ノードAを起動
sudo systemctl start vault
export VAULT_ADDR=https://node-a.example.com:8200
export VAULT_TOKEN=s.xxxxx
# 4) スナップショットを適用(A上で)
vault operator raft snapshot restore -force /var/backups/vault/vault-raft-20240401-000000.snap
# 5) Unseal(元のUnsealキーを使用)
vault operator unseal
vault operator unseal
vault operator unseal
# 6) 動作確認
vault status
vault operator raft list-peers
# 7) 残りノードB/Cを初期化・起動後、各ノードでjoin
# ノードB側で実行(BのVAULT_ADDRをエクスポートしてから)
vault operator raft join https://node-a.example.com:8200
vault operator unseal
# ノードC側も同様
vault operator raft join https://node-a.example.com:8200
vault operator unsealBackup options for Vault include Raft snapshots, filesystem-level copies, and Consul snapshots (when Consul storage is in use). When you are running on Raft integrated storage, Raft snapshots are almost always the safest choice.
When designing operations, decide upfront on snapshot storage location, encryption, retention, separate handling for configuration files and certificates, and recovery drill frequency. That preparation eliminates hesitation when a real incident hits.
| Method | Consistency | Downtime / Impact | Operational Notes |
|---|---|---|---|
| Raft snapshot (vault operator raft snapshot) | Consistent based on the leader's applied state | Effectively none (taken online) | Officially recommended. Manage config/TLS separately. Restore by applying to one node, then having others join. |
| Filesystem copy (stop, then rsync/snapshot) | Consistent when stopped; risky while running | Requires downtime | For small or single-node setups that can be stopped; involves more manual steps |
| Consul snapshot (when using Consul storage) | Depends on Consul's consistency model | Low (can be taken online) | Only valid with the Consul backend; not available on Vault integrated storage |
Raft snapshot capture and restore flow (conceptual diagram)
For comparison: example with the Consul backend (reference only)
# VaultがConsulストレージを使っている場合の参考(Raft統合ストレージでは使用不可)
# Consulのスナップショット
consul snapshot save consul-$(date +%Y%m%d).snap
# 復元
consul snapshot restore consul-20240401.snapOps exams frequently ask about the snapshot commands, the restore order, the requirement for Unseal keys, the fact that configuration is not in the snapshot, and how version compatibility is handled. Make sure you can state the recovery order — apply to one node, unseal, then join — without hesitation.
Typical errors include restoring without clearing the data directory (existing logs make it fail), leader forwarding failures during network partitions, and version-compatibility errors. Automating the restore procedure in staging beforehand minimizes hesitation when production is on the line.
Commands for troubleshooting
# リーダーへのフォワーディングが機能しているか
vault status
# ピアと投票状況の把握
vault operator raft list-peers -format=json | jq .
# スナップショットサイズとハッシュ再計算
ls -lh /var/backups/vault/*.snap
sha256sum /var/backups/vault/*.snap
# Autopilotの状態(安定性評価の参考)
vault operator raft autopilot stateOps
問題 1
You are running Vault on integrated storage (Raft). A disk failure has wiped out the data directories on all nodes, but the most recent Raft snapshot is safely stored. Which recovery procedure is most appropriate?
正解: A
The correct recovery is to apply the snapshot to one node, unseal it, and have the other nodes join. Copying directly into data_dir or starting all nodes simultaneously produces inconsistencies. The snapshot does not contain configuration, so vault.hcl must be provided separately. Overwriting another cluster is both discouraged and dangerous.
Is it safe to take a snapshot while writes are happening?
Yes. The request is forwarded to the leader, and a consistent state is carved out based on the applied commit point. No downtime is typically required, but you should monitor for latency increases.
Do snapshots include server configuration or TLS certificates?
No. Back up vault.hcl, TLS certificates, and keys separately, and prepare an equivalent configuration during recovery. Snapshots only cover storage state (secrets, policies, etc.).
Can I restore to a different version of Vault?
Compatibility is limited. As a rule, restore to the same major version as the source, or to an equivalent or newer compatible version. If there is a large version gap, validate in staging beforehand and perform a staged upgrade if needed.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Vault Core Concepts: Sealed/Unsealed, Auth, Secrets (2026)
Vault fundamentals — sealed/unsealed state, auth methods, se...
Vault Operations Professional (VOP-003): Complete Guide (2026)
Pass the Vault Operations Professional exam — enterprise pat...
Vault Path-Based Routing: API URL Structure (2026)
How Vault's path-based routing works — mount points, sub-pat...
Vault Tokens: Auth Token Mechanics (2026)
Token fundamentals — service vs. batch tokens, accessor, ren...
Vault Token Types: Service, Batch, Periodic (2026)
Service vs. batch tokens compared — performance, ACL behavio...