Vault

Vault Performance Replication: Practical Guide to Load Distribution and Horizontal Scaling

2026-04-19
NicheeLab Editorial Team

If Vault throughput is hitting a ceiling per region, or cross-region latency has become a bottleneck, Performance Replication is the strongest option to consider.

This article covers everything from design decisions to setup, operations, monitoring, and troubleshooting — equally useful for exam preparation (aligned with HashiCorp Security Automation exam scope) and real-world practice.

Goals and Prerequisites of Performance Replication

Performance Replication is a scale-out feature in Vault Enterprise that processes reads locally on secondaries in each region while applying writes centrally on the primary. This shortens latency and stabilizes throughput.

There are three key prerequisites: 1) Replication is asynchronous, so very short replication lag can occur. 2) Authentication and tokens on a secondary are local to each cluster — there is no global session sharing. 3) Failover during incidents is handled by DR Replication, while Performance Replication mainly targets scale and latency optimization — these roles are intentionally separated.

For storage, both Consul and Integrated Storage (Raft) are available, but Integrated Storage is increasingly recommended for operational simplicity and unified management.

  • Reads are processed locally on each Performance secondary
  • Writes are applied to the primary as a rule (write-forwarding from secondaries is supported)
  • Authentication, tokens, and leases are independent per cluster
  • Disaster recovery is the role of DR Replication (separate the use cases)

Topology and Data Flow (Read Distribution and Write Consistency)

A typical layout is one Performance primary plus multiple Performance secondaries per region. Each cluster is internally HA, and external access goes through a per-region load balancer.

Client reads land on the nearest secondary, and writes are forwarded via the secondary to the primary, returning after the primary commits. Data is then replicated asynchronously to each secondary. Because of the short replication lag, strict local read-after-write consistency is not guaranteed (in most cases this is practically fine, but stricter requirements call for mitigation in your design).

For networking, replication between clusters must reach Vault's cluster address (TCP 8201/TLS by default). API access (8200/TLS) is exposed to clients and load balancers.

  • Each cluster is configured for HA (Active + Standby)
  • LB health checks use /v1/sys/health (apply performance-standby-ok=true to also allow standbys)
  • Inter-cluster traffic assumes mutual TLS (distribute and verify CAs thoroughly)

Conceptual diagram of global Performance Replication

Replication (8201/TLS)Replication (8201/TLS)Clients (US)Primary Cluster[LB] → Active/StandbySecondary (EU)[LB] → Act./StandbyClients (APAC)Secondary (APAC)[LB] → Act./Standby

Mode Comparison and Design Decisions (Performance / DR / Single Cluster)

Balancing scale, availability, and operational cost starts with correctly understanding each replication mode's role. Remembering that Performance is for latency reduction and read throughput, while DR is for disaster recovery, will keep you on track in the exam.

The comparison table below is useful during early requirement gathering.

  • If you have strict RTO/RPO requirements, combine with DR Replication
  • Global read optimization is centered on Performance Replication
  • Tokens and leases are independent per cluster (not shared globally)
PerspectivePerformance ReplicationDR ReplicationSingle Cluster
Primary purposeRead distribution, latency reduction, horizontal scaleDisaster recovery and primary replacementSimple operations
ReadsProcessed locally on each secondaryDR side is normally idle (activated upon failover)Single site only
WritesApplied centrally on the primary (forwarding from secondaries supported)Not allowed on DR side (allowed after promotion)Handled at a single site
ConsistencyAsynchronous replication (short lag)Not in effect until promotionLocal consistency only
FailoverOut of scope; assumes DR is also usedExplicitly supported (via promotion)Vulnerable to server failures
Auth / tokensIndependent per clusterAfter promotion, valid on the new primarySingle management plane

Minimal Setup Procedure (CLI-Focused)

We assume both primary and secondary are already running Vault Enterprise, initialized, unsealed, and TLS-configured. The example uses Integrated Storage (Raft) for storage.

Secondary registration uses a one-time activation token, after which a full sync runs in the background.

  • Enable Performance Replication on the primary
  • Issue a token for the secondary (managed by identifier)
  • Enable and join from the secondary
  • Verify sync and status
  • Update LB health checks (decide whether to allow standbys)

CLI example: from enabling to status check

# 1) Enable Performance Replication on the primary
vault write -f sys/replication/performance/primary/enable

# 2) Generate a one-time token for the secondary to join (with an identifier)
vault write sys/replication/performance/primary/secondary-token id="sec-eu-001"
# Note the token field from the output (single-use)

# 3) Enable Performance Replication on the secondary (using the primary-issued token)
vault write sys/replication/performance/secondary/enable token="s.xxxxx..."

# 4) Check status (verifying on both sides is reassuring)
vault read sys/replication/status
vault read sys/replication/performance/status

# 5) (Optional) Take an Integrated Storage snapshot (for offload / verification)
#   Operationally, regular snapshots and secure storage are recommended
vault operator raft snapshot save /tmp/vault_raft.snap

Operational Essentials (LB, Auth, Write Forwarding, Backups)

LB design: each region's LB prefers the active node and, when needed, uses a health check with performance-standby-ok=true to also permit standbys. The standard pattern for redirecting to another region during a cluster failure is to combine Performance with DR, not to rely on Performance alone.

Authentication and tokens: operated independently per cluster. Users and apps log in at the nearest secondary, and the resulting token is only valid within that cluster. Avoid designs that assume a globally common token.

Write forwarding: write requests to a secondary are forwarded to the primary and respond after the primary commits. Even after a successful response, there can be slight delay before the change appears on the secondary — bake this into your API client retries and consistency requirements.

  • LB health check: /v1/sys/health?performance-standby-ok=true
  • Ports: API 8200/TLS, Cluster 8201/TLS (mutual connectivity)
  • Design tokens/leases to be scoped within each cluster
  • Backups are still required separately (regular snapshots)

Health check example (curl) and snapshot

# LB health check (allow standbys)
curl -sS "https://vault-eu.example.com:8200/v1/sys/health?standbycode=200&performance-standby-ok=true" -o /dev/null -w "%{http_code}\n"

# Save a Raft snapshot (requires authentication)
vault login s.xxxxx...
vault operator raft snapshot save /backups/vault-`date +%F`.snap

Exam Points and Troubleshooting

Exam tips: Performance is for scale, DR is for failover. Tokens/leases are per cluster. Secondary writes are forwarded and applied at the primary. Lock in these three points first.

Issues like "initial sync isn't progressing" or "lag won't go away" are most often caused by networking, TLS, or time sync. Confirm basic connectivity, certificate chains, and NTP before digging deeper.

  • Inspection command: vault read sys/replication/status
  • Connectivity: does 8201/TLS reach bidirectionally between clusters?
  • Certificates: does the SAN include the cluster address?
  • Time sync: NTP drift affects TLS validation and TTLs
  • Replication lag: monitor the correlation between load (WAL/queue) and bandwidth

Checking basic status and health

# Replication state
vault read sys/replication/status
vault read sys/replication/performance/status

# Health checks (on primary and secondary)
curl -sS https://vault-primary.example.com:8200/v1/sys/health | jq .
curl -sS "https://vault-eu.example.com:8200/v1/sys/health?performance-standby-ok=true" | jq .

Check with a Sample Question

Ops

問題 1

Which statement about Vault Enterprise Performance Replication is most accurate?

  1. A token acquired on a secondary is valid on all clusters, enabling global authentication
  2. Writes to a secondary are forwarded to the primary and respond after being committed on the primary
  3. Performance Replication is primarily for disaster recovery (automatic failover)
  4. Reads are always handled by the primary, and secondaries only stand by

正解: B

Performance Replication processes reads locally on each secondary and forwards writes from a secondary to the primary, where they are applied. Tokens/leases are independent per cluster, and DR scenarios are the responsibility of DR Replication.

Frequently Asked Questions

Can I send write requests to a secondary?

Yes. The secondary forwards writes to the primary, and the response returns after the primary commits. Be aware that an immediate local read on the secondary may see a very short replication lag.

Are tokens and leases replicated?

No. With Performance Replication, authentication, tokens, and leases are local to each cluster. Users and applications log in to the nearest cluster, and the issued token is only valid within that same cluster.

Can Performance Replication alone handle failover?

Not recommended. Business continuity should be designed with DR Replication. Performance Replication is primarily for scale and latency optimization, and separating the two roles is the best practice.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Vault

Vault Core Concepts: Sealed/Unsealed, Auth, Secrets (2026)

Vault fundamentals — sealed/unsealed state, auth methods, se...

Vault

Vault Operations Professional (VOP-003): Complete Guide (2026)

Pass the Vault Operations Professional exam — enterprise pat...

Vault

Vault Path-Based Routing: API URL Structure (2026)

How Vault's path-based routing works — mount points, sub-pat...

Vault

Vault Tokens: Auth Token Mechanics (2026)

Token fundamentals — service vs. batch tokens, accessor, ren...

Vault

Vault Token Types: Service, Batch, Periodic (2026)

Service vs. batch tokens compared — performance, ACL behavio...

Browse all Vault articles (101)
© 2026 NicheeLab All rights reserved.