Vault Telemetry: Prometheus & StatsD (2026)

Vault can expose internal counters, latencies, and other signals as telemetry. The two common consumers are Prometheus (pull) and StatsD (push). This article covers when to use each, common configuration pitfalls, and HA considerations.

For exam prep, the frequent topics are the main telemetry stanza options, the format selection for /v1/sys/metrics, typical Prometheus scrape configurations, and the port/protocol assumptions for StatsD.

Goals and Context: The Vault Telemetry Big Picture

Vault enables observability exports via the telemetry stanza in the server config. Prometheus-compatible output is available over HTTP at /v1/sys/metrics?format=prometheus, and StatsD pushes metrics over UDP.

This article assumes OSS Vault and focuses on stable options and common operational patterns. In environments with version differences, always verify the configurable keys against the documentation for the Vault version you are running.

Prometheus: pull model. The Prometheus server periodically scrapes Vault's HTTP endpoint.
StatsD: push model. Vault sends metrics over UDP to a local or remote StatsD daemon.
In HA configurations, watch for metric labels and double-counting between active and standby nodes.

Minimal connectivity check (fetching in Prometheus format)

curl -s https://vault.example.com:8200/v1/sys/metrics?format=prometheus \
  --cacert /etc/ssl/certs/ca.pem | head -50

Architecture: Pull vs. Push and the HA Flow

Prometheus uses pull while StatsD uses push, so the network direction, reachability requirements, label/tag representation, and deduplication strategies all differ. When monitoring a Vault cluster, locking down how active/standby nodes are handled along with TLS and firewall design up front reduces downstream rework.

For Prometheus, ensure scrape reachability (firewalls/ACLs) and append format=prometheus to /v1/sys/metrics.
For StatsD, open UDP 8125 (the common default). It is not designed to be resilient to packet loss, so factor in network quality.

Item	Prometheus (Pull)	StatsD (Push)	Operational impact
Collection direction	Monitoring side pulls	Vault pushes	The reachability requirement flips
Protocol	HTTP(S)	UDP 8125, etc.	Account for UDP loss and reordering
Metadata	Labels (key=value)	Tags only via extensions (DogStatsD, etc.)	Controlling high cardinality is critical
Duplication under HA	Suppressed by controlling scrape targets	Suppressed by controlling the sender	Requires explicit active/standby operational design
Dashboards	High affinity with Grafana, etc.	Depends on the aggregator (Graphite, Datadog, etc.)	Aligns with your visualization platform

Data flow (conceptual diagram)

Typical HA target example (assuming Consul service discovery)

# Prometheus 側でサービス解決を使う場合のイメージ（抜粋）
scrape_configs:
- job_name: 'vault'
  metrics_path: /v1/sys/metrics
  params:
    format: ['prometheus']
  scheme: https
  tls_config:
    ca_file: /etc/prometheus/ca.pem
  static_configs:
  - targets: ['vault-1.example.com:8200','vault-2.example.com:8200','vault-3.example.com:8200']

Vault telemetry Configuration (Minimal Setup and Safety Tips)

On the Vault side, the telemetry stanza enables emission formats and destinations. For Prometheus-only integration, a safe practical starting point is to keep the HTTP endpoint output and set the retention window (prometheus_retention_time) along with hostname-label suppression (disable_hostname).

If you also use StatsD, add statsd_address. If your observability platform (Datadog, Graphite, etc.) is already established, this keeps migration cost low.

disable_hostname=true helps cap label cardinality, which stabilizes dashboards.
prometheus_retention_time controls the retention window for histograms/summaries. Setting it too large drives up memory usage.
Because StatsD is UDP, design with no expectation of retransmission on packet loss or buffer overflows.

vault.hcl excerpt: enabling Prometheus and StatsD simultaneously

telemetry {
  # Prometheus 形式の保持時間（例: 24h）
  prometheus_retention_time = "24h"

  # ホスト名をラベルに含めないことでカーディナリティを抑制
  disable_hostname = true

  # StatsD 送信（必要な場合のみ有効化）
  # 一般的な既定ポートは 8125/udp
  statsd_address = "127.0.0.1:8125"
}

# 反映: systemd の場合
# sudo systemctl reload vault

Prometheus Integration: Scrape Config and Verification

From Prometheus, scrape /v1/sys/metrics with format=prometheus appended. Avoid configurations that do not terminate TLS and perform at least CA verification. In HA environments, the common approach is to target every node or resolve them dynamically via service discovery.

To dampen double-counting and failover jitter, stabilizing the stats via recording rules is a practical approach (e.g., widen the rate window slightly).

Set metrics_path to /v1/sys/metrics and specify format=prometheus in params.
Do not forget TLS settings (ca_file/server_name). For self-signed certificates, distribute the CA.
For initial verification, use curl to confirm the formatted output (status 200, metrics starting with # HELP/# TYPE).

Prometheus scrape_configs (minimal example)

scrape_configs:
- job_name: 'vault'
  scheme: https
  metrics_path: /v1/sys/metrics
  params:
    format: ['prometheus']
  tls_config:
    ca_file: /etc/prometheus/ca.pem
    # server_name を指定して SNI/証明書のCN/SANを検証
    server_name: vault.example.com
  static_configs:
  - targets: ['vault.example.com:8200']

StatsD Integration: Send Configuration and Receiver Considerations

When you set statsd_address in telemetry, Vault sends internal metrics over UDP. Receivers include Graphite, Datadog, and Telegraf. If you rely on tag extensions like DogStatsD, choose an agent that supports them.

Because UDP assumes some loss, account for network congestion and restart spikes by stabilizing alert thresholds with hysteresis or compound conditions.

Allow 8125/udp through the firewall (unnecessary for local receivers).
Consider sampling for high-frequency metrics (depends on receiver capabilities).
If using DogStatsD, enable the matching address/tag settings on the receiver side.

StatsD ingestion (e.g., Graphite via Telegraf)

# Telegraf の statsd 入力例（telegraf.conf 抜粋）
[[inputs.statsd]]
  service_address = ":8125"
  delete_gauges = false
  delete_counters = false
  # 必要に応じて metric_separator / templates を調整

# Vault 側（前掲 vault.hcl）で statsd_address を 127.0.0.1:8125 に設定

Operational Checks and Key Exam Points

Operationally, monitor metrics for availability, accuracy, and latency. For Prometheus, watch scrape success rate and metric-count drift; for StatsD, watch receive rate and signs of drops (receiver logs/internal metrics). This catches real problems early.

On the exam, the format selection for /v1/sys/metrics, the main telemetry keys, and the design differences between pull and push are reliable scoring opportunities.

Use curl to verify /v1/sys/metrics?format=prometheus (TLS, response code, headers).
Watch Prometheus' scrape_duration_seconds and scrape_samples_scraped.
For StatsD, check receiver metrics and logs for drop or buffer warnings.

Troubleshooting tips

# 1) Prometheus 形式で応答するか
curl -vk https://vault.example.com:8200/v1/sys/metrics?format=prometheus | head -20

# 2) ポート到達性（StatsD/UDP）
sudo tcpdump -ni any udp port 8125 -vv -c 5

# 3) ラベル過多の抑制（設定再確認）
# telemetry.disable_hostname=true を適用後、ダッシュボードの時系列を比較

Check Your Understanding

Ops

問題 1

Which recommended setting most directly contributes to operational stability when collecting Vault telemetry with Prometheus?

Set telemetry.disable_hostname to true to cap label cardinality
Set telemetry.prometheus_retention_time to 0 (no retention)
Set Prometheus' metrics_path to /v1/sys/health
Avoid using StatsD and Prometheus at the same time

正解: A

disable_hostname=true reduces the number of metric labels (especially hostname-derived ones), which lowers TSDB load and stabilizes dashboards. prometheus_retention_time=0 can cause missing metrics. /v1/sys/health is for health checks, not metrics. StatsD and Prometheus can be used together and combined as requirements dictate.

Frequently Asked Questions

Does Vault's /v1/sys/metrics always return data in Prometheus format?

No. The format switches based on the format=prometheus query parameter or the Accept header. From Prometheus, the reliable approach is to set metrics_path and pass format=prometheus via params.

Which nodes should be scraped in an HA configuration?

Generally, include every node as a target and design labels and aggregation on the Prometheus side to avoid duplicate series. The standard practice is to auto-register nodes via service discovery while ensuring TLS and network reachability.

Does enabling StatsD and Prometheus at the same time increase load?

There is some overhead because the export and emission paths both run, but with appropriate sampling, retention window tuning, and reliable networking, running both is practical in most environments. They can coexist depending on your observability requirements.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

Vault Telemetry Configuration: Practical Guide to Prometheus / StatsD Integration