dbt

Putting dbt run_results.json to Work: Logs, Metrics, and Exam-Ready Analysis

2026-04-19
NicheeLab Editorial Team

dbt generates a run_results.json on every invocation, recording each node's status, execution time, and adapter response. It is not just a log — it is a reliable data source for quantifying pipeline performance and quality.

This article walks through everything from understanding the structure of run_results.json to ingesting it into a data platform and monitoring SLOs, all grounded in the stable spec from the official documentation. Analytics Engineer exam angles are woven in throughout.

run_results.json: Structure and Role

run_results.json is the artifact that summarizes a dbt run. In dbt Core it is typically written to the target directory, and in dbt Cloud it is available as an artifact for each job run. Each entry represents the result for a single executed node, and the top-level metadata includes the dbt version, generation timestamp, invocation_id, and more.

Representative fields look like this: metadata (dbt_schema_version, dbt_version, generated_at, invocation_id, etc.); results (an array with unique_id, status, execution_time, timing, adapter_response, message, thread_id, etc.); and elapsed_time (the total elapsed time of the run). The content of adapter_response is adapter-dependent, but values such as rows_affected may be included.

  • Location: dbt Core writes to target/run_results.json; dbt Cloud exposes it via the Artifact API for each run
  • Per-node status values: success, error, skipped (tests may also produce fail)
  • invocation_id uniquely identifies a single run
  • execution_time is per-node seconds; elapsed_time is the seconds for the entire run
ArtifactMain contentsUpdate timing
run_results.jsonEach node's result, duration, and adapter responseAfter each command run (run, test, seed, etc.)
manifest.jsonProject dependencies, node definitions, resolved macrosAt compile / run time
catalog.jsonObject column profile (descriptions, types, etc.)When the catalog command is run

Artifact generation flow (overview)

dbt CLI/Cloud
   |
   v
[Compile & Execute]
   |
   +--> target/manifest.json
   |
   +--> target/run_results.json  <-- run results
   |
   +--> target/logs/ ...

Minimal example of run_results.json (excerpt)

{
  "metadata": {
    "dbt_version": "1.x.x",
    "generated_at": "2024-04-01T12:34:56Z",
    "invocation_id": "e1c2..."
  },
  "results": [
    {
      "unique_id": "model.my_project.dim_customers",
      "status": "success",
      "execution_time": 12.34,
      "adapter_response": {"rows_affected": 1234},
      "timing": [{"name": "execute", "started_at": "...", "completed_at": "..."}],
      "message": null
    }
  ],
  "elapsed_time": 45.67
}

Reading the Execution Metrics: duration, status, rows_affected

results[].execution_time is the pure execution seconds for a node and is highly effective for identifying bottlenecks. status takes success, error, skipped (and fail for tests), letting you tell apart retries and selector-driven effects. adapter_response is adapter-dependent — Snowflake and Databricks, for example, may return rows_affected and execution codes.

In practice you group records by invocation_id to treat one run as a unit, then visualize per-node P95 execution time, success rate, and row-count change (rows_affected). Use that to surface nodes that are consistently slow or whose success rate is unstable, and prioritize your improvements accordingly.

  • Latency detection: continuously monitor P95/P99 of execution_time
  • Success rate: compute the share of success statuses per invocation_id
  • Volume signal: use rows_affected as an approximate volume metric (mind the adapter dependency)
  • Skip semantics: skipped reflects selector conditions and cache effects
StatusMeaningTreatment in monitoring
successNode finished normallyCounted in both numerator and denominator of success rate
errorFailed with a runtime errorCritical incident; alert required
skippedNot run due to selector or dependenciesTypically excluded from the denominator (adjust by policy)

Execution timeline of a single node (conceptual)

[compile]----[execute================]----[post]
            ^ start                 ^ end
execution_time = end - start

Python: extract the top 5 slowest nodes (parsing run_results.json directly)

import json, pathlib
p = pathlib.Path('target/run_results.json')
with p.open() as f:
    data = json.load(f)
rows = []
for r in data.get('results', []):
    rows.append({
        'unique_id': r.get('unique_id'),
        'status': r.get('status'),
        'sec': float(r.get('execution_time', 0.0)),
        'rows_affected': (r.get('adapter_response') or {}).get('rows_affected')
    })
slow = sorted([x for x in rows if x['status']=='success'], key=lambda x: x['sec'], reverse=True)[:5]
for i, x in enumerate(slow, 1):
    print(f"{i}. {x['unique_id']}  {x['sec']:.2f}s  rows_affected={x['rows_affected']}")

Quality Monitoring: Handling and Visualizing Test Results

dbt test results are also recorded in run_results.json. Test unique_ids are generally in the form test.<package>...., and the status uses success / fail / error. fail indicates an assertion violation, while error indicates a runtime failure.

Counting failures precisely depends on the test type and the adapter implementation. run_results.json is sufficient for pass/fail per test, but details such as violation row counts are best confirmed by combining the test SQL and logs.

  • Test identification: extract entries whose unique_id contains test.
  • Failure type: separate fail (assertion violation) from error (execution failure)
  • Stable operations: isolate tests with unstable external dependencies (time functions, non-determinism) into separate monitoring
Test typeExecution unitIdentification in run_results
Generic testsAssertions expanded at the model/column levelunique_id in the form test.<pkg>.<name>.<hash>
Singular testsAssertions written as arbitrary SQLunique_id in the form test.<pkg>.<file_name>...

Relationship between a model and its tests (conceptual)

model.my_project.dim_customers
  |
  +-- test.my_project.not_null_dim_customers_id (success)
  +-- test.my_project.unique_dim_customers_id    (fail)

Python: compute the test pass rate

import json
with open('target/run_results.json') as f:
    data = json.load(f)
results = [r for r in data.get('results', []) if 'test.' in (r.get('unique_id') or '')]
passed = sum(1 for r in results if r.get('status')=='success')
failed = sum(1 for r in results if r.get('status') in ('fail','error'))
rate = (passed / (passed + failed)) * 100 if (passed+failed)>0 else None
print({'tests': len(results), 'passed': passed, 'failed': failed, 'pass_rate_pct': rate})

Storage and Query Design: Ingest, Normalize, and Analyze in the Warehouse

Because run_results.json is one file per invocation, store it in object storage and then ingest it into the warehouse keyed by invocation_id and generated_at. A solid approach is to keep the full JSON in a single column at the RAW layer first, and then FLATTEN/EXPLODE the results array to build wide analysis tables.

For the primary key, the combination of invocation_id and unique_id is convenient in practice and protects against double-loading. Because row counts and types are adapter-dependent, design numeric columns to be nullable when you cast them.

  • Ingestion unit: 1 run = 1 JSON (identified by invocation_id)
  • Primary key: (invocation_id, unique_id)
  • Normalization: expand results and break out status, execution_time, rows_affected as columns
  • Partitioning: partition by generated_at (timestamp)
LayerPurposeMain columns / format
RAW (JSON)Preserve original data; reprocessableartifact VARIANT/STRING, load_ts
STG (expanded)Row-expand the results arrayinvocation_id, unique_id, status, execution_time, adapter_response
DM (analysis)Dashboards and SLO metricsPer-node P95, success rate, rows_affected, etc.

Ingestion pipeline (Snowflake example)

[Object Storage]
      |
  COPY INTO (RAW VARIANT)
      |
   FLATTEN(results)
      |
  ANALYTICS TABLES (SLO / visualization)

Snowflake: JSON expansion and normalization (conceptual example)

create or replace table raw_run_results(artifact variant);
-- Ingest with COPY INTO from a stage into artifact (omitted)

-- Expand (one row per entry in the results array)
create or replace view stg_run_results as
select
  artifact:metadata:invocation_id::string as invocation_id,
  artifact:metadata:generated_at::timestamp_tz as generated_at,
  r.value:unique_id::string as unique_id,
  r.value:status::string as status,
  r.value:execution_time::float as execution_time,
  r.value:adapter_response as adapter_response
from raw_run_results,
     lateral flatten(input => artifact:results) r;

-- Analytics table (example: extract rows_affected, nullable)
create or replace view dm_run_results as
select
  invocation_id,
  unique_id,
  status,
  execution_time,
  try_to_number(adapter_response:rows_affected) as rows_affected,
  generated_at
from stg_run_results;

Observability and SLO Design: Success Rate, P95 Execution Time, Regression Detection

The foundation of your SLOs is the per-invocation_id success rate and the per-node P95 execution time. Because run_results.json reliably tells you how long each node took, it lets you quantify the health of scheduled runs. For workloads with seasonality, managing P95 separately by day of week and time of day helps you avoid false positives.

For regression detection, monitor moving medians and P95 over the last N runs, alerting when thresholds are exceeded. Display tests with frequent failures on a separate dashboard from the model success rate so you can cleanly distinguish root causes — SQL errors vs. data-quality failures.

  • Candidate SLOs: run success rate >= 99.x%; P95 execution time of key models <= target
  • Alerting: alert immediately on error; alert on P95 breaches sustained for n consecutive runs
  • Handling seasonality: maintain separate baselines by day of week and time of day
MetricExample definitionCaveats
Success ratesuccess / (success + error)skipped is usually excluded from the denominator
P95 execution time95th percentile of execution_timeMeasure DB execution time, not scheduler latency
Row count changeTrend of rows_affectedAdapter-dependent; tolerate NULLs

SLO monitoring flow (conceptual)

run_results.json -> RAW -> STG -> DM
        |                 |
        v                 v
   Success rate        P95 execution time
        \______________/
                |
             Alerting

Databricks (PySpark): compute P95 execution time

from pyspark.sql import functions as F
from pyspark.sql import types as T

# Load JSON (adjust file placement to your environment)
df_raw = spark.read.json('/mnt/artifacts/run_results/*.json')

# Explode results
results = df_raw.select(
    F.col('metadata.invocation_id').alias('invocation_id'),
    F.explode_outer('results').alias('r')
)

fact = results.select(
    'invocation_id',
    F.col('r.unique_id').alias('unique_id'),
    F.col('r.status').alias('status'),
    F.col('r.execution_time').cast('double').alias('execution_time')
)

p95 = (fact
  .where(F.col('status')=='success')
  .groupBy('unique_id')
  .agg(F.expr('percentile_approx(execution_time, 0.95)').alias('p95_sec')))

p95.orderBy(F.desc('p95_sec')).show(20, truncate=False)

Exam Hot Spots and Pitfalls

The Analytics Engineer exam tends to test your understanding of how the different artifacts differ in role and which metrics to use for performance diagnosis. To identify slow models the correct answer is to use execution_time from run_results.json — manifest.json is primarily for understanding structure and dependencies.

A common pitfall: ephemeral models are not executed standalone — they are inlined into parent models, so they do not appear as independent results in run_results.json. seed entries only appear when dbt seed is run. And remember that rows_affected is adapter-dependent and is not always populated.

  • Use execution_time (run_results.json) to identify slow models
  • Use manifest.json to visualize dependencies or evaluate selectors
  • ephemeral does not appear as an independent result
  • seed results are stored in run_results.json when dbt seed is run
Common misconceptionCorrect understandingHow to address it
You can see execution time in manifest.jsonExecution time is recorded in run_results.jsonReference run_results.json for latency analysis
ephemeral also produces a result row every timeIt is inlined into the parent and has no independent resultWatch for source SQL bloat and monitor the parent model instead
rows_affected is always availableIt is adapter-dependent and can be NULLDesign schemas and visuals to be NULL-safe

Node types and how they appear in run_results

model (materialized)  -> appears
seed                  -> appears when seed is run
snapshot              -> appears
test                  -> appears (success/fail/error)
ephemeral             -> does not appear independently

Bash: save and ship run_results.json after a run (conceptual)

set -euo pipefail
run_ts=$(date -u +%Y%m%dT%H%M%SZ)
dbt run --select state:modified || true
# Save target/run_results.json
mkdir -p artifacts/$run_ts
cp target/run_results.json artifacts/$run_ts/
# Example: upload to AWS S3 (adjust to your environment)
# aws s3 cp artifacts/$run_ts/run_results.json s3://my-bucket/dbt/run_results/$run_ts.json

Check Your Knowledge

Analytics Engineer

問題 1

You want to identify the model that took the longest during a dbt run. Which data source and field combination is most appropriate?

  1. Compare execution_time in results[] of run_results.json per unique_id
  2. Compare the length of nodes[].compiled_sql in manifest.json
  3. Grep the log files and extract the longest line
  4. Pick the model with the most columns in catalog.json

正解: A

Execution time is recorded as execution_time in each result entry of run_results.json. manifest.json and catalog.json describe structure and metadata, so they are not appropriate for comparing execution time.

Frequently Asked Questions

Where can I find run_results.json?

In dbt Core it is generated at target/run_results.json under your project. In dbt Cloud you can download it from the Artifacts of each job run.

Are ephemeral models recorded in run_results.json?

Ephemeral models are not executed independently; they are inlined into their parent models, so they do not appear as independent result entries.

Is rows_affected always available?

No. The content of adapter_response is adapter-dependent. Many adapters populate rows_affected, but it can be NULL, so handle it as nullable when you analyze the data.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
dbt

dbt Models: SQL-Defined Transformation Units (2026)

Model fundamentals — SELECT-based definitions, naming, refs,...

dbt

dbt Analytics Engineering Exam: Complete Guide (2026)

Pass the AE Certification — scope, weighting, sample questio...

dbt

dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)

Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...

dbt

dbt Project Structure: models/seeds/macros Layout (2026)

Recommended dbt project layout — models, seeds, macros, snap...

dbt

dbt_project.yml Explained: Every Config (2026)

Every dbt_project.yml setting that matters — paths, vars, ma...

Browse all dbt articles (101)
© 2026 NicheeLab All rights reserved.