dbt generates a run_results.json on every invocation, recording each node's status, execution time, and adapter response. It is not just a log — it is a reliable data source for quantifying pipeline performance and quality.
This article walks through everything from understanding the structure of run_results.json to ingesting it into a data platform and monitoring SLOs, all grounded in the stable spec from the official documentation. Analytics Engineer exam angles are woven in throughout.
run_results.json is the artifact that summarizes a dbt run. In dbt Core it is typically written to the target directory, and in dbt Cloud it is available as an artifact for each job run. Each entry represents the result for a single executed node, and the top-level metadata includes the dbt version, generation timestamp, invocation_id, and more.
Representative fields look like this: metadata (dbt_schema_version, dbt_version, generated_at, invocation_id, etc.); results (an array with unique_id, status, execution_time, timing, adapter_response, message, thread_id, etc.); and elapsed_time (the total elapsed time of the run). The content of adapter_response is adapter-dependent, but values such as rows_affected may be included.
| Artifact | Main contents | Update timing |
|---|---|---|
| run_results.json | Each node's result, duration, and adapter response | After each command run (run, test, seed, etc.) |
| manifest.json | Project dependencies, node definitions, resolved macros | At compile / run time |
| catalog.json | Object column profile (descriptions, types, etc.) | When the catalog command is run |
Artifact generation flow (overview)
dbt CLI/Cloud
|
v
[Compile & Execute]
|
+--> target/manifest.json
|
+--> target/run_results.json <-- run results
|
+--> target/logs/ ...Minimal example of run_results.json (excerpt)
{
"metadata": {
"dbt_version": "1.x.x",
"generated_at": "2024-04-01T12:34:56Z",
"invocation_id": "e1c2..."
},
"results": [
{
"unique_id": "model.my_project.dim_customers",
"status": "success",
"execution_time": 12.34,
"adapter_response": {"rows_affected": 1234},
"timing": [{"name": "execute", "started_at": "...", "completed_at": "..."}],
"message": null
}
],
"elapsed_time": 45.67
}results[].execution_time is the pure execution seconds for a node and is highly effective for identifying bottlenecks. status takes success, error, skipped (and fail for tests), letting you tell apart retries and selector-driven effects. adapter_response is adapter-dependent — Snowflake and Databricks, for example, may return rows_affected and execution codes.
In practice you group records by invocation_id to treat one run as a unit, then visualize per-node P95 execution time, success rate, and row-count change (rows_affected). Use that to surface nodes that are consistently slow or whose success rate is unstable, and prioritize your improvements accordingly.
| Status | Meaning | Treatment in monitoring |
|---|---|---|
| success | Node finished normally | Counted in both numerator and denominator of success rate |
| error | Failed with a runtime error | Critical incident; alert required |
| skipped | Not run due to selector or dependencies | Typically excluded from the denominator (adjust by policy) |
Execution timeline of a single node (conceptual)
[compile]----[execute================]----[post]
^ start ^ end
execution_time = end - startPython: extract the top 5 slowest nodes (parsing run_results.json directly)
import json, pathlib
p = pathlib.Path('target/run_results.json')
with p.open() as f:
data = json.load(f)
rows = []
for r in data.get('results', []):
rows.append({
'unique_id': r.get('unique_id'),
'status': r.get('status'),
'sec': float(r.get('execution_time', 0.0)),
'rows_affected': (r.get('adapter_response') or {}).get('rows_affected')
})
slow = sorted([x for x in rows if x['status']=='success'], key=lambda x: x['sec'], reverse=True)[:5]
for i, x in enumerate(slow, 1):
print(f"{i}. {x['unique_id']} {x['sec']:.2f}s rows_affected={x['rows_affected']}")dbt test results are also recorded in run_results.json. Test unique_ids are generally in the form test.<package>...., and the status uses success / fail / error. fail indicates an assertion violation, while error indicates a runtime failure.
Counting failures precisely depends on the test type and the adapter implementation. run_results.json is sufficient for pass/fail per test, but details such as violation row counts are best confirmed by combining the test SQL and logs.
| Test type | Execution unit | Identification in run_results |
|---|---|---|
| Generic tests | Assertions expanded at the model/column level | unique_id in the form test.<pkg>.<name>.<hash> |
| Singular tests | Assertions written as arbitrary SQL | unique_id in the form test.<pkg>.<file_name>... |
Relationship between a model and its tests (conceptual)
model.my_project.dim_customers
|
+-- test.my_project.not_null_dim_customers_id (success)
+-- test.my_project.unique_dim_customers_id (fail)Python: compute the test pass rate
import json
with open('target/run_results.json') as f:
data = json.load(f)
results = [r for r in data.get('results', []) if 'test.' in (r.get('unique_id') or '')]
passed = sum(1 for r in results if r.get('status')=='success')
failed = sum(1 for r in results if r.get('status') in ('fail','error'))
rate = (passed / (passed + failed)) * 100 if (passed+failed)>0 else None
print({'tests': len(results), 'passed': passed, 'failed': failed, 'pass_rate_pct': rate})Because run_results.json is one file per invocation, store it in object storage and then ingest it into the warehouse keyed by invocation_id and generated_at. A solid approach is to keep the full JSON in a single column at the RAW layer first, and then FLATTEN/EXPLODE the results array to build wide analysis tables.
For the primary key, the combination of invocation_id and unique_id is convenient in practice and protects against double-loading. Because row counts and types are adapter-dependent, design numeric columns to be nullable when you cast them.
| Layer | Purpose | Main columns / format |
|---|---|---|
| RAW (JSON) | Preserve original data; reprocessable | artifact VARIANT/STRING, load_ts |
| STG (expanded) | Row-expand the results array | invocation_id, unique_id, status, execution_time, adapter_response |
| DM (analysis) | Dashboards and SLO metrics | Per-node P95, success rate, rows_affected, etc. |
Ingestion pipeline (Snowflake example)
[Object Storage]
|
COPY INTO (RAW VARIANT)
|
FLATTEN(results)
|
ANALYTICS TABLES (SLO / visualization)Snowflake: JSON expansion and normalization (conceptual example)
create or replace table raw_run_results(artifact variant);
-- Ingest with COPY INTO from a stage into artifact (omitted)
-- Expand (one row per entry in the results array)
create or replace view stg_run_results as
select
artifact:metadata:invocation_id::string as invocation_id,
artifact:metadata:generated_at::timestamp_tz as generated_at,
r.value:unique_id::string as unique_id,
r.value:status::string as status,
r.value:execution_time::float as execution_time,
r.value:adapter_response as adapter_response
from raw_run_results,
lateral flatten(input => artifact:results) r;
-- Analytics table (example: extract rows_affected, nullable)
create or replace view dm_run_results as
select
invocation_id,
unique_id,
status,
execution_time,
try_to_number(adapter_response:rows_affected) as rows_affected,
generated_at
from stg_run_results;The foundation of your SLOs is the per-invocation_id success rate and the per-node P95 execution time. Because run_results.json reliably tells you how long each node took, it lets you quantify the health of scheduled runs. For workloads with seasonality, managing P95 separately by day of week and time of day helps you avoid false positives.
For regression detection, monitor moving medians and P95 over the last N runs, alerting when thresholds are exceeded. Display tests with frequent failures on a separate dashboard from the model success rate so you can cleanly distinguish root causes — SQL errors vs. data-quality failures.
| Metric | Example definition | Caveats |
|---|---|---|
| Success rate | success / (success + error) | skipped is usually excluded from the denominator |
| P95 execution time | 95th percentile of execution_time | Measure DB execution time, not scheduler latency |
| Row count change | Trend of rows_affected | Adapter-dependent; tolerate NULLs |
SLO monitoring flow (conceptual)
run_results.json -> RAW -> STG -> DM
| |
v v
Success rate P95 execution time
\______________/
|
AlertingDatabricks (PySpark): compute P95 execution time
from pyspark.sql import functions as F
from pyspark.sql import types as T
# Load JSON (adjust file placement to your environment)
df_raw = spark.read.json('/mnt/artifacts/run_results/*.json')
# Explode results
results = df_raw.select(
F.col('metadata.invocation_id').alias('invocation_id'),
F.explode_outer('results').alias('r')
)
fact = results.select(
'invocation_id',
F.col('r.unique_id').alias('unique_id'),
F.col('r.status').alias('status'),
F.col('r.execution_time').cast('double').alias('execution_time')
)
p95 = (fact
.where(F.col('status')=='success')
.groupBy('unique_id')
.agg(F.expr('percentile_approx(execution_time, 0.95)').alias('p95_sec')))
p95.orderBy(F.desc('p95_sec')).show(20, truncate=False)The Analytics Engineer exam tends to test your understanding of how the different artifacts differ in role and which metrics to use for performance diagnosis. To identify slow models the correct answer is to use execution_time from run_results.json — manifest.json is primarily for understanding structure and dependencies.
A common pitfall: ephemeral models are not executed standalone — they are inlined into parent models, so they do not appear as independent results in run_results.json. seed entries only appear when dbt seed is run. And remember that rows_affected is adapter-dependent and is not always populated.
| Common misconception | Correct understanding | How to address it |
|---|---|---|
| You can see execution time in manifest.json | Execution time is recorded in run_results.json | Reference run_results.json for latency analysis |
| ephemeral also produces a result row every time | It is inlined into the parent and has no independent result | Watch for source SQL bloat and monitor the parent model instead |
| rows_affected is always available | It is adapter-dependent and can be NULL | Design schemas and visuals to be NULL-safe |
Node types and how they appear in run_results
model (materialized) -> appears
seed -> appears when seed is run
snapshot -> appears
test -> appears (success/fail/error)
ephemeral -> does not appear independentlyBash: save and ship run_results.json after a run (conceptual)
set -euo pipefail
run_ts=$(date -u +%Y%m%dT%H%M%SZ)
dbt run --select state:modified || true
# Save target/run_results.json
mkdir -p artifacts/$run_ts
cp target/run_results.json artifacts/$run_ts/
# Example: upload to AWS S3 (adjust to your environment)
# aws s3 cp artifacts/$run_ts/run_results.json s3://my-bucket/dbt/run_results/$run_ts.jsonAnalytics Engineer
問題 1
You want to identify the model that took the longest during a dbt run. Which data source and field combination is most appropriate?
正解: A
Execution time is recorded as execution_time in each result entry of run_results.json. manifest.json and catalog.json describe structure and metadata, so they are not appropriate for comparing execution time.
Where can I find run_results.json?
In dbt Core it is generated at target/run_results.json under your project. In dbt Cloud you can download it from the Artifacts of each job run.
Are ephemeral models recorded in run_results.json?
Ephemeral models are not executed independently; they are inlined into their parent models, so they do not appear as independent result entries.
Is rows_affected always available?
No. The content of adapter_response is adapter-dependent. Many adapters populate rows_affected, but it can be NULL, so handle it as nullable when you analyze the data.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
dbt Models: SQL-Defined Transformation Units (2026)
Model fundamentals — SELECT-based definitions, naming, refs,...
dbt Analytics Engineering Exam: Complete Guide (2026)
Pass the AE Certification — scope, weighting, sample questio...
dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)
Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...
dbt Project Structure: models/seeds/macros Layout (2026)
Recommended dbt project layout — models, seeds, macros, snap...
dbt_project.yml Explained: Every Config (2026)
Every dbt_project.yml setting that matters — paths, vars, ma...