dbt CI/CD Overview: PR to Production (2026)

This article is grounded in a 'stable' operational design that works equally well for dbt Cloud and dbt Core + GitHub Actions, and turns the decision points from PR to merge to production into concrete guidance.

It covers the keywords most commonly asked on the exam — Slim CI, state artifacts, defer, run/test/build, environment isolation — alongside practical configuration examples so you can lock them in quickly.

CI/CD Foundations and the Big Picture

Design dbt CI/CD as two distinct concerns: a CI that quickly validates only the scope impacted by a change, and a CD that safely promotes changes to production. The basic pattern is to auto-trigger CI on PR creation and validate only the diff with state:modified and defer. After merge, the production job runs on a schedule or event and executes run/test against the stable resources.

On the database/lakehouse side, enforce isolation by environment and schema. Write CI outputs to a per-developer or per-branch temporary schema, and grant write permission on the production schema (or catalog) to the production environment only.

Three-tier environments: development (dev) / staging (stg) / production (prod)
Slim CI: use state:modified+ and defer to run only the diff and its dependencies
Separate jobs: PR CI job, staging integration job, production job
Artifact management: persist manifest.json and run_results.json and reuse them as the CI state
Permissions and isolation: separate roles/schemas/catalogs per environment (works for both Snowflake and Databricks)

Environment	Typical responsibilities	Key dbt commands
dev (per-person / per-PR)	Diff validation and experimentation	dbt build --select state:modified+ --defer --state path/to/prod_artifacts
stg (integration)	Integration checks across multiple PRs and load testing	dbt build --selector stg_selector
prod (production)	Producing and publishing finalized data	dbt run && dbt test (or simply dbt build)

High-level flow from PR to production

Dev (branch)
   |  open PR
   v
[CI Job] --- build(test) on temp schema
   |  status check green?
   v
merge to main
   |
   v
[CD Job (prod)] -- run/test on prod schema
   |
   v
Publish docs / exposures
   |
   v
Consumers (BI/ML)

selectors.yml example (prioritize the diff)

selectors:
  - name: pr_changed
    definition:
      union:
        - method: state
          value: modified
          children: true
        - method: tag
          value: always_ci

From PR Open to CI Run (The Slim CI Pattern)

When a PR is opened, CI fires and builds only the models changed on that branch and their downstream (+) dependencies. The trick is to use defer to redirect references to the existing stable version (usually the prod artifacts) so unchanged parts are not recomputed.

CI uses the same adapter configuration as production (Snowflake/Databricks, etc.) but routes output to a branch-specific temporary schema. On success it reports a status check back to the Git platform, where reviewers verify the tests passed and review the result summary.

state:modified+ runs the minimum set of changes and dependencies
defer delegates unchanged upstream references to the production artifacts
Per-PR temporary schema (encode the branch name or short hash in the naming rule)
Reflect critical dbt test failures in the status check (Quality Gate)

Approach	Setup difficulty	PR integration / status	Slim CI (state/defer)
dbt Cloud CI	Low (UI-centric)	Standard integration, easy	Built-in
GitHub Actions + dbt Core	Medium (YAML configuration)	Can integrate with Checks	Supported (manual configuration)
Databricks Jobs + dbt	Medium-high (requires job design)	PR integration requires extra work	Supported (controlled via arguments)

CI internal flow (conceptual)

N/A

Slim CI implementation example on GitHub Actions

name: dbt CI (PR)
on:
  pull_request:
    branches: [ main ]
jobs:
  slim-ci:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      - name: Install dbt-core + adapter
        run: |
          pip install dbt-core dbt-snowflake  # or dbt-databricks
      - name: Restore prod artifacts (state)
        uses: actions/download-artifact@v4
        with:
          name: prod_artifacts
          path: ./.state
      - name: dbt deps
        run: dbt deps
      - name: dbt build (Slim CI)
        env:
          DBT_USER: ${{ secrets.SNOWFLAKE_USER }}
          DBT_PASSWORD: ${{ secrets.SNOWFLAKE_PASSWORD }}
          DBT_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
          DBT_SCHEMA: pr_${{ github.event.number }}
        run: |
          dbt build --select state:modified+ \
                    --defer --state ./.state \
                    --target ci
      - name: Upload CI artifacts
        uses: actions/upload-artifact@v4
        with:
          name: ci_artifacts
          path: target/*.json

Post-Merge CD and Production Promotion

Once a PR merges to main, the production job kicks off. CI used defer to reference the existing prod artifacts, but the production job writes to the real prod connection and schema. Combining schedule-driven and event-driven triggers is realistic in practice (event-driven for high-frequency models, schedule-driven for heavy aggregations).

Whether to cleanly separate run and test in production or fold them into a single build is a team-level decision. What matters is to avoid leaving partially successful states behind on failure, and to reliably persist artifacts so the next run can use them as state.

It is also valid to scope production runs to 'changes plus their dependencies' rather than full refresh
Automate docs generation and exposures updates at the end of the job
Persist artifacts (manifest, run_results) long-term and reuse them as state next time
Apply grants reliably via a dedicated macro or as a post-step

Step	Purpose	Example
seed	Load reference master data	Idempotently load a small CSV
snapshot	History management (SCD2)	Upsert using a change-detection key
build(run+test)	Generate the artifacts and validate	Fail fast on critical test failures
docs	Metadata refresh	dbt docs generate / apply exposures

A simple CD execution order

N/A

Production job command example (with separated stages)

dbt deps
# 必要に応じて seed / snapshot
# dbt seed --full-refresh
# dbt snapshot
# 変更中心の実行（十分に安定運用ならフル実行も可）
dbt run --select state:modified+
dbt test --select state:modified+
# アーティファクト保存（ワークフロー側で target/*.json を保管）

Database-Side Isolation Strategy (Schema / Catalog Design)

CI safety hinges on schema (or catalog) isolation. Create a temporary schema per PR and clean it up when CI completes. On Snowflake, split ROLE/WAREHOUSE/SCHEMA; on Databricks Unity Catalog, cleanly separate catalog.schema.

Encode the branch or PR number into the schema name to avoid collisions and invalid characters. Overriding dbt's generate_schema_name macro lets you make safe naming a team-wide rule.

Snowflake: grant a least-privilege ROLE only the rights to create and write to a temp schema
Databricks: isolate a development catalog and disallow writes to the prod catalog
Cleanup: drop old PR schemas by TTL via a scheduled job
Naming: keep it concise, e.g. ci_pr123_userx or ci_b_<short_sha>

Platform	Isolation unit	Caveats
Snowflake	Database / Schema / Role / Warehouse	Restrict DROP/CREATE on the prod role; give CI a dedicated warehouse
Databricks (Unity Catalog)	Catalog / Schema / Permissions	Make the prod catalog read-only; CI writes only to the dev catalog
Others (Postgres, BigQuery, etc.)	Schema / Dataset	Automate cost/quota controls and cleanup

Isolation diagram (conceptual)

N/A

Example schema-naming macro (sanitizes the branch name)

{% macro generate_schema_name(custom_schema_name, node) -%}
  {%- set raw = env_var('GIT_BRANCH', 'pr') | lower -%}
  {%- set safe = re.sub('[^a-z0-9_]', '_', raw) -%}
  {%- set prefix = 'ci_' ~ safe[0:20] -%}
  {{ return(prefix) }}
{%- endmacro %}

Tests, Quality Gates, and Artifact Management

A quality gate is the mechanism that blocks a PR on critical dbt test failures. General tests like unique/not_null/accepted_values work well in CI; reserve heavy checks (large row counts, statistical comparisons) for staging to strike a good balance.

Always persist artifacts (manifest.json / run_results.json) and reference them as the state for the next CI run. This is what makes Slim CI possible and lets you skip recomputing unchanged areas. Update documentation (catalog.json) and exposures at the same time to keep visibility intact for downstream consumers.

Run lightweight tests in CI; route heavy / regression tests to stg or prod
Store run_results.json alongside the status summary
Use manifest.json as the --state for CI (long-term retention recommended)
Make CI success mandatory via branch protection (Required status checks)

Test category	When to run	On failure
Generic (unique / not_null)	CI (PR)	Block the PR
Referential / functional	CI or stg	Block on critical issues; warning is acceptable for minor ones
Regression / statistical comparison	stg/prod	Operational call, subject to review

Artifact flow (conceptual)

N/A

schema.yml test example (excerpt)

version: 2
models:
  - name: fct_orders
    tests:
      - unique:
          column_name: order_id
      - not_null:
          column_name: order_id
      - relationships:
          to: ref('dim_customer')
          field: customer_id

Rollback on Failure and Operational Essentials

When production fails, first revert the most recent commit and re-run. If you are worried about half-baked state from a failed table update, a blue/green pattern (build into a next schema, then swap views) is an effective way to swap the visible layer safely.

Separate permissions and destructive changes, and confine drops and full-refresh to a maintenance window. Run snapshots in production only, and normally skip them in CI.

Rollback basics: git revert plus re-running the production job
Swap the visible layer: use create or replace view to switch current to next instantly
Apply grants reliably via a dedicated macro (decoupled from data regeneration)
Apply destructive changes deliberately (use contracts to guarantee forward compatibility)

Scenario	Detection / signal	Response
Critical test failure	CI status red	Push a fix commit or close the PR
Partial failure in the production job	Failed nodes shown in run_results	Retry, or revert then re-run
Schema compatibility broken	contracts / downstream errors	Buy time with a view swap, then ship a backward-compatible fix

Blue/green swap (conceptual)

N/A

Simple view-swap script example (SQL)

-- next スキーマで build 済みを想定
-- 可視スキーマのビューを next へ切替
create or replace view prod.visible.fct_orders as
select * from prod_next.fct_orders;

Check Your Understanding

Analytics Engineer

問題 1

You are designing a PR-based Slim CI. You want unchanged upstream models to reference the stable production versions while validating only the changed models and their dependencies. Which dbt command is most appropriate?

dbt build --select state:modified+ --defer --state path/to/prod_artifacts
dbt run --full-refresh
dbt test --select tag:ci_only --no-defer
dbt build --select +state:modified --no-write-json

正解: A

The canonical Slim CI pattern targets the diff and its downstream with state:modified+ and uses defer to delegate references to unchanged dependencies to the existing (typically prod) artifacts. --state specifies the location of the manifest.json (and friends) to reference.

Frequently Asked Questions

Where should I store the state artifacts used by Slim CI?

Save the manifest.json and run_results.json produced under the production job's target directory to durable storage when the job finishes, then download them at CI time and reference them with --state. dbt Cloud makes it easy to reference artifacts across environments; with GitHub Actions and similar tools, use upload/download-artifact or external storage such as S3, GCS, or DBFS.

How should I design permissions so CI runs safely on Snowflake or Databricks?

Provision a dedicated CI role (or service principal) and limit write targets to development/CI schemas (or catalogs). Disallow CREATE/DROP on production schemas and grant only SELECT. Isolate warehouses and clusters for CI as well to keep cost and blast radius under control.

Should CI run snapshots or heavy seeds?

Generally, no. CI exists to give immediate feedback on the validity and compatibility of changes; snapshots and large seeds belong in production or staging batch jobs. If you must run them in CI, swap in a lightweight sample or use CI-specific tags to exclude or scope them.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

dbt CI/CD End-to-End: From Pull Request to Production