This article is grounded in a 'stable' operational design that works equally well for dbt Cloud and dbt Core + GitHub Actions, and turns the decision points from PR to merge to production into concrete guidance.
It covers the keywords most commonly asked on the exam — Slim CI, state artifacts, defer, run/test/build, environment isolation — alongside practical configuration examples so you can lock them in quickly.
Design dbt CI/CD as two distinct concerns: a CI that quickly validates only the scope impacted by a change, and a CD that safely promotes changes to production. The basic pattern is to auto-trigger CI on PR creation and validate only the diff with state:modified and defer. After merge, the production job runs on a schedule or event and executes run/test against the stable resources.
On the database/lakehouse side, enforce isolation by environment and schema. Write CI outputs to a per-developer or per-branch temporary schema, and grant write permission on the production schema (or catalog) to the production environment only.
| Environment | Typical responsibilities | Key dbt commands |
|---|---|---|
| dev (per-person / per-PR) | Diff validation and experimentation | dbt build --select state:modified+ --defer --state path/to/prod_artifacts |
| stg (integration) | Integration checks across multiple PRs and load testing | dbt build --selector stg_selector |
| prod (production) | Producing and publishing finalized data | dbt run && dbt test (or simply dbt build) |
High-level flow from PR to production
Dev (branch)
| open PR
v
[CI Job] --- build(test) on temp schema
| status check green?
v
merge to main
|
v
[CD Job (prod)] -- run/test on prod schema
|
v
Publish docs / exposures
|
v
Consumers (BI/ML)selectors.yml example (prioritize the diff)
selectors:
- name: pr_changed
definition:
union:
- method: state
value: modified
children: true
- method: tag
value: always_ci
When a PR is opened, CI fires and builds only the models changed on that branch and their downstream (+) dependencies. The trick is to use defer to redirect references to the existing stable version (usually the prod artifacts) so unchanged parts are not recomputed.
CI uses the same adapter configuration as production (Snowflake/Databricks, etc.) but routes output to a branch-specific temporary schema. On success it reports a status check back to the Git platform, where reviewers verify the tests passed and review the result summary.
| Approach | Setup difficulty | PR integration / status | Slim CI (state/defer) |
|---|---|---|---|
| dbt Cloud CI | Low (UI-centric) | Standard integration, easy | Built-in |
| GitHub Actions + dbt Core | Medium (YAML configuration) | Can integrate with Checks | Supported (manual configuration) |
| Databricks Jobs + dbt | Medium-high (requires job design) | PR integration requires extra work | Supported (controlled via arguments) |
CI internal flow (conceptual)
N/ASlim CI implementation example on GitHub Actions
name: dbt CI (PR)
on:
pull_request:
branches: [ main ]
jobs:
slim-ci:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dbt-core + adapter
run: |
pip install dbt-core dbt-snowflake # or dbt-databricks
- name: Restore prod artifacts (state)
uses: actions/download-artifact@v4
with:
name: prod_artifacts
path: ./.state
- name: dbt deps
run: dbt deps
- name: dbt build (Slim CI)
env:
DBT_USER: ${{ secrets.SNOWFLAKE_USER }}
DBT_PASSWORD: ${{ secrets.SNOWFLAKE_PASSWORD }}
DBT_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
DBT_SCHEMA: pr_${{ github.event.number }}
run: |
dbt build --select state:modified+ \
--defer --state ./.state \
--target ci
- name: Upload CI artifacts
uses: actions/upload-artifact@v4
with:
name: ci_artifacts
path: target/*.jsonOnce a PR merges to main, the production job kicks off. CI used defer to reference the existing prod artifacts, but the production job writes to the real prod connection and schema. Combining schedule-driven and event-driven triggers is realistic in practice (event-driven for high-frequency models, schedule-driven for heavy aggregations).
Whether to cleanly separate run and test in production or fold them into a single build is a team-level decision. What matters is to avoid leaving partially successful states behind on failure, and to reliably persist artifacts so the next run can use them as state.
| Step | Purpose | Example |
|---|---|---|
| seed | Load reference master data | Idempotently load a small CSV |
| snapshot | History management (SCD2) | Upsert using a change-detection key |
| build(run+test) | Generate the artifacts and validate | Fail fast on critical test failures |
| docs | Metadata refresh | dbt docs generate / apply exposures |
A simple CD execution order
N/AProduction job command example (with separated stages)
dbt deps
# 必要に応じて seed / snapshot
# dbt seed --full-refresh
# dbt snapshot
# 変更中心の実行(十分に安定運用ならフル実行も可)
dbt run --select state:modified+
dbt test --select state:modified+
# アーティファクト保存(ワークフロー側で target/*.json を保管)CI safety hinges on schema (or catalog) isolation. Create a temporary schema per PR and clean it up when CI completes. On Snowflake, split ROLE/WAREHOUSE/SCHEMA; on Databricks Unity Catalog, cleanly separate catalog.schema.
Encode the branch or PR number into the schema name to avoid collisions and invalid characters. Overriding dbt's generate_schema_name macro lets you make safe naming a team-wide rule.
| Platform | Isolation unit | Caveats |
|---|---|---|
| Snowflake | Database / Schema / Role / Warehouse | Restrict DROP/CREATE on the prod role; give CI a dedicated warehouse |
| Databricks (Unity Catalog) | Catalog / Schema / Permissions | Make the prod catalog read-only; CI writes only to the dev catalog |
| Others (Postgres, BigQuery, etc.) | Schema / Dataset | Automate cost/quota controls and cleanup |
Isolation diagram (conceptual)
N/AExample schema-naming macro (sanitizes the branch name)
{% macro generate_schema_name(custom_schema_name, node) -%}
{%- set raw = env_var('GIT_BRANCH', 'pr') | lower -%}
{%- set safe = re.sub('[^a-z0-9_]', '_', raw) -%}
{%- set prefix = 'ci_' ~ safe[0:20] -%}
{{ return(prefix) }}
{%- endmacro %}
A quality gate is the mechanism that blocks a PR on critical dbt test failures. General tests like unique/not_null/accepted_values work well in CI; reserve heavy checks (large row counts, statistical comparisons) for staging to strike a good balance.
Always persist artifacts (manifest.json / run_results.json) and reference them as the state for the next CI run. This is what makes Slim CI possible and lets you skip recomputing unchanged areas. Update documentation (catalog.json) and exposures at the same time to keep visibility intact for downstream consumers.
| Test category | When to run | On failure |
|---|---|---|
| Generic (unique / not_null) | CI (PR) | Block the PR |
| Referential / functional | CI or stg | Block on critical issues; warning is acceptable for minor ones |
| Regression / statistical comparison | stg/prod | Operational call, subject to review |
Artifact flow (conceptual)
N/Aschema.yml test example (excerpt)
version: 2
models:
- name: fct_orders
tests:
- unique:
column_name: order_id
- not_null:
column_name: order_id
- relationships:
to: ref('dim_customer')
field: customer_id
When production fails, first revert the most recent commit and re-run. If you are worried about half-baked state from a failed table update, a blue/green pattern (build into a next schema, then swap views) is an effective way to swap the visible layer safely.
Separate permissions and destructive changes, and confine drops and full-refresh to a maintenance window. Run snapshots in production only, and normally skip them in CI.
| Scenario | Detection / signal | Response |
|---|---|---|
| Critical test failure | CI status red | Push a fix commit or close the PR |
| Partial failure in the production job | Failed nodes shown in run_results | Retry, or revert then re-run |
| Schema compatibility broken | contracts / downstream errors | Buy time with a view swap, then ship a backward-compatible fix |
Blue/green swap (conceptual)
N/ASimple view-swap script example (SQL)
-- next スキーマで build 済みを想定
-- 可視スキーマのビューを next へ切替
create or replace view prod.visible.fct_orders as
select * from prod_next.fct_orders;
Analytics Engineer
問題 1
You are designing a PR-based Slim CI. You want unchanged upstream models to reference the stable production versions while validating only the changed models and their dependencies. Which dbt command is most appropriate?
正解: A
The canonical Slim CI pattern targets the diff and its downstream with state:modified+ and uses defer to delegate references to unchanged dependencies to the existing (typically prod) artifacts. --state specifies the location of the manifest.json (and friends) to reference.
Where should I store the state artifacts used by Slim CI?
Save the manifest.json and run_results.json produced under the production job's target directory to durable storage when the job finishes, then download them at CI time and reference them with --state. dbt Cloud makes it easy to reference artifacts across environments; with GitHub Actions and similar tools, use upload/download-artifact or external storage such as S3, GCS, or DBFS.
How should I design permissions so CI runs safely on Snowflake or Databricks?
Provision a dedicated CI role (or service principal) and limit write targets to development/CI schemas (or catalogs). Disallow CREATE/DROP on production schemas and grant only SELECT. Isolate warehouses and clusters for CI as well to keep cost and blast radius under control.
Should CI run snapshots or heavy seeds?
Generally, no. CI exists to give immediate feedback on the validity and compatibility of changes; snapshots and large seeds belong in production or staging batch jobs. If you must run them in CI, swap in a lightweight sample or use CI-specific tags to exclude or scope them.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
dbt Models: SQL-Defined Transformation Units (2026)
Model fundamentals — SELECT-based definitions, naming, refs,...
dbt Analytics Engineering Exam: Complete Guide (2026)
Pass the AE Certification — scope, weighting, sample questio...
dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)
Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...
dbt Project Structure: models/seeds/macros Layout (2026)
Recommended dbt project layout — models, seeds, macros, snap...
dbt_project.yml Explained: Every Config (2026)
Every dbt_project.yml setting that matters — paths, vars, ma...