This is a practical guide to implementing dbt Slim CI with state:modified+, so you can quickly validate only the models that changed in a pull request (PR) plus their downstream dependencies.
We cover state selection, --defer, and artifact handling — all common Analytics Engineer exam topics — along with the real-world pitfalls you will hit.
Slim CI is a technique that compares against the last stable state (the production artifacts) and runs only the changed targets in CI, dramatically speeding up review. In dbt you use the state:modified selector, and adding + includes the downstream dependencies so impact-scope validation stays focused and fast.
The core of Slim CI is an artifact that represents the previous state — typically the production manifest.json. CI reads it via the --state option and diffs it against the local changes. At run time you combine this with --defer so that references to unselected nodes are delegated to production. That way you skip rebuilding unchanged upstreams in CI while still letting tests run correctly.
| Execution mode | Scope | Main benefits | Main risks / caveats |
|---|---|---|---|
| Full run (dbt build) | All resources | Fewest regression misses | More time and cost; impractical to run per PR |
| Impact-scoped run (A+) | Node A and its descendants | Guarantees downstream validation | Manually deciding what changed leads to misses or over-selection |
| Slim CI (state:modified+ --defer) | Only the diff plus its downstreams | Fastest PR validation; runs the diff accurately while referencing production | Requires reliable production artifacts; misconfiguration can drop selections |
Slim CI data flow (conceptual)
PR branch CI runner Production (last good state)
| change (SQL/model) | |
|--------------------------->| 1. Fetch prod manifest.json |
| |<---------------------------------|
| | 2. dbt build -s state:modified+ --defer --state=./state
| | ├─ Select changed nodes
| | ├─ Run/test downstream too
| | └─ Delegate unselected refs to prod
| | |
|<---------------------------| 3. Result (pass/fail + artifacts)state:modified compares the past artifact pointed to by --state (typically manifest.json) against the current project definition and selects nodes that differ. Differences include changes to SQL or schema files, additions, and path or config changes. The trailing + tells dbt to include the descendants (downstreams) of the selected nodes.
Slim CI pairs this with --defer. --defer delegates ref()/source() resolution for unselected nodes to the relations recorded in --state, so unchanged upstreams are not rebuilt in CI. That also lets tests run successfully against production data.
| Selector | Meaning | Primary use |
|---|---|---|
| state:modified | Nodes that changed since the previous state | Minimal PR-diff runs |
| state:modified+ | Changed nodes plus their downstreams | Recommended diff run that covers the blast radius |
| state:new | Nodes that did not exist previously | Validating only new additions or post-release checks |
From the successful production pipeline, save manifest.json (and run_results.json if needed) as a build artifact, then fetch it in PR CI. CI passes that artifact to --state and enables --defer to run only the diff.
Inject connection credentials for Snowflake, Databricks, etc. via repository secrets. Tune parallelism and caching to fit your environment.
GitHub Actions (excerpt)
name: pr-slim-ci
on:
pull_request:
branches: [ main ]
jobs:
slim-ci:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dbt
run: |
pip install --upgrade pip
pip install dbt-core dbt-snowflake # swap to match your adapter
- name: Download prod artifacts
uses: actions/download-artifact@v4
with:
name: prod-target-artifacts
path: state/
- name: Show planned selection
run: |
dbt --version
dbt deps
dbt ls -s 'state:modified+' --state state | tee selection.txt
- name: Slim CI build
env:
DBT_PROFILES_DIR: ./.profiles # adjust as needed
SNOWFLAKE_USER: ${{ secrets.SNOWFLAKE_USER }}
SNOWFLAKE_PASSWORD: ${{ secrets.SNOWFLAKE_PASSWORD }}
SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
run: |
dbt build -s 'state:modified+' \
--defer --state state \
--indirect-selection=eager \
--target ci
- name: Upload CI artifacts
uses: actions/upload-artifact@v4
with:
name: pr-ci-target
path: target/The biggest pitfall is passing the wrong artifact. If you hand --state the manifest.json the PR branch just generated, no diff will be detected and changes can slip through unverified. Always use the last successful production artifact.
Macro and package changes can affect models. State comparison considers dependencies, so changes usually propagate, but it is safer to inspect dbt ls output during review to confirm no tests are missed.
| Symptom | Cause | Fix |
|---|---|---|
| Diff is empty | --state is mistakenly pointing to PR-generated artifacts | Swap in the production manifest.json and re-check with dbt ls |
| Tests are missing | Indirect selection is too conservative | Add --indirect-selection=eager or extend the selection with + |
| Runs are slow | Missing --defer or over-selecting nodes | Enable --defer and narrow the selection to state:modified+ |
manifest.json contains metadata such as model definitions and relation names. It generally does not include PII, but it does contain internal information like schema and table names, so control visibility and design repository access carefully.
Common storage locations include CI/CD build artifacts, object storage, or pulling from the previous dbt Cloud run. Define the artifact lifecycle (retention period, rotation) and make sure the most recent successful version is always reachable.
Exam questions often ask you to pick the best way to run only changed models and their downstreams in CI. Lock in state:modified+ and --defer as the key combination, and be ready to state that the artifact behind --state is the last successful production state.
Also nail down differences between selectors, when to use build vs run, and how sources, seeds, and snapshots are handled.
Analytics Engineer
問題 1
A PR modifies one intermediate model. In CI you want to validate only the changed model and its downstreams, while referencing production data for unchanged upstreams. Which dbt command is most appropriate? (Production manifest.json is in ./state.)
正解: A
The Slim CI baseline is: pick the diff and its downstreams with state:modified+, delegate unselected references to production with --defer, and provide the comparison artifact via --state. A is correct. B has wrong ordering and syntax, C only targets new nodes, and D lacks --state and selects by tag, which does not meet the requirement.
If a macro changes, does state:modified+ pick up the affected models?
In most cases macro changes are surfaced as model diffs. However, behavior can vary with project layout and dependency resolution, so it is safer to run dbt ls -s state:modified+ --state in CI alongside the build to confirm that the selected set is reasonable.
What happens when source definitions (column types or tests in schema.yml) change?
Changes to source metadata can affect downstream tests and models. state:modified+ is designed to pick up the node and its descendants, but to avoid misses we recommend logging the selected set in the PR so reviewers can sanity-check it.
Does using --defer overwrite production?
--defer only delegates resolution of unselected nodes to the production artifacts; it never modifies production objects. Only the selected nodes run against the CI target (for example, a separate schema). Make sure your target is fully isolated (dedicated schema or database).
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
dbt Models: SQL-Defined Transformation Units (2026)
Model fundamentals — SELECT-based definitions, naming, refs,...
dbt Analytics Engineering Exam: Complete Guide (2026)
Pass the AE Certification — scope, weighting, sample questio...
dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)
Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...
dbt Project Structure: models/seeds/macros Layout (2026)
Recommended dbt project layout — models, seeds, macros, snap...
dbt_project.yml Explained: Every Config (2026)
Every dbt_project.yml setting that matters — paths, vars, ma...