dbt

dbt Slim CI in Practice: Run Only the Diff Safely with state:modified+

2026-04-19
NicheeLab Editorial Team

This is a practical guide to implementing dbt Slim CI with state:modified+, so you can quickly validate only the models that changed in a pull request (PR) plus their downstream dependencies.

We cover state selection, --defer, and artifact handling — all common Analytics Engineer exam topics — along with the real-world pitfalls you will hit.

What Slim CI Is For and How It Works

Slim CI is a technique that compares against the last stable state (the production artifacts) and runs only the changed targets in CI, dramatically speeding up review. In dbt you use the state:modified selector, and adding + includes the downstream dependencies so impact-scope validation stays focused and fast.

The core of Slim CI is an artifact that represents the previous state — typically the production manifest.json. CI reads it via the --state option and diffs it against the local changes. At run time you combine this with --defer so that references to unselected nodes are delegated to production. That way you skip rebuilding unchanged upstreams in CI while still letting tests run correctly.

  • state:modified+ selects changed nodes and their downstream descendants
  • --state points to the comparison source (typically the production manifest.json)
  • --defer delegates reference resolution for unselected nodes to the production artifacts
  • dbt build runs models, snapshots, seeds, and tests in one shot
Execution modeScopeMain benefitsMain risks / caveats
Full run (dbt build)All resourcesFewest regression missesMore time and cost; impractical to run per PR
Impact-scoped run (A+)Node A and its descendantsGuarantees downstream validationManually deciding what changed leads to misses or over-selection
Slim CI (state:modified+ --defer)Only the diff plus its downstreamsFastest PR validation; runs the diff accurately while referencing productionRequires reliable production artifacts; misconfiguration can drop selections

Slim CI data flow (conceptual)

PR branch                   CI runner                       Production (last good state)
    |  change (SQL/model)        |                                 |
    |--------------------------->| 1. Fetch prod manifest.json     |
    |                            |<---------------------------------|
    |                            | 2. dbt build -s state:modified+ --defer --state=./state
    |                            |    ├─ Select changed nodes
    |                            |    ├─ Run/test downstream too
    |                            |    └─ Delegate unselected refs to prod
    |                            |                                 |
    |<---------------------------| 3. Result (pass/fail + artifacts)

How state:modified+ Works and Related Options

state:modified compares the past artifact pointed to by --state (typically manifest.json) against the current project definition and selects nodes that differ. Differences include changes to SQL or schema files, additions, and path or config changes. The trailing + tells dbt to include the descendants (downstreams) of the selected nodes.

Slim CI pairs this with --defer. --defer delegates ref()/source() resolution for unselected nodes to the relations recorded in --state, so unchanged upstreams are not rebuilt in CI. That also lets tests run successfully against production data.

  • Use dbt ls -s state:modified+ to preview the selected set before running
  • Newly added nodes can be explicitly selected with state:new (for example, dbt build -s state:new+)
  • Adding --indirect-selection=eager helps avoid missing related tests
  • Artifact comparison is based on file content and dependency information — it is not a simple timestamp comparison
SelectorMeaningPrimary use
state:modifiedNodes that changed since the previous stateMinimal PR-diff runs
state:modified+Changed nodes plus their downstreamsRecommended diff run that covers the blast radius
state:newNodes that did not exist previouslyValidating only new additions or post-release checks

Implementation (GitHub Actions + dbt Core example)

From the successful production pipeline, save manifest.json (and run_results.json if needed) as a build artifact, then fetch it in PR CI. CI passes that artifact to --state and enables --defer to run only the diff.

Inject connection credentials for Snowflake, Databricks, etc. via repository secrets. Tune parallelism and caching to fit your environment.

  • On a successful main-branch run, upload the contents of target/ as an artifact (at least manifest.json)
  • In PR CI, download that artifact and reference it via --state
  • Run dbt build -s state:modified+ --defer --indirect-selection=eager
  • Optionally log the selection with dbt ls so reviewers can verify what is being run

GitHub Actions (excerpt)

name: pr-slim-ci
on:
  pull_request:
    branches: [ main ]
jobs:
  slim-ci:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install dbt
        run: |
          pip install --upgrade pip
          pip install dbt-core dbt-snowflake  # swap to match your adapter

      - name: Download prod artifacts
        uses: actions/download-artifact@v4
        with:
          name: prod-target-artifacts
          path: state/

      - name: Show planned selection
        run: |
          dbt --version
          dbt deps
          dbt ls -s 'state:modified+' --state state | tee selection.txt

      - name: Slim CI build
        env:
          DBT_PROFILES_DIR: ./.profiles  # adjust as needed
          SNOWFLAKE_USER: ${{ secrets.SNOWFLAKE_USER }}
          SNOWFLAKE_PASSWORD: ${{ secrets.SNOWFLAKE_PASSWORD }}
          SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
        run: |
          dbt build -s 'state:modified+' \
            --defer --state state \
            --indirect-selection=eager \
            --target ci

      - name: Upload CI artifacts
        uses: actions/upload-artifact@v4
        with:
          name: pr-ci-target
          path: target/

Common Pitfalls and Verification Points

The biggest pitfall is passing the wrong artifact. If you hand --state the manifest.json the PR branch just generated, no diff will be detected and changes can slip through unverified. Always use the last successful production artifact.

Macro and package changes can affect models. State comparison considers dependencies, so changes usually propagate, but it is safer to inspect dbt ls output during review to confirm no tests are missed.

  • --state must point to the production manifest.json — never to PR-generated artifacts
  • Forgetting --defer causes CI to rebuild unselected upstreams, blowing up time and cost
  • Source-definition or contract changes require downstream re-validation; confirm via ls that state:modified+ catches them
  • Seed and snapshot changes are included by build, but watch for selection gaps
  • Explicitly set --indirect-selection=eager (recommended) to improve coverage of related tests
SymptomCauseFix
Diff is empty--state is mistakenly pointing to PR-generated artifactsSwap in the production manifest.json and re-check with dbt ls
Tests are missingIndirect selection is too conservativeAdd --indirect-selection=eager or extend the selection with +
Runs are slowMissing --defer or over-selecting nodesEnable --defer and narrow the selection to state:modified+

Storing Production Artifacts Securely

manifest.json contains metadata such as model definitions and relation names. It generally does not include PII, but it does contain internal information like schema and table names, so control visibility and design repository access carefully.

Common storage locations include CI/CD build artifacts, object storage, or pulling from the previous dbt Cloud run. Define the artifact lifecycle (retention period, rotation) and make sure the most recent successful version is always reachable.

  • Store in least-privilege storage and read-only from PRs
  • Fix artifact names and paths to prevent mix-ups
  • Tune retention to your release cycle (for example, 30-90 days)
  • If you use dbt Cloud, consider the Artifacts API for the last successful run

Exam Checklist (Analytics Engineer)

Exam questions often ask you to pick the best way to run only changed models and their downstreams in CI. Lock in state:modified+ and --defer as the key combination, and be ready to state that the artifact behind --state is the last successful production state.

Also nail down differences between selectors, when to use build vs run, and how sources, seeds, and snapshots are handled.

  • Can explain the difference between state:modified and state:new
  • Can explain what + means (descendant selection)
  • Can explain --defer (delegates unselected ref/source resolution to production)
  • dbt build runs models, seeds, snapshots, and tests together
  • Can verify selection results with dbt ls

Check Your Understanding

Analytics Engineer

問題 1

A PR modifies one intermediate model. In CI you want to validate only the changed model and its downstreams, while referencing production data for unchanged upstreams. Which dbt command is most appropriate? (Production manifest.json is in ./state.)

  1. dbt build -s state:modified+ --defer --state ./state
  2. dbt run --select +state:modified --state ./state
  3. dbt test -s state:new+ --defer --state ./state
  4. dbt build -m tag:ci --defer

正解: A

The Slim CI baseline is: pick the diff and its downstreams with state:modified+, delegate unselected references to production with --defer, and provide the comparison artifact via --state. A is correct. B has wrong ordering and syntax, C only targets new nodes, and D lacks --state and selects by tag, which does not meet the requirement.

Frequently Asked Questions

If a macro changes, does state:modified+ pick up the affected models?

In most cases macro changes are surfaced as model diffs. However, behavior can vary with project layout and dependency resolution, so it is safer to run dbt ls -s state:modified+ --state in CI alongside the build to confirm that the selected set is reasonable.

What happens when source definitions (column types or tests in schema.yml) change?

Changes to source metadata can affect downstream tests and models. state:modified+ is designed to pick up the node and its descendants, but to avoid misses we recommend logging the selected set in the PR so reviewers can sanity-check it.

Does using --defer overwrite production?

--defer only delegates resolution of unselected nodes to the production artifacts; it never modifies production objects. Only the selected nodes run against the CI target (for example, a separate schema). Make sure your target is fully isolated (dedicated schema or database).

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
dbt

dbt Models: SQL-Defined Transformation Units (2026)

Model fundamentals — SELECT-based definitions, naming, refs,...

dbt

dbt Analytics Engineering Exam: Complete Guide (2026)

Pass the AE Certification — scope, weighting, sample questio...

dbt

dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)

Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...

dbt

dbt Project Structure: models/seeds/macros Layout (2026)

Recommended dbt project layout — models, seeds, macros, snap...

dbt

dbt_project.yml Explained: Every Config (2026)

Every dbt_project.yml setting that matters — paths, vars, ma...

Browse all dbt articles (101)
© 2026 NicheeLab All rights reserved.