dbt

Mastering dbt Node Selection for Practice and Exams: tag / path / + / state:

2026-04-19
NicheeLab Editorial Team

Selectors in dbt Core/Cloud are the core mechanism for narrowing target nodes to the minimum set, so you can run pipelines faster and more safely.

This article walks through tag, path, + (as in model+), and state: — the selectors most commonly tested on the exam and used in practice — along with the points people frequently misunderstand.

Selector Basics and the Big Picture

dbt commands (run/build/test/seed, etc.) specify target nodes with --select and --exclude. There are several selection methods, with the main ones being tag:, path:, state:, resource type (model/test/seed/source, etc.), and the + lineage expansion. Here + means "include parents (upstream) / children (downstream)," and the term model+ is commonly used as shorthand for forms like my_model+ (down to downstream) or +my_model (up to upstream).

Each selector is useful on its own, but combining them lets you safely accelerate CI, run staged releases, and execute nightly batches. state: in particular runs only the diff against a past artifact (manifest.json), so the benefit grows with the scale of your operations.

  • Major selectors: tag:, path:, state:, resource types (model:, test:, seed:, snapshot:, source:, etc.)
  • Meaning of +: +target includes upstream, target+ includes downstream, +target+ includes both
  • state: selectors assume --state points at the reference directory (the target folder containing manifest.json)
Selector / FeatureMain UseTypical ExampleAdvantage
tag:Logical grouping (nightly / heavy / by team, etc.)dbt build --select tag:nightlyFlexible grouping regardless of code layout
path:Directory / file-level selectiondbt run --select path:models/staging/Clear selection aligned with physical layout
+ (lineage expansion)Safely re-run including upstream/downstreamdbt build --select +fact_ordersNo missing dependencies
state:Diff execution (modified/new)dbt build --select state:modified+ --state target/Easy to accelerate even in huge repos

Conceptual diagram of + (lineage expansion)

Upstream                 Target                 Downstream
   A  ---->  B  ---->  C  ---->  D
            ^
            |__ Selected: B

+B      =>  A, B         (B and all upstream)
B+      =>       B, C, D (B and all downstream)
+B+     =>  A, B, C, D   (both upstream and downstream)

Basic usage examples

### Selector basics
# Select by tag
 dbt build --select tag:nightly

# Select by path (relative to project root)
 dbt run --select path:models/marts/

# Lineage expansion (upstream/downstream)
 dbt build --select +dim_customers
 dbt build --select fact_orders+
 dbt build --select +stg_payments+

# state: (diff)
 dbt build --select state:modified+ --state path/to/previous/target --defer

Logical Grouping with tag:

tag: applies to models, snapshots, seeds, and (if needed) tests to provide logical grouping. Since it doesn't depend on physical paths, you can safely run subsets aligned with your organization or operational policy (nightly, heavy, finance, etc.).

dbt build also automatically runs the tests that depend on the selected models. If you want to tag the tests themselves, set tags on the tests definition in the schema/properties YAML.

  • Standardize naming across the team (e.g. frequency, SLA, data domain)
  • tag: lets you run subsets without switching environments or clusters
  • Checking selector coverage in CI is effective for catching missing tags

Example: applying tags and running

-- models/marts/fact_orders.sql
{{ config(tags=["nightly", "heavy"]) }}
select ...

# Example of tagging tests in schema.yml
models:
  - name: fact_orders
    tests:
      - not_null:
          column_name: order_id
        tags: ["heavy"]

# Build by tag
 dbt build --select tag:nightly

# Exclude a specific tag
 dbt build --select tag:nightly --exclude tag:heavy

Physical-Layout Selection with path:

path: specifies nodes by their path relative to the project root. Pointing at a directory targets the nodes underneath; pointing at a file targets only that node. It's simple and powerful when your model layout policy (staging/marts, etc.) is clear.

To target nodes in external packages, either point at the project's packages/ folder or consider using the package: selector. Moving or renaming files changes what gets selected, so reviewing those impacts is helpful.

  • The cleaner the directory structure, the more effective this is
  • package: is also an option for cross-package references (choose based on use case)
  • Even on Windows, consistently using forward slashes (/) reduces confusion

path: examples

# Only staging
 dbt run --select path:models/staging/

# Pinpoint a single file
 dbt run --select path:models/marts/fact_orders.sql

# Under an external package (e.g. jaws_shop) if needed
 dbt build --select path:packages/jaws_shop/models/

Safely Including Upstream/Downstream with + (model+)

+ is the operator for including upstream/downstream lineage in the selection. +target includes parents (upstream), target+ includes children (downstream), and +target+ includes both. It's an essential technique for avoiding missed dependencies during builds and refactors.

Combined with dbt build, the tests tied to those models are run appropriately. If the impact range gets too wide, use --exclude to dial it back.

  • Include upstream: +stg_payments
  • Include downstream: fact_orders+
  • Include both: +dim_customers+
  • Prevent over-selection: control with --exclude tag:heavy and similar

Examples using +

# Verify all the way down to affected downstream models
 dbt build --select stg_payments+

# Recalculate including upstream prep
 dbt build --select +fact_orders

# Both upstream and downstream (watch out — this can get large)
 dbt build --select +dim_customers+ --exclude tag:heavy

Run Only the Diff, Fast and Safely, with state:

state: compares a past artifact (typically target/manifest.json) with the current repository state and selects only the nodes that became modified or new. --state pointing at the reference directory is required. In practice, the standard pattern is to pass production build output to --state and use --defer to delegate unchanged parts to that existing output.

Common patterns are state:modified+ (changed nodes plus their downstream) and state:new (only newly added nodes). File moves, renames, and important config diffs (such as materialized) also count as changes.

  • Prerequisite: --state path/to/target (manifest.json must exist there)
  • Use --defer to delegate unchanged-node references to existing output (safe diff builds)
  • Change detection can be triggered by file moves, renames, and some config changes

Practical state: examples

# Build only changed nodes and their downstream
 dbt build --select state:modified+ \
           --state s3://prod-artifacts/jaffle/target \
           --defer

# Verify only new nodes (smoke test for new features, etc.)
 dbt build --select state:new --state ./target-prod --defer

# Test only changed nodes (delegate models to existing output)
 dbt test --select state:modified --state ./target-prod --defer

Practical Scenarios and Exam Pitfalls

In practice, you combine physical layout (path:), logical grouping (tag:), lineage (+), and diff (state:) to target a "minimum sufficient" scope. The Analytics Engineer exam tests this mindset and the precision of the commands.

Common pitfalls include inconsistent tagging, missing the impact of file moves on path:, overly wide selection from +, and forgetting --state/--defer when using state:. To err on the safe side, prefer build with +, and clamp down on over-selection with --exclude.

  • Nightly batch: base on tag:nightly, excluding heavy processing with tag:heavy
  • Verification stage: narrow to path:models/staging/, expanding downstream only via staging+
  • Production diff: switch fast and safely with state:modified+ and --defer
  • Migration period: use + while gradually replacing with --exclude path:models/legacy/

Combination examples

# Nightly job (logical + exclusion)
 dbt build --select tag:nightly --exclude tag:heavy

# Staging verification (physical + downstream)
 dbt build --select path:models/staging/+

# Production diff (state: + downstream + defer)
 dbt build --select state:modified+ --state s3://prod/target --defer

# Verify impact while excluding legacy
 dbt build --select +fact_orders --exclude path:models/legacy/

Check Your Understanding

Analytics Engineer

問題 1

You want to reference a manifest.json already produced in production and safely verify only the changed models and their downstream. Which command is most appropriate?

  1. A. dbt build --select state:modified+ --state path/to/prod/target --defer
  2. B. dbt build --select +state:modified --defer
  3. C. dbt run --select state:new --defer
  4. D. dbt build --select path:models/ --state path/to/prod/target

正解: A

Diff selection requires --state, and state:modified+ is the right choice to also include downstream of changed nodes. Combining --defer safely delegates unchanged nodes to existing output. B lacks --state and is insufficient, C only targets new nodes and also lacks --state, and D is not a diff selection.

Frequently Asked Questions

What is the difference between state:modified and state:new?

state:modified targets existing nodes whose content or important configuration has changed, while state:new targets newly added nodes. Both require --state to point at the manifest.json artifact you are comparing against.

When should I use path: vs fqn:?

path: is based on the physical file or directory path, while fqn: is based on the logical fully qualified name (project name, subdirectory, model name). If your directory layout is clean, path: is intuitive; if you want to preserve the logical hierarchy, fqn: is a viable option.

Do tags also work on tests?

If you attach tags to the tests themselves, you can select them directly with tag:. Even without tags on tests, dbt build will run the tests tied to selected models. To group tests explicitly, set tags on the tests block in your schema/properties YAML.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
dbt

dbt Models: SQL-Defined Transformation Units (2026)

Model fundamentals — SELECT-based definitions, naming, refs,...

dbt

dbt Analytics Engineering Exam: Complete Guide (2026)

Pass the AE Certification — scope, weighting, sample questio...

dbt

dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)

Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...

dbt

dbt Project Structure: models/seeds/macros Layout (2026)

Recommended dbt project layout — models, seeds, macros, snap...

dbt

dbt_project.yml Explained: Every Config (2026)

Every dbt_project.yml setting that matters — paths, vars, ma...

Browse all dbt articles (101)
© 2026 NicheeLab All rights reserved.