dbt

dbt Troubleshooting: Common Errors and How to Fix Them

2026-04-19
NicheeLab Editorial Team

dbt spans multiple layers: SQL transformations, Jinja, adapter-mediated connections, selection rules, and tests. Errors look different in each layer, and without proper isolation you can burn a lot of time chasing the wrong cause.

This article organizes the failure patterns that show up most often in the field, layer by layer, and offers reproducible diagnostic steps and reliable fixes. It also covers the command differences and incremental requirements that the Analytics Engineer exam likes to test.

Isolating profile and connection errors

Start by verifying that profiles.yml and the adapter line up. Use dbt debug to validate the profile name in dbt_project.yml, the target in profiles.yml, the authentication method, and whether the adapter plugin is installed.

Errors caused by network or privilege issues are typical at the database connection layer. Use the vocabulary of the error message — authentication failure, missing schema, insufficient role — to localize the problem, then fall back to a minimal-configuration connection test.

  • Runtime Error: Could not find profile named 'xxx' → the profile name in dbt_project.yml does not match the key in ~/.dbt/profiles.yml
  • Database Error: Authentication failed / invalid credentials → re-check environment variables and key-file references for credentials
  • Adapter Error: Missing required dependency → the adapter is not installed. Example: pip install dbt-bigquery / dbt-snowflake
  • Database Error: schema does not exist → schema creation privilege is required upfront, or add CREATE SCHEMA to an on-run-start hook in dbt
Command / filePurpose
dbt debugHealth check for adapter, authentication, and profile
dbt --versionVerify adapter and dbt-core compatibility and installation
profiles.ymlDefines connection target, credentials, and target environment
dbt_project.ymlConfirms the profile key matches and reviews default model settings

Initial response flow for connection issues

noyesyesnodbt run faileddbt debug OK?Verify adapter installationpip install dbt-xxxAuth / privilege error?Check credentials / role / schemaMove to SQL / model layercontinue isolationInitial response flow for connection issues

Minimal profiles.yml example (Snowflake, using environment variables)

snowflake_profile:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}"
      user: "{{ env_var('SNOWFLAKE_USER') }}"
      password: "{{ env_var('SNOWFLAKE_PASSWORD') }}"
      role: ANALYST
      database: ANALYTICS
      warehouse: TRANSFORMING
      schema: DEV
      threads: 8
      client_session_keep_alive: true

Failures from selection rules and command differences

dbt's node selection involves ref() dependencies, tags, and paths. No nodes selected means the selector resolved to an empty set. Visualize the target set with dbt ls before running the production command to dramatically cut down on incidents.

The difference between run and build is a frequent exam topic. build runs models, tests, snapshots, seeds, and exposures together in dependency order. build is the default path in CI, but you need to understand its behavior around unintended seed overwrites and halts on test failures.

  • No nodes selected → recheck the selector: dbt ls -s <selector>
  • Found duplicate model name → package collision. Resolve with models: +schema / +alias, or prefix with the package name
  • State-based selection returns zero rows → the state path is wrong when using dbt run -s state:modified --state path/to/artifacts
CommandTarget nodesCommon optionsPrimary failure causes
dbt runmodels only-s/--select, -m (synonym), --deferNo nodes selected, missing dependencies, compilation failure
dbt buildmodels + tests + snapshots + seeds--fail-fast, --select, --excludeHalts on test failure, seed overwrites, order-dependent errors
dbt testgeneric/singular tests--select, --store-failuresQuality errors such as NULL or duplicate detection
dbt seedCSV → table--full-refresh, --showColumn type mismatches, quoting configuration, watch for overwrites
dbt source freshnesssources--select, --output jsonMax latency exceeded, insufficient metadata privileges

Example of target visualization and safe selection

# First confirm the targets
/dbt ls -s tag:daily +
# Only the changed nodes (state:modified)
/dbt ls -s state:modified --state ./target

# Real run (include downstream dependencies)
/dbt build -s tag:daily+ --fail-fast

# Alias to avoid package collisions (example from dbt_project.yml)
models:
  my_pkg:
    +alias: pkg__{{ model.name }}

Incremental model failures and safe reruns

Incremental looks convenient but produces missing rows or duplicates when unique_key or the is_incremental() condition is ambiguous. The merge strategy effectively requires a unique_key. The key is a condition that does not break in CI or production runs that mix full refreshes.

Schema changes behave differently based on the on_schema_change setting and whether the adapter supports it. Unsupported combinations raise a Database Error, so make the sync policy explicit and initialize with --full-refresh when necessary.

  • merge without unique_key → runtime error or duplicate rows
  • Nondeterministic boundary timestamp in is_incremental() → missing rows or duplicates on rerun
  • on_schema_change unsupported / mismatched → ALTER / REPLACE fails
  • No partition column specified (on warehouses that support it) → excessive scans and higher costs
SymptomTypical cause / fix
Database Error: MERGE requires a unique matchSet config(unique_key=...). Redesign the key to guarantee uniqueness.
Intermittent missing rowsThe WHERE in is_incremental() is too loose. Revise the boundary to >= last timestamp + 1 unit.
Fails after adding a columnSet the on_schema_change sync policy to sync_all_columns or similar. Fall back to --full-refresh if unsupported.
Runs succeed but duplicates growunique_key needs to be composite. Use coalesce or normalization to guarantee uniqueness.

Robust incremental model example

{{ config(
    materialized='incremental',
    unique_key='id',
    on_schema_change='sync_all_columns'
) }}

with src as (
  select * from {{ ref('stg_events') }}
)

select
  id,
  event_ts,
  payload
from src
{% if is_incremental() %}
where event_ts > (
  select coalesce(max(event_ts), '1970-01-01') from {{ this }}
)
{% endif %}

Privilege, schema, and quoting pitfalls

If the target schema does not exist or the role cannot create it, the model fails before it is even created. In production, grant explicit CREATE / USAGE privileges to the deployment role, or automate CREATE SCHEMA in an on-run-start hook for the first run only.

Identifier case sensitivity and reserved words behave differently per adapter. Standardize quoting for identifier and schema via the dbt quoting setting, and pin the name explicitly with alias — it is the safe approach.

  • schema does not exist → create the schema in advance, or create it via run-operation on the first run
  • permission denied → grant read and create privileges separately. At minimum: USAGE / CREATE / SELECT / INSERT / UPDATE / DELETE
  • Case sensitivity drift → stabilize by setting quoting.identifier to true and pinning with alias
PrivilegeDescription
USAGE (DB/SCHEMA)Required to reference objects. Without it you cannot even check existence.
CREATE (SCHEMA/TABLE/VIEW)Required at creation time. A common cause of materialization failures.
SELECT/INSERT/UPDATE/DELETERequired for reading and writing models. Incremental models also need update privileges.

Stabilizing quoting and schema (profiles.yml / model config)

# profiles.yml example (always quote identifiers)
quoting:
  database: false
  schema: true
  identifier: true

# Pin the alias on the model side (top of models/*.sql)
{{ config(alias='fact_events', schema='prod') }}

Failures originating from Jinja, macros, and packages

A Compilation Error is a failure during the Jinja interpretation stage. The cause is usually an undefined variable or macro, or a type mismatch — and dbt parse catches these early. Harden vars by providing defaults.

Missing package-sourced macros are typically caused by not running dbt deps or by a version mismatch. Revisit the pins in packages.yml and the dispatch rules used when names collide.

  • Compilation Error: 'my_var' is undefined → use var('my_var', 'default')
  • Macro not found → check packages.yml and run dbt deps
  • adapter.dispatch resolution failure → confirm that macro naming and package_namespace line up
Error messageDirection of the fix
'var_x' is undefinedDefine in vars, or replace with var('var_x', default)
Macro 'pkg.my_macro' not foundAdd pkg to packages.yml, run dbt deps, and pin the version
Could not render {{ ref('x') }}Fix the referenced resource name, path, or package prefix

Example: vars defaults and package management

# Safe vars reference inside a model
select * from {{ ref('stg_users') }} where country = '{{ var('country', 'JP') }}'

# packages.yml (example)
packages:
  - package: dbt-labs/dbt_utils
    version: ">=1.0.0,<2.0.0"

# Verification
/dbt deps
/dbt parse

Handling failures around tests, source freshness, and operations

When a test fails, first confirm whether it is detecting per the quality spec. Decide whether to absorb periods where NULLs or duplicates are tolerated, or the handling of exceptions, on the model side or via conditions on the test side.

source freshness fails when the max latency is exceeded. Align the scheduler with the ingest cycle, and pipe the JSON output into your monitoring stack for effective operations.

  • Identify failing rows with dbt test --store-failures
  • Hook dbt source freshness --output json into monitoring
  • Seeds easily drift due to type and quoting settings. Reconcile with --full-refresh.
Failure scenarioDiagnostic angle
unique / accepted_values test failureDrift between model logic and business spec. Revisit the test with a WHERE clause.
freshness latency exceededIngest platform latency, or a misconfigured lag threshold
CI failure due to seed driftMissed recognition of CSV updates. Standardize differences in type, NULL representation, and quoting.

Practical snippets for tests, freshness, and CI

# tests/schema.yml (example)
models:
  - name: dim_users
    tests:
      - unique:
          column_name: user_id
      - not_null:
          column_name: user_id

sources:
  - name: app
    tables:
      - name: events
        freshness:
          warn_after: {count: 12, period: hour}
          error_after: {count: 24, period: hour}
          loaded_at_field: event_ts

# In CI, run everything together with build
git diff --name-only | xargs -I{} echo {}
/dbt build --select state:modified+ --state ./target --fail-fast --warn-error
/dbt source freshness --select source:app.events --output json

Check Your Understanding

Analytics Engineer

問題 1

An incremental model runs with the merge strategy, and dbt run is producing growing duplicate rows. Which fix best guarantees a stable rerun?

  1. Set unique_key in config, and tighten the WHERE in is_incremental() to a strict boundary greater than the last ingested timestamp
  2. Turn dbt test off and rerun
  3. Run dbt seed first, then run
  4. Increase threads to raise the concurrency

正解: A

The merge strategy needs a key for uniquely matching rows; missing unique_key or a fuzzy time boundary is what causes duplicates. The correct fix is to declare unique_key explicitly and adjust the is_incremental() condition to be deterministic on reruns. Disabling tests or bumping concurrency does not address the root cause.

Frequently Asked Questions

What is the fastest way to pinpoint the SQL that caused a dbt run to fail?

First run dbt build/run with --fail-fast, then open the compiled artifact at target/run/<package>/<model>.sql and execute it directly on the warehouse. Use dbt debug in parallel to rule out connection issues.

How do you prevent recurring No nodes selected errors?

Make it standard practice to confirm targets with dbt ls -s <selector> before running for real. When combining tag, path, and state selectors, be explicit about + (dependencies) and exclude, and standardize the selectors across the repository.

How do you handle the first deployment in an environment without schema creation privileges?

Have an administrator create the schema in advance and grant USAGE / CREATE (table/view) / DML privileges to the execution role. Alternatively, for the first run only, use an admin role with CREATE SCHEMA in an on-run-start hook, then switch to the normal role for ongoing operations.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
dbt

dbt Models: SQL-Defined Transformation Units (2026)

Model fundamentals — SELECT-based definitions, naming, refs,...

dbt

dbt Analytics Engineering Exam: Complete Guide (2026)

Pass the AE Certification — scope, weighting, sample questio...

dbt

dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)

Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...

dbt

dbt Project Structure: models/seeds/macros Layout (2026)

Recommended dbt project layout — models, seeds, macros, snap...

dbt

dbt_project.yml Explained: Every Config (2026)

Every dbt_project.yml setting that matters — paths, vars, ma...

Browse all dbt articles (101)
© 2026 NicheeLab All rights reserved.