dbt

Mastering dbt on_schema_change: Tracking Schema Changes and Safe Operations

2026-04-19
NicheeLab Editorial Team

The longevity of your modeling depends on its tolerance to schema changes. dbt has an important setting called on_schema_change that controls the behavior when columns are added or removed from incremental models.

This article explains the meaning and selection of each option, caveats for major data warehouses, and exam-prone topics from a hands-on perspective.

on_schema_change Basics and Scope

on_schema_change is a model setting that determines the behavior when the column set of a dbt incremental model differs from that of its target table. The detection targets are column additions and removals. Type changes are generally out of scope, and you must handle DDL separately when needed.

It only applies to incremental materializations. It has no effect on table, view, or ephemeral. Detection is done by comparing the actual columns of the target relation against the column list of the compiled model.

  • Primarily used in combination with the merge incremental strategy
  • Default is ignore (skip and continue)
  • When running full_refresh, consistency is achieved through recreation regardless of on_schema_change

Minimal example: specifying on_schema_change inside a model

{{
  config(
    materialized='incremental',
    incremental_strategy='merge',
    unique_key='id',
    on_schema_change='append_new_columns'
  )
}}

select
  id,
  event_time,
  value,
  -- new columns expected to be added here in the future
  from {{ ref('stg_events') }}

Option Meanings and Differences: Which to Choose When

The main on_schema_change options are ignore, fail, append_new_columns, and sync_all_columns. The differences boil down to whether new column ingestion and existing column removal are performed automatically, or whether the build is stopped upon detection.

If you want to err on the side of safety, choose fail. To prioritize continuity and accept only new columns, choose append_new_columns. To always treat the model definition as the source of truth, sync_all_columns is the standard in practice.

  • The default is ignore. On exams, this is often asked as a pitfall when not explicitly specified
  • If you need automatic column deletion, use sync_all_columns. However, watch for warehouse constraints
  • In environments with change review processes, fail is commonly used while applying DDL on a planned basis
OptionNew column handlingRemoved column handlingType change handling
ignoreIgnored (DML uses only existing columns)Ignored (remains in target)Out of scope (mostly unsupported)
failStops with error on detectionStops with error on detectionOut of scope (type diffs handled separately)
append_new_columnsAdded via ALTER TABLEIgnored (not dropped)Out of scope (handled separately)
sync_all_columnsAddedDropped (warehouse-dependent)Out of scope (handled separately)

Specifying as a project default (dbt_project.yml)

models:
  marts:
    +materialized: incremental
    +incremental_strategy: merge
    +on_schema_change: sync_all_columns

Behavior Differences and Caveats Across Major Data Warehouses

Snowflake's ALTER TABLE ADD/DROP COLUMN is stable, making both append_new_columns and sync_all_columns easy to use in practice. Column deletion assumes there are no dependent objects.

BigQuery supports ADD/DROP COLUMN. on_schema_change works when the incremental strategy is merge, but when using insert_overwrite it may not work as expected because no table recreation is involved. Use full_refresh together as needed.

Databricks (Delta Lake) is strong with column additions, while column deletion depends on the runtime and table characteristics. Pre-verification is recommended when deleting with sync_all_columns. All work well with the merge strategy.

  • Snowflake: easy to apply all 4 options in practice. Permissions and dependency checks are key for deletion
  • BigQuery: works with merge. May differ from expectations with insert_overwrite, so consider full_refresh
  • Databricks: additions are stable. Deletion is environment-dependent, so phased application and a verification plan are needed

Tip for switching behavior per target (using environment variables)

models:
  +on_schema_change: '{{ env_var("DBT_ON_SCHEMA_CHANGE", "ignore") }}'

# Example: override at runtime
# DBT_ON_SCHEMA_CHANGE=sync_all_columns dbt run -s dim_customer

Compatibility with Incremental Strategies: merge / delete+insert / insert_overwrite

on_schema_change takes effect when the strategy runs DML while keeping the table definition. The typical case is the merge strategy. With ignore, DML is generated by excluding columns that do not exist in the target, so the build succeeds but new columns are not ingested.

With insert_overwrite (especially on BigQuery), since it completes by replacing partition regions, the column addition/removal logic of on_schema_change may not apply as expected. To reliably reflect schema changes, it is safer to use full_refresh together.

  • merge: the main arena for on_schema_change. unique_key is required
  • delete+insert: column additions apply, but watch for warehouse-dependent differences
  • insert_overwrite: automatic schema diff adjustment is limited. Include full_refresh in the plan

Schema diff handling flow during incremental builds

dbt run startFetch existing table → Compute column diffsNo diffPrepare to execute MERGE/INSERTDiff presenton_schema_change branchingignoreDo nothingfailStop with errorappend_new_columnsOnly add new columnssync_all_columnsAdd and remove columnsCompute column diffs and execute DML after applying ALTER/skip per on_schema_change

Typical model with the merge strategy

{{
  config(
    materialized='incremental',
    incremental_strategy='merge',
    unique_key='order_id',
    on_schema_change='sync_all_columns'
  )
}}
select
  order_id,
  customer_id,
  total_amount,
  updated_at
from {{ ref('stg_orders') }}

Operational Patterns: Safe Application, Rollback, and Auditing

For phased introduction of new columns, default to append_new_columns and switch to sync_all_columns after several cycles to contain the impact. For column deletion, enable sync after impact analysis and identification of dependent objects.

For recovery from failures, first stop with fail, verify the table backup, recreate with --full-refresh if needed, and then return to incremental. Since changes remain in dbt logs, use run_results.json and invocation_id for auditing.

  • Start small and tighten in stages (ignore -> append -> sync)
  • Perform deletions during planned downtime and rehearse beforehand in the verification environment
  • Systematize log storage so that DDL traces remain in run_results.json

Representative operational commands

# Standard incremental build
 dbt run -s fact_sales

# Force rebuild to reliably apply schema
 dbt run --full-refresh -s fact_sales

# Debug for troubleshooting on failure
 dbt debug && dbt ls -s fact_sales && dbt compile -s fact_sales

Exam Prep: Frequently Asked Topics and Pitfalls

The Analytics Engineer exam targets the default value of on_schema_change, differences between options, when full_refresh is needed, and applicability per strategy. The fact that type changes are out of scope and it has no effect on views are also check points.

When using model contracts (contract enforced: true), the build fails on column mismatches. on_schema_change handles automatic DDL application, and contract prohibits schema deviation. Combining them creates strong governance through clear role separation.

  • The default is ignore. Without explicit specification, new columns are not ingested
  • Type changes are out of scope. Plan separately for ALTER TYPE equivalents
  • Using contract together can also fail unapproved column additions

Example of model contracts (fails on column deviation)

models:
  - name: dim_customer
    config:
      contract:
        enforced: true
      materialized: incremental
      incremental_strategy: merge
      on_schema_change: fail
    columns:
      - name: customer_id
        data_type: string
      - name: customer_name
        data_type: string

Check Your Understanding

Analytics Engineer

問題 1

A new column has been added upstream in an incremental model (strategy=merge). You want to ingest the new column without halting operations, but you do not want existing column deletion yet. Which on_schema_change setting is appropriate?

  1. append_new_columns
  2. sync_all_columns
  3. ignore
  4. fail

正解: A

append_new_columns only adds new columns via ALTER TABLE without performing deletion. The build continues, and new columns on past rows are typically filled with NULL. sync_all_columns may also perform deletion and does not match the requirement, ignore does not ingest new columns, and fail stops on detection.

Frequently Asked Questions

Does on_schema_change automatically apply type changes (e.g., STRING to INT)?

No. It targets column additions and removals. Type changes have significant warehouse-specific constraints, and dbt does not convert them automatically. Plan for ALTER COLUMN or full_refresh separately.

Does on_schema_change apply to view or table materializations?

No. on_schema_change is a setting for incremental materializations. Views always reflect the latest schema through recreation, and tables are handled via recreation or full_refresh.

Will the build fail if I change the SQL to reference new columns with the default ignore setting?

Typically not. dbt generates DML by excluding columns that do not exist in the target. However, new columns will not be ingested. To ingest them, use append_new_columns or sync_all_columns.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
dbt

dbt Models: SQL-Defined Transformation Units (2026)

Model fundamentals — SELECT-based definitions, naming, refs,...

dbt

dbt Analytics Engineering Exam: Complete Guide (2026)

Pass the AE Certification — scope, weighting, sample questio...

dbt

dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)

Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...

dbt

dbt Project Structure: models/seeds/macros Layout (2026)

Recommended dbt project layout — models, seeds, macros, snap...

dbt

dbt_project.yml Explained: Every Config (2026)

Every dbt_project.yml setting that matters — paths, vars, ma...

Browse all dbt articles (101)
© 2026 NicheeLab All rights reserved.