on_schema_change: Handle Column Drift (2026)

The longevity of your modeling depends on its tolerance to schema changes. dbt has an important setting called on_schema_change that controls the behavior when columns are added or removed from incremental models.

This article explains the meaning and selection of each option, caveats for major data warehouses, and exam-prone topics from a hands-on perspective.

on_schema_change Basics and Scope

on_schema_change is a model setting that determines the behavior when the column set of a dbt incremental model differs from that of its target table. The detection targets are column additions and removals. Type changes are generally out of scope, and you must handle DDL separately when needed.

It only applies to incremental materializations. It has no effect on table, view, or ephemeral. Detection is done by comparing the actual columns of the target relation against the column list of the compiled model.

Primarily used in combination with the merge incremental strategy
Default is ignore (skip and continue)
When running full_refresh, consistency is achieved through recreation regardless of on_schema_change

Minimal example: specifying on_schema_change inside a model

{{
  config(
    materialized='incremental',
    incremental_strategy='merge',
    unique_key='id',
    on_schema_change='append_new_columns'
  )
}}

select
  id,
  event_time,
  value,
  -- new columns expected to be added here in the future
  from {{ ref('stg_events') }}

Option Meanings and Differences: Which to Choose When

The main on_schema_change options are ignore, fail, append_new_columns, and sync_all_columns. The differences boil down to whether new column ingestion and existing column removal are performed automatically, or whether the build is stopped upon detection.

If you want to err on the side of safety, choose fail. To prioritize continuity and accept only new columns, choose append_new_columns. To always treat the model definition as the source of truth, sync_all_columns is the standard in practice.

The default is ignore. On exams, this is often asked as a pitfall when not explicitly specified
If you need automatic column deletion, use sync_all_columns. However, watch for warehouse constraints
In environments with change review processes, fail is commonly used while applying DDL on a planned basis

Option	New column handling	Removed column handling	Type change handling
ignore	Ignored (DML uses only existing columns)	Ignored (remains in target)	Out of scope (mostly unsupported)
fail	Stops with error on detection	Stops with error on detection	Out of scope (type diffs handled separately)
append_new_columns	Added via ALTER TABLE	Ignored (not dropped)	Out of scope (handled separately)
sync_all_columns	Added	Dropped (warehouse-dependent)	Out of scope (handled separately)

Specifying as a project default (dbt_project.yml)

models:
  marts:
    +materialized: incremental
    +incremental_strategy: merge
    +on_schema_change: sync_all_columns

Behavior Differences and Caveats Across Major Data Warehouses

Snowflake's ALTER TABLE ADD/DROP COLUMN is stable, making both append_new_columns and sync_all_columns easy to use in practice. Column deletion assumes there are no dependent objects.

BigQuery supports ADD/DROP COLUMN. on_schema_change works when the incremental strategy is merge, but when using insert_overwrite it may not work as expected because no table recreation is involved. Use full_refresh together as needed.

Databricks (Delta Lake) is strong with column additions, while column deletion depends on the runtime and table characteristics. Pre-verification is recommended when deleting with sync_all_columns. All work well with the merge strategy.

Snowflake: easy to apply all 4 options in practice. Permissions and dependency checks are key for deletion
BigQuery: works with merge. May differ from expectations with insert_overwrite, so consider full_refresh
Databricks: additions are stable. Deletion is environment-dependent, so phased application and a verification plan are needed

Tip for switching behavior per target (using environment variables)

models:
  +on_schema_change: '{{ env_var("DBT_ON_SCHEMA_CHANGE", "ignore") }}'

# Example: override at runtime
# DBT_ON_SCHEMA_CHANGE=sync_all_columns dbt run -s dim_customer

Compatibility with Incremental Strategies: merge / delete+insert / insert_overwrite

on_schema_change takes effect when the strategy runs DML while keeping the table definition. The typical case is the merge strategy. With ignore, DML is generated by excluding columns that do not exist in the target, so the build succeeds but new columns are not ingested.

With insert_overwrite (especially on BigQuery), since it completes by replacing partition regions, the column addition/removal logic of on_schema_change may not apply as expected. To reliably reflect schema changes, it is safer to use full_refresh together.

merge: the main arena for on_schema_change. unique_key is required
delete+insert: column additions apply, but watch for warehouse-dependent differences
insert_overwrite: automatic schema diff adjustment is limited. Include full_refresh in the plan

Schema diff handling flow during incremental builds

Typical model with the merge strategy

{{
  config(
    materialized='incremental',
    incremental_strategy='merge',
    unique_key='order_id',
    on_schema_change='sync_all_columns'
  )
}}
select
  order_id,
  customer_id,
  total_amount,
  updated_at
from {{ ref('stg_orders') }}

Operational Patterns: Safe Application, Rollback, and Auditing

For phased introduction of new columns, default to append_new_columns and switch to sync_all_columns after several cycles to contain the impact. For column deletion, enable sync after impact analysis and identification of dependent objects.

For recovery from failures, first stop with fail, verify the table backup, recreate with --full-refresh if needed, and then return to incremental. Since changes remain in dbt logs, use run_results.json and invocation_id for auditing.

Start small and tighten in stages (ignore -> append -> sync)
Perform deletions during planned downtime and rehearse beforehand in the verification environment
Systematize log storage so that DDL traces remain in run_results.json

Representative operational commands

# Standard incremental build
 dbt run -s fact_sales

# Force rebuild to reliably apply schema
 dbt run --full-refresh -s fact_sales

# Debug for troubleshooting on failure
 dbt debug && dbt ls -s fact_sales && dbt compile -s fact_sales

Exam Prep: Frequently Asked Topics and Pitfalls

The Analytics Engineer exam targets the default value of on_schema_change, differences between options, when full_refresh is needed, and applicability per strategy. The fact that type changes are out of scope and it has no effect on views are also check points.

When using model contracts (contract enforced: true), the build fails on column mismatches. on_schema_change handles automatic DDL application, and contract prohibits schema deviation. Combining them creates strong governance through clear role separation.

The default is ignore. Without explicit specification, new columns are not ingested
Type changes are out of scope. Plan separately for ALTER TYPE equivalents
Using contract together can also fail unapproved column additions

Example of model contracts (fails on column deviation)

models:
  - name: dim_customer
    config:
      contract:
        enforced: true
      materialized: incremental
      incremental_strategy: merge
      on_schema_change: fail
    columns:
      - name: customer_id
        data_type: string
      - name: customer_name
        data_type: string

Check Your Understanding

Analytics Engineer

問題 1

A new column has been added upstream in an incremental model (strategy=merge). You want to ingest the new column without halting operations, but you do not want existing column deletion yet. Which on_schema_change setting is appropriate?

append_new_columns
sync_all_columns
ignore
fail

正解: A

append_new_columns only adds new columns via ALTER TABLE without performing deletion. The build continues, and new columns on past rows are typically filled with NULL. sync_all_columns may also perform deletion and does not match the requirement, ignore does not ingest new columns, and fail stops on detection.

Frequently Asked Questions

Does on_schema_change automatically apply type changes (e.g., STRING to INT)?

No. It targets column additions and removals. Type changes have significant warehouse-specific constraints, and dbt does not convert them automatically. Plan for ALTER COLUMN or full_refresh separately.

Does on_schema_change apply to view or table materializations?

No. on_schema_change is a setting for incremental materializations. Views always reflect the latest schema through recreation, and tables are handled via recreation or full_refresh.

Will the build fail if I change the SQL to reference new columns with the default ignore setting?

Typically not. dbt generates DML by excluding columns that do not exist in the target. However, new columns will not be ingested. To ingest them, use append_new_columns or sync_all_columns.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

Mastering dbt on_schema_change: Tracking Schema Changes and Safe Operations