Incremental delete+insert Strategy (2026)

delete+insert is an incremental strategy that first deletes the rows matching the target keys and then inserts the new rows to keep the table consistent. It works reliably in environments where MERGE is not optimal or has limited support.

This article follows the assumptions in the official documentation (dbt docs: Incremental models and each adapter's strategies) and focuses on basic design and operational patterns that are not easily affected by version differences, providing perspectives that help both daily work and the Analytics Engineer exam.

Key Points of the delete+insert Strategy and the Rebuild Mindset

delete+insert identifies target rows by their unique key, deletes the old rows, and inserts the new ones, producing an incremental upsert that preserves eventual consistency. It is useful when a single-statement upsert like MERGE is unavailable or operationally undesirable.

As a rebuild approach, it lets you partially recreate data by explicitly stating the target key range, and it makes rollback easier — when problems occur you can just delete the affected keys and re-insert. Transaction isolation and locking behavior are adapter-dependent, so consider them together with job parallelism and physical design (clustering, partitioning).

Prerequisite: materialized=incremental with a unique_key specified
Control delta extraction with is_incremental()
Recovery on failure: re-delete then re-insert the same keys — highly reversible
Where it fits: historical Redshift implementations, environments with MERGE restrictions, pipelines that emphasize partitioned rebuilds

Strategy	Requirements / Dependencies	Strengths and Fit
delete+insert	unique_key required. Needs DELETE/INSERT support.	Easy to rerun. More I/O. Key selection is critical.
merge	Adapter features like MERGE DML / Delta MERGE.	Fast and consistent upsert. Not available where the feature is unsupported.
insert_overwrite	Explicit control over partitions and clustering.	Partition-level replacement is fast, but the design requirements are strict.

delete+insert Processing Flow (Rebuild-Oriented)

Minimal dbt Incremental Model (delete+insert)

with source as (
  select * from {{ ref('stg_orders') }}
  {% if is_incremental() %}
    where updated_at >= (select coalesce(max(updated_at), '1900-01-01') from {{ this }})
  {% endif %}
),

new_rows as (
  select
    order_id as id,
    customer_id,
    status,
    updated_at
  from source
)

select * from new_rows

-- model config (at top of file, or via config block)
-- {{ config(
--   materialized='incremental',
--   incremental_strategy='delete+insert',
--   unique_key='id'
-- ) }}

Requirements and Prerequisites: unique_key and Delta Control

delete+insert cannot work without a unique_key. Duplicate keys create ambiguity about which rows to delete and become a breeding ground for mistaken deletes and duplicate inserts. The key should be logically unique (composite is fine), and in practice it should be materialized on the target side as an index or cluster key.

Delta extraction is controlled with is_incremental(). Narrow the range by updated_at or a load boundary column and produce the key set that matches the delete range. Account for late-arriving timestamps by widening the window slightly or by clarifying your rerun policy.

unique_key can be single or composite; NULL-free design is strongly preferred
Practical delta conditions include update timestamps, incremental flags, or hash-change detection
Combine unique and not_null dbt tests for validation
Design clustering / sort / indexes on the target to make key lookups fast

Item	Recommendation	Reason
unique_key	Logically unique, no NULLs	To correctly determine the DELETE scope
Delta column	updated_at + a safety window	Absorbs late arrivals and stabilizes reruns
Constraints / physical design	Clustering / partitioning / indexes	Balances DELETE target lookup with INSERT performance

Mapping Delta Extraction to the Key Set

Composite Key with a Look-Back Window Example

{{ config(
  materialized='incremental',
  incremental_strategy='delete+insert',
  unique_key=['id','effective_date']
) }}

with src as (
  select * from {{ ref('stg_line_items') }}
  {% if is_incremental() %}
    where updated_at >= dateadd(day, -1, (select coalesce(max(updated_at),'1900-01-01') from {{ this }}))
  {% endif %}
)
select id, effective_date, qty, price, updated_at from src

Implementation Patterns: Splitting Model Definition and Configuration

Manage configuration with a config block at the top of the model file, or in the models section of dbt_project.yml. When you need to switch strategies between environments (prod vs. dev), branching on vars or target.name makes maintenance easier.

On some adapters MERGE becomes the default, but explicitly specifying delete+insert forces that strategy to be used. Check support status and default behavior in the adapter's official documentation.

Specify the strategy and keys inside the model and enforce consistency with tests
Switching incremental_strategy per environment makes validation easier
Use dbt run-operation as a supplementary tool for maintenance scripts

Configuration location	When to use	Caveats
In-model config	Express model-specific requirements directly	Too many settings make cross-cutting governance hard
dbt_project.yml	Apply patterns in bulk	Override exception models explicitly
Env vars / vars	Switch strategy per environment	Avoid letting the branching get too complex

Strategy Switching by Environment

Default Strategy and Overrides in dbt_project.yml

models:
  project_name:
    +materialized: incremental
    +incremental_strategy: delete+insert
    +unique_key: id
    marts:
      orders:
        +incremental_strategy: delete+insert  # explicit
        +unique_key: [id, effective_date]

Performance and Operational Pitfalls

delete+insert tends to increase I/O. To speed up locating delete targets, combine platform-specific optimizations on the target table: clustering (Snowflake), cluster/sort keys (Redshift), clustering/partitioning (BigQuery), and Z-ORDER or clustering (Databricks Delta).

Be mindful of the locking impact of bulk DELETE and the storage growth from time travel / retention. Operational tricks such as staggering schedules to avoid concurrent runs over the same key range, or sharding the key range per job, go a long way.

Use key-based clustering and partitioning to shrink the DELETE scan range
Make inserts efficient with batched COPY/INSERT (adapter-dependent)
When parallelizing jobs, avoid contention on the same keys (mind pessimistic vs. optimistic locking)
Estimate the cost impact of retention and auto-clustering ahead of time

Optimization	Expected effect	Caveats
Clustering / partition design	Reduces scan for both DELETE and INSERT	Over-partitioning causes small files and overhead
Batch size tuning	Fewer commits and higher throughput	Very long transactions have a higher failure risk
Key-range sharding	Better concurrency	Requires careful key-range management to avoid gaps

Minimize Scan with Key-Aligned Clustering

Clustering / Partitioning Design Examples (Conceptual)

-- Snowflake example (conceptual)
-- create or replace table T cluster by (id);
-- BigQuery example (conceptual)
-- partition by date(updated_at) cluster by id
-- Databricks Delta example (conceptual)
-- optimize T zorder by (id)

Validation and Rerun: A Reversible Rebuild Workflow

A key strength of delete+insert is that failures are easy to recover from by re-deleting and re-inserting the affected keys. Manage the timing of downstream references through model dependencies (ref) and run tests (unique, not_null, functional dedup) at the end of the pipeline to catch inconsistencies early.

A full refresh (--full-refresh) is the last resort for a complete rebuild. First try to identify the affected keys and recover with a partial rerun (--select model_name), and if needed, temporarily widen the rerun window to restore consistency.

Reduce failure granularity to per-key and recover with partial reruns
Layer validation tests across unique, not_null, and functional duplicate checks
In production, watch for read consistency and control cutover timing
Estimate cost and lock time for a full refresh up front

Failure scenario	Behavior	Recommended response
INSERT interrupted	Some keys were not inserted	Rerun the same increment; the upfront DELETE absorbs any duplicates
DELETE interrupted	Old rows remain	Resolved on the next run via another DELETE then INSERT
Missed delta	Lost updates / missing rows	Widen the window or identify affected keys and rerun

Reversible Flow on Failure

Choosing Between Partial Rerun and Full Refresh

# Rerun only the affected model
# dbt run --select my_incremental_model

# Full refresh (last resort)
# dbt run --full-refresh --select my_incremental_model

Exam Prep: Points Targeted on the Analytics Engineer Exam

The exam tests your understanding of when each incremental strategy applies, the fact that unique_key is required for delete+insert, how to use is_incremental(), the meaning of --full-refresh, and conservative judgment about adapter differences.

It is also important to organize the differences between MERGE and delete+insert, along with the prerequisites of insert_overwrite (partition replacement), as a structured comparison.

delete+insert requires unique_key; harden it with delta extraction and tests
MERGE is a single-statement upsert, but its availability is environment-dependent — verify before standardizing on it
insert_overwrite is efficient for partition-level rebuilds
Know when to use a rerun vs. a full refresh on failure

Theme	Keywords	Trap points
Choosing an incremental strategy	unique_key / MERGE / overwrite	delete+insert is impossible without a unique_key
Delta control	is_incremental() / boundary column	Direction of inequality and handling of late arrivals
Rerun	--full-refresh / partial selection	The cost of overusing full refresh

Learning Map Across Strategies

Frequently Tested Configuration Snippet

{{ config(
  materialized='incremental',
  incremental_strategy='delete+insert',
  unique_key='id'
) }}
{% if is_incremental() %}
  -- don't forget the delta condition
{% endif %}

Check Your Understanding

Analytics Engineer

問題 1

Which is the most appropriate primary reason to choose the delete+insert strategy in a dbt incremental model?

Because MERGE support or operational constraints push you toward safe, row-level rebuilds (delete then insert)
Because you want to fully recompute the table every time to maximize aggregation accuracy
Because you want partitions to be created automatically and only the latest partition replaced
Because you want to upsert an event log that has no unique_key as-is

正解: A

delete+insert assumes a unique_key, deletes the matching rows, and then inserts the new ones. It provides a stable upsert even when MERGE is unavailable or undesirable, and reruns are easy. Full recomputation (B) describes full-refresh, partition replacement (C) is insert_overwrite, and skipping unique_key (D) is incompatible.

Frequently Asked Questions

Can delete+insert be used with columns where unique_key may be NULL?

Not recommended. When NULL values are present, the rows targeted for deletion cannot be uniquely identified, leading to duplicates or stale rows. Normalize with coalesce in the model or use a composite key to eliminate NULLs, and enforce this with dbt's unique / not_null tests.

What is the advantage of choosing delete+insert even in environments that support MERGE?

It fits when you value simple reruns (re-delete the affected keys then re-insert) and high observability of the logic. That said, MERGE is often more efficient in most environments, so weigh the I/O and locking impact before choosing.

When should you choose insert_overwrite vs. delete+insert?

If your partition design is clear and the replacement scope can be defined at the partition level, insert_overwrite is faster. If partitioning is hard to design or you need fine-grained, key-level rebuilds, consider delete+insert.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

dbt delete+insert Strategy: A Complete Guide to the Rebuild Approach