delete+insert is an incremental strategy that first deletes the rows matching the target keys and then inserts the new rows to keep the table consistent. It works reliably in environments where MERGE is not optimal or has limited support.
This article follows the assumptions in the official documentation (dbt docs: Incremental models and each adapter's strategies) and focuses on basic design and operational patterns that are not easily affected by version differences, providing perspectives that help both daily work and the Analytics Engineer exam.
delete+insert identifies target rows by their unique key, deletes the old rows, and inserts the new ones, producing an incremental upsert that preserves eventual consistency. It is useful when a single-statement upsert like MERGE is unavailable or operationally undesirable.
As a rebuild approach, it lets you partially recreate data by explicitly stating the target key range, and it makes rollback easier — when problems occur you can just delete the affected keys and re-insert. Transaction isolation and locking behavior are adapter-dependent, so consider them together with job parallelism and physical design (clustering, partitioning).
| Strategy | Requirements / Dependencies | Strengths and Fit |
|---|---|---|
| delete+insert | unique_key required. Needs DELETE/INSERT support. | Easy to rerun. More I/O. Key selection is critical. |
| merge | Adapter features like MERGE DML / Delta MERGE. | Fast and consistent upsert. Not available where the feature is unsupported. |
| insert_overwrite | Explicit control over partitions and clustering. | Partition-level replacement is fast, but the design requirements are strict. |
delete+insert Processing Flow (Rebuild-Oriented)
Minimal dbt Incremental Model (delete+insert)
with source as (
select * from {{ ref('stg_orders') }}
{% if is_incremental() %}
where updated_at >= (select coalesce(max(updated_at), '1900-01-01') from {{ this }})
{% endif %}
),
new_rows as (
select
order_id as id,
customer_id,
status,
updated_at
from source
)
select * from new_rows
-- model config (at top of file, or via config block)
-- {{ config(
-- materialized='incremental',
-- incremental_strategy='delete+insert',
-- unique_key='id'
-- ) }}
delete+insert cannot work without a unique_key. Duplicate keys create ambiguity about which rows to delete and become a breeding ground for mistaken deletes and duplicate inserts. The key should be logically unique (composite is fine), and in practice it should be materialized on the target side as an index or cluster key.
Delta extraction is controlled with is_incremental(). Narrow the range by updated_at or a load boundary column and produce the key set that matches the delete range. Account for late-arriving timestamps by widening the window slightly or by clarifying your rerun policy.
| Item | Recommendation | Reason |
|---|---|---|
| unique_key | Logically unique, no NULLs | To correctly determine the DELETE scope |
| Delta column | updated_at + a safety window | Absorbs late arrivals and stabilizes reruns |
| Constraints / physical design | Clustering / partitioning / indexes | Balances DELETE target lookup with INSERT performance |
Mapping Delta Extraction to the Key Set
Composite Key with a Look-Back Window Example
{{ config(
materialized='incremental',
incremental_strategy='delete+insert',
unique_key=['id','effective_date']
) }}
with src as (
select * from {{ ref('stg_line_items') }}
{% if is_incremental() %}
where updated_at >= dateadd(day, -1, (select coalesce(max(updated_at),'1900-01-01') from {{ this }}))
{% endif %}
)
select id, effective_date, qty, price, updated_at from src
Manage configuration with a config block at the top of the model file, or in the models section of dbt_project.yml. When you need to switch strategies between environments (prod vs. dev), branching on vars or target.name makes maintenance easier.
On some adapters MERGE becomes the default, but explicitly specifying delete+insert forces that strategy to be used. Check support status and default behavior in the adapter's official documentation.
| Configuration location | When to use | Caveats |
|---|---|---|
| In-model config | Express model-specific requirements directly | Too many settings make cross-cutting governance hard |
| dbt_project.yml | Apply patterns in bulk | Override exception models explicitly |
| Env vars / vars | Switch strategy per environment | Avoid letting the branching get too complex |
Strategy Switching by Environment
Default Strategy and Overrides in dbt_project.yml
models:
project_name:
+materialized: incremental
+incremental_strategy: delete+insert
+unique_key: id
marts:
orders:
+incremental_strategy: delete+insert # explicit
+unique_key: [id, effective_date]
delete+insert tends to increase I/O. To speed up locating delete targets, combine platform-specific optimizations on the target table: clustering (Snowflake), cluster/sort keys (Redshift), clustering/partitioning (BigQuery), and Z-ORDER or clustering (Databricks Delta).
Be mindful of the locking impact of bulk DELETE and the storage growth from time travel / retention. Operational tricks such as staggering schedules to avoid concurrent runs over the same key range, or sharding the key range per job, go a long way.
| Optimization | Expected effect | Caveats |
|---|---|---|
| Clustering / partition design | Reduces scan for both DELETE and INSERT | Over-partitioning causes small files and overhead |
| Batch size tuning | Fewer commits and higher throughput | Very long transactions have a higher failure risk |
| Key-range sharding | Better concurrency | Requires careful key-range management to avoid gaps |
Minimize Scan with Key-Aligned Clustering
Clustering / Partitioning Design Examples (Conceptual)
-- Snowflake example (conceptual)
-- create or replace table T cluster by (id);
-- BigQuery example (conceptual)
-- partition by date(updated_at) cluster by id
-- Databricks Delta example (conceptual)
-- optimize T zorder by (id)
A key strength of delete+insert is that failures are easy to recover from by re-deleting and re-inserting the affected keys. Manage the timing of downstream references through model dependencies (ref) and run tests (unique, not_null, functional dedup) at the end of the pipeline to catch inconsistencies early.
A full refresh (--full-refresh) is the last resort for a complete rebuild. First try to identify the affected keys and recover with a partial rerun (--select model_name), and if needed, temporarily widen the rerun window to restore consistency.
| Failure scenario | Behavior | Recommended response |
|---|---|---|
| INSERT interrupted | Some keys were not inserted | Rerun the same increment; the upfront DELETE absorbs any duplicates |
| DELETE interrupted | Old rows remain | Resolved on the next run via another DELETE then INSERT |
| Missed delta | Lost updates / missing rows | Widen the window or identify affected keys and rerun |
Reversible Flow on Failure
Choosing Between Partial Rerun and Full Refresh
# Rerun only the affected model
# dbt run --select my_incremental_model
# Full refresh (last resort)
# dbt run --full-refresh --select my_incremental_model
The exam tests your understanding of when each incremental strategy applies, the fact that unique_key is required for delete+insert, how to use is_incremental(), the meaning of --full-refresh, and conservative judgment about adapter differences.
It is also important to organize the differences between MERGE and delete+insert, along with the prerequisites of insert_overwrite (partition replacement), as a structured comparison.
| Theme | Keywords | Trap points |
|---|---|---|
| Choosing an incremental strategy | unique_key / MERGE / overwrite | delete+insert is impossible without a unique_key |
| Delta control | is_incremental() / boundary column | Direction of inequality and handling of late arrivals |
| Rerun | --full-refresh / partial selection | The cost of overusing full refresh |
Learning Map Across Strategies
Frequently Tested Configuration Snippet
{{ config(
materialized='incremental',
incremental_strategy='delete+insert',
unique_key='id'
) }}
{% if is_incremental() %}
-- don't forget the delta condition
{% endif %}
Analytics Engineer
問題 1
Which is the most appropriate primary reason to choose the delete+insert strategy in a dbt incremental model?
正解: A
delete+insert assumes a unique_key, deletes the matching rows, and then inserts the new ones. It provides a stable upsert even when MERGE is unavailable or undesirable, and reruns are easy. Full recomputation (B) describes full-refresh, partition replacement (C) is insert_overwrite, and skipping unique_key (D) is incompatible.
Can delete+insert be used with columns where unique_key may be NULL?
Not recommended. When NULL values are present, the rows targeted for deletion cannot be uniquely identified, leading to duplicates or stale rows. Normalize with coalesce in the model or use a composite key to eliminate NULLs, and enforce this with dbt's unique / not_null tests.
What is the advantage of choosing delete+insert even in environments that support MERGE?
It fits when you value simple reruns (re-delete the affected keys then re-insert) and high observability of the logic. That said, MERGE is often more efficient in most environments, so weigh the I/O and locking impact before choosing.
When should you choose insert_overwrite vs. delete+insert?
If your partition design is clear and the replacement scope can be defined at the partition level, insert_overwrite is faster. If partitioning is hard to design or you need fine-grained, key-level rebuilds, consider delete+insert.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
dbt Models: SQL-Defined Transformation Units (2026)
Model fundamentals — SELECT-based definitions, naming, refs,...
dbt Analytics Engineering Exam: Complete Guide (2026)
Pass the AE Certification — scope, weighting, sample questio...
dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)
Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...
dbt Project Structure: models/seeds/macros Layout (2026)
Recommended dbt project layout — models, seeds, macros, snap...
dbt_project.yml Explained: Every Config (2026)
Every dbt_project.yml setting that matters — paths, vars, ma...