dbt

dbt Materializations: Choosing view / table / incremental / ephemeral for Work and the Exam

2026-04-19
NicheeLab Editorial Team

dbt's materialization controls how a model is persisted as a database object and when it gets refreshed. Pick the wrong one and you pay for it in cost, latency, and availability.

This article walks through view / table / incremental / ephemeral — how they differ and when to use each — from both the practical engineering side and the Analytics Engineer exam perspective.

dbt Materialization Basics and How to Choose

dbt lets you switch the materialized type per model. That changes the kind of database object created (or not created), the refresh granularity, and the cost/permission profile. Start by understanding the four types, then choose based on your requirements: recompute cost, latency, downstream dependencies, and permissions.

As a rough guide: use view / ephemeral for lightweight exploration and intermediate steps, table for stable serving tables, and incremental for fact tables and huge tables that need continuous incremental updates. That said, partitioning, clustering, and supported strategies (merge / append / insert_overwrite) are adapter-specific, so check your data platform's support matrix (the dbt docs and each adapter's documentation are the source of truth).

  • Small, frequently queried, cheap to recompute → view
  • Stable serving copy, used like a snapshot → table
  • Huge, expensive to fully recompute, apply diffs by key or timestamp → incremental
  • Short-lived intermediate shaping, don't want to add a database object → ephemeral
MaterializationDB objectRefresh modelCost / performance profile
viewVIEWRecomputed on every queryZero storage; compute cost on every read
tableTABLEFully rebuilt on every dbt run (default)Uses storage; reads are fast and stable
incrementalTABLEApplies only the delta (append / merge / insert_overwrite, etc.)Avoids full recompute for major efficiency gains
ephemeralNone (inlined as a CTE)Embedded into the downstream query at compile timeNo extra DB object; speed depends on the optimizer

Conceptual view of the four materializations and their dependencies

sourcestaging_eephemeral (CTE)model_finaltable/incconsumersviewad-hoc queriesephemeral expands into model_final's SQL as a CTE; view is computed on read; table/incremental are created/updated at run time

Project defaults plus per-model overrides

# dbt_project.yml (excerpt)
models:
  my_project:
    +materialized: view    # set project default to view
    marts:
      +materialized: table # override to table for this subpath

# models/stg_orders.sql (per-model override)
{{ config(materialized='ephemeral') }}
select ...

# models/fct_orders.sql
{{ config(materialized='incremental', unique_key='order_id', incremental_strategy='merge') }}
select ... -- typically scope rows with is_incremental()

view: Lightweight and Flexible, Computed on Read

view creates a logical view via CREATE VIEW. It uses no storage and queries flow through to the underlying tables on read. A good fit for early development and small intermediate aggregations.

The catch: stacking heavy aggregations across multiple views means upstream gets recomputed on every downstream query, slowing things down. If you need consistent serving performance, materialize as table instead. Some adapters offer options like late binding views (deferring dependency resolution), which gives different tolerance to schema evolution errors. Check the adapter docs as needed.

  • Ideal for exploration and small intermediate steps
  • Permissions flow through: granting on the view alone may not be enough — the user often needs select on the underlying tables too
  • Promote to table when downstream needs consistent performance

Minimal view model

{{ config(materialized='view') }}
select
  o.id,
  o.created_at,
  c.country
from {{ ref('raw_orders') }} as o
left join {{ ref('dim_customers') }} as c on o.customer_id = c.id

table: Rebuilt for Stable Serving, Fast Reads

table is built with CREATE TABLE AS SELECT on each dbt run and rebuilt every time the model runs (the default). It suits workloads that prioritize read performance and downstream stability.

If the cost of full rebuilds becomes non-trivial, consider moving to incremental. Partition/cluster settings are adapter-specific (BigQuery's partition_by/cluster_by, Snowflake clustering keys, etc.). When available they let you balance read cost against rebuild cost.

  • Fits final serving tables where stable performance for scheduled distribution or BI matters
  • Full rebuilds are simple but watch the cost and runtime
  • Consider incremental for large volumes or high update frequency

table model (cluster settings are adapter-specific)

{{ config(materialized='table') }}
select * from {{ ref('int_orders_enriched') }}

incremental: Strategies and Design Guidance for Applying Only the Delta

incremental creates the table on the first run, then applies only the delta on subsequent runs. The delta strategy (incremental_strategy) varies by adapter and typically includes append, merge (upsert by key), and insert_overwrite (partition-level replacement). For merge, specifying unique_key is critical.

Filter the delta with the is_incremental() macro, gating on updated_at or a load watermark. To trigger a full recompute, run dbt run --full-refresh. When schema evolution happens (e.g., new columns), always check the on_schema_change setting (ignore, fail, append_new_columns, sync_all_columns, etc.; adapter support varies).

  • unique_key is required for merge. It's not needed for append, but you'll have to manage duplicates separately
  • Inside is_incremental(), gate with conditions like updated_at >= max processed timestamp
  • Pair with partitioning/clustering to optimize I/O (adapter-dependent)
  • Use --full-refresh to safely rebuild from scratch — bake it into your operations runbook

Typical merge-strategy pattern (with key-based updates)

{{ config(
    materialized='incremental',
    unique_key='order_id',
    incremental_strategy='merge'
) }}

with src as (
  select * from {{ ref('stg_orders') }}
  {% if is_incremental() %}
    where updated_at >= (
      select coalesce(max(updated_at), '1900-01-01') from {{ this }}
    )
  {% endif %}
)

select * from src

ephemeral: Intermediate Models Inlined as CTEs

ephemeral creates no database object — it expands inline as a common table expression (CTE) in the downstream model's SQL. Useful when you want intermediate steps split into separate files without adding objects to the database.

The catch: cramming heavy processing into many ephemerals leads to bloated single queries that are hard for the optimizer to handle. You also can't reference them from external tools or grant on them. To inspect behavior, dbt compile and look at the generated SQL — it's the fastest path.

  • Fits short-lived cleaning, filtering, and normalization steps
  • Cannot be referenced externally, granted on, or indexed
  • Push heavy processing into table or view to spread the load

ephemeral model and its consumer

-- models/_int_orders_ephemeral.sql
{{ config(materialized='ephemeral') }}
select *
from {{ ref('stg_orders_raw') }}
where is_valid = true

-- models/fct_orders.sql (the consumer — ephemeral expands here as a CTE)
{{ config(materialized='table') }}
with cleaned as (
  select * from {{ ref('_int_orders_ephemeral') }}
)
select * from cleaned

Operational Design and the Analytics Engineer Exam Angle

The exam frequently asks which materialization fits which requirement, the role of unique_key in delta strategies, what --full-refresh actually does, and the characteristics of ephemeral. In practice, the safe pattern is to start with view/ephemeral during early development, then promote to table/incremental once things stabilize.

To switch materialization by environment (dev/prod), branch config on a variable or target.name. Plan promotion and demotion deliberately, with an eye on dependency fan-out, recompute cost, and the permission model.

  • Set defaults at the project/folder level; override exceptions at the model level
  • Design large models for incremental from day one — pin down the key and the watermark
  • Check adapter support differences for on_schema_change when columns are added
  • Run commands: dbt run, dbt run --full-refresh, and partial runs via selectors (e.g., dbt run -s tag:incremental)

Example: switching materialization by environment

{{ config(
    materialized= (target.name == 'prod') and 'incremental' or 'view',
    unique_key= (target.name == 'prod') and 'id' or none
) }}
select * from {{ ref('stg_items') }}

Check Your Understanding

Analytics Engineer

問題 1

You need to update a huge fact table daily. The rows to update can be identified by updated_at, and existing rows must be overwritten. Which design best minimizes runtime and cost while preventing duplicates?

  1. Make the model incremental with incremental_strategy=merge and unique_key set; inside is_incremental(), scope the delta by updated_at
  2. Make the model table and fully rebuild every run; control the delta on the application side
  3. Make the model view and compute on every read; let downstream tools handle the updated_at filter
  4. Make every intermediate step ephemeral and ship the final layer as ephemeral too

正解: A

When you need delta updates plus overwriting existing rows, incremental with the merge strategy and a unique_key is the standard answer. Filtering the delta by updated_at inside is_incremental() keeps runtime and cost down while preventing duplicates. table fully rebuilds every run (inefficient), view pays compute on every read and makes it hard to keep performance consistent, and ephemeral never produces a final serving object.

Frequently Asked Questions

How do I switch between view in development and incremental in production?

Branch on target.name (the environment name) or vars inside config. For example: {{ config(materialized=(target.name=='prod') and 'incremental' or 'view', unique_key=(target.name=='prod') and 'id' or none) }} produces incremental in prod and view elsewhere.

What happens when a column is added to an incremental model?

Behavior depends on on_schema_change and adapter support. Options include ignore, fail, append_new_columns, and sync_all_columns, with availability varying by adapter. To apply changes reliably, use --full-refresh, or set append_new_columns / sync_all_columns when supported.

When should I choose ephemeral vs. view?

ephemeral inlines intermediate steps as CTEs and adds no database object. It fits light processing that does not need direct downstream reference or grants. view exposes the model directly and suits shared logic or access control. Use table or incremental when you need heavy processing or stable performance.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
dbt

dbt Models: SQL-Defined Transformation Units (2026)

Model fundamentals — SELECT-based definitions, naming, refs,...

dbt

dbt Analytics Engineering Exam: Complete Guide (2026)

Pass the AE Certification — scope, weighting, sample questio...

dbt

dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)

Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...

dbt

dbt Project Structure: models/seeds/macros Layout (2026)

Recommended dbt project layout — models, seeds, macros, snap...

dbt

dbt_project.yml Explained: Every Config (2026)

Every dbt_project.yml setting that matters — paths, vars, ma...

Browse all dbt articles (101)
© 2026 NicheeLab All rights reserved.