Table Materialization: Full Refresh Patterns (2026)

dbt's table materialization is appealing because of its simplicity: it rebuilds the entire table on every run. On the other hand, cost and locking behavior change depending on data volume and the per-platform replacement strategy, so you need to understand it correctly to design well.

This article systematically covers rebuild patterns, performance characteristics, selection criteria, and operational tips — all grounded in the official behavior Analytics Engineer candidates need to know.

table Materialization Basics: What Happens Every Run

table rebuilds the entire table every time the model is executed. Most adapters either issue CREATE OR REPLACE TABLE or create a temporary table and then atomically rename it, fully replacing the existing table. This way, downstream consumers always see a consistent snapshot.

It is not incremental, so no delta updates happen. Schema changes are applied on rebuild, and dependency resolution follows dbt's DAG. Documentation and grants are applied via post-hooks, grants, or persist_docs settings, as long as the adapter supports them.

Rebuild is the default: every row is regenerated
Replacement is adapter-specific: CREATE OR REPLACE, or create tmp then rename
Downstream gets a consistent view easily, but costs grow for large tables
Schema changes tend to stay in sync naturally thanks to the per-run rebuild

Materialization	Update method	Primary use cases
table	Full rebuild every run	Stable snapshots, aggregates, intermediate artifacts
view	Evaluated on every query	Lightweight logical layer, validation, small reference data
incremental	Insert/merge deltas only	Large data, low-cost iterative updates
ephemeral	No physical entity (inlined as CTE)	Small preprocessing and reusable logic

High-level flow of a table build

Basic table model configuration example (models/orders_rollup.sql)

{{ config(
    materialized = 'table',
    tags = ['daily'],
    persist_docs = {'relation': true, 'columns': true}
) }}

with base as (
  select * from {{ ref('stg_orders') }}
)
select customer_id,
       date_trunc(day, order_ts) as order_date,
       count(*) as order_cnt,
       sum(amount) as order_amount
from base
group by 1,2

Rebuild Semantics and Per-Adapter Replacement Strategies

Even on normal runs, table behaves equivalently to a full refresh. In other words, whether --full-refresh changes behavior or not is mostly an incremental concern; table always rebuilds.

Replacement safety varies by adapter. Where CREATE OR REPLACE is available, replacement is a single statement; where it is not, the pattern is create-tmp then rename, or drop then create, executed inside a transaction. As a result, failure modes such as the old table remaining or a leftover tmp table can occur.

table always does a full build. on_schema_change is for incremental and normally does not apply to table
Snowflake, BigQuery, and Databricks (Delta) can typically use CREATE OR REPLACE
Redshift/Postgres family typically uses tmp then rename, or drop then create
Cleaning up leftover artifacts (tmp tables) after failures is an operational checkpoint

Adapter example	Replacement pattern	Atomicity / transactional considerations
BigQuery	CREATE OR REPLACE TABLE AS SELECT	Single-statement replacement. Grants/metadata are applied after the statement
Snowflake	CREATE OR REPLACE / temporary table then RENAME	The rename is atomic. On failure, the old table typically remains
Redshift/Postgres	Create tmp then RENAME, or DROP then CREATE	Consistency is guaranteed inside a transaction
Databricks (Delta)	CREATE OR REPLACE TABLE AS SELECT	Metadata updates are fast. Optimization is handled separately via OPTIMIZE/VACUUM

Simplified sequence of the tmp-then-rename strategy

Roughly what the compiled SQL looks like (varies by adapter)

-- Example: when CREATE OR REPLACE is supported
create or replace table analytics.orders_rollup as
select ...;
-- Example: when using tmp then rename
create table analytics.orders_rollup__dbt_tmp as select ...;
alter table analytics.orders_rollup__dbt_tmp rename to analytics.orders_rollup;

Performance and Cost: What Helps and What Does Not

Because table does a full rebuild, the volume scanned from input tables and the aggregation work are the main cost and time drivers. The fundamentals of optimization are upstream pre-aggregation, dropping unnecessary columns, and applying filters early. On the physical side, partition/cluster/distribution settings supported by the adapter influence both read and write paths.

However, because it is table, the model SQL must return a complete result set. A design that simply reads only recent data via WHERE is acceptable only when it still satisfies the completeness requirement (for example, when the upstream already holds an accumulated snapshot).

Suppress scan cost via column pruning, pre-aggregation, and early filtering
Use partition_by/cluster_by on BigQuery, CLUSTER BY on Snowflake, and ZORDER/OPTIMIZE on Databricks where appropriate
Schedule runs in the gaps between data arrival and other workloads
Track metrics: bytes scanned, shuffle volume, DML step duration

Lever	How to control (examples)	Caveats
Column/row reduction	Minimize selected columns and apply where early in upstream models	Logic must preserve completeness of the result
Partition/cluster	config(partition_by=..., cluster_by=...)	Watch out for sort/shuffle load during writes
Compute resources	Snowflake: warehouse size; BigQuery: slots / flat-rate	Concentrate compute over a short window to shorten build time
Table maintenance	Databricks: OPTIMIZE/VACUUM; Snowflake: maintenance	Run maintenance outside the build window

Bird's-eye view of performance bottlenecks

Example partitioning/clustering configuration on BigQuery

{{ config(
    materialized = 'table',
    partition_by = { 'field': 'order_date', 'data_type': 'date' },
    cluster_by = ['customer_id']
) }}
select ... from {{ ref('stg_orders') }}

Should You Pick table? Design Criteria vs. incremental

table is simple and robust, but build time and cost grow as data volume grows. It is a strong choice when daily-to-hourly recomputation is acceptable and the cost of a full rebuild stays within a realistic range. Conversely, if data arrives heavily and partial updates can preserve correctness sufficiently, incremental is the better fit.

Also, during phases of frequent schema changes or iterative logic experimentation, table's rebuild makes change propagation easier. Once things stabilize, a hybrid that moves only the hot path to incremental is a practical answer.

Pick table when the target is mid-sized and recomputation cost is justifiable
Consider incremental when latency tolerance is low and data volume is high
When schema changes are frequent, table is easier to manage
Migrate incrementally based on the balance between SLA and cost

Condition	Favors table	Favors incremental
Data volume / arrival rate	Small to medium, daily/hourly	Large, near-real-time
Frequency of schema change	High (synced by per-run rebuild)	Low to medium (delta management is stable)
SLA / latency requirement	Minutes to hours acceptable	Short latency required
Cost optimization	Prioritize operational simplicity	Prioritize compute savings

Simplified decision flow

Example of switching materialization via environment or variable

{% set mat = var('mat_override', 'table') %}
{{ config(materialized = mat) }}
select ... from {{ ref('stg_orders') }}

-- Example run: dbt run --vars "{mat_override: incremental}"

Grants, Docs, Tests: The Operational Fundamentals of Running table

Because table exists as a physical relation, you can configure grants, comments, and column-description persistence. As long as the adapter supports grants and persist_docs, they are automatically re-applied after a rebuild.

Constraints and indexes are adapter-dependent. Where supported, use the constraints or indexes settings, or run DDL via post-hook. Define schema tests in .yml and validate them alongside the model via dbt build.

grants: essential to keep permissions intact after a rebuild
persist_docs: automatically applies table and column comments
Check the adapter's support coverage before using constraints/indexes
Use dbt build to run model creation and tests together

Feature	Example configuration	Notes
Grants	config(grants={'select': ['analyst_role']})	Requires the adapter to support grants
Documentation	persist_docs={'relation': true, 'columns': true}	Effective on platforms where comments can be applied
Constraints / Indexes	constraints / indexes / post-hook	Support depends on the adapter and version

Unified operation via dbt build

Example combining the model with schema tests and grants

{{ config(
    materialized='table',
    grants={'select': ['ANALYST_ROLE']},
    persist_docs={'relation': true, 'columns': true}
) }}
select ...

# models/schema.yml
version: 2
models:
  - name: orders_rollup
    columns:
      - name: customer_id
        tests: [not_null]
      - name: order_date
        tests: [not_null]
      - name: order_amount
        tests: [not_null]

Operations, Failure Response, and Parallelism: Practical Points for Stable Operation

Rebuilds put heavy load on compute, so schedule them in windows that do not overlap with other jobs and allocate warehouses or slots appropriately. Parallel execution is useful within the bounds of dependency resolution, but a flood of concurrent replacements in the same schema can cause lock contention.

Leftover artifacts or half-applied states on failure depend on the replacement strategy. With the tmp-then-rename approach, failing before the rename typically leaves the old table intact and a tmp behind; failing after the rename leaves the new table valid. With CREATE OR REPLACE, the old table usually remains unless the replacement completes.

Place the schedule in a window that does not overlap with other jobs
Use compute scale-up/out for short, focused runs
Monitor and auto-clean leftover tmp tables on failure
Use dbt build with selectors to minimize blast radius on retry

Scenario	Expected state	Response
Failure after tmp creation	Old table survives; tmp is left behind	Automate tmp cleanup
Post-hook failure after rename	New table is valid; hook is unapplied	Retry and idempotency for hooks
CREATE OR REPLACE failure	Old table remains valid	Investigate the root cause and rerun

Timeline of failures and leftover artifacts

Operational examples for selectors and parallelism

# Target only modified models
$ dbt build --select state:modified+

# Cross-dependency parallelism (job parallelism depends on the platform)
$ dbt build --threads 8

Check with a Practice Question

Analytics Engineer

問題 1

Which statement about dbt's table materialization is most accurate? Assume BigQuery.

table writes only deltas, and only does a full rebuild when --full-refresh is passed
table just updates a view every run and does not create a physical table
table does a full rebuild even on normal runs, and on BigQuery it is typically replaced using CREATE OR REPLACE TABLE
table requires on_schema_change; without it, new columns are not added

正解: C

table always performs a full rebuild, and on BigQuery replacement via CREATE OR REPLACE TABLE is the common pattern. Delta updates are incremental's behavior, and on_schema_change is primarily an incremental setting.

Frequently Asked Questions

How are column additions or deletions reflected when using table?

Because table rebuilds the table from scratch every run, the schema of your model SQL becomes the new table definition as-is. If the adapter supports comment propagation, persist_docs also refreshes the descriptions.

How do you control cost when using table on large data sets?

Push pre-aggregation, column pruning, and early filtering upstream, and apply proper partitioning/clustering design. Splitting run windows and scaling compute up briefly to shorten build time also helps. Keep the logic that returns a complete result set, but design away unnecessary recomputation.

Can you safely switch between table and incremental?

Yes. During migration, do a full rebuild with --full-refresh first, then switch to incremental, clarifying keys, on_schema_change, and merge conditions. Conversely, temporarily reverting a complex incremental back to table for validation is also a common pattern.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

The Complete Guide to dbt table Materialization: Rebuild Semantics and Performance for Real-World Design