dbt's table materialization is appealing because of its simplicity: it rebuilds the entire table on every run. On the other hand, cost and locking behavior change depending on data volume and the per-platform replacement strategy, so you need to understand it correctly to design well.
This article systematically covers rebuild patterns, performance characteristics, selection criteria, and operational tips — all grounded in the official behavior Analytics Engineer candidates need to know.
table rebuilds the entire table every time the model is executed. Most adapters either issue CREATE OR REPLACE TABLE or create a temporary table and then atomically rename it, fully replacing the existing table. This way, downstream consumers always see a consistent snapshot.
It is not incremental, so no delta updates happen. Schema changes are applied on rebuild, and dependency resolution follows dbt's DAG. Documentation and grants are applied via post-hooks, grants, or persist_docs settings, as long as the adapter supports them.
| Materialization | Update method | Primary use cases |
|---|---|---|
| table | Full rebuild every run | Stable snapshots, aggregates, intermediate artifacts |
| view | Evaluated on every query | Lightweight logical layer, validation, small reference data |
| incremental | Insert/merge deltas only | Large data, low-cost iterative updates |
| ephemeral | No physical entity (inlined as CTE) | Small preprocessing and reusable logic |
High-level flow of a table build
Basic table model configuration example (models/orders_rollup.sql)
{{ config(
materialized = 'table',
tags = ['daily'],
persist_docs = {'relation': true, 'columns': true}
) }}
with base as (
select * from {{ ref('stg_orders') }}
)
select customer_id,
date_trunc(day, order_ts) as order_date,
count(*) as order_cnt,
sum(amount) as order_amount
from base
group by 1,2Even on normal runs, table behaves equivalently to a full refresh. In other words, whether --full-refresh changes behavior or not is mostly an incremental concern; table always rebuilds.
Replacement safety varies by adapter. Where CREATE OR REPLACE is available, replacement is a single statement; where it is not, the pattern is create-tmp then rename, or drop then create, executed inside a transaction. As a result, failure modes such as the old table remaining or a leftover tmp table can occur.
| Adapter example | Replacement pattern | Atomicity / transactional considerations |
|---|---|---|
| BigQuery | CREATE OR REPLACE TABLE AS SELECT | Single-statement replacement. Grants/metadata are applied after the statement |
| Snowflake | CREATE OR REPLACE / temporary table then RENAME | The rename is atomic. On failure, the old table typically remains |
| Redshift/Postgres | Create tmp then RENAME, or DROP then CREATE | Consistency is guaranteed inside a transaction |
| Databricks (Delta) | CREATE OR REPLACE TABLE AS SELECT | Metadata updates are fast. Optimization is handled separately via OPTIMIZE/VACUUM |
Simplified sequence of the tmp-then-rename strategy
Roughly what the compiled SQL looks like (varies by adapter)
-- Example: when CREATE OR REPLACE is supported
create or replace table analytics.orders_rollup as
select ...;
-- Example: when using tmp then rename
create table analytics.orders_rollup__dbt_tmp as select ...;
alter table analytics.orders_rollup__dbt_tmp rename to analytics.orders_rollup;Because table does a full rebuild, the volume scanned from input tables and the aggregation work are the main cost and time drivers. The fundamentals of optimization are upstream pre-aggregation, dropping unnecessary columns, and applying filters early. On the physical side, partition/cluster/distribution settings supported by the adapter influence both read and write paths.
However, because it is table, the model SQL must return a complete result set. A design that simply reads only recent data via WHERE is acceptable only when it still satisfies the completeness requirement (for example, when the upstream already holds an accumulated snapshot).
| Lever | How to control (examples) | Caveats |
|---|---|---|
| Column/row reduction | Minimize selected columns and apply where early in upstream models | Logic must preserve completeness of the result |
| Partition/cluster | config(partition_by=..., cluster_by=...) | Watch out for sort/shuffle load during writes |
| Compute resources | Snowflake: warehouse size; BigQuery: slots / flat-rate | Concentrate compute over a short window to shorten build time |
| Table maintenance | Databricks: OPTIMIZE/VACUUM; Snowflake: maintenance | Run maintenance outside the build window |
Bird's-eye view of performance bottlenecks
Example partitioning/clustering configuration on BigQuery
{{ config(
materialized = 'table',
partition_by = { 'field': 'order_date', 'data_type': 'date' },
cluster_by = ['customer_id']
) }}
select ... from {{ ref('stg_orders') }}table is simple and robust, but build time and cost grow as data volume grows. It is a strong choice when daily-to-hourly recomputation is acceptable and the cost of a full rebuild stays within a realistic range. Conversely, if data arrives heavily and partial updates can preserve correctness sufficiently, incremental is the better fit.
Also, during phases of frequent schema changes or iterative logic experimentation, table's rebuild makes change propagation easier. Once things stabilize, a hybrid that moves only the hot path to incremental is a practical answer.
| Condition | Favors table | Favors incremental |
|---|---|---|
| Data volume / arrival rate | Small to medium, daily/hourly | Large, near-real-time |
| Frequency of schema change | High (synced by per-run rebuild) | Low to medium (delta management is stable) |
| SLA / latency requirement | Minutes to hours acceptable | Short latency required |
| Cost optimization | Prioritize operational simplicity | Prioritize compute savings |
Simplified decision flow
Example of switching materialization via environment or variable
{% set mat = var('mat_override', 'table') %}
{{ config(materialized = mat) }}
select ... from {{ ref('stg_orders') }}
-- Example run: dbt run --vars "{mat_override: incremental}"Because table exists as a physical relation, you can configure grants, comments, and column-description persistence. As long as the adapter supports grants and persist_docs, they are automatically re-applied after a rebuild.
Constraints and indexes are adapter-dependent. Where supported, use the constraints or indexes settings, or run DDL via post-hook. Define schema tests in .yml and validate them alongside the model via dbt build.
| Feature | Example configuration | Notes |
|---|---|---|
| Grants | config(grants={'select': ['analyst_role']}) | Requires the adapter to support grants |
| Documentation | persist_docs={'relation': true, 'columns': true} | Effective on platforms where comments can be applied |
| Constraints / Indexes | constraints / indexes / post-hook | Support depends on the adapter and version |
Unified operation via dbt build
Example combining the model with schema tests and grants
{{ config(
materialized='table',
grants={'select': ['ANALYST_ROLE']},
persist_docs={'relation': true, 'columns': true}
) }}
select ...
# models/schema.yml
version: 2
models:
- name: orders_rollup
columns:
- name: customer_id
tests: [not_null]
- name: order_date
tests: [not_null]
- name: order_amount
tests: [not_null]Rebuilds put heavy load on compute, so schedule them in windows that do not overlap with other jobs and allocate warehouses or slots appropriately. Parallel execution is useful within the bounds of dependency resolution, but a flood of concurrent replacements in the same schema can cause lock contention.
Leftover artifacts or half-applied states on failure depend on the replacement strategy. With the tmp-then-rename approach, failing before the rename typically leaves the old table intact and a tmp behind; failing after the rename leaves the new table valid. With CREATE OR REPLACE, the old table usually remains unless the replacement completes.
| Scenario | Expected state | Response |
|---|---|---|
| Failure after tmp creation | Old table survives; tmp is left behind | Automate tmp cleanup |
| Post-hook failure after rename | New table is valid; hook is unapplied | Retry and idempotency for hooks |
| CREATE OR REPLACE failure | Old table remains valid | Investigate the root cause and rerun |
Timeline of failures and leftover artifacts
Operational examples for selectors and parallelism
# Target only modified models
$ dbt build --select state:modified+
# Cross-dependency parallelism (job parallelism depends on the platform)
$ dbt build --threads 8Analytics Engineer
問題 1
Which statement about dbt's table materialization is most accurate? Assume BigQuery.
正解: C
table always performs a full rebuild, and on BigQuery replacement via CREATE OR REPLACE TABLE is the common pattern. Delta updates are incremental's behavior, and on_schema_change is primarily an incremental setting.
How are column additions or deletions reflected when using table?
Because table rebuilds the table from scratch every run, the schema of your model SQL becomes the new table definition as-is. If the adapter supports comment propagation, persist_docs also refreshes the descriptions.
How do you control cost when using table on large data sets?
Push pre-aggregation, column pruning, and early filtering upstream, and apply proper partitioning/clustering design. Splitting run windows and scaling compute up briefly to shorten build time also helps. Keep the logic that returns a complete result set, but design away unnecessary recomputation.
Can you safely switch between table and incremental?
Yes. During migration, do a full rebuild with --full-refresh first, then switch to incremental, clarifying keys, on_schema_change, and merge conditions. Conversely, temporarily reverting a complex incremental back to table for validation is also a common pattern.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
dbt Models: SQL-Defined Transformation Units (2026)
Model fundamentals — SELECT-based definitions, naming, refs,...
dbt Analytics Engineering Exam: Complete Guide (2026)
Pass the AE Certification — scope, weighting, sample questio...
dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)
Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...
dbt Project Structure: models/seeds/macros Layout (2026)
Recommended dbt project layout — models, seeds, macros, snap...
dbt_project.yml Explained: Every Config (2026)
Every dbt_project.yml setting that matters — paths, vars, ma...