Have you ever discovered that the column names, order, or types in a data model silently drifted? dbt Data Contracts (model contracts) prevent exactly that. Enable a contract on a model, and the compiled SQL explicitly pins down the columns and types, catching unexpected columns or type drift early.
This article walks through how to enable contracts, the concrete behavior after compilation, how they combine with incremental models, how they differ from tests and DB constraints, and the points most often asked on the exam — all based on stable behavior aligned with the official docs.
Data Contracts let you declare a dbt model's output schema (column names, order, data types) and enforce it at runtime. When enforced, the compiled SQL selects only the declared columns and applies explicit type casts as needed. This stops upstream-driven surprises like new columns appearing or types changing at the model boundary.
From the Analytics Engineer exam perspective, frequent topics include how to enable a contract, the requirement to declare columns, the adapter-dependent nature of type specification, and how to handle schema changes on incremental models (on_schema_change and full refresh).
Contracts are enabled per model. Set contract.enforced to true in the YAML or model SQL config, and declare every column under columns. When enforcing the contract, you must specify each column's data_type using the right type for your adapter (e.g. BigQuery, Snowflake, Postgres/Redshift).
Column renames or aliases require that the SQL output column name matches columns.name in YAML. Undeclared columns will not appear in the output, and declared columns that are not present in the SQL will cause a runtime error.
Minimal contract implementation (YAML + model SQL)
# schema.yml
models:
- name: orders_mart
config:
contract:
enforced: true
columns:
- name: order_id
data_type: INT64 # e.g. BigQuery
- name: customer_id
data_type: INT64
- name: order_total
data_type: NUMERIC
- name: order_ts
data_type: TIMESTAMP
---
-- models/orders_mart.sql
{{ config(materialized='table', contract={'enforced': true}) }}
with src as (
select * from {{ ref('stg_orders') }}
)
select
order_id,
customer_id,
order_total,
order_ts
from srcWhen the contract is enforced, dbt wraps the original query in an internal CTE and the outer query selects only the declared columns in the declared order and type. The outer SELECT applies implicit/explicit casts and drops undeclared columns. Declared columns that are missing from the inner query fail at the database as undefined column references.
Adapters differ in detail, but the common theme is: the outer SELECT pins columns, order, and types. Even for views, column types depend on the SELECT list, so the cast fixes the types.
What enforcement looks like at compile time (simplified SQL)
WITH model__dbt_internal AS (
-- Original query (may be affected by upstream changes)
SELECT * FROM stg_orders
)
SELECT
CAST(order_id AS INT64) AS order_id,
CAST(customer_id AS INT64) AS customer_id,
CAST(order_total AS NUMERIC) AS order_total,
CAST(order_ts AS TIMESTAMP) AS order_ts
FROM model__dbt_internalContracts also work on incremental materializations. However, physical schema changes to existing tables are a separate problem. Contracts logically pin output columns and types, but actual column additions or type changes to the underlying table must be combined with on_schema_change or a full refresh.
Here are the typical operational patterns. When the change is large or involves type changes, a full refresh is the safest path.
Even though all three are about 'quality', contracts, tests, and DB constraints serve different roles. Contracts pin the schema boundary, tests verify data properties, and DB constraints enforce storage-level consistency. Knowing when to use each makes design intent clear.
| Item | Purpose / Target | Failure Timing | Typical Configuration / Example |
|---|---|---|---|
| Data Contracts | Pin output schema (columns, order, type) | At model execution (detected by the compiled SELECT) | config: contract.enforced=true plus data_type on YAML columns |
| dbt tests | Data properties like NOT NULL, UNIQUE, relationships | dbt test (or the test phase of dbt build) | tests: not_null, unique, relationships, etc. |
| DB constraints | Storage consistency such as NOT NULL, PK, FK, CHECK | At DML/DDL time (depends on adapter implementation) | Constraint support is adapter-dependent (e.g. NOT NULL, etc.) |
To avoid common implementation pitfalls, run through this pre-shipping checklist.
Analytics Engineer
問題 1
You want to pin the column names, order, and types of a model delivered downstream so that new upstream columns do not break the downstream schema, and you want missing columns or type mismatches to fail at model execution. Which approach is most appropriate?
正解: A
Enforcing the contract makes the compiled outer SELECT emit only the declared columns and types. Undeclared columns are dropped, and missing columns or type mismatches fail at runtime. Tests verify content; they do not pin the schema.
What can contracts be applied to? Do they work on sources or snapshots?
Contracts apply to dbt models. Sources are declarations of external relations and are not subject to contracts. Snapshots are a history-tracking mechanism and are not the primary target of contracts either.
How do you specify data_type in a multi-adapter environment?
data_type is adapter-specific. When environments differ, use env vars, Jinja conditionals, or macros for type aliases to emit the right type name per adapter (e.g. INT64/STRING, NUMBER/VARCHAR).
What happens when a new column is added upstream?
If the contract is enforced, the new column will not appear in the output (the outer SELECT drops it). To bring it in, update the contract by adding the column to YAML with a data_type, and consider on_schema_change for incremental models or a full refresh if needed.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
dbt Models: SQL-Defined Transformation Units (2026)
Model fundamentals — SELECT-based definitions, naming, refs,...
dbt Analytics Engineering Exam: Complete Guide (2026)
Pass the AE Certification — scope, weighting, sample questio...
dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)
Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...
dbt Project Structure: models/seeds/macros Layout (2026)
Recommended dbt project layout — models, seeds, macros, snap...
dbt_project.yml Explained: Every Config (2026)
Every dbt_project.yml setting that matters — paths, vars, ma...