dbt's strength is that it applies software engineering principles to data transformation. In particular, contracts (fixed schemas), ownership and boundaries (groups + access), and compatible evolution (versions) form the core of governance.
This article explains the role and interaction of each feature, common design patterns, and migration cautions, focusing on stable concepts grounded in the official documentation.
To continuously improve data products without breaking downstream stability, you need to fix interfaces, clarify team boundaries, and manage breaking changes. dbt expresses this with contracts (fixing the model's I/O schema), groups + access (declaring ownership and reference boundaries), and versions (compatible evolution).
These directly affect build success/failure and reference permissions, preventing regressions before they happen. From an exam perspective, the precise distinction between terms and the judgment of which setting to use when are key targets.
| Feature | Primary Purpose | Declaration Location (Typical) | Handling of Breaking Changes |
|---|---|---|---|
| contracts | Fix the model output schema | Model config (columns and types in schema.yml) | Compatibility-breaking changes are basically not allowed. Create a new version if needed |
| groups + access | Declare ownership boundaries and reference policy | groups resource definition + group/access per model | private blocks external references; public allows references; protected is limited disclosure |
| versions | Compatible evolution and parallel operation | Model versions definition + latest_version | For breaking changes, add a new v and migrate in stages |
Minimal configuration overview (skeleton)
# Project skeleton (excerpt)
# A design that consolidates into files like models/schema.yml is easy to read
models:
- name: customers
group: mart
access: public
config:
contract:
enforced: true
columns:
- name: customer_id
data_type: string
- name: country
data_type: string
versions:
- v: 1
defined_in: models/marts/customers_v1.sql
- v: 2
defined_in: models/marts/customers_v2.sql
latest_version: 2
groups:
- name: mart
owner:
name: Analytics Mart
email: [email protected]
contracts is a mechanism that fixes column names and data types of the model output. Set contract.enforced to true in config and declare name and data_type in columns. Writing a SELECT that does not match the definition causes a build-time error (behavior differs by adapter, but deviations are detected at least at dbt runtime).
In practice, apply contracts to the facade layer (e.g., the mart layer) that creates a trusted downstream interface. When breaking changes (column deletion, type changes) are needed, the safe approach is to combine versions and provide a new version.
Minimal contracts example (schema.yml)
models:
- name: orders
group: mart
access: public
config:
materialized: table
contract:
enforced: true
columns:
- name: order_id
data_type: string
- name: order_date
data_type: date
- name: amount
data_type: numeric
# models/marts/orders.sql
-- Error if SELECT columns and types do not match the above
select
cast(order_id as string) as order_id,
cast(order_date as date) as order_date,
cast(amount as numeric) as amount
from {{ ref('stg_orders') }}
groups is a mechanism for declaring the owning group of a resource, defined together with owner information. Set group on each model and specify the publication level (private / protected / public) with access. This way, whether references from other groups are allowed is checked at compile time.
Although the implementation is simple, the effect is large; it can curb "chaotic references" that span projects. Use public for models that need to be exposed and private for internal-only models as the basic policy. protected is best used for phased exposure or limited use by agreement.
Example of groups and access definitions
# groups.yml
groups:
- name: staging
owner:
name: Data Platform
email: [email protected]
- name: finance
owner:
name: Finance Analytics
email: [email protected]
# models/finance/schema.yml
models:
- name: fct_revenue
group: finance
access: private # Forbid refs from outside finance
config:
contract:
enforced: true
columns:
- name: revenue
data_type: numeric
# Reference-side selection (CI, etc.)
# Build only the finance group
# dbt build --select group:finance
versions is a mechanism that lets you serve a model in parallel by version. List v values in versions, link each version's SQL with defined_in, and specify latest_version. ref('model') resolves to latest by default, and you can also pin to a specific version like ref('model', version=1).
For breaking changes such as column deletion or type changes, add a new v. Keep the old version for a certain period and decommission it after downstream migration is complete. Combining with contracts keeps each version's interface clear.
Definition and reference example for a versioned model
# models/mart/schema.yml
models:
- name: customers
group: mart
access: public
config:
contract:
enforced: true
versions:
- v: 1
defined_in: models/marts/customers_v1.sql
- v: 2
defined_in: models/marts/customers_v2.sql
latest_version: 2
columns:
- name: customer_id
data_type: string
# Reference side
-- Use latest
select * from {{ ref('customers') }}
-- Explicitly use v1
select * from {{ ref('customers', version=1) }}
In operations, it is realistic to enforce contract.enforced and versioning only on public interfaces and keep private internal models simple. References across group boundaries should, in principle, be limited to public models.
Proceed with migration in this order: "add new v -> phased downstream migration (provide a temporary v1/v2 bridge if needed) -> switch latest -> decommission old v." Separating tests by group and by public model in CI makes failure scope easier to identify.
Design flow (evolution of public models)
Example CI selectors during migration
# Verify old and new in parallel (public models only)
# Build the new version first; limit the old version to smoke tests
# Detect failures early at the group level for the latest version
dbt build --select access:public +state:modified --exclude group:archive
dbt test --select model:customers,version:1 --store-failures
On the exam, mapping terms (contract fixes the schema, group/access defines reference boundaries, version absorbs breaking changes) and judging which to use in which situation are common. Make sure you understand concrete YAML, ref resolution rules, and when errors occur.
In practice, "selection and concentration"—being strict only on the public surface—is effective. First, set up contracts and versions on public models and fix boundaries with groups and access. Codifying migration rules and communication channels (owner) lowers the social cost of change.
Command snippets for self-check
# Health check for contracts and tests on public models
/dbt build --select access:public
# Understand change impact (state comparison)
dbt ls --select state:modified+ --defer --state target/production
# Take stock of group owners (along with docs generation)
dbt docs generate && dbt docs serve
Analytics Engineer
問題 1
For the public model customers provided by the mart group, you want to remove the column country that downstream (another group) references. Which is the most appropriate action?
正解: A
Column deletion is a classic breaking change. Creating a new v with a versioned model, updating contracts, switching latest, and providing a parallel period for staged migration is safe and aligns with dbt's recommended design.
Are contracts enforced the same way on every adapter?
Column-name and type mismatches are detected at least at dbt runtime, but the creation and behavior of database constraints is adapter-dependent. Major adapters such as Snowflake and BigQuery use contracts reliably. Building once in your actual environment and confirming that it fails as expected gives you peace of mind.
Are dbt groups linked to Git teams or permissions?
Groups are metadata inside the dbt project and do not directly link to Git permissions. Record owner information (name, email, etc.) in groups and supplement change flow and review processes with repository-side operations such as CODEOWNERS.
Are versions required on every model?
They are not required. Versions are most effective on public models referenced from outside the group and on models with long-lived downstream dependencies. For private internal models, introduce versions only when needed, balancing the operational cost.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
dbt Models: SQL-Defined Transformation Units (2026)
Model fundamentals — SELECT-based definitions, naming, refs,...
dbt Analytics Engineering Exam: Complete Guide (2026)
Pass the AE Certification — scope, weighting, sample questio...
dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)
Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...
dbt Project Structure: models/seeds/macros Layout (2026)
Recommended dbt project layout — models, seeds, macros, snap...
dbt_project.yml Explained: Every Config (2026)
Every dbt_project.yml setting that matters — paths, vars, ma...