dbt

dbt constraints: NOT NULL, PRIMARY KEY practice and exam prep

2026-04-19
NicheeLab Editorial Team

Some dbt adapters let you attach data warehouse constraints (NOT NULL, PRIMARY KEY, UNIQUE, CHECK, etc.) when models are built. This blocks invalid data from being written at runtime.

However, if you misjudge how constraints, dbt tests (generic tests), and model contracts divide responsibilities, you can unintentionally crash production jobs — or weaken your quality guarantees in the other direction. This article covers how to declare constraints, the differences between adapters, migration playbooks, and exam-relevant takeaways.

Start by separating the roles of constraints and tests

The dbt constraints feature creates database constraints during CREATE/ALTER TABLE when the adapter (Snowflake, Databricks, etc.) supports it. The result: at load time (INSERT/MERGE), invalid data cannot be written.

dbt tests aggregate or scan data after loading to verify quality. Constraints are "prevention", tests are "detection". Model contracts (contracts.enforced) enforce schema consistency (column names and types) — they are separate from value-level constraints.

  • Constraints: the warehouse rejects writes at runtime (NOT NULL, CHECK; PK/UNIQUE behavior varies by adapter)
  • Tests: dbt detects issues downstream (not_null, unique, relationships, etc.)
  • Model contracts: build-time rejection of column name/type drift (value validity is handled separately)
Aspectdbt testsConstraintsModel contracts
Where definedschema.yml (generic tests)Model config or schema.yml (when adapter supports it)Model config (contracts.enforced: true)
When it runsAfter the buildAt DML time, by the warehouseAt build time, validating DDL vs. query
Failure behaviorTest fails (job behavior depends on config)The write errors out and abortsThe build errors out (column name/type mismatch)
Primary purposeContinuous quality detectionBlock invalid data at runtimeEnforce the schema contract

How constraints, tests, and contracts flow inside dbt

DDLDMLcheckSELECTdbt buildCREATE/ALTERwith constraintsWarehouse TableConstraint Enforcementdbt tests (generic)How constraints, tests, and contracts flow inside dbt

How to declare NOT NULL / PRIMARY KEY in dbt

When the adapter supports it, you can declare constraints in the model's config or schema.yml. The declarations are applied during DDL generation. Behavior for existing tables depends on the adapter: in some cases ALTER TABLE attaches the constraint, in others a rebuild (--full-refresh) is required.

The actual enforcement of PRIMARY KEY and UNIQUE differs by warehouse. For example, Snowflake enforces NOT NULL, but PRIMARY KEY/UNIQUE are often informational (not enforced). Databricks (Delta Lake) can enforce NOT NULL and CHECK, but native PRIMARY KEY support is limited.

  • Column-level NOT NULL is a strong runtime validation
  • PRIMARY KEY effectively expresses "unique + not null" intent. On Snowflake it is informational in many cases
  • UNIQUE follows the adapter implementation (informational on Snowflake; Delta may substitute CHECK)

Declaration example (model config and schema.yml)

-- models/customers.sql
{{ config(
  materialized='table',
  constraints=[
    { 'type': 'primary_key', 'columns': ['customer_id'] },
    { 'type': 'not_null',    'columns': ['customer_id', 'email'] },
    { 'type': 'unique',      'columns': ['email'] }
  ],
  contracts={'enforced': true}
) }}
select
  customer_id::number as customer_id,
  email::string       as email,
  created_at::timestamp as created_at
from {{ ref('stg_customers') }};

# schema.yml (also keep tests as a safety net)
models:
  - name: customers
    columns:
      - name: customer_id
        tests:
          - not_null
          - unique
        constraints:
          - type: not_null
          - type: primary_key
      - name: email
        tests:
          - not_null
        constraints:
          - type: not_null

Adapter support differences (Snowflake / Databricks)

Snowflake: NOT NULL is enforced by the engine. PRIMARY KEY and UNIQUE are informational, so they do not necessarily guarantee query optimization or runtime duplicate rejection. In practice, combine NOT NULL constraints with dbt's unique test, or with deduplication logic in the model itself.

Databricks (Delta Lake): NOT NULL and CHECK constraints can be enforced. PRIMARY KEY support is limited, so uniqueness is typically guaranteed by model-side deduplication or by ensuring the merge key is unique.

  • Snowflake: NOT NULL is enforced. Remember that PK/UNIQUE are informational.
  • Databricks Delta: NOT NULL and CHECK are effective. PK is limited.
  • In both cases, use dbt generic tests alongside (for early detection and observability).

CHECK as a stand-in for uniqueness in Delta Lake

-- Databricks Delta example: validate email format or range
{{ config(
  materialized='table',
  constraints=[
    { 'type': 'not_null', 'columns': ['customer_id', 'email'] },
    { 'type': 'check',    'expression': "email LIKE '%@%._%'" }
  ]
) }}
select * from {{ ref('stg_customers') }};

How constraints interact with incremental models

Incremental models often perform MERGE using unique_key, and a NOT NULL constraint on that column acts as a safety net. Because adding or changing a constraint may force a table rebuild on some adapters, plan --full-refresh runs deliberately.

Constraints are usually attached at table creation. Whether an ALTER can add one to a live table — and whether dbt emits that ALTER automatically — depends on the adapter implementation. For change management, validate in staging first, and roll out to production starting with the lowest-risk constraints (NOT NULL).

  • unique_key handles duplicates via update logic. PK/UNIQUE constraint effectiveness varies by adapter.
  • Use --full-refresh to apply constraint definitions reliably (plan for downtime).
  • Pair this with on_schema_change to make your column-addition and sync strategy explicit.

Example config for an incremental model

{{ config(
  materialized='incremental',
  unique_key='customer_id',
  on_schema_change='sync_all_columns',
  constraints=[ { 'type': 'not_null', 'columns': ['customer_id'] } ]
) }}

with src as (
  select * from {{ ref('stg_customers') }}
)
select * from src
{% if is_incremental() %}
  -- incremental filter (e.g. by updated_at)
  where updated_at > (select coalesce(max(updated_at), '1970-01-01') from {{ this }})
{% endif %}

Best practices for migration and operations

If you want to add NOT NULL to an existing table, first finish backfilling or correcting NULLs and confirm dbt's not_null test is green, then attach the constraint. For production, the safe sequence is: apply to staging, then run --full-refresh in production during a low-traffic window.

Expressing intent with PRIMARY KEY or UNIQUE is useful, but in environments where enforcement is limited (e.g. Snowflake), guarantee uniqueness through model-side deduplication (window functions, QUALIFY, etc.) combined with tests.

  • Roll out in low-risk order (tests → NOT NULL → CHECK → PK/UNIQUE declarations).
  • Make tests mandatory in dbt Cloud/CI and visualize the diff before and after adding a constraint.
  • Prepare recovery steps in advance (DROP the constraint, rebuild, etc.).

Deduplication example (keep only unique rows)

with base as (
  select *,
         row_number() over (partition by customer_id order by updated_at desc) as rn
  from {{ ref('stg_customers') }}
)
select * from base where rn = 1

Points the Analytics Engineer exam tends to ask

Being able to verbalize the division of labor among constraints, tests, and contracts is critical. In particular, the adapter differences — Snowflake's PK/UNIQUE being informational, Snowflake enforcing NOT NULL, Delta enforcing NOT NULL/CHECK strongly — show up often.

Beware the trap that "to guarantee uniqueness, just declare PK". In real implementations, model-side deduplication plus tests is usually the most reproducible approach.

  • Constraints prevent, tests detect, contracts enforce schema consistency.
  • Snowflake: NOT NULL is enforced; PK/UNIQUE are informational.
  • Databricks Delta: NOT NULL/CHECK are strong; PK is limited. Guarantee uniqueness via model logic and tests.

Check your understanding

Analytics Engineer

問題 1

You are targeting Snowflake with dbt. You want to disallow NULLs on customer_id in the customers model, block invalid rows at write time, and also preserve quality observability. Which is the most appropriate approach?

  1. Declare a NOT NULL constraint on customer_id in the model and also keep the not_null test in schema.yml
  2. Only set the not_null test in schema.yml and do not use a constraint
  3. Just declare a PRIMARY KEY constraint — duplicates and NULLs will be blocked at runtime
  4. Just enable a model contract (contracts.enforced: true) — NULLs will be blocked as well

正解: A

On Snowflake, NOT NULL is enforced by the engine and blocks NULL writes at runtime. Keeping dbt's not_null test for observability is the most practical setup. PK alone does not guarantee that NULLs or duplicates are blocked at runtime, and contracts enforce column name/type consistency — NULL value checks are handled separately.

Frequently Asked Questions

Should I prioritize dbt constraints or dbt tests?

They serve different roles. Constraints block writes at runtime (prevention), while tests detect issues after the build runs (observability). Use both whenever possible — for NOT NULL in particular, we recommend layering a constraint and a test together.

In Snowflake, will declaring PRIMARY KEY or UNIQUE reject duplicates?

In Snowflake, PRIMARY KEY and UNIQUE are largely informational and do not necessarily reject duplicates at runtime. The common approach is to enforce uniqueness through model-side deduplication logic combined with dbt's unique test.

If I want to add a constraint later, do I have to drop the table?

It depends on the adapter implementation. Some adapters can add constraints via ALTER, but other changes require --full-refresh. Validate in staging first and plan a maintenance window for production.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
dbt

dbt Models: SQL-Defined Transformation Units (2026)

Model fundamentals — SELECT-based definitions, naming, refs,...

dbt

dbt Analytics Engineering Exam: Complete Guide (2026)

Pass the AE Certification — scope, weighting, sample questio...

dbt

dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)

Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...

dbt

dbt Project Structure: models/seeds/macros Layout (2026)

Recommended dbt project layout — models, seeds, macros, snap...

dbt

dbt_project.yml Explained: Every Config (2026)

Every dbt_project.yml setting that matters — paths, vars, ma...

Browse all dbt articles (101)
© 2026 NicheeLab All rights reserved.