dbt's generic tests are the first step toward declaratively guaranteeing data quality from YAML. This article focuses on the four most common tests — unique / not_null / accepted_values / relationships — and explains their behavior and when to use each one from a practical standpoint.
From an exam perspective, expect questions on each test's parameters, NULL handling, referential integrity definitions, where filters, and severity settings. We stick to the stable, documented behavior so the patterns are reproducible.
dbt's generic tests are validation queries generated by SQL macros against models (tables/views) or their columns. You declare the tests in schema.yml (any YAML file works) and run dbt test — if the query returns failing rows the test fails, and if it returns none it passes.
The four canonical tests are not_null (no missing values), unique (uniqueness), accepted_values (whitelist of allowed values), and relationships (referential integrity). You can scope target rows with where and control failure severity (error/warn) via severity.
| Test | Primary Use Case | What a Failure Indicates | Practical Notes |
|---|---|---|---|
| not_null | Forbid missing values on required keys/attributes | A NULL exists | Design what NULL means. For derived columns, consider whether aggregation logic should compensate |
| unique | Uniqueness of natural keys / surrogate keys | Duplicate values, or includes NULLs | dbt's unique counts NULLs as failures. Pair with explicit not_null to make root-cause analysis easier |
| accepted_values | Whitelist of status codes / category values | A value outside the list, or NULL (when not allowed) | Use quote: false for numerics and booleans. To allow NULL, include null in values |
| relationships | Foreign-key integrity (fact -> dimension) | A referenced key does not exist in the target | Watch join key types/trimming/case. For large fact tables, scope with where to manage cost |
Conceptual flow of a test definition
Minimal schema.yml example (all four tests)
version: 2
models:
- name: dim_customers
columns:
- name: customer_id
tests:
- not_null
- unique
- name: status
tests:
- accepted_values:
values: ['active', 'inactive']
quote: true
- name: fct_orders
columns:
- name: customer_id
tests:
- relationships:
to: ref('dim_customers')
field: customer_id
- not_null
not_null verifies that the target column contains no NULLs. The basic rule is to apply it to primary keys, foreign keys, and operationally required attributes.
NULLs in derived columns (e.g., normalization flags or aggregation outputs) sometimes mean "undetermined", and blindly applying not_null can break operations. Sometimes it's safer to allow NULL and have the dashboard layer fill defaults.
Example: applying where to not_null
version: 2
models:
- name: dim_products
columns:
- name: product_sku
tests:
- not_null:
where: "is_active = true"dbt's unique treats both duplicate values and NULLs as failures (under the hood it combines a NULL-row extraction with a duplicate extraction). Technically, unique on its own already subsumes not_null.
That said, to make failure causes easier to pinpoint, the best practice is to declare both not_null and unique on primary key columns. For composite uniqueness, generate a surrogate key or consider a custom test (or a macro from a proven package).
Pairing unique with not_null (primary-key scenario)
version: 2
models:
- name: dim_dates
columns:
- name: date_key
tests:
- not_null
- uniqueaccepted_values verifies that a column's values belong to a predefined set. It's useful for business codes, statuses, and flag columns. The clearest convention is quote: false for numerics and booleans, quote: true for strings.
To allow NULL, include null in the YAML values array (unquoted null). Case-handling depends on the database's collation, so for stability, normalize the column on the model side before testing if needed.
accepted_values examples (NULL allowed, numeric type)
version: 2
models:
- name: dim_status
columns:
- name: status
tests:
- accepted_values:
values: ['active', 'inactive', null]
quote: true
- name: importance_level
tests:
- accepted_values:
values: [1, 2, 3]
quote: false
where: "is_deprecated = false"relationships verifies that the fact table's foreign key exists as the dimension table's primary key. Use to to specify the referenced model and field to specify the referenced column. When defined inside a columns section, column_name implicitly refers to the current column.
On huge fact tables this test can get expensive, so it's pragmatic to scope it with where (e.g., the last N periods). Key-type mismatches, leading/trailing whitespace, and case differences are classic failure causes.
Reference relationship between fact and dimension tables
relationships YAML example (inside a column definition)
version: 2
models:
- name: fct_orders
columns:
- name: customer_id
tests:
- not_null
- relationships:
to: ref('dim_customers')
field: customer_id
where: "order_date >= dateadd('day', -30, current_date)"In production, the small-start approach (begin with key primary keys and major dimensions) stabilizes things, and codifying a failure-response runbook (contacts, rollback options, tolerance levels) helps even more. In CI, run dbt test per PR and gradually promote severity from warn to error.
On the exam, expect questions on each test's required parameters, NULL handling, the meaning of where / severity / quote, and the reference specification for relationships (to and field). The fact that unique also detects NULL as a failure is a frequent target.
Selective execution and severity tuning (sample commands)
# Run only tests under models changed recently
# (Adjust selectors based on your version and adapter)
dbt test -s state:modified+ type:generic
# Run only unique tests
dbt test -s test_type:unique
# Demote selected tests to warnings (in YAML)
version: 2
models:
- name: fct_orders
tests:
- dbt_utils.expression_is_true:
expression: "total_amount >= 0"
config:
severity: warnAnalytics Engineer
問題 1
You want fct_orders.status to allow 'pending' or 'completed' as well as NULL. Which accepted_values definition is most appropriate?
正解: A
To allow NULL in dbt's accepted_values, include YAML null (unquoted) in values. B uses the string 'null', which is not NULL. C leaves pending/completed unquoted, so they don't match as strings. D's allow_nulls is not a built-in parameter.
Do I need both unique and not_null?
dbt's unique test treats NULLs as failures too, but for clearer error messages and easier root-cause analysis, both practical experience and the exam recommend pairing not_null with unique on primary key columns.
Can relationships validate composite keys (2+ columns)?
The built-in relationships test assumes a single column. For composite keys, the robust options are to generate a surrogate key inside the model and apply relationships to it, or to author a custom test.
How does accepted_values handle case differences?
Comparison behavior depends on the database's collation. For deterministic results, normalize on the model side using lower() or similar, and align accepted_values to lowercase as well.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
dbt Models: SQL-Defined Transformation Units (2026)
Model fundamentals — SELECT-based definitions, naming, refs,...
dbt Analytics Engineering Exam: Complete Guide (2026)
Pass the AE Certification — scope, weighting, sample questio...
dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)
Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...
dbt Project Structure: models/seeds/macros Layout (2026)
Recommended dbt project layout — models, seeds, macros, snap...
dbt_project.yml Explained: Every Config (2026)
Every dbt_project.yml setting that matters — paths, vars, ma...