dbt Tests: Generic, Singular, Source Freshness (2026)

dbt tests come in two main flavors: generic tests, declared reusably in YAML, and singular tests, written in SQL that returns each offending row.

On the exam, knowing where each test is defined, how it runs, and how failures are handled is a steady source of points. In practice, the key is to set a minimum quality bar while staying conscious of cost.

Test Fundamentals and the Big Picture

Under the hood, dbt tests run a SELECT that returns offending rows. If one or more rows come back, the test fails; if zero rows, it passes. Generic tests are declared in YAML and can be applied to many models or columns, while singular tests live under tests/ as SQL files.

Tests are usually executed with dbt test after the models are built. In CI, it is common to use dbt build so model construction and tests run as a single workflow. Failure behavior can be set to error or warn via severity.

Generic tests: declared in YAML, reusable; representative examples include not_null, unique, accepted_values, and relationships
Singular tests: defined as tests/*.sql; write SQL that extracts offending rows under custom conditions
Success condition: zero offending rows (as a rule)
Failure handling: severity=error fails the job; severity=warn keeps it as a warning only
Artifacts: execution results are recorded in run_results.json; offending rows can be persisted to the warehouse via store_failures when needed

Where tests sit inside a dbt project

Basic commands: running models and tests

# Build models and run tests together
$ dbt build -s +marts.customer --fail-fast

# Tests only, filtered by tag
$ dbt test -s tag:data_quality

# Point directly at a singular test file
$ dbt test -s tests/no_future_booking.sql

Defining and Using Generic (schema) Tests

Generic tests are declared in models/*.yml. The flagship built-ins are not_null, unique, accepted_values, and relationships. Because they are written declaratively, they offer strong maintainability and reuse, letting you quickly lock in a quality floor for your schema.

Custom generic tests can be defined as macros. The macro name becomes the test name, and you can call it from YAML passing columns or models as arguments.

Where it lives: schema.yml (the YAML for the target models, sources, etc.)
Scope: whole model or column-level
Failure output: SQL that extracts offending rows is generated automatically
Configuration examples: config.severity, tags, store_failures, and more

Aspect	Generic tests	Singular tests	How to remember it on the exam
Where defined	YAML (schema.yml)	tests/*.sql	Declarative (YAML) vs. imperative (SQL)
Purpose	Shared integrity and constraint checks	Business-specific or composite-condition checks	When in doubt, start with a generic test
Reusability	High; easily applied to many columns	Lower; SQL written case by case	Generic tests deliver DRY
Readability	High; requirements are listed in YAML	Depends on the SQL; review cost grows	The exam emphasizes declarative definitions
Maintenance	Easy to follow along when models are renamed	Dependent SQL must be updated too	Generic tests are more change-resilient
Execution performance	Mostly simple conditions, relatively light	Joins and aggregations tend to make them heavier	Prefer generic tests for large datasets

Generic test definitions in YAML and a custom generic test example

# models/schema.yml
version: 2
models:
  - name: customers
    columns:
      - name: customer_id
        tests:
          - not_null
          - unique
      - name: status
        tests:
          - accepted_values:
              values: ["active", "inactive"]
  - name: orders
    columns:
      - name: customer_id
        tests:
          - relationships:
              to: ref('customers')
              field: customer_id
          - not_null:
              config:
                severity: warn

# macros/tests/not_future_date.sql (custom generic test)
{% test not_future_date(model, column_name) %}
select *
from {{ model }}
where {{ column_name }} > current_date
{% endtest %}

# Usage (models/payments.yml)
version: 2
models:
  - name: payments
    columns:
      - name: paid_at
        tests:
          - not_future_date

Writing Singular (data) Tests and Design Pointers

Singular tests are SQL files placed in the tests/ directory. Any rows returned by the query count as failures. They are well suited to business-specific rules and complex integrity checks that span multiple models.

You can use ref and source through Jinja, so references stay safely in sync with your models. When you rely on heavy joins or window functions, narrow the target period and lean on clustering keys (index-like structures) to keep cost in check.

Rule of thumb: write a SELECT that returns only offending rows
Naming: clarify intent with patterns like tests/<feature>__<check>.sql
Performance: narrow time ranges and target columns, and minimize the select clause
Review: spell out expected counts and boundary conditions (>= vs. >) in comments

Singular test examples: forbidding duplicate orders and future dates

-- tests/no_duplicate_orders.sql
with dup as (
  select order_id, count(*) as c
  from {{ ref('orders') }}
  group by 1
  having count(*) > 1
)
select * from dup

-- tests/no_future_bookings.sql
select *
from {{ ref('bookings') }}
where booking_date > current_date

Execution, Selection, and Reading the Results (Artifacts)

dbt test can run just the targeted tests following selection rules. Filtering by tag, model name, or path is handy in practice. In CI, selecting only the changed delta is also effective.

Results appear in standard output, and details are also persisted to target/run_results.json. When store_failures is enabled, offending rows are stored in warehouse tables using adapter-specific naming so they can be investigated later.

Selection examples: dbt test -s model_name, -s path:tests/critical, -s tag:data_quality
Dependency selection: -s +model includes upstreams; @model selects neighbors
Speed-up: --fail-fast stops at the first failure
Artifacts: run_results.json (per-test status and execution_time) and manifest.json (dependency graph)

Concrete examples of selection and storing failing rows

# Run only the critical-tagged checks
$ dbt test -s tag:critical --fail-fast

# Only changed models and their tests (e.g., when using state:modified)
$ dbt test -s state:modified+

# Persist failing rows via project configuration
# dbt_project.yml example
tests:
  +store_failures: true
  +severity: error

Design Guidelines and Anti-Patterns

Aim for the most confidence at the lowest cost. Instead of slapping tests on every column of every table, prioritize the metrics, keys, and foreign references that drive the business. Use generic tests to lock in a floor, and fill gaps in domain-specific conditions with singular tests.

Anti-patterns include constantly running wide full-scan singular tests against aggregate models, or duplicating the same validation logic across both generic and singular tests. Bake frequency and scheduling into your design so test runtimes do not erode your SLAs.

Prioritization: primary keys, foreign key relationships, dimension accepted values
Run frequency: always for critical checks; split heavy validation to daily or weekly cadences
Avoid duplication: consolidate identical logic on one side only
Observability: dashboard the time series of failure counts to spot trends

Configuration example that keeps failures as warnings (phased rollout)

# models/core.yml
version: 2
models:
  - name: customers
    columns:
      - name: email
        tests:
          - unique:
              config:
                severity: warn  # Monitor as a warning initially
          - not_null:
              config:
                severity: error # Keep required checks strict as errors

Exam Prep: Frequently Tested Points on the Analytics Engineer Exam

The exam frequently asks about the differences between generic and singular tests, their definitions in YAML/SQL, the meaning of relationships and accepted_values, the behavior of severity and store_failures, and when to use dbt build versus dbt test. Memorize the behavior as written in the official documentation.

Dependency resolution via ref/source, and basic operations to narrow the target with selection flags, are also directly tied to your score. Of the selection syntax (+/@/tag/path/state), make sure you understand at least tag, path, and the meaning of a simple +.

Generic tests live in YAML; singular tests live in tests/*.sql
Failure condition: one or more offending rows are returned
relationships checks referential integrity (child to parent existence)
accepted_values detects values outside a fixed list
severity=warn lets the job continue; error ends in failure
When store_failures is enabled, offending rows are stored in tables with adapter-specific names

Minimal snippet to memorize

# Minimal relationships example
tests:
  - relationships:
      to: ref('dim_customers')
      field: customer_id

# Minimal accepted_values example
tests:
  - accepted_values:
      values: ["A", "B"]

Check Your Understanding

Analytics Engineer

問題 1

You want to verify with a dbt generic test that the customer_id in the orders table always exists in the customer_id of the customers table. Which definition is most appropriate?

Define a relationships test on customer_id in models/orders.yml, specifying to: ref('customers') and field: customer_id
Create tests/relationships_orders_customers.sql and write SQL that returns unmatched rows using LEFT JOIN
It is enough to just add unique and not_null to models/customers.yml
Use accepted_values on customer_id in models/orders.yml and enumerate every customer_id under values

正解: A

For referential integrity, the official and reusable way is to declare the generic test relationships on the child (orders) column, specifying the parent model with to (ref('customers')) and the parent's key column with field (customer_id). The other choices fall short in either feasibility or maintainability.

Frequently Asked Questions

What is the difference between dbt build and dbt test? Which one should I use?

dbt build runs model builds (run), tests, snapshots, and more in dependency order in a single command. It is convenient for CI and production validation. When you want to run only tests against existing models locally, use dbt test.

Is it really safe to let severity=warn ignore failures?

warn does not fail the job, but the failure is still recorded in run_results.json. It is useful during initial rollout or right after a migration, but we recommend eventually promoting business-critical checks to error.

What happens when store_failures is enabled, and how does it affect cost?

Failing rows are stored in dedicated tables in the warehouse, which is useful for investigation and visualization. On the other hand, additional table creation and writes increase storage and query cost. Limiting this to important tests or temporary operations is realistic.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

The Big Picture of dbt Tests: Choosing Between Generic and Singular Tests