dbt

The Big Picture of dbt Tests: Choosing Between Generic and Singular Tests

2026-04-19
NicheeLab Editorial Team

dbt tests come in two main flavors: generic tests, declared reusably in YAML, and singular tests, written in SQL that returns each offending row.

On the exam, knowing where each test is defined, how it runs, and how failures are handled is a steady source of points. In practice, the key is to set a minimum quality bar while staying conscious of cost.

Test Fundamentals and the Big Picture

Under the hood, dbt tests run a SELECT that returns offending rows. If one or more rows come back, the test fails; if zero rows, it passes. Generic tests are declared in YAML and can be applied to many models or columns, while singular tests live under tests/ as SQL files.

Tests are usually executed with dbt test after the models are built. In CI, it is common to use dbt build so model construction and tests run as a single workflow. Failure behavior can be set to error or warn via severity.

  • Generic tests: declared in YAML, reusable; representative examples include not_null, unique, accepted_values, and relationships
  • Singular tests: defined as tests/*.sql; write SQL that extracts offending rows under custom conditions
  • Success condition: zero offending rows (as a rule)
  • Failure handling: severity=error fails the job; severity=warn keeps it as a warning only
  • Artifacts: execution results are recorded in run_results.json; offending rows can be persisted to the warehouse via store_failures when needed

Where tests sit inside a dbt project

sources (raw)staging modelsmarts modelsgeneric testsnot_null/unique/...generic testsaccepted_values...generic testsrelationships...singular tests (tests/*.sql)Arbitrary business validation SQL

Basic commands: running models and tests

# Build models and run tests together
$ dbt build -s +marts.customer --fail-fast

# Tests only, filtered by tag
$ dbt test -s tag:data_quality

# Point directly at a singular test file
$ dbt test -s tests/no_future_booking.sql

Defining and Using Generic (schema) Tests

Generic tests are declared in models/*.yml. The flagship built-ins are not_null, unique, accepted_values, and relationships. Because they are written declaratively, they offer strong maintainability and reuse, letting you quickly lock in a quality floor for your schema.

Custom generic tests can be defined as macros. The macro name becomes the test name, and you can call it from YAML passing columns or models as arguments.

  • Where it lives: schema.yml (the YAML for the target models, sources, etc.)
  • Scope: whole model or column-level
  • Failure output: SQL that extracts offending rows is generated automatically
  • Configuration examples: config.severity, tags, store_failures, and more
AspectGeneric testsSingular testsHow to remember it on the exam
Where definedYAML (schema.yml)tests/*.sqlDeclarative (YAML) vs. imperative (SQL)
PurposeShared integrity and constraint checksBusiness-specific or composite-condition checksWhen in doubt, start with a generic test
ReusabilityHigh; easily applied to many columnsLower; SQL written case by caseGeneric tests deliver DRY
ReadabilityHigh; requirements are listed in YAMLDepends on the SQL; review cost growsThe exam emphasizes declarative definitions
MaintenanceEasy to follow along when models are renamedDependent SQL must be updated tooGeneric tests are more change-resilient
Execution performanceMostly simple conditions, relatively lightJoins and aggregations tend to make them heavierPrefer generic tests for large datasets

Generic test definitions in YAML and a custom generic test example

# models/schema.yml
version: 2
models:
  - name: customers
    columns:
      - name: customer_id
        tests:
          - not_null
          - unique
      - name: status
        tests:
          - accepted_values:
              values: ["active", "inactive"]
  - name: orders
    columns:
      - name: customer_id
        tests:
          - relationships:
              to: ref('customers')
              field: customer_id
          - not_null:
              config:
                severity: warn

# macros/tests/not_future_date.sql (custom generic test)
{% test not_future_date(model, column_name) %}
select *
from {{ model }}
where {{ column_name }} > current_date
{% endtest %}

# Usage (models/payments.yml)
version: 2
models:
  - name: payments
    columns:
      - name: paid_at
        tests:
          - not_future_date

Writing Singular (data) Tests and Design Pointers

Singular tests are SQL files placed in the tests/ directory. Any rows returned by the query count as failures. They are well suited to business-specific rules and complex integrity checks that span multiple models.

You can use ref and source through Jinja, so references stay safely in sync with your models. When you rely on heavy joins or window functions, narrow the target period and lean on clustering keys (index-like structures) to keep cost in check.

  • Rule of thumb: write a SELECT that returns only offending rows
  • Naming: clarify intent with patterns like tests/<feature>__<check>.sql
  • Performance: narrow time ranges and target columns, and minimize the select clause
  • Review: spell out expected counts and boundary conditions (>= vs. >) in comments

Singular test examples: forbidding duplicate orders and future dates

-- tests/no_duplicate_orders.sql
with dup as (
  select order_id, count(*) as c
  from {{ ref('orders') }}
  group by 1
  having count(*) > 1
)
select * from dup

-- tests/no_future_bookings.sql
select *
from {{ ref('bookings') }}
where booking_date > current_date

Execution, Selection, and Reading the Results (Artifacts)

dbt test can run just the targeted tests following selection rules. Filtering by tag, model name, or path is handy in practice. In CI, selecting only the changed delta is also effective.

Results appear in standard output, and details are also persisted to target/run_results.json. When store_failures is enabled, offending rows are stored in warehouse tables using adapter-specific naming so they can be investigated later.

  • Selection examples: dbt test -s model_name, -s path:tests/critical, -s tag:data_quality
  • Dependency selection: -s +model includes upstreams; @model selects neighbors
  • Speed-up: --fail-fast stops at the first failure
  • Artifacts: run_results.json (per-test status and execution_time) and manifest.json (dependency graph)

Concrete examples of selection and storing failing rows

# Run only the critical-tagged checks
$ dbt test -s tag:critical --fail-fast

# Only changed models and their tests (e.g., when using state:modified)
$ dbt test -s state:modified+

# Persist failing rows via project configuration
# dbt_project.yml example
tests:
  +store_failures: true
  +severity: error

Design Guidelines and Anti-Patterns

Aim for the most confidence at the lowest cost. Instead of slapping tests on every column of every table, prioritize the metrics, keys, and foreign references that drive the business. Use generic tests to lock in a floor, and fill gaps in domain-specific conditions with singular tests.

Anti-patterns include constantly running wide full-scan singular tests against aggregate models, or duplicating the same validation logic across both generic and singular tests. Bake frequency and scheduling into your design so test runtimes do not erode your SLAs.

  • Prioritization: primary keys, foreign key relationships, dimension accepted values
  • Run frequency: always for critical checks; split heavy validation to daily or weekly cadences
  • Avoid duplication: consolidate identical logic on one side only
  • Observability: dashboard the time series of failure counts to spot trends

Configuration example that keeps failures as warnings (phased rollout)

# models/core.yml
version: 2
models:
  - name: customers
    columns:
      - name: email
        tests:
          - unique:
              config:
                severity: warn  # Monitor as a warning initially
          - not_null:
              config:
                severity: error # Keep required checks strict as errors

Exam Prep: Frequently Tested Points on the Analytics Engineer Exam

The exam frequently asks about the differences between generic and singular tests, their definitions in YAML/SQL, the meaning of relationships and accepted_values, the behavior of severity and store_failures, and when to use dbt build versus dbt test. Memorize the behavior as written in the official documentation.

Dependency resolution via ref/source, and basic operations to narrow the target with selection flags, are also directly tied to your score. Of the selection syntax (+/@/tag/path/state), make sure you understand at least tag, path, and the meaning of a simple +.

  • Generic tests live in YAML; singular tests live in tests/*.sql
  • Failure condition: one or more offending rows are returned
  • relationships checks referential integrity (child to parent existence)
  • accepted_values detects values outside a fixed list
  • severity=warn lets the job continue; error ends in failure
  • When store_failures is enabled, offending rows are stored in tables with adapter-specific names

Minimal snippet to memorize

# Minimal relationships example
tests:
  - relationships:
      to: ref('dim_customers')
      field: customer_id

# Minimal accepted_values example
tests:
  - accepted_values:
      values: ["A", "B"]

Check Your Understanding

Analytics Engineer

問題 1

You want to verify with a dbt generic test that the customer_id in the orders table always exists in the customer_id of the customers table. Which definition is most appropriate?

  1. Define a relationships test on customer_id in models/orders.yml, specifying to: ref('customers') and field: customer_id
  2. Create tests/relationships_orders_customers.sql and write SQL that returns unmatched rows using LEFT JOIN
  3. It is enough to just add unique and not_null to models/customers.yml
  4. Use accepted_values on customer_id in models/orders.yml and enumerate every customer_id under values

正解: A

For referential integrity, the official and reusable way is to declare the generic test relationships on the child (orders) column, specifying the parent model with to (ref('customers')) and the parent's key column with field (customer_id). The other choices fall short in either feasibility or maintainability.

Frequently Asked Questions

What is the difference between dbt build and dbt test? Which one should I use?

dbt build runs model builds (run), tests, snapshots, and more in dependency order in a single command. It is convenient for CI and production validation. When you want to run only tests against existing models locally, use dbt test.

Is it really safe to let severity=warn ignore failures?

warn does not fail the job, but the failure is still recorded in run_results.json. It is useful during initial rollout or right after a migration, but we recommend eventually promoting business-critical checks to error.

What happens when store_failures is enabled, and how does it affect cost?

Failing rows are stored in dedicated tables in the warehouse, which is useful for investigation and visualization. On the other hand, additional table creation and writes increase storage and query cost. Limiting this to important tests or temporary operations is realistic.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
dbt

dbt Models: SQL-Defined Transformation Units (2026)

Model fundamentals — SELECT-based definitions, naming, refs,...

dbt

dbt Analytics Engineering Exam: Complete Guide (2026)

Pass the AE Certification — scope, weighting, sample questio...

dbt

dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)

Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...

dbt

dbt Project Structure: models/seeds/macros Layout (2026)

Recommended dbt project layout — models, seeds, macros, snap...

dbt

dbt_project.yml Explained: Every Config (2026)

Every dbt_project.yml setting that matters — paths, vars, ma...

Browse all dbt articles (101)
© 2026 NicheeLab All rights reserved.