dbt

dbt Singular Tests Deep Dive: Precise Data Quality with Custom SQL Validations

2026-04-19
NicheeLab Editorial Team

dbt tests fall into two broad categories: generic tests and singular tests. Generic tests are parametric templates designed for reuse, while singular tests are one-off validations expressed as a single SQL query that explicitly returns the failing rows.

This article walks through the design, implementation, and operation of singular tests, with an eye on what the Analytics Engineer exam emphasizes, so you can lock in the key points quickly.

Singular Test Basics and How They Differ from Generic Tests

A dbt singular test is a single SQL file placed under tests/. The rule is that the SQL must return the failing rows: at execution time, zero rows means pass, and one or more rows means fail. It is ideal for requirements that are hard to express with generic tests, such as complex joins, conditional logic, or temporal consistency.

Generic tests are reusable macro-based validations — not_null, unique, and relationships are typical examples. Singular tests are better suited to one-off, case-specific logic and offer a high degree of freedom with WHERE clauses, window functions, and joins.

  • Pass condition: query returns zero rows
  • Fail condition: query returns one or more rows
  • Location: under the project's tests/ directory (any subdirectory is fine)
  • References: use {{ ref() }} and {{ source() }} to reference models and sources
  • Configuration: a config block can set severity, tags, store_failures, and more
AspectGeneric testSingular testNative DB constraints
DefinitionMacro + YAML arguments, designed for reuseOne-off SQL that returns failing rowsDDL constraints (UNIQUE, NOT NULL, etc.)
ExpressivenessStrong for common patternsAny SQL (joins, window functions, etc.)Limited to constraint types
ReusabilityHigh (applies to the same pattern)Low (case-by-case)Medium (general per table)
Cross-model validationLimited (relationships, etc.)Free (can join multiple models)Generally not supported (single-table focus)
Execution and visibilityExecuted in aggregate via dbt testExecuted directly via dbt testDetected and rejected automatically by the DWH
Use casesNOT NULL, UNIQUE, referential integrityTemporal overlap, conditional uniqueness, invariant violationsKey uniqueness and foreign keys (where supported)

Minimal singular test example (verify that no negative amounts exist)

{{ config(severity='error', tags=['dq','critical']) }}

-- tests/no_negative_amounts.sql
with invalid as (
  select order_id, amount
  from {{ ref('orders') }}
  where amount < 0
)
select * from invalid

File Layout, Naming, and How Compilation Works

Singular tests live under tests/. A naming scheme such as <target>__<rule>.sql, which makes both the target and the violated rule obvious, reads well in practice. For the exam, the key thing to remember is that any *.sql file under tests/ becomes a test.

You can use Jinja to call ref/source, variables, and macros. Extract duplicated logic into macros/ and call them from the test SQL — that keeps maintenance simple.

  • Naming example: customers__email_unique_conditionally.sql
  • Extract shared logic into macros/
  • Do not write DML/DDL (validation queries only)
  • Avoid non-deterministic functions and current-time dependencies for stable execution

Layout example (tests and macros)

# Project layout (excerpt)
models/
  mart/
    orders.sql
macros/
  test_helpers.sql
tests/
  orders__no_negative_amounts.sql

-- macros/test_helpers.sql
{% macro flag_negative_amounts(model_name) %}
select order_id, amount
from {{ ref(model_name) }}
where amount < 0
{% endmacro %}

-- tests/orders__no_negative_amounts.sql
{{ config(severity='error', tags=['dq']) }}
{{ flag_negative_amounts('orders') }}

Representative Use Cases: Validations Only Custom SQL Can Express

Singular tests shine for conditional uniqueness (e.g., customer email is unique only within rows where status = 'active'), temporal overlap (SCD2 effective-period overlaps), and cross-model business rules (e.g., orders are not canceled after they ship).

The example below is a singular test that detects effective-period overlaps. If any row is returned, an overlap exists and the test fails.

  • Conditional uniqueness: filter first, then detect duplicate keys
  • Temporal consistency: detect overlaps or gaps in effective periods (SCD2)
  • Cross-model consistency: time-slice alignment between facts and dimensions

Detect period overlaps (no overlapping SCD2 effective periods)

{{ config(severity='error', tags=['dq','temporal']) }}

with scoped as (
  select surrogate_key, valid_from, valid_to
  from {{ ref('dim_customer_scd2') }}
),
ordered as (
  select
    surrogate_key,
    valid_from,
    coalesce(valid_to, '9999-12-31') as valid_to
  from scoped
),
overlaps as (
  select a.surrogate_key, a.valid_from as a_from, a.valid_to as a_to,
         b.valid_from as b_from, b.valid_to as b_to
  from ordered a
  join ordered b
    on a.surrogate_key = b.surrogate_key
   and a.valid_from < b.valid_to
   and b.valid_from < a.valid_to
   and (a.valid_from, a.valid_to) <> (b.valid_from, b.valid_to)
)
select * from overlaps

Execution, Selectors, and Handling Results

The basic command is dbt test. Use selectors to run tests individually. For singular tests in particular, selecting by file path or tag is the most practical and reliable approach. If you need to persist failing rows, set store_failures: true on the test.

Selector examples: path:tests/orders__no_negative_amounts.sql, tag:dq, and so on. In CI, combining --select state:modified+ with path selectors is an effective way to run only the tests impacted by changes.

  • Run a single test: dbt test -s path:tests/orders__no_negative_amounts.sql
  • Run by tag: dbt test -s tag:dq
  • Persist failing rows: config(store_failures=true)
  • Treat as warning instead of failure: config(severity='warn')

dbt test execution flow (singular)

CLI(dbt test)
   |
   v
Compile Jinja → SQL
   |
   v
Run query (returns failing rows)
   |
   |-- zero rows --> PASS
   |
   `-- >0 rows  --> FAIL
                 |
                 `-- store_failures=true ? persist failing rows : no persist

CLI and config examples (tag selection, warning severity, persisting failing rows)

# Run a single file
$ dbt test -s path:tests/orders__no_negative_amounts.sql

# Run by tag
$ dbt test -s tag:dq

# Config block at the top of the test file
{{ config(
  tags=['dq','nightly'],
  severity='warn',
  store_failures=true
) }}

Best Practices and Performance

Singular tests give you a lot of freedom, which also makes them prone to heavy SQL. Scope the target as narrowly as possible, return only the columns you need, and be mindful of join conditions and indexes (clustering or sort keys). For duplicate detection, narrow down candidate keys with an aggregation first and only return detail columns for the candidates — that keeps the query light.

Tests should be deterministic. Do not depend on the current time or random values, and avoid functions or hints that are sensitive to environmental differences whenever possible.

  • Return only the minimum columns needed (keys and a reason column for debugging)
  • Use a two-stage pattern: aggregate to narrow down first, then fetch details
  • Avoid joining large fact tables together; let dimensions guide the join
  • Avoid non-deterministic functions and UDF dependencies (they hurt portability)
  • Move long-running tests to a nightly run and keep the scope narrow on PRs

Lightweight duplicate-detection pattern (narrow down candidates first)

{{ config(severity='error', tags=['dq','perf']) }}

with active as (
  select customer_id, email
  from {{ ref('dim_customer') }}
  where status = 'active'
),
candidates as (
  select email
  from active
  group by email
  having count(*) > 1
),
details as (
  select a.customer_id, a.email
  from active a
  join candidates c using(email)
)
select * from details

Exam-Prep Checklist (Analytics Engineer)

Three points come up frequently on the exam: singular tests are "SQL that returns failing rows," every *.sql file under tests/ becomes a test, and config lets you set severity, tags, and store_failures. Being able to articulate when to use generic versus singular tests gives you an edge.

Lock in the path: and tag: selectors so you can pick the right option for pinpoint execution.

  • Singular rule: zero rows = pass, >0 rows = fail
  • Lives under tests/ and can use {{ ref() }} and {{ source() }}
  • config(severity, tags, store_failures) can be declared at the top of the test
  • Write complex logic that generic tests cannot express as a singular test
  • Selectors: choose between path:tests/... and tag:...

Conditional uniqueness (active users have unique email)

{{ config(severity='error', tags=['dq','exam']) }}

with active as (
  select user_id, email
  from {{ ref('users') }}
  where is_active = true
),
dups as (
  select email
  from active
  group by email
  having count(*) > 1
)
select a.*
from active a
join dups d using(email)

Check Your Understanding

Analytics Engineer

問題 1

Which statement about dbt singular tests is most accurate?

  1. The test SQL is written to return the failing rows, and the test passes when the result is zero rows
  2. The test SQL must always return count(*)
  3. Singular tests can only be defined in YAML — no SQL file is needed
  4. Singular tests cannot use {{ ref() }}

正解: A

Singular tests are designed as one-off SQL that returns failing rows: zero rows pass, one or more fail. Returning count(*) is not required, YAML-only definition is not allowed, and {{ ref() }} and {{ source() }} work as usual.

Frequently Asked Questions

What happens when store_failures=true is enabled?

The failing rows detected by the test are persisted as a table or view in the data warehouse, where they can be used for downstream investigation or dashboard visualization. Naming and storage location follow dbt's defaults.

What should I consider first when generic tests are not enough?

Break down the requirements and check whether they can be expressed as combinations of existing generic tests (unique, not_null, relationships, etc.) with filters. If composite conditions, temporal consistency, or cross-model validation are still hard to express, implement them as singular tests.

Can I reference ephemeral models or seeds from a singular test?

Yes. {{ ref() }} and {{ source() }} work as usual. Ephemeral models are inlined as subqueries, and seeds are referenced as physical tables or views.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
dbt

dbt Models: SQL-Defined Transformation Units (2026)

Model fundamentals — SELECT-based definitions, naming, refs,...

dbt

dbt Analytics Engineering Exam: Complete Guide (2026)

Pass the AE Certification — scope, weighting, sample questio...

dbt

dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)

Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...

dbt

dbt Project Structure: models/seeds/macros Layout (2026)

Recommended dbt project layout — models, seeds, macros, snap...

dbt

dbt_project.yml Explained: Every Config (2026)

Every dbt_project.yml setting that matters — paths, vars, ma...

Browse all dbt articles (101)
© 2026 NicheeLab All rights reserved.