dbt

dbt Metrics in Practice: Designing Definitions That Do Not Break

2026-04-19
NicheeLab Editorial Team

A metric is not the final output — it is the contract decisions are made against. Once the definition location or calculation drifts, BI numbers diverge from tool to tool. This article walks through the dbt options for metric definition, covering both model-based aggregation and unified definitions via the Semantic Layer.

Official dbt features shift across versions. In particular, the legacy metrics resource is on the deprecation path, and the dbt Cloud Semantic Layer (built on MetricFlow) is now the recommended option. For exam prep, anchor on the durable concepts — grain, tests, lineage, governance — while staying aware of where the old and new approaches each fit.

Where to Define Metrics: Models vs. the Semantic Layer

There are essentially two places to define metrics in dbt. The first is to build pre-aggregated mart models. The second is to declare measures and metrics in the Semantic Layer and let downstream tools call them. The former runs on any DWH, while the latter centralizes the definition and maximizes reuse across connected tools.

In practice, first normalize your fact and dimension tables, align on time grain, and agree on filter conditions. Only then decide which of the two approaches to pick.

  • Model approach: build out fct_ / dim_ tables and ship aggregated views as dbt models
  • Semantic Layer approach: declare measures, dimensions, and metrics in YAML and reuse them across tools
  • Common to both: grain, default filters, null/duplicate handling, and automated tests
  • Prioritize platform independence with the model approach; prioritize org-wide centralization with the Semantic Layer

Metric lineage and consumption flow

Sources (RAW)Staging (stg)Marts (dim/fct)A) Aggregated models (dbt SQL)B) Semantic LayerBI / Notebooks / AppsMetric lineage and consumption flow

Expressing Metrics as Models: A Stable, Portable Approach

Build out fact tables in the mart layer and serve metrics like AOV or CVR as views or tables. It runs on any DWH and stays inside dbt Core, which makes it the mainstream answer the exam often tests.

The keys are consistent grain management and reuse of aggregation logic. Rather than building three tables (daily, weekly, monthly), keep one table at the smallest grain (daily) and let downstream consumers roll it up.

  • Make the grain explicit: include date_day, customer_id, and similar fields in the primary key
  • Apply default filters uniformly in the view layer rather than carrying them as columns
  • Resolve duplicate rows and late-arriving data in the stg layer (surrogate keys, window functions to pick the latest row, etc.)
  • Tests: always include unique, not_null, and accepted_values
  • Use exposures to surface dashboard dependencies and put them under monitoring

Defining Metrics in the Semantic Layer: Maximize Reuse and Consistency

In the dbt Cloud Semantic Layer (built on MetricFlow), you declare entities, dimensions, and measures in semantic_models, then compose metrics on top. When BI tools and notebooks go through this layer, metric values stay consistent across tools.

Specs and keywords can change between versions, so check the official documentation before implementing. Conceptually: separate the measure (the aggregation definition) from the metric (the business-facing definition), and make time grain, dimensions, and default filters explicit.

  • A measure defines the aggregation method (sum, count, avg, etc.) and the underlying expression
  • A metric references measures and declares ratio or derived calculations when needed
  • Putting time_granularity and default filters on the metric prevents misuse downstream
  • Connected BI tools hit the same API, which guarantees consistent numbers
ApproachWhere it lives / how it is managedMain strengthsWatch out for
Hand-written aggregated models (SQL)dbt Core models in the marts layerDWH-independent, simple, easy to operateGrain and filter sprawl can creep in. Per-BI deltas need to be managed.
Legacy metrics resource (deprecated)YAML metrics definitions (old spec)Briefly offered a simple definition path in dbt CoreNow deprecated; new work should migrate to the Semantic Layer
Semantic Layer (MetricFlow)semantic_models and metrics (dbt Cloud)Centralized definitions, cross-tool reuse, metric governanceCloud-dependent; you have to track spec updates

Semantic Layer example: semantic_models and metrics YAML

semantic_models:
  - name: orders
    model: ref('fct_orders')
    entities:
      - name: order
        type: primary
        expr: order_id
      - name: customer
        type: foreign
        expr: customer_id
    dimensions:
      - name: order_date
        type: time
        type_params:
          time_granularity: day
      - name: channel
        type: categorical
    measures:
      - name: orders
        agg: count
      - name: gross_revenue
        agg: sum
        expr: gross_amount

metrics:
  - name: total_revenue
    type: simple
    label: Total Revenue
    type_params:
      measure: gross_revenue
  - name: aov
    type: ratio
    label: Average Order Value
    type_params:
      numerator: gross_revenue
      denominator: orders
    filter: channel != 'internal'
    time_granularity: day

Grain, Tests, Governance: Building Metrics That Do Not Break

Unbreakable metrics come down to three things: an explicit grain, guaranteed input data quality, and documentation. First, confirm that the date plus entity combination is actually unique, then design for nulls and late-arriving data.

Place tests at both stg and marts. At stg, verify schema consistency and deduplication. At marts, verify key uniqueness, join completeness, and accepted values. Even when you adopt the Semantic Layer, you do not skip tests on the underlying models.

  • Grain: hold the smallest grain and roll up as needed
  • Filters: define business-agreed default filters; document exclusion conditions
  • Nulls / late-arriving data: decide default handling and update strategy (snapshot / incremental)
  • Tests: make unique, not_null, relationships, and accepted_values mandatory
  • Docs: publish the rationale for each metric, the numerator/denominator, and any exceptions

Consuming Metrics: From BI, Notebooks, and Apps

With the model approach, BI tools read the aggregated views directly. With the Semantic Layer approach, you pull metrics by name through a supported connector or API. Either way, design permissions and caching together to stabilize performance.

Register dashboards as exposures in your dbt project to surface the impact of upstream changes. Auto-run data validation against the relevant exposures before each release to catch breaking metric changes early.

  • BI connection: views/tables for the model approach; connector/API for the Semantic Layer
  • Permissions: grant reads via least-privilege roles; check the Cloud-side permissions when going through the Semantic Layer
  • Cache: do not shrink the daily refresh window too aggressively for time-series metrics
  • Exposures: track impact through type=dashboard dependencies
  • SLA: document refresh cadence and latency tolerance directly on the dashboard

Exam Prep Essentials: Sidestepping the Trick Questions

The Analytics Engineer exam targets principles around definition consistency and data quality. Because the fine print on specific products can change, train yourself to decide based on conceptual priorities.

Common trick scenarios: per-BI calculated field sprawl, and ratio errors caused by grain mismatches. Always confirm that the numerator and denominator are calculated with the same filters and the same grain.

  • A metric is a business-definition contract; centralizing where it lives is the top priority
  • Ratios over mismatched grains produce errors; align grain first
  • Tests and docs are perennial exam topics; use examples and accepted_values
  • Manage downstream consumers via exposures to trace the impact of changes
  • For the Semantic Layer, conceptual fluency matters: be able to articulate the roles of measure vs. metric

Check Your Understanding

Analytics Engineer

問題 1

You want to provide AOV (average order value) consistently across the whole organization. Multiple DWHs and multiple BI tools are running in parallel. Which approach best balances reuse and governance?

  1. Define measures and metrics in the dbt Cloud Semantic Layer and read them from each BI tool through a supported connector
  2. Define AOV as a calculated field in every BI tool and publicize the shared definition through docs
  3. Build separate daily, weekly, and monthly AOV tables in the marts layer and let each BI tool pick the one it needs
  4. Pre-calculate AOV in upstream ETL, write it back to the RAW layer, and have every tool read from there

正解: A

Centralizing the definition in the Semantic Layer is what most reliably guarantees cross-tool consistency and reuse. BI-side calculations, grain sprawl, and baking values into RAW all become maintenance traps and breed definition drift.

Frequently Asked Questions

Should I still use the legacy dbt metrics resource?

It is on the deprecation path. For new work, define metrics in the dbt Cloud Semantic Layer (MetricFlow). When migrating an existing project, map measures to metrics, document default grain and filters, and roll the change out incrementally. Check the latest dbt official documentation for current guidance.

Our DWHs and BI tools are mixed. Which approach is safer?

If portability is the top priority, the model-based approach (aggregated views in dbt SQL) is solid. If you want to maximize org-wide consistency and reuse, the Semantic Layer is the stronger choice. Either way, you need quality tests on the underlying stg/marts layers and a consistent grain.

How should I think about performance?

For the model approach, use materializations (table/incremental) and pre-compute aggregates. For the Semantic Layer, lean on connector-side optimizations plus DWH statistics and partitioning. Either way, constrain the date window you recompute and pre-normalize expensive dimension joins.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
dbt

dbt Models: SQL-Defined Transformation Units (2026)

Model fundamentals — SELECT-based definitions, naming, refs,...

dbt

dbt Analytics Engineering Exam: Complete Guide (2026)

Pass the AE Certification — scope, weighting, sample questio...

dbt

dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)

Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...

dbt

dbt Project Structure: models/seeds/macros Layout (2026)

Recommended dbt project layout — models, seeds, macros, snap...

dbt

dbt_project.yml Explained: Every Config (2026)

Every dbt_project.yml setting that matters — paths, vars, ma...

Browse all dbt articles (101)
© 2026 NicheeLab All rights reserved.