A metric is not the final output — it is the contract decisions are made against. Once the definition location or calculation drifts, BI numbers diverge from tool to tool. This article walks through the dbt options for metric definition, covering both model-based aggregation and unified definitions via the Semantic Layer.
Official dbt features shift across versions. In particular, the legacy metrics resource is on the deprecation path, and the dbt Cloud Semantic Layer (built on MetricFlow) is now the recommended option. For exam prep, anchor on the durable concepts — grain, tests, lineage, governance — while staying aware of where the old and new approaches each fit.
There are essentially two places to define metrics in dbt. The first is to build pre-aggregated mart models. The second is to declare measures and metrics in the Semantic Layer and let downstream tools call them. The former runs on any DWH, while the latter centralizes the definition and maximizes reuse across connected tools.
In practice, first normalize your fact and dimension tables, align on time grain, and agree on filter conditions. Only then decide which of the two approaches to pick.
Metric lineage and consumption flow
Build out fact tables in the mart layer and serve metrics like AOV or CVR as views or tables. It runs on any DWH and stays inside dbt Core, which makes it the mainstream answer the exam often tests.
The keys are consistent grain management and reuse of aggregation logic. Rather than building three tables (daily, weekly, monthly), keep one table at the smallest grain (daily) and let downstream consumers roll it up.
In the dbt Cloud Semantic Layer (built on MetricFlow), you declare entities, dimensions, and measures in semantic_models, then compose metrics on top. When BI tools and notebooks go through this layer, metric values stay consistent across tools.
Specs and keywords can change between versions, so check the official documentation before implementing. Conceptually: separate the measure (the aggregation definition) from the metric (the business-facing definition), and make time grain, dimensions, and default filters explicit.
| Approach | Where it lives / how it is managed | Main strengths | Watch out for |
|---|---|---|---|
| Hand-written aggregated models (SQL) | dbt Core models in the marts layer | DWH-independent, simple, easy to operate | Grain and filter sprawl can creep in. Per-BI deltas need to be managed. |
| Legacy metrics resource (deprecated) | YAML metrics definitions (old spec) | Briefly offered a simple definition path in dbt Core | Now deprecated; new work should migrate to the Semantic Layer |
| Semantic Layer (MetricFlow) | semantic_models and metrics (dbt Cloud) | Centralized definitions, cross-tool reuse, metric governance | Cloud-dependent; you have to track spec updates |
Semantic Layer example: semantic_models and metrics YAML
semantic_models:
- name: orders
model: ref('fct_orders')
entities:
- name: order
type: primary
expr: order_id
- name: customer
type: foreign
expr: customer_id
dimensions:
- name: order_date
type: time
type_params:
time_granularity: day
- name: channel
type: categorical
measures:
- name: orders
agg: count
- name: gross_revenue
agg: sum
expr: gross_amount
metrics:
- name: total_revenue
type: simple
label: Total Revenue
type_params:
measure: gross_revenue
- name: aov
type: ratio
label: Average Order Value
type_params:
numerator: gross_revenue
denominator: orders
filter: channel != 'internal'
time_granularity: dayUnbreakable metrics come down to three things: an explicit grain, guaranteed input data quality, and documentation. First, confirm that the date plus entity combination is actually unique, then design for nulls and late-arriving data.
Place tests at both stg and marts. At stg, verify schema consistency and deduplication. At marts, verify key uniqueness, join completeness, and accepted values. Even when you adopt the Semantic Layer, you do not skip tests on the underlying models.
With the model approach, BI tools read the aggregated views directly. With the Semantic Layer approach, you pull metrics by name through a supported connector or API. Either way, design permissions and caching together to stabilize performance.
Register dashboards as exposures in your dbt project to surface the impact of upstream changes. Auto-run data validation against the relevant exposures before each release to catch breaking metric changes early.
The Analytics Engineer exam targets principles around definition consistency and data quality. Because the fine print on specific products can change, train yourself to decide based on conceptual priorities.
Common trick scenarios: per-BI calculated field sprawl, and ratio errors caused by grain mismatches. Always confirm that the numerator and denominator are calculated with the same filters and the same grain.
Analytics Engineer
問題 1
You want to provide AOV (average order value) consistently across the whole organization. Multiple DWHs and multiple BI tools are running in parallel. Which approach best balances reuse and governance?
正解: A
Centralizing the definition in the Semantic Layer is what most reliably guarantees cross-tool consistency and reuse. BI-side calculations, grain sprawl, and baking values into RAW all become maintenance traps and breed definition drift.
Should I still use the legacy dbt metrics resource?
It is on the deprecation path. For new work, define metrics in the dbt Cloud Semantic Layer (MetricFlow). When migrating an existing project, map measures to metrics, document default grain and filters, and roll the change out incrementally. Check the latest dbt official documentation for current guidance.
Our DWHs and BI tools are mixed. Which approach is safer?
If portability is the top priority, the model-based approach (aggregated views in dbt SQL) is solid. If you want to maximize org-wide consistency and reuse, the Semantic Layer is the stronger choice. Either way, you need quality tests on the underlying stg/marts layers and a consistent grain.
How should I think about performance?
For the model approach, use materializations (table/incremental) and pre-compute aggregates. For the Semantic Layer, lean on connector-side optimizations plus DWH statistics and partitioning. Either way, constrain the date window you recompute and pre-normalize expensive dimension joins.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
dbt Models: SQL-Defined Transformation Units (2026)
Model fundamentals — SELECT-based definitions, naming, refs,...
dbt Analytics Engineering Exam: Complete Guide (2026)
Pass the AE Certification — scope, weighting, sample questio...
dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)
Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...
dbt Project Structure: models/seeds/macros Layout (2026)
Recommended dbt project layout — models, seeds, macros, snap...
dbt_project.yml Explained: Every Config (2026)
Every dbt_project.yml setting that matters — paths, vars, ma...