Even for the same "revenue" or "active users," decisions waver when departments and tools each aggregate differently. The dbt Semantic Layer eliminates this drift by centralizing metric definitions as code so they can be reused across the entire data platform.
This article walks through the design principles for "unified metrics" that come up in both Analytics Engineer practice and exams, with YAML examples and tips on change management and testing — grounded in the stable behavior described in the official documentation.
When each BI tool defines revenue or filter conditions differently, you end up with multiple answers to the same question. Teams get stuck reconciling numbers, and leadership delays decisions. The root cause is metrics being defined separately across scattered places like visualization tools and spreadsheets.
The dbt Semantic Layer lets you declare metrics in YAML and tightly couple them to your transformed data models. Definitions are version-controlled and ride your review, test, and deploy pipelines. At runtime they compile to SQL and are evaluated on warehouses like Snowflake, BigQuery, Databricks, or Redshift (execution follows each DWH's permission model — see the respective official docs for details).
From a certification standpoint, the key topics are ensuring metric consistency, managing dependencies with data models, and automating tests and documentation. The Semantic Layer lets you bake all of these into your project's standard operating practice.
The Semantic Layer adds "meaning" on top of your transformed fact/dimension models. You define semantic_model (entities, dimensions, measures) in YAML and use them as the building blocks for metrics (business indicators). The definitions compile to SQL, and the aggregations run on the warehouse.
The concepts that are stable in production use are below. For syntax and detailed options, see the official documentation (docs.getdbt.com). The API integration and query-serving mechanics are provided as dbt Cloud features.
The foundation is star-schema thinking. For a fact model like fct_orders, prepare time dimensions for order date and customer dimensions in the marts layer, then define the semantic_model on top. Joins between models are based on the keys declared in entities.
Visualization tools and notebooks fetch metrics through the normalized SQL the Semantic Layer generates. Differences in time granularity and filters are passed as query arguments, so the same definition produces multiple views reliably.
Data flow centered on the Semantic Layer
Below is an example semantic_model targeting the orders fact table fct_orders, along with metrics defined on top of it. Adjust the actual field names and options to fit your project. The YAML is version-controlled and subject to review and testing.
A simple metric defines total revenue and a derived metric defines average order value. The default time granularity is day, and a shared filter excludes canceled orders.
YAML example for semantic_model and metrics
semantic_models:
- name: orders
model: ref('fct_orders')
defaults:
agg_time_dimension: order_date
entities:
- name: order_id
type: primary
- name: customer_id
type: foreign
measures:
- name: order_revenue
agg: sum
expr: amount
- name: orders
agg: count_distinct
expr: order_id
dimensions:
- name: order_date
type: time
type_params:
time_granularity: day
- name: order_status
type: categorical
metrics:
- name: total_revenue
type: simple
label: Total Revenue
type_params:
measure: order_revenue
filter:
- {expr: "order_status != 'canceled'"}
- name: order_count
type: simple
label: Order Count
type_params:
measure: orders
filter:
- {expr: "order_status != 'canceled'"}
- name: average_order_value
type: derived
label: Average Order Value
type_params:
expr: total_revenue / order_count
metrics: [total_revenue, order_count]Unifying metrics is pointless if operations let them drift. In dbt — just like models and sources — you review Semantic Layer definitions in pull requests, test them in CI, and deploy them per environment. Execution happens on the DWH and rides the existing permission and audit framework.
For backward-incompatible changes (such as filter or expression changes), branch the metric into versions and switch consumers over in phases. Use documentation and the catalog to visualize references and assess the blast radius in advance.
Where you place metrics has a huge impact on reusability and change blast radius. Considering day-to-day operations and audit, having definitions consolidated as code in one place and executed consistently on the DWH is the easiest to manage.
| Aspect | In-BI Tool Definitions | SQL Views (DWH) | dbt Semantic Layer |
|---|---|---|---|
| Consistency | Tends to diverge across tools | Unifiable per view, but tends to proliferate by use case | A YAML definition can serve as the single source of truth |
| Reusability | Confined to the tool | Reusable within the same DWH | Designed from the ground up for cross-tool reuse |
| Change Impact Visibility | Difficult (manual) | Requires analyzing dependent views | Dependencies surfaced in the catalog |
| Testing & CI | Limited | Mostly per-query verification | Integrates definitions into dbt tests/CI |
| Governance | Depends on each tool's settings | Depends on DWH permissions and view management | Honors DWH permissions while governing through code |
Analytics Engineer
問題 1
You want to reuse the same "average order value (AOV)" across multiple BI tools and notebooks and review definition changes through pull requests. Which approach is most appropriate?
正解: A
Declaring metrics in the dbt Semantic Layer (semantic_model/metrics) centralizes definitions as code and enables review, testing, and version control. Per-BI-tool definitions and CSV distribution cause definition drift, and a fixed column in the DWH lacks flexibility and auditability.
Is dbt Cloud required to use the Semantic Layer?
The APIs and integrations that surface and serve metrics are provided by dbt Cloud. The definition files themselves (semantic_model/metrics) live in your repository and integrate with the normal dbt development, review, and documentation workflow. Execution happens on the data warehouse, so permissions and auditing follow each DWH's mechanisms. Check docs.getdbt.com for the latest requirements.
Which data warehouses does it work with?
dbt supports the major cloud data warehouses (Snowflake, BigQuery, Databricks, Redshift, and so on), and the Semantic Layer runs as SQL on top of them. Available features and requirements may vary by adapter and version, so consult the official compatibility matrix.
Where should SCDs (Slowly Changing Dimensions) be handled?
Handle SCDs in the marts layer model design, making the treatment of current vs. historical records explicit at the table level. In the Semantic Layer, use the time dimension and flag (e.g., is_current) defined on that model as dimensions/filters to keep the point-in-time view consistent.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
dbt Models: SQL-Defined Transformation Units (2026)
Model fundamentals — SELECT-based definitions, naming, refs,...
dbt Analytics Engineering Exam: Complete Guide (2026)
Pass the AE Certification — scope, weighting, sample questio...
dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)
Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...
dbt Project Structure: models/seeds/macros Layout (2026)
Recommended dbt project layout — models, seeds, macros, snap...
dbt_project.yml Explained: Every Config (2026)
Every dbt_project.yml setting that matters — paths, vars, ma...