Analytics Engineers who deliver continuous business value operate a SQL-centric transformation layer with engineering discipline, shipping data products that can stand up to real decision-making. dbt has become the standard tool for the role, and it is increasingly the benchmark for reproducibility and hiring evaluation.
Anchored on the stable features documented in the official docs, this article makes concrete where day-to-day practice and exam scope overlap. A sample question is included at the end.
With the spread of cloud DWHs, data transformation and modeling are now managed with the same rigor as application code. dbt unifies model dependencies, tests, documentation, and lineage around SQL, letting you achieve both reproducible analytics and operational reliability.
Hiring teams want more than someone who can build tables: they want people who can design end-to-end for change-resistant modeling, automated testing, and visible documentation and ownership. That directly maps to whether you can describe your deliverables in terms of the official dbt concepts (models, sources, tests, snapshots, documentation, exposures).
| Role | Primary Responsibilities | Core Skills / Tools |
|---|---|---|
| Data Engineer | Ingestion, preprocessing, and pipeline operation (batch / streaming) | Python/Scala, Spark, orchestration, IaC |
| Analytics Engineer (dbt) | In-DWH transformation and modeling, testing and documentation, lineage management | SQL, dbt Core/Cloud, Git/CI, DWHs (Snowflake, Databricks, etc.) |
| BI Analyst | Visualization, requirements gathering, metric design, decision support | SQL, dashboard tools, experiment design |
The stable core of dbt covers model dependency management (ref), source definitions (source), tests (generic and singular), snapshots (SCD management), materializations (view/table/incremental/ephemeral), and documentation and lineage (docs generate, Graph). This is exactly the area where the exam and real-world work overlap.
Running dbt build in CI is the canonical workflow: it evaluates models, seeds, snapshots, and tests in dependency order. Techniques such as running only the impacted slice, and deployment strategies built around role and schema isolation, should be designed around your platform's characteristics.
A dbt-centric transformation layer (simplified lineage)
Incremental models are the foundational strategy for large datasets. On DWHs that support MERGE (such as Snowflake and Databricks), the merge strategy with a unique_key is the stable default. Use is_incremental() to scope the rows you load, and keep identity correct via an update timestamp or business key.
Tests are declared in schema YAML as generic tests (not_null, unique, relationships, accepted_values, and so on) and run automatically via dbt build. Source freshness is declared on sources with loaded_at_field plus thresholds (warn_after/error_after) and monitored via scheduled runs.
A minimal example: incremental model plus test and freshness definitions
-- models/marts/fct_orders.sql
{{ config(materialized='incremental', unique_key='order_id', on_schema_change='sync') }}
with src as (
select
o.order_id,
o.customer_id,
o.status,
o.total_amount,
o.updated_at
from {{ source('app', 'orders') }} o
)
select *
from src
{% if is_incremental() %}
where updated_at > (
select coalesce(max(updated_at), '1900-01-01') from {{ this }}
)
{% endif %}
-- models/marts/schema.yml
version: 2
models:
- name: fct_orders
description: 受注のファクトテーブル。order_idで一意。
columns:
- name: order_id
tests:
- not_null
- unique
- name: customer_id
tests:
- relationships:
to: ref('dim_customers')
field: customer_id
sources:
- name: app
schema: raw
tables:
- name: orders
loaded_at_field: updated_at
freshness:
warn_after: {count: 60, period: minute}
error_after: {count: 120, period: minute}
-- exposuresでBI依存を明示(変更時に影響範囲が見える)
exposures:
- name: sales_dashboard
type: dashboard
owner:
name: Analytics Team
email: [email protected]
depends_on:
- ref('fct_orders')Via adapters, dbt compiles to the optimal SQL for each DWH. Snowflake and Databricks both support the incremental merge strategy, letting you use MERGE on a unique_key. In production, the stable pattern is to design a permission and schema strategy (per-developer schemas, dedicated production schemas) and deploy safely through roles.
Performance and cost swing significantly with your choice of materialization, clustering and file optimizations (platform features), query parallelism, and how often jobs are scheduled. On the dbt side, materializing only the intermediate layers you really need as tables, and keeping staging mostly as views, makes it easier to balance build time and storage.
Your market value is judged by how much reproducible, automated operation you can design. Putting a complete dbt project in a repo and explaining the layer structure, naming conventions, test policy, and deployment strategy in the README makes a much stronger case.
In hiring, what separates candidates is not just writing queries, but the ability to build a system where changes are spec'd, impact is visible, and anomalies are detected automatically.
The exam tests whether you can correctly choose between dbt's core concepts. Stable features that show up often include materializations, tests, snapshots, source freshness, documentation and exposures, dependencies (ref/source), and execution order (build/test/run).
A natural learning order is: model layering and test basics → correctness of incremental updates → snapshots and history → documentation and exposures → integrating with CI.
Analytics Engineer
問題 1
A dashboard depends on a specific mart model. You want it surfaced in lineage and want CI to catch impact when things change. Which dbt configuration fits best?
正解: A
Exposures are the official way to register BI tools and other downstream consumers in dbt's lineage, with depends_on tying them to the models. Running dbt build in CI evaluates the dependencies and tests, surfacing the impact of any change. Freshness is useful for detecting source lag but does not express dashboard dependencies. Ephemeral inlines an intermediate view and is unsuitable for managing dashboards.
Should I learn dbt Core or dbt Cloud?
You can learn the core concepts (models, tests, snapshots, sources, exposures, materializations, ref/source, build) entirely with dbt Core. Cloud adds operational features (scheduler, UI, secrets management), but for the exam and for building strong fundamentals, focusing on Core is more than enough.
Is Python required?
The core of the Analytics Engineer role is SQL and operating dbt. Python is useful for ingestion and external API integrations, but you can master the main dbt surface area (SQL models, tests, documentation, incremental, snapshots) without it.
How do I decide between incremental models and snapshots?
Incremental models efficiently keep a table at its latest state using a unique key and an update timestamp. Snapshots are for when you need to retain record history (SCD). A simple rule of thumb: use incremental for current-state metrics, and snapshots for history tracking and change auditing.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
dbt Models: SQL-Defined Transformation Units (2026)
Model fundamentals — SELECT-based definitions, naming, refs,...
dbt Analytics Engineering Exam: Complete Guide (2026)
Pass the AE Certification — scope, weighting, sample questio...
dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)
Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...
dbt Project Structure: models/seeds/macros Layout (2026)
Recommended dbt project layout — models, seeds, macros, snap...
dbt_project.yml Explained: Every Config (2026)
Every dbt_project.yml setting that matters — paths, vars, ma...