dbt Naming Conventions: stg_, int_, fct_, dim

In dbt, the file name becomes the model name and is exposed everywhere — ref, source, selectors, and documentation. That means naming directly affects development speed and operational quality.

This article follows widely accepted dbt best practices and organizes naming recipes around the stg / int / dim / fct layers, covering both the practical conventions you need day-to-day and the angles most often tested on the Analytics Engineer exam.

Core Naming Principles

Use lower_snake_case as the baseline for dbt model names and start with a short prefix that signals the layer. Choose concrete names from which the grain is obvious, and keep abbreviations to the minimum your team has explicitly agreed on.

The Analytics Engineer exam often probes the difference in roles between stg_{source}_{table} and dim_/fct_, as well as how easy it is to narrow scope with selectors.

lower_snake_case (ASCII letters/digits and underscores only)
Express the layer with a prefix: stg_, int_, dim_, fct_, rpt_
Singular vs plural follows the grain (dim_customer, fct_orders, etc.)
Normalize external system names to short forms (salesforce → sf, google_analytics → ga)
Forbidden: embedding dates or environments in names, meaningless abbreviations, camelCase, spaces

Good vs. bad naming examples

# Good
stg_sf_account.sql
int_customer_unified.sql
dim_customer.sql
fct_order_line.sql
rpt_sales_summary.sql

# Bad
StageSalesforceAccount.sql      -- uppercase / camel
staging_accounts.sql            -- layer name too long / inconsistent
customer_v2_new_new.sql         -- meaningless version suffixes
ordersFact.sql                  -- camel + inconsistent prefix

Per-Layer Naming and Visualizing Dependencies

Lock in the role and grain of each layer and make them obvious through prefixes. The baseline split is: stg maps 1:1 to source tables, int handles joins and pre-processing in the middle, and dim/fct are the distribution-ready models in the mart layer.

Prefixes pay off when you operate via selectors. Expressions like dbt build -s stg_* or +fct_* turn naming itself into selection power.

stg handles only column renaming/type casts/light filtering
int joins multiple stg models, deduplicates, and reshapes grain
dim manages dimension unique keys and attributes; fct holds event/transaction facts
rpt is final aggregation for distribution (use only when needed)

Layer	Recommended prefix	Example	Grain / Role
Staging	stg_	stg_sf_account	1:1 with source; normalizes column names and types
Intermediate	int_	int_customer_unified	Joins, deduplication, key cleanup
Dimension	dim_	dim_customer	Unique key; one row per entity
Fact	fct_	fct_order_line	Event/transaction; one row per fact
Report/Mart	rpt_	rpt_sales_daily	Final aggregation; metric distribution

How dependencies map to folders

models/
  staging/
    sf/
      stg_sf_account.sql
      stg_sf_contact.sql
  intermediate/
    int_customer_unified.sql
  marts/
    core/
      dim_customer.sql
      fct_order_line.sql
    reporting/
      rpt_sales_daily.sql

Flow:
source('sf','account') --> stg_sf_account --> int_customer_unified --> dim_customer
                                                        \
                                                         --> fct_order_line --> rpt_sales_daily

Selector examples (where naming pays off on the CLI)

# Build only stg models
 dbt build -s stg_*

# Build all fct models and everything upstream (+ means upstream/downstream)
 dbt build -s +fct_*

# Only customer-related models (bundle by shared name fragments)
 dbt build -s dim_customer int_customer_* stg_*_customer*

Folder Layout and Aligning YAML Definitions

Splitting folders into two levels — layer and domain — makes it easy to drop a schema.yml per scope. Keep YAML name values identical to file names (i.e. model names), and write descriptions concisely for BI consumers.

Assume default project quoting (unspecified = unquoted for most adapters) and stick to lowercase model names. Some warehouses uppercase the physical identifier, but the logical name (the dbt model name) does not change.

Place schema.yml close to its models, e.g. models/staging/sf/schema.yml
Grouping models in the same directory under one YAML is easier to maintain
description and tests must not contradict the naming and grain

Example: dbt_project.yml and schema.yml

# dbt_project.yml (excerpt)
name: my_dbt
models:
  my_dbt:
    staging:
      +materialized: view
    intermediate:
      +materialized: table
    marts:
      core:
        +materialized: table

# models/staging/sf/schema.yml (excerpt)
version: 2
models:
  - name: stg_sf_account
    description: "Normalized view of Salesforce Account (1 row = account)"
    columns:
      - name: account_id
        tests: [not_null, unique]
      - name: account_name
        tests: [not_null]
  - name: stg_sf_contact
    description: "Normalized view of Salesforce Contact (1 row = person)"
    columns:
      - name: contact_id
        tests: [not_null, unique]

Naming Sources, Seeds, and Snapshots

Give each source a stable, organization-wide system name and align table-side names with the actual physical names. Bake the source name into stg model names so you can trace lineage at a glance.

For seeds, use a straightforward business name when the seed is for distribution, or a consistent convention like a map_ prefix for auxiliary mappings. For snapshots, the logical name = the snapshot block name; make it explicit enough that the target table is obvious.

Source names should be short and consistent (sf, erp, ga, etc.)
Use stg_{source}_{table} to guarantee the 1:1 mapping
Seeds use names that read like their purpose, e.g. map_country_codes
Snapshots use names that imply the target and key, e.g. snap_customer_scd2

Source, seed, and snapshot definitions

# models/staging/sf/sources.yml (excerpt)
version: 2
sources:
  - name: sf
    schema: raw_salesforce
    tables:
      - name: account
      - name: contact

# seeds/proj/map_country_codes.csv -> table map_country_codes
# dbt_project.yml (excerpt)
seeds:
  +schema: reference
  proj:
    +quote_columns: false

# snapshots/snap_customer_scd2.sql (excerpt)
{% snapshot snap_customer_scd2 %}
  {{ config(
      target_schema='history',
      unique_key='customer_id',
      strategy='check',
      check_cols=['email','status']
  ) }}
  select * from {{ ref('dim_customer') }}
{% endsnapshot %}

Naming Tests, Macros, and Exposures

Generic tests (unique, not_null, etc.) get consistent dbt-generated names when declared in YAML. Put custom logic in singular tests (SQL in the tests directory) and name them like test_{target}_{rule}.

Group macros by purpose with prefixes: util_ for type conversion and cleaning, biz_ for business rules, etc., so the role is obvious. Name exposures per BI dashboard as exposure_{domain}_{name} and keep them aligned with the model names listed under depends_on.

Singular test: tests/test_dim_customer_scd.sql, etc.
Macro: macros/util_clean_email.sql, etc.
Aligning exposure names with the dashboard's persistent ID makes tracing easier

Test and macro examples

# models/marts/core/schema.yml (excerpt)
version: 2
models:
  - name: dim_customer
    tests:
      - dbt_utils.unique_combination_of_columns:
          combination_of_columns: [customer_id]
    columns:
      - name: email
        tests: [not_null]

# tests/test_dim_customer_email_format.sql (singular)
select email from {{ ref('dim_customer') }}
where email !~ '^[^@]+@[^@]+\.[^@]+
      
        
        NicheeLab を読み込み中…
      
    
  #x27;

# macros/util_clean_email.sql
{% macro util_clean_email(col) %}
    lower(trim({{ col }}))
{% endmacro %}

# models/staging/sf/stg_sf_contact.sql (usage)
select
  contact_id,
  {{ util_clean_email('email') }} as email
from {{ source('sf','contact') }}

Anti-Patterns and Exceptional Alias Usage

Anti-patterns include meaningless version suffixes, names that are too abstract, and layer mismatches (e.g. heavy joins inside stg). When naming contradicts grain and responsibility, documentation drifts away from reality.

Use alias only when you must align the physical warehouse table name for legacy compatibility. Keep the logical name (file name) on-convention and swap only the physical name via alias — that contains the blast radius.

Forbidden: customer_final_v3_final.sql and other history-in-the-name patterns
Forbidden: putting stg_ models inside marts, or aggregating inside stg
Use alias sparingly; never as a default

Alias and quoting examples

# models/marts/core/dim_customer.sql
{{ config(alias='DIM_CUSTOMER_LEGACY') }}
select * from {{ ref('int_customer_unified') }}

# dbt_project.yml (excerpt: use this on Snowflake etc. to avoid uppercasing)
quoting:
  database: false
  schema: false
  identifier: true   # quote the physical name to prevent uppercasing

# Note: behavior depends on the warehouse/adapter defaults. Agree as a team before enabling.

Check Your Understanding

Analytics Engineer

問題 1

You want to normalize Salesforce Account and Contact 1:1, build a unified customer view, and ultimately distribute a customer dimension. Which naming combination is appropriate?

stg_sf_account, stg_sf_contact → int_customer_unified → dim_customer
stg_sf_account, int_sf_contact → dim_customer_unified → fct_customer
dim_sf_account, stg_sf_contact → int_unified_customer → rpt_customer
sf_account_stg, contact_stg → customer_int → customer_dim

正解: A

Keep stg 1:1 with the source, do integration and deduplication in int, and distribute through dim_. Option A is the only combination where prefixes and grain are mutually consistent.

Frequently Asked Questions

Should dim and fct models be named singular or plural?

Use singular for dim models whose grain is one row per entity (dim_customer). For fct models (one row per event/transaction) either singular or plural is fine, but stay consistent across the project. Names that reveal the grain (e.g. fct_order_line) are preferred.

Can I add v2 to model names when running versioned models in parallel?

Not recommended as a permanent practice. Use names like int_customer_unified_v2 only during a migration window, document the sunset plan in an issue/PR, and expose a compatibility name via an alias or view. Do not keep version suffixes long-term.

The warehouse uppercases identifiers automatically. Does that affect dbt naming conventions?

The logical name (the model name used in ref) is unaffected. If you want to prevent uppercasing of the physical name, set quoting.identifier to true in your project. Either way, keep file names in lower_snake_case across the project.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

dbt Naming Conventions: Designing Consistent Model Names