dbt

Defining Logical Boundaries with dbt Model Groups

2026-04-19
NicheeLab Editorial Team

Model groups are a mechanism for giving resources inside a dbt project (models, snapshots, seeds, sources, etc.) a logical 'belonging.' Combined with access (public/protected/private), they let you statically verify where a resource can be referenced from and prevent unintended dependencies at CI time.

This article assumes the model governance features of dbt Core (groups and access) and compactly summarizes how to draw domain boundaries, the implementation steps, migration, and the points most often asked on exams. We focus on stable behavior backed by the official documentation.

Fundamentals and Value of Model Groups

A group is first-class metadata that expresses a logical boundary inside a project (e.g., finance, marketing, sales). By assigning a group to a resource and combining it with an access level, you can verify whether references are allowed at compile time. This lets you explicitly state 'whose team's public API this is' and 'where the implementation details live.'

From an exam perspective, remember: groups define the boundary, access controls references that cross the boundary. In particular, the difference between private/protected/public, and the fact that private does not behave as expected when no group is set, are easy points to miss.

  • A group represents a boundary at the domain or team level
  • access defines visibility of references (private < protected < public)
  • dbt detects boundary violations during ref resolution and raises an error
  • If you use private, the practical standard is to assign a group to every model
Access levelReference scopeTypical use
privateWithin the same group onlyImplementation details and intermediate models (breaking changes allowed)
protectedWithin the same package (across groups allowed)Internal stable API (reused within a domain)
publicAcross packages allowedOrg-wide stable API (long-term compatibility prioritized)

Group boundaries and reference permissions (→ allowed, x forbidden)

Package: analytics

 [Group: finance]             [Group: marketing]
   fin_base_customers (private)   mkt_enriched_orders (protected)
             ^                              ^
             |                              |
             | (allowed)                     | (allowed within package)
   fin_kpi (protected)  <------------------- mkt_rollups (protected)
             ^
             |
 external_pkg.model  x  (forbidden because not public)

Legend:
 - private: referenceable only within the same group
 - protected: referenceable within the same package, not from other packages
 - public: referenceable from other packages too

Minimal group definition and model-side configuration (YAML/SQL)

# groups/finance.yml
version: 2
groups:
  - name: finance
    owner:
      name: Finance Analytics
      email: [email protected]
    description: Finance domain analytics assets

# models/finance/fin_base_customers.sql
-- This model is internal to the finance group (private)
{{
  config(
    group='finance',
    access='private'
  )
}}
select * from {{ ref('stg_customers') }}

Logical Boundary Design Patterns

The recommended approach is to slice groups by domain. Treat layers (staging/intermediate/mart) and groups (finance/marketing, etc.) as orthogonal concepts. Express the layer through directory, naming, and tags; express the group as 'the domain it belongs to.'

Splitting by database schema and by group serves different purposes. Schemas place physical entities and manage permissions; groups define logical dependency boundaries. Regardless of the physical implementation (Snowflake, Databricks, etc.), groups operate consistently inside dbt.

  • Define groups by domain (finance, marketing, ops, etc.)
  • Use directory/naming to express layer and granularity (e.g., stg_, int_, dim_/fct_)
  • Limit inter-group references to protected/public public surfaces in principle
  • Stable APIs go public, semi-public within a domain goes protected, implementation details go private

Default the group/access per directory in dbt_project.yml

# dbt_project.yml (example)
models:
  analytics:
    finance:
      +group: finance
      +access: private   # default is private; promote individually only when public exposure is needed
    marketing:
      +group: marketing
      +access: private
    marts:
      +materialized: table

access Validation and Error Cases

When dbt resolves a ref, it inspects the target resource's access and group to decide whether the reference is allowed. Disallowed references error out immediately at parse/build time, failing before execution. This lets CI prevent boundary violations from sneaking in.

A caveat: if you put private on a model with no group set, the 'same-group only' constraint becomes ineffective. If you use private, standardize on assigning a group to every model without exception.

  • private: ref from outside the same group errors
  • protected: ref from outside the same package errors (within the same package is OK)
  • public: cross-package ref is also OK
  • macros are not subject to access control (different code-reuse mechanism)

Boundary violation example (marketing → finance private reference)

# models/marketing/mkt_rollups.sql
{{
  config(group='marketing', access='protected')
}}

-- NG: referencing a private model in the finance group
select *
from {{ ref('fin_base_customers') }}

-- Runtime image (sketch):
-- AccessError: Model 'fin_base_customers' is private to group 'finance' and cannot be referenced from group 'marketing'.

Adoption Steps for an Existing Project

Stage the migration to keep it safe. First, define groups only and assign them to models, letting CI catch any gaps. Next, apply protected uniformly across access. Finally, drop internal implementation models to private and promote only the surfaces that need to be exposed to protected/public.

For inventorying dependencies, the dbt DAG and manifest.json analysis are effective. References that cross boundaries should be re-pointed to a public surface (protected/public), or you should reconsider which domain the model belongs to.

  • Step 1: define groups in groups/*.yml and assign a group to every model
  • Step 2: set +access: protected as the package default
  • Step 3: drop internal implementation to private; only the surfaces that need exposure stay protected/public
  • Step 4: detect and fix boundary violations in CI (require build on PR)

Example of uniform defaults and staged promotion

# dbt_project.yml (package default)
models:
  analytics:
    +access: protected

# Internal implementation (private)
{{ config(group='finance', access='private') }}

# Public surface (protected → promote to public if needed)
{{ config(group='finance', access='protected') }}

Practical Notes for Multi-Team and Multi-Package Setups

Models that may be referenced from other packages should be public. Operate protected as an agreed-upon API within the same package, and require explicit public when references cross package boundaries.

You can also attach group/access to seeds and snapshots. Manage physical-layer permissions (Snowflake, Databricks, etc.) separately on the RDB/platform side, and use them alongside dbt's logical boundaries.

  • Allow cross-package references only via public
  • protected is a semi-public API within the same package
  • Change management: public prioritizes backward compatibility; private prioritizes flexibility
  • Document owner and description per group in the documentation (dbt docs)

Cross-package reference (only public is allowed)

-- package_b/models/use_finance.sql (referenced from package_b)
select * from {{ ref('analytics', 'fin_published_kpi') }}

-- The target is configured public on the analytics package side
-- models/finance/fin_published_kpi.sql
{{ config(group='finance', access='public') }}

Key Points for the Analytics Engineer Exam

Groups define the boundary; access controls references. Make sure to memorize the order of increasing openness: private is restricted to the same group, protected to the same package, and public crosses packages.

Two easy mistakes: putting private on a model with no group fails to express the expected boundary (same-group only), and macros are not subject to access control.

  • Terminology: group = domain, access = visibility
  • Frequently asked: difference between private and protected
  • Practice: assign a group to every model, then manage public surface via access
  • Detect boundary violations in CI (fails at build time)

Recap of key points (commented YAML)

# Representative patterns
# internal/private (same group only)
{{ config(group='marketing', access='private') }}
# domain API/protected (public within the same package)
{{ config(group='marketing', access='protected') }}
# org-wide/public (also from other packages)
{{ config(group='marketing', access='public') }}

Check with a Question

Analytics Engineer

問題 1

Within the same package you have finance and marketing groups, and you want to prevent the marketing side from referencing the finance group's internal implementation model. Which configuration is appropriate?

  1. Set group='finance' and access='private' on the finance internal implementation model
  2. Set access='protected' on the finance internal implementation model
  3. Set access='private' on the marketing-side model
  4. Apply the same tag to both models

正解: A

private is referenceable only within the same group. Setting the finance side to group='finance' and access='private' makes refs from marketing (a different group) error out at build time. protected is referenceable within the same package, so it does not meet the requirement.

Frequently Asked Questions

What happens if I do not define a group?

If you use private without assigning a group, the 'same-group only' boundary cannot be expressed. In practice, we recommend standardizing on assigning a group to every model.

Does access also apply to macros?

No. access is a reference control for resources resolved via ref (models, snapshots, seeds, sources, etc.). Macros are a separate reuse mechanism and are not subject to access control.

How do I allow references across packages?

Set the referenced model to access='public'. protected is only valid within the same package and cannot be referenced from other packages. Be sure to also design backward compatibility and change-notification practices.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
dbt

dbt Models: SQL-Defined Transformation Units (2026)

Model fundamentals — SELECT-based definitions, naming, refs,...

dbt

dbt Analytics Engineering Exam: Complete Guide (2026)

Pass the AE Certification — scope, weighting, sample questio...

dbt

dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)

Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...

dbt

dbt Project Structure: models/seeds/macros Layout (2026)

Recommended dbt project layout — models, seeds, macros, snap...

dbt

dbt_project.yml Explained: Every Config (2026)

Every dbt_project.yml setting that matters — paths, vars, ma...

Browse all dbt articles (101)
© 2026 NicheeLab All rights reserved.