dbt

Jinja Basics: The Minimum Syntax You Need for dbt

2026-04-19
NicheeLab Editorial Team

dbt uses Jinja2 to templatize SQL and produces the final SQL at compile time. In other words, Jinja is a set of rules expanded before execution. Understanding it correctly keeps model definitions concise and reusable.

This article organizes the minimum Jinja syntax and the dbt-specific context (ref, source, var, env_var, config, macros) that you need for both the Analytics Engineer exam and day-to-day operations, along with the points that often go wrong.

Where Jinja Runs: dbt's Evaluation Timing

dbt evaluates Jinja before running a model and produces the final SQL. Dependency resolution (ref/source) and macro expansion happen at this stage. Jinja itself does not run in the database; it is processed by dbt (locally or in the runtime environment).

Evaluation falls roughly into two phases: parse/compile time, and runtime macro invocation. run_query, which needs a database connection, can only be called safely at runtime. ref, source, var, env_var, and config are routinely used in model and macro templates.

  • Jinja expands into SQL at compile time (it is not interpreted on the database)
  • ref handles both dependency graph wiring and schema/table name resolution
  • Use run_query only at runtime (e.g., inside macros); avoid it in parse-only contexts
PhaseWhat you can doRepresentative APIs/syntaxCaveats
Parse / CompileTemplate expansion and dependency resolutionref, source, var, env_var, configNo database queries allowed
Runtime (inside macros)Auxiliary query executionrun_queryLimit its use; don't mix it into the model's main SQL generation
DB executionExecution of the final SQLPure SQLJinja has already been expanded by this point

How Jinja is evaluated and executed in dbt

Developer SQL + Jinjadbt compileJinja evaluationFinal SQLDependencies resolvedExecuted in the databaseHow Jinja is evaluated and executed in dbt

Minimal example: ref and Jinja expansion

-- models/orders_enriched.sql
select
  o.id,
  o.customer_id,
  c.segment
from {{ ref('stg_orders') }} as o
left join {{ ref('dim_customers') }} as c
  on o.customer_id = c.customer_id;

Basic Syntax: The Minimum Set of Variables, Expressions, and Filters

You only need to remember three Jinja delimiters. {{ }} outputs a value (expression/variable), {% %} is for control structures (if, for, set, etc.), and {# #} is for comments. In dbt these mix together inside SQL.

Filters transform values with pipe syntax. default, lower, upper, join, and replace come up most often. Pay attention to quoting when handling numbers and strings, and especially keep safety and readability in mind when generating identifiers such as column names.

  • Value output: {{ expression }}
  • Control structures: {% if %}, {% for %}, {% set %}
  • Comments: {# comment #}
  • Common filters: |default('x'), |lower, |upper, |join(', ')
DelimiterUseExample
{{ ... }}Output a value{{ var('days', 7) }}
{% ... %}Control structure{% if target.name == 'prod' %} ... {% endif %}
{# ... #}Comment{# build only for daily partition #}

Mini diagram: roles of the delimiters

{{ }}OutputRenders a value{% %}ControlShapes the flow{# #}CommentNo outputDelimiter roles: syntax → role → character

Filter and set example

{% set cols = ['id', 'customer_id', 'order_ts'] %}
select {{ cols | map('lower') | join(', ') }}
from {{ ref('stg_orders') }};

Control Structures: Shape SQL with if/for and Iteration

if is useful when you want to toggle columns or conditions based on environment or target. for, which iterates over arrays and maps, is great for generating select columns or CASE expressions in bulk. Keep templating within the bounds of readability and extract long logic into macros.

Watch out for comma separators and trailing commas inside loops. Either separate safely with the join filter, or control them using loop.last.

  • Branch on target.name for environment-dependent logic
  • Stabilize loop separators with join
  • Extract long logic into macros
SyntaxUseWhat to watch for on the exam
if/elif/elseBranch on environment or flagsSwitching via target/var
forBulk-generate columns or conditionsAvoid trailing commas with join or loop.last
setHold an intermediate valueConsider extracting into a macro before it gets complex

How a column list is built with for

Input['id','name','email']for + join(', ')Outputid, name, emailBuilding a column list with for + join(', ')

Example using for and loop.last

{% set dims = ['country', 'city', 'zip'] %}
select
  customer_id,
  {% for d in dims %}
  {{ d }}{% if not loop.last %},{% endif %}
  {% endfor %}
from {{ ref('dim_customers') }};

dbt-Specific Functions: When to Use ref, source, var, env_var, and config

ref wires up model-to-model dependencies while resolving the proper schema and identifier. source is for referencing external tables and must match the declaration in catalog.yml. var retrieves variables passed via dbt_project.yml or the command line, while env_var reads OS environment variables. config is used for per-model settings such as materialization and tags.

On the Analytics Engineer exam, the ref-vs-source distinction, the precedence of var/env_var, and the scope of config (model-level vs project-level) come up frequently.

  • ref guarantees both dependency resolution and portability of schema/identifier
  • source is for external tables and must be declared in the catalog
  • var is a dbt variable; env_var is an OS environment variable
  • config controls per-model settings (materialized, tags, etc.)
Function / syntaxPrimary useWhere it can be usedExam notes
ref('model')Reference a model and resolve dependenciesModel SQL / macrosAutomatically wires up inter-model dependencies
source('src','table')Reference an external tableModel SQL / macrosMust match the definition in sources.yml
var('key', default)Retrieve a variableModel SQL / macrosMake it safe with a default for the undefined case
env_var('NAME')Retrieve an environment variableModel SQL / macrosDepends on the runtime environment
config(...)Model configurationModel SQL (at the top) / macrosIn-model config applies only to that model

How ref and source relate

source: raw.orderssources.ymlstg_ordersfct_ordersref('fct_orders')source is declared in sources.yml; downstream models reference it with ref('fct_orders')

Examples of ref, source, var, env_var, and config

-- models/fct_orders.sql
{{ config(materialized='table', tags=['finance']) }}

with src as (
  select * from {{ source('raw', 'orders') }}
), stg as (
  select * from {{ ref('stg_orders') }} where order_date >= {{ var('from_date', "'1970-01-01'") }}
)
select * from stg
where {{ 'true' if env_var('ALLOW_ALL', 'false') == 'true' else 'total_amount > 0' }};

Creating, Reusing, and Packaging Macros

Extract recurring SQL fragments into macros under the macros directory and pass arguments from the caller. Macros are Jinja templates that are expanded at call time. To share them across multiple projects, package them and install via packages.yml.

Use run_query only when you need an auxiliary query from inside a macro. Making the main SQL generation of a model depend on a run_query result can produce surprising behavior with compilation, caching, and parallel execution, so avoid it as a design pattern.

  • Create macros as .sql files under macros/
  • Call them with {{ my_macro(arg=...) }}
  • Share via packages and avoid name collisions with namespaces
TargetLocation / definitionCall / application
Macromacros/*.sql{{ my_macro(arg='x') }}
PackageReferenced via packages.yml{{ package_name.macro_name(...) }}
Model configconfig(materialized='...'){{ config(...) }} or dbt_project.yml

Minimal project structure

project/models/stg_orders.sqlfct_orders.sqlmacros/utils.sqlpackages.ymldbt_project.ymlMinimal project structure

Macro definition and call

-- macros/utils.sql
{% macro coalesce_zero(expr) %}
coalesce({{ expr }}, 0)
{% endmacro %}

-- models/fct_orders.sql(一部)
select {{ coalesce_zero('total_amount') }} as total_amount
from {{ ref('stg_orders') }};

Pitfalls and Verification: Dependencies, Identifiers, and Safe Templates

Schema and table names hardcoded without ref/source break across environments and renames. Always go through ref/source so the adapter quotes identifiers properly. When generating identifiers or literals dynamically, watch the consistency of string quoting.

To validate Jinja logic with tests, combine dbt test with unit tests (when available). The fastest way to confirm whether a template expanded as intended is to inspect the generated SQL with dbt compile.

  • Don't hardcode schema or table names; rely on ref/source for portability
  • Don't confuse string quoting with identifier quoting
  • Eyeball the generated SQL with dbt compile
PitfallSymptomFix
Hardcoded referencesBreaks when switching environmentsUse ref/source
Trailing commasSQL syntax errorControl with join or loop.last
Unsafe dependency on run_queryNon-deterministic and slowMove metadata to vars or source control

Safe vs. unsafe references

良い: {{ ref('stg_orders') }} -> コンパイル時に解決
悪い: raw_schema.stg_orders -> 環境で破綻

Limit run_query to auxiliary use inside macros

-- macros/get_max_order_date.sql
{% macro get_max_order_date() %}
  {% set res = run_query("select max(order_date) as d from " ~ ref('stg_orders')) %}
  {% if res is not none and res.columns and res.rows %}
    {{ return(res.rows[0][0]) }}
  {% else %}
    {{ return("'1970-01-01'") }}
  {% endif %}
{% endmacro %}

-- 悪手: モデル本体の where に直接 run_query 結果を依存させない

Check with a Question

Analytics Engineer

問題 1

As an Analytics Engineer, you need to reference the right table name per environment while correctly wiring up dependencies between models. Which is the best approach?

  1. Use from {{ ref('stg_customers') }} and let dbt handle dependencies and identifier resolution
  2. Hardcode the full path as from prod.analytics.stg_customers
  3. Variableize only the schema: from {{ var('schema') }}.stg_customers
  4. Reference via environment variable: from {{ env_var('DBT_SCHEMA') }}.stg_customers

正解: A

ref builds the dependency graph and resolves identifiers per adapter at the same time, and is robust to environment differences. Hardcoding schemas/tables or stitching things together with hand-rolled variables is inferior in portability and dependency resolution.

Frequently Asked Questions

When should I use var versus env_var?

var refers to template variables managed in the dbt project or on the command line, while env_var reads values from the runtime (OS) environment. For reproducibility, manage values that affect application behavior with var, and pass credentials or secrets through environment variables (env_var).

Which takes precedence: config in the model or in dbt_project.yml?

The config defined inside the model wins for that specific model. Project-level settings are defaults, and per-model config overrides them.

What is the safe way to build a column list with a loop?

Use the join filter, or control the trailing comma with loop.last. For example: select {{ cols | join(', ') }}. This prevents SQL syntax errors.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
dbt

dbt Models: SQL-Defined Transformation Units (2026)

Model fundamentals — SELECT-based definitions, naming, refs,...

dbt

dbt Analytics Engineering Exam: Complete Guide (2026)

Pass the AE Certification — scope, weighting, sample questio...

dbt

dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)

Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...

dbt

dbt Project Structure: models/seeds/macros Layout (2026)

Recommended dbt project layout — models, seeds, macros, snap...

dbt

dbt_project.yml Explained: Every Config (2026)

Every dbt_project.yml setting that matters — paths, vars, ma...

Browse all dbt articles (101)
© 2026 NicheeLab All rights reserved.