In dbt, the Jinja context exposes a read-only object called graph that lets you reference nodes (models, sources, seeds, snapshots, tests, etc.) and dependencies across the project. This makes it possible to implement bulk configuration based on tags, as well as macros for inspecting and visualizing the DAG.
At the same time, graph does not create new dependencies. The edges of the DAG are created by ref and source. Because this distinction is a frequent topic on the Analytics Engineer exam, make sure you understand the correct usage and safe implementation patterns.
graph is the DAG representation of the entire project that becomes available after dbt parses your code. It can be referenced from Jinja in a read-only fashion, exposing each node's kind (resource_type), name, tags, configuration values, dependencies (depends_on), and more. It is usable in standard Jinja contexts such as model SQL, macros, snapshots, and tests.
The important point is that graph does not create dependencies. The DAG edges are produced by ref, source (and internally by tests, for example) — graph is strictly "look but don't touch." Once you internalize this, graph becomes a powerful tool for project-wide health checks and metadata-driven guardrails.
| Context / function | Main purpose | Creates dependencies? | Where it's available |
|---|---|---|---|
| graph | Inspecting DAG information | No | Models / macros / tests etc. |
| ref | Model reference and dependency creation | Yes | Models / macros |
| source | Source reference and dependency creation | Yes | Models / macros |
| this | Identifier of the current node | No | Models / tests etc. |
| target | Connection profile information | No | Everywhere |
Relationship between the DAG and graph references (conceptual)
Example: a macro that uses graph to list downstream models (for run-operation)
{% macro show_children(model_name) %}
{# Get the node for the target model #}
{% set target = (graph.nodes.values()
| selectattr('resource_type', 'equalto', 'model')
| selectattr('name', 'equalto', model_name)
| list | first) %}
{% if not target %}
{{ exceptions.raise_compiler_error('Model not found: ' ~ model_name) }}
{% endif %}
{% set children = [] %}
{% for n in graph.nodes.values() if n.resource_type == 'model' %}
{% if target.unique_id in n.depends_on.nodes %}
{% do children.append(n.name) %}
{% endif %}
{% endfor %}
{{ log('children(' ~ model_name ~ '): ' ~ (children | join(', ')), info=True) }}
{% endmacro %}
{# Usage:
dbt run-operation show_children --args '{"model_name": "stg_orders"}'
#}graph.nodes is a dictionary (unique_id → node), and values() gives you an array-like view of nodes. In practice, you first narrow by resource_type (model, source, seed, snapshot, test, etc.), then further restrict by name, package_name, tags, and config (materialized, schema, post-hook, etc.) to land on the target set.
Dependencies are stored on node.depends_on.nodes as an array of unique_ids. To get parents, look at the node's own depends_on. To get children, scan all nodes and pick up the ones that include this node's unique_id in their depends_on.nodes.
Combined with Jinja filters such as selectattr, map, rejectattr, unique, and list, you can manipulate node sets declaratively. For example, tag-based control, aggregation of models under a specific folder, and statistics by materialization can all be expressed concisely.
For large projects, the trick is to minimize the target set before aggregating or inspecting. Macros that unconditionally scan all of graph.nodes.values() can become a serious cause of parse-time bloat.
Rule checks powered by graph are very effective in practice. Examples include failing the build when a mart-tagged model lacks the required tests, or detecting cases where a staging-layer model is wired directly to a mart-layer model.
You can also stably run patterns that emit summaries — model counts, test counts, distribution by tag — via run-operation and save them as CI artifacts for an operational dashboard.
graph is static information about the project as interpreted by dbt — it does not contain runtime information such as the actual database state or row counts. For querying real objects inside a schema, use adapter-mediated info retrieval (get_relation, etc.) and don't confuse it with graph.
Dependencies are created only via ref and source. Even if you make a decision by looking at graph, the DAG itself does not change, so to influence execution order or parallelism you must either use ref/source or control it via selection syntax (+, @, state:modified, etc.).
On the exam, you are frequently asked to distinguish between mechanisms that do and do not create dependencies, the positioning of graph, and how to choose between common Jinja context variables (this, target, var, env_var, etc.). Be ready to articulate in one sentence that graph is for DAG introspection and does not influence execution order.
Beyond that, having a solid grasp of parent-child relationships via depends_on and the basics of set operations using resource_type/tag will help you handle scenario-style questions as well.
Analytics Engineer
問題 1
You want to use dbt's Jinja to reference DAG information and check whether models tagged 'mart' have the required tests configured. Which is the correct approach?
正解: A
graph is read-only and is used for DAG introspection. Creating dependencies is the responsibility of ref/source. Enumerating database objects or reconstructing the DAG from environment variables is not part of dbt's official dependency-management mechanism.
Can graph be referenced from a model's SQL body?
Yes. It is available in the general Jinja context. That said, doing heavy full scans inside every model increases parse time, so it is best practice to factor the logic out into a run-operation macro.
Which node attributes are safe to rely on?
Basic attributes such as name, resource_type, package_name, tags, config, and depends_on are practically stable. For unique identification, unique_id is the safest choice. Avoid over-relying on detailed internal structures that may change in the future.
Is there a built-in shortcut for getting downstream nodes?
There is no direct shortcut from Jinja, so you traverse using depends_on. In practice, combining this with CLI selection syntax (e.g. dbt ls --select +model_name) keeps analysis cost low while producing the same result.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
dbt Models: SQL-Defined Transformation Units (2026)
Model fundamentals — SELECT-based definitions, naming, refs,...
dbt Analytics Engineering Exam: Complete Guide (2026)
Pass the AE Certification — scope, weighting, sample questio...
dbt Cloud vs dbt Core: Feature & Cost Comparison (2026)
Honest comparison of dbt Cloud vs. dbt Core — IDE, scheduler...
dbt Project Structure: models/seeds/macros Layout (2026)
Recommended dbt project layout — models, seeds, macros, snap...
dbt_project.yml Explained: Every Config (2026)
Every dbt_project.yml setting that matters — paths, vars, ma...