Snowflake centers on the VARIANT type to store semi-structured data — JSON, Parquet, Avro, ORC, XML — natively inside relational tables and query it directly with SQL. The three operation patterns that show up most often on exams are: path access via colon notation (:), array unnesting via LATERAL FLATTEN, and construction via OBJECT_CONSTRUCT/ARRAY_AGG.
VARIANT is a general-purpose type that can store any semi-structured data — JSON, objects, arrays, scalar values, and more. A single value can hold up to 16MB, and Snowflake stores it internally in a compressed columnar format.
| Semi-structured Type | What it Stores | Use Case |
|---|---|---|
| VARIANT | Any semi-structured data (JSON / object / array / scalar) | The most general-purpose; the landing zone for JSON loads |
| OBJECT | A collection of key-value pairs | JSON objects with a known structure |
| ARRAY | An ordered array of values | Lists, tags, and array data |
-- Create a table with a VARIANT column
CREATE OR REPLACE TABLE raw.events (
event_id INT AUTOINCREMENT,
received_at TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP(),
payload VARIANT
);
-- Direct INSERT of JSON data
INSERT INTO raw.events (payload)
SELECT PARSE_JSON('{
"user_id": 42,
"action": "purchase",
"items": [
{"sku": "A100", "qty": 2, "price": 1500},
{"sku": "B200", "qty": 1, "price": 3200}
],
"metadata": {"source": "web", "version": "2.1"}
}');You access keys inside VARIANT data with colon notation (:). Use dot notation (.) for nested objects and bracket notation ([]) for array elements.
-- Accessing top-level keys
SELECT
payload:user_id::INT AS user_id,
payload:action::STRING AS action
FROM raw.events;
-- Accessing nested objects
SELECT
payload:metadata.source::STRING AS source,
payload:metadata.version::STRING AS api_version
FROM raw.events;
-- Accessing array elements (0-indexed)
SELECT
payload:items[0].sku::STRING AS first_sku,
payload:items[0].price::NUMBER AS first_price
FROM raw.events;Without an explicit cast using :: (the cast operator), the result stays as VARIANT. You must cast whenever you use the value in a WHERE clause or a JOIN.
LATERAL FLATTEN is the function that expands arrays and objects inside a VARIANT into rows. It is the standard way to convert JSON array elements into a relational, one-row-per-element form.
-- Expand each item in the array into its own row
SELECT
e.event_id,
e.payload:user_id::INT AS user_id,
f.index AS item_index,
f.value:sku::STRING AS sku,
f.value:qty::INT AS quantity,
f.value:price::NUMBER AS price
FROM raw.events e,
LATERAL FLATTEN(INPUT => e.payload:items) f;| Column | Description |
|---|---|
| SEQ | Sequence number of the input record |
| KEY | Object key name (NULL for arrays) |
| PATH | Path string to the element |
| INDEX | Array index (0-based) |
| VALUE | The expanded value (VARIANT) |
| THIS | The entire source array/object being expanded |
-- Two-level unnesting (orders -> items)
SELECT
o.value:order_id::INT AS order_id,
i.value:product::STRING AS product,
i.value:amount::NUMBER AS amount
FROM raw.batch_data d,
LATERAL FLATTEN(INPUT => d.payload:orders) o,
LATERAL FLATTEN(INPUT => o.value:items) i;OBJECT_CONSTRUCT builds a VARIANT JSON object out of relational columns. Use it for data unloads (exporting as JSON) and for generating payloads to send to APIs.
-- Build a JSON object from relational data
SELECT OBJECT_CONSTRUCT(
'order_id', o.order_id,
'customer', o.customer_name,
'total', o.total_amount,
'order_date', o.order_date::STRING
) AS order_json
FROM analytics.orders o
WHERE o.order_date = '2026-03-27';
-- OBJECT_CONSTRUCT_KEEP_NULL (keeps NULL keys)
SELECT OBJECT_CONSTRUCT_KEEP_NULL(
'name', e.name,
'email', e.email,
'phone', e.phone
) AS employee_json
FROM hr.employees e;ARRAY_AGG aggregates values from multiple rows into a single ARRAY. Combined with GROUP BY, it is useful for converting relational data into a denormalized JSON structure.
-- Aggregate per-customer order IDs into an array
SELECT
customer_id,
ARRAY_AGG(DISTINCT order_id) WITHIN GROUP (ORDER BY order_date)
AS order_ids,
ARRAY_SIZE(ARRAY_AGG(DISTINCT order_id)) AS order_count
FROM analytics.orders
GROUP BY customer_id;
-- Combining OBJECT_CONSTRUCT and ARRAY_AGG
SELECT OBJECT_CONSTRUCT(
'customer_id', customer_id,
'orders', ARRAY_AGG(
OBJECT_CONSTRUCT(
'order_id', order_id,
'amount', total_amount
)
)
) AS customer_orders_json
FROM analytics.orders
GROUP BY customer_id;When loading JSON files, a very common pattern is using the STRIP_OUTER_ARRAY option to break the outermost array into individual rows.
-- Define a JSON File Format
CREATE OR REPLACE FILE FORMAT json_format
TYPE = JSON
STRIP_OUTER_ARRAY = TRUE
ALLOW_DUPLICATE = FALSE;
-- Load from a stage into a VARIANT column
COPY INTO raw.events (payload)
FROM @data_stage/events/
FILE_FORMAT = json_format;
-- Load Parquet files (auto schema detection)
COPY INTO raw.parquet_data
FROM @data_stage/parquet/
FILE_FORMAT = (TYPE = PARQUET)
MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE;| Function | Use Case | Example |
|---|---|---|
| PARSE_JSON | Convert string -> VARIANT | PARSE_JSON('{"key":"val"}') |
| TO_JSON | Convert VARIANT -> JSON string | TO_JSON(payload) |
| ARRAY_SIZE | Get the number of elements in an array | ARRAY_SIZE(payload:items) |
| ARRAY_CONTAINS | Check whether a value exists in an array | ARRAY_CONTAINS('A100'::VARIANT, payload:skus) |
| OBJECT_KEYS | Get an object's keys as an array | OBJECT_KEYS(payload:metadata) |
| GET_PATH / GET | Get a VARIANT element by path string | GET_PATH(payload, 'metadata.source') |
| TYPEOF | Check the internal type of a VARIANT value | TYPEOF(payload:user_id) |
Semi-structured Data
問題 1
You want to expand the JSON array payload:items stored in a VARIANT column payload and retrieve the sku and price of each element. Which SQL is correct?
正解: A
Use LATERAL FLATTEN to expand an array inside a VARIANT into rows. You access each expanded element via f.value, and then pick keys with colon notation like f.value:sku. payload:items.sku is a property access on the array as a whole and won't produce the expected result. SPLIT is a string-splitting function and cannot be used to expand a VARIANT array.
Is there a size limit for data stored in a VARIANT column?
A single VARIANT value is capped at 16MB. There are also size constraints on the micro-partitions that contain VARIANT columns, so in production you should aim to keep each record within a few MB. Rather than storing a giant JSON document as-is, it is better — both for query performance and for cost — to normalize the data with LATERAL FLATTEN and split it into separate tables.
Do queries against VARIANT columns perform as well as queries against relational columns?
Snowflake stores VARIANT data internally in a columnar format alongside metadata (type information and path information), so frequently accessed keys can be retrieved quickly. However, evaluating semi-structured paths via colon notation (:) or bracket notation ([]) does carry more overhead than scanning a relational column. The best practice is to pre-extract keys you frequently filter, JOIN, or aggregate on into relational columns using a VIEW or CTAS.
How do you unnest deeply nested arrays with FLATTEN?
LATERAL FLATTEN unnests a single level at a time, so for multi-level nesting you chain multiple LATERAL FLATTEN calls. For a 3-level structure like orders[].items[].tags[], you would write FROM raw, LATERAL FLATTEN(raw.data:orders) o, LATERAL FLATTEN(o.value:items) i, LATERAL FLATTEN(i.value:tags) t. Each FLATTEN result is passed to the next FLATTEN via its value column.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Snowflake Certifications: All 11 Exams Explained (2026)
Every SnowPro certification — Associate, Core, Specialty, Ad...
Snowflake Exam Difficulty Ranking: All 11 Certs Compared (2026)
All 11 SnowPro exams ranked by difficulty with study-time es...
Snowflake Study Guide: Fastest Pass Route by Exam (2026)
How to pass SnowPro certifications efficiently — official ma...
SnowPro Core (COF-C03): Complete Exam Guide (2026)
Pass the SnowPro Core exam — six domains, scope, sample ques...
SnowPro Associate Platform (SOL-C01): Complete Guide (2026)
The entry-level SnowPro Associate exam — scope, weighting, s...