Snowflake

What is Snowpipe? Automated Data Ingestion in Snowflake

2026-03-21
更新: 2026-03-27
NicheeLab Editorial Team

Snowpipe is a serverless data ingestion service that continuously and automatically loads files arriving in cloud storage into Snowflake tables. When a new file lands in S3, Azure Blob Storage, or GCS, an event notification triggers Snowpipe and the data is reflected in the target table within minutes. It is a frequently tested topic in the Data Loading / Data Transformation domain of the SnowPro Core exam.

Snowpipe Architecture

Snowpipe runs on serverless compute in the Cloud Services Layer. It does not consume your Virtual Warehouse and is billed in Serverless Credits based on the volume of data loaded. As a result, you do not need to keep a Warehouse running at all times.

Cloud Storage (S3 / Blob / GCS)
  │  File arrives
  ▼
Event Notification (SQS / EventGrid / Pub/Sub)
  │  Notification message
  ▼
Snowpipe (Cloud Services Layer - Serverless)
  │  COPY INTO executes
  ▼
Target Table (Storage Layer)

Auto-ingest: Automatic Ingestion via SQS Notifications

With auto-ingest on AWS, S3 event notifications (SQS) are wired directly into Snowpipe. Every time a file arrives in the S3 bucket, an SQS message is sent to Snowpipe's queue and the load kicks off automatically. The same pattern is available on Azure (EventGrid) and GCP (Pub/Sub).

-- Create an auto-ingest enabled pipe
CREATE OR REPLACE PIPE sales_pipe
  AUTO_INGEST = TRUE
AS
COPY INTO analytics.raw_sales
FROM @sales_stage/daily/
FILE_FORMAT = (TYPE = CSV SKIP_HEADER = 1 FIELD_OPTIONALLY_ENCLOSED_BY = '"')
ON_ERROR = 'SKIP_FILE';

Specifying AUTO_INGEST = TRUE tells Snowflake to internally generate an SQS queue ARN. Binding that ARN to the S3 bucket's event notification configuration completes the file-arrival-triggered pipeline.

Triggering via the REST API

Instead of using auto-ingest, you can also trigger loads by calling the Snowpipe REST API from an external orchestration tool such as Airflow or AWS Lambda. POST a list of file paths to the insertFiles endpoint to load only the files you specify.

-- Pipe for REST API use (AUTO_INGEST = FALSE)
CREATE OR REPLACE PIPE batch_pipe
AS
COPY INTO analytics.raw_events
FROM @event_stage/
FILE_FORMAT = (TYPE = PARQUET);

-- Check pipe status
SELECT SYSTEM$PIPE_STATUS('batch_pipe');

REST API triggering is useful when you want to integrate Snowpipe with an existing workflow management tool, or when you need precise external control over when loads happen.

Monitoring Loads with COPY_HISTORY

Check Snowpipe load results with COPY_HISTORY. The INFORMATION_SCHEMA version is near real-time and retains the last 14 days; the ACCOUNT_USAGE version retains up to 1 year of history.

-- Check the last 24 hours of load history
SELECT
  file_name,
  stage_location,
  status,
  row_count,
  error_count,
  first_error_message,
  last_load_time
FROM TABLE(INFORMATION_SCHEMA.COPY_HISTORY(
  TABLE_NAME => 'ANALYTICS.RAW_SALES',
  START_TIME => DATEADD(HOURS, -24, CURRENT_TIMESTAMP())
))
ORDER BY last_load_time DESC;
MethodRetentionLatencyUse case
INFORMATION_SCHEMA.COPY_HISTORY14 daysReal-timeRecent load status checks and incident response
ACCOUNT_USAGE.COPY_HISTORY1 yearUp to 45 min delayLong-term load trend analysis
SYSTEM$PIPE_STATUS-Real-timePipe runtime status and number of pending files

Snowpipe vs. Snowpipe Streaming

Snowpipe Streaming, which became GA in 2023, is a row-level real-time ingestion feature. Because it bypasses files entirely, latency drops to sub-second levels.

AspectSnowpipeSnowpipe Streaming
Data unitFiles (CSV / Parquet / JSON, etc.)Rows
TriggerSQS notifications / REST APIIngest SDK / Kafka Connector
LatencyMinutesSub-second to a few seconds
Cost modelServerless Credits (per file)Serverless Credits (per row / byte)
Use caseContinuous ingestion of batch filesReal-time ingestion of IoT sensor data, clickstreams, etc.

When to Use COPY INTO vs. Snowpipe

COPY INTO is a bulk load command that runs on a Warehouse. The fundamental differences from Snowpipe lie in the execution platform and what triggers the load.

AspectCOPY INTO (manual)Snowpipe
Execution platformVirtual WarehouseServerless
Trigger methodManual SQL / scheduled TaskAutomatic (notification-triggered) / REST API
Best fitInitial bulk loads and reprocessingContinuous file arrivals
Cost structureBased on Warehouse uptimeBased on Serverless Credits

Operational Best Practices

  • Optimize file size: Snowpipe works with small files, but compressed files of roughly 100-250 MB are considered the most cost-efficient
  • ON_ERROR option: In production, setting SKIP_FILE quarantines failing files so you can deal with them individually later — a more stable design
  • Monitor cost via PIPE_USAGE_HISTORY: Use the ACCOUNT_USAGE.PIPE_USAGE_HISTORY view to track credit consumption per pipe
  • Ensure idempotency: Include a timestamp or UUID in every file name so duplicates are reliably eliminated within the 14-day metadata retention window

Check Your Understanding

Data Loading

問題 1

You want to load files into a Snowflake table automatically whenever new CSV files arrive in an S3 bucket. Which configuration is required in the Snowpipe pipe definition?

  1. Specify AUTO_INGEST = TRUE and bind the S3 event notification (SQS) to the Snowpipe queue ARN
  2. Specify SCHEDULE = 'USING CRON 0/5 * * * *' to poll for files
  3. Specify WAREHOUSE_SIZE = 'LARGE' to secure Snowpipe processing capacity
  4. Specify AUTO_RESUME = TRUE to auto-start the pipe's Warehouse

正解: A

Snowpipe auto-ingest requires AUTO_INGEST = TRUE in the pipe definition, with the generated SQS queue ARN configured as the S3 bucket's event notification. Snowpipe runs serverlessly, so specifying a Warehouse size is unnecessary. SCHEDULE is Task syntax, and AUTO_RESUME is a Warehouse configuration parameter.

Frequently Asked Questions

What is the difference between Snowpipe and Snowpipe Streaming?

Snowpipe auto-ingests files (CSV, Parquet, JSON, etc.) from cloud storage, triggered by SQS / EventGrid / Pub/Sub notifications. Snowpipe Streaming, on the other hand, skips files entirely and inserts row-level data directly into tables at sub-second latency via the Kafka Connector or the Snowflake Ingest SDK. Use Snowpipe for file-based ingestion and Snowpipe Streaming for real-time row streams.

Does Snowpipe auto-ingest cause duplicate loads?

Snowpipe internally tracks metadata for loaded files and prevents duplicate loads of the same file. However, that metadata is retained for only 14 days by default, so a file with the same name resubmitted after 14 days may be loaded again. The best practice is to configure FILE_FORMAT options like SKIP_HEADER properly and to add a timestamp suffix to every file name.

How do you detect and handle Snowpipe load failures?

Use the COPY_HISTORY table function (INFORMATION_SCHEMA) or the ACCOUNT_USAGE.COPY_HISTORY view to inspect load history and failure reasons. You can also pre-validate per-file errors by running COPY INTO with VALIDATION_MODE='RETURN_ERRORS'. The standard production pattern is to set ON_ERROR='SKIP_FILE' on the Snowpipe to quarantine failing files, and wire it into a notification pipeline so alerts fire automatically.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Snowflake

Snowflake Certifications: All 11 Exams Explained (2026)

Every SnowPro certification — Associate, Core, Specialty, Ad...

Snowflake

Snowflake Exam Difficulty Ranking: All 11 Certs Compared (2026)

All 11 SnowPro exams ranked by difficulty with study-time es...

Snowflake

Snowflake Study Guide: Fastest Pass Route by Exam (2026)

How to pass SnowPro certifications efficiently — official ma...

Snowflake

SnowPro Core (COF-C03): Complete Exam Guide (2026)

Pass the SnowPro Core exam — six domains, scope, sample ques...

Snowflake

SnowPro Associate Platform (SOL-C01): Complete Guide (2026)

The entry-level SnowPro Associate exam — scope, weighting, s...

Browse all Snowflake articles (103)
© 2026 NicheeLab All rights reserved.