Databricks SQL: SQL Warehouses, Dashboards, Alerts (2026)

Databricks SQL is a SQL analytics environment designed for BI analysts and SQL users. It bundles SQL warehouses (dedicated compute), a SQL editor, dashboards, and alerts into a single product, delivering a SQL analytics experience on the Lakehouse that rivals traditional data warehouses such as Snowflake, BigQuery, and Redshift. This article walks through SQL warehouse types, the Photon engine, dashboards, query federation, Query Profile, Unity Catalog integration, and exam-relevant points.

Databricks SQL Overview

Databricks SQL is the integrated environment for running SQL analytics on the Databricks Lakehouse Platform. Separate from the notebook environment where data engineers build ETL, it is made up of the following SQL-focused components.

SQL Warehouse: Dedicated compute resources for running SQL queries (Classic / Pro / Serverless)
SQL Editor: A web-based editor for writing, running, saving, and sharing queries. Includes autocomplete and a schema browser
Dashboards: Visualize query results and build interactive reports to share with stakeholders
Alerts: Automatically notify via email, Slack, or webhook when query results meet defined conditions
Query History: Inspect past query performance, runtime, and scanned data volume

Comparing the 3 SQL Warehouse Types

Databricks SQL offers three types of SQL warehouses. Pick the right one based on workload characteristics, startup speed, cost, and required features.

Characteristic	Classic	Pro	Serverless
Startup time	5-10 min	5-10 min	Seconds
Photon engine	Enabled	Enabled	Enabled
Query federation	Not supported	Supported	Supported
Predictive Optimization	Not supported	Supported	Supported
Infrastructure management	Customer-managed	Customer-managed	Databricks-managed
Autoscaling	Manual configuration	Manual configuration	Fully automatic
DBU rate	Low	Medium	High
Recommended use case	Basic SQL queries	Advanced analytics, federation	Ad-hoc queries, dashboards

As of 2026, Databricks recommends Serverless SQL Warehouse. It starts in seconds, drastically reducing wait time for ad-hoc queries, and bills nothing while idle, making it cost-efficient for intermittent usage patterns.

The Photon Engine

Photon is Databricks' next-generation vectorized query execution engine, written in C++. It is enabled by default on every SQL warehouse and speeds up queries transparently — no SQL or code changes required.

Vectorized execution: Processes data in batches (columnar format) instead of row by row, maximizing the use of CPU SIMD instructions
Native C++ implementation: Avoids JVM GC overhead and optimizes memory layout
Delta Lake optimization: Specifically optimized for decoding, filtering, and aggregating Parquet files
Full Spark compatibility: Existing Spark SQL queries run as-is with identical results

Benchmarks report 2-8x speedups over the legacy Spark engine, with the largest gains on large-volume scans, filters, aggregations, and joins.

Building and Sharing Dashboards

Databricks SQL dashboards place SQL query results into visualizations (charts and tables) to compose interactive reports. The 2026 refresh added AI-assisted automatic dashboard generation.

Visualizations: Bar, line, pie, scatter, heatmap, counter, table, and many other chart types
Filter widgets: Dropdowns, date ranges, and text-input filters let viewers narrow down the data dynamically
Query parameters: Use {{param_name}} placeholders to bind dynamic values into SQL queries. Provides safe parameterization that prevents SQL injection
Scheduled refresh: Re-run queries automatically at intervals from 1 minute to 24 hours to keep dashboards current
Sharing: Share via URL, embed with an iframe, or export to PDF. View permissions integrate with Unity Catalog to also control data access

Configuring Alerts

SQL alerts run a specified SQL query on a schedule and automatically send a notification when results meet a threshold condition. They are used for data quality monitoring, KPI threshold alerts, and anomaly detection.

-- アラート用クエリの例: 過去1時間のAPIエラー率が5%を超えたら通知
SELECT
  COUNT(CASE WHEN status_code >= 500 THEN 1 END) * 100.0
    / NULLIF(COUNT(*), 0) AS error_rate_pct
FROM prod_catalog.logs.api_requests
WHERE request_timestamp > current_timestamp() - INTERVAL 1 HOUR

-- アラート条件: error_rate_pct > 5
-- 通知先: Slackチャンネル #data-alerts
-- 評価間隔: 15分ごと

Alert evaluation results are tracked in three states (TRIGGERED / OK / UNKNOWN), and you can configure a notification to fire when the alert recovers from TRIGGERED back to OK as well.

Query Federation (Lakehouse Federation)

Query federation (Lakehouse Federation) lets you query data in external databases (PostgreSQL, MySQL, SQL Server, Snowflake, BigQuery, and others) directly from Databricks SQL. You can query across heterogeneous data sources without moving data or building ETL pipelines.

-- 1. 外部データベースへの接続を作成
CREATE CONNECTION pg_production
TYPE POSTGRESQL
OPTIONS (
  host 'prod-db.example.com',
  port '5432',
  user secret('db_scope', 'pg_user'),
  password secret('db_scope', 'pg_password')
);

-- 2. 外部カタログの登録
CREATE FOREIGN CATALOG pg_catalog
USING CONNECTION pg_production
OPTIONS (database 'production');

-- 3. 外部テーブルとDelta Lakeテーブルの横断クエリ
SELECT c.customer_name, o.order_total, o.order_date
FROM prod_catalog.sales.customers c
JOIN pg_catalog.public.orders o
  ON c.customer_id = o.customer_id
WHERE o.order_date >= '2026-01-01'
ORDER BY o.order_total DESC;

Federation is available on Pro SQL Warehouse and Serverless SQL Warehouse. External table connection details are managed centrally in Unity Catalog, and access permissions are controlled with GRANT statements.

Query Profile

Query Profile visualizes the execution plan of a SQL query along with each operator's processing time, scanned row count, and shuffle volume. Use it to find query bottlenecks and to guide performance tuning.

Operator tree: Displays the execution plan (Scan → Filter → Aggregate → Sort, etc.) as a tree
Time breakdown: Color-codes each operator's processing time so you can spot the most expensive operation
Scan statistics: Shows the files, rows, and bytes scanned so you can verify that predicate pushdown is working
Spill information: Identifies where memory pressure caused a spill to disk

The DAA exam covers reading Query Profile and asks you to identify which operator is the bottleneck and whether predicate pushdown is being applied.

Unity Catalog Integration

Databricks SQL is fully integrated with Unity Catalog. Every piece of data accessed from a SQL warehouse is managed under Unity Catalog's three-level namespace (catalog.schema.table).

Access control: Manage permissions on tables, views, and functions with GRANT / REVOKE statements. Dashboard viewers need SELECT permission on the underlying tables referenced by the SQL queries
Data lineage: Data flow between tables is tracked automatically from SQL query execution. Dashboard data-source dependencies are also visualized
Column masking and row filters: Unity Catalog's column masking functions and row filter policies let you display different data to different users from the same table
Audit logs: Audit logs of queries run on SQL warehouses are recorded in System Tables, helping you meet compliance requirements

Exam Coverage (DEA / DAA)

Databricks SQL appears on multiple certification exams. The depth of coverage differs by exam, so focus on the points that matter for the exam you are taking.

Data Engineer Associate (DEA) coverage

Differences between the 3 SQL warehouse types (Classic / Pro / Serverless) and how to choose
Basics of the Photon engine (C++ implementation, vectorization, automatic application)
Basic use cases and setup steps for dashboards and alerts
Unity Catalog integration (referencing tables in the three-level namespace)

Data Analyst Associate (DAA) coverage

Writing queries in the SQL editor and parameterized query syntax
Dashboard creation workflow (adding visualizations, configuring filters, sharing)
Reading Query Profile (identifying bottlenecks, interpreting scan statistics)
Configuring alert conditions and notification destinations
Query federation concepts (CONNECTION / FOREIGN CATALOG)

Sample Question

DEA - SQL Warehouse

問題 1

A data analyst team frequently runs ad-hoc queries and wants to minimize the wait time before queries execute while also cutting costs during idle periods. Which SQL warehouse type is the best fit?

Run a Classic SQL Warehouse (2X-Large) 24/7 with Auto-stop disabled
Use a Pro SQL Warehouse at the smallest size with Auto-stop set to 5 minutes
Use a Serverless SQL Warehouse and let it auto-stop when idle
Run SQL queries on an All-Purpose Cluster

正解: C

Serverless SQL Warehouse starts in seconds, minimizing wait time for ad-hoc queries. When idle, the warehouse returns to the shared resource pool that Databricks manages automatically, so there are no charges during downtime. Classic/Pro take 5-10 minutes to start, creating a dilemma where enabling Auto-stop makes restart times painful. Option A is expensive because the warehouse runs 24/7, and Option D — an All-Purpose Cluster — is not as optimized for SQL analytics as a SQL warehouse. The DEA exam includes questions about choosing the right SQL warehouse type.

Databricks SQL Best Practices

Sizing the warehouse: Use 2X-Small to Small for small ad-hoc queries, and Medium or larger for production dashboards with many concurrent users
Leverage Auto-stop: Save money by stopping warehouses after an idle period. Serverless manages this automatically — no Auto-stop configuration needed
Use the query cache: Repeated identical queries return cached results, which is highly effective for dashboards viewed by multiple users concurrently
Use parameterized queries: Bind dashboard filters with {{param}} query parameters to safely inject values and prevent SQL injection
Combine Delta Lake with Liquid Clustering: Apply Liquid Clustering to columns that are frequently filtered to dramatically reduce SQL warehouse scan volume — speeding up queries and lowering cost at the same time

Frequently Asked Questions

How is Databricks SQL different from Spark SQL?

Spark SQL is Apache Spark's distributed SQL execution engine, used to run SQL from notebooks and jobs. Databricks SQL is a complete SQL-focused analytics environment that bundles SQL warehouses (dedicated compute), the SQL editor, dashboards, and alerts. It is built for BI analysts and SQL users, with the Photon engine delivering fast query execution. Think of Spark SQL as the data engineer's tool and Databricks SQL as the data analyst's product.

How much does Serverless SQL Warehouse cost compared with Pro?

Serverless SQL Warehouse charges roughly 1.4-1.7x the DBU rate of Pro, but it starts in seconds (Pro takes minutes) and incurs no charges while idle. For ad-hoc queries or intermittent workloads, Serverless often has the lower total cost; for always-on, high-throughput dashboards Pro can be more cost-effective. Tune the warehouse Auto-stop setting (default 15 minutes) at the same time.

Which exam domains cover Databricks SQL?

The Data Engineer Associate (DEA) exam tests the comparison of SQL warehouse types (Classic/Pro/Serverless), the use cases for dashboards and alerts, and Unity Catalog integration. The Data Analyst Associate (DAA) exam goes deeper into practical SQL syntax, dashboard creation workflows, parameterized queries, and reading Query Profile. The basics of the Photon engine can appear on both DEA and DAA.

Try Databricks SQL questions

Check your grasp of SQL warehouses, dashboards, and federation

Try free questions →

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

Databricks SQL Complete Guide: Serverless SQL Warehouse