Databricks SQL is a SQL analytics environment designed for BI analysts and SQL users. It bundles SQL warehouses (dedicated compute), a SQL editor, dashboards, and alerts into a single product, delivering a SQL analytics experience on the Lakehouse that rivals traditional data warehouses such as Snowflake, BigQuery, and Redshift. This article walks through SQL warehouse types, the Photon engine, dashboards, query federation, Query Profile, Unity Catalog integration, and exam-relevant points.
Databricks SQL is the integrated environment for running SQL analytics on the Databricks Lakehouse Platform. Separate from the notebook environment where data engineers build ETL, it is made up of the following SQL-focused components.
Databricks SQL offers three types of SQL warehouses. Pick the right one based on workload characteristics, startup speed, cost, and required features.
| Characteristic | Classic | Pro | Serverless |
|---|---|---|---|
| Startup time | 5-10 min | 5-10 min | Seconds |
| Photon engine | Enabled | Enabled | Enabled |
| Query federation | Not supported | Supported | Supported |
| Predictive Optimization | Not supported | Supported | Supported |
| Infrastructure management | Customer-managed | Customer-managed | Databricks-managed |
| Autoscaling | Manual configuration | Manual configuration | Fully automatic |
| DBU rate | Low | Medium | High |
| Recommended use case | Basic SQL queries | Advanced analytics, federation | Ad-hoc queries, dashboards |
As of 2026, Databricks recommends Serverless SQL Warehouse. It starts in seconds, drastically reducing wait time for ad-hoc queries, and bills nothing while idle, making it cost-efficient for intermittent usage patterns.
Photon is Databricks' next-generation vectorized query execution engine, written in C++. It is enabled by default on every SQL warehouse and speeds up queries transparently — no SQL or code changes required.
Benchmarks report 2-8x speedups over the legacy Spark engine, with the largest gains on large-volume scans, filters, aggregations, and joins.
Databricks SQL dashboards place SQL query results into visualizations (charts and tables) to compose interactive reports. The 2026 refresh added AI-assisted automatic dashboard generation.
SQL alerts run a specified SQL query on a schedule and automatically send a notification when results meet a threshold condition. They are used for data quality monitoring, KPI threshold alerts, and anomaly detection.
-- アラート用クエリの例: 過去1時間のAPIエラー率が5%を超えたら通知
SELECT
COUNT(CASE WHEN status_code >= 500 THEN 1 END) * 100.0
/ NULLIF(COUNT(*), 0) AS error_rate_pct
FROM prod_catalog.logs.api_requests
WHERE request_timestamp > current_timestamp() - INTERVAL 1 HOUR
-- アラート条件: error_rate_pct > 5
-- 通知先: Slackチャンネル #data-alerts
-- 評価間隔: 15分ごとAlert evaluation results are tracked in three states (TRIGGERED / OK / UNKNOWN), and you can configure a notification to fire when the alert recovers from TRIGGERED back to OK as well.
Query federation (Lakehouse Federation) lets you query data in external databases (PostgreSQL, MySQL, SQL Server, Snowflake, BigQuery, and others) directly from Databricks SQL. You can query across heterogeneous data sources without moving data or building ETL pipelines.
-- 1. 外部データベースへの接続を作成
CREATE CONNECTION pg_production
TYPE POSTGRESQL
OPTIONS (
host 'prod-db.example.com',
port '5432',
user secret('db_scope', 'pg_user'),
password secret('db_scope', 'pg_password')
);
-- 2. 外部カタログの登録
CREATE FOREIGN CATALOG pg_catalog
USING CONNECTION pg_production
OPTIONS (database 'production');
-- 3. 外部テーブルとDelta Lakeテーブルの横断クエリ
SELECT c.customer_name, o.order_total, o.order_date
FROM prod_catalog.sales.customers c
JOIN pg_catalog.public.orders o
ON c.customer_id = o.customer_id
WHERE o.order_date >= '2026-01-01'
ORDER BY o.order_total DESC;Federation is available on Pro SQL Warehouse and Serverless SQL Warehouse. External table connection details are managed centrally in Unity Catalog, and access permissions are controlled with GRANT statements.
Query Profile visualizes the execution plan of a SQL query along with each operator's processing time, scanned row count, and shuffle volume. Use it to find query bottlenecks and to guide performance tuning.
The DAA exam covers reading Query Profile and asks you to identify which operator is the bottleneck and whether predicate pushdown is being applied.
Databricks SQL is fully integrated with Unity Catalog. Every piece of data accessed from a SQL warehouse is managed under Unity Catalog's three-level namespace (catalog.schema.table).
Databricks SQL appears on multiple certification exams. The depth of coverage differs by exam, so focus on the points that matter for the exam you are taking.
DEA - SQL Warehouse
問題 1
A data analyst team frequently runs ad-hoc queries and wants to minimize the wait time before queries execute while also cutting costs during idle periods. Which SQL warehouse type is the best fit?
正解: C
Serverless SQL Warehouse starts in seconds, minimizing wait time for ad-hoc queries. When idle, the warehouse returns to the shared resource pool that Databricks manages automatically, so there are no charges during downtime. Classic/Pro take 5-10 minutes to start, creating a dilemma where enabling Auto-stop makes restart times painful. Option A is expensive because the warehouse runs 24/7, and Option D — an All-Purpose Cluster — is not as optimized for SQL analytics as a SQL warehouse. The DEA exam includes questions about choosing the right SQL warehouse type.
How is Databricks SQL different from Spark SQL?
Spark SQL is Apache Spark's distributed SQL execution engine, used to run SQL from notebooks and jobs. Databricks SQL is a complete SQL-focused analytics environment that bundles SQL warehouses (dedicated compute), the SQL editor, dashboards, and alerts. It is built for BI analysts and SQL users, with the Photon engine delivering fast query execution. Think of Spark SQL as the data engineer's tool and Databricks SQL as the data analyst's product.
How much does Serverless SQL Warehouse cost compared with Pro?
Serverless SQL Warehouse charges roughly 1.4-1.7x the DBU rate of Pro, but it starts in seconds (Pro takes minutes) and incurs no charges while idle. For ad-hoc queries or intermittent workloads, Serverless often has the lower total cost; for always-on, high-throughput dashboards Pro can be more cost-effective. Tune the warehouse Auto-stop setting (default 15 minutes) at the same time.
Which exam domains cover Databricks SQL?
The Data Engineer Associate (DEA) exam tests the comparison of SQL warehouse types (Classic/Pro/Serverless), the use cases for dashboards and alerts, and Unity Catalog integration. The Data Analyst Associate (DAA) exam goes deeper into practical SQL syntax, dashboard creation workflows, parameterized queries, and reading Query Profile. The basics of the Photon engine can appear on both DEA and DAA.
Try Databricks SQL questions
Check your grasp of SQL warehouses, dashboards, and federation
Try free questions →Related Articles
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Databricks Certifications: All 7 Exams, Difficulty & Study Plan (2026)
Complete guide to all 7 Databricks certifications — Data Eng...
Databricks Exam Difficulty Ranking: All 7 Certs Compared (2026)
Every Databricks certification ranked by difficulty, with st...
Databricks Study Guide: Fastest Pass Route & Time Estimates (2026)
How to pass Databricks certifications efficiently. Official ...
Databricks Data Engineer Associate: Complete Guide (2026)
Domain-by-domain breakdown of the Databricks Certified Data ...
Databricks Data Engineer Professional: Complete Guide (2026)
Tactics for the Databricks Certified Data Engineer Professio...