Unity Catalog is the account-level unified data governance layer provided by Databricks. It delivers centralized access control, audit logging, and data lineage across multiple workspaces for every data asset, including tables, views, functions, ML models, and files (Volumes). It unifies metadata management that used to be fragmented per workspace in the legacy Hive Metastore by introducing a three-level namespace of catalog.schema.table, and enables SQL-standard GRANT/REVOKE-based permission management.
This article covers Unity Catalog end-to-end: hierarchy, permission model, External Locations, Delta Sharing integration, and how it differs from Hive Metastore. It also highlights the points tested on the Data Engineer Associate (DEA) exam, focusing on the Data Governance domain (17% of the exam).
Unity Catalog uses a four-level object model topped by the metastore. Every data asset sits in this tree, and permissions are inherited from parents down to children.
Metastore (one per account, per region)
├── Catalog: production
│ ├── Schema: sales
│ │ ├── Table: orders
│ │ ├── Table: customers
│ │ ├── View: daily_summary
│ │ └── Function: calc_tax()
│ ├── Schema: marketing
│ │ ├── Table: campaigns
│ │ └── Volume: raw_files
│ └── Schema: information_schema (auto-generated)
├── Catalog: development
│ └── Schema: sandbox
│ └── Table: test_orders
├── External Location: s3://data-lake/external/
└── Storage Credential: aws_s3_roleWhen you reference a data asset, always use the three-level namespace catalog.schema.object. This makes it possible to uniquely identify which environment, which domain, and which table is being referenced just from the name.
-- Access via the three-level namespace
SELECT * FROM production.sales.orders;
-- ^^^^^^^^^^ ^^^^^ ^^^^^^
-- Catalog Schema Table
-- Set a default Catalog/Schema
USE CATALOG production;
USE SCHEMA sales;
-- After the settings above, you can use short names
SELECT * FROM orders;| Level | Description | Example Design Pattern |
|---|---|---|
| Metastore | The top-level container for Unity Catalog. Create one per region and attach multiple workspaces to it | One created for ap-northeast-1 (Tokyo) |
| Catalog | Top-level logical grouping of data assets. Separated by environment, department, or project | production / development / staging, finance / marketing |
| Schema | Grouping for tables and views. Organized by domain or data layer | sales / hr / logs, bronze / silver / gold |
| Table / View / Function / Volume | The actual data assets. Tables are either Managed or External | orders (Managed), external_logs (External), calc_tax() (UDF) |
To store data assets in Unity Catalog, you first create a Catalog and a Schema. Catalogs can be created by the Metastore Admin (or any user holding the CREATE CATALOG privilege), and Schemas can be created by the Catalog owner (or any user holding the CREATE SCHEMA privilege).
-- Create a Catalog
CREATE CATALOG IF NOT EXISTS production
COMMENT 'Production environment catalog';
-- Create a Schema
CREATE SCHEMA IF NOT EXISTS production.sales
COMMENT 'Sales domain tables';
-- Create a Managed Table (no LOCATION -> stored in Unity Catalog-managed storage)
CREATE TABLE production.sales.orders (
order_id BIGINT GENERATED ALWAYS AS IDENTITY,
customer_id BIGINT NOT NULL,
amount DECIMAL(12,2) NOT NULL,
order_date DATE NOT NULL,
status STRING DEFAULT 'pending'
)
COMMENT 'Customer order records'
TBLPROPERTIES ('quality' = 'gold');
-- Create an External Table (with LOCATION -> references data in external storage)
CREATE TABLE production.sales.external_logs (
log_id BIGINT,
message STRING,
timestamp TIMESTAMP
)
LOCATION 's3://data-lake/external/sales/logs/';Unity Catalog enforces access control with SQL-standard GRANT/REVOKE syntax. Privileges are granted on Securable Objects (Catalog, Schema, Table, View, Function, Volume, External Location, Storage Credential) and attached to users or groups (principals).
| Privilege | Applies To | Effect |
|---|---|---|
USAGE | Catalog, Schema | Required to list contents of and access objects nested inside |
SELECT | Table, View | Read data (run SELECT statements) |
MODIFY | Table | Run INSERT / UPDATE / DELETE / MERGE |
CREATE TABLE | Schema | Create new tables in the Schema |
CREATE SCHEMA | Catalog | Create new Schemas in the Catalog |
CREATE CATALOG | Metastore | Create new Catalogs in the metastore |
ALL PRIVILEGES | All | Grants every privilege on the target object at once |
CREATE EXTERNAL LOCATION | Storage Credential | Create an External Location using a Storage Credential |
-- Step 1: Open access to the Catalog (open the hallway)
GRANT USAGE ON CATALOG production TO analysts;
-- Step 2: Open USAGE on the Schema
GRANT USAGE ON SCHEMA production.sales TO analysts;
-- Step 3: Grant SELECT on tables (let them into the room)
GRANT SELECT ON SCHEMA production.sales TO analysts;
-- ^ SELECT applies to every table under the Schema
-- Restrict to a specific table
GRANT SELECT ON TABLE production.sales.orders TO analysts;
-- Grant data modification privilege
GRANT MODIFY ON TABLE production.sales.orders TO etl_service;
-- Privilege to create tables in a Schema
GRANT CREATE TABLE ON SCHEMA production.sales TO data_engineers;
-- Inspect privileges
SHOW GRANTS ON SCHEMA production.sales;
SHOW GRANTS TO analysts;
-- Revoke a privilege
REVOKE SELECT ON SCHEMA production.sales FROM analysts;Unity Catalog privileges flow from parent objects down to children. For example, granting SELECT on a Catalog applies SELECT to every Schema and table under that Catalog. However, USAGE is not inherited automatically. Unless you explicitly grant USAGE on both the parent Catalog and the Schema, having SELECT on the underlying table is not enough to access the data.
Privilege inheritance:
GRANT SELECT ON CATALOG production TO analysts;
-> SELECT inherited by every Schema and Table under production
But USAGE is still required:
Catalog: production -> USAGE required ✓
Schema: sales -> USAGE required ✓
Table: orders -> readable with SELECT ✓
When USAGE is missing:
Catalog: production -> no USAGE ✗
Schema: sales -> USAGE granted
Table: orders -> SELECT granted -> but inaccessible ✗Tables registered in Unity Catalog are categorized as Managed Tables or External Tables based on where the data is stored. The DROP behavior is fundamentally different, so the choice matters at design time.
| Comparison | Managed Table | External Table |
|---|---|---|
| Data location | Unity Catalog managed storage (the storage root of the metastore/catalog/schema) | An external path you specify (S3, ADLS, GCS) |
| How to create | CREATE TABLE ... (no LOCATION) | CREATE TABLE ... LOCATION 's3://...' |
| Data on DROP | Both metadata and data files are deleted | Only metadata is deleted; data files remain in external storage |
| Lifecycle management | Fully managed by Unity Catalog | You manage the lifecycle of the data files |
| Recommended use cases | Data that lives entirely inside Databricks; greenfield projects | Integration with existing data lakes; sharing data with other platforms |
Exam favorite: "What happens to the data files after DROP TABLE?" -> Managed Table deletes the data too; External Table only deletes the metadata.
To create an External Table, you must register the credentials Unity Catalog uses to access external storage. The mechanism has two layers: Storage Credential and External Location.
[Storage Credential] Register cloud credentials
| (IAM role / Service Principal / Service Account)
v
[External Location] Bind a credential to an allowed path
url: s3://bucket/path/ "With these credentials, access to this path is OK"
|
v
[External Table] Create a table with LOCATION pointing at the external path
production.sales.ext_logs
LOCATION 's3://bucket/path/logs/'-- 1. Create a Storage Credential (requires Metastore Admin)
CREATE STORAGE CREDENTIAL aws_s3_credential
WITH (
AWS_IAM_ROLE = 'arn:aws:iam::123456789012:role/unity-catalog-role'
);
-- 2. Create an External Location
CREATE EXTERNAL LOCATION s3_data_lake
URL 's3://my-data-lake/production/'
WITH (STORAGE CREDENTIAL aws_s3_credential)
COMMENT 'Production data lake on S3';
-- 3. Grant privileges on the External Location
GRANT CREATE EXTERNAL TABLE ON EXTERNAL LOCATION s3_data_lake
TO data_engineers;
-- 4. Create the External Table
CREATE TABLE production.sales.external_events (
event_id BIGINT,
event_type STRING,
payload STRING,
created_at TIMESTAMP
)
LOCATION 's3://my-data-lake/production/events/';A Storage Credential can be reused across multiple External Locations. For example, a single IAM role can back two External Locations — s3://bucket/sales/ and s3://bucket/marketing/ — so different teams get distinct, well-bounded access ranges.
Unity Catalog was designed to solve the governance pain points of the legacy Hive Metastore. If you are planning a migration, you need to understand the following differences.
| Comparison | Hive Metastore | Unity Catalog |
|---|---|---|
| Management scope | Per-workspace (each workspace has its own isolated metastore) | Per-account (shared across multiple workspaces) |
| Namespace | Two-level (schema.table / database.table) | Three-level (catalog.schema.table) |
| Permission model | Table ACLs (access control lists per table/view) | SQL-standard GRANT/REVOKE with hierarchical inheritance |
| Audit logs | Relies on cluster logs (hard to audit uniformly) | Unified audit logs via System Tables |
| Data lineage | Manual (no built-in feature) | Automatic table- and column-level lineage |
| Table formats | Delta / Parquet / CSV / JSON / ORC / Avro | Managed Tables are Delta only; External Tables support multiple formats |
| File management | DBFS (no governance) | Volumes (with permission management and auditing) |
| Cross-workspace sharing | Not possible (metadata fragmented across workspaces) | Automatically shared across workspaces attached to the same metastore |
Delta Sharing is an open protocol for securely sharing data across organizational boundaries. Unity Catalog acts as a Delta Sharing provider and can share data with other organizations' Databricks environments as well as non-Databricks environments (Spark, Pandas, Power BI, Tableau, and more).
-- Create a Share
CREATE SHARE customer_analytics;
-- Add a table to the Share
ALTER SHARE customer_analytics
ADD TABLE production.sales.orders;
-- Create a Recipient
CREATE RECIPIENT partner_company
USING ID 'partner-sharing-identifier';
-- Grant the Recipient access to the Share
GRANT SELECT ON SHARE customer_analytics TO RECIPIENT partner_company;Volumes let you govern non-table files (CSV, JSON, images, model artifacts, configuration files, and so on) under the Unity Catalog permission model. The legacy DBFS (Databricks File System) had no governance and could not track who accessed which file. Because Volumes are created under a Catalog/Schema, you can control privileges with GRANT/REVOKE and access is recorded in the audit logs.
-- Create a Managed Volume (stored in Unity Catalog-managed storage)
CREATE VOLUME production.raw.landing_files;
-- Create an External Volume (references an external path)
CREATE EXTERNAL VOLUME production.raw.s3_landing
LOCATION 's3://my-bucket/landing/';
-- List files
LIST '/Volumes/production/raw/landing_files/';
-- Read files from SQL
SELECT * FROM csv.`/Volumes/production/raw/landing_files/2026-03/data.csv`;
-- Read files in a Volume from Python
-- df = spark.read.csv("/Volumes/production/raw/landing_files/2026-03/data.csv")Unity Catalog automatically parses the Spark queries run in notebooks and jobs, and records the data flow (lineage) between tables and between columns. No manual setup or activation is required — it works automatically wherever Unity Catalog is enabled.
Audit logs are written to the system.access.audit table (System Tables). You can query who did what to which object and when directly in SQL.
-- Audit events in the last 24 hours
SELECT
event_time,
user_identity.email AS user_email,
action_name,
request_params.full_name_arg AS object_name
FROM system.access.audit
WHERE event_time > current_timestamp() - INTERVAL 24 HOURS
AND action_name IN ('getTable', 'createTable', 'grantPermission')
ORDER BY event_time DESC;On the Data Engineer Associate (DEA) exam, the Data Governance domain accounts for about 17% of the questions. Unity Catalog is the core topic in that domain. Make sure you nail the items below.
| Topic | What to Remember |
|---|---|
| Three-level namespace | catalog.schema.table structure. The metastore is not part of the namespace |
| USAGE privilege | SELECT does not work without USAGE on both the Catalog and the Schema. Parent is the hallway, child is the room |
| Managed vs External | DROP behavior: Managed -> data is deleted, External -> only metadata is deleted. Distinguished by the presence of a LOCATION clause |
| Storage Credential → External Location | Two-layer authentication. Credential = cloud credentials, Location = allowed path range |
| Volumes | Successor to DBFS. GRANT/REVOKE and audit logs apply to files too |
| Data lineage | No activation needed (recorded automatically). Two flavors: table-level and column-level |
| Delta Sharing | Open protocol. No data copy required. Recipients do not have to be Databricks users |
| Privilege inheritance | Privileges granted on a Catalog are inherited by every Schema and Table below it (USAGE excluded) |
Data Governance / Unity Catalog
問題 1
The analyst team (group name: analysts) tried to SELECT from production.sales.orders but got a 'PERMISSION DENIED' error. The administrator has already run GRANT SELECT ON SCHEMA production.sales TO analysts. Which additional SQL statement is required to resolve the issue?
正解: A
In the Unity Catalog permission model, you need USAGE on both the Catalog and the Schema to access objects under them. Granting SELECT on the Schema is not enough; without USAGE on the parent Catalog, you cannot "walk down the hallway" to the data. You need to add GRANT USAGE ON CATALOG production TO analysts, plus USAGE on the production.sales schema. (Because SELECT ON SCHEMA has already been granted, schema-level access can be implicitly covered in some setups, but USAGE on the Catalog is still required.) Option B grants too much privilege and violates least privilege. Options C (READ) and D (BROWSE) are not valid Unity Catalog privilege names.
Data Governance / External Location
問題 2
A data engineer wants to create an External Table that references existing data on S3. What is the correct order of steps in Unity Catalog?
正解: C
Creating an External Table requires a Storage Credential and an External Location to exist first. The correct order is (1) Storage Credential (register cloud credentials), (2) External Location (define the allowed path range using that credential), (3) External Table (use a LOCATION clause to create the table). The External Location references the Storage Credential, so the Credential must exist first. The External Table's LOCATION must fall inside the External Location's URL range, so the Location must exist before the Table.
Data Governance / Table Types
問題 3
What happens to the data files when you run DROP TABLE production.sales.orders on a Unity Catalog Managed Table?
正解: A
Because Unity Catalog fully manages the data lifecycle of a Managed Table, DROP TABLE deletes both the metadata and the data files. For an External Table, by contrast, only the metadata is deleted and the data files remain in external storage. Option B's "30-day retention" is not a real mechanism. Option C describes the External Table behavior. Option D's UNDROP TABLE does exist in Unity Catalog, but it is not that the data files themselves "remain" — Unity Catalog's internal mechanism simply allows restoration for a limited window. On the exam, make sure you can instantly answer the Managed vs. External difference in terms of whether the data files get deleted.
Try More Unity Catalog Questions
Test your level with 16,000+ questions, including Data Governance items
Try free questions →What is the difference between Unity Catalog and Hive Metastore?
Hive Metastore manages metadata per workspace and cannot enforce permissions or auditing across workspaces. Unity Catalog operates at the account level and provides unified access control, audit logs, and data lineage across multiple workspaces. The permission model also differs: Hive Metastore only offers table/view-level ACLs, while Unity Catalog uses SQL-standard GRANT/REVOKE for consistent control across the entire Catalog/Schema/Table/View/Function/Volume hierarchy.
Is Unity Catalog free to use?
The core Unity Catalog features (three-level namespace, GRANT/REVOKE permissions, Managed/External Tables, Volumes) are available on every Databricks edition, including Standard. However, advanced features such as Attribute-Based Access Control (ABAC), column-level automatic lineage, and Lakehouse Monitoring integration require Premium or higher. Check the official Databricks documentation for the latest edition-by-edition feature matrix.
What is the difference between USAGE and SELECT in Unity Catalog?
USAGE is the right to traverse into an object, granted on Catalogs and Schemas. SELECT is the right to read data from a table or view. For example, to SELECT from production.sales.orders, you need SELECT on the table plus USAGE on the production catalog and USAGE on the sales schema. Think of USAGE as the key to the hallway and SELECT as the key to the room. You cannot access the data with just one of them.
Related Unity Catalog Articles
Data Engineer Associate: Complete Guide
Unity Catalog is tested in the Governance domain (17%)
Delta Lake Complete Guide
Unity Catalog Managed Tables use the Delta format
Delta Sharing Explained
Securely share data from Unity Catalog
Databricks SQL Complete Guide
GRANT/REVOKE syntax and the query execution environment
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Databricks Certifications: All 7 Exams, Difficulty & Study Plan (2026)
Complete guide to all 7 Databricks certifications — Data Eng...
Databricks Exam Difficulty Ranking: All 7 Certs Compared (2026)
Every Databricks certification ranked by difficulty, with st...
Databricks Study Guide: Fastest Pass Route & Time Estimates (2026)
How to pass Databricks certifications efficiently. Official ...
Databricks Data Engineer Associate: Complete Guide (2026)
Domain-by-domain breakdown of the Databricks Certified Data ...
Databricks Data Engineer Professional: Complete Guide (2026)
Tactics for the Databricks Certified Data Engineer Professio...