Unity Catalog: Databricks Governance for Tables, Models, and Files Across Workspaces
Group COO & CISO
Operational excellence, governance, and information security. Aligns technology, risk, and business outcomes in complex IT environments
Unity Catalog: Databricks Governance for Tables, Models, and Files Across Workspaces
Unity Catalog is Databricks' unified governance layer for data and AI assets — tables, views, volumes (files), models, functions, and external locations — across every workspace in an account. It replaced the legacy Hive metastore as the recommended catalog in 2023 and by 2026 is the only catalog Databricks invests in for new features. If you are running Databricks today and have not migrated, your governance, audit, and lineage story is two years behind.
This article explains the three-level namespace, the privilege model, the governance primitives that matter under regulations like GDPR, NIS2, and DORA, and the operational patterns that hold up across customer audits.
Why a New Catalog Existed in the First Place
The Hive metastore was workspace-scoped, two-level (database.table), and offered only coarse table-level access controls. In an enterprise running ten workspaces, sharing a curated dataset meant copying it ten times or building bespoke ETL between metastores. There was no native way to govern ML models, files, or external locations the same way as tables. Audit logs were fragmented, lineage was missing, and PII handling required bolt-on tooling.
Unity Catalog fixes all of those: one metastore per region, per account, with a three-level namespace and consistent governance over every asset type.
The Three-Level Namespace
Every object in Unity Catalog has a fully qualified name: catalog.schema.object.
- Catalog — the top container; usually one per environment (e.g.,
dev,staging,prod) or per business domain - Schema — equivalent to a database in Hive, groups related objects (e.g.,
sales,marketing) - Object — table, view, materialised view, volume, function, model, or registered model version
-- Browse the namespace
SHOW CATALOGS;
SHOW SCHEMAS IN prod;
SHOW TABLES IN prod.sales;
-- Fully qualified reference
SELECT * FROM prod.sales.orders;
The namespace replaces the legacy two-level hive_metastore.database.table pattern. Existing Hive references continue to work for backwards compatibility, but new tables should always be created in Unity Catalog catalogs.
Need expert help with unity catalog?
Our cloud architects can help you with unity catalog — from strategy to implementation. Book a free 30-minute advisory call with no obligation.
Securables, Principals, and Privileges
Unity Catalog grants are SQL-standard GRANT / REVOKE statements. Privileges flow down the hierarchy: a privilege on a catalog implies the same privilege on all schemas and objects below it, unless explicitly revoked.
-- Catalog-level grants
GRANT USE CATALOG ON CATALOG prod TO `data-platform`;
GRANT CREATE SCHEMA ON CATALOG prod TO `data-platform`;
-- Schema-level grants
GRANT USE SCHEMA, SELECT ON SCHEMA prod.sales TO `analyst-team`;
-- Table-level grants
GRANT SELECT ON TABLE prod.sales.orders TO `analyst-team`;
GRANT MODIFY ON TABLE prod.sales.orders TO `etl-service-principal`;
-- Volume (files) grant
GRANT READ VOLUME, WRITE VOLUME ON VOLUME prod.raw.landing TO `ingest-service`;
-- Model grant
GRANT EXECUTE ON FUNCTION prod.ml.fraud_score TO `prod-app-sp`;
Privileges that matter most in practice:
| Privilege | Applies to | What it allows |
|---|---|---|
| USE CATALOG / USE SCHEMA | Catalog / schema | Discovery — required before any operation in the namespace |
| SELECT | Table / view | Read access to rows |
| MODIFY | Table | INSERT, UPDATE, DELETE, MERGE |
| CREATE TABLE / SCHEMA | Schema / catalog | Create new objects |
| READ VOLUME / WRITE VOLUME | Volume | File-level access for non-tabular data |
| EXECUTE | Function / model | Invoke a UDF or call a model |
| BROWSE | Catalog / schema | See metadata without data access — useful for data discovery |
Row Filters and Column Masks
For regulated workloads — healthcare, financial services, anything with PII — table-level grants are not enough. Unity Catalog supports row filters and column masks defined as SQL functions, applied at query time.
-- Mask the email column for non-privileged users
CREATE OR REPLACE FUNCTION prod.sec.email_mask(email STRING)
RETURN CASE
WHEN is_member('pii-readers') THEN email
ELSE regexp_replace(email, '(^.).+(@.+$)', '$1***$2')
END;
ALTER TABLE prod.sales.customers
ALTER COLUMN email
SET MASK prod.sec.email_mask;
-- Row filter: only show rows for a user's region
CREATE OR REPLACE FUNCTION prod.sec.region_filter(region STRING)
RETURN region IN (
SELECT region FROM prod.sec.user_regions WHERE user_email = current_user()
);
ALTER TABLE prod.sales.orders
SET ROW FILTER prod.sec.region_filter ON (region);
This is significantly less brittle than the row-access-policy patterns common on legacy data warehouses. The functions are first-class SQL, version-controlled in source, and auditable in lineage.
Lineage, Audit, and System Tables
Unity Catalog automatically captures column-level lineage across SQL and notebook code. The lineage graph shows which downstream tables depend on a source column, which queries last touched it, and which users ran them. For incident response — "who saw this PII column in the last 90 days?" — the answer is one query against the system tables:
SELECT user_identity.email, request_params.full_name_arg AS table, event_time
FROM system.access.audit
WHERE service_name = 'unityCatalog'
AND action_name = 'getTable'
AND request_params.full_name_arg = 'prod.sales.customers'
AND event_time > current_date() - INTERVAL 90 DAYS
ORDER BY event_time DESC;
The system.access.audit, system.lineage.table_lineage, and system.billing.usage tables are populated automatically and queryable via SQL. For organisations under DORA or NIS2, this is a meaningful uplift over the bolted-together audit pipelines that Hive metastore deployments needed.
Governing ML Models and Vector Indexes
Unity Catalog governs models the same way as tables. A model registered to UC has a fully qualified name (catalog.schema.model), versioning, ACLs, and lineage that links the training tables, the model, and the inference queries that use it.
# MLflow registration into Unity Catalog
import mlflow
mlflow.set_registry_uri("databricks-uc")
with mlflow.start_run():
mlflow.sklearn.log_model(
sk_model=clf,
artifact_path="model",
registered_model_name="prod.ml.fraud_score",
)
Mosaic AI Vector Search indexes are also UC-governed: the index is a UC object, and access to query the index follows the same GRANT model. This matters for AI workloads under regulation — the auditor wants the same lineage and access trail for an LLM-powered application as for a SQL report.
External Locations and Cross-Cloud Sharing
External locations register cloud storage paths (S3 buckets, ADLS containers, GCS buckets) with Unity Catalog, so file access can be governed alongside table access. Combined with Delta Sharing — Databricks' open protocol for sharing live data without copying — UC enables both internal cross-workspace sharing and external sharing with partners on different clouds, even non-Databricks recipients.
Migration from Hive Metastore
The legacy hive_metastore catalog still exists in every workspace for backwards compatibility, but it does not get new features. Migration involves three phases: enable Unity Catalog at the account level, create the new catalog hierarchy, and migrate tables using SYNC SCHEMA or table cloning. Plan 1-3 quarters for a meaningful estate, run both catalogs in parallel during cutover, and treat the migration as the moment to right-size your namespace design — not just lift-and-shift the old database structure.
How Opsio Helps
Opsio designs and rolls out Unity Catalog as part of databricks implementation services engagements, with particular focus on the governance and audit posture demanded by NIS2, GDPR, DORA, and ISO 27001. We pair UC rollout with broader cloud security services for identity federation, network isolation, and key management, and with end-to-end data analytics engagements when the goal is end-to-end governed delivery from raw ingestion to executive dashboards.
About the Author

Group COO & CISO
Fredrik is the Group Chief Operating Officer and Chief Information Security Officer at Opsio. He focuses on operational excellence, governance, and information security, working closely with delivery and leadership teams to align technology, risk, and business outcomes in complex IT environments. He leads Opsio's security practice including SOC services, penetration testing, and compliance frameworks.
Editorial standards: This article was written by a certified practitioner and peer-reviewed by our engineering team. We update content quarterly to ensure technical accuracy. Opsio maintains editorial independence — we recommend solutions based on technical merit, not commercial relationships.