azure-purviewdata-governance

How can I limit Microsoft Purview Unity Catalog scans to a specific schema and subset of tables?


I'm using Microsoft Purview with the Azure Databricks (Unity Catalog) connector and need to scan only a specific schema and only tables that contain "bronze" in their names (e.g., bronze_sales, bronze_customers).

However, during scan setup:

The scope selection UI only allows selecting at the catalog level (e.g., dev) There's no schema-level filtering

Custom scan rule sets don’t support table filters for Unity Catalog scans

Table/Folder Filters are not visible during scan rule creation for this source type

This is critical for both cost control and governance — scanning full catalogs with hundreds of schemas and thousands of tables isn’t feasible.


Solution

  • Scoped scan can be done only on catalog level. So, you might have to try splitting the catalog and modify based on your requirements to minimize the scan volume.https://learn.microsoft.com/en-us/purview/register-scan-azure-databricks-unity-catalog?tabs=MI#known-limitations

    For governance, you can try automation/script to looks for tables as per your requirement, this will still not limit Unity Catalog Scanning.

    For tracking you can try lineage: Introducing Lineage Tracking for Azure Databricks Unity Catalog in Microsoft Purview

    https://learn.microsoft.com/en-us/purview/register-scan-azure-databricks-unity-catalog?tabs=MI#lineage