[SOLVED] Azure ML and Data exfiltration prevention

Azure ML and Data exfiltration prevention

I'm trying to understand how DEP works in ML.

The Microsoft recommended architecture states that I must use a service endpoint along with a service endpoint policy to prevent ML compute subnets from gaining access to non-white listed storage accounts (https://learn.microsoft.com/en-us/azure/machine-learning/how-to-network-isolation-planning#recommended-architecture-with-data-exfiltration-prevention)

Some other examples I found on the web don't use service endpoints and instead prefer private endpoints for storage accounts. Does using PEs alone prevent data exfiltration? I'm not sure, because from what I've seen so far, it's possible to add any storage account as a datastore through the ML workspace as long as you have the appropriate access rights for the storage account.

So I'm a bit confused and would appreciate if someone could shed some light on this.

Solution

using private endpoints alone may not prevent data exfiltration, but it can reduce the attack surface and the chances of data exfiltration. It is recommended to use a combination of Azure Virtual Network, Azure Private Link, and Azure Policy to secure your Azure Machine Learning resources.