azureazure-blob-storageazure-storageazure-private-linkazure-private-dns

Access azure storage account from pipeline agent in same region with access restrictions enabled


We are using an azure storage account for our cloud services. This storage account is part of a virtual network, so access to the storage account is restricted to selected networks and the vnet is added. This works beautifully in our services.

The problem arises when we try to copy data to this storage account in an azure pipeline. Within the pipeline, we temporarily add a firewall rule to the storage account to allow traffic from the pipeline agent's Ip address to the storage account. Then we copy the data (via azcopy) and finally, remove the firewall rule. This works fine on a private agent. However, we are also using private agents hosted in azure. The problem is that if the agent runs in azure, the connection to the storage account uses private azure ip addresses, and the firewall rule doesn't work. This is specified in this doc:

Services deployed in the same region as the storage account use private Azure IP addresses for communication. Thus, you cannot restrict access to specific Azure services based on their public outbound IP address range.

Is there any way to force external routing? It seems really silly to me that with the current configuration, we are unable to connect to the storage account from within azure, and we ARE able to connect from a private agent (or any other pc) outside of azure.

I've already tried to play with the routing preference setting in the firewalls and virtual networks section of the storage account, and using the -internetrouting endpoint, but this doesn't make any difference.


Solution

  • The solution is to use private endpoints. I am dealing with the same issue and after considerable research, I found that private endpoints will facilitate access over internal IPs between the remote vnet where your agents are located and the vnet where your storage is located. I have tested this and provided details below.

    After testing I found that the way this works is by creating a Private DNS in the storage vnet and setting up a vnet DNS link that allows the VMs in the remote vnet where the agents sit to resolve the storage account connection to a private IP instead of a public IP. Additionally a NIC in the remote vnet is created which provides a route to the private IPs of the storage.

    So that NIC sits in the same vnet as the agents providing a route for private IP connections, and a DNS link exists to resolve storage to private IPs, and it all just works. The agents will rely on the private DNS rather than public DNS to resolve the storage hostname and so the agents will communicate with azure properly via private IPs.

    Edit: I have setup a private endpoint and confirmed that it works as expected. There are some caveats to doing so with terraform which I've outlined in a related azurerm provider GitHub issue:

    https://github.com/hashicorp/terraform-provider-azurerm/issues/2977#issuecomment-1011183736