azuredockerazure-pipelines

Azure Pipelines docker push fail with firewall error accessing Azure Container Registry


I'm going back to basics with trying to build and push a docker image to Azure ACR. I have a simple pipeline using the Docker@2 task that was created with their template through the Azure Devops portal. I'm using Microsoft's Azure hosted agents.

- task: Docker@2
      displayName: Build and push an image to container registry
      inputs:
        command: buildAndPush
        repository: $(imageRepository)
        dockerfile: $(dockerfilePath)
        containerRegistry: $(dockerRegistryServiceConnection)
        tags: |
          $(tag)

This comes straight from Azure's docker push template using the workflow in Azure devops.

The pipeline creation process set up the service connection in Azure Devops, and I can see the connection it made in the Azure Portal in the ACR configuration and that it's marked as a contributor.

The issue I'm having with this (and the rest of my docker or azure-cli related pipelines is that it's throwing a firewall error at me:

denied: {"errors":[{"code":"DENIED","message":"client with IP \u0027xx.xx.xx.xx\u0027 is not allowed access.

My networking settings for the ACR is set as:

enter image description here

This may be incorrect, but my assumption would be that a Microsoft hosted azure agent, should be recognized as a "trusted Microsoft service" should it not?

Is there a configuration that I'm missing?

I'm aware that Microsoft hosted agents change their IP every week and there's a published IP list. But do we seriously have to poll that list every week and add it to the firewall?

And as an aside to that, I looked up the IP in the error in that published list and it wasn't in there as far as I could see. (I may very well have been looking at the wrong list, so I won't rule that out.

I've read that agents can be pulled from somewhere external to the ACR's geographical region and that could be part of the problem and why sometimes it works and sometimes it does not.

I tried a hacky attempt at adding the agent's ip to the firewall at the beginning of the pipeline with the azure cli and then removing it at the end, but I ran in to the same blocker where I can't change the firewall from an agent that's apparently not allowed in to it.

I tried starting up a solution to use my own self-hosted agents deployed to an AKS cluster, but I'm running in to a brick wall since AKS 1.19 doesn't allow docker in docker, so building docker images was a no-go there.

I looked in to setting up buildKit in that AKS so I don't need a docker daemon, but that also seems like a convoluted process to simply build some docker images.

My last option that I can see is to set up and Azure virtual machine scale set that can run docker to perform these tasks but that's a cost I'd like to avoid at this point when Microsoft agents fulfil all my current needs except for this firewall issue...


Solution

  • Based on your description, I set up an ACR and allowed All public network access. In this configuration, the Docker@2 step of the pipeline running on a Microsoft-hosted agent successfully built and pushed the image to my ACR.

    However, when I configured ACR to allow access from Selected networks only, without whitelisting any Microsoft-hosted agent IPs, the pipeline failed with a 403 error.

    As a workaround, I added an AzureCLI@2 task to dynamically add and remove ACR network rules with the pipeline agent's IP before and after the Docker@2 step. This resolved the issue. Here’s a sample pipeline for reference. Make sure that az acr network-rule add runs in the same agent job as the Docker@2 step.

    trigger:
    - none
    
    pool:
      vmImage: ubuntu-latest
    
    variables:
      myACR: xxxpremium
      imageRepository: test/repo/helloworld
      dockerfilePath: $(System.DefaultWorkingDirectory)/Ubuntu/HelloWorld/Dockerfile
      dockerRegistryServiceConnection: ACRSvcCnnPremium
      tag: $(Build.BuildId)
    
    steps:
    - task: AzureCLI@2
      displayName: Whitelist Microsoft-hosted agent IP
      inputs:
        azureSubscription: 'ARMSvcCnnSub0'
        scriptType: 'pscore'
        scriptLocation: 'inlineScript'
        inlineScript: |
          $publicIP = Invoke-RestMethod -Uri "https://api64.ipify.org"
          Write-Host "The public IP of the pipeline agent machine is $publicIP"
          
          az acr network-rule add -n $(myACR) --ip-address $publicIP/32
          Start-Sleep 120 # Pause the agent job to check the rule in Azure Portal
    
    - task: Docker@2
      displayName: Build and push an image to container registry
      inputs:
        containerRegistry: '$(dockerRegistryServiceConnection)'
        repository: '$(imageRepository)'
        command: 'buildAndPush'
        Dockerfile: '$(dockerfilePath)'
        tags: '$(tag)'
    
    - task: AzureCLI@2
      displayName: Remove Microsoft-hosted agent IP
      condition: always() # Alway remove the ACR network rule even if previous steps are cancelled or failed
      inputs:
        azureSubscription: 'ARMSvcCnnSub0'
        scriptType: 'pscore'
        scriptLocation: 'inlineScript'
        inlineScript: |
          $publicIP = Invoke-RestMethod -Uri "https://api64.ipify.org"
          Write-Host "The public IP of the pipeline agent machine is $publicIP"
          
          az acr network-rule remove -n $(myACR) --ip-address $publicIP
    

    iprule