azureazure-aksazure-bicepazure-acr

What is preventing AKS from pulling images from ACR?


I'm using Bicep to create an AKS cluster and connect it to an existing ACR:

param clusterName string = 'MyTestCluster'

param location string = resourceGroup().location

param acrName string = 'mytestacr'

param sshRSAPublicKey string

resource aksCluster 'Microsoft.ContainerService/managedClusters@2023-10-01' = {
  name: clusterName
  location: location
  identity: {
    type: 'SystemAssigned'
  }
  sku: {
    name: 'Base'
    tier: 'Standard'
  }
  properties: {
    dnsPrefix: dnsPrefix
    agentPoolProfiles: [
      {
        name: 'agentpool'
        osDiskSizeGB: 30
        count: 1
        vmSize: 'standard_d2s_v3'
        osType: 'Linux'
        mode: 'System'
      }
    ]
    linuxProfile: {
      adminUsername: 'azureuser'
      ssh: {
        publicKeys: [
          {
            keyData: sshRSAPublicKey
          }
        ]
      }
    }
    networkProfile: {
      loadBalancerSku: 'standard'
      networkPlugin: 'azure'
      networkPluginMode: 'overlay'
      networkDataplane: 'azure'
      networkPolicy: 'azure'
    }
  }
}

resource acr 'Microsoft.ContainerRegistry/registries@2021-09-01' existing = {
  name: acrName
  scope: resourceGroup()
}

// Assign the AKS cluster access to the ACR
resource acrRoleAssignment 'Microsoft.Authorization/roleAssignments@2020-04-01-preview' = {
  name: guid(acr.id, aksCluster.id, '7f951dda-4ed3-4680-a7ca-43fe172d538d')
  scope: acr
  properties: {
    roleDefinitionId: resourceId('Microsoft.Authorization/roleDefinitions', '7f951dda-4ed3-4680-a7ca-43fe172d538d') // ACR Pull
    principalId: aksCluster.identity.principalId
    principalType: 'ServicePrincipal'
  }
}

The cluster creates correctly and I can see the the cluster has been granted AcrPull on the existing repository.

However when I deploy the following to the cluster, the pull with ImagePullBackOff and 401 Unauthorised from the repository:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myaksapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: myaksapp
  template:
    metadata:
      labels:
        app: myaksapp
    spec:
      containers:
      - name: myaksapp
        image: mytestacr.azurecr.io/myaksapp:latest
        resources:
          limits:
            memory: "256Mi"
            cpu: "500m"
        ports:
        - containerPort: 80

Have I missed a step in the process? If I create the same cluster with

az aks create --resource-group myResourceGroup --name MyTestCluster --node-count 1 --generate-ssh-keys --attach-acr mytestacr

the everything deploys fine as expected.


Solution

  • You did it wrong here:

    principalId: aksCluster.identity.principalId
    

    It is supposed to use kubelet identity instead of AKS Control Plane identity to access ACR.

    See also: https://github.com/Azure/bicep/issues/4026