azureazure-managed-identityazure-batchazure-python-sdkazure-batch-account

Create Pool in azure batch with user assigned managed identity


We are migrating our Python solution from Service Principal Name (SPN) to Managed Identity (MI) based authentication. We have made the necessary changes to the python code to enable authentication with MI. Now, we are trying to create a batch pool with a user-assigned managed identity. Below is the Python code we have written:

import os
from azure.identity import DefaultAzureCredential, ManagedIdentityCredential
from azure.mgmt.batch import BatchManagementClient
from azure.mgmt.batch.models import Pool, CloudServiceConfiguration, PoolIdentityType, FixedScaleSettings, ContainerRegistry, ContainerConfiguration, PoolAllocationMode, ComputeNodeIdentityReference, VirtualMachineConfiguration, BatchPoolIdentity, UserAssignedIdentities, DeploymentConfiguration, ScaleSettings, FixedScaleSettings, ComputeNodeIdentityReference, NetworkConfiguration

app_name      = 'BatchTest' #The name of the Batch Application to load onto the pool.
app_version   = 0.0.1 #The version of the Batch Application to load. If omitted the default version will be loaded.
registry_name = 'sampleregistry.azurecr.io'
image_name    = 'test-image'
image_version = 'latest'
sku_to_use = 'batch.node.ubuntu 20.04' # Marketplace image sku
image_ref_to_use = : {'additional_properties': {}, 'publisher': 'microsoft-azure-batch', 'offer': 'ubuntu-server-container', 'sku': '20-04-lts', 'version': 'latest', 'virtual_machine_image_id': None}
client_id = "xxx-xxx-xx-xx-xx"
subscription_id = "yyyy-yyy-yyy-yyy-yyy"
resource_group_name = "TestRG"
user_assigned_identity_name = "testUMI"
resource_id = f"/subscriptions/{subscription_id}/resourceGroups/{resource_group_name}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{user_assigned_identity_name}"
pool_vm_size = "STANDARD_DS4_V2"
pool_vm_count = 4
pool_vm_maxcount = 1

if app_version is None:
    version_str = " with the Default {} Version".format(app_name)
else:
    version_str = "with {} Version {}".format(app_name, app_version)

image_name = registry_name + image_name + image_version

application_references = [batchmodels.ApplicationPackageReference(application_id = app_name, version = app_version)]

network_config = None

# Authenticate using DefaultAzureCredential
credentials = ManagedIdentityCredential(client_id = f"{client_id}")
batch_client = BatchManagementClient(credential=credentials,subscription_id=f"{subscription_id}")
#batch_client = BatchManagementClient(credential, subscription_id)

# Define the container registry (with username and password if needed)
container_registry = ContainerRegistry(
    registry_server=f"{registry_name}",
    user_name=None,
    password=None,
    identity_reference = ComputeNodeIdentityReference(resource_id = resource_id))

# Define the container configuration
container_config = ContainerConfiguration(
    type="dockerCompatible",
    container_registries=[container_registry],
    container_image_names=[f"{registry_name}.azurecr.io/{image_name}:{image_version}"])    

vm_config = VirtualMachineConfiguration(
    image_reference=image_ref_to_use,
    node_agent_sku_id=sku_to_use,
    container_configuration=container_config)    

pool = Pool(
    identity=BatchPoolIdentity(
        type=PoolIdentityType.USER_ASSIGNED,
        user_assigned_identities={f"{resource_id}:UserAssignedIdentities()"}
    ), 
    display_name='managedidentitytest',
    vm_size=pool_vm_size,
    deployment_configuration=DeploymentConfiguration(
    virtual_machine_configuration=vm_config,
    ),
    scale_settings=ScaleSettings(
        fixed_scale=FixedScaleSettings(
            target_dedicated_nodes=pool_vm_count,
            target_low_priority_nodes=0)
    ),
    task_slots_per_node=max_tasks_per_node
    #,network_configuration = network_configuration
)

# Batch client
# Attempt to create the pool
batch_client.pool.create(
    resource_group_name="TestRG",
    account_name="Teststsystemdev",
    pool_name=pool_id,
    parameters=pool
    )

If we run the above code, we are getting below error.

Code: LinkedAuthorizationFailed
Message: The client "xxx-xxx-xx-xx-xx" with object id "xxx-xxx-xx-xx-xx" has permission to perform action 'Microsoft.Batch/batchAccounts/pools/write' on scope '/subscriptions/yyyy-yyy-yyy-yyy-yyy/resourceGroups/TestRG/providers/Microsoft.Batch/batchAccounts/Teststsystemdev/pools/ForecastCluster'; however, it does not have permission to perform action(s) 'Microsoft.ManagedIdentity/userAssignedIdentities/assign/action' on the linked scope(s) '/subscriptions/yyyy-yyy-yyy-yyy-yyy/resourcegroups/ProdRG/providers/microsoft.managedidentity/userassignedidentities/testUMI' (respectively) or the linked scope(s) are invalid.

From ChatGPT, I learned that I needed to assign the Managed Identity Operator role to the user-assigned managed identity. I have done so as advised. However, I am now encountering a different error, as detailed below.

Encountered unexpected exception of type <class 'azure.core.exceptions.HttpResponseError'> when trying to create a pool. Exception details: (GatewayTimeout) The gateway did not receive a response from 'Microsoft.Batch' within the specified time period.
Code: GatewayTimeout
Message: The gateway did not receive a response from 'Microsoft.Batch' within the specified time period.

I am not sure what I am missing here. Kindly help me to get this issue resolved.


Solution

  • Encountered unexpected exception of type <class 'azure.core.exceptions.HttpResponseError'> when trying to create a pool. Exception details: (GatewayTimeout) The gateway did not receive a response from 'Microsoft.Batch' within the specified time period. Code: GatewayTimeout Message: The gateway did not receive a response from 'Microsoft.Batch' within the specified time period.

    According to MS-Q&A by vipullag-MSFT.

    A GatewayTimeout error usually means that the Azure service took too long to respond. This can happen for various reasons, like network problems, the service being temporarily unavailable, or incorrect configuration settings.

    You can use the below code and configuration to create pool in azure batch with user managed identity using Azure Python SDK.

    Code:

    import os
    from azure.identity import DefaultAzureCredential, ManagedIdentityCredential
    from azure.mgmt.batch import BatchManagementClient
    from azure.mgmt.batch.models import Pool, VirtualMachineConfiguration, ContainerRegistry, ContainerConfiguration, PoolIdentityType, BatchPoolIdentity, FixedScaleSettings, ComputeNodeIdentityReference, ScaleSettings, DeploymentConfiguration
    
    app_name = 'BatchTest'
    app_version = "0.0.1"
    registry_name = 'sampleregistry.azurecr.io'
    image_name = 'test-image'
    image_version = 'latest'
    sku_to_use = 'batch.node.ubuntu 20.04'
    image_ref_to_use = {
        'publisher': 'microsoft-azure-batch',
        'offer': 'ubuntu-server-container',
        'sku': '20-04-lts',
        'version': 'latest'
    }
    
    client_id = "77a2e89axxxxxb254"
    subscription_id = "xxxxxxxx"
    resource_group_name = "venkatesan-rg"
    user_assigned_identity_name = "venkat"
    resource_id = f"/subscriptions/{subscription_id}/resourceGroups/{resource_group_name}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{user_assigned_identity_name}"
    pool_vm_size = "STANDARD_DS4_V2"
    pool_vm_count = 4
    pool_vm_maxcount = 1
    pool_id = "managedidentitytestpool"
    
    # Authenticate using DefaultAzureCredential
    credentials = DefaultAzureCredential(managed_identity_client_id=client_id)
    batch_client = BatchManagementClient(credential=credentials, subscription_id=subscription_id)
    
    # Define the container registry
    container_registry = ContainerRegistry(
        registry_server=registry_name,
        identity_reference=ComputeNodeIdentityReference(resource_id=resource_id)
    )
    
    # Define the container configuration
    container_config = ContainerConfiguration(
         type="dockerCompatible",
        container_registries=[container_registry],
        container_image_names=[f"{registry_name}/{image_name}:{image_version}"]
    )
    
    # Define the VM configuration
    vm_config = VirtualMachineConfiguration(
        image_reference=image_ref_to_use,
        node_agent_sku_id=sku_to_use,
        container_configuration=container_config
    )
    
    # Define the pool identity
    pool_identity = BatchPoolIdentity(
        type=PoolIdentityType.USER_ASSIGNED,
        user_assigned_identities={resource_id: {}}
    )
    
    # Define the pool configuration
    pool = Pool(
        identity=pool_identity,
        display_name='managedidentitytest',
        vm_size=pool_vm_size,
        deployment_configuration=DeploymentConfiguration(
            virtual_machine_configuration=vm_config
        ),
        scale_settings=ScaleSettings(
            fixed_scale=FixedScaleSettings(
                target_dedicated_nodes=pool_vm_count,
                target_low_priority_nodes=0
            )
        )
    )
    
    # Attempt to create the pool
    try:
        batch_client.pool.create(
            resource_group_name=resource_group_name,
            account_name="Teststsystemdev",
            pool_name=pool_id,
            parameters=pool
        )
        print(f"Pool {pool_id} created successfully.")
    except Exception as e:
        print(f"Failed to create pool: {e}")
    

    Output

    Pool managedidentitytestpool created successfully.
    

    enter image description here

    enter image description here

    Reference: Use managed identities in an Azure Batch account or pool - Azure | Microsoft Learn