azureazure-functionsazure-virtual-network

Removing Azure Functions using Flex Consumption SKU from a subnet leaves it in a corrupted state


I am deploying Azure Functions using the Flex Consumption SKU to a subnet on a virtual network. The subnet is delegated to Microsoft.App/environments and everything regarding the deployment works perfectly until I remove it and try to deploy it again.

For context, we have a development environment and each evening I tear it down to save costs, each morning I deploy the resources and everything works fine except with Flex Consumption apps. In specific regards to Function Apps I am using the following Azure CLI command to remove it, you'll note that I'm disconnecting it from the virtual net before removing the resource itself to ensure I'm not leaving it in a weird state:

az webapp vnet-integration remove --name $resource.name --resource-group $resourceGroup.name
az resource delete --ids $resource.id

Deploying the same Function App to the same subnet fails with the following error:

Internal Server Error

The error itself isn't particularly useful, but when you try to remove or edit the subnet you get a more useful error:

Failed to delete subnet 'xxxx'. Error: Subnet xxxx is in use by /subscriptions/xxxx/resourceGroups/xxxx/providers/Microsoft.Network/virtualNetworks/xxxx/subnets/xxxx/serviceAssociationLinks/legionservicelink and cannot be deleted. In order to delete the subnet, delete all the resources within the subnet. 

The resource it tells me is in use, legionservicelink, seems to be a hangover from the original linking to the Function App and isn't an actual resource governed by me as far as I can tell. I certainly can't find it in the subscription to remove it.

I've opened a ticket with Microsoft and they are suggesting I should remove the app from the network before I delete it, but you can see that I am.

Does anyone know how I remove the subnet once it's in this state? Obviously I need to remove anything using it first, but I can't find the resource to remove it.

EDIT

I have since found what I believe to be reproduceable steps to cause this bug - I still believe it's a bug because the same steps do not cause Logic Apps or other types of Function App to corrupt the subnet in way that renders them unable to be removed or edited. The scenario is actually linked to how the app is deployed via Bicep rather than it being torn down as I originally suspected.

Here are the steps I am following to create the app via Bicep:

  1. Create a Function App with the Flex Consumption SKU using bicep
  2. Attach the App to a subnet using bicep

The issue I'm seeing where the subnet is becoming corrupted is if you reapply those two steps without first removing the Function App from the subnet or deleting it - like I said this only seems to happen with Flex Apps. If we redeploy, the following happens when those steps are re-run:

  1. Create a Function App with the Flex Consumption SKU using bicep

    Because I'm not specifying any networking at this point the Function App is updated and having its networking removed, therefore disconnecting it from the network - this is expected and intended. This is where I believe the issue exists, this is not in fact disconnecting it from the subnet correctly and leaves the subnet corrupted.

  2. Attach the App to a subnet using bicep

    This step now fails because the subnet is still associated with the Function App, despite the Function App thinking it has been disconnected.

I have a workaround where I am simply disconnecting the Function App from the VNet using Azure CLI before running the Bicep script. I am however curious whether anyone else is seeing this issue when redeploying Flex Apps via Bicep?

FURTHER EDIT

See Thiago's comment beneath Dasari's answer below, this is a known issue by Microsoft


Solution

  • Thanks to @ThiagoAlmeida I've had confirmation that this is indeed a bug with Flex Consumption apps in Azure. His comment is below but here is the relevant section from it:

    This is a known issue by the product group, the service association link is not being deleted in time and leaves that association still there (/legionservicelink is the Flex Consumption VNet integration association link).

    The error only occurs for me when you deploy the Function App using Bicep and attach it to a network separately in the same script. For clarity these are the steps to reproduce the error:

    1. Use Bicep code similar to below where the networking is configured after the Function App is provisioned.
    resource appHost 'Microsoft.Web/serverfarms@2021-03-01' = {
      name: hostingPlanName
      location: location
      sku: sku
      kind: 'functionapp'
      properties: {
        reserved: true
      }
    }
    
    resource functionApp 'Microsoft.Web/sites@2023-12-01' = {
      name: functionAppName
      location: location 
      kind: 'functionapp,linux'
      identity: {
        type: 'SystemAssigned'
      }
      properties: {
        reserved: true
        enabled: true
        hostNameSslStates: hostNameSslStates
        functionAppConfig: functionAppConfig
        serverFarmId: appHost.id
        siteConfig: siteConfig
      } 
    }
    
    resource hostNameBindings 'Microsoft.Web/sites/hostNameBindings@2018-11-01' = {
      parent: functionApp
      name: '${functionAppName}.azurewebsites.net'
      properties: {
        siteName: functionApp_siteName
        hostNameType: 'Verified'
      }
    }
    
    resource planNetworkConfig 'Microsoft.Web/sites/networkConfig@2021-01-01' = {
      parent: functionApp
      name: 'virtualNetwork'
      properties: {
        subnetResourceId: subnetId
        swiftSupported: true
      }
    }
    
    1. Deploy the Bicep script once.

    2. Observe that the Function App is created and attached to the network as expected.

    3. Run the Bicep script a second time without changing or deleting the Function App.

    4. Observe a 500 error related to the networking section in the deployment script.

      • The error itself doesn't give you much information about what's happened, but if navigate to the Flex app you will observe that it is no longer connected to the network, this is expected because the Bicep script creates the Function App without the network configuration, which is applied in a secondary step, given this secondary step failed the app has been refreshed without attachment to the network. This is what corrupts the subnet. In refreshing the app the Bicep script hasn't explicitly removed it from the subnet in the same way as it does for Logic Apps and other Function Apps therefore the subnet still maintains a broken link to the app and can't be edited or removed.
      • Try to manually reattach the Function App to the same subnet and observe the actual error:

      Error: Subnet xxxx is in use by /subscriptions/xxxx/resourceGroups/xxxx/providers/Microsoft.Network/virtualNetworks/xxxx/subnets/xxxx/serviceAssociationLinks/legionservicelink

    Workaround

    There is one workaround that I've found which has overcome this issue for us. Before running the Bicep script add a deployment step that removes the network binding from the app. Like this:

    az webapp vnet-integration remove --name $appName --resource-group $resourceGroupName
    

    Because the network binding has been explicitly removed, running the Bicep script no longer corrupts the subnet.

    Microsoft have advised they are working on a fix which should be available in the coming weeks.

    Observations

    If you remove a Flex Consumption app from a subnet manually via the portal, or delete it via the portal, or even via the Azure CLI, the subnet does not become corrupted.

    It is also possible that combining the network binding with the module that creates the app itself does not encounter the same error because the app is refreshed along with its network configuration. I haven't tested this scenario because our specific situation requires us to bind the network after the app is created.