azureazure-synapseazure-bicepazure-synapse-analytics

Bicep How to reference an existing Spark Configuration from Synapse Workspace?


I have a Spark Pool template working, and I would like to select/reference an existing Spark Configuration that was created in a given Synapse Workspace as below:

enter image description here

Select in here:

enter image description here

But if I run the template below:

@description('Creates Spark Pool resources within the specified Synapse Workspace based on the provided configurations.')
resource synapseWorkspaceDSSparkPools 'Microsoft.Synapse/workspaces/bigDataPools@2021-06-01' = [
  for sparkPool in sparkPools: {
    parent: synapseWorkspaceDS
    name: sparkPool.name
    location: location
    properties: {
      nodeCount: sparkPool.nodeCount
      nodeSizeFamily: sparkPool.nodeSizeFamily
      nodeSize: sparkPool.nodeSize
      autoScale: sparkPool.autoScaleEnabled
        ? {
            enabled: true
            minNodeCount: sparkPool.minNodeCount
            maxNodeCount: sparkPool.maxNodeCount
          }
        : null
      autoPause: sparkPool.autoPauseEnabled
        ? {
            enabled: true
            delayInMinutes: sparkPool.autoPauseDelayInMinutes
          }
        : null
      // This property needs to be empty when creating a new Spark Pool, because Synapse Spark pool must be created before libraries are installed.
      customLibraries: []
      // This property needs to be empty when creating a new Spark Pool, because Synapse Spark pool must be created before libraries are installed.
      libraryRequirements: {}
      sparkVersion: sparkPool.sparkVersion
      sparkConfigProperties: {
        configurationType: 'Artifact'
        filename: sparkConfigurationDSName
        content: '{"id":"${synapseWorkspace.id}/sparkconfigurations/${sparkConfigurationName}","name":"${sparkConfigurationName}","type":"Microsoft.Synapse/workspaces/sparkconfigurations","properties":{"description":"Configuration enabled to track logs from Spark Pools to Log Analytics. The documentation can be found here: https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-azure-log-analytics","configs":{"spark.synapse.logAnalytics.enabled":"true","spark.synapse.logAnalytics.workspaceId":"${operationalInsightsWorkspace.properties.customerId}","spark.synapse.logAnalytics.secret":"${logAnalyticsWorkspaceSharedKey}"},"annotations":[],"configMergeRule":{"artifact.currentOperation.spark.synapse.logAnalytics.enabled":"replace","artifact.currentOperation.spark.synapse.logAnalytics.workspaceId":"replace"}}}'
      }
      isComputeIsolationEnabled: sparkPool.isComputeIsolationEnabled
      sessionLevelPackagesEnabled: sparkPool.sessionLevelPackagesEnabled
      dynamicExecutorAllocation: sparkPool.dynamicExecutorAllocationEnabled
        ? {
            enabled: true
            minExecutors: sparkPool.minExecutorCount
            maxExecutors: sparkPool.maxExecutorCount
          }
        : null
      cacheSize: sparkPool.cacheSize
    }
  }
]

It creates a new one instead of assigning an existing one:

enter image description here

But if I remove the content and keep the name with Artifact or File option available here, I get the error below:

"message":"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-deployment-operations for usage details.","details":[{"code":"ValidationFailed","message":"Spark pool request validation failed.","details":[{"code":"SparkComputePropertiesCorrupted","message":"SparkConfigProperties field is corrupted with Content:, Filename:gmatheus01rsynwsparkConfiguration"}]}

Documentation reference


Solution

  • It is actually possible, I created a spark pool in my Synapse Workspace using this script (I called from Deployment Script):

    param(
        [string]$SubscriptionId,
        [string]$UserManagedIdentity,
        [string]$UserManagedIdentityId,
        [string[]]$SynapseWorkspaces,
        [string]$LogAnalyticsWorkspaceName,
        [string]$LogAnalyticsWorkspaceId,
        [string]$LogAnalyticsSecret,
        [string]$DeployerId,
        [string]$DeployerName
    )
    
    # Initialize counters
    $successCount = 0
    $failCount = 0
    
    $DeploymentScriptOutputs = @{
        success = $successCount
        fails   = $failCount
    }
    
    # Connect to Azure
    try {
        $retryIntervalSec = 300
        $maxRetryCount = 1
        $retryCount = 0
        $connected = $false
    
        while (-not $connected -and $retryCount -lt $maxRetryCount) {
            try {
                Write-Host "Using User Managed Identity '$UserManagedIdentity' to connect to Azure..."
                Set-AzContext -SubscriptionId $SubscriptionId -ErrorAction Stop
                Write-Host "Context set to SubscriptionId '$SubscriptionId'."
                $connected = $true
            } catch {
                $retryCount++
                if ($retryCount -lt $maxRetryCount) {
                    Write-Warning "Failed to connect to Azure. Retrying in $retryIntervalSec seconds... (Attempt $retryCount of $maxRetryCount)"
                    Start-Sleep -Seconds $retryIntervalSec
                } else {
                    Write-Error "Failed to connect to Azure after $maxRetryCount attempts. Error: $_"
                    $DeploymentScriptOutputs['success'] = $successCount
                    $DeploymentScriptOutputs['fails'] = $failCount
                    exit 1
                }
            }
        }
    } catch {
        Write-Error "Unexpected error during Azure connection setup. Error: $_"
        $DeploymentScriptOutputs['success'] = $successCount
        $DeploymentScriptOutputs['fails'] = $failCount
        exit 1
    }
    
    # Get token for Synapse - try multiple resource URLs
    $token = $null
    $resourceUrls = @(
        "https://dev.azuresynapse.net/",
        "https://management.azure.com/",
        "https://dev.azuresynapse.net"
    )
    
    # If User Managed Identity is provided, use it to get the token
    # Reference: https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-to-use-vm-token#get-a-token-using-powershell
    foreach ($resourceUrl in $resourceUrls) {
        try {
            $tokenUrl = "http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=$resourceUrl&client_id=$UserManagedIdentityId"
            Write-Host "Attempting to get token for resource: $resourceUrl. Token URL: $tokenUrl"
            $response = Invoke-WebRequest -Uri $tokenUrl -Headers @{Metadata="true"} -ErrorAction Stop
            $content =$response.Content | ConvertFrom-Json
            if ($content -and $content.access_token) {
                $token = $content.access_token
                Write-Host "Successfully obtained token for resource: $resourceUrl"
                break
            }
        } catch {
            Write-Error "Failed to retrieve token using User Managed Identity '$UserManagedIdentity'. Error: $_"
            Write-Host "Attempting to get token for resource: $resourceUrl. Retrying with Get-AzAccessToken..."
            $token = (Get-AzAccessToken -ResourceUrl $resourceUrl).Token
            if ($token) {
                Write-Host "Successfully obtained token using Get-AzAccessToken for resource: $resourceUrl"
                break
            } else {
                Write-Error "Failed to obtain token for resource: $resourceUrl. Continuing to next resource."
                continue
            }
        }
    }
    
    if (-not $token) {
        Write-Error "Failed to retrieve access token."
        $DeploymentScriptOutputs.fails = ++$failCount
        exit 1
    }
    
    function Get-ISO8601Timestamp {
        return (Get-Date).ToUniversalTime().ToString("yyyy-MM-ddTHH:mm:ssZ")
    }
    
    Write-Host "Synapses to be updated: $($SynapseWorkspaces -join ', ')"
    
    foreach ($workspaceName in $SynapseWorkspaces) {
        $ConfigName = "$workspaceName-SparkConfigurationToLogAnalytics-$LogAnalyticsWorkspaceName"
        $uri = "https://$workspaceName.dev.azuresynapse.net/sparkconfigurations/$ConfigName"+ "?api-version=2021-06-01-preview"
        Write-Host "`nURI: $uri"
        $headers = @{
            "Authorization" = "Bearer $token"
            "Content-Type"  = "application/json"
        }
    
        $Property1 = "spark.synapse.logAnalytics.enabled"
        $Property2 = "spark.synapse.logAnalytics.workspaceId"
        $Property3 = "spark.synapse.logAnalytics.secret"
    
        $body = @{
            properties = @{
                description = "Configuration enabled to track logs from Spark Pools to Log Analytics. The documentation can be found here: https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-azure-log-analytics"
                configs = @{
                    $Property1 = "true"
                    $Property2 = $LogAnalyticsWorkspaceId
                    $Property3 = $LogAnalyticsSecret
                }
                configMergeRule = @{
                    "artifact.currentOperation.$Property1" = "replace"
                    "artifact.currentOperation.$Property2" = "replace"
                    "artifact.currentOperation.$Property3" = "replace"
                }
                created = Get-ISO8601Timestamp
                createdBy = $DeployerName
                annotations = @("Synapse workspace name: $workspaceName", "Log Analytics name: $LogAnalyticsWorkspaceName", "Deployer name: $DeployerName")
            }
        } | ConvertTo-Json -Depth 10
    
        try {
            Write-Host "`nCreating/Updating Spark Configuration '$ConfigName' in Synapse Workspace '$workspaceName'..."
            $response = Invoke-RestMethod -Uri $uri -Method PUT -Headers $headers -Body $body -ErrorAction Stop
            Write-Host "`n$response"
            Write-Host "`nSpark Configuration created/updated successfully in '$workspaceName'.`n"
            $successCount++
        } catch {
            Write-Error "`nFailed to create/update Spark Configuration in '$workspaceName'. Error: $_"
            $failCount++
        }
    }
    
    # Final result
    Write-Host "`nSpark Configuration update completed. Success: $successCount, Failed: $failCount."
    $DeploymentScriptOutputs.success = $successCount
    $DeploymentScriptOutputs.fails = $failCount
    

    Then I am assigning the existing configuration like this:

          sparkConfigProperties: {
            configurationType: 'Artifact'
            filename: sparkConfigurationName
            content: string({
              name: sparkConfigurationName
              properties: {
                configs: sparkConfiguration.properties.configs
                annotations: sparkConfiguration.properties.annotations
                type: sparkConfiguration.type
                description: sparkConfiguration.properties.description
                notes: ''
                created: sparkConfiguration.properties.created
                createdBy: sparkConfiguration.properties.createdBy
                configMergeRule: sparkConfiguration.properties.configMergeRule
              }
            })
          }