azure-devopsdatabricksazure-powershellcicdazure-pipelines-release-pipeline

Azure CICD pipelines


One of our customer is using the release pipelines with powershell code to deploy databricks notebooks ( dev-tst-uat-prd).

The way it is built is not really practical. So we have one workspace for multiple Business projects and therefore dedicated folders for each project in workspace ( project 1 projecct 2 project 3)

And current CICD pipeline will look into all of those folders and pick all the nested folders or files and will move them accross env.

Major problem here is this scenario:

Lets say code for project 1 and project 2 is approved, pushed and now code is in UAT env. But something happened and Business did not approve the project 2 to be pushed to production but project 1 needs to go to production. Here we have problem as if i trigger the piple all the code goes.

this is the code

$admin_dir = "/ADMIN/"
$data_dir = "/DATA/"
$data_project1 = "/Project 1/"
$data_project2 = "/Project 2/"
$data_project3 = "/Project 3/"



$nb_folders = @($admin_dir, $data_dir, $data_project1, $data_project3, $data_project3)

foreach ($nb_folder in $nb_folders) {

    if ($target_ENV -eq 'd') {
        Write-Host "The environment is dev, so we don't change resource names in notebooks"
    }
    else {
        $nb_dir = $build_dir + $nb_folder
        Write-Host "Changing variables in notebooks"
        $files = gci -File -Path $nb_dir
        foreach ($f in $files) {
            Write-Output $f.name
            for ($i = 0; $i -lt $replace_str.Count; $i++) {
            (Get-Content $nb_dir\$f).replace($replace_str[$i].Source, $replace_str[$i].Target) | Set-Content $nb_dir\$f
            }
        }

        if ($target_ENV -eq 'p') {
            foreach ($f in $files) {
                Write-Output $f.name
                for ($i = 0; $i -lt $prd_replace_str.Count; $i++) {
                (Get-Content $nb_dir\$f).replace($prd_replace_str[$i].Source, $prd_replace_str[$i].Target) | Set-Content $nb_dir\$f
                }
            }
        }
    }
    if ($target_ENV -eq 'd') {
        Write-Host "The environment is dev, so command for copying scripts from repo will be skipped"   
    }
    else {
        Import-DatabricksFolder -BearerToken $DatabricksToken -Region "westeurope" -LocalPath $nb_dir -DatabricksPath $nb_folder
    }
}

#-------------------------------
# Cleanup
#-------------------------------

#Remove-Item $nb_dir\*

YMAL code for that specific task:

steps:
- task: PowerShell@2
  displayName: 'Deploy notebooks'
  inputs:
    targetType: filePath
    filePath: './$(drop)/Deployment Scripts/Databricks/ads_deploynotebooks.ps1'
    arguments: '-build_dir $(drop)/Databricks/notebooks'

Could there be a solution to this without making a duplicated CICD per project/path? Is it possible if ($target_ENV -eq 'u') env is UAT then user who is triggering the pipeline manually define the value of the $nb_folder variable?


Solution

  • You can try to use a pipeline runtime parameter to let the user pass the custom value to the PowerShell script:

    1. In your PowerShell script, define a PowerShell parameter for use when ($target_ENV -eq 'u').
    param (
        [string] $custom_nb_folder
    )
    
    if ($target_ENV -eq 'u')
    {
      # Directly use the PowerShell parameter $custom_nb_folder in this section.
      Write-Host "$custom_nb_folder"
      . . .
    }
    
    1. In the pipeline YAML, set a pipeline runtime parameter and pass its value to the PowerShell parameter defined in the PowerShell script.
    parameters:
    - name: custom_nb_folder
      type: string
      default: '/Project 1/'
    
    steps:
    - task: PowerShell@2
      displayName: 'Deploy notebooks'
      inputs:
        targetType: filePath
        filePath: './$(drop)/Deployment Scripts/Databricks/ads_deploynotebooks.ps1'
        arguments: '-build_dir $(drop)/Databricks/notebooks -custom_nb_folder "${{ parameters.custom_nb_folder }}"'
    

    By this way, every time when a user manually trigger the pipeline, he can specify a different value to overwrite the default value, otherwise the default value will be used.

    enter image description here


    UPDATE:

    If you want to provide the users with a list of all the available values and let the users can select and pass one or more values into the PowerShell script, you can do like as below:

    1. In the PowerShell script, define an array type parameter to receive the multiple values passed into the script.
    param (
        [string[]] $custom_nb_folders
    )
    
    Write-Host $custom_nb_folders
    
    foreach ($item in $custom_nb_folders)
    {
        Write-Host "$item"
    }
    
    1. In the pipeline YAML, define an object type parameter to provide a list of all the available values.
    parameters:
    - name: custom_nb_folders
      displayName: 'Delete un-required and only keep required items:'
      type: object
      default:
        - '/ADMIN/'
        - '/DATA/'
        - '/Project 1/'
        - '/Project 2/'
        - '/Project 3/'
    
    steps:
    - task: PowerShell@2
      displayName: 'Deploy notebooks'
      env:
        JSON_CONTENT: ${{ convertToJson(parameters.custom_nb_folders) }}
      inputs:
        pwsh: true
        targetType: 'inline'
        script: |
          Out-File -FilePath nb_folders.json -InputObject "$env:JSON_CONTENT"
          $arr = Get-Content nb_folders.json | ConvertFrom-Json
          Remove-Item -Path nb_folders.json
          ./$(drop)/"Deployment Scripts"/Databricks/ads_deploynotebooks.ps1 -build_dir "$(drop)/Databricks/notebooks" -custom_nb_folders $arr
    

    With this way, when the users to manually trigger the pipeline, they can manually delete un-required and only keep required items from the provided default list.

    enter image description here