The last step of my pipeline in Azure Data Factory executes another pipeline with the flow in which there is a Notebook step. As part of my requirement I need to capture the details of the error messages when this step fails (to store it in the database).
However, the output of the last step does not provide this information, only the link to the other one. My challenge is that I can not modify the "internal" pipeline to get that data (we are interested in the URL of the error), I need to do it everything from the one I have created.
Unfortunately I have not been able to find any possible solution or docs to try to solve this. I'd appreciate any suggestion you can give me.
Any activity will have the output stored in the format as below.
@activity('*activityName*').output.*subfield1*.*subfield2*
To access the output incase of a failed activity, you can select Add activity on failure stream
and use to set a variable.
However, in this scenario, since another pipeline is being executed, its output returned to the parent pipeline (ExecutePipeline
activity) is just the Child PipelineName
and PipelineRunId
.
So let us utilize this PipelineRunId. Let us use the WebActivity to call a REST API Activity Runs - Query By Pipeline Run
POST https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.DataFactory/factories/{factoryName}/pipelineruns/{runId}/queryActivityruns?api-version=2018-06-01
1. Get PipelineRunId from ExecutePipeline activity output and set to a variable.
errorOutput: @activity('Execute Pipeline').output.pipelineRunId
2. Use this run id to dynamically form the url
url: https://management.azure.com/subscriptions/b83c1ed3-xxxx-xxxx-xxxx-2b83a074c23f/resourceGroups/myrg/providers/Microsoft.DataFactory/factories/ktestadf/pipelineruns/@{variables('pipelineRunId')}/queryActivityruns?api-version=2018-06-01
3. Now prepare for the REST API call. Use the url
already prepared. Set method POST, header and body.
Body of the request is a time range in which the querying pipeline ran. We shall set a 1 hour window before and after this call is being made dynamically.
Method: POST
Header: Content-Type: application/json
Body: {"lastUpdatedAfter":getPastTime(1, 'Hour'),"lastUpdatedBefore":getFutureTime(1, 'Hour')}
Authentication: If using MSI set Resource: https://management.azure.com/
4. Finally store that error message in another Array variable, from which you can parse the array and store to SQL sink as required.
here are its content.
{
"name": "ErrorMsg",
"value": [
{
"activityRunEnd": "2021-10-14T07:43:20.1291832Z",
"activityName": "Notebook",
"activityRunStart": "2021-10-14T07:43:18.3867797Z",
"activityType": "DatabricksNotebook",
"durationInMs": 1742,
"retryAttempt": null,
"error": {
"errorCode": "2011",
"message": "Caller was not found on any access policy in this key vault, secretName: databricksclientsecret, secretVersion: , vaultBaseUrl: https://ktestkeyvaults.vault.azure.net/. The error message is: The user, group or application 'name=Microsoft.DataFactory/factories;appid=3ecdccaf-xxxx-xxxx-8818-4f30b45856eb;oid=7ee614a9-xxxx-xxxx-a7cd-fbad1afc715b;iss=https://sts.windows.net/72f988bf-xxxx-41af-91ab-2d7cd011db47/' does not have secrets get permission on key vault 'ktestkeyvaultss;location=eastus'. For help resolving this issue, please see https://go.microsoft.com/fwlink/?linkid=2125287.",
"failureType": "UserError",
"target": "Notebook",
"details": []
},
"activityRunId": "6c9519e1-b646-4d5b-a974-29bef371d7e5",
"iterationHash": "",
"input": {
"notebookPath": "https://adb-7020907718042127.7.azuredatabricks.net/?o=7020907718041127#notebook/171399934287251/command/171399934287255"
},
"linkedServiceName": "",
"output": {
"effectiveIntegrationRuntime": "DefaultIntegrationRuntime (Central US)",
"executionDuration": 0,
"durationInQueue": {
"integrationRuntimeQueue": 1
},
"billingReference": {
"activityType": "ExternalActivity",
"billableDuration": [
{
"meterType": "AzureIR",
"duration": 0.016666666666666666,
"unit": "Hours"
}
]
}
},
"userProperties": {},
"pipelineName": "error2",
"pipelineRunId": "6a717388-516e-46d7-883c-fdcf2d517bd8",
"status": "Failed",
"recoveryStatus": "None",
"integrationRuntimeNames": [
"defaultintegrationruntime"
],
"executionDetails": {
"integrationRuntime": [
{
"name": "DefaultIntegrationRuntime",
"type": "Managed",
"location": "Central US"
}
]
},
"id": "/SUBSCRIPTIONS/B83C2ED3-xxxx-xxxx-xxxx-2B80A074C23F/RESOURCEGROUPS/myrg/PROVIDERS/MICROSOFT.DATAFACTORY/FACTORIES/KTESTADF/pipelineruns/6a717388-516e-46d7-883c-fdcf2d517bd8/activityruns/6c9519e1-b646-4d5b-a974-29bef371d7e5"
}
]
}
I have a different error message, you would find the required URL similarly.