We are looking to track the failure rate of our Azure Container Apps Jobs. Not to be confused with Azure Container Apps. These things enable us to "execute containerized tasks for a finite duration". As far as we can see there is only an integration toward Azure Container Apps and not to Azure Container App Jobs. Does anyone know how we could track this?
We looked through the available metrics in Datadog but nothing pops up that would cover this.
As I said, I don't think there is a direct integration with Datadog specifically for Azure Container Apps Jobs, I was checking around and I guess for Azure Container App Jobs, the only available diagnostic category is basic, which covers only metrics
You can monitor these metrics to get some idea like
Replica completion Count shows the number of replicas to complete successfully for the execution to succeed.
or
using Execution History you can retrieve job execution status to analyze failures over time.
az containerapp job execution list --name "arko-job" --resource-group "arkorg"
Check out this document to set up Azure Monitor to export these metrics from Log Analytics to Datadog - Datadog-Azure integration