I am currently trying to find a way to monitor agents in Azure that were scaled by Azure Scale-Sets. To avoid costs I want to kill every agent that has been running a job for longer than 1 hour. Is there a solution from Azure that would help me in this case?
I was thinking about using alerts and Azure functions but I dont know which alert signal to use since they all only consider metrics from the vmss and not a single agent.
Thank you
Terminate Azure Agent of Scale Set after a period of time
You can use an Azure Automation runbook to check and kill the Agent Service
every hour. To create a runbook and schedule the job, you can follow the Stack link that I answered.
Here is thePowerShell script
can monitor and terminate the Azure Monitor Agent Service
if it runs longer than an hour.
az login --identity --username "ed65fffjfnjff fhc01dcf5"
# Define the script content
$scriptContent = @"
# Get the Azure Monitor Agent process
\$process = Get-Process "AzureMonitorAgentService" -ErrorAction SilentlyContinue
# Check if the process exists
if (\$process) {
# Get the process start time
\$processStartTime = \$process.StartTime
# Get the current time
\$currentTime = Get-Date
# Calculate the elapsed time since the process started
\$elapsedTime = New-TimeSpan -Start \$processStartTime -End \$currentTime
# Check if the elapsed time is greater than 1 hour
if (\$elapsedTime.TotalHours -gt 1) {
# Stop the process
Stop-Process -Name AzureMonitorAgentService -Force
Stop-Service -Name Azure Monitor Agent -Force -Confirm
Write-Output "Azure Monitor Agent Service has been stopped."
} else {
Write-Output "Azure monitor Service has not been running for more than 1 hour."
Write-Output "Process Start Time: " + \$process.StartTime
Write-Output "Current Time: " + \$currentTime
Write-Output "Elapsed Time: " + \$elapsedTime
}
} else {
Write-Output "Azure Monitor Agent process is not running."
}
"@
# Define the resource group and VMSS name
$resourceGroup = "venkat-vmss_group"
$vmssName = "venkat-vmss"
# Get the list of instance IDs in the VMSS
$instanceIds = az vmss list-instances --resource-group $resourceGroup --name $vmssName --query "[].instanceId" -o tsv | ConvertFrom-Json
# Loop through each instance ID and execute the script
foreach ($instanceId in $instanceIds) {
Write-Host "Running script on VMSS instance $instanceId ..."
az vmss run-command invoke `
--resource-group $resourceGroup `
--name $vmssName `
--instance-id $instanceId `
--command-id RunPowerShellScript `
--scripts "$scriptContent"
}
Write-Host "Script execution completed for all VMSS instances."
To schedule the Job and check the Agent service
status hourly, you can follow the steps shown in the snapshot below.
Output: