Select jobid, count (servername) from [HangFire].[State] with(nolock)
where [JobId] in (1,2,3,)
group by jobid
having count(servername)>1
This issue happens mainly during scale-out
and scale-in
. In scale-out, multiple WebJob instances may pick the same Hangfire job due to timing issues in how Hangfire locks jobs with SQL Server.
In scale-in, if an instance is removed while running a job, that job can restart on another instance, leading to duplicates or delays. Always On
helps keep the app running but doesn’t stop this behavior during scaling.
To Resolve the issue,
Ensure only one background worker runs per instance to avoid job duplication,
Limit Worker Count to 1 per Instance
var options = new BackgroundJobServerOptions
{
WorkerCount = 1
};
app.UseHangfireServer(options);
[DisableConcurrentExecution]
Attribute, this will make the long-running or sensitive job methods to prevent parallel execution of the same job.[DisableConcurrentExecution(timeoutInSeconds: 300)]
public void ProcessJob()
{
}
Use a database or cache to record jobs that have already run, and check this at the start of each job to avoid running it again.
Log which instance picks the job, along with timestamps, to better understand concurrency behavior.
Console.WriteLine($"Job started by {Environment.MachineName} at {DateTime.UtcNow}");
If possible, upgrade to Hangfire 1.7+ for better handling of job locking with SQL Server. Or consider using Durable Functions or Azure Service Bus with Azure Functions for more reliable job processing across multiple instances.