amazon-web-servicesspot-instances

Is there data for AWS spot interruption rate over time?


We are running an EMR cluster with spot instances as task nodes. The EMR cluster is executing spark jobs which sometimes run for several hours. Interruptions of spot instances can cause the failure of the spark job which then requires us to restart the job entirely.

I can see that there is some basic information on the "Frequency of interruption" on AWS Spot Advisor - However, this data seems to be very generic, I can't see historic trends and I also miss the probability of interruption based on how long the spot instance is running (which should have a significant impact on the probability of interruption).

Is this data available somewhere? Or are there other data points that can be used as proxy?


Solution

  • I found this Github issue which provides a link to this JSON file in Spot Advisor S3 bucket that includes interruption rates.

    https://spot-bid-advisor.s3.amazonaws.com/spot-advisor-data.json