We've been struggling for some time to properly publish internal metrics into Amazon's CloudWatch service. We have a number of different types of internal metrics that we map into CloudWatch's MetricDatum
class and publish.
Each of the MetricDatum
instances has a double value
and also a StatisticSet
which accepts a sampleCount
, sum
, and minimum
/maximum
values. For counters such as a Httpd 200 page counter, it is more appropriate to use the StatisticSet
and set the sampleCount
and the sum
to the value of the counter. If you look at the ELB stats for example, that is what Amazon does to publish them. This makes the sum, average and other graph views work correctly when you are graphing the result.
The problem is when the value of the counter is 0 because CloudWatch does not allow you to publish a StatisticSet
with a sampleCount
of 0. What ELB does is to not publish anything for that period which creates holes in the graph. This is a pain because you get INSUFFICIENT_DATA
warnings when the counter is 0 for the time period. If you have notifications on ERROR
and want to know when you transition back to OK
, the INSUFFICIENT_DATA to OK
alerts will keep you up all night.
You have 1 alarm in INSUFFICIENT DATA state in US East (N. Virginia) region.
Question: How do I properly publish CloudWatch metrics so that you don't see the INSUFFICIENT_DATA
warnings but still use the sampleCount
with metrics that have a value of 0.
Although you cannot publish a StatisticSet
with a sampleCount
of 0, you can publish it with an extremely small sampleCount
since it is a double
. We have found that a sampleCount
of 0.000000001
seems to give the appearance of 0 on the graphs but it still fills in the holes in the graph appropriately and does not cause the INSUFFICIENT_DATA
alarms to happen.
double sampleCount = numSamples;
// our values come in as value and numSamples but StatisticSet wants a sum
double sum = value * numSamples;
if (numSamples == 0) {
// special case here, CloudWatch does not allow a 0 sample count so
// we have to set it to be slightly more
sampleCount = 0.000000001D;
// but sum can be 0
}
StatisticSet statisticSet =
new StatisticSet().withMinimum(min)
.withMaximum(max)
.withSampleCount(sampleCount)
.withSum(sum);
As an aside, I've coded some of this logic into my SimpleMetrics library which is designed to easily track and publish metrics.