I want to create a monitor in Datadog that will alert me on old messages in ActiveMQ queues (in AWS AmazonMQ).
The notifications for alerting is working fine with my query but it seems even I purge the queue or if the messages are expired, the alert value is not decreasing and the monitor stays red for queue with no more message in.
Here is my Datadog Query for the alert:
(avg:aws.amazonmq.enqueue_time{project:myprj AND NOT queue:*.dlq} by {env,queue} / 86400000)
I just filter out the queues terminating by .DLQ
and it is creating an alert per queue and environment.
I divide the value by 86400000 to get the number of days.
I wanted to add another boolean operator in the query like: AND aws.amazonmq.queue_size>0
to be:
(avg:aws.amazonmq.enqueue_time{project:myprj AND NOT queue:*.dlq AND aws.amazonmq.queue_size>0} by {env,queue} / 86400000)
but it seems we cannot add another metrics in the filtering.
I also tried to put this queue_size as query b and use formula:
a * (b/b)
But I guess division by 0 is not managed properly
Perhaps enqueue_time
is not the right metric to check, but I would be curious how I can achieve this in Datadog (I search plenty of article but none are fitting this needs).
Doing an alert on the queue size would be very hard as we sometime have a lot of pending messages but totally consumed only after few hours (or days).
It seems not possible to do this operation in one single Datadog Monitor.
I found that there is a possibility to create a composite Monitor based on two others.
The link to official doc is here.
For the given example in question, we just need to create two "single" monitors:
Monitor a
- size of queue grether than 0:
sum(last_1m):avg:aws.amazonmq.queue_size.sum{project:myprj AND NOT queue:*.dlq} by {env}.as_count() > 0
Monitor b
- age of message greather than one day
(avg:aws.amazonmq.enqueue_time{project:myprj AND NOT queue:*.dlq} by {env,queue} / 86400000)
and we do not notify anybody in it.
To have an alert on both metrics, we just need to create a third Composite Monitor
based from two above:
In a
, you choose your first metrics and in b
the second.
and in the Trigger when
you just put: a && b
You can then setup emails alerting in the Say what's happening
section like any other Monitor
.
Like this the alerting is coming back to green when the queue is empty.