amazon-web-servicesamazon-cloudwatchamazon-cloudwatch-metrics

Drawing ALB RequestPerCountPerTarget accurately on AWS Cloudwatch Dashboard


The AWS ALB Target Groups have a metric "RequestCountPerTarget" that seems at first sight very interesting. However, this metric only seems to be displayed accurately on the full detailed view of the metric, and it is completely screwed over when it appears along with other metrics on the CloudWatch dashboard.

When I configure the metric I have this, which is the correct that is the most useful for me, ie. the number of requests per minute received by a single server

enter image description here

Using this graph, I can quickly determine if my application is overloaded or not : from the average response rate of my servers, I can deduce a max RPM (Requests per Minute) a single server can tank (which happens to be around 200 RPM/server in my case)

However, on the CloudWatch Dashboard, this metrics appears like this

enter image description here

If my understanding is correct, The AWS CloudWatch dashboard uses interpolation in order to avoid requesting to many datapoints, but in this case, what the interpolation seems to be doing, isn't to make an average of "RequestCountPerTarget during 1min" over the dashboard period (1 week in the screenshots), but a sum of "RequestCountPerTarget during 1min" over the dashboard period, which completely destroys the purpose of the metric : I don't care about the total number of requests received over 1 week (since if those requests are distributed evenly during the time frame, this basically means nothing to my servers), but I do care about the average maximum number of requests received in 1 minute over 1 week (since this will reflect the actual request spikes).

Is there a way around this ?


Solution

  • In your first graph, you have the period set to 1 minute, and CloudWatch respects that.

    When you put that graph on the dashboard and change the time range of the dashboard, CloudWatch will adjust the period to have the dashboard load faster.

    You can change that behaviour by going to the Actions -> Period on the top of the dashboard when you have it opened and change the value from Auto to Do not override. This will make the dashboard respect the period you have set on the graph.

    To make the change permanent, go to Actions -> View/edit source and put "periodOverride": "inherit" above the widgets list (make sure to save the dashboard, doesn't save automatically ...).

    {
        "periodOverride": "inherit",
        "widgets": ...
    }
    

    For more info: