graphpercentilerrdtoolrrd

rrdtool graph ignoring --step?


I have RRD Files with multiple months of PDP Data (5min Interval).

For general purpose Graphs its fine, when rrdtool automatically decides which RRA to use for displaying the Graph.

But some of my Graphs contain 95-Percentile Data in the legend, which I need to be calculated from "exact" 5min-Interval Data, because calculation of Percentile from aggregated Data-Points can (by it's nature) lead to dramatically incorrect values.'

graph:

...
--step 300
...
"VDEF:perca=a,95,PERCENT",
...

created with:

        '-s', '300',
       ...
        "RRA:AVERAGE:0.5:1:53568",      # 6 months pdp
        "RRA:AVERAGE:0.5:12:8904",      # 1 hour, 1 year.
        "RRA:AVERAGE:0.5:288:730",      # 1 day, 2 years.
        "RRA:AVERAGE:0.5:2016:520",     # 1 week, 10 years.
        "RRA:MAX:0.5:1:600",            # 5 min: 2 days
        "RRA:MAX:0.5:12:8904",          # 1 hour, 1 year.
        "RRA:MAX:0.5:288:730",          # 1 day, 2 years.
        "RRA:MAX:0.5:2016:520",         # 1 week, 10 years

Solution

  • This is due to data consolidation being performed prior to the VDEF calculation.

    Although your rrdtool graph arguments specify a step of 300s, this is less width than a pixel of the graph, and so the data series are further averaged before you get to the VDEF. All the CDEF and VDEF functions will always work with a time series of one cdp per pixel. From the RRDTool manual:

    Note: a step smaller than one pixel will silently be ignored.

    This means that, while you can decrease the resolution of the data, you cannot increase it. Sadly, to get an accurate 95th Percentile, you need higher-resolution data.

    So, if you omit the --step 300 in a narrow graph, what will happen is:

    With the --step 300 it is slightly different process, but the same result:

    So, you can see the final outcome is the same - its just where the 300s -> 1h consolidation happens, either in the RRA or at graph time.

    When using a wide graph, the time per pixel becomes smaller, and RRDTool then no longer needs to perform its additional consolidation of the data, resulting in a more accurate calculation:

    When you retrieve the raw data using rrdtool fetch1 then this extra consolodation doesn't happen, so you get:

    Your next question will likely be, how do I stop this from happening? The unfortunate answer is that you cannot. RRDTool does not have a Percentile type CF, and so the correct calculations cannot be performed in the RRA (this would be the only real solution).

    The Routers2 frontend for MRTG calculated 95th Percentiles for the graphs, and the way it does it is to perform a high-resolution fetch to get the raw data and calculates the value internally before passing this in a HRULE when making the graph. In other words, it doesn't use a VDEF at all, due to this problem you are experiencing.