I have RRD Files with multiple months of PDP Data (5min Interval).
For general purpose Graphs its fine, when rrdtool automatically decides which RRA to use for displaying the Graph.
But some of my Graphs contain 95-Percentile Data in the legend, which I need to be calculated from "exact" 5min-Interval Data, because calculation of Percentile from aggregated Data-Points can (by it's nature) lead to dramatically incorrect values.'
fetch
Data from RRD File with a step of 300 and I'll get the right data to calculate percentile on my ownwidth
of the Graph, even if the Time-Range is the same, and 300s Data is available for the whole Time-Rangegraph:
...
--step 300
...
"VDEF:perca=a,95,PERCENT",
...
created with:
'-s', '300',
...
"RRA:AVERAGE:0.5:1:53568", # 6 months pdp
"RRA:AVERAGE:0.5:12:8904", # 1 hour, 1 year.
"RRA:AVERAGE:0.5:288:730", # 1 day, 2 years.
"RRA:AVERAGE:0.5:2016:520", # 1 week, 10 years.
"RRA:MAX:0.5:1:600", # 5 min: 2 days
"RRA:MAX:0.5:12:8904", # 1 hour, 1 year.
"RRA:MAX:0.5:288:730", # 1 day, 2 years.
"RRA:MAX:0.5:2016:520", # 1 week, 10 years
This is due to data consolidation being performed prior to the VDEF
calculation.
Although your rrdtool graph
arguments specify a step of 300s, this is less width than a pixel of the graph, and so the data series are further averaged before you get to the VDEF
. All the CDEF
and VDEF
functions will always work with a time series of one cdp per pixel. From the RRDTool manual:
Note: a step smaller than one pixel will silently be ignored.
This means that, while you can decrease the resolution of the data, you cannot increase it. Sadly, to get an accurate 95th Percentile, you need higher-resolution data.
So, if you omit the --step 300
in a narrow graph, what will happen is:
With the --step 300
it is slightly different process, but the same result:
So, you can see the final outcome is the same - its just where the 300s -> 1h consolidation happens, either in the RRA or at graph time.
When using a wide graph, the time per pixel becomes smaller, and RRDTool then no longer needs to perform its additional consolidation of the data, resulting in a more accurate calculation:
When you retrieve the raw data using rrdtool fetch
1 then this extra consolodation doesn't happen, so you get:
Your next question will likely be, how do I stop this from happening? The unfortunate answer is that you cannot. RRDTool does not have a Percentile type CF, and so the correct calculations cannot be performed in the RRA (this would be the only real solution).
The Routers2 frontend for MRTG calculated 95th Percentiles for the graphs, and the way it does it is to perform a high-resolution fetch
to get the raw data and calculates the value internally before passing this in a HRULE
when making the graph. In other words, it doesn't use a VDEF
at all, due to this problem you are experiencing.