rrdtoolrrd

RRDTool data values (e.g. max value) are different in different time resolutions


currently I'm experimenting a bit with RRDTool. I'm aware that the accuracy gets lower the longer the time periods are selected. But I thought I could bypass this with my datasource settings.

For example temperature and humidity from my house, resoultion 1h:

1h resoultion

And now with the resolution of 1d:

1d resolution

As you could see, there is a great difference for the max. value of the blue line.

I created my datasources and archives with this values:

"rrdtool create temp.rrd --step 30",
"DS:temp:GAUGE:60:U:U",
"DS:humidity:GAUGE:60:U:U",
"RRA:AVERAGE:0.5:1:1051200",
"RRA:MAX:0.5:1:1051200",
"RRA:MIN:0.5:1:1051200",

I thought that 1051200 (1 year = 31536000 / 30 s (resoulution) = 1051200) is correct for saving every value for a year and that there should be no need for interpolating.

Is it possible to get the exact values displayed even if the resolution changes (for example the max humidity (Luftfeuchtigkeit) at 99.9%)?

Here are my values for image creation:

"--start" => "-1h", (-1d etc-)
"--title" => "Haustemperatur",
"--vertical-label" => "°C / % RLF",
"--width" => 800,
"--height" => 600,
"--lower-limit" => "-5",
"DEF:temperatur=$rrdFile:temperatur:LAST",
"DEF:humidity=$rrdFile:humidity:LAST",
"LINE1:temperatur#33CC33:Temperatur",
"GPRINT:temperatur:LAST:\t\tAktuell\: %4.2lf °C",
"GPRINT:temperatur:AVERAGE:Schnitt\: %4.2lf °C",
"GPRINT:temperatur:MAX:Maximum\: %4.2lf °C\j",
"LINE1:humidity#0000FF:Relative Luftfeuchtigkeit",
"GPRINT:humidity:LAST:Aktuell\: %4.2lf %%",
"GPRINT:humidity:AVERAGE:Schnitt\: %4.2lf %%",
"GPRINT:humidity:MAX:Maximum\: %4.2lf %%\j",

Thanks for your help and any suggestions.

P.S. I'm using a library to generate the graphs and the database, please do not be surprised about possible syntax errors.


Solution

  • Your problem is that you are causing the values to be rolled-up on the fly at graph time, but have not correctly specified which rollup function to use. Your second graph is showing the MAXIMUM of the LAST in the interval, not the true Maximum.

    There are a few issues to explain with this configuration:

    Firstly, your RRD is defined using 3 RRAs with 1cdp=1pdp and different consolidation functions (AVG, MIN, MAX). This means they are functionally identical, but they do not save you any time at graphing as they have not done any pre-rollup for you! You should definitely consider having just one of these (probably AVG) and adding others at lower resolution to help speed up graphing when you have a bigger time window.

    Secondly, you need to specify the on-the-fly rollup function. When graphing, RRDTool will work out the best RRA to use based on your DEF lines, and will perform any additional consolidation required on the fly. This can take a long time if the only available RRA is too high-granularity.

    Your graph request uses DEF:temperatur=$rrdFile:temperatur:LAST but you do not actually have a LAST type RRA, so RRDTool will grab the last average. Your RRA data points are at 30s interval, but your second graph has (approx) 5min per pixel, meaning that RRDTool needs to grab the 10 entries from the RRA, and print the last. Looking at the data in the top graph, it seems that the last in that interval was the 66 value, though previous ones were 100.

    So you have a choice. Do you want the graph to show the average for the time period, the maximum, or both? Do you want the figures at the bottom to show the maximum of the average, or the maximum of everything?

    For example

    "DEF:temperatur=$rrdFile:temperatur:AVERAGE",
    "DEF:humidity=$rrdFile:humidity:AVERAGE",
    "DEF:temperaturmax=$rrdFile:temperatur:MAX;reduce=MAX",
    "DEF:humiditymax=$rrdFile:humidity:MAX;reduce=MAX",
    "LINE1:temperatur#33CC33:Temperatur",
    "LINE1:temperaturmax#66EE66:Maximum Temperatur",
    "GPRINT:temperatur:LAST:\t\tAktuell\: %4.2lf °C",
    "GPRINT:temperatur:AVERAGE:Schnitt\: %4.2lf °C",
    "GPRINT:temperaturmax:MAX:Maximum\: %4.2lf °C\j",
    "LINE1:humidity#0000FF:Relative Luftfeuchtigkeit",
    "LINE1:humiditymax#3333FF:Maximum Luftfeuchtigkeit",
    "GPRINT:humidity:LAST:Aktuell\: %4.2lf %%",
    "GPRINT:humidity:AVERAGE:Schnitt\: %4.2lf %%",
    "GPRINT:humiditymax:MAX:Maximum\: %4.2lf %%\j",
    

    In this case, we define a separate DEF for the maximum data set, so that we can always obtain the highest value even after consolidation. This is also used in the GPRINT so that we get the MAX of the MAX rather than the MAX of the AVERAGE. The Maximum line is now drawn separately to the average line, so that we can see the effect of any rollup of data - the lines will be together at high-resolution but get further apart as the time window widens and resolution decreases.

    TheDEF is set to force any rollup function used for the maxima to be MAX rather than AVG, so we can be sure to get the maximum rather than average of maxima.

    We are also using AVERAGE rather than LAST in order to get more meaningful data after rollup. Note that we could also use a separate DEF for the LAST as well if we wanted to though it is of less usefulness.

    Note that, if you ever expect to be generating graphs over more than a few days, you should definitely consider adding some lower-resolution RRAs for AVERAGE and MAX or else the graphs will generate very slowly. RRDTool is designed with the intention that data will be rolled up over time, rather than (as in a traditional database) every sample kept as-is. So, unless you really need to have 30s resolution data kept for an entire year, you may prefer to keep this high resolution data for only a week, and then have separate RRAs that roll up to 1 hour resolution and keep for longer. Many people keep the 30s for 2 days, then 30min-summary for 2 weeks, 2h-summary for 2 months, and then 1day-summary for 2 years.

    For more information, see the RRDTool manual pages.