I'm monitoring more than 300 servers, for that I'm using Ganglia.
Which use RRD
as database to collect and store data related the resources of each server.
I would like to have a history about 2 years or more, so reading this article, I think that my RRA
configuration should be :
RRAs "RRA:AVERAGE:0.5:1:17520"
17520 = (365 days [year] x 2) * 24 [hour]
This is Ganglia default configuration, which is running today:
#
# Round-Robin Archives
# You can specify custom Round-Robin archives here (defaults are listed below)
#
# RRAs "RRA:AVERAGE:0.5:1:244" "RRA:AVERAGE:0.5:24:244" "RRA:AVERAGE:0.5:168:244" "RRA:AVERAGE:0.5:672:244" \
# "RRA:AVERAGE:0.5:5760:374"
#
Is that right my way of thinking or I'm missing something here ?
After studying this subject for a while, I came up with an answer that may help someone in the future. I read these two articles many times, which I recommend. Read this one first, Creating an initial RRD then read this one. How to create an RRDTool database:
I will try to explain it simply. Format RRA:CF:xff:steps:rows
:
RRA: Round Robin Archive
CF: Consolidation Factor
XFF: Xfile Factor
steps
rows
The biggest issue for me was to discover the right value for steps
and rows
.
After reading, I came up with this explanation:
1 day - 5-minute resolution
1 week - 15-minute resolution
1 month - 1-hour resolution
1 year - 6-hour resolution
RRA:AVERAGE:0.5:1:288 \
RRA:AVERAGE:0.5:3:672 \
RRA:AVERAGE:0.5:12:744 \
RRA:AVERAGE:0.5:72:1480
Keep in mind that our step
is 300 seconds
, so the idea is very simple:
If I want to resolve one day
which has 86400 seconds
, as shown in the first example, how many rows do I need? The answer is 288 rows
. Why?
`86400 seconds [1 day] / 300 seconds [5 minutes`] = 288 rows
Another example, if I want to resolve:
1 week [ = 604800 seconds ]
in 15 minutes [ = 900 seconds ]
= 604800/900 = 672 rows
And so it goes on for the other values. This way you are going to find out how many rows
you need.
Finding out how many steps
you need is very simple, you just have to take the multiplier of your steps.
Let me explain: Our steps
are 300 seconds
, right?
So if we want to resolve 5 minutes [ = 300 seconds ]
, we just need to multiply by 1, right?
So, 15 minutes means by 300 seconds x 3, 1 hour means 300 x 12, 6 hours mean 300 x 72 and so on.
In my specific case, I would like to my steps
be 30 seconds
, so I came up with these structure:
1 every time 30 seconds 1 * 30s = 30s
2 every second time 1 minute 2 * 30s = 1m
4 every third time 2 minutes 4 * 30s = 2m
10 every 10th time 5 minutes 10 * 30s = 5m
20 every 20th time 10 minutes 20 * 30s = 10m
60 every 60th time 30 minutes 60 * 30s = 30m
80 every 80th time 40 minutes 80 * 30s = 40m
100 every 100th time 50 minutes 100 * 30s = 50m
120 every 120th time 1 hour 120 * 30s = 1h
240 every 240th time 2 hours 240 * 30s = 2h
360 every 360th time 3 hours 360 * 30s = 3h
RRA:AVERAGE:0.5:1:120 \
RRA:AVERAGE:0.5:2:120 \
RRA:AVERAGE:0.5:4:120 \
RRA:AVERAGE:0.5:10:288 \
RRA:AVERAGE:0.5:20:1008 \
RRA:AVERAGE:0.5:60:1440 \
RRA:AVERAGE:0.5:80:3240 \
RRA:AVERAGE:0.5:100:5184 \
RRA:AVERAGE:0.5:120:8760 \
RRA:AVERAGE:0.5:240:8760 \
RRA:AVERAGE:0.5:360:8760 \
Which means:
1 hour - 30 seconds resolution
2 hours - 1 minute resolution
4 hours - 2 minutes resolution
1 day - 5 minutes resolution
1 week - 10 minutes resolution
1 month - 30 minutes resolution
3 months - 40 minutes resolution
6 months - 50 minutes resolution
1 year - 1 hour resolution
2 year - 2 hour resolution
3 year - 3 hour resolution
Well, I hope this helps someone, that's all.