I have a log file in that format:
2021-05-17 07:59:10.496 33821 ERROR bla bla bla
2021-05-17 08:03:10.957 33821 WARNING bla bla bla
2021-05-17 08:12:10.094 33821 ERROR bla bla bla
2021-05-17 08:40:10.592 33821 INFO bla bla bla
I need to count the number of level-messages (ERROR,WARNING,INFO) separately with time intervals of 4 hours. Now, I was able to count the number of messages of each type of the entire log file, but lack the knowledge of how to count the number in time intervals every 4 hours. Write a script in bash and sort it with awk:
awk '($4 ~ /INFO/)' $file | awk '{print $4}' | uniq -c | sort -r
similarly with error and warning
First get the hour multiple of 4 before the event: int(substr($2,1,2)/4) * 4
(e.g. for 07:59:10
this returns 4
). Then format it nicely, e.g. to print 04:00-07:59
, and then sort
everything and uniq -c
as you are doing already:
awk '($4 ~ /INFO/)' $file |
awk '{
x = int(substr($2,1,2)/4) * 4;
printf "%s %02d:00-%02d:59 %s\n", $1, x, x+3, $4
}' |
sort |
uniq -c
This will print all the counts sorted by 4-hour intervals, e.g. for your example (with all lines, cat $file
instead of awk '($4 ~ /INFO/)' $file
) it gives:
1 2021-05-17 04:00-07:59 ERROR
1 2021-05-17 08:00-11:59 ERROR
1 2021-05-17 08:00-11:59 INFO
1 2021-05-17 08:00-11:59 WARNING