bashawkgreplogfile

Search level-errors with time range in log file


I have a log file in that format:

2021-05-17 07:59:10.496 33821 ERROR bla bla bla

2021-05-17 08:03:10.957 33821 WARNING bla bla bla

2021-05-17 08:12:10.094 33821 ERROR bla bla bla

2021-05-17 08:40:10.592 33821 INFO bla bla bla

I need to count the number of level-messages (ERROR,WARNING,INFO) separately with time intervals of 4 hours. Now, I was able to count the number of messages of each type of the entire log file, but lack the knowledge of how to count the number in time intervals every 4 hours. Write a script in bash and sort it with awk:

awk '($4 ~ /INFO/)' $file | awk '{print $4}' | uniq -c | sort -r 

similarly with error and warning


Solution

  • First get the hour multiple of 4 before the event: int(substr($2,1,2)/4) * 4 (e.g. for 07:59:10 this returns 4). Then format it nicely, e.g. to print 04:00-07:59, and then sort everything and uniq -c as you are doing already:

    awk '($4 ~ /INFO/)' $file |
        awk '{
            x = int(substr($2,1,2)/4) * 4;
            printf "%s %02d:00-%02d:59 %s\n", $1, x, x+3, $4
        }' |
        sort |
        uniq -c
    

    This will print all the counts sorted by 4-hour intervals, e.g. for your example (with all lines, cat $file instead of awk '($4 ~ /INFO/)' $file) it gives:

          1 2021-05-17 04:00-07:59 ERROR
          1 2021-05-17 08:00-11:59 ERROR
          1 2021-05-17 08:00-11:59 INFO
          1 2021-05-17 08:00-11:59 WARNING