I have access log file with data only per 1 day like:
10.2.21.120 "-" - [26/Jan/2013:19:15:11 +0000] "GET /server/ad?uid=abc&type=PST HTTP/1.1" 202 14 10
10.2.21.120 "-" - [26/Jan/2013:19:17:22 +0000] "GET /server/ad?uid=abc&type=PST HTTP/1.1" 204 14 9
10.2.22.130 "-" - [26/Jan/2013:19:27:53 +0000] "GET /server/ad?uid=abc&type=PST HTTP/1.1" 200 14 8
I am using the following command:
awk '$9 == 200 { s++ } END { print s / NR * 100; }' access.log
This awk may help you
$ awk -F[:\ ] '{count[$5]++}; $12 == 200 { hour[$5]++} END { for (i in hour) print i, hour[i]/count[i]*100 }' input
Test
$ cat input
10.1.20.123 "1.1.1.1" - [15/Oct/2014:12:14:17 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 200 3014 10
10.1.20.123 "1.1.1.1" - [15/Oct/2014:12:14:17 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 100 3014 10
10.1.20.123 "1.1.1.1" - [15/Oct/2014:13:14:26 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 200 3014 9
10.1.20.123 "1.1.1.1" - [15/Oct/2014:13:24:55 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 200 3014 8
$ awk -F[:\ ] '{count[$5]++}; $12 == 200 { hour[$5]++} END { for (i in hour) print i, hour[i]/count[i]*100 }' input
12 50
13 100
What it does
{count[$5]++}
array count
stores the number of occurence of each hour from the log file.
$12 == 200 { hour[$5]++}
Now it the log is success, that is $12 == 200
then the corresponding value in hour
array is incremented.
So count[13]
will contain total enteries from hour 13
where as hour[13]
would contain count of succuessfull entries
END { for (i in hour) print i, hour[i]/count[i]*100 }
prints the hour, percentage