I need suggestion for a better programming in bash for processing file ( specially the key value pairs in a line)
I was trying to process a task for log lines :
request_id
in a new lineIPA
has value "MASKED" then append " MASK" with the request_id
in outputI wrote below code to process it
while read line
do
if [ $( echo "$line" | grep "critical/warning" | grep -c "request_id=") -gt 0 ]
then
request_id=$( echo "$line"| awk -F"request_id=" '{print $2}'| awk '{print $1}')
if [ $(echo "$line" | grep -c "IPA=") -gt 0 ]
then
IPA=$(echo "$line"| awk -F"IPA=" '{print $2}'| awk '{print $1}');
[[ "M$IPA" == "M\"MASKED\"" ]] && request_id="$request_id MASK"
fi
echo $request_id;
fi
done < test.txt
Below is the sample log file
Apr 10 11:17:35 jalaltu app/web.3: IP_MASKED - - [10/Apr/2020:18:17:35 +0000] "GET /backend/requests/editor/placeholder?shareLinkId=69dff0hba0nv HTTP/1.1" 200 148 "https://jalaltu.com" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:74.0) Gecko/20100101 Firefox/74.0
Apr 10 11:17:35 critical/warning: at=info method=GET path="/backend/requests/editor/placeholder?key=s2fwad2Es2" host=jalaltu.com request_id=b19a87a1-1bbb-4e67-b207-bd9f23d46afa IPA="108.31.000.000" dyno=web.3 connect=0ms service=92ms status=200 bytes=3194 protocol=https
Apr 10 11:17:35 critical/warning: at=info method=GET path="/backend/requests/editor/placeholder?shareLinkId=tosrve4v8q8q" host=jalaltu.com request_id=910b07d1-3f71-4347-a1a7-bfa20384ef65 IPA="108.31.000.000" dyno=web.2 connect=1ms service=17ms status=200 bytes=4435 protocol=https
Apr 10 11:17:35 critical/warning: at=info method=GET path="/backend/requests/editor/placeholder?shareLinkId=tosrve4v8q8q" host=jalaltu.com request_id=097bf65e-e189-4f9f-9dfb-4758cff411b2 IPA="108.31.000.000" dyno=web.3 connect=1ms service=10ms status=200 bytes=4435 protocol=https
Apr 10 11:17:35 jalaltu app/web.2: IP_MASKED - - [10/Apr/2020:18:17:35 +0000] "GET /backend/requests/editor/placeholder?key=s2fwad2Es2 HTTP/1.1" 200 4263 "https://jalaltu.com" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36
Apr 10 11:17:35 critical/warning: at=info method=GET path="/backend/requests/editor/placeholder?shareLinkId=4eiramcmayu0" host=jalaltu.com request_id=d48278c2-5731-464e-be38-ab9ad84ac4a8 IPA="108.31.000.000" dyno=web.4 connect=1ms service=7ms status=200 bytes=3194 protocol=https
Apr 10 11:17:35 jalaltu app/web.3: IP_MASKED - - [10/Apr/2020:18:17:35 +0000] "GET /backend/requests/editor/placeholder?shareLinkId=tosrve4v8q8q HTTP/1.1" 200 4263 "https://jalaltu.com" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36
Apr 10 11:17:35 jalaltu app/web.3: IP_MASKED - - [10/Apr/2020:18:17:35 +0000] "GET /backend/requests/editor/placeholder?shareLinkId=tosrve4v8q8q HTTP/1.1" 200 4263 "https://jalaltu.com" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36
Apr 10 11:17:36 jalaltu app/web.4: IP_MASKED - - [10/Apr/2020:18:17:35 +0000] "GET /backend/requests/editor/placeholder?shareLinkId=4eiramcmayu0 HTTP/1.1" 200 3023 "https://jalaltu.com" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36
Apr 10 11:17:36 critical/warning: at=info method=GET path="/backend/requests/editor/placeholder?shareLinkId=tosrve4v8q8q" host=jalaltu.com request_id=8bb2413c-3c67-4180-8091-000313b8d9ca IPA="MASKED" dyno=web.3 connect=1ms service=32ms status=200 bytes=4435 protocol=https
Apr 10 11:17:36 critical/warning: at=info method=GET path="/backend/requests/editor/placeholder?shareLinkId=tosrve4v8q8q" host=jalaltu.com request_id=10f93da3-2753-48a3-9485-857a93d8a88a IPA="MASKED" dyno=web.3 connect=1ms service=37ms status=200 bytes=4435 protocol=https
Below is the output from sample log file
b19a87a1-1bbb-4e67-b207-bd9f23d46afa
910b07d1-3f71-4347-a1a7-bfa20384ef65
097bf65e-e189-4f9f-9dfb-4758cff411b2
d48278c2-5731-464e-be38-ab9ad84ac4a8
8bb2413c-3c67-4180-8091-000313b8d9ca MASK
10f93da3-2753-48a3-9485-857a93d8a88a MASK
Assumptions:
request_id
always coming before IPA
, I'm going to assume this may not always be the caseOne idea using a single awk
invocation (which should be a bit faster than the current bash
looping construct with several sub-process calls to echo/grep/awk
):
awk '
/critical[/]warning/ && # if line contains "critical/warning" and ...
/request_id/ { mask="" # line contains "request_id", clear the "mask" variable
for (i=1 ; i<=NF; i++) # loop through our input fields
{ split($(i),arr,"=") # split current field on "=", store results in array "arr[]"
if ( arr[1] == "request_id" ) # if field is "request_id" ...
{ reqid = arr[2] } # save the associated id
if ( arr[1] == "IPA" && arr[2] ~ "MASKED" ) # if field is "IPA" and value matches "MASKED" ...
{ mask = " MASK" } # set our "mask" variable
}
print reqid mask # print our variables
}
' log.dat
NOTE: Remove comments to declutter code
The above generates:
b19a87a1-1bbb-4e67-b207-bd9f23d46afa
910b07d1-3f71-4347-a1a7-bfa20384ef65
097bf65e-e189-4f9f-9dfb-4758cff411b2
d48278c2-5731-464e-be38-ab9ad84ac4a8
8bb2413c-3c67-4180-8091-000313b8d9ca MASK
10f93da3-2753-48a3-9485-857a93d8a88a MASK