I am using a FreeBSD (on Citrix NetScaler)… I have the challenge of extracting the Mbps from a log that has literally 100's of thousands of lines.
The log look something like this, where the Mbps number with decimal can range from 0.0 to 9999.99 or more. I.e.
#>alphatext_anylength... (more_alphatext_in brackets)... Mbps (1.0)… alphatext_anylength... (more_alphatext_in brackets)...
#>alphatext_anylength... (more_alphatext_in brackets)... Mbps (500.15)… alphatext_anylength... (more_alphatext_in brackets)...
#>alphatext_anylength... (more_alphatext_in brackets)... Mbps (1500.01)… alphatext_anylength... (more_alphatext_in brackets)...
Now the challenge is I want to filter out all the Mbps's bracketed number with decimals that is A) greater than 500mbps, with B) line numbers. I.e., for the above sample output, I want to see only the following:
#>[line number 20] 500.15
#>[line number 55] 1500.01
I have tried:
cat output.log | sed -n -e 's/^.*Mbps//p' |cut -c 3-10
Which gives me 10 characters after Mbps. But this is not smart enough to show only bracketed decimal number that is greater than 500Mbps.
I appreciate this might be a bit if a challenge... however would be grateful for any bash scripts wizards out there that can create magic!
Thanks in advance!
You can use awk
to match the lines containing Mbps (
followed by any non-)
characters followed by )
.
Then replace the beginning of the string up to Mbps (
with an empty string and also )
up to the end with an empty string.
If the remaining line converted to a number (+0
) is greater than 500, print the line number and the line.
awk '
/Mbps \([^)]*\)/{ sub(/.*Mbps \(/, ""); sub(/\).*/, "") }
($0+0) > 500{ print FNR, $0 }
' file
Edit: To match lines containing an optional space after Mbps
with a value > 50, use
awk '
/Mbps ?\([^)]*\)/{ sub(/.*Mbps ?\(/, ""); sub(/\).*/, "") }
($0+0) > 50{ print FNR, $0 }
' file