[SOLVED] Only print if the number of field is greater than a value with awk

Only print if the number of field is greater than a value with awk

I'm still a newbie to awk, what am I doing wrong? apologies for the poor description, I reformulate.

Goal

Only print the number of the second field if the number is > 20

lorem v3  <--- no print
ipsum v5  <--- no print
text v21  <--- print "21"
expla v12 <--- no print

My attempt that does not work

awk ' { sub("^v","",$2); if ( $2 > 20 ) print $2 } '

Solution

Addressing OP's question about why the current code outputs 3:

Initially awk doesn't know if $2 is a number or a string.

The sub() call (a string function) tells awk that $2 is to be treated as a string, which also means $2 will be treated as a string for the rest of the script.

This leads to $2 > 20 being treated as a string comparison ('3' > '20') and since '3' (the string) is greater than '20' (the string), a 3 is output.

To facilitate a numeric comparion we need a way to force awk to re-evaluate $2 as a numeric. One method is to add a zero, ie, $2+0. Making this one change to OP's current code:

$ echo "lorem v3" | awk ' { sub("^v","",$2); if ( $2+0 > 20 ) print $2 } '
           <<< no output

NOTE: for more details see GNU awk - variable typing

Addressing the latest change to the question:

Sample input:

$ cat input.dat
lorem v3
ipsum v5
text v21
expla v12

Running our awk code (additional print added for clarification) against input.dat:

$ awk ' { print "######",$0; sub("^v","",$2); if ( $2+0 > 20 ) print $2 } ' input.dat
###### lorem v3
###### ipsum v5
###### text v21
21
###### expla v12