[SOLVED] Select only those rows from a column where column 2 has more than 2 leading zeroes in Linux

Select only those rows from a column where column 2 has more than 2 leading zeroes in Linux

So I want to grab only the rows that has 2 or more than 2 leading zeroes in the ID column ($2) between the 5th and 10th character. For example column 2 has ID 156700923134, so from 5th to 10th character 1567-009231-34 i.e. 009231. In this case we do see the leading zeroes. However in the second row we have 777754635373, so grab 546353, which does not have leading zeroes. I am working on a pipe delimited file.

Ex: 1 | 156700923134 | hkohin | 23

4 | 777754635373 | hhkdys | 45

3 | 678387700263 | ieysff | 09

Expected output: 1 | 156700923134 | hkohin | 23 
                        --OR-- 
                     156700923134 

So far I have the substring 009231, 546353, 877002 as output but I don't know how to check for leading zeroes.

This is what I used to get to the above result:
awk -F'|' '{print $2, substr($2, 5, 6) }' file.dat | head -5

Solution

() for test condition allows any valid expression

awk -F'|' '( match($2,"^....00") ) { print print $2, substr($2, 5, 6) }' file.dat

Answer #2:

Takes more lines to be generic:

zstart=5
zcnt=3
zeros=$(eval printf '0%.0s' {1..$zcnt})

echo 'xxx|1234000890|end' |
  awk -F'|' -vzstart=$zstart -vzcnt=$zcnt -vzeros="$zeros" '
     ### debug { print substr($2, zstart, zcnt); }
     (zeros == substr($2, zstart, zcnt)) { print }'