I'm trying to look for phone numbers in any of the following formats: +1.570.555.1212, 570.555.1212, (570)555-1212, and 570-555-1212. We also need to look in compressed folders using zgrep, however I would have my code come back "No matches found". The code is working as it is below to find phone numbers from txt files. It is very bad, but here it is below
Code:
#!/bin/bash
egrep '[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}.[0-9]{3}.[0-9]{4}|([0-9]{3})[0-9]{3}-[0-9]{4}|+(1).[0-9]{3}.[0-9]{3}.[0-9]{4}' *
if [ $? -eq 0 ] ; then echo $1 ; else echo "No matches found" ; fi 2>/dev/null
zgrep
without any options is equivalent in its regex capabilities to grep
; you need to say zgrep -E
if you want to use grep -E
(aka egrep
) regex syntax when searching compressed files.
#!/bin/bash
if zgrep -E -q '[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}.[0-9]{3}.[0-9]{4}|([0-9]{3})[0-9]{3}-[0-9]{4}|+(1).[0-9]{3}.[0-9]{3}.[0-9]{4}' *
then
echo "$1"
else
echo "No matches found" >&2
fi
Notice also Why is testing “$?” to see if a command succeeded or not, an anti-pattern? and When to wrap quotes around a shell variable as well as the preference for -q
over redirecting to /dev/null
, and the displaying of error messages on standard error (>&2
redirection).
Your regex could also use some refactoring; maybe try
(\+\(1\).)?[0-9]{3}.[0-9]{3}.[0-9]{4}
Notice how round brackets and the plus sign need to be backslash-escaped to match literally, and how after refactoring out the +(1)
prefix as optional the rest of the regex subsumes all the other variants you had enumerated, because .
matches -
and (
and .
and many other characters. (The optional prefix could also be dropped completely and this would still match the same strings, but I had to guess some things so I am leaving it in with this remark.)