loggingsedawkgrepqmail

grep -f using -i option


I want to use grep with the -f, -i, and -v options. I have a pattern file with the following contents:

vchkpw-pop3
vchkpw-submission
user_unknown
unknown_user
address_rejected
no_such_user
does_not_exist
invalid_recipient
mailbox_unavailable
user_not_found
no_mailbox_here

and I want to exclude all of the above terms when I am processing my qmail mail log files.

using Grep 2.5.1, it doesn't appear to work for any of the patterns starting from the 3rd position.

I am using a one line of bash code to parse my maillog file. See the line below:

cat /var/log/maillog | tai64n2tai | awk '{$1="";$2="";$3="";$4="";$5="";print}'
| grep -v vchkpw-pop3 | grep -v vchkpw-submission | awk '{sub(/^[ \t]+/,"")};1'
| qlogselect start $STARTDAY end $ENDDAY | matchup > $QMAILSTATS 5>/dev/null

and instead of using multiple grep -v "sometext" in pipes, I wanted to use grep -vif patterns.txt in their place.

However, my problem is that in my version of grep, it won't allow me to use the f and i options together if the patterns contain an underscore(_) in them. If I remove the underscore then the patterns match as expected.

Here is what an example line that I want to ommit when parsing my maillog:

Sep 20 15:46:50 m qmail: 1348123610.323831 delivery 11150428: failure: 204.119.19.51_does_not_like_recipient./Remote_host_said:_550_5.1.1_User_unknown/Giving_up_on_204.119.19.51./ 

Since the error message is dependent upon the mail server I am contacting, sometimes the pattern user_unknown has capital letters and sometimes it doesn't.

Anyone have a better solution?

I like the idea of not having to edit the one line bash command everytime, and just add/remove a pattern from a file.


Solution

  • Here's one way using GNU awk, assuming you have the patterns saved in a file called patterns.txt. This is the contents of script.awk:

    BEGIN {
        IGNORECASE=1
    }
    
    FNR==NR {
        patterns[$0]++
        counter++
        next
    }
    
    {
        $1=$2=$3=$4=$5=""
        sub(/^[ \t]+/,"")
    
        for (i in patterns) {
            if ($0 !~ i) {
                count++
            }
        }
    
        if (counter == count \
            && !/^$/) {
                print
        }
    
        count = 0
    }
    

    Run like this:

    < /var/log/maillog | tai64n2tai | awk -f script.awk patterns.txt - | qlogselect start $STARTDAY end $ENDDAY | matchup > $QMAILSTATS 5>/dev/null
    

    Alternatively, if you prefer not to use a script you will find this one liner useful:

    < /var/log/maillog | tai64n2tai | awk 'BEGIN { IGNORECASE=1 } FNR==NR { patterns[$0]++; counter++; next } { $1=$2=$3=$4=$5=""; sub(/^[ \t]+/,""); for (i in patterns) { if ($0 !~ i) { count++ } } if (counter == count && !/^$/) { print } count = 0 }' patterns.txt - | qlogselect start $STARTDAY end $ENDDAY | matchup > $QMAILSTATS 5>/dev/null