linuxbashshellunixsh

How to use grep to check if a pattern line exists and write to another file?


I have log file with contents as like this

The color is Orange then can used to binding.
The color is Black then can used to binding.
The animal Lion need to carefully. 
The color is Black then can used to binding.
The animal Zebra need to carefully. 
The animal Tiger need to carefully.
The animal Bee need to skip.
The color is White then can used to binding.
The color is Yellow then can used to binding.
The animal Ant need to skip.
The animal Tiger need to carefully. 
The color is Red then can used to filled.
The color is Green then can used to filled.

I want to check if line contains pattern and not exists then write to another log.

#!/bin/bash
source_file="/home/user1/source/source_sample.log"
dest_file="/home/user1/dest/dest_sample.log

#define line with pattern
pattern1=".*then can used to binding"
pattern2=".*need to carefully"

#check if line of pattern exist or not in destination file, if not then write it
grep -e "$pattern1" -e "$pattern2" $source_file >> $dest_file

The expected output on dest_sample.log would be like this. If the line of pattern does not exist in destination file, then write it. If already have exist , dont write it.

The color is Orange then can used to binding.
The color is Black then can used to binding.
The animal Lion need to carefully. 
The animal Zebra need to carefully. 
The animal Tiger need to carefully.
The color is White then can used to binding.
The color is Yellow then can used to binding.

Solution

  • I think the language barrier is causing some trouble here.
    Apologies if I miss what you want.

    First, there are some lines in your data which look identical, but aren't; some have spaces at the end while others do not.

    I removed the spaces before processing. If they matter, just leave them. Other answers here have examples of removing them in-process, though I could add that logic if you like.

    What you asked was a way to do this with grep:

    grep -Fxvf $dest_file $source_file | # Fixed string search 
      grep -E 'then can used to binding|need to carefully' >> $dest_file
    

    It requires two passes.

    First, a -Fixed-string -x(exact match) -v(exclusion) scan of $source_file using the lines of $dest_file as a -file of match strings. This gives us only lines from $source_file NOT in $dest_file already.

    Pipe this to grep -E 'then can used to binding|need to carefully' (or grep -e "$pattern1" -e "$pattern2") to select lines you want from the output of the first grep and append to $dest_file.

    If it is possible there can be duplicate new lines in $source_file, you may want to eliminate the dups before adding them to $dest_file:

    grep -Fxvf $dest_file $source_file | # Fixed string search 
      grep -E 'then can used to binding|need to carefully' |
      sort -u >> $dest_file
    

    This will alter your ordering unless $source_file has order you can reproduce, but you said order doesn't matter.

    A Better Way

    What you asked was a way to use grep, but awk can do this in one process, one pass (each file), without reordering your data.

    awk 'NR==FNR{seen[$0]=1;next} 1==seen[$0]{next} 
      /then can used to binding|need to carefully/{seen[$0]=1;print}
    ' $dest_file $source_file >> $dest_file