regexbashgreppcrepcregrep

Bash: how to certain lines but exclude certain lines in between?


I have a file that looks like this:

a: 0
a: 0
a: 0
a: 1
b: 1
c: 1
d: 1
e: 1
f: 1
a: 2
b: 2
c: 2
d: 2
e: 2
f: 2
a: 3
b: 3
c: 3
d: 3
e: 3
f: 3
c: 4
c: 4
c: 4

I want to capture and output all of the a and c lines of the form <a line><anything other than an a or c line><c line> so the output would look like:

a: 1
c: 1

a: 2
c: 2

a: 3
c: 3

Note that neither the a: 0 lines at the beginning nor the c: 4 lines at the end are captured because they don't follow the pattern I mentioned. Note also that the b lines between the a and c lines are removed.

I've been trying to do this with lookarounds usings Bash's pcregrep, but haven't found a solution yet. Any ideas?

Thanks!


Solution

  • Using awk

    Try:

    $ awk -F: '$1=="a"{aline=$0} $1=="c"{if(aline)print aline ORS $0 ORS; aline=""}' file
    a: 1
    c: 1
    
    a: 2
    c: 2
    
    a: 3
    c: 3
    

    How it works

    By default, awk reads in one line at a time.

    Multiline version

    For those who prefer their commands spread over several lines:

    awk -F: '
        $1=="a"{
            aline=$0
        }
    
       $1=="c"{
            if(aline)
                print aline ORS $0 ORS
            aline=""
        }' file
    

    Using sed

    $ sed -n '/^a/h; /^c/{x;/^a/{p;x;s/$/\n/;p};h}' file
    a: 1
    c: 1
    
    a: 2
    c: 2
    
    a: 3
    c: 3
    

    How it works