Bash: how to certain lines but exclude certain lines in between?

I have a file that looks like this:

a: 0
a: 0
a: 0
a: 1
b: 1
c: 1
d: 1
e: 1
f: 1
a: 2
b: 2
c: 2
d: 2
e: 2
f: 2
a: 3
b: 3
c: 3
d: 3
e: 3
f: 3
c: 4
c: 4
c: 4

I want to capture and output all of the a and c lines of the form <a line><anything other than an a or c line><c line> so the output would look like:

a: 1
c: 1

a: 2
c: 2

a: 3
c: 3

Note that neither the a: 0 lines at the beginning nor the c: 4 lines at the end are captured because they don't follow the pattern I mentioned. Note also that the b lines between the a and c lines are removed.

I've been trying to do this with lookarounds usings Bash's pcregrep, but haven't found a solution yet. Any ideas?

Thanks!

Solution

Using awk

Try:

$ awk -F: '$1=="a"{aline=$0} $1=="c"{if(aline)print aline ORS $0 ORS; aline=""}' file
a: 1
c: 1

a: 2
c: 2

a: 3
c: 3

How it works

By default, awk reads in one line at a time.

-F:

This tells awk to use : as the field separator.
$1=="a"{aline=$0}

Everytime an a line is observed, save the line in variable aline.
$1=="c"{if(aline)print aline ORS $0 ORS; aline=""}

Every time a c line is observed, check to see if we have a nonempty aline. If so, print aline and the current line, separated by newline characters. Also, set aline back to an empty string.

Multiline version

For those who prefer their commands spread over several lines:

awk -F: '
    $1=="a"{
        aline=$0
    }

   $1=="c"{
        if(aline)
            print aline ORS $0 ORS
        aline=""
    }' file

Using sed

$ sed -n '/^a/h; /^c/{x;/^a/{p;x;s/$/\n/;p};h}' file
a: 1
c: 1

a: 2
c: 2

a: 3
c: 3

How it works

-n

This tells sed not to print anything unless we explicitly ask it to.
/^a/h

Any time we have a line that starts with a, we save it to the hold space.
/^c/{ x; /^a/{ p; x; s/$/\n/; p}; h}

Any time we have a line that starts with c, we:
- We swap (x) the pattern space with the hold space.
- If the new pattern space starts with a, then we print (p) it, and swap (x) again, add a new line to the end of the new pattern space (s/$/\n/) and print (p) it.
- Lastly we save the current pattern space (which starts with a c) to the hold space.