regexgreppcregrep

Using pcregrep to grep multiple lines


I have a file with the following pattern.

Foo $var1
.........
.........

Foo $var2 
..........
..........
..........
Yes

I would only like to match the "section" which starts with "Foo" and has "Yes". (You will notice there is an empty line feed at the end of each section)

The expected output should be.

Foo $var2 
..........
..........
..........
Yes

I tried

pcregrep -M "^Foo(.|\n)*^Yes"

But unfortunately this starts matching from the previous section and lumps the penultimate section together with the the section that has the "Yes" as the returned match, so I don't get one section that starts with "Foo" and has "Yes" but as many sections as before it that started with "Foo"

My dilemma is how to discard the previous match if at the end of the section I could not see "Yes" though I matched "Foo".

I tried to use the lookbehind function but it cannot be used for variable lengths.


Solution

  • You could use match Foo from the start of the string followed by matching all lines that do not start with either Yes or Foo.

    If Foo and Yes should not be part of a larger word you could use a word boundary \b

    ^Foo\b.*(?:\n(?!Yes\b|Foo\b).*)*\nYes\b
    

    In parts

    Regex demo

    For example

    pcregrep -Mo '^Foo\b.*(?:\n(?!Yes\b|Foo\b).*)*\nYes\b' file
    

    Output

    Foo $var2
    ..........
    ..........
    ..........
    Yes