I have a file with the following pattern.
Foo $var1
.........
.........
Foo $var2
..........
..........
..........
Yes
I would only like to match the "section" which starts with "Foo" and has "Yes". (You will notice there is an empty line feed at the end of each section)
The expected output should be.
Foo $var2
..........
..........
..........
Yes
I tried
pcregrep -M "^Foo(.|\n)*^Yes"
But unfortunately this starts matching from the previous section and lumps the penultimate section together with the the section that has the "Yes" as the returned match, so I don't get one section that starts with "Foo" and has "Yes" but as many sections as before it that started with "Foo"
My dilemma is how to discard the previous match if at the end of the section I could not see "Yes" though I matched "Foo".
I tried to use the lookbehind function but it cannot be used for variable lengths.
You could use match Foo from the start of the string followed by matching all lines that do not start with either Yes or Foo.
If Foo and Yes should not be part of a larger word you could use a word boundary \b
^Foo\b.*(?:\n(?!Yes\b|Foo\b).*)*\nYes\b
In parts
^
Start of stringFoo\b.*
Match Foo followed by 0+ times any char except a newline(?:
Non capturing group
\n
Match newline(?!Yes\b|Foo\b)
Negative lookahead, assert not Yes or Foo directly on the right.*
Match any char 0+ times except a newline)*
Close group and repeat 0+ times\nYes\b
For example
pcregrep -Mo '^Foo\b.*(?:\n(?!Yes\b|Foo\b).*)*\nYes\b' file
Output
Foo $var2
..........
..........
..........
Yes