regexregex-lookarounds

Regex expression to find end of multiline YAML codeblock


I'm working on adding folding markers to my FastColoredTextBox for YAML. For this I need to detect the start & end of a block of comments (with 2 regex) I want to be able to collapse. For example

# line1 <-- startpatern match here
# line2
# line3
# line4 <-- endpatern match here

# line5 <-- no match, since its single line
hi <-- no match

# line6 <-- startpatern match here
# line7 <-- endpatern match here

I've already got the regex to match the start of such block; (?<!#.*\r?\n)#.*(?=\r?\n#) Which matches every line that doesn't have a comment the line before, but has one after (to rule out single line comments), but can't figure out a patern to match the end. I tried;

(?<=#.*\r?\n)#.*(?!\r?\n#) While I think this would work, its still picking up random matches, like line2 & line3 in the example above


Solution

  • You could make use of 2 capture groups, where the first line is in group 1. Then optionally repeat lines that start with # and then capture the last line in group 2 by using a repeated capture group.

    ^(#.*)(?:\r?\n(#.*))+
    

    See a regex 101 demo

    Or if supported with lookaround assertions:

    For the opening line:

    (?<!^#.*\r?\n)#.*$(?=\r?\n#)
    

    For the closing line:

    (?<=^#.*\r?\n)#.*$(?!\r?\n#)
    

    See matching both lines here on regex 101