ruby-on-railsregexrubystring

How to match repeating patterns containing special characters in Ruby using regex?


Basically, I am trying to use regex to match repeating patterns containing special characters in Ruby. I've been able to do this if I am given the number of times a pattern repeats but not dynamically. An example string I am looking to match is:

Draw a square that is {{coords.width}} pixels wide by {{coords.height}} pixels tall.

This can be easily done by using

arr = value.scan(/\{\{(\w+?\.\w+?)\}\}/).flatten

arr looks like this after I run this

["coords.width", "coords.height"]

But how do I write a regex which can match in case this pattern follows arbitrarily, for example

Draw a square that is {{shape.rectangle.coords.width}} pixels wide by {{shape.rectangle.coords.height}} pixels tall.

while also matching in case of the following(no ".")

Draw a square that is {{width}} pixels wide by {{height}} pixels tall.


Solution

  • You can match the regular expression

    r = /(?<=\{\{)[a-z]+(?:\.[a-z]+)*(?=\}\})/
    

    Rubular demo / PCRE demo at regex 101.com

    I've included the PCRE demo because regex101.com provides a detailed explanation of each element of the regex (hover the cursor).

    For example,

    str = "Draw a square {{coords.width}} wide by {{coords.height}} " +
          "tall by {{coords deep}} deep"
    
    str.scan(r)
      #=> ["coords.width", "coords.height"]
    

    Notice that "coords deep" does not match because it does not have (what I have assumed is) a valid form. Notice also that I did not have to flatten the return value from scan because the regex has no capture groups.

    We can write the regular expression in free-spacing mode to make it self-documenting.

    /
    (?<=      # begin a positive lookbehind
      \{\{    # match 1 or more lower case letters
    )         # end the positive lookbehind
    [a-z]+    # match 1 or more lower case letters
    (?:       # begin a non-capture group
      \.      # match a period
      [a-z]+  # match 1 or more lower case letters
    )         # end the non-capture group
    *         # execute the non-capture group zero or more times
    (?=       # begin a positive lookahead
      \}\}    # match '}}'
    )         # end positive lookahead
    /x        # free-spacing regex definition mode