regexopenrefinegrel

Inserting Regular Expressions into OpenRefine facets


Working in OpenRefine I want to find word pairs where the second word is 'Street'.

I have a python regular expression which works for this but I can't get it to work in OpenRefine.

https://regex101.com/r/igjCuo/94 show the regex working

\w+(\s+Street)

My issue is that I am obviously not inserting the regex into OpenRefine correctly.

Testing

If I try

value.find("Street") 

then all cells which contain the word correctly return.

However putting the regular expression into the same query doesn't work

I know that this is something basic about formatting the query but I am at a loss and would really appreciate some help.


Solution

  • You did not form a correct regex literal/string pattern and added extra double quotation marks.

    You may use

    value.find("\\w+\\s+Street")
    value.find(/\w+\s+Street/)
    

    Note you do not need a capturing group as you need to get whole matches.

    See OpenRefine 3.0 onwards:

    NOTE: If p is a String then we compile it into a regex pattern, otherwise, If p is already a regex, then we just use that regex pattern. NOTE: When supplying the regex pattern in string quotes, you will need to use escaping (double slashes)