regexgrepregex-lookaroundsadobe-indesign

How to limit lookbehind to strings which do not start with certain characters?


In InDesign, I’m using the GREP expression (?<=.)/(?=.) to locate all occurrences of the slash character / throughout a document.

For example, I want to find the character / in Color/Colour or American English/British English in order to apply a certain styling to the slash.

However, I want to limit this to all words/strings that do not begin with either http or www, so the slashes in https://usa.gov/about or www.gov.uk/about should not be included in the results. Lone slashes should/can be ignored.

I have managed to find all words/strings that begin with either http or www with \<www|\<http, however, I’m not able to combine the two.

I’ve tried the following but with no success:

(?<=.)(?<!\<www|\<http)/(?=.)

From what I can see, InDesign uses the Perl Regular Expression Syntax boost libraries.


Solution

  • If the regex engine is boost as you state in your comment, you could make use of SKIP FAIL backtracking control verbs to first match what you don't want and then skip the match:

    (?<!\S)(?:(?:https?|www)\S+|/+(?!\S))(*SKIP)(*F)|/
    

    The pattern matches:

    See a regex demo.