regexregex-negation

Regular expression that doesn't contain certain string


I have something like this

aabbabcaabda

for selecting minimal group wrapped by a I have this /a([^a]*)a/ which works just fine

But i have problem with groups wrapped by aa, where I'd need something like /aa([^aa]*)aa/ which doesn't work, and I can't use the first one like /aa([^a]*)aa/, because it would end on first occurence of a, which I don't want.

Generally, is there any way, how to say not contains string in the same way that I can say not contains character with [^a]?

Simply said, I need aa followed by any character except sequence aa and then ends with aa


Solution

  • In general it's a pain to write a regular expression not containing a particular string. We had to do this for models of computation - you take an NFA, which is easy enough to define, and then reduce it to a regular expression. The expression for things not containing "cat" was about 80 characters long.

    Edit: I just finished and yes, it's:

    aa([^a] | a[^a])aa
    

    Here is a very brief tutorial. I found some great ones before, but I can't see them anymore.