regexocamllexocamllex

Some special characters are allowed only if they are preceded by an escape character


I want to construct a regular expression (in the style of lex, with a more OCaml-like syntax) for a class of strings, where 4 characters [, ], #, ' are allowed only if they are preceded by an escape character '.

Here are some valid examples:

Here are some non-valid examples:

Hope the definition is clear. First, does anyone know how to construct such a regular expression? Second, does anyone know how to construct such a regular expression (in the style of lex, with a more OCaml-like syntax) that can be accepted by ocamllex?


Solution

  • You don't say the accepted strings look like other than with a few examples. Just for concreteness, let's say that lower-case letters and digits are allowed, and the 4 special characters are allowed only if preceded by '.

    This, then, is described by the Kleene closure of a set of 36 one-character strings and 4 two-character strings.

    Which looks like this:

     (['a' - 'z' '0' - '9'] | '\'' ['\'' '#' '[' ']'])*