petitparser

Parsing delimited strings using petitparser


I was originally looking to (manually) write a simple tokenise/parser for my grammar, but one of my requirements means that tokenising is a bit fiddly.

I need to be able to support the notion of delimited strings where the delimiter could be any char. eg. strings are most likely to be delimited using double quotes (eg. "hello") but it could just as easily be /hello/ or ,hello, or pathologically xhellox

So, I started looking at what alternatives there might be to do a combined tokenise/parse... which is when I stumbled across petit parser.

Just curious whether this type of delimited string might be something that would be able to be parsed using Petit Parser? Thanks.


Solution

  • There are multiple ways to achieve this with PetitParser. Probably the most elegant is to use the a continuation parser:

    final delimited = any().callCC((continuation,  context) {
      final delimiter = continuation(context).value.toParser();
      final parser = [
        delimiter,
        delimiter.neg().star().flatten(),
        delimiter,
      ].toSequenceParser().pick<String>(1);
      return parser.parseOn(context);
    });
    

    The above snippet parses the start character any() (can be further restricted, if necessary) and then dynamically creates a delimiter parser from that. Furthermore, it combines that delimiter parser into one that parses the start character, the contents (not the start character), and the end character and uses the new parser to consume the input. This also gives really nice error messages.