parsingf#indentationfparsec

Is possible to parse "off-side" (indentation-based) languages with fparsec?


I wish to use FParsec for a python-like language, indentation-based.

I understand that this must be done in the lexing phase, but FParsec don't have a lexing phase. Is possible to use FParsec, or, how can feed it after lexing?

P.D: I'm new at F#, but experienced in other languages


Solution

  • Yes, it's possible.

    Here is a relevant article by FParsec author. If you want to go deeper on the subject, this paper might worth a read. The paper points out that there are multiple packages for indentation-aware parsing that based on Parsec, the parser combinator that inspires FParsec.

    FParsec doesn't have a separate lexing phase but instead it fuses lexing and parsing to a single phase. IMO indentation-aware parsing is better to be done with parser combinators (FParsec) than parser generators (fslex/fsyacc). The reason is that you need to manually track current indentation and report good error messages based on contexts.