Is there any way to return multiple tokens in OCamlLex?
I'm trying to write a lexer and parser for an indentation based language, and I would like my lexer to return multiple DEDENT
tokens when it notices that the indentation level is less than it previously was. This will allow it to notify the parser when multiple blocks have ended.
By following this method, I would be able to use INDENT
and DEDENT
as drop-in replacements for BEGIN
and END
, as these two tokens would be implied by the INDENT
and DEDENT
tokens.
Return the list of tokens. If the parser cannot natively handle that (say ocamlyacc) - just insert a cache in between :
let cache =
let l = ref [] in
fun lexbuf ->
match !l with
| x::xs -> l := xs; x
| [] -> match Lexer.tokens lexbuf with
| [] -> failwith "oops"
| x::xs -> l := xs; x
Or you can run the lexer on the full document and then run the parser on the full token stream.
BTW did you see ocaml+twt?