parsingocamlmenhir

How to parse list of expression using menhir?


Here are my current lexer and parser for Andrew Appel's Tiger language (ocaml).

I'm currently trying to support mutual recursive functions, but the following parser code doesn't work:

decs :
    | l = list(dec) { l }

dec :
    | t = nonempty_list(loc(tydec)) { S.TypeDec t }
    | v = loc(vardec) { S.VarDec v }
    | f = nonempty_list(loc(fundec)) { S.FunDec f }

%inline fundec :
    | Function fun_name = symbol LPar params = tyfields RPar
        Eq body = loc(exp) {
        S.{ fun_name; args = params; return_type = None; body }
    }
    | Function fun_name = symbol LPar params = tyfields RPar
        Colon result_type = symbol Eq body = loc(exp) {
        S.{ fun_name; args = params; return_type = Some result_type; body }
    }

For the small example:

let
    function f1(x : int) : int =
        f2(x)

    function f2(x : int) : int =
        f1(x)

in
    f1 (0)
end

I get two FunDec tokens with a singleton list instead of a single FunDec token with a list made of two elements.

How can I use menhir to parse a list of fundec ?

PS: I know I can merge these list in a second pass, but I'd like the parser to do it for me if possible


Solution

  • Since there is no marker for a group of functions, you have to declare your list yourself, with several constructors:

    decs :
        | hd=nonempty_list(fundec) tl=decs_no_function { (S.Fundecs hd)::tl }
        | l=decs_no_function { l }
    
    decs_no_functions :
        | hd=dec tl=decs { hd::tl } (* dec same as yours, without functions *)
        | { [] }
    

    Here decs_no_functions corresponds to "any list of declaration that does not start with a function". Note that a single function declaration will be inside a single element list.