ocamlpolymorphic-variants

Polymorphic variants and let%bind type error


I'm trying to use the technique in Composable Error Handling in OCaml (Result type with polymorphic variants for errors) for some code I've written. The types of the functions I'm trying to use look like this:

val parse : parser -> token list -> (Nominal.term, [> `ParseError of string ]) Result.t
val lex : lexer -> string -> (token list, [> `LexError of string ]) Result.t

My attempt at composing them is this:

let lex_and_parse
  : parser -> lexer -> string -> (Nominal.term, [> `ParseError of string | `LexError of string ]) Result.t
  = fun parser lexer input ->
    let open Result.Let_syntax in
    let%bind tokens = lex lexer input in
    parse parser tokens

Unfortunately the compiler (4.09.0) reports a type error:

File "src/Pratt.ml", line 147, characters 4-23:
147 |     parse parser tokens
          ^^^^^^^^^^^^^^^^^^^
Error: This expression has type
         (Nominal.term, [ `ParseError of string ]) result
       but an expression was expected of type
         (Nominal.term, [> `LexError of string ]) result
       The first variant type does not allow tag(s) `LexError

Note that if I do the equivalent by hand, the code compiles:

let lex_and_parse
  : parser -> lexer -> string -> (Nominal.term, [> `ParseError of string | `LexError of string ]) Result.t
  = fun parser lexer input ->
    match lex lexer input with
      | Error (`LexError err) -> Error (`LexError err)
      | Ok tokens ->
        (match parse parser tokens with
        | Ok result -> Ok result
        | Error (`ParseError err) -> Error (`ParseError err))

Actually, that's not quite true. The equivalent is this, which also fails to compile (in the same way):

    match lex lexer input with
      | Error err -> Error err
      | Ok tokens ->
        match parse parser tokens with
        | Ok result -> Ok result
        | Error err -> Error err
File "src/Pratt.ml", line 155, characters 29-32:
155 |         | Error err -> Error err
                                   ^^^
Error: This expression has type [ `ParseError of string ]
       but an expression was expected of type
         [> `LexError of string | `ParseError of string ]
       The first variant type does not allow tag(s) `LexError

So my question is this. Note that the error message says "This expression has type (Nominal.term, [ `ParseError of string ]) result". This is what I don't understand -- I never specify that type anywhere, in fact, both places ParseError is mentioned, it's with a > constraint. So where does this type come from? IE where does [>ParseError of string ]become[ ParseError of string ]?

Also:

Edit:

I uploaded all of my code for context.

Edit 2 (sorry):

I did some exploration and came up with this implementation:

let parse : parser -> token list -> (Nominal.term, [> `ParseError of string ]) Result.t
  = fun parser toks ->
    match expression parser toks with
    (* | [], result -> result *)
    (* | _, Error err -> Error err *)
    | _, Ok _ -> Error (`ParseError "leftover tokens")
    | _, _ -> Error (`ParseError "unimplemented")

If I remove either of the commented lines, the implementation of lex_and_parse starts to fail again. It's a bit disturbing to me that parse compiles and its type signature never changes, yet a caller can fail to typecheck. How is this possible? This kind of nonlocal effect seriously violates my expectation for how type checking / signatures (ought to) work. I'd really like to understand what's happening.


Solution

  • So first of all, your intuition is right, your code shall work, e.g., the following code is accepted by 4.09.0 without any type errors:

    open Base
    
    module type S = sig
      type parser
      type lexer
      type token
      type term
    
      val parse : parser -> token list -> (term, [> `ParseError of string ]) Result.t
      val lex : lexer -> string -> (token list, [> `LexError of string ]) Result.t
    end
    
    module Test (P : S) = struct
      open P
      let lex_and_parse :
        parser -> lexer -> string -> (term, [> `ParseError of string | `LexError of string ]) Result.t
        = fun parser lexer input ->
          let open Result.Let_syntax in
          let%bind tokens = lex lexer input in
          parse parser tokens
    end
    
    
    module PL : S = struct
      type parser
      type lexer
      type token
      type term
    
      let parse _parser _tokens = Error (`ParseError "not implemented")
      let lex _ _ = Error (`LexError "not implemented")
    end
    

    So my question is this. Note that the error message says "This expression has type (Nominal.term, [ `ParseError of string ]) result". This is what I don't understand -- I never specify that type anywhere, in fact, both places ParseError is mentioned, it's with a > constraint. So where does this type come from? IE where does [>ParseError of string ]become[ ParseError of string ]?

    You are right, this is the main culprit. For some reason, your parse function returns a value of type

    (term, [`ParseError of string])
    

    where the type of the error constituent is a ground type, i.e., it is not polymorphic and cannot be extended. It is hard to tell, why this has happened, but I bet that there should be some type annotation that you have put that stops the type checker from inferring the most general type for the parse function. In any case, the culprit is hiding somewhere and is not in the code that you have shown us.

    Is there a way to weaken a polymorphic variant from [ x ] to [> x ]? (other than mapping all the tags by hand from the first type to the second)

    Yes,

    # let weaken x = (x : [`T]  :> [> `T]);;
    val weaken : [ `T ] -> [> `T ] = <fun>
    

    What's the difference between my attempt and Vladimir's original (which I assume compiles)?

    Your parse function in fact returns a non-extensible type. Note that to turn an non-extensible type to extensible, you have to use the full form coercion, e.g., if you will define lex_and_parse as

      let lex_and_parse :
        parser -> lexer -> string -> (term, [> `ParseError of string | `LexError of string ]) Result.t
        = fun parser lexer input ->
          let open Result.Let_syntax in
          let parse = (parse
                       :  _ -> _ -> (_, [ `ParseError of string]) Result.t
                       :> _ -> _ -> (_, [> `ParseError of string]) Result.t) in
    
          let%bind tokens = lex lexer input in
          parse parser tokens
    

    it will compile. But again the main culprit is the type of your parse function.

    Where the actual bug was hiding

    After OP have uploaded the source code we were able to identify why and where the OCaml typechecker was denied from inferring the general and polymorphic type.

    Here is the story, the parse function is implemented as

    let parse : parser -> token list -> (Nominal.term, [> `ParseError of string ]) Result.t
      = fun parser toks -> match expression parser toks with
        | [], result -> result
        | _, Ok _ -> Error (`ParseError "leftover tokens")
        | _, Error err -> Error err
    

    So its return type is a unification of the types of the folowing expressions:

    in addition we have a type constraint

    parser -> token list -> (Nominal.term, [> `ParseError of string ]) Result.t
    

    And here is an important thing to understand, a type constrained is not a definition, so when you say let x : 'a = 42 you are not defining x to have a universal polymorphic type 'a. A type constraint (expr : typeexpr) forces the type of expr to be compatible with typexpr. In other words, type constraint can only constrain the type, but the type itself is always inferred by the type checker. If the inferred type is more general, e.g., 'a list than the constraint, e.g., int list, then it will be constrained to int list. But you can't move the other way around as it will defeat the type soundness, e.g., if the inferred type is int list and your constraint is 'a list, then it will still be 'a list (treat it like intersection of types). Again, the type inference will infer the most general type, and you can only make it less general.

    So, finally, the return type of the parse subroutine is the result of unification of the three expressions above plus the user constraint. The type of result is the smallest type, since you have constrained the expression function here to return errors of the non-extensible ground type parse_error.

    Now to mitigations.

    The easiest solution is to remove type annotations at all and rely on the type checker, merlin, and well-defined interfaces (signatures) when you program. Indeed, the type annotation only confused you here. You wrote an extensible [> ...] type annotation and believed that the inferred type is extensible which wasn't true.

    If you need to keep them or if you need to make the expression function a part of your interface, then you have two options, either make your parse_error extensible, and this means polymorphic or use type coercion to weaken the type of result and make it extensible, e.g.,

    | [], result -> (result : parse_error :> [> parse_error])
    

    If you will decide to make your parse_error type extensible, you can't just say

    type parse_error = [> `ParseError of string]
    

    because now parse_error denotes a whole family of types, so we need to represent this variability of type with a type variable, two syntaxes here are applicable,

    type 'a parse_error = [>
      | `ParseError of string
      | `MoonPhaseError of int
    ] as 'a
    

    or a more verbose, but to my taste more precise,

    type 'a parse_error = 'a constraint 'a = [>
        | `ParseError of string
        | `MoonPhaseError of int
      ]
    

    Both definitions are equivalent. The all mean that type 'a parser_error is a type variable 'a s.t. 'a includes the ParseError, the MoonPhaseError and infinitely more errors of unspecified genera.