I'm trying to use the technique in Composable Error Handling in OCaml (Result type with polymorphic variants for errors) for some code I've written. The types of the functions I'm trying to use look like this:
val parse : parser -> token list -> (Nominal.term, [> `ParseError of string ]) Result.t
val lex : lexer -> string -> (token list, [> `LexError of string ]) Result.t
My attempt at composing them is this:
let lex_and_parse
: parser -> lexer -> string -> (Nominal.term, [> `ParseError of string | `LexError of string ]) Result.t
= fun parser lexer input ->
let open Result.Let_syntax in
let%bind tokens = lex lexer input in
parse parser tokens
Unfortunately the compiler (4.09.0) reports a type error:
File "src/Pratt.ml", line 147, characters 4-23:
147 | parse parser tokens
^^^^^^^^^^^^^^^^^^^
Error: This expression has type
(Nominal.term, [ `ParseError of string ]) result
but an expression was expected of type
(Nominal.term, [> `LexError of string ]) result
The first variant type does not allow tag(s) `LexError
Note that if I do the equivalent by hand, the code compiles:
let lex_and_parse
: parser -> lexer -> string -> (Nominal.term, [> `ParseError of string | `LexError of string ]) Result.t
= fun parser lexer input ->
match lex lexer input with
| Error (`LexError err) -> Error (`LexError err)
| Ok tokens ->
(match parse parser tokens with
| Ok result -> Ok result
| Error (`ParseError err) -> Error (`ParseError err))
Actually, that's not quite true. The equivalent is this, which also fails to compile (in the same way):
match lex lexer input with
| Error err -> Error err
| Ok tokens ->
match parse parser tokens with
| Ok result -> Ok result
| Error err -> Error err
File "src/Pratt.ml", line 155, characters 29-32:
155 | | Error err -> Error err
^^^
Error: This expression has type [ `ParseError of string ]
but an expression was expected of type
[> `LexError of string | `ParseError of string ]
The first variant type does not allow tag(s) `LexError
So my question is this. Note that the error message says "This expression has type (Nominal.term, [ `ParseError of string ]) result
". This is what I don't understand -- I never specify that type anywhere, in fact, both places ParseError
is mentioned, it's with a >
constraint. So where does this type come from? IE where does [>
ParseError of string ]become
[ ParseError of string ]
?
Also:
[ x ]
to [> x ]
? (other than mapping all the tags by hand from the first type to the second)Edit:
I uploaded all of my code for context.
Edit 2 (sorry):
I did some exploration and came up with this implementation:
let parse : parser -> token list -> (Nominal.term, [> `ParseError of string ]) Result.t
= fun parser toks ->
match expression parser toks with
(* | [], result -> result *)
(* | _, Error err -> Error err *)
| _, Ok _ -> Error (`ParseError "leftover tokens")
| _, _ -> Error (`ParseError "unimplemented")
If I remove either of the commented lines, the implementation of lex_and_parse
starts to fail again. It's a bit disturbing to me that parse
compiles and its type signature never changes, yet a caller can fail to typecheck. How is this possible? This kind of nonlocal effect seriously violates my expectation for how type checking / signatures (ought to) work. I'd really like to understand what's happening.
So first of all, your intuition is right, your code shall work, e.g., the following code is accepted by 4.09.0 without any type errors:
open Base
module type S = sig
type parser
type lexer
type token
type term
val parse : parser -> token list -> (term, [> `ParseError of string ]) Result.t
val lex : lexer -> string -> (token list, [> `LexError of string ]) Result.t
end
module Test (P : S) = struct
open P
let lex_and_parse :
parser -> lexer -> string -> (term, [> `ParseError of string | `LexError of string ]) Result.t
= fun parser lexer input ->
let open Result.Let_syntax in
let%bind tokens = lex lexer input in
parse parser tokens
end
module PL : S = struct
type parser
type lexer
type token
type term
let parse _parser _tokens = Error (`ParseError "not implemented")
let lex _ _ = Error (`LexError "not implemented")
end
So my question is this. Note that the error message says "This expression has type (Nominal.term, [ `ParseError of string ]) result". This is what I don't understand -- I never specify that type anywhere, in fact, both places ParseError is mentioned, it's with a > constraint. So where does this type come from? IE where does [>ParseError of string ]become[ ParseError of string ]?
You are right, this is the main culprit. For some reason, your parse
function returns a value of type
(term, [`ParseError of string])
where the type of the error constituent is a ground type, i.e., it is not polymorphic and cannot be extended. It is hard to tell, why this has happened, but I bet that there should be some type annotation that you have put that stops the type checker from inferring the most general type for the parse
function. In any case, the culprit is hiding somewhere and is not in the code that you have shown us.
Is there a way to weaken a polymorphic variant from [ x ] to [> x ]? (other than mapping all the tags by hand from the first type to the second)
Yes,
# let weaken x = (x : [`T] :> [> `T]);;
val weaken : [ `T ] -> [> `T ] = <fun>
What's the difference between my attempt and Vladimir's original (which I assume compiles)?
Your parse function in fact returns a non-extensible type. Note that to turn an non-extensible type to extensible, you have to use the full form coercion, e.g., if you will define lex_and_parse
as
let lex_and_parse :
parser -> lexer -> string -> (term, [> `ParseError of string | `LexError of string ]) Result.t
= fun parser lexer input ->
let open Result.Let_syntax in
let parse = (parse
: _ -> _ -> (_, [ `ParseError of string]) Result.t
:> _ -> _ -> (_, [> `ParseError of string]) Result.t) in
let%bind tokens = lex lexer input in
parse parser tokens
it will compile. But again the main culprit is the type of your parse
function.
After OP have uploaded the source code we were able to identify why and where the OCaml typechecker was denied from inferring the general and polymorphic type.
Here is the story, the parse
function is implemented as
let parse : parser -> token list -> (Nominal.term, [> `ParseError of string ]) Result.t
= fun parser toks -> match expression parser toks with
| [], result -> result
| _, Ok _ -> Error (`ParseError "leftover tokens")
| _, Error err -> Error err
So its return type is a unification of the types of the folowing expressions:
result
, Error ('ParseError "leftover tokens")
Error err
in addition we have a type constraint
parser -> token list -> (Nominal.term, [> `ParseError of string ]) Result.t
And here is an important thing to understand, a type constrained is not a definition, so when you say let x : 'a = 42
you are not defining x
to have a universal polymorphic type 'a
. A type constraint (expr : typeexpr)
forces the type of expr
to be compatible with typexpr
. In other words, type constraint can only constrain the type, but the type itself is always inferred by the type checker. If the inferred type is more general, e.g., 'a list
than the constraint, e.g., int list
, then it will be constrained to int list
. But you can't move the other way around as it will defeat the type soundness, e.g., if the inferred type is int list
and your constraint is 'a list
, then it will still be 'a list
(treat it like intersection of types). Again, the type inference will infer the most general type, and you can only make it less general.
So, finally, the return type of the parse
subroutine is the result of unification of the three expressions above plus the user constraint. The type of result
is the smallest type, since you have constrained the expression
function here to return errors of the non-extensible ground type parse_error.
Now to mitigations.
The easiest solution is to remove type annotations at all and rely on the type checker, merlin, and well-defined interfaces (signatures) when you program. Indeed, the type annotation only confused you here. You wrote an extensible [> ...]
type annotation and believed that the inferred type is extensible which wasn't true.
If you need to keep them or if you need to make the expression function a part of your interface, then you have two options, either make your parse_error
extensible, and this means polymorphic or use type coercion to weaken the type of result and make it extensible, e.g.,
| [], result -> (result : parse_error :> [> parse_error])
If you will decide to make your parse_error
type extensible, you can't just say
type parse_error = [> `ParseError of string]
because now parse_error denotes a whole family of types, so we need to represent this variability of type with a type variable, two syntaxes here are applicable,
type 'a parse_error = [>
| `ParseError of string
| `MoonPhaseError of int
] as 'a
or a more verbose, but to my taste more precise,
type 'a parse_error = 'a constraint 'a = [>
| `ParseError of string
| `MoonPhaseError of int
]
Both definitions are equivalent. The all mean that type 'a parser_error
is a type variable 'a
s.t. 'a
includes the ParseError, the MoonPhaseError and infinitely more errors of unspecified genera.