haskelltypesfunctional-programminggadt

Difference between GADTs and GADTSyntax


When I define a new type using GADT-Syntax, what exactly makes the difference between a "normal" algebraic data type and a generalized algebraic data type? I think it has to do with the type signatures of the defined data constructors but could never find an exact definition.

Furthermore, what are consequences of this difference that justify having to enable GADTs explicitly? I read that they make type inference indecidable, but why cant we treat the type constructors as functions with that type signature and plug it into the inference algorithm?


Solution

  • Pattern-matching on a normal algebraic datatype (regardless of whether it's defined in GADT syntax) only injects value information into the body. Pattern-matching on a GADT can also inject type information.

    For example,

    data NonGADT a where
      Agnostic :: a -> NonGADT a
    
    data GADT a where
      IntSpecific :: Int -> GADT Int
      CharSpecific :: Char -> GADT Char
    

    All these constructors have in a sense the same signature a -> □ADT a, so a priori that's what the compiler would need to work with when typechecking a function involving pattern matches on them.

    extractNonGADT :: NonGADT a -> a
    extractNonGADT v = case v of
       Agnostic x -> x
    
    extractGADT :: GADT a -> a
    extractGADT v = case v of
       IntSpecific x -> x
       CharSpecific x -> x
    

    The difference is that extractGADT could make use of the fact that the types inside are actually constrained to a single one, to do stuff like

    extractGADT' :: GADT a -> a
    extractGADT' v = case v of
       IntSpecific x -> x + 5
       CharSpecific x -> toUpper x
    

    A standard Hindley-Milner compiler would not be able to handle this: it would assume that a is everywhere the same within the whole definition. In fact, technically speaking it is also everywhere the same in GHC, except that some parts of the code have additional constraints a ~ Int or a ~ Char in scope, respectively. But it's not enough to add handling for that: this additional information must not be allowed to escape the scope (because it would just not be true), and that requires an extra check.