Genericize GADT

I'm implementing an interpreter for a functional language that will need to do runtime type checking, as different expressions in the input language return different types of values. I started off with a basic boxing data type like this:

-- Extended value type
data Value 
  = StringValue T.Text 
  | IntValue Integer
  | BoolValue Bool
  | Error EvalError
  deriving (Eq, Show)

This values would be returned from expressions in the parse tree on evaluation, something like this:

data Expression = Constant Value a |
                  UnaryOp (Value a -> Value b) Expression |
                  BinaryOp (Value a -> Value b  -> Value c) Expression Expression

For defining operations, I created a helper function:

lift :: (Int -> Int -> Int) -> Value -> Value -> Value
lift f (IntValue x) (IntValue y) = IntValue $ f x y
lift _ _ _ = Error TypeError

Which could be used to build operators like this:

case op of
  '+' -> (BinaryOp (lift (+)) arg1 arg2, rest2)
-- etc.

Implementation started to get a bit cumbersome, and I thought it was a little weird to define my own lift method and all the other surrounding operators, when Haskell has type classes for exactly this reason. I thought about trying to make Value an instance of Functor and Applicative, but that doesn't really work without Value having a type variable, then I learned about GADTs, and I thought I could use that for my Value type, so I changed it into this:

data Value a where
  StringValue :: T.Text    -> Value T.Text
  IntValue    :: Int       -> Value Int
  BoolValue   :: Bool      -> Value Bool
  Error       :: EvalError -> Value EvalError

Now I'm trying to make a Functor and Applicative instance. I started off with my Functor instance looking like this:

instance Functor Value where
  fmap f (StringValue t) = StringValue $ f t
  fmap f (IntValue x)    = IntValue $ f x
  fmap f (BoolValue b)   = BoolValue $ f b
  fmap _ _               = Error TypeError

I thought it was a bit silly to have all of those repeated fmap instances, and it would get really tricky to manage if I wanted to have combinations of types, like a function mapping an int to a string for example. I thought maybe I could get around this by using Applicative and helper functions, so now I have this:

instance Functor Value where
  fmap f v = pure $ f (extractValue v)

extractValue :: Value a -> a
extractValue (StringValue t) = t
extractValue (IntValue n) = n
extractValue (BoolValue b) = b
extractValue (Error e) = e

It seems a little silly to me to have all of these copies of unboxing code that are all doing the same thing, and my pure implementation is going to also need a bunch of copies going the other way. I could just make Value a generic constructor, but then I won't be able to pattern match on the value type later on, and I'm sure that's going raise issues like with the addition implementation above if the compiler can't be sure the value type actually is an int.

It would be nice if I could just pattern match on a generic kind of constructor, but I don't know if that's possible in Haskell. If I could just have all of the constructors match against something like

extractValue :: Value a -> a
extractValue (T t) = t

Instance Applicative Value where
  pure x = T x
-- etc.

that would be great. Is there a way to do that in Haskell?

Solution

An easy way to avoid duplication is to factor out the "type" witnesses:

data Ty a where
  StringTy :: Ty T.Text
  IntTy    :: Ty Int
  BoolTy   :: Ty Bool
  ErrorTy  :: Ty EvalError

data Value a = Value (Ty a) a

You can then use pattern synonyms to recover your original constructors:

pattern StringValue a = Value StringTy a
pattern IntValue a = Value IntTy a
pattern BoolValue a = Value BoolTy a
pattern Error a = Value ErrorTy a

The extractValue function is now straightforward to implement:

extractValue :: Value a -> a
extractValue (Value _ a) = a

However, as pointed out by others in the comments, Value is not a Functor on all of Hask. The best you can do is define its action on endomorphisms:

emap :: (a -> a) -> Value a -> Value a
emap f (Value t a) = Value t (f a)

Or take an explicit witness that the target type is in Ty:

fmap' :: Ty b -> (a -> b) -> Value a -> Value b
fmap' t' f (Value t a) = Value t' (f a)

Very informally, Value is the inclusion of the full subcategory of Hask carved out by the types in Ty, so you could reasonably see it as an "exofunctor".

Depending on your use case, it might make more sense to use the "identity functor" instead of Value (so a instead of Value a) and constrain a separately when needed, possibly using a typeclass instead of a GADT.