The Haskell Wiki does a good job of explaining how to use existential types, but I don't quite grok the theory behind them.
Consider this example of an existential type:
data S = forall a. Show a => S a -- (1)
to define a type wrapper for things that we can convert to a String
. The wiki mentions that what we really want to define is a type like
data S = S (exists a. Show a => a) -- (2)
i.e. a true "existential" type - loosely I think of this as saying "the data constructor S
takes any type for which a Show
instance exists and wraps it". In fact, you could probably write a GADT as follows:
data S where -- (3)
S :: Show a => a -> S
I haven't tried compiling that, but it seems as though it should work. To me, the GADT is obviously equivalent to the code (2) that we'd like to write.
However, it's completely not obvious to me why (1) is equivalent to (2). Why does moving the data constructor to the outside turn the forall
into an exists
?
The closest thing I can think of are De Morgan's Laws in logic, where interchanging the order of a negation and a quantifier turns existential quantifiers into universal quantifiers, and vice-versa:
¬(∀x. px) ⇔ ∃x. ¬(px)
but data constructors seem to be a totally different beast to the negation operator.
What is the theory that lies behind the ability to define existential types using forall
instead of the non-existent exists
?
First of all, take a look at the "Curry Howard correspondence" which states that the types in a computer program correspond to formulas in intuitionistic logic. Intuitionistic logic is just like the "regular" logic you learned in school but without the law of the excluded middle or double negation elimination:
Not an axiom: P ⇔ ¬¬P (but P ⇒ ¬¬P is fine)
Not an axiom: P ∨ ¬P
You are on the right track with DeMorgan's laws, but first we are going to use them to derive some new ones. The relevant version of DeMorgan's laws are:
We can derive (∀x. P ⇒ Q(x)) = P ⇒ (∀x. Q(x)):
And (∀x. Q(x) ⇒ P) = (∃x. Q(x)) ⇒ P (this one is used below):
Note that these laws hold in intuitionistic logic as well. The two laws we derived are cited in the paper below.
The simplest types are easy to work with. For example:
data T = Con Int | Nil
The constructors and accessors have the following type signatures:
Con :: Int -> T
Nil :: T
unCon :: T -> Int
unCon (Con x) = x
Now let's tackle type constructors. Take the following data definition:
data T a = Con a | Nil
This creates two constructors,
Con :: a -> T a
Nil :: T a
Of course, in Haskell, type variables are implicitly universally quantified, so these are really:
Con :: ∀a. a -> T a
Nil :: ∀a. T a
And the accessor is similarly easy:
unCon :: ∀a. T a -> a
unCon (Con x) = x
Let's add the existential quantifier, ∃, to our original type (the first one, without the type constructor). Rather than introducing it in the type definition, which doesn't look like logic, introduce it in the constructor / accessor definitions, which do look like logic. We'll fix the data definition later to match.
Instead of Int
, we will now use ∃x. t
. Here, t
is some kind of type expression.
Con :: (∃x. t) -> T
unCon :: T -> (∃x. t)
Based on the rules of logic (the second rule above), we can rewrite the type of Con
to:
Con :: ∀x. t -> T
When we moved the existential quantifier to the outside (prenex form), it turned into a universal quantifier.
So the following are theoretically equivalent:
data T = Con (exists x. t) | Nil
data T = forall x. Con t | Nil
Except there is no syntax for exists
in Haskell.
In non-intuitionistic logic, it is permissible to derive the following from the type of unCon
:
unCon :: ∃ T -> t -- invalid!
The reason this is invalid is because such a transformation is not permitted in intuitionistic logic. So it is impossible to write the type for unCon
without an exists
keyword, and it is impossible to put the type signature in prenex form. It's hard to make a type checker guaranteed to terminate in such conditions, which is why Haskell doesn't support arbitrary existential quantifiers.
"First-class Polymorphism with Type Inference", Mark P. Jones, Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages (web)