I was experimenting with phantom types in Haskell. My goal is to convert the LangCode
Type to it's corresponding Phantom Type representation via Type Classes for example DE
to Lang DE
.
module Main (main) where
import Data.Proxy (Proxy(..))
data DE
data EN
data LangCode
= DE
| EN
deriving (Eq, Show)
type Lang a = Proxy a
de :: Lang DE
de = Proxy
en :: Lang EN
en = Proxy
class ToLangCode a where
toLangCode :: Lang a -> LangCode
instance ToLangCode DE where
toLangCode _ = DE
instance ToLangCode EN where
toLangCode _ = EN
class FromLangCode a where
fromLangCode :: LangCode -> Lang a
instance FromLangCode DE where
fromLangCode DE = Proxy
instance FromLangCode EN where
fromLangCode EN = Proxy
main :: IO ()
main = do
print $ de -- Output => Proxy
print $ en -- Output => Proxy
print $ toLangCode de -- Output => DE
print $ toLangCode en -- Output => EN
-- works
print $ (fromLangCode DE :: Lang DE) -- Output => Proxy
print $ (fromLangCode EN :: Lang EN) -- Output => Proxy
-- throws an error
print $ fromLangCode DE -- Output => Proxy
print $ fromLangCode EN -- Output => Proxy
With a type annotation it works fine. But without it I get this error.
[1 of 1] Compiling Main ( main.hs, main.o )
main.hs:50:11: error:
* Ambiguous type variable `a0' arising from a use of `fromLangCode'
prevents the constraint `(FromLangCode a0)' from being solved.
Probable fix: use a type annotation to specify what `a0' should be.
These potential instances exist:
instance FromLangCode DE -- Defined at main.hs:32:10
instance FromLangCode EN -- Defined at main.hs:34:10
* In the second argument of `($)', namely `fromLangCode DE'
In a stmt of a 'do' block: print $ fromLangCode DE
In the expression:
do print $ de
print $ en
print $ toLangCode de
print $ toLangCode en
....
|
50 | print $ fromLangCode DE -- Output => Proxy
| ^^^^^^^^^^^^^^^
main.hs:51:11: error:
* Ambiguous type variable `a1' arising from a use of `fromLangCode'
prevents the constraint `(FromLangCode a1)' from being solved.
Probable fix: use a type annotation to specify what `a1' should be.
These potential instances exist:
instance FromLangCode DE -- Defined at main.hs:32:10
instance FromLangCode EN -- Defined at main.hs:34:10
* In the second argument of `($)', namely `fromLangCode EN'
In a stmt of a 'do' block: print $ fromLangCode EN
In the expression:
do print $ de
print $ en
print $ toLangCode de
print $ toLangCode en
....
|
51 | print $ fromLangCode EN -- Output => Proxy
| ^^^^^^^^^^^^^^^
exit status 1
My question is. Is it possible to implement it in a way so that the type annotations are no longer needed?
Updated version
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE FlexibleInstances #-}
module Main (main) where
data LangCode = DE | EN deriving (Eq, Show)
data SLangCode a where
SDE :: SLangCode DE
SEN :: SLangCode EN
data Lang (a :: LangCode) where
LangDE :: Lang 'DE
LangEN :: Lang 'EN
deriving instance Show (Lang 'DE)
deriving instance Show (Lang 'EN)
slcDE :: SLangCode a -> Lang 'DE -> Lang a
slcDE SDE t = t
slcDE SEN _ = LangEN
slcEN :: SLangCode a -> Lang 'EN -> Lang a
slcEN SEN t = t
slcEN SDE _ = LangDE
class ToLangCode a where
toLangCode :: Lang a -> LangCode
instance ToLangCode 'DE where
toLangCode _ = DE
instance ToLangCode 'EN where
toLangCode _ = EN
class FromLangCode a where
fromLangCode :: SLangCode a -> LangCode -> Lang a
instance FromLangCode 'DE where
fromLangCode SDE DE = LangDE
instance FromLangCode 'EN where
fromLangCode SEN EN = LangEN
main :: IO ()
main = do
print $ toLangCode LangDE -- Output => DE
print $ toLangCode LangEN -- Output => EN
print $ fromLangCode SDE DE -- Output => LangDE
print $ fromLangCode SEN EN -- Output => LangEN
Here's the problem. It's a fundamental property of types that if the term-level expressions e1
and e2
have the same type t
, then substituting e1
with e2
in a program doesn't change any of the program's types. That's what makes them types.
You want the expression (with no explicit type signature):
fromLangCode EN
to have type Lang EN
, and that's easy enough. But if the term-level expressions EN
and DE
are constructors from the same sum type, LangCode
, then substituting one for the other won't change any types, so the expression:
fromLangCode DE
will still have type Lang EN
, which is clearly not what you want.
So, if you want two different inferred types:
fromLangCode EN :: Lang EN
fromLangCode DE :: Lang DE
then any solution is going to require that the term-level expressions EN
and DE
have different types, which means that you can't have:
data LangCode = EN | DE
So, that's the short answer -- you can't freely convert between a sum type LangCode
and proxies of type Lang lang
. Or rather, you can easily convert from type Lang lang
to term LangCode
using a type class (like ToLangCode
), but you can't really convert back.
There are many solutions to this "problem", but it kind of depends on what you're trying to do, which is why folks in the comments are asking you questions about "use cases" and "expected behavior".
One simple solution is to write:
data EN = EN
data DE = DE
Here, the term-level expressions EN
and DE
have different types (EN
and DE
respectively). This allows you to easily implement the desired interface using your main
function verbatim:
import Data.Proxy
data EN = EN deriving (Show)
data DE = DE deriving (Show)
type Lang a = Proxy a
de :: Lang DE
de = Proxy
en :: Lang EN
en = Proxy
class ToLangCode a where
toLangCode :: Lang a -> a
instance ToLangCode DE where
toLangCode _ = DE
instance ToLangCode EN where
toLangCode _ = EN
class FromLangCode a where
fromLangCode :: a -> Lang a
instance FromLangCode DE where
fromLangCode DE = Proxy
instance FromLangCode EN where
fromLangCode EN = Proxy
main :: IO ()
main = do
print $ de -- Output => Proxy
print $ en -- Output => Proxy
print $ toLangCode de -- Output => DE
print $ toLangCode en -- Output => EN
-- works
print $ (fromLangCode DE :: Lang DE) -- Output => Proxy
print $ (fromLangCode EN :: Lang EN) -- Output => Proxy
-- works fine now
print $ fromLangCode DE -- Output => Proxy
print $ fromLangCode EN -- Output => Proxy
If you're suspicious of this "solution", you're right to be. It doesn't really accomplish anything, since the term-level expressions EN
and DE
are already distinct at the type level, and this program is really just converting between one type level representation (types EN
and DE
) and another (types Lang EN
and Lang DE
).
One way to kind of do what you want is to use a GADT. If we define LangCode
as a "generalized" sum type:
{-# LANGUAGE GADTs #-}
data EN
data DE
data LangCode lang where
EN :: LangCode EN
DE :: LangCode DE
everything works more or less like my previous example, with a few minor changes to type signatures, and main
left unchanged, as below:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE StandaloneDeriving #-}
import Data.Proxy
data EN
data DE
data LangCode lang where
EN :: LangCode EN
DE :: LangCode DE
deriving instance Show (LangCode a)
type Lang a = Proxy a
de :: Lang DE
de = Proxy
en :: Lang EN
en = Proxy
class ToLangCode a where
toLangCode :: Lang a -> LangCode a
instance ToLangCode DE where
toLangCode _ = DE
instance ToLangCode EN where
toLangCode _ = EN
class FromLangCode a where
fromLangCode :: LangCode a -> Lang a
instance FromLangCode DE where
fromLangCode DE = Proxy
instance FromLangCode EN where
fromLangCode EN = Proxy
main :: IO ()
main = do
print $ de -- Output => Proxy
print $ en -- Output => Proxy
print $ toLangCode de -- Output => DE
print $ toLangCode en -- Output => EN
-- works
print $ (fromLangCode DE :: Lang DE) -- Output => Proxy
print $ (fromLangCode EN :: Lang EN) -- Output => Proxy
-- works fine now
print $ fromLangCode DE -- Output => Proxy
print $ fromLangCode EN -- Output => Proxy
So, we can freely convert between this generalized sum type and a phantom type representation.
This really isn't a big improvement over the previous example. The two constructors are now formally part of a generalized sum type, but the term-level expressions EN
and DE
are already distinct at the type level, and we're just converting between one type-level representation (types LangCode EN
and LangCode DE
) and another (types Lang EN
and Lang DE
).
However, the same criticism could be levelled at your "updated example". By introducing singletons (a generalized sum type), you too are already making the expressions fromLangCode SDE DE
and fromLangCode SEN EN
distinct at the type-level in that first, singleton argument. The second, term-level argument plays no useful role here and could be eliminated, so you're just converting from one type-level representation (SLangCode DE
versus SLangCode EN
) to another (Lang DE
versus Lang EN
).
Actual useful conversion between term and type-level representations usually involves an existential type somewhere in the mix. It helps to consider a slightly more realistic example. Suppose you might want to be able to use the type system to help avoid inappropriately mixing languages. For example:
import Data.List.Extra
data EN
data DE
newtype Text lang = Text String deriving (Show)
item :: Text EN
item = Text "The big elephant"
artikel :: Text DE
artikel = Text "Die großen Elefanten"
fixß :: Text DE -> Text DE
fixß (Text x) = Text $ replace "ß" "ss" x
pluralize :: Text EN -> Text EN
pluralize (Text noun) | "s" `isSuffixOf` noun = Text $ noun ++ "ses"
| otherwise = Text $ noun ++ "s"
message_en :: Text EN
message_en = pluralize item
message_de :: Text DE
message_de = fixß artikel
-- type system prevents applying german manipulations to english text
type_error_1 = fixß item
But, you might also like to make a run-time decision about which language is being used in a particular expression:
data LangCode = EN | DE
main :: IO ()
main = do
let language = EN -- assume this comes from args or user input
-- type error: `mytext` can't be both `Text EN` and `Text DE`
let mytext = case language of
EN -> message_en
DE -> message_de
print mytext
This doesn't work because the type of mytext
can't depend on a runtime computation. That is, there's no simple way to convert the runtime term-level value language :: LangCode
(a sum type) to a type-level value Lang language
, the desired type of mytext
.
The usual solution in is to use an existential type:
{-# LANGUAGE ExistentialQuantification #-}
data SomeText = forall lang. SomeText (Text lang)
Here, the type SomeText
represents text in some unspecified language (i.e., a value of type Text lang
for some unspecified type lang
). Now, mytext
can be assigned text in a runtime-determined language by wrapping it with the SomeText
constructor.
let mytext = case language of
EN -> SomeText message_en
DE -> SomeText message_de
We are limited in what we can do with a SomeText
value like mytext
-- we can't do anything that depends on knowing the language, like applying pluralize
or fixß
or whatever. However, one thing we can do is extract the string, since that works for any language:
getText :: SomeText -> String
getText (SomeText (Text str)) = str
which allows us to write a useful main
:
main :: IO ()
main = do
let language = EN
let mytext = case language of
EN -> SomeText message_en
DE -> SomeText message_de
print $ getText mytext
Here's the full working example:
{-# LANGUAGE ExistentialQuantification #-}
import Data.List.Extra
data EN
data DE
data LangCode = EN | DE
newtype Text lang = Text String deriving (Show)
data SomeText = forall lang. SomeText (Text lang)
item :: Text EN
item = Text "The big elephant"
artikel :: Text DE
artikel = Text "Die großen Elefanten"
fixß :: Text DE -> Text DE
fixß (Text x) = Text $ replace "ß" "ss" x
pluralize :: Text EN -> Text EN
pluralize (Text noun) | "s" `isSuffixOf` noun = Text $ noun ++ "ses"
| otherwise = Text $ noun ++ "s"
message_en :: Text EN
message_en = pluralize item
message_de :: Text DE
message_de = fixß artikel
getText :: SomeText -> String
getText (SomeText (Text str)) = str
main :: IO ()
main = do
let language = DE
let mytext = case language of
EN -> SomeText message_en
DE -> SomeText message_de
print $ getText mytext
What we've done here is successfully converted a term-level value from a sum type (language
) to a type-level value Text language
by wrapping it in a SomeText
constructor.
We can use the same technique to convert freely between a sum type and type-level proxies, by shielding the type-level proxies in an existential. It might look something like this. Note how I've used a custom Show
instance to differentiate proxies of different types, to prove we're doing something useful.
{-# LANGUAGE ExistentialQuantification #-}
import Data.Proxy
data DE
data EN
-- sum type
data LangCode
= DE
| EN
deriving (Eq, Show)
-- existential for type-level proxies
type Lang = Proxy
data SomeLang = forall a. ToLangCode a => SomeLang (Lang a)
instance Show SomeLang where
show (SomeLang lang) = "SomeLang Proxy<" ++ show (toLangCode' lang) ++ ">"
-- convert from LangCode to SomeLang
fromLangCode :: LangCode -> SomeLang
fromLangCode EN = SomeLang (Proxy :: Lang EN)
fromLangCode DE = SomeLang (Proxy :: Lang DE)
-- convert from SomeLang to LangCode
class ToLangCode lang where
toLangCode' :: Lang lang -> LangCode
instance ToLangCode EN where
toLangCode' Proxy = EN
instance ToLangCode DE where
toLangCode' Proxy = DE
toLangCode :: SomeLang -> LangCode
toLangCode (SomeLang lang) = toLangCode' lang
de :: SomeLang
de = SomeLang (Proxy :: Lang DE)
en :: SomeLang
en = SomeLang (Proxy :: Lang EN)
main :: IO ()
main = do
print $ de -- Output => SomeLang Proxy<DE>
print $ en -- Output => SomeLang Proxy<EN>
print $ toLangCode de -- Output => DE
print $ toLangCode en -- Output => EN
print $ fromLangCode DE -- Output => SomeLang Proxy<DE>
print $ fromLangCode EN -- Output => SomeLang Proxy<EN>