TL;DR: How do I write interpreters (run :: Lang a b -> whatever
) for my Lang
GADT?
I have a DSL:
data Lang a b where
F :: Lang x y
G :: Lang x y
I want to build programs like:
prog :: Lang a b
prog = G . F
And interpret them arbitrarily, where the replacement for F
and G
can be any function, so long as their types compose properly.
-- run ~= (+1) . fst
run :: Lang a b -> ((Int, b) -> Int)
run F = fst
run G = (+1) -- error: couldn't match type (Int, b) with Int
Intepretation is my big issue. How do I write functions like:
run :: Lang a b -> whatever
-- eg reusing the variables
run :: Lang a b -> (a -> IO b)
I've added notions of composition to my language:
data Lang a b where
...
Lift :: (a -> b) -> Lang a b
Comp :: Lang b c -> Lang a b -> Lang a c
instance C.Category Lang where
id = Lift id
(.) = Comp
This allows me to build programs, but not interpret them. I understand the compiler's errors to me, when I try to instantiate one specific term to some type that is not the overall run
fn's type.
This seems a bit like Free
approaches, perhaps an Indexed Free? Relatedly, this is a much simplified version of my real problem, that largely uses Arrows, but, I think this simplified formulation encapsulates my issue.
Here's a runnable test environment:
{-# LANGUAGE GADTs #-}
module T15_GADT_Lang where
import qualified Control.Category as C
------------------------------
-- * My Lang
data Lang a b where
-- | DSL
F :: Lang x y
G :: Lang x y
-- | Composition
Lift :: (a -> b) -> Lang a b
Comp :: Lang b c -> Lang a b -> Lang a c
instance C.Category Lang where
id = Lift id
(.) = Comp
------------------------------
-- * Program A can be interpreted in 2 different ways
progA :: Lang a b
progA = F
runA1 :: Lang a b -> (Int -> String)
runA1 F x = show x <> "!"
runA2 :: Lang a b -> ([Double] -> String)
runA2 F x = show x <> "!"
progAWorksNicely = do
putStrLn $ runA1 progA 123
putStrLn $ runA2 progA [1..3]
------------------------------
-- * Program B, I can't interpret the interim functions indepenedently
progB :: Lang a b
progB = G C.. F
-- | get the fst and increment it
runB1 :: Lang a b -> ((Int, b) -> Int)
runB1 F = fst
runB1 G = (+1) -- error: couldn't match type (Int, b) with Int
-- | Get the head and duplicate it
runB2 :: Lang a b -> ([a] -> (a, a))
runB2 F xs = head xs -- you're giving an `a`, but it should be `(a,a)`
runB2 G x = (x, x)
You can write these different interpretations using only Lift
and Comp
, but it is less obvious how to interpret F
as both fst
and head
in a way that makes sense
-- > run (Lift (\x -> show x <> "!")) 1
-- "1!"
-- > run (Lift (\x -> show x <> "!")) [pi,pi]
-- "[3.141592653589793,3.141592653589793]!"
-- > run (Lift (1 +) `Comp` Lift fst) (100, "ok")
-- 101
-- dup a = (a, a)
-- >> run (Lift dup `Comp` Lift head) [1,2,3]
-- (1, 1)
run :: Lang a b -> (a -> b)
run (Comp f g) = run f . run g
run (Lift f) = f
The problem with F
and G
is that their indices are completely universally quantified, that means that composing with them might as well be untyped because any input and any output is valid!
If you want type-safe composition of G . F
their interpretation must connect to those types somehow so you must provide indexes for Lang
that unify these two uses
run F = fst :: (a, b) -> a
run G = (+) 1 :: Int -> Int
run F = head :: [a] -> a
run G = dup :: a -> (a, a)
In the case of Lift
the input and output types are clearly reflected in the indices. Each of them could be made into a separate constructors that are indexed differently.
Lift (\x -> show x <> "!") :: Lang Int String
Lift (\x -> show x <> "!") :: Lang [Double] String
Lift fst :: Lang (a, b) a
Lift (1 +) :: Lang Int Int
Lift head :: Lang [a] a
Lift dup :: Lang a (a, a)
If you want to interpret F
as fst
and G
as (+ 1)
you you have the types match
type Lang :: Cat Type
data Lang a b where
One :: Lang (a, b) a
Plus1 :: Lang Int Int
..
-- > run (Plus1 . One) (100, 200)
-- 101
run :: Lang a b -> (a -> b)
run One = fst
run Plus1 = (+) 1
..
you can similarly interpret as an IO-kleisli arrow
-- > runIO (Plus1 . One) (100, 200)
-- One
-- (+) 1
-- 101
runIO :: Lang a b -> (a -> IO b)
runIO One ~(a, _) = do
putStrLn "One"
pure a
runIO Plus1 n = do
putStrLn "(+) 1"
pure (1 + n)
runIO (Comp f g) a =
(runIO f <=< runIO g) a
tangent
You can also interpret it by abstracting the constructors into methods of a type class and interpreting it constrained by that class
type Lang :: Cat Type
data Lang a b where
One :: Lang (a, b) a
Plus1 :: Lang Int Int
Lift :: (a -> b) -> Lang a b
Comp :: Lang b c -> Lang a b -> Lang a c
type Langc :: Cat Type -> Constraint
class Langc lang where
one :: lang (a, b) a
plus1 :: lang Int Int
lift :: (a -> b) -> lang a b
comp :: lang b c -> lang a b -> lang a c
run :: Lang a b -> (forall lang. Langc lang => lang a b)
run = \case
One -> one
Plus1 -> plus1
Lift f -> lift f
Comp f g -> comp (run f) (run g)
I imagine the aim is to treat Lift id
as the identity of Comp
like the category operations are. Language can be expressed as a "free Category
"
type Langc :: Cat Type -> Constraint
class Category cat => Langc cat where
one :: cat (a, b) a
plus1 :: cat Int Int
lift :: (a -> b) -> cat a b
run :: Lang a b -> (forall lang. Langc lang => lang a b)
run = \case
One -> one
Plus1 -> plus1
Lift f -> lift f
Comp f g -> run f . run g
or a free Arrow
which has arr :: Arrow arr => (a -> b) -> arr a b
type Langc :: Cat Type -> Constraint
class Arrow arr => Langc arr where
one :: cat (a, b) a
plus1 :: cat Int Int
run :: Lang a b -> (forall lang. Langc lang => lang a b)
run = \case
One -> one
Plus1 -> plus1
Lift f -> arr f
Comp f g -> run f . run g
Which can be interpreted at Langc (->)
and Langc (Kleisli IO)
.
In fact you can skip having an "initial" Lang
encoding and define it as "final"
type Lang :: Cat Type
type Lang a b = (forall lang. Langc lang => lang a b)
f :: Lang (Int, a) Int
f = one >>> plus1
-- > run f (100, "ok")
-- 101
run :: Lang a b -> (a -> b) -- requires: instance Langc (->)
run lang = lang