haskelldsldependent-typegadtfree-monad

How can I polymorphically interpret an Arrow-like GADT DSL?


TL;DR: How do I write interpreters (run :: Lang a b -> whatever) for my Lang GADT?


I have a DSL:

data Lang a b where
  F :: Lang x y
  G :: Lang x y

I want to build programs like:

prog :: Lang a b
prog = G . F

And interpret them arbitrarily, where the replacement for F and G can be any function, so long as their types compose properly.

-- run ~= (+1) . fst
run :: Lang a b -> ((Int, b) -> Int)
run F = fst
run G = (+1) -- error: couldn't match type (Int, b) with Int

Intepretation is my big issue. How do I write functions like:

run :: Lang a b -> whatever
-- eg reusing the variables
run :: Lang a b -> (a -> IO b)

I've added notions of composition to my language:

data Lang a b where
  ...

  Lift :: (a -> b) -> Lang a b
  Comp :: Lang b c -> Lang a b -> Lang a c

instance C.Category Lang where
  id = Lift id
  (.) = Comp

This allows me to build programs, but not interpret them. I understand the compiler's errors to me, when I try to instantiate one specific term to some type that is not the overall run fn's type.

This seems a bit like Free approaches, perhaps an Indexed Free? Relatedly, this is a much simplified version of my real problem, that largely uses Arrows, but, I think this simplified formulation encapsulates my issue.

Here's a runnable test environment:

{-# LANGUAGE GADTs #-}

module T15_GADT_Lang where

import qualified Control.Category as C

------------------------------
-- * My Lang

data Lang a b where
  -- | DSL
  F :: Lang x y
  G :: Lang x y

  -- | Composition
  Lift :: (a -> b) -> Lang a b
  Comp :: Lang b c -> Lang a b -> Lang a c

instance C.Category Lang where
  id = Lift id
  (.) = Comp

------------------------------
-- * Program A can be interpreted in 2 different ways

progA :: Lang a b
progA = F

runA1 :: Lang a b -> (Int -> String)
runA1 F x = show x <> "!"

runA2 :: Lang a b -> ([Double] -> String)
runA2 F x = show x <> "!"

progAWorksNicely = do
  putStrLn $ runA1 progA 123
  putStrLn $ runA2 progA [1..3]


------------------------------
-- * Program B, I can't interpret the interim functions indepenedently

progB :: Lang a b
progB = G C.. F

-- | get the fst and increment it
runB1 :: Lang a b -> ((Int, b) -> Int)
runB1 F = fst
runB1 G = (+1) -- error: couldn't match type (Int, b) with Int

-- | Get the head and duplicate it
runB2 :: Lang a b -> ([a] -> (a, a))
runB2 F xs = head xs -- you're giving an `a`, but it should be `(a,a)`
runB2 G x = (x, x)

Solution

  • You can write these different interpretations using only Lift and Comp, but it is less obvious how to interpret F as both fst and head in a way that makes sense

    -- > run (Lift (\x -> show x <> "!")) 1
    -- "1!"
    -- > run (Lift (\x -> show x <> "!")) [pi,pi]
    -- "[3.141592653589793,3.141592653589793]!"
    -- > run (Lift (1 +) `Comp` Lift fst) (100, "ok")
    -- 101
    -- dup a = (a, a)
    -- >> run (Lift dup `Comp` Lift head) [1,2,3]
    -- (1, 1)
    run :: Lang a b -> (a -> b)
    run (Comp f g) = run f . run g
    run (Lift f)   = f
    

    The problem with F and G is that their indices are completely universally quantified, that means that composing with them might as well be untyped because any input and any output is valid!

    If you want type-safe composition of G . F their interpretation must connect to those types somehow so you must provide indexes for Lang that unify these two uses

    run F = fst   :: (a, b) -> a
    run G = (+) 1 ::           Int -> Int
    
    run F = head :: [a] -> a
    run G = dup  ::        a -> (a, a)
    

    In the case of Lift the input and output types are clearly reflected in the indices. Each of them could be made into a separate constructors that are indexed differently.

    Lift (\x -> show x <> "!") :: Lang Int      String
    Lift (\x -> show x <> "!") :: Lang [Double] String
    Lift fst                   :: Lang (a, b)   a
    Lift (1 +)                 :: Lang Int      Int
    Lift head                  :: Lang [a]      a
    Lift dup                   :: Lang a        (a, a)
    

    If you want to interpret F as fst and G as (+ 1) you you have the types match

    type Lang :: Cat Type
    data Lang a b where
      One   :: Lang (a, b) a
      Plus1 :: Lang Int    Int
      ..
    
    -- > run (Plus1 . One) (100, 200)
    -- 101
    run :: Lang a b -> (a -> b)
    run One   = fst
    run Plus1 = (+) 1
    ..
    

    you can similarly interpret as an IO-kleisli arrow

    -- > runIO  (Plus1 . One) (100, 200)
    -- One
    -- (+) 1
    -- 101
    runIO :: Lang a b -> (a -> IO b)
    runIO One ~(a, _) = do
      putStrLn "One"
      pure a
    runIO Plus1 n = do
      putStrLn "(+) 1"
      pure (1 + n)
    runIO (Comp f g) a = 
      (runIO f <=< runIO g) a
    

    tangent

    You can also interpret it by abstracting the constructors into methods of a type class and interpreting it constrained by that class

    type Lang :: Cat Type
    data Lang a b where
      One   :: Lang (a, b) a
      Plus1 :: Lang Int Int
      Lift  :: (a -> b) -> Lang a b
      Comp  :: Lang b c -> Lang a b -> Lang a c
    
    type  Langc :: Cat Type -> Constraint
    class Langc lang where
     one   :: lang (a, b) a
     plus1 :: lang Int    Int
     lift  :: (a -> b) -> lang a b
     comp  :: lang b c -> lang a b -> lang a c
    
    run :: Lang a b -> (forall lang. Langc lang => lang a b)
    run = \case
     One      -> one
     Plus1    -> plus1
     Lift f   -> lift f
     Comp f g -> comp (run f) (run g)
    

    I imagine the aim is to treat Lift id as the identity of Comp like the category operations are. Language can be expressed as a "free Category"

    type  Langc :: Cat Type -> Constraint
    class Category cat => Langc cat where
     one   :: cat (a, b) a
     plus1 :: cat Int    Int
     lift  :: (a -> b) -> cat a b
    
    run :: Lang a b -> (forall lang. Langc lang => lang a b)
    run = \case
     One      -> one
     Plus1    -> plus1
     Lift f   -> lift f
     Comp f g -> run f . run g
    

    or a free Arrow which has arr :: Arrow arr => (a -> b) -> arr a b

    type  Langc :: Cat Type -> Constraint
    class Arrow arr => Langc arr where
     one   :: cat (a, b) a
     plus1 :: cat Int    Int
    
    run :: Lang a b -> (forall lang. Langc lang => lang a b)
    run = \case
     One      -> one
     Plus1    -> plus1
     Lift f   -> arr f
     Comp f g -> run f . run g
    

    Which can be interpreted at Langc (->) and Langc (Kleisli IO).

    In fact you can skip having an "initial" Lang encoding and define it as "final"

    type Lang :: Cat Type
    type Lang a b = (forall lang. Langc lang => lang a b)
    
    f :: Lang (Int, a) Int
    f = one >>> plus1
    
    -- > run f (100, "ok")
    -- 101
    run :: Lang a b -> (a -> b)     -- requires: instance Langc (->)
    run lang = lang