haskell monads category-theory free-monad monadplus

Is there a Codensity MonadPlus that asymptotically optimizes a sequence of MonadPlus operations?

Recently there was a question about the relation between DList <-> [] versus Codensity <-> Free.

This made me think whether there is such a thing for MonadPlus. The Codensity monad improves the asymptotic performance only for the monadic operations, not for mplus.

Moreover, while there used to be Control.MonadPlus.Free, it has been removed in favor of FreeT f []. And since there is no explicit free MonadPlus, I'm not sure how one would express a corresponding improve variant. Perhaps something like

improvePlus :: Functor f => (forall m. (MonadFree f m, MonadPlus m) => m a) -> FreeT f [] a

Update: I attempted to create such a monad using the backtracking LogicT monad, which seems to be defined in a way similar to Codensity:

newtype LogicT r m a = LogicT { unLogicT :: forall r. (a -> m r -> m r) -> m r -> m r }

and is suited for backtracking computations, that is, MonadPlus.

Then I defined lowerLogic, similar to lowerCodensity as followd:

{-# LANGUAGE RankNTypes, FlexibleInstances, FlexibleContexts, MultiParamTypeClasses,
             UndecidableInstances, DeriveFunctor #-}
import Control.Monad
import Control.Monad.Trans.Free
import Control.Monad.Logic

lowerLogic :: (MonadPlus m) => LogicT m a -> m a
lowerLogic k = runLogicT k (\x k -> mplus (return x) k) mzero

Then, after supplementing the corresponding MonadFree instance

instance (Functor f, MonadFree f m) => MonadFree f (LogicT m) where
    wrap t = LogicT (\h z -> wrap (fmap (\p -> runLogicT p h z) t))

one can define

improvePlus :: (Functor f, MonadPlus mr)
            => (forall m. (MonadFree f m, MonadPlus m) => m a)
            -> FreeT f mr a
improvePlus k = lowerLogic k

However, something isn't right with it, as it seems from my initial experiments that for some examples k is distinct from improvePlus k. I'm not sure, if this is a fundamental limitation of LogicT and a different, more complex monad is needed, or just if I defined lowerLogic (or something else) wrongly.

Solution

The following is all based on my (mis)understanding of this very interesting paper posted by Matthew Pickering in his comment: From monoids to near-semirings: the essence of MonadPlus and Alternative (E. Rivas, M. Jaskelioff, T. Schrijvers). All results are theirs; all mistakes are mine.

From free monoids to `DList`

To build up the intuition, first consider the free monoid [] over the category of Haskell types Hask. One problem with [] is that if you have

(xs `mappend` ys) `mappend` zs = (xs ++ ys) ++ zs

then evaluating that requires traversing and re-traversing xs for each left-nested application of mappend.

The solution is to use CPS in the form of difference lists:

newtype DList a = DL { unDL :: [a] -> [a] }

The paper considers the generic form of this (called the Cayley representation) where we're not tied to the free monoid:

newtype Cayley m = Cayley{ unCayley :: Endo m }

with conversions

toCayley :: (Monoid m) => m -> Cayley m
toCayley m = Cayley $ Endo $ \m' -> m `mappend` m'

fromCayley :: (Monoid m) => Cayley m -> m
fromCayley (Cayley k) = appEndo k mempty

Two directions of generalization

We can generalize the above construction in two ways: first, by considering monoids not over Hask, but over endofunctors of Hask; i.e. monads; and second, by enriching the algebraic structure into near-semirings.

`Free` monads to `Codensity`

For any Haskell (endo)functor f, we can construct the free monad Free f, and it will have the analogous performance problem with left-nested binds, with the analogous solution of using the Cayley representation Codensity.

Near-semirings instead of just monoids

This is where the paper stops reviewing concepts that are well-known by the working Haskell programmer, and starts homing in on its goal. A near-semiring is like a ring, except simpler, since both addition and multiplication are just required to be monoids. The connection between the two operations is what you expect:

zero |*| a = zero
(a |+| b) |*| c = (a |*| c) |+| (b |*| c)

where (zero, |+|) and (one, |*|) are the two monoids over some shared base:

class NearSemiring a where
    zero :: a
    (|+|) :: a -> a -> a
    one :: a
    (|*|) :: a -> a -> a

The free near-semiring (over Hask) turns out to be the following Forest type:

newtype Forest a = Forest [Tree a]
data Tree a = Leaf | Node a (Forest a)

instance NearSemiring (Forest a) where
    zero = Forest []
    one = Forest [Leaf]
    (Forest xs) |+| (Forest ys) = Forest (xs ++ ys)
    (Forest xs) |*| (Forest ys) = Forest (concatMap g xs)
      where
        g Leaf = ys
        g (Node a n) = [Node a (n |*| (Forest ys))]

(good thing we don't have commutativity or inverses, those make free representations far from trivial...)

Then, the paper applies the Cayley representation twice, to the two monoidal structures.

However, if we do this naively, we do not get a good representation: we want to represent a near-semiring, and therefore the whole near-semiring structure must be taken into account and not just one chosen monoid structure. [...] [W]e obtain the semiring of endomorphisms over endomorphisms DC(N):

newtype DC n = DC{ unDC :: Endo (Endo n) }

instance (Monoid n) => NearSemiring (DC n) where
    f |*| g = DC $ unDC f `mappend` unDC g
    one = DC mempty
    f |+| g = DC $ Endo $ \h -> appEndo (unDC f) h `mappend` h
    zero = DC $ Endo $ const mempty

(I've changed the implementation here slightly from the paper to emphasize that we are using the Endo structure twice). When we'll generalize this, the two layers will not be the same. The paper then goes on to say:

Note that rep is not a near-semiring homomorphism from N into DC(N) as it does not preserve the unit [...] Nevertheless, [...] the semantics of a computation over a near-semiring will be preserved if we lift values to the representation, do the near-semiring computation there, and then go back to the original near-semiring.

`MonadPlus` is almost a near-semiring

The paper then goes on to reformulate the MonadPlus typeclass so that it corresponds to the near-semiring rules: (mzero, mplus) is monoidal:

m `mplus` mzero = m
mzero `mplus` m = m
m1 `mplus` (m2 `mplus` m3) = (m1 `mplus` m2) `mplus` m3

and it interacts with the monad-monoid as expected:

join mzero = mzero
join (m1 `mplus` m2) = join m1 `mplus` join m2

Or, using binds:

mzero >>= _ = mzero
(m1 `mplus` m2) >>= k = (m1 >>= k) `mplus` (m2 >>= k)

However, these are not the rules of the existing MonadPlus typeclass from base, which are listed as:

mzero >>= _  =  mzero
_ >> mzero   =  mzero

The paper calls MonadPlus instances that satisfy the near-semiring-like laws "nondeterminism monads", and cites Maybe as an example that is a MonadPlus but not a nondeterminism monad, since setting m1 = Just Nothing and m2 = Just (Just False) is a counter-example to join (m1 `mplus` m2) = join m1 `mplus` join m2.

Free and Cayley representation of nondeterminism monads

Putting everything together, on one hand we have the Forest-like free nondeterminism monad:

newtype FreeP f x = FreeP { unFreeP :: [FFreeP f x] }
data FFreeP f x = PureP x | ConP (f (FreeP f x))

instance (Functor f) => Functor (FreeP f) where
    fmap f x = x >>= return . f

instance (Functor f) => Monad (FreeP f) where
    return x = FreeP $ return $ PureP x
    (FreeP xs) >>= f = FreeP (xs >>= g)
      where
        g (PureP x) = unFreeP (f x)
        g (ConP x) = return $ ConP (fmap (>>= f) x)

instance (Functor f) => MonadPlus (FreeP f) where
    mzero = FreeP mzero
    FreeP xs `mplus` FreeP ys = FreeP (xs `mplus` ys)

and on the other, the double-Cayley representation of the two monoidal layers:

newtype (:^=>) f g x = Ran{ unRan :: forall y. (x -> f y) -> g y }
newtype (:*=>) f g x = Exp{ unExp :: forall y. (x -> y) -> (f y -> g y) }

instance Functor (g :^=> h) where
    fmap f m = Ran $ \k -> unRan m (k . f)

instance Functor (f :*=> g) where
    fmap f m = Exp $ \k -> unExp m (k . f)

newtype DCM f x = DCM {unDCM :: ((f :*=> f) :^=> (f :*=> f)) x}

instance Monad (DCM f) where
    return x = DCM $ Ran ($x)
    DCM (Ran m) >>= f = DCM $ Ran $ \g -> m $ \a -> unRan (unDCM (f a)) g

instance MonadPlus (DCM f) where
    mzero = DCM $ Ran $ \k -> Exp (const id)
    mplus m n = DCM $ Ran $ \sk -> Exp $ \f fk -> unExp (a sk) f (unExp (b sk) f fk)
      where
        DCM (Ran a) = m
        DCM (Ran b) = n

caylize :: (Monad m) => m a -> DCM m a
caylize x = DCM $ Ran $ \g -> Exp $ \h m -> x >>= \a -> unExp (g a) h m

-- I wish I called it DMC earlier...
runDCM :: (MonadPlus m) => DCM m a -> m a
runDCM m = unExp (f $ \x -> Exp $ \h m -> return (h x) `mplus` m) id mzero
  where
    DCM (Ran f) = m

The paper gives the following example of a computation running in a nondeterminism monad that will behave poorly for FreeP:

anyOf :: (MonadPlus m) => [a] -> m a
anyOf [] = mzero
anyOf (x:xs) = anyOf xs `mplus` return x

Indeed, while

length $ unFreeP (anyOf [1..100000] :: FreeP Identity Int)

takes ages, the Cayley-transformed version

length $ unFreeP (runDCM $ anyOf [1..100000] :: FreeP Identity Int)

returns instantly.