haskellrecursion-schemes

How to use recursion-schemes to `cata` two mutually-recursive types?


I started with this type for leaf-valued trees with labeled nodes:

type Label = String
data Tree a = Leaf Label a 
            | Branch Label [Tree a]

I have some folds I'd like to write over this tree, and they all take the form of catamorphisms, so let's let recursion-schemes do the recursive traversal for me:

{-# LANGUAGE DeriveFunctor, DeriveFoldable, DeriveTraversable, TemplateHaskell, TypeFamilies #-}
import Data.Functor.Foldable.TH (makeBaseFunctor)
import Data.Functor.Foldable (cata)

type Label = String
data Tree a = Leaf Label a 
            | Branch Label [Tree a]
makeBaseFunctor ''Tree

allLabels :: Tree a -> [Label]
allLabels = cata go
  where go (LeafF l _) = [l]
        go (BranchF l lss) = l : concat lss

And all is well: we can traverse a tree:

λ> allLabels (Branch "root" [(Leaf "a" 1), Branch "b" [Leaf "inner" 2]])
["root","a","b","inner"]

But that definition of Tree is a little clunky: each data constructor needs to handle the Label separately. For a small structure like Tree this isn't too bad, but with more constructors it would be quite a nuisance. So let's make the labeling its own layer:

data Node' a = Leaf' a
             | Branch' [Tree' a]
data Labeled a = Labeled Label a
newtype Tree' a = Tree' (Labeled (Node' a))
makeBaseFunctor ''Tree'
makeBaseFunctor ''Node'

Great, now our Node type represents the structure of a tree without labels, and Tree' and Labeled conspire to decorate it with labels. But I no longer know how to use cata with these types, even though they are isomorphic to the original Tree type. makeBaseFunctor doesn't see any recursion, so it just defines base functors that are identical to the original types:

$ stack build --ghc-options -ddump-splices
...
newtype Tree'F a r = Tree'F (Labeled (Node' a))
...
data Node'F a r = Leaf'F a | Branch'F [Tree' a]

Which like, fair enough, I don't know what I'd want it to generate either: cata expects a single type to pattern-match on, and of course it can't synthesize one that's a combination of two of my types.

So what's the plan here? Is there some adaptation of cata that works here if I define my own Functor instances? Or a better way to define this type that avoids duplicate handling of Label but still is self-recursive instead of mutually recursive?

I think this question is probably related to Recursion schemes with several types, but I don't understand the answer there: Cofree is so far mysterious to me, and I can't tell whether it's essential to the problem or just a part of the representation used; and the types in that question are not quite mutally-recursive, so I don't know how to apply the solution there to my types.


Solution

  • One answer to the linked question mentions adding an extra type parameter, so that instead of Tree (Labeled a) we use Tree Labeled a:

    type Label = String
    data Labeled a = Labeled Label a deriving Functor
    data Tree f a = Leaf (f a)
                  | Branch (f [Tree f a])
    

    This way, a single type (Tree) is responsible for the recursion, and so makeBaseFunctor should recognize the recursion and abstract it over a functor. And it does do that, but the instances it generates aren't quite right. Looking at -ddump-splices again, I see that makeBaseFunctor ''Tree produces:

    data TreeF f a r = LeafF (f a) | BranchF (f [r]) deriving (Functor, Foldable, Traversable)
    type instance Base (Tree f a) = TreeF f a
    instance Recursive (Tree f a) where
      project (Leaf x) = LeafF x
      project (Branch x) = BranchF x
    instance Corecursive (Tree f a) where
      embed (LeafF x) = Leaf x
      embed (BranchF x) = Branch x
    

    but this doesn't compile, because the Recursive and Corecursive instances are only correct when f is a functor. In the worst case, I can copy the splices into my file directly and add the constraint myself:

    data TreeF f a r = LeafF (f a) | BranchF (f [r]) deriving (Functor, Foldable, Traversable)
    type instance Base (Tree f a) = TreeF f a
    instance Functor f => Recursive (Tree f a) where
      project (Leaf x) = LeafF x
      project (Branch x) = BranchF x
    instance Functor f => Corecursive (Tree f a) where
      embed (LeafF x) = Leaf x
      embed (BranchF x) = Branch x
    

    After which I can use cata in a way very similar to the original version in my question:

    allLabels :: Tree Labeled a -> [Label]
    allLabels = cata go
      where go (LeafF (Labeled l _)) = [l]
            go (BranchF (Labeled l lss)) = l : concat lss
    

    However, dfeuer explains in a (now-deleted) comment that recursion-schemes has a facility already for saying "please generate the base functor as you normally would, but include this constraint in the generated class instances". So, you can write

    makeBaseFunctor [d| instance Functor f => Recursive (Tree f a) |]
    

    to generate the same instances that I produced above by hand-editing the splices.