haskellfunctional-programmingfoldcategory-theoryrecursion-schemes

Recursion scheme allowing dependencies between recursive calls (an ordered catamorphism?)


I'm interested in a higher-order way (recursion scheme) to write recursive code in which there might be dependencies between recursive calls.

As a simplified example, consider a function that traverses a tree of integers, checking if the sum is less than some value. We could sum the entire tree and compare it against the maximum. Alternatively, we could keep a running sum and short-circuit as soon as we've exceeded the maximum:

data Tree = Leaf Nat | Node Tree Tree

sumLT :: Nat -> Tree -> Bool
sumLT max t = sumLT' max t > 0

sumLT' :: Nat -> Tree -> Int
sumLT' max (Leaf n) = max - n
sumLT' max (Node l r) = 
  let max' = sumLT' max l
   in if max' > 0 then sumLT' max' r else 0

Is there a way to express this idea -- essentially an ordered traversal -- with a recursion scheme? I'm interested in as-generically-as-possible composing ordered traversals like this.

Ideally, I'd like some way to write traversals in which the function being folded (or unfolded) over the data structure determines the order of traversal. Whatever abstraction I end up with, I'd like to be able to also write the reverse-ordered version of the sumLT' traversal above, where we go right-to-left instead:

sumLT'' :: Nat -> Tree -> Int
sumLT'' max (Leaf n) = max - n
sumLT'' max (Node l r) = 
  let max' = sumLT'' max r
   in if max' > 0 then sumLT'' max' l else 0

Solution

  • As usual, folding into endofunctions gives you a notion of processing order / state passing:

    import Numeric.Natural
    
    data Tree = Leaf Natural | Node Tree Tree
    
    cata :: (Natural -> r) -> (r -> r -> r) -> Tree -> r
    cata l n (Leaf a)     = l a
    cata l n (Node lt rt) = n (cata l n lt) (cata l n rt)
    
    sumLT :: Natural -> Tree -> Bool
    sumLT max t = cata (\ a max -> max - a)     -- left-to-right
                       (\ l r max -> let max' = l max in
                            if max' > 0 then r max' else 0)
                       t max > 0
    
    sumLT' :: Natural -> Tree -> Bool
    sumLT' max t = cata (\ a max -> max - a)     -- right-to-left
                        (\ l r max -> let max' = r max in
                             if max' > 0 then l max' else 0)
                        t max > 0
    

    Trying it out:

    > sumLT 11 (Node (Leaf 10) (Leaf 0))
    True
    
    > sumLT 11 (Node (Leaf 10) (Leaf 1))
    False
    
    > sumLT 11 (Node (Leaf 10) (Leaf undefined))
    *** Exception: Prelude.undefined
    
    > sumLT 11 (Node (Leaf 11) (Leaf undefined))
    False
    
    > sumLT 11 (Node (Leaf 10) (Node (Leaf 1) (Leaf undefined)))
    False
    
    > sumLT' 11 (Node (Leaf undefined) (Leaf 11))
    False
    

    As is also usual, Haskell's laziness provides for the ability to short-circuit / exit early. As can be seen in the examples, if cata's second argument, the node-folding function, does not demand the value of one of its arguments, the corresponding branch is not actually visited at all.