I'm working through Ullmans Elements of ML programming in my spare-time. End goal is to self-study Andrew Appels Modern Compiler Implementation in ML.
In Elements of ML, Ullman describes the difference list:
There is a trick known to LISP programmers as difference lists, in which one manipulates lists more efficiently by keeping, as an extra parameter of your function, a list that represents in some way what you have already accomplished. The idea comes up in a number of different applications;
Ullman uses reverse
as an example of the difference list technique. Here is a slow function that runs in O(n^2).
fun reverse nil = nil
| reverse (x::xs) = reverse(xs) @ [x]
And the faster one using a difference list
fun rev1(nil, M) = M
| rev1(x::xs, ys) = rev1(xs, x::ys)
fun reverse L = rev1(L, nil)
I have this Binary Search Tree (BST) data type.
datatype 'a btree = Empty
| Node of 'a * 'a btree * 'a btree
A naive solution for collecting a list of the elements in pre-order would be
fun preOrder Empty = nil
| preOrder (Node(x, left, right)) = [x] @ preOrder left @ preOrder right
But Ullman points out that the @ operator is slow and suggests in exercise 6.3.5 that I implement preOrder
using a difference list.
After some head scratching I came up with this function:
fun preOrder tree = let
fun pre (Empty, L) = L
| pre (Node(x, left, right), L) = let
val L = pre(right, L)
val L = pre(left, L)
in
x::L
end
in
pre (tree, nil)
end
It outputs the elements in pre-order. BUT it evaluates the tree in post-order! And the code is uglier than the naive preOrder
one.
> val t = Node(5,
Node(3,
Node(1, Empty, Empty),
Node(4, Empty, Empty)),
Node(9, Empty, Empty))
> preOrder t
val it = [5,3,1,4,9] : int list
I tried searching for references to difference lists in ML programming, and found John Hughes original article describing how to use difference lists for reverse.
I also found Matthew Brecknells difference list blog post with examples in Haskell. He makes a distinction between using an accumulator, like Ullmans reverse example and creating a new type for difference lists. He also presents a tree flattener. But I have a hard time understanding the Haskell code and would appreciate a similar expose but in Standard ML. abc
How implement a function that actually evaluate the tree in pre-order and collects the elements in pre-order? Do I have to reverse the list after my traversal? Or is there some other trick?
How can I generalize this technique to work for in-order and post-order traversal?
What is the idiomatic way for using a difference list for a BST algorithm?
Your eventual method of doing this is is the best it reasonably gets. The nice way to do this turns out to be
fun preOrderHelper (Empty, lst) = lst
| preOrderHelper (Node(x, left, right), lst) =
x :: preOrderHelper(left, preOrderHelper(right, lst))
fun preOrder tree = preOrderHelper(tree, Nil)
Note that the run time of preOrderHelper(tree, list)
is only a function of tree
. Call r(t)
the run time of preOrderHelper
on tree t
. Then we have r(Empty) = O(1)
and r(Node(x, left, right)) = O(1) + r(left) + r(right)
, so clearly r(t)
is linear in the size of t
.
What is the derivation of this technique? Is there a more principled way of deriving it? In general, when you're turning a data structure into a list, you want to foldr
onto an empty list. I don't know enough ML to say what the equivalent of typeclasses is, but in Haskell, we would approach the situation as follows:
data Tree a = Empty | Node a (Tree a) (Tree a)
instance Foldable Tree where
foldr f acc t = foldrF t acc where
foldrF Empty acc = acc
foldrF (Node x left right) acc = f x (foldrF left (foldrF right acc))
To convert a Tree a
to a [a]
, we would call Data.Foldable.toList
, which is defined in Data.Foldable
as
toList :: Foldable f => f a -> [a]
toList = foldr (:) []
Unfolding this definition gives us the equivalent of the ML definition above.
As you can see, your technique is actually a special case of a very principled way to turn data structures into lists.
In fact, in modern Haskell, we can do this totally automatically.
{-# LANGUAGE DeriveFoldable #-}
data Tree a = Empty | Node a (Tree a) (Tree a) deriving Foldable
will give us the equivalent(*) of the above Foldable
implementation automatically, and we can then immediately use toList
. I don't know what the equivalent is in ML, but I'm sure there's something analogous.
The difference between ML and Haskell is that Haskell is lazy. Haskell's laziness means that the evaluation of preOrder
actually walks the tree in the pre-Order order. This is one of the reasons I prefer laziness. Laziness permits very fine-grained control over the order of evaluation without resorting to non-functional techniques.
(*) (up to the arguments order, which does not count in the lazy Haskell.)