listhaskelloptimization

What is Haskell's Stream Fusion


What is Haskell's Stream Fusion and how do I use it?


Solution

  • The paper that Logan points to is great, but it's a little difficult. (Just ask my students.) It's also a great deal about 'how stream fusion works' and only a fraction 'what stream fusion is and how you can use it'.

    The problem that stream fusion solves is that functional codes as written often allocate intermediate lists, e.g., to create an infinite list of node numbers, you might write

    nodenames = map ("n"++) $ map show [1..]
    

    Naive code would allocate an infinite list of integers [1, 2, 3, ...], an infinite list of strings ["1", "2", "3", ...], and eventually an infinite list of names ["n1", "n2", "n3", ...]. That's too much allocation.

    What stream fusion does is translate a definition like nodenames into something which uses a recursive function that allocates only what is needed for the result. In general, eliminating allocation of intermediate lists is called deforestation.

    To use stream fusion, you need to write non-recursive list functions that use the functions from the stream-fusion library described in GHC ticket 915 (map, foldr, and so on) instead of explicit recursion. This library contains new versions of all the Prelude functions which have been rewritten to exploit stream fusion. Apparently this stuff is slated to make it into the next GHC release (6.12) but is not in the current stable version (6.10). If you want to use the library Porges has a nice simple explanation in his answer.

    If you actually want an explanation of how stream fusion works, post another question---but that's much harder.