haskellarrow-abstraction

What are arrows, and how can I use them?


I tried to learn the meaning of arrows, but I didn't understand them.

I used the Wikibooks tutorial. I think Wikibook's problem is mainly that it seems to be written for somebody who already understands the topic.

Can somebody explain what arrows are and how I can use them?


Solution

  • I don't know a tutorial, but I think it's easiest to understand arrows if you look at some concrete examples. The biggest problem I had learning how to use arrows was that none of the tutorials or examples actually show how to use arrows, just how to compose them. So, with that in mind, here's my mini-tutorial. I'll examine two different arrows: functions and a user-defined arrow type MyArr.

    -- type representing a computation
    data MyArr b c = MyArr (b -> (c,MyArr b c))
    

    1) An Arrow is a calculation from input of a specified type to output of a specified type. The arrow typeclass takes three type arguments: the arrow type, the input type, and the output type. Looking at the instance head for arrow instances we find:

    instance Arrow (->) b c where
    instance Arrow MyArr b c where
    

    The Arrow (either (->) or MyArr) is an abstraction of a computation.

    For a function b -> c, b is the input and c is the output.
    For a MyArr b c, b is the input and c is the output.

    2) To actually run an arrow computation, you use a function specific to your arrow type. For functions you simply apply the function to an argument. For other arrows, there needs to be a separate function (just like runIdentity, runState, etc. for monads).

    -- run a function arrow
    runF :: (b -> c) -> b -> c
    runF = id
    
    -- run a MyArr arrow, discarding the remaining computation
    runMyArr :: MyArr b c -> b -> c
    runMyArr (MyArr step) = fst . step
    

    3) Arrows are frequently used to process a list of inputs. For functions these can be done in parallel, but for some arrows output at any given step depends upon previous inputs (e.g. keeping a running total of inputs).

    -- run a function arrow over multiple inputs
    runFList :: (b -> c) -> [b] -> [c]
    runFList f = map f
    
    -- run a MyArr over multiple inputs.
    -- Each step of the computation gives the next step to use
    runMyArrList :: MyArr b c -> [b] -> [c]
    runMyArrList _ [] = []
    runMyArrList (MyArr step) (b:bs) = let (this, step') = step b
                                       in this : runMyArrList step' bs
    

    This is one reason Arrows are useful. They provide a computation model that can implicitly make use of state without ever exposing that state to the programmer. The programmer can use arrowized computations and combine them to create sophisticated systems.

    Here's a MyArr that keeps count of the number of inputs it has received:

    -- count the number of inputs received:
    count :: MyArr b Int
    count = count' 0
      where
        count' n = MyArr (\_ -> (n+1, count' (n+1)))
    

    Now the function runMyArrList count will take a list length n as input and return a list of Ints from 1 to n.

    Note that we still haven't used any "arrow" functions, that is either Arrow class methods or functions written in terms of them.

    4) Most of the code above is specific to each Arrow instance[1]. Everything in Control.Arrow (and Control.Category) is about composing arrows to make new arrows. If we pretend that Category is part of Arrow instead of a separate class:

    -- combine two arrows in sequence
    >>> :: Arrow a => a b c -> a c d -> a b d
    
    -- the function arrow instance
    -- >>> :: (b -> c) -> (c -> d) -> (b -> d)
    -- this is just flip (.)
    
    -- MyArr instance
    -- >>> :: MyArr b c -> MyArr c d -> MyArr b d
    

    The >>> function takes two arrows and uses the output of the first as input to the second.

    Here's another operator, commonly called "fanout":

    -- &&& applies two arrows to a single input in parallel
    &&& :: Arrow a => a b c -> a b c' -> a b (c,c')
    
    -- function instance type
    -- &&& :: (b -> c) -> (b -> c') -> (b -> (c,c'))
    
    -- MyArr instance type
    -- &&& :: MyArr b c -> MyArr b c' -> MyArr b (c,c')
    
    -- first and second omitted for brevity, see the accepted answer from KennyTM's link
    -- for further details.
    

    Since Control.Arrow provides a means to combine computations, here's one example:

    -- function that, given an input n, returns "n+1" and "n*2"
    calc1 :: Int -> (Int,Int)
    calc1 = (+1) &&& (*2)
    

    I've frequently found functions like calc1 useful in complicated folds, or functions that operate on pointers for example.

    The Monad type class provides us with a means to combine monadic computations into a single new monadic computation using the >>= function. Similarly, the Arrow class provides us with means to combine arrowized computations into a single new arrowized computation using a few primitive functions (first, arr, and ***, with >>> and id from Control.Category). Also similar to Monads, the question of "What does an arrow do?" can't be generally answered. It depends on the arrow.

    Unfortunately I don't know of many examples of arrow instances in the wild. Functions and FRP seem to be the most common applications. HXT is the only other significant usage that comes to mind.

    [1] Except count. It's possible to write a count function that does the same thing for any instance of ArrowLoop.