listrecursionf#tail-recursionunfold

custom unfold returning the accumulator


I'm trying to create a custom unfold function that returns its last accumulator value like:

val unfold' : generator:('State -> ('T * 'State) option) -> state:'State -> 'T list * 'State

I managed to make the following:

let unfold' generator state =
    let rec loop resultList state =
        match generator state with
        | Some (value, state) -> loop (value :: resultList) state
        | None -> (List.rev resultList), state
    loop [] state

But I wanted to avoid to List.rev the resulting list and generate it already with the correct order. I imagine it would be necessary to use continuations to build the list, but I'm quite new to functional programming and have not yet managed to wrap my mind around continuations; and all alternatives I can imagine would put the accumulator inside the resulting list or not allow it to be returned by the function.

Is there some way to do this?

As this is a personal learning exercise I would prefer an answer explaining how to do it instead of simply giving the completed code.


Solution

  • The way to do without a List.rev is to pass a function instead of the resultList parameter. Let's call that function buildResultList. At each step, this function would take the already-built tail of the list, prepend the current item, and then pass this to the function from the previous step, which would append the previous item, pass it to the function from the previous-previous step, and so on. The very last function in this chain will prepend the very first item to the list. The result of the whole recursive loop would be the last function of the chain (it calls all the previous ones), which you would then call with empty list as argument. I'm afraid this is as clear as I can go without just writing the code.

    However, the thing is, this wouldn't be any better, for any definition of "better". Since the computation is progressing "forward", and resulting list is built "backward" (head :: tail, Lisp-style), you have to accumulate the result somewhere. In your code, you're accumulating it in a temporary list, but if you modify it to use continuations, you'll be accumulating it on the heap as a series of closures that reference each other in a chain. One could argue that it would be, in essence, the same list, only obfuscated.

    Another approach you could try is to use a lazy sequence instead: build a recursive seq computation, which will yield the current item and then yield! itself. You can then enumerate this sequence, and it won't require a "temporary" storage. However, if you still want to get a list at the end, you'll have to convert the sequence to a list via List.ofSeq, and guess how that's going to be implemented? Theoretically, from purely mathematical standpoint, List.ofSeq would be implemented in exactly the same way: by building a temp list first and then reversing it. But the F# library cheats here: it builds the list in a mutable way, so it doesn't have to reverse.

    And finally, since this is a learning exercise, you could also implement the equivalent of a lazy sequence yourself. Now, the standard .NET sequences (aka IEnumerable<_>, which is what Seq<_> is an alias for) are inherently mutable: you're changing the internal state of the iterator every time you move to the next item. You can do that, or, in the spirit of learning, you can do an immutable equivalent. That would be almost like a list (i.e. head::tail), except that, since it's lazy, the "tail" has to be a promise rather than the actual sequence, so:

    type LazySeq<'t> = LazySeq of (unit -> LazySeqStep<'t>)
    and LazySeqStep<'t> = Empty | Cons of head: 't * tail: LazySeq<'t>
    

    The way to enumerate is to invoke the function, and it will return you the next item plus the tail of the sequence. Then you can write your unfold as a function that returns current item as head and then just returns itself wrapped in a closure as tail. Turns out pretty simple, actually.