clojuretransducer

What does sequence do with a transducer?


Two related questions about sequence:

Given a transducer, e.g. (def xf (comp (filter odd?) (map inc))),

  1. What's the relationship between (into [] xf (range 10)) or (into () xf (range 10)), and (sequence xf (range 10))? Is it just that there's no syntax for a lazy sequence that can be used as the second argument of into, so we need a separate function sequence for this purpose? (I know that sequence has another, non-transducer use, coercing a collection into a sequence of one kind or another.)

  2. The Clojure transducers page says, about uses of sequence like the one above,

The resulting sequence elements are incrementally computed. These sequences will consume input incrementally as needed and fully realize intermediate operations. This behavior differs from the equivalent operations on lazy sequences.

To me that sounds as if sequence doesn't return a lazy sequence, yet the docstring for sequence says "When a transducer is supplied, returns a lazy sequence of applications of the transform to the items in coll(s), ....", and in fact (class (sequence xf (range 10))) returns clojure.lang.LazySeq. I think I don't understand the last sentence quoted above from the Clojure transducers page.


Solution

  • (sequence xform from) creates lazy-seq (RT.chunkIteratorSeq) over TransformerIterator to which xform and from are passed. When next value is requested, xform (composition of transformations) is invoked over next value from from.

    This behavior differs from the equivalent operations on lazy sequences.

    What would be equivalent operations on lazy sequences? With your xf as an example, applying filter odd? to (range 10), producing intermediate lazy sequence, and applying map inc to intermediate lazy sequence, producing final lazy sequence as result.

    I would say that (into to xform from) is similar to (into to (sequence xform from)) when from is some collection which does not implement IReduceInit.

    into internally uses (transduce xform conj to from) which does the same as (reduce (xform conj) to from) and at the end clojure.core.protocols/coll-reduce is called:

    (into [] (sequence xf (range 10)))
    ;[2 4 6 8 10]
    (into [] xf (range 10))
    ;[2 4 6 8 10]
    (transduce xf conj [] (range 10))
    ;[2 4 6 8 10]
    (reduce (xf conj) [] (range 10))
    ;[2 4 6 8 10]
    

    I modified a bit your transducer into:

    (defn hof-pr
         "Prints char c on each invocation of function f within higher order function"
     ([hof f c]
       (hof (fn [e] (print c) (f e))))
     ([hof f c coll]
       (hof (fn [e] (print c) (f e)) coll)))
    (def map-inc-pr (partial hof-pr map inc \m))
    (def filter-odd-pr (partial hof-pr filter odd? \f))
    (def xf (comp (filter-odd-pr) (map-inc-pr)))
    

    so that it prints out character on each transformation step.

    Create s1 in REPL as follows:

    (def s1 (into [] xf (range 10)))
    ffmffmffmffmffm
    

    s1 is eagerly evaluated (printed f for filtering and m for mapping). No evaluation when s1 is requested again:

    s1
    [2 4 6 8 10]
    

    Let's create s2:

    (def s2 (sequence xf (range 10)))
    ffm 
    

    Only first item in s2 is evaluated. Next items will be evaluated when requested:

    s2
    ffmffmffmffm(2 4 6 8 10)
    

    Additionally, create s3, old way:

    (def s3 (map-inc-pr (filter-odd-pr (range 10))))
    s3
    ffffffffffmmmmm(2 4 6 8 10)
    

    As you can see, no evaluation when s3 is defined. When s3 is requested, filtering over 10 elements is applied and after that mapping over remaining 5 elements is applied, producing final sequence.