javajava-8java-streambuilderspliterator

Spliterator vs Stream.Builder


I read some questions how to create a finite Stream ( Finite generated Stream in Java - how to create one?, How do streams stop?).

The answers suggested to implement a Spliterator. The Spliterator would implement the logic how to and which element to provide as next (tryAdvance). But there are two other non-default methods trySplit and estimateSize() which I would have to implement.

The JavaDoc of Spliterator says:

An object for traversing and partitioning elements of a source. The source of elements covered by a Spliterator could be, for example, an array, a Collection, an IO channel, or a generator function. ... The Spliterator API was designed to support efficient parallel traversal in addition to sequential traversal, by supporting decomposition as well as single-element iteration. ...

On the other hand I could implement the logic how to advance to the next element around a Stream.Builder and bypass a Spliterator. On every advance I would call accept or add and at the end build. So it looks quite simple.

What does the JavaDoc say?

A mutable builder for a Stream. This allows the creation of a Stream by generating elements individually and adding them to the Builder (without the copying overhead that comes from using an ArrayList as a temporary buffer.)

Using StreamSupport.stream I can use a Spliterator to obtain a Stream. And also a Builder will provide a Stream.

When should / could I use a Stream.Builder?
Only if a Spliterator wouldn't be more efficient (for instance because the source cannot be partitioned and its size cannot be estimated)?


Solution

  • Note that you can extend Spliterators.AbstractSpliterator. Then, there is only tryAdvance to implement.

    So the complexity of implementing a Spliterator is not higher.

    The fundamental difference is that a Spliterator’s tryAdvance method is only invoked when a new element is needed. In contrast, the Stream.Builder has a storage which will be filled with all stream elements, before you can acquire a Stream.

    So a Spliterator is the first choice for all kinds of lazy evaluations, as well as when you have an existing storage you want to traverse, to avoid copying the data.

    The builder is the first choice when the creation of the elements is non-uniform, so you can’t express the creation of an element on demand. Think of situations where you would otherwise use Stream.of(…), but it turns out to be to inflexible.

    E.g. you have Stream.of(a, b, c, d, e), but now it turns out, c and d are optional. So the solution is

    Stream.Builder<MyType> builder = Stream.builder();
    builder.add(a).add(b);
    if(someCondition) builder.add(c).add(d);
    builder.add(e).build()
       /* stream operations */
    

    Other use cases are this answer, where a Consumer was needed to query an existing spliterator and push the value back to a Stream afterwards, or this answer, where a structure without random access (a class hierarchy) should be streamed in the opposite order.