parallel-processingjava-8iteratorspliterator

Difference between Iterator and Spliterator in Java8


I came to know while studying that Parallelism is a main advantage of Spliterator.

This may be a basic question but can anyone explain me the main differences between Iterator and Spliterator and give some examples?


Solution

  • The names are pretty much self-explanatory, to me. Spliterator == Splittable Iterator : it can split some source, and it can iterate it too. It roughly has the same functionality as an Iterator, but with the extra thing that it can potentially split into multiple pieces: this is what trySplit is for. Splitting is needed for parallel processing.

    An Iterator always has an unknown size: you can traverse elements only via hasNext/next; a Spliterator can provide the size (thus improving other operations too internally); either an exact one via getExactSizeIfKnown or a approximate via estimateSize.

    On the other hand, tryAdvance is what hasNext/next is from an Iterator, but it's a single method, much easier to reason about, IMO. Related to this, is forEachRemaining which in the default implementation delegates to tryAdvance, but it does not have to always be like this (see ArrayList for example).

    A Spliterator also is a "smarter" Iterator, via its internal properties like DISTINCT or SORTED, etc (which you need to provide correctly when implementing your own Spliterator). These flags are used internally to disable unnecessary operations; see for example this optimization:

     someStream().map(x -> y).count();
    

    Because size does not change in the case of the stream, the map can be skipped entirely, since all we do is counting.

    You can create a Spliterator around an Iterator if you need to, via:

    Spliterators.spliteratorUnknownSize(yourIterator, properties)