Help! I'm learning to love Javascript after programming in C# for quite a while but I'm stuck learning to love the iterable protocol!
Why did Javascript adopt a protocol that requires creating a new object for each iteration? Why have next()
return a new object with properties done
and value
instead of adopting a protocol like C# IEnumerable
and IEnumerator
which allocates no object at the expense of requiring two calls (one to moveNext
to see if the iteration is done, and a second to current
to get the value)?
Are there under-the-hood optimizations that skip the allocation of the object return by next()
? Hard to imagine given the iterable doesn't know how the object could be used once returned...
Generators don't seem to reuse the next object as illustrated below:
function* generator() {
yield 0;
yield 1;
}
var iterator = generator();
var result0 = iterator.next();
var result1 = iterator.next();
console.log(result0.value) // 0
console.log(result1.value) // 1
Hm, here's a clue (thanks to Bergi!):
We will answer one important question later (in Sect. 3.2): Why can iterators (optionally) return a value after the last element? That capability is the reason for elements being wrapped. Otherwise, iterators could simply return a publicly defined sentinel (stop value) after the last element.
And in Sect. 3.2 they discuss using Using generators as lightweight threads. Seems to say the reason for return an object from next
is so that a value
can be returned even when done
is true
! Whoa. Furthermore, generators can return
values in addition to yield
and yield*
-ing values and a value generated by return
ends up as in value
when done
is true
!
And all this allows for pseudo-threading. And that feature, pseudo-threading, is worth allocating a new object for each time around the loop... Javascript. Always so unexpected!
Although, now that I think about it, allowing yield*
to "return" a value to enable a pseudo-threading still doesn't justify returning an object. The IEnumerator
protocol could be extended to return an object after moveNext()
returns false
-- just add a property hasCurrent
to test after the iteration is complete that when true
indicates current
has a valid value...
And the compiler optimizations are non-trivial. This will result in quite wild variance in the performance of an iterator... doesn't that cause problems for library implementors?
All these points are raised in this thread discovered by the friendly SO community. Yet, those arguments didn't seem to hold the day.
However, regardless of returning an object or not, no one is going to be checking for a value after iteration is "complete", right? E.g. most everyone would think the following would log all values returned by an iterator:
function logIteratorValues(iterator) {
var next;
while(next = iterator.next(), !next.done)
console.log(next.value)
}
Except it doesn't because even though done
is false
the iterator might still have returned another value. Consider:
function* generator() {
yield 0;
return 1;
}
var iterator = generator();
var result0 = iterator.next();
var result1 = iterator.next();
console.log(`${result0.value}, ${result0.done}`) // 0, false
console.log(`${result1.value}, ${result1.done}`) // 1, true
Is an iterator that returns a value after its "done" is really an iterator? What is the sound of one hand clapping? It just seems quite odd...
And here is in depth post on generators I enjoyed. Much time is spent controlling the flow of an application as opposed to iterating members of a collection.
Another possible explanation is that IEnumerable/IEnumerator requires two interfaces and three methods and the JS community preferred the simplicity of a single method. That way they wouldn't have to introduce the notion of groups of symbolic methods aka interfaces...
Are there under-the-hood optimizations that skip the allocation of the object return by
next()
?
Yes. Those iterator result objects are small and usually short-lived. Particularly in for … of
loops, the compiler can do a trivial escape analysis to see that the object doesn't face the user code at all (but only the internal loop evaluation code). They can be dealt with very efficiently by the garbage collector, or even be allocated directly on the stack.
Here are some sources:
StopIteration
exceptionsThe key to great performance for iteration is to make sure that the repeated calls to
iterator.next()
in the loop are optimized well, and ideally completely avoid the allocation of theiterResult
using advanced compiler techniques like store-load propagation, escape analysis and scalar replacement of aggregates. To really shine performance-wise, the optimizing compiler should also completely eliminate the allocation of theiterator
itself - theiterable[Symbol.iterator]()
call - and operate on the backing-store of the iterable directly.