javascriptperformancememory-managementecmascript-6iterable

Why does Javascript `iterator.next()` return an object?


Help! I'm learning to love Javascript after programming in C# for quite a while but I'm stuck learning to love the iterable protocol!

Why did Javascript adopt a protocol that requires creating a new object for each iteration? Why have next() return a new object with properties done and value instead of adopting a protocol like C# IEnumerable and IEnumerator which allocates no object at the expense of requiring two calls (one to moveNext to see if the iteration is done, and a second to current to get the value)?

Are there under-the-hood optimizations that skip the allocation of the object return by next()? Hard to imagine given the iterable doesn't know how the object could be used once returned...

Generators don't seem to reuse the next object as illustrated below:

function* generator() {
  yield 0;
  yield 1;
}

var iterator = generator();
var result0 = iterator.next();
var result1 = iterator.next();

console.log(result0.value) // 0
console.log(result1.value) // 1

Hm, here's a clue (thanks to Bergi!):

We will answer one important question later (in Sect. 3.2): Why can iterators (optionally) return a value after the last element? That capability is the reason for elements being wrapped. Otherwise, iterators could simply return a publicly defined sentinel (stop value) after the last element.

And in Sect. 3.2 they discuss using Using generators as lightweight threads. Seems to say the reason for return an object from next is so that a value can be returned even when done is true! Whoa. Furthermore, generators can return values in addition to yield and yield*-ing values and a value generated by return ends up as in value when done is true!

And all this allows for pseudo-threading. And that feature, pseudo-threading, is worth allocating a new object for each time around the loop... Javascript. Always so unexpected!


Although, now that I think about it, allowing yield* to "return" a value to enable a pseudo-threading still doesn't justify returning an object. The IEnumerator protocol could be extended to return an object after moveNext() returns false -- just add a property hasCurrent to test after the iteration is complete that when true indicates current has a valid value...

And the compiler optimizations are non-trivial. This will result in quite wild variance in the performance of an iterator... doesn't that cause problems for library implementors?

All these points are raised in this thread discovered by the friendly SO community. Yet, those arguments didn't seem to hold the day.


However, regardless of returning an object or not, no one is going to be checking for a value after iteration is "complete", right? E.g. most everyone would think the following would log all values returned by an iterator:

function logIteratorValues(iterator) {
  var next;
  while(next = iterator.next(), !next.done)
    console.log(next.value)
}

Except it doesn't because even though done is false the iterator might still have returned another value. Consider:

function* generator() {
  yield 0;
  return 1;
}

var iterator = generator();
var result0 = iterator.next();
var result1 = iterator.next();

console.log(`${result0.value}, ${result0.done}`) // 0, false
console.log(`${result1.value}, ${result1.done}`) // 1, true

Is an iterator that returns a value after its "done" is really an iterator? What is the sound of one hand clapping? It just seems quite odd...


And here is in depth post on generators I enjoyed. Much time is spent controlling the flow of an application as opposed to iterating members of a collection.


Another possible explanation is that IEnumerable/IEnumerator requires two interfaces and three methods and the JS community preferred the simplicity of a single method. That way they wouldn't have to introduce the notion of groups of symbolic methods aka interfaces...


Solution

  • Are there under-the-hood optimizations that skip the allocation of the object return by next()?

    Yes. Those iterator result objects are small and usually short-lived. Particularly in for … of loops, the compiler can do a trivial escape analysis to see that the object doesn't face the user code at all (but only the internal loop evaluation code). They can be dealt with very efficiently by the garbage collector, or even be allocated directly on the stack.

    Here are some sources: