I'm experimenting with iteration on an array using a for .. in ..
loop. My question is related to the case where the collection is changed within the loop body.
It seems that the iteration is safe, even if the list shrinks in the meantime. The for
iteration variables successively take the values of the (indexes and) elements that were in the array at the start of the loop, despite the changes made on the flow. Example:
var slist = [ "AA", "BC", "DE", "FG" ]
for (i, st) in slist.enumerated() { // for st in slist gives a similar result
print ("Index \(i): \(st)")
if st == "AA" { // at one iteration change completely the list
print (" --> check 0: \(slist[0]), and 2: \(slist[2])")
slist.append ("KLM")
slist.insert(st+"XX", at:0) // shift the elements in the array
slist[2]="bc" // replace some elements to come
print (" --> check again 0: \(slist[0]), and 2: \(slist[2])")
slist.remove(at:3)
slist.remove(at:3)
slist.remove(at:1) // makes list shorter
}
}
print (slist)
This works very well, the iteration being made on the values [ "AA", "BC", "DE", "FG" ]
even if after the first iteration the array is completely changed to ["AAXX", "bc", "KLM"]
I wanted to know if I can safely rely on this behavior. Unfortunately, the language guide does not tell anything about iterating on a collection when the collection is modified. And the for .. in
section doesn't address this question either. So:
The documentation for IteratorProtocol says "whenever you use a for-in loop with an array, set, or any other collection or sequence, you’re using that type’s iterator." So, we are guaranteed that a for in
loop is going to be using .makeIterator()
and .next()
which is defined most generally on Sequence
and IteratorProtocol
respectively.
The documentation for Sequence says that "the Sequence
protocol makes no requirement on conforming types regarding whether they will be destructively consumed by iteration." As a consequence, this means that an iterator for a Sequence
is not required to make a copy, and so I do not think that modifying a sequence while iterating over it is, in general, safe.
This same caveat does not occur in the documentation for Collection, but I also don't think there is any guarantee that the iterator makes a copy, and so I do not think that modifying a collection while iterating over it is, in general, safe.
But, most collection types in Swift are struct
s with value semantics or copy-on-write semantics. I'm not really sure where the documentation for this is, but this link does say that "in Swift, Array
, String
, and Dictionary
are all value types... You don’t need to do anything special — such as making an explicit copy — to prevent other code from modifying that data behind your back." In particular, this means that for Array
, .makeIterator()
cannot hold a reference to your array because the iterator for Array
does not have to "do anything special" to prevent other code (i.e. your code) from modifying the data it holds.
We can explore this in more detail. The Iterator
type of Array
is defined as type IndexingIterator<Array<Element>>
. The documentation IndexingIterator says that it is the default implementation of the iterator for collections, so we can assume that most collections will use this. We can see in the source code for IndexingIterator
that it holds a copy of its collection
@frozen
public struct IndexingIterator<Elements: Collection> {
@usableFromInline
internal let _elements: Elements
@usableFromInline
internal var _position: Elements.Index
@inlinable
@inline(__always)
/// Creates an iterator over the given collection.
public /// @testable
init(_elements: Elements) {
self._elements = _elements
self._position = _elements.startIndex
}
...
}
and that the default .makeIterator()
simply creates this copy.
extension Collection where Iterator == IndexingIterator<Self> {
/// Returns an iterator over the elements of the collection.
@inlinable // trivial-implementation
@inline(__always)
public __consuming func makeIterator() -> IndexingIterator<Self> {
return IndexingIterator(_elements: self)
}
}
Although you might not want to trust this source code, the documentation for library evolution claims that "the @inlinable
attribute is a promise from the library developer that the current definition of a function will remain correct when used with future versions of the library" and the @frozen
also means that the members of IndexingIterator
cannot change.
Altogether, this means that any collection type with value semantics and an IndexingIterator
as its Iterator
must make a copy when using using for in
loops (at least until the next ABI break, which should be a long-way off). Even then, I don't think Apple is likely to change this behavior.
In Conclusion
I don't know of any place that it is explicitly spelled out in the docs "you can modify an array while you iterate over it, and the iteration will proceed as if you made a copy" but that's also the kind of language that probably shouldn't be written down as such code could definitely confuse a beginner.
However, there is enough documentation lying around which says that a for in
loop just calls .makeIterator()
and that for any collection with value semantics and the default iterator type (for example, Array
), .makeIterator()
makes a copy and so cannot be influenced by code inside the loop. Further, because Array
and some other types like Set
and Dictionary
are copy-on-write, modifying these collections inside a loop will have a one-time copy penalty as the body of the loop will not have a unique reference to its storage (because the iterator will have a reference too). This is the exact same penalty that modifying the collection outside the loop will have if you don’t have a unique reference to the storage.
Without these assumptions, you aren't guaranteed safety, but you might have it anyway in some circumstances.
Edit:
I just realized we can create some cases where this is unsafe for sequences.
import Foundation
/// This is clearly fine and works as expected.
print("Test normal")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x {
print(i)
}
}
/// This is also okay. Reassigning `x` does not mutate the reference that the iterator holds.
print("Test reassignment")
for _ in 0...10 {
var x: NSMutableArray = [0,1,2,3]
for i in x {
x = []
print(i)
}
}
/// This crashes. The iterator assumes that the last index it used is still valid, but after removing the objects, there are no valid indices.
print("Test removal")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x {
x.removeAllObjects()
print(i)
}
}
/// This also crashes. `.enumerated()` gets a reference to `x` which it expects will not be modified behind its back.
print("Test removal enumerated")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x.enumerated() {
x.removeAllObjects()
print(i)
}
}
The fact that this is an NSMutableArray
is important because this type has reference semantics. Since NSMutableArray
conforms to Sequence
, we know that mutating a sequence while iterating over it is not safe, even when using .enumerated()
.