Say
struct Teste: Codable {
let a: Int
lazy var x: Int = {
print("I just changed this Teste item")
return a + 1000
}()
}
and you
var hugeData: [Teste] // 750 million of these
So it's something like your data from the cloud, held in a singleton or such.
Say then you happen to
let see: [Teste] = hugeData.filter{ .. }
// 4 of these. we'll say one of them happens to be hugeData[109333072]
maybe to display in a view or such. Next! While see
exists, you happen to for some reason
let width = hugeData[109333072].x
At that moment, it has to copy-due-to-write all of hugeData
, just because the stupid little see
exists at that time.
It only just occurred to me that a huge danger (if you deal with massive amounts of data), that if something changes in the big original array, while you just happen to have some pissant little copy-on-write array in existence, it's a huge performance woe.
(And/or, as mentioned in the bullet point, maybe Swift deals with this by actually doing the copy-due-to-write on the little one??)
In short, is it correct that copy on write is applied to (all) the "original" array if a change is made to that array while there happens to be a copy extant?
Here's a short self-explanatory demo ...
struct Teste: Codable {
let a: Int
lazy var x: Int = {
print("I just calculated it", a + 1000)
return a + 1000
}()
}
print("yo")
tt = [Teste(a:7), Teste(a:8), Teste(a:9)]
let tt2 = tt.filter{ $0.a > 8 }
print(tt)
print(tt2)
print("yes, still nil in tt2")
print("go..", tt[2].x)
print("go again..", tt[2].x)
print(tt)
print(tt2)
print("yes, copy on write was? applied to the original huge array")
print("ie YES, still nil in tt2")
let tt3 = tt.filter{ $0.a > 8 }
print(tt)
print(tt3)
print("yes, the set value seems to carry through to tt3")
result
yo
[.Teste(a: 7, $__lazy_storage_$_x: nil), .Teste(a: 8, $__lazy_storage_$_x: nil), .Teste(a: 9, $__lazy_storage_$_x: nil)]
[.Teste(a: 9, $__lazy_storage_$_x: nil)]
yes, still nil in tt2
I just calculated it 1009
go.. 1009
go again.. 1009
[.Teste(a: 7, $__lazy_storage_$_x: nil), .Teste(a: 8, $__lazy_storage_$_x: nil), .Teste(a: 9, $__lazy_storage_$_x: Optional(1009))]
[.Teste(a: 9, $__lazy_storage_$_x: nil)]
yes, copy on write was? applied to the original huge array
ie YES, still nil in tt2
[.Teste(a: 7, $__lazy_storage_$_x: nil), .Teste(a: 8, $__lazy_storage_$_x: nil), .Teste(a: 9, $__lazy_storage_$_x: Optional(1009))]
[.Teste(a: 9, $__lazy_storage_$_x: Optional(1009))]
yes, the set value seems to carry through to tt3
So in such a case would it copy the 750 million item array?
Array CoW is the mechanism in which, when you modify an array, and that array shares its storage with another array, the contents of that storage will be copied, and the two arrays will no longer share their storages, because they each have their own copy.
var x = [1,2,3]
var y = x // now x and y share storage
x.append(4) // now the shared storage is copied
The situation you have described does not involve arrays that share storage. filter
returns a new array that has the filtered elements. This new array cannot possibly share the same storage as the old array, because they have different elements. In this case, see
will have a storage containing 4 elements.
So CoW semantics are really irrelevant here. There are no arrays that share storage. You just have two arrays, that each have their own separate storages.
So the line
let width = hugeData[109333072].x
Will only change hugeData
and will not cause any copies
A clarification on what "share storage" means:
The Array
struct contains a reference to some class
type that is in the heap. This is where the array's elements are stored. When you copy the value of the Array
struct value like in var y = x
, this reference is copied. i.e. x._storage === y._storage
. This is what I mean by "shares the same storage".
When you modify y
in some way, y._storage
will be assigned to some other class instance (that is a copy of the elements in the storage), and it is no longer the case that x._storage === y._storage
.