I am trying to implement a generic Mutable Ordered Set type and it needs to conform to many protocols to behave the same way as an Array and a Set does in Swift. First of all to accomplish that the generic type element needs to conform to Hashable
and the generic struct needs to conform to RandomAccessCollection
, SetAlgebra
, ExpressibleByArrayLiteral
,AdditiveArithmetic
, RangeReplaceableCollection
and MutableCollection
. I would like also to allow subscript access to its SubSequence
adding support to PartialRangeThrough
, PartialRangeUpTo
, PartialRangeFrom
and UnboundedRange
as well.
This is my generic OrderedSet
struct declaration:
public struct OrderedSet<Element: Hashable> {
public init() { }
private var elements: [Element] = []
private var set: Set<Element> = []
}
Even though this is a self-answered question I would really appreciate and encourage new answers, some feedback on this implementation and/or suggestions on how to fix/improve it as well:
edit/update:
The sorted
method works as expected but the mutating sort
it is not even changing the elements order.
MutableCollection
Declaration mutating
func sort()
Available when Self conforms to RandomAccessCollection and Element conforms to Comparable.
var numbers: OrderedSet = [15, 40, 10, 30, 60, 25, 5, 100]
numbers[0..<4] // [15, 40, 10, 30]
numbers[0..<4].sorted() // [10, 15, 30, 40]
numbers[0..<4].sort() // [15, 40, 10, 30, 60, 25, 5, 100]
print(numbers)
// Prints "[15, 40, 10, 30, 60, 25, 5, 100]"
// But it should print "[10, 15, 30, 40, 60, 25, 5, 100]"
How can I fix it?
OrderedSet
a MutableCollection
In a MutableCollection
you can change an individual element (or a slice of elements) via a subscript that supports write access. And that is where the problems start: What should the output of
var oset: OrderedSet = [1, 2, 3, 4]
oset[0] = 3
print(oset)
be? We cannot simply replace the first element because then the set members are not unique anymore. Your current implementation returns [1, 2, 3, 4]
, i.e. it rejects the setting if the new member is already present in the set.
That makes many default implementations of MutableCollection
methods fail: sort()
, swapAt()
, shuffle()
and probably more:
var oset: OrderedSet = [4, 3, 2, 1]
oset.swapAt(0, 2)
print(oset) // [4, 3, 2, 1]
oset.sort()
print(oset) // [4, 3, 2, 1]
oset.shuffle()
print(oset) // [4, 3, 2, 1]
In your implementation you chose Slice<OrderedSet<Element>>
as the SubSequence
type. Slice
uses the storage from the originating (base) collection and only maintains its own startIndex
and endIndex
. That leads to unexpected results:
let oset: OrderedSet = [1, 2, 3, 4, 5]
var oslice = oset[0..<3]
oslice[0] = 5
print(oslice) // [1, 2, 3]
Setting oslice[0]
is rejected because the originating set contains the new member. That is certainly not expected. Sorting a slice
var oset: OrderedSet = [6, 5, 4, 3, 2, 1]
oset[0..<4].sort()
print(oset) // [6, 5, 4, 3, 2, 1]
fails because the sorted elements are written back one by one, and that fails because the members are already present in the set. The same happens with a slice assignment:
var o1: OrderedSet = [1, 2]
let o2: OrderedSet = [2, 1]
o1[0..<2] = o2[0..<2]
print(o1) // [1, 2]
Another problem is that a slice oset[0..<3]
does not conform to OrderedSetProtocol
:It is a (mutable) collection, but for example not a SetAlgebra
, so that it cannot be used to form unions, intersections, or symmetric differences.
MutableCollection
conformance?I would seriously consider not to adopt the MutableCollection
protocol. That does not make the ordered set immutable: It only means that individual members cannot be modified via the subscript setter. You can still insert or remove elements, or form unions or intersections with other sets. Only for “complex” operations like sorting you have to go via an extra temporary set:
var oset: OrderedSet = [4, 3, 2, 1]
oset = OrderedSet(oset.sorted())
print(oset) // [1, 2, 3, 4]
The big advantage is that there is no unclear behavior anymore.
MutableCollection
conformance!OK, you asked for it – let's see what we can do. We could try to fix this by “fixing” the subscript setter. One attempt is your commented out code:
set {
guard set.update(with: newValue) == nil else {
insert(remove(at: elements.firstIndex(of: newValue)!), at: index)
return
}
elements[index] = newValue
}
This has the effect of moving an existing member to the given location, shifting other elements:
var oset: OrderedSet = [1, 2, 3, 4]
oset[0] = 3
print(oset) // [3, 1, 2, 4]
This seems to make most methods work correctly:
var oset: OrderedSet = [4, 3, 2, 1]
oset.swapAt(0, 2)
print(oset) // [2, 3, 4, 1]
oset.sort()
print(oset) // [1, 2, 3, 4]
oset.shuffle()
print(oset) // [1, 4, 3, 2]
and even the subscript sorting:
var oset: OrderedSet = [6, 5, 4, 3, 2, 1]
oset[0..<4].sort()
print(oset) // [3, 4, 5, 6, 2, 1]
But I see two disadvantages:
Another option is to leave the subscript setter as it is (i.e. reject invalid settings), and implement the problematic methods instead of using the default implementations of MutableCollection
:
extension OrderedSet {
public mutating func swapAt(_ i: Index, _ j: Index) {
elements.swapAt(i, j)
}
public mutating func partition(by belongsInSecondPartition: (Element) throws -> Bool) rethrows -> Index {
try elements.partition(by: belongsInSecondPartition)
}
public mutating func sort(by areInIncreasingOrder: (Element, Element) throws -> Bool) rethrows {
try elements.sort(by: areInIncreasingOrder)
}
}
extension OrderedSet where Element : Comparable {
public mutating func sort() {
elements.sort()
}
}
In addition we need to implement the subscript setter taking a range
public subscript(bounds: Range<Index>) -> SubSequence
so that a sorted slice is assigned back to the set as one operation, and not each element individually.
That worked in my tests, but there is a risk that I overlooked something.
And for the slicing I would make OrderedSet
its own SubSequence
type. That means that the elements are duplicated. This could be avoided by making the element
storage an ArraySlice
but – as we saw above – the set
of distinct members has to be duplicated anyway, to avoid unwanted side effects when the originating set is mutated.
This is what I have so far. It works correctly as far as I can tell, but needs more testing.
Note that some methods need not be implemented, e.g. ExpressibleByArrayLiteral
has already a default implementation in SetAlgebra
, and various index calculations have default implementations because the Index
is Strideable
.
public struct OrderedSet<Element: Hashable> {
private var elements: [Element] = []
private var set: Set<Element> = []
public init() { }
}
extension OrderedSet {
public init<S>(distinctElements elements: S) where S : Sequence, S.Element == Element {
self.elements = Array(elements)
self.set = Set(elements)
precondition(self.elements.count == self.set.count, "Elements must be distinct")
}
}
extension OrderedSet: SetAlgebra {
public func contains(_ member: Element) -> Bool {
set.contains(member)
}
@discardableResult public mutating func insert(_ newMember: Element) -> (inserted: Bool, memberAfterInsert: Element) {
let insertion = set.insert(newMember)
if insertion.inserted { elements.append(newMember) }
return insertion
}
@discardableResult public mutating func remove(_ member: Element) -> Element? {
if let oldMember = set.remove(member) {
let index = elements.firstIndex(of: member)!
elements.remove(at: index)
return oldMember
} else {
return nil
}
}
@discardableResult public mutating func update(with newMember: Element) -> Element? {
if let member = set.update(with: newMember) {
return member
} else {
elements.append(newMember)
return nil
}
}
public mutating func formUnion(_ other: Self) {
other.elements.forEach { self.insert($0) }
}
public mutating func formIntersection(_ other: Self) {
for element in elements {
if !other.contains(element) {
remove(element)
}
}
}
public mutating func formSymmetricDifference(_ other: Self) {
for member in other.elements {
if set.contains(member) {
remove(member)
} else {
insert(member)
}
}
}
public func union(_ other: Self) -> Self {
var orderedSet = self
orderedSet.formUnion(other)
return orderedSet
}
public func intersection(_ other: Self) -> Self {
var orderedSet = self
orderedSet.formIntersection(other)
return orderedSet
}
public func symmetricDifference(_ other: Self) -> Self {
var orderedSet = self
orderedSet.formSymmetricDifference(other)
return orderedSet
}
public init<S>(_ elements: S) where S : Sequence, S.Element == Element {
elements.forEach { insert($0) }
}
}
extension OrderedSet: CustomStringConvertible {
public var description: String { elements.description }
}
extension OrderedSet: MutableCollection, RandomAccessCollection {
public typealias Index = Int
public typealias SubSequence = OrderedSet
public subscript(index: Index) -> Element {
get {
elements[index]
}
set {
if !set.contains(newValue) || elements[index] == newValue {
set.remove(elements[index])
set.insert(newValue)
elements[index] = newValue
}
}
}
public subscript(bounds: Range<Index>) -> SubSequence {
get {
return OrderedSet(distinctElements: elements[bounds])
}
set {
replaceSubrange(bounds, with: newValue.elements)
}
}
public var startIndex: Index { elements.startIndex}
public var endIndex: Index { elements.endIndex }
public var isEmpty: Bool { elements.isEmpty }
}
extension OrderedSet {
public mutating func swapAt(_ i: Index, _ j: Index) {
elements.swapAt(i, j)
}
public mutating func partition(by belongsInSecondPartition: (Element) throws -> Bool) rethrows -> Index {
try elements.partition(by: belongsInSecondPartition)
}
public mutating func sort(by areInIncreasingOrder: (Element, Element) throws -> Bool) rethrows {
try elements.sort(by: areInIncreasingOrder)
}
}
extension OrderedSet where Element : Comparable {
public mutating func sort() {
elements.sort()
}
}
extension OrderedSet: RangeReplaceableCollection {
public mutating func replaceSubrange<C>(_ subrange: Range<Index>, with newElements: C) where C : Collection, C.Element == Element {
set.subtract(elements[subrange])
let insertedElements = newElements.filter {
set.insert($0).inserted
}
elements.replaceSubrange(subrange, with: insertedElements)
}
}
I already said, dropping the MutableCollection
conformance would be the safer solution.
The above works but is fragile: I had to “guess” which methods must be implemented because the default implementation does not work. If the MutableCollection
protocol in the Swift standard library gets a new method with a default implementation. things can break again.