arraysswiftsetnsorderedsetnsmutableorderedset

How to implement a Mutable Ordered Set generic type formerly known as NSMutableOrderedSet in native Swift?


I am trying to implement a generic Mutable Ordered Set type and it needs to conform to many protocols to behave the same way as an Array and a Set does in Swift. First of all to accomplish that the generic type element needs to conform to Hashable and the generic struct needs to conform to RandomAccessCollection, SetAlgebra, ExpressibleByArrayLiteral,AdditiveArithmetic, RangeReplaceableCollection and MutableCollection. I would like also to allow subscript access to its SubSequence adding support to PartialRangeThrough, PartialRangeUpTo, PartialRangeFrom and UnboundedRange as well.

This is my generic OrderedSet struct declaration:

public struct OrderedSet<Element: Hashable> {
    public init() { }
    private var elements: [Element] = []
    private var set: Set<Element> = []
}

Even though this is a self-answered question I would really appreciate and encourage new answers, some feedback on this implementation and/or suggestions on how to fix/improve it as well:

edit/update:

The sorted method works as expected but the mutating sort it is not even changing the elements order.

MutableCollection

Declaration mutating

func sort()

Available when Self conforms to RandomAccessCollection and Element conforms to Comparable.

var numbers: OrderedSet = [15, 40, 10, 30, 60, 25, 5, 100]

numbers[0..<4]           // [15, 40, 10, 30]
numbers[0..<4].sorted()  // [10, 15, 30, 40]

numbers[0..<4].sort()    // [15, 40, 10, 30, 60, 25, 5, 100]

print(numbers) 
// Prints "[15, 40, 10, 30, 60, 25, 5, 100]"
// But it should print "[10, 15, 30, 40, 60, 25, 5, 100]"

How can I fix it?


Solution

  • Problem 1: Making OrderedSet a MutableCollection

    In a MutableCollection you can change an individual element (or a slice of elements) via a subscript that supports write access. And that is where the problems start: What should the output of

    var oset: OrderedSet = [1, 2, 3, 4]
    oset[0] = 3
    print(oset)
    

    be? We cannot simply replace the first element because then the set members are not unique anymore. Your current implementation returns [1, 2, 3, 4], i.e. it rejects the setting if the new member is already present in the set.

    That makes many default implementations of MutableCollection methods fail: sort(), swapAt(), shuffle() and probably more:

    var oset: OrderedSet = [4, 3, 2, 1]
    oset.swapAt(0, 2)
    print(oset) // [4, 3, 2, 1]
    oset.sort()
    print(oset) // [4, 3, 2, 1]
    oset.shuffle()
    print(oset) // [4, 3, 2, 1]
    

    Problem 2: Slicing

    In your implementation you chose Slice<OrderedSet<Element>> as the SubSequence type. Slice uses the storage from the originating (base) collection and only maintains its own startIndex and endIndex. That leads to unexpected results:

    let oset: OrderedSet = [1, 2, 3, 4, 5]
    var oslice = oset[0..<3]
    oslice[0] = 5
    print(oslice) // [1, 2, 3]
    

    Setting oslice[0] is rejected because the originating set contains the new member. That is certainly not expected. Sorting a slice

    var oset: OrderedSet = [6, 5, 4, 3, 2, 1]
    oset[0..<4].sort()
    print(oset) // [6, 5, 4, 3, 2, 1]
    

    fails because the sorted elements are written back one by one, and that fails because the members are already present in the set. The same happens with a slice assignment:

    var o1: OrderedSet = [1, 2]
    let o2: OrderedSet = [2, 1]
    o1[0..<2] = o2[0..<2]
    print(o1) // [1, 2]
    

    Another problem is that a slice oset[0..<3] does not conform to OrderedSetProtocol:It is a (mutable) collection, but for example not a SetAlgebra, so that it cannot be used to form unions, intersections, or symmetric differences.

    Do you really need the MutableCollection conformance?

    I would seriously consider not to adopt the MutableCollection protocol. That does not make the ordered set immutable: It only means that individual members cannot be modified via the subscript setter. You can still insert or remove elements, or form unions or intersections with other sets. Only for “complex” operations like sorting you have to go via an extra temporary set:

    var oset: OrderedSet = [4, 3, 2, 1]
    oset = OrderedSet(oset.sorted())
    print(oset) // [1, 2, 3, 4]
    

    The big advantage is that there is no unclear behavior anymore.

    Yes, I want the MutableCollection conformance!

    OK, you asked for it – let's see what we can do. We could try to fix this by “fixing” the subscript setter. One attempt is your commented out code:

        set {
            guard set.update(with: newValue) == nil else {
                insert(remove(at: elements.firstIndex(of: newValue)!), at: index)
                return 
            }
            elements[index] = newValue
        }
    

    This has the effect of moving an existing member to the given location, shifting other elements:

    var oset: OrderedSet = [1, 2, 3, 4]
    oset[0] = 3
    print(oset) // [3, 1, 2, 4]
    

    This seems to make most methods work correctly:

    var oset: OrderedSet = [4, 3, 2, 1]
    oset.swapAt(0, 2)
    print(oset) // [2, 3, 4, 1]
    oset.sort()
    print(oset) // [1, 2, 3, 4]
    oset.shuffle()
    print(oset) // [1, 4, 3, 2]
    

    and even the subscript sorting:

    var oset: OrderedSet = [6, 5, 4, 3, 2, 1]
    oset[0..<4].sort()
    print(oset) // [3, 4, 5, 6, 2, 1]
    

    But I see two disadvantages:

    Another option is to leave the subscript setter as it is (i.e. reject invalid settings), and implement the problematic methods instead of using the default implementations of MutableCollection:

    extension OrderedSet {
        public mutating func swapAt(_ i: Index, _ j: Index) {
            elements.swapAt(i, j)
        }
    
        public mutating func partition(by belongsInSecondPartition: (Element) throws -> Bool) rethrows -> Index {
            try elements.partition(by: belongsInSecondPartition)
        }
    
        public mutating func sort(by areInIncreasingOrder: (Element, Element) throws -> Bool) rethrows {
            try elements.sort(by: areInIncreasingOrder)
        }
    }
    
    extension OrderedSet where Element : Comparable {
        public mutating func sort() {
            elements.sort()
        }
    }
    

    In addition we need to implement the subscript setter taking a range

    public subscript(bounds: Range<Index>) -> SubSequence
    

    so that a sorted slice is assigned back to the set as one operation, and not each element individually.

    That worked in my tests, but there is a risk that I overlooked something.

    And for the slicing I would make OrderedSet its own SubSequence type. That means that the elements are duplicated. This could be avoided by making the element storage an ArraySlice but – as we saw above – the set of distinct members has to be duplicated anyway, to avoid unwanted side effects when the originating set is mutated.

    The code

    This is what I have so far. It works correctly as far as I can tell, but needs more testing.

    Note that some methods need not be implemented, e.g. ExpressibleByArrayLiteral has already a default implementation in SetAlgebra, and various index calculations have default implementations because the Index is Strideable.

    public struct OrderedSet<Element: Hashable> {
        private var elements: [Element] = []
        private var set: Set<Element> = []
    
        public init() { }
    
    }
    
    extension OrderedSet {
        public init<S>(distinctElements elements: S) where S : Sequence, S.Element == Element {
            self.elements = Array(elements)
            self.set = Set(elements)
            precondition(self.elements.count == self.set.count, "Elements must be distinct")
        }
    }
    
    extension OrderedSet: SetAlgebra {
        public func contains(_ member: Element) -> Bool {
            set.contains(member)
        }
    
        @discardableResult public mutating func insert(_ newMember: Element) -> (inserted: Bool, memberAfterInsert: Element) {
            let insertion = set.insert(newMember)
            if insertion.inserted { elements.append(newMember) }
            return insertion
        }
    
        @discardableResult public mutating func remove(_ member: Element) -> Element? {
            if let oldMember = set.remove(member) {
                let index = elements.firstIndex(of: member)!
                elements.remove(at: index)
                return oldMember
            } else {
                return nil
            }
        }
    
        @discardableResult public mutating func update(with newMember: Element) -> Element? {
            if let member = set.update(with: newMember) {
                return member
            } else {
                elements.append(newMember)
                return nil
            }
        }
    
        public mutating func formUnion(_ other: Self) {
            other.elements.forEach { self.insert($0) }
        }
    
        public mutating func formIntersection(_ other: Self) {
            for element in elements {
                if !other.contains(element) {
                    remove(element)
                }
            }
        }
    
        public mutating func formSymmetricDifference(_ other: Self) {
            for member in other.elements {
                if set.contains(member) {
                    remove(member)
                } else {
                    insert(member)
                }
            }
        }
    
        public func union(_ other: Self) -> Self {
            var orderedSet = self
            orderedSet.formUnion(other)
            return orderedSet
        }
    
        public func intersection(_ other: Self) -> Self {
            var orderedSet = self
            orderedSet.formIntersection(other)
            return orderedSet
        }
    
        public func symmetricDifference(_ other: Self) -> Self {
            var orderedSet = self
            orderedSet.formSymmetricDifference(other)
            return orderedSet
        }
    
        public init<S>(_ elements: S) where S : Sequence, S.Element == Element {
            elements.forEach { insert($0) }
        }
    }
    
    extension OrderedSet: CustomStringConvertible {
        public var description: String { elements.description }
    }
    
    extension OrderedSet: MutableCollection, RandomAccessCollection {
    
        public typealias Index = Int
        public typealias SubSequence = OrderedSet
    
        public subscript(index: Index) -> Element {
            get {
                elements[index]
            }
            set {
                if !set.contains(newValue) || elements[index] == newValue {
                    set.remove(elements[index])
                    set.insert(newValue)
                    elements[index] = newValue
                }
            }
        }
    
        public subscript(bounds: Range<Index>) -> SubSequence {
            get {
                return OrderedSet(distinctElements: elements[bounds])
            }
            set {
                replaceSubrange(bounds, with: newValue.elements)
            }
    
        }
        public var startIndex: Index { elements.startIndex}
        public var endIndex:   Index { elements.endIndex }
    
        public var isEmpty: Bool { elements.isEmpty }
    } 
    
    extension OrderedSet {
        public mutating func swapAt(_ i: Index, _ j: Index) {
            elements.swapAt(i, j)
        }
    
        public mutating func partition(by belongsInSecondPartition: (Element) throws -> Bool) rethrows -> Index {
            try elements.partition(by: belongsInSecondPartition)
        }
    
        public mutating func sort(by areInIncreasingOrder: (Element, Element) throws -> Bool) rethrows {
            try elements.sort(by: areInIncreasingOrder)
        }
    }
    
    extension OrderedSet where Element : Comparable {
        public mutating func sort() {
            elements.sort()
        }
    }
    
    extension OrderedSet: RangeReplaceableCollection {
    
        public mutating func replaceSubrange<C>(_ subrange: Range<Index>, with newElements: C) where C : Collection, C.Element == Element {
    
            set.subtract(elements[subrange])
            let insertedElements = newElements.filter {
                set.insert($0).inserted
            }
            elements.replaceSubrange(subrange, with: insertedElements)
        }
    }
    

    Conclusion

    I already said, dropping the MutableCollection conformance would be the safer solution.

    The above works but is fragile: I had to “guess” which methods must be implemented because the default implementation does not work. If the MutableCollection protocol in the Swift standard library gets a new method with a default implementation. things can break again.