swift concurrency grand-central-dispatch semaphore

Use queue and semaphore for concurrency and property wrapper?

I'm trying to create a thread-safe property wrapper. I could only think of GCD queues and semaphores as being the most Swifty and reliable way. Are semaphore's just more performant (if that's true), or is there another reason to use one over the other for concurrency?

Below are two variants of atomic property wrappers:

@propertyWrapper
struct Atomic<Value> {
    private var value: Value
    private let queue = DispatchQueue(label: "Atomic serial queue")

    var wrappedValue: Value {
        get { queue.sync { value } }
        set { queue.sync { value = newValue } }
    }

    init(wrappedValue value: Value) {
        self.value = value
    }
}

@propertyWrapper
struct Atomic2<Value> {
    private var value: Value
    private var semaphore = DispatchSemaphore(value: 1)

    var wrappedValue: Value {
        get {
            semaphore.wait()
            let temp = value
            semaphore.signal()
            return temp
        }

        set {
            semaphore.wait()
            value = newValue
            semaphore.signal()
        }
    }

    init(wrappedValue value: Value) {
        self.value = value
    }
}

struct MyStruct {
    @Atomic var counter = 0
    @Atomic2 var counter2 = 0
}

func test() {
    var myStruct = MyStruct()

    DispatchQueue.concurrentPerform(iterations: 1000) {
        myStruct.counter += $0
        myStruct.counter2 += $0
   }
}

How can they be properly tested and measured to see the difference between the two implementations and if they even work?

Solution

You asked:

How can they be properly tested and measured to see the difference between the two implementations and if they even work?

A few thoughts:

I’d suggest doing far more than 1,000 iterations. You want to do enough iterations that the results are measured in seconds, not milliseconds. I used ten million iterations in my example.
The unit testing framework is a good tool, used for both testing for correctness as well as measuring performance using the measure method (which repeats the performance test 10 times for each unit test and the results will be captured by the unit test reports):

So, create a project with a unit test target (or add a unit test target to existing project if you want) and then create unit tests, and execute them with command+u.
If you edit the scheme for your target, make sure parallel execution of tests is off (especially not to put multi-threaded synchronizations at an artificial disadvantage) and you may want to to randomize the order of your tests, to make sure the order in which they execute doesn’t affect the performance:

I would also make the test target use a release build to make sure you are testing an optimized build.
You might also consider using OSSignposter with a category of .pointsOfInterest, which you can then profile in Instruments. See How to identify key events in Xcode Instruments? After profiling the app, you can select the “Points of Interest” lane, and the “Details” pane at the bottom will show you min, max, avg, and stdev of how long the intervals took.
Needless to say, while I am stress testing the locks by running 10m iterations, incrementing by one for each iteration, that is horribly inefficient. There simply is not enough work on each thread to justify the overhead of the thread handling. One would generally stride through the data set and do more iterations per thread, and reducing the number of synchronizations. (As you can see in my results above, clever striding can have a far more dramatic impact than changing synchronization methods.)

The practical implication of this, is that in well designed parallel algorithm, where you are doing enough work to justify the multiple threads, you are reducing the number of synchronizations that are taking place. Thus, the minor variances in the different synchronization techniques are unobservable. If the synchronization mechanism is having an observable performance difference, this probably suggests a deeper problem in the parallelization algorithm. Focus on reducing synchronizations, not making synchronizations faster.
Probably needless to say, nowadays, with Swift concurrency and async-await, we generally use actors for synchronization. See WWDC 2021 videos Protect mutable state with Swift actors and Meet async/await in Swift.

A few observations regarding different synchronization techniques:

FWIW, the GCD approach will offer better performance than the semaphore approach. Consider:

@propertyWrapper
class Atomic<Value> {
    private var value: Value
    private let queue = DispatchQueue(label: (Bundle.main.bundleIdentifier ?? "Atomic") + ".synchronize")

    var wrappedValue: Value {
        get { queue.sync  { value } }
        set { queue.async { self.value = newValue } }
    }

    init(wrappedValue value: Value) {
        self.value = value
    }
}

That is faster than the semaphore technique.

Even better, consider NSLock:

@propertyWrapper
class Atomic<Value> {
    private var value: Value
    private let lock = NSLock()

    var wrappedValue: Value {
        get { lock.withLock { value } }
        set { lock.withLock { value = newValue } }
    }

    init(wrappedValue value: Value) {
        self.value = value
    }
}

Or you can use unfair locks. In iOS 16+ and macOS 13.0+, you can do this with OSAllocatedUnfairLock:
```
import os.lock

@propertyWrapper
class Atomic<Value> {
    private let lock: OSAllocatedUnfairLock<Value>

    var wrappedValue: Value {
        get { lock.withLock { $0 } }
        set { lock.withLock { $0 = newValue } }
    }

    init(wrappedValue value: Value) {
        lock = OSAllocatedUnfairLock(initialState: value)
    }
}
```
For what it is worth, historically, unfair locks were significantly faster than NSLock, in my most recent tests, this advantage is eliminated. You should benchmark optimized builds on your target hardware, OS, and specific use-case, to verify.

Also, to use unfair locks in earlier OS versions, rather than OSAllocatedUnfairLock, you have to write your own UnfairLock wrapper:

// One should not use `os_unfair_lock` directly in Swift (because Swift
// can move `struct` types), so we'll wrap it in a `UnsafeMutablePointer`.
// See https://github.com/apple/swift/blob/88b093e9d77d6201935a2c2fb13f27d961836777/stdlib/public/Darwin/Foundation/Publishers%2BLocking.swift#L18
// for stdlib example of this pattern.

final class UnfairLock: NSLocking {
    private let unfairLock: UnsafeMutablePointer<os_unfair_lock> = {
        let pointer = UnsafeMutablePointer<os_unfair_lock>.allocate(capacity: 1)
        pointer.initialize(to: os_unfair_lock())
        return pointer
    }()

    deinit {
        unfairLock.deinitialize(count: 1)
        unfairLock.deallocate()
    }

    @inlinable
    func lock() {
        os_unfair_lock_lock(unfairLock)
    }

    @inlinable
    func tryLock() -> Bool {
        os_unfair_lock_trylock(unfairLock)
    }

    @inlinable
    func unlock() {
        os_unfair_lock_unlock(unfairLock)
    }
}

And then you could use that in a property wrapper:

@propertyWrapper
class Atomic<Value> {
    private var value: Value
    private let lock = UnfairLock()

    var wrappedValue: Value {
        get { lock.withLock { value } }
        set { lock.withLock { value = newValue } }
    }

    init(wrappedValue value: Value) {
        self.value = value
    }
}

Note, none of these will pass muster with the “Strict concurrency checking” build setting of “Complete” if the wrapped type is not Sendable. (If it was Sendable, then none of this would be necessary, anyway.) For example, consider this example with a non-Sendable type, Foo:
```
@Atomic var foo = Foo()
```
… will produce a warning:

Stored property '_foo' of 'Sendable'-conforming class 'Bar' is mutable; this is an error in the Swift 6 language mode

If you try to get around that by making the wrappedValue property nonisolated, that will produce the following warning/error:

'nonisolated' is not supported on properties with property wrappers; this is an error in the Swift 6 language mode

You may have to abandon the property wrapper pattern, and just do something like:
```
class Atomic<Value>: @unchecked Sendable {
    private var value: Value
    private let lock = NSLock()

    var wrappedValue: Value {
        get { lock.withLock { value } }
        set { lock.withLock { value = newValue } }
    }

    init(wrappedValue value: Value) {
        self.value = value
    }

    func synchronized(_ body: (inout Value) throws -> Void) rethrows {
        try lock.withLock { try body(&value) }
    }
}
```
Which you can then use like so:
```
let foo = AtomicWrapper(wrappedValue: Foo())

foo.synchronized { $0.value = 42 }
```
It is not as elegant as the property wrapper approach, but is a worst-case scenario for adapting legacy codebases with non-Sendable types, and you want to synchronize your interaction with it to make it thread-safe. Just make sure you do not access the Foo instance outside of the wrapper.
While I have preserved your terminology in the above, I personally would hesitate to call this “atomic”, because it practically invites the use of non-atomic operations. Consider this simple experiment, where we increment an integer ten million times:
```
func threadSafetyExperiment() {
    @Atomic var foo = 0

    DispatchQueue.global().async {
        DispatchQueue.concurrentPerform(iterations: 10_000_000) { _ in
            foo += 1
        }
        print(foo)
    }
}
```
You’d expect foo to be equal to 10,000,000, but it will not be. That is because the whole interaction of “retrieve the value and increment it and save it” needs to be wrapped in a single synchronization mechanism.

But you can add an atomic increment method:
```
extension Atomic where Value: Numeric {
    func increment(by increment: Value) {
        lock.withLock { value += increment }
    }
}
```
And then this works fine:
```
func threadSafetyExperiment() {
    @Atomic var foo = 0

    DispatchQueue.global().async {
        DispatchQueue.concurrentPerform(iterations: iterations) { _ in
            _foo.increment(by: 1)
        }
        print(foo)
    }
}
```
In short, be wary of accessor-level property wrappers: That is generally the wrong layer of abstraction to perform the synchronization.

You might consider using a package, such as Swift Atomics which better handles this. But, even its “Proceed at Your Own Risk” warns us:

The Atomics package provides carefully considered API for atomic operations that follows established design principles for Swift APIs. However, the underlying operations work on a very low level of abstraction. Atomics – even more than other low-level concurrency constructs – are notoriously difficult to use correctly.