So I have this code using concurrent dispatch queue and barrier flag to support parallel reads but block reads/writes when writing:
struct SafeDict<Element> {
private var dict = [String: Element]()
private var queue = DispatchQueue(label: "WriteQueue", attributes: .concurrent)
func get(_ key: String) -> Element? {
queue.sync { return dict[key] }
}
mutating func set(_ key: String, value: Element) {
queue.sync(flags: .barrier) { dict[key] = value }
}
}
var safeDict = SafeDict<Int>()
for i in 1...4 {
DispatchQueue.global().async {
switch i {
case 1:
safeDict.get("one")
case 2:
safeDict.set("one", value: 1) // Waits for get (good)
case 3:
safeDict.get("one") // Runs after set in parallel
case 4:
safeDict.get("two") // Runs after set in parallel
default:
print("done")
}
}
}
But since actor functions are async, parallel reads will wait on each other. How can this be avoided cleanly?
One alternative I can think of is to not use an actor and instead have 2 async methods (get set). And get will await a set task if not nil. But that seems too tedious.
actor SafeDictActor<Element> {
private var dict = [String: Element]()
func get(_ key: String) -> Element? { dict[key] }
func set(_ key: String, value: Element) { dict[key] = value }
}
let safeDictActor = SafeDictActor<Int>()
for i in 1...4 {
Task {
switch i {
case 1:
await safeDictActor.get("one")
case 2:
await safeDictActor.set("one", value: 1) // Waits for get (good)
case 3:
await safeDictActor.get("one") // waits for set (good)
case 4:
await safeDictActor.get("two") // waits for previous get (bad)
default:
print("done")
}
}
}
As Paulw11 mentioned in the comments to a recent question, How to solve reader/writer problem with swift concurrency? (and that you note in your question), actors use a basic synchronization mechanism that is essentially incompatible with reader-writer pattern. Actors are based on a simple (yet elegant) mechanism that prevents races by avoiding any parallel access to shared state, but reader-writer is predicated on a pattern of permitting parallel reads. The idea of reader-writer parallel reads is simply incompatible with actor’s basic mechanisms.
That having been said, I would make a few additional observations:
If you really want reader-writer (and I am not sure why you would; see point 3, below), you could just stick with GCD for that object. Yes, we generally avoid intermingling GCD with Swift concurrency codebase, but it can be done (especially if compartmentalized to a single type, not intermingling different tech stacks within a single type). I would not generally advise it unless absolutely essential, but it can be done.
But if integrating this with Swift concurrency codebase, you might consider conforming to this type to @unchecked Sendable
. Swift concurrency employs a compile-time validation of code thread-safety (esp with “Strict Concurrency” build setting set to “Complete”; effectively a preview of the sort of checking that we will enjoy in Swift 6). The @unchecked Sendable
effectively lets the compiler know that you are vouching for the thread-safety of this object. It lets your object play well within Swift concurrency.
For more information about Sendable
, WWDC 2022’s video Eliminate data races using Swift Concurrency is a great primer on the topic.
I notice that your reader-writer implementation is using a value type: But the whole idea of value types is to provide a simple mechanism for thread-safety, whereby we provide each thread with its own copy of an object, thereby eliminating potential races. But the idea of reader-writer is to allow thread-safe shared access to the same object from multiple threads. Now, you can have a reader-writer value type, but it is a bit curious to say that you want to both thread-safe access to share object across threads but also want to provide each thread its own copy. There are some cases where you might do this, but many reader-write use-cases call for shared mutable state, which begs for reference semantics. (As an aside, the actor implementation you contemplate uses reference semantics, too.)
I understand the intuitive appeal of the reader-writer pattern. But in my experience, GCD-based implementation generally fails to realize its promise. In all of my benchmarks, (a) it is only marginally more performant than GCD serial queue; (b) it is often much slower than simple locks; and (c) it often introduces more problems than it solves. E.g., if, as you contemplate, you have 1000 reads, there is a deeper thread-explosion problem, and reader-writer only compounds that problem. In my experience, the overhead of GCD can start to outweigh the potential benefits of parallelized execution. I have tried, in vain, to construct realistic scenarios where reader-writer was faster than, say, a simple unfair lock. I would suggest benchmarking your particular use-case, and avoid assuming that the reader-writer will be faster.
Bottom line, actors are great for high-level thread safety/integrity. Where performance is critical (e.g., that 3% scenario that Knuth contemplated, where performance really is critical, such as compute-intensive parallelized algorithms), I personally skip reader-writer and jump to something more performant (e.g., an unfair lock). But for most practical use-cases, actors are more than adequate.
You said:
But since actor functions are async, parallel reads will wait on each other. How can this be avoided cleanly?
One alternative I can think of is to not use an actor and instead have 2 async methods (get set). And get will await a set task if not nil. But that seems too tedious.
actor SafeDictActor<Element> { private var dict = [String: Element]() func get(_ key: String) -> Element? { dict[key] } func set(_ key: String, value: Element) { dict[key] = value } }
I’m going to set aside that you suggested “not use an actor” and then showed us an actor implementation. I assume you meant “use an actor.” I’m also going to set aside the suggestion that get
not set
are async
, as they are not. (Yes, you need to await
them when you call them, but they are synchronous functions of the actor.)
But, an actor does elegantly synchronize this for us. No GCD queues or locks are needed. Just simply make it an actor
and you are done. But you are correct, two get
calls cannot run in parallel. The actor will execute one function at a time. This is how it ensures thread-safety.
For the vast majority of situations, this simple actor implementation is more than sufficient. But for those critical 3% of the cases, where performance really is of paramount concern, only then would we consider other patterns.