iosswiftgrand-central-dispatchsemaphoredispatch-async

DispatchQueue threads don't always set the correct results


Am trying on coding game MineSweeper, the following code is to set the numbers around landmines. For a test, I choose the minimum level 9 x 9 with 10 landmines.

For a faster performance, I tried to use more threads when setting numbers, but one day I found that it doesn't always give the correct number arrangements, I made a loop to create it 1000 times and found that 20 ~ 40 out of 1000 are wrong.

Here are several wrong results, "*" represents landmine, "0" means no landmine around

wrong mine: index 1 should be "*"

10212*100
112*21211
0011101*1
000000111
000011100
01123*200
01*2**200
023432100
01**10000

wrong num: index 0 should be "1"

0*212*100
112*21211
0011101*1
000000111
000011100
01123*200
01*2**200
023432100
01**10000

wrong num: index 73 should be "1"

1*212*100
112*21211
0011101*1
000000111
000011100
01123*200
01*2**200
023432100
00**10000

Without using DispatchQueue or set DispatchSemaphore's value to 1, it gives the correct number arrangement in 1000%1000.

1*212*100
112*21211
0011101*1
000000111
000011100
01123*200
01*2**200
023432100
01**10000

Here is the sample code:

// actually indexes created randomly every time
let minesIndexArr = [59, 74, 1, 12, 50, 56, 75, 58, 5, 25]
var defaultCellArr = [String]()
var totalCellArr = [String]()
var count = 0

for _ 1...81 {
    defaultCellArr.append("0")
}

runLoop()

func runLoop() {
    if count == 1000 {
        return
    }
    totalCellArr = defaultCellArr
    setNums()
}

func setNums() {
    let group = DispatchGroup()
    let queue = DispatchQueue(label: "com.setnums", attributes: .concurrent)
    let semaphore = DispatchSemaphore(value: 10)
    for i in 0..<self.totalCellArr.count {
        semaphore.wait()
        group.enter()
        queue.async(group: group, execute: {
            if self.minesIndexArr.firstIndex(of: i) != nil{
                self.totalCellArr[i] = "*"
            }else{
                var num = 0
                let neighbourIndexes = self.getNeighbourIndex(i)
                for v in neighbourIndexes.values {
                    if self.minesIndexArr.firstIndex(of: v) != nil {
                        num += 1
                    }
                }
                self.totalCellArr[i] = String(num)
            }
                            
            group.leave()
            semaphore.signal()
        })
        
    }
    group.notify(queue: DispatchQueue.main) {
        printMap()
        count += 1
        self.runLoop()
    }
}

Solution

  • tl;dr

    You are using this non-zero semaphore to do parallel calculations, constraining the degree of concurrency to something reasonable. I would recommend concurrentPerform.

    But the issue here is not how you are constraining the degree of the parallelism, but rather that you are using the same properties (shared by all of these concurrent tasks) for your calculations, which means that one iteration on one thread can be mutating these properties while they are being used/mutated by another parallel iteration on another thread.

    So, I would avoid using any shared properties at all (short of the final array of boards). Use local variables only. And make sure to synchronize the updating of this final array so that you don't have two threads mutating it at the same time.


    So, for example, if you wanted to create the boards in parallel, I would probably use concurrentPerform as outlined in my prior answer:

    func populateBoards(count: Int, rows: Int, columns: Int, mineCount: Int, completion: @escaping ([Board]) -> Void) {
        var boards: [Board] = []
        let lock = NSLock()
    
        DispatchQueue.global().async {
            DispatchQueue.concurrentPerform(iterations: count) { index in
                let board = Board(rows: rows, columns: columns, mineCount: mineCount)
                lock.synchronize {
                    boards.append(board)
                }
            }
        }
    
        DispatchQueue.main.async {
            lock.synchronize {
                completion(boards)
            }
        }
    }
    

    Note, I'm not referencing any ivars. It is all local variables, passing the result back in a closure.

    And to avoid race conditions where multiple threads might be trying to update the same array of boards, I am synchronizing my access with a NSLock. (You can use whatever synchronization mechanism you want, but locks a very performant solution, probably better than a GCD serial queue or reader-writer pattern in this particular scenario.) That synchronize method is as follows:

    extension NSLocking {
        func synchronize<T>(block: () throws -> T) rethrows -> T {
            lock()
            defer { unlock() }
            return try block()
        }
    }
    

    That is a nice generalized solution (handling closures that might return values, throw errors, etc.), but if that is too complicated to follow, here is a minimalistic rendition that is sufficient for our purposes here:

    extension NSLocking {
        func synchronize(block: () -> Void) {
            lock()
            block()
            unlock()
        }
    }
    

    Now, I confess, that I'd probably employ a different model for the board. I would define a Square enum for the individual squares of the board, and then define a Board which was an array (for rows) of arrays (for columns) for all these squares. Anyway, this in my implementation of the Board:

    enum Square {
        case count(Int)
        case mine
    }
    
    struct Board {
        let rows: Int
        let columns: Int
        var squares: [[Square]]
    
        init(rows: Int, columns: Int, mineCount: Int) {
            self.rows = rows
            self.columns = columns
    
            // populate board with all zeros
    
            self.squares = (0..<rows).map { _ in
                Array(repeating: Square.count(0), count: columns)
            }
    
            // now add mines
    
            addMinesAndUpdateNearbyCounts(mineCount)
        }
    
        mutating func addMinesAndUpdateNearbyCounts(_ mineCount: Int) {
            let mines = (0..<rows * columns)
                .map { index in
                    index.quotientAndRemainder(dividingBy: columns)
                }
                .shuffled()
                .prefix(mineCount)
    
            for (mineRow, mineColumn) in mines {
                squares[mineRow][mineColumn] = .mine
                for row in mineRow-1 ... mineRow+1 where row >= 0 && row < rows {
                    for column in mineColumn-1 ... mineColumn+1 where column >= 0 && column < columns  {
                        if case .count(let n) = squares[row][column] {
                            squares[row][column] = .count(n + 1)
                        }
                    }
                }
            }
        }
    }
    
    extension Board: CustomStringConvertible {
        var description: String {
            var result = ""
    
            for row in 0..<rows {
                for column in 0..<columns {
                    switch squares[row][column] {
                        case .count(let n): result += String(n)
                        case .mine: result += "*"
                    }
                }
                result += "\n"
            }
    
            return result
        }
    }
    

    Anyway, I would generate 1000 9×9 boards with ten mines each like so:

    exercise.populateBoards(count: 1000, rows: 9, columns: 9, mineCount: 10) { boards in
        for board in boards {
            print(board)
            print("")
        }
    }
    

    But feel free to use whatever model you want. But I'd suggest encapsulating the model for the Board in its own type. It not only abstracts the details of the generation of a board from the multithreaded algorithm to create lots of boards, but it naturally avoids any unintended sharing of properties by the various threads.


    Now all of this said, this is not a great example of parallelized code because the creation of a board is not nearly computationally intensive enough to justify the (admittedly very minor) overhead of running it in parallel. This is not a problem that is likely to benefit much from parallelized routines. Maybe you'd see some modest performance improvement, but not nearly as much as you might experience from something a little more computationally intensive.