I want to use a SharedArray in Julia and sometimes need to operate on the same cell. To do this I tried to use a lock, but this fails. After running the code below increment_array should have one entry with 10000 but instead it ends up with 99xx. I start up julia with julia -p auto, after which nprocs() returns 17.
using Distributed
using SharedArrays
@everywhere using Distributed, SharedArrays
@everywhere safe_increment_lock = ReentrantLock()
function main()
increment_array = SharedArray{Int}(1)
@sync @distributed for i in range(1, 10000)
lock(safe_increment_lock) do
increment_array[1] += 1
end
end
return increment_array
end
Locks are for thread safety (see Multi-Threading) not for distribution. Just try the following to verify:
julia> lock(safe_increment_lock)
julia> islocked(safe_increment_lock)
true
julia> @fetchfrom 2 islocked(safe_increment_lock)
false
as you can see, each process has its own lock, they are not shared.
A version that uses multi-threading instead of distribution could look like this:
function main()
increment_array = [0]
Threads.@threads for i in range(1, 10000)
lock(safe_increment_lock) do
increment_array[1] += 1
end
end
return increment_array
end
Runnning with julia -t auto
:
julia> Threads.nthreads()
8
julia> main()
1-element Vector{Int64}:
10000
Various techniques exist if you really need distribution, often it makes sense to let each process work as much as possible on its own and aggregate the result at the end. In your example, something like:
julia> function main()
increment_array = SharedArray{Int}(nprocs())
@sync @distributed for i in range(1, 10000)
increment_array[myid()] += 1
end
return sum(increment_array)
end
main (generic function with 1 method)
julia> main()
10000
This is of course not always as easy or even possible and there are other techniques. I recommend taking a look at the Transducers.jl package which could help in structuring the logic and simplify running it using distribution or multi-threading.