From what I understand, modifications to IORef
s are very quick, all they involve is updating a thunk pointer. Of course, the reader (i.e. someone who wants to see the value on their webpage) will then need to take time to evaluate these thunks (which may build up if writers are not reading back the results).
I was thinking it would be good to start actually evaluating the modification thunks on the IORef
in parallel, since in many circumstances they'll probably have to be evaluated at some point anyway (obviously, this will break with infinite data structures).
So I've wrote the following function, with a similar signature to atomicModifyIORef
:
atomicModifyIORefPar :: (NFData a) => IORef a -> (a -> (a, b)) -> IO b
atomicModifyIORefPar ioref f =
let
g olddata =
let (newdata, result) = f olddata in (newdata, (result, newdata))
in do
(result, newdata) <- atomicModifyIORef ioref g
force newdata `par` return result
This seems to work (test code here). Is there anything I've done wrong here? Or is there a better way to do this?
Edit: Second attempt
Inspired by Carl's answer below. We actually store force newdata
into the IORef
. This is the same as newdata
anyway, but shows the runtime that we want to keep force newdata
for later, so it doesn't garbage collect the spark.
atomicModifyIORefPar :: (NFData a) => IORef a -> (a -> (a, b)) -> IO b
atomicModifyIORefPar ioref f =
let
g olddata =
let
(newdata, result) = f olddata
newdata_forced = force newdata
in
(newdata_forced, (result, newdata_forced))
in do
(result, newdata_forced) <- atomicModifyIORef ioref g
newdata_forced `par` return result
This may or may not work, depending on the version of GHC. The spark pool's interaction with GC has been variable throughout history. In some versions, the fact that the expression force newdata
isn't referred to by anything in scope after atomicModifyIORefPar
returns means that it's likely to be garbage collected before the spark created by par
ever is converted, which means that the spark will also be collected.
Other versions of GHC have treated the spark pool as roots in GC analysis, but that has problems too. I don't remember what the current state is, but I suspect it's that the spark pool does not count as GC roots. The problems it raises (loss of parallelism when the returned expressions don't refer to the expressions being evaluated in parallel) are less bad than the problems created by treating the spark pool as GC roots (retaining memory that isn't needed).
Edit - second attempt at answering
This new implementation looks right, for the reasons you give. The expression being evaluated in parallel is also reachable from the GC roots.