I have been looking at how transactional memory is implemented in Haskell, and I am not sure I understand how the STM operations exposed to the programmer hook into the runtime system functions written in C. In ghc/libraries/base/GHC/Conc/Sync.hs
of the git repo, I see the following definitions:
-- |A monad supporting atomic memory transactions.
newtype STM a = STM (State# RealWorld -> (# State# RealWorld, a #))
deriving Typeable
-- |Shared memory locations that support atomic memory transactions.
data TVar a = TVar (TVar# RealWorld a)
deriving Typeable
-- |Create a new TVar holding a value supplied
newTVar :: a -> STM (TVar a)
newTVar val = STM $ \s1# ->
case newTVar# val s1# of
(# s2#, tvar# #) -> (# s2#, TVar tvar# #)
Then in ghc/rts/PrimOps.cmm
, I see the following C-- definition:
stg_newTVarzh (P_ init){
W_ tv;
ALLOC_PRIM_P (SIZEOF_StgTVar, stg_newTVarzh, init);
tv = Hp - SIZEOF_StgTVar + WDS(1);
SET_HDR (tv, stg_TVAR_DIRTY_info, CCCS);
StgTVar_current_value(tv) = init;
StgTVar_first_watch_queue_entry(tv) = stg_END_STM_WATCH_QUEUE_closure;
StgTVar_num_updates(tv) = 0;
return (tv);
}
My questions:
#
mean in (# s2#, TVar tvar# #)
. I've read before that putting a #
after a variable is just a naming convention indicating something is unboxed, but what does it mean when it is by itself?newTVar#
to stg_newTVarzh
? It seems like I am missing another definition between these two. Does the compiler rewrite newTVar#
into a call to the C-- function listed?P_
and W_
in the C-- code?I have only been able to find one other occurrence of newTVar#
in ghc/compiler/prelude/primops.txt.pp
primop NewTVarOp "newTVar#" GenPrimOp
a
-> State# s -> (# State# s, TVar# s a #)
{Create a new {\tt TVar\#} holding a specified initial value.}
with
out_of_line = True
has_side_effects = True
According to https://ghc.haskell.org/trac/ghc/wiki/Commentary/PrimOps, this is how primitives are defined so that the compiler knows about them.
(# s2#, TVar tvar# #)
is an unboxed tuple.
The name stg_newTVarzh
is built from:
The stg_
prefix, which is common to the whole GHC runtime, and stands for the spineless-tagless G-machine, an abstract machine to evaluate functional languages;
newTVar
which is the first part of newTVar#
;
the final zh
, which is the so-called z-encoding of #
: this encoding generates a plain name usable by the linker/the ABI in all platform, removing funny characters like hash (#).