haskellgarbage-collectionghcmemory-safety

Under what circumstances can a write to a Haskell unsafeThaw-ed array/vector result in a segfault (via lost GC root)?


This question tries to collect the full picture if/when a stale object reference can happen from an old-gen (immutable) array referring a newer-gen object, from fragments of information.

Preface: was originally researching unsafeThaw/unsafeFreeze of vectors, but as these seem to use the unsafe[...]Array underlying (M)Array operations, continued the search on those. Assuming the thin wrapper of vector around them doesn't change the situation much.

Edit: extending the scope of the quoted message thread and updating the synthesis part based on new learnings.

The information fragments

In https://mail.haskell.org/pipermail/glasgow-haskell-users/2014-May/024978.html (Using mutable array after an unsafeFreezeArray, and GC details, reply by Edward Z. Yang):

  1. What happens when I do newArray s x >>= \a-> unsafeFreezeArray a >> return a and then use a? What problems could that cause?

[...] newArray# and unsafeFreezeArray#, what this operation does is allocate a new array of pointers (initially recorded as mutable), and then freezes it in-place (by changing the info-table associated with it), but while maintaining a pointer to the original mutable array. Nothing bad will happen immediately, but if you use this mutable reference to mutate the pointer array, you can cause a crash (in particular, if the array makes it to the old generation, it will not be on the mutable list and so if you mutate it, you may be missing roots.)

In this scenario it is important that the newly created mutable array was unsafe-frozen, but then a reference to the original (from Haskell user-land still mutable) array is kept and later used.

I thought the following thread contradicted the above stated unsafety - https://mail.haskell.org/pipermail/glasgow-haskell-users/2012-March/022140.html (How unsafe are unsafeThawArray# and unsafeFreezeArray#, reply by Simon Marlow)

I just ran across some code that calls unsafeThawArray#, writeArray#, and unsafeFreezeArray#, in that order. How unsafe is that?

  • Is it unsafe in the sense that if someone has a reference to the original Array# they will see the value of that pure array change?

Yes.

  • Is it unsafe in the sense things will crash and burn?

No. (at least, we would consider it a bug if a crash was the result)

The RTS implementation details of unsafeThawArray included at the bottom of the reply also hint that these unsafe thaw/freeze operations not only in-place change the array mutability marker, but also move them between generations and the mutable list in some way.

The last message of this mailing thread https://mail.haskell.org/pipermail/glasgow-haskell-users/2012-March/022145.html (reply by Johan Tibell) also hints that the unsafeFreezeArray is necessary due to its side-effect for marking as immutable (and compiling with Simon's previously mentioned reply, still keeping on the mutable list as FROZEN0 until eventually a GC moves it to an immutable GC space), but not sure if this is the full picture.

Searching some more, the QA here at Mutable Array in GHC Compact Region Ben Gamari nicely describes the multi-generation mutable reference scenario, and in the comments section we find the question

When someone calls unsafeFreezeArray#, it seems like they would end up with an immutable value that can point to younger generations. It doesn't seem like GHC changes the frozen array to be in generation 0, so I don't understand how this works with garbage collection.

which touches on what we are after I believe - it continues

just found out about eager promotion, which answers my question.

Then https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/rts/storage/gc/eager-promotion and https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/rts/storage/gc/remembered-sets describes how the new-gen object being referred would be promoted into the old-gen in which the immutable array lives.

Synthesis?

While I originally thought the second (earlier) post means the first posts' unsafety concern is not true as is (or is missing some circumstance), I was educated that the important difference in the early case was the maintaining of the mutable reference, which allows later mutating the then-immutable array that is not on the mutable list anymore.

Aside: in Johan Tibell's example, what would it cause to not call the final unsafeFreezeArray?

The unsafeThawArray likely moves the array to a mutable list

because the RTS comments referred above mention:

   // So, when we thaw a MUT_ARR_PTRS_FROZEN, we must cope with two cases:
   // either it is on a mut_list, or it isn't.  We adopt the convention that
   // the closure type is MUT_ARR_PTRS_FROZEN0 if it is on the mutable list,
   // and MUT_ARR_PTRS_FROZEN otherwise.  In fact it wouldn't matter if
   // we put it on the mutable list more than once, but it would get 
scavenged
   // multiple times during GC, which would be unnecessarily slow.

then since the unsafeFreezeArray would not be called, it would happily stay on the mutable list (though not clear which gen's mutable list, assuming its own generation's), serving as a GC root for younger generations. Which would be fine, just would prevent the eager promotion optimization (edit: also would stay on the mutable list, adding to linear GC overhead).

Apart from the performance loss, would there be any unsafe aspect of this way?

Aside over

So the main question is, can this lost-object-ref issue ever arise, and what are the exact circumstances / mechanics involved?


Solution

  • Quoting @andrás-kovács

    in the default moving garbage collector, mutable arrays are always present in a list ("mutable list") that gets scanned on every GC run. Each array has its own dirty flag and bitmask for detecting changed elements. An unsafe freezing marks the array so that on the next garbage collection it gets removed from the list. After this, if you do a mutable write to the array, that marks the array itself as dirty, but it does not add the array back to the mutable list, so it does not get scanned on subsequent GC runs.

    So retaining (and using) a mutable reference after unsafeFreeze-ing is the culprit sequence of operations.

    (Sidenote: for a moment one might think Storable vector doesn't have this problem, as its memory is not managed by GHC RTS, and refers the same ForeignPtr.. until one realizes that Storable vectors can't store references anyway. So in a sense Storable vectors duck this problem, but it would still not be nice to rely on this pattern - in case the operation gets generalized beyond Storable and this detail is missed).