In Shenandoah 1.0
every single Object had an additional header - called forwarding pointer
. Why was that needed and what is the reason that lead to its elimination in Shenandoah 2.0
?
First of all, every single java Object has two headers: klass
and mark
. They have been there in each instance since forever (they can slightly change how a JVM handles their flags internally with recent JVMs, for example) and are used for various reasons (will go into detail about only one of them a bit further in the answer).
The need for a forwarding pointer
is literally in the second part of this answer. The forwarding pointer
is needed in both read barrier
and write barrier
in Shenandoah 1.0
(though the read could skip the barrier for some field types - will not go into detail). In very simple words it simplifies concurrent copy very much. As said in that answer, it allows to atomically switch the forwarding pointer
to the new copy of the Object and then concurrently update all references to point to that new Object.
Things have changed a bit in Shenandoah 2.0
where the "to-space invariant" is in place : meaning all the writes and reads are done via the to-space
.This means one interesting thing : once the to-space
copy is established, the from-copy
is never used. Imagine a situation like this:
refA refB
| |
fwdPointer1 ---- fwdPointer2
|
--------- ---------
| i = 0 | | i = 0 |
| j = 0 | | j = 0 |
--------- ---------
In Shenandoah 1.0
there were cases when reading via the refA
could bypass the barrier (not use it at all) and still read via the from-copy
. This was allowed for final
fields, for example (via a special flag). This means that even if to-space
copy already existed and there were already references to it, there could still be reads (via refA
) that would go to the from-space
copy. In Shenandoah 2.0
this is prohibited.
This information was used in a rather interesting way. Every object in Java is aligned to 64 bits - meaning the last 3 bits are always zero. So, they dropped the forwarding pointer
and said that : if the last two bits of the mark
word are 11
(this is allowed since no else uses it in this manner) -> this is a forwarding pointer
, otherwise the to-space
copy does yet exists and this is a plain header. You can see it in action right here and you can trace the masking here and here.
It used to look like this:
| -------------------|
| forwarding Pointer |
| -------------------|
| -------------------|
| mark |
| -------------------|
| -------------------|
| class |
| -------------------|
And has transformed to:
| -------------------|
| mark or forwarding | // depending on the last two bits
| -------------------|
| -------------------|
| class |
| -------------------|
So here is a possible scenario (I'll skip class header
for simplicity):
refA, refB
|
mark (last two bits are 00)
|
---------
| i = 0 |
| j = 0 |
---------
GC
kicks in. The object referenced by refA/refB
is alive, thus must be evacuated (it is said to be in the "collection set"). First a copy is created and atomically mark
is made to reference that copy (also the last two bits are marked as 11
to now make it a forwardee
and not a mark word
):
refA, refB
|
mark (11) ------ mark (00)
|
--------- ---------
| i = 0 | | i = 0 |
| j = 0 | | j = 0 |
--------- ---------
Now one of the mark word
s has a bit pattern (ends in 11
) that indicates that it is a forwardee and not a mark word anymore.
refA refB
| |
mark (11) ------ mark (00)
|
--------- ---------
| i = 0 | | i = 0 |
| j = 0 | | j = 0 |
--------- ---------
refB
can move concurrently, so then refA
, ultimately there are not references to the from-space
object and it is garbage. This is how mark word
acts as a forwarding pointer
, if needed.