javamemory-modeljava-memory-modeljls

Why no data races in sequentially consistent executions is enough to guarantee that there will be no data races in all executions?


According to the Java Memory Model (JMM):

A program is correctly synchronized if and only if all sequentially consistent executions are free of data races.

If a program is correctly synchronized, then all executions of the program will appear to be sequentially consistent (§17.4.3).

I don't see how the the fact that no SC execution has data races guarantees that every execution has no data races (which means that every execution is SC).
Is there a proof of that?


What I found:

  1. A blog post by Jeremy Manson (one of the authors of the JMM).
    The following paragraph might mean that the guarantee is provided by causality (but I don't see how):

    So there is the intuition behind the model. When you want to justify the fact that a read returns the value of a write, you can do so if:

    a) That write happened before it, or
    b) You have already justified the write.

    The way that the model works is that you start with an execution that has all of the actions justified by property a), and you start justifying actions iteratively. So, in the second example above, you create successive executions that justify 0, then 2, then 3, then 4, then 1.

    This definition has the lovely property of being reasonably intuitive and also guaranteeing SC for DRF programs.

  2. Foundations of the C++ Concurrency Memory Model.
    This article describes C++ memory model (which has similarities with the JMM).
    Section 8 of the article has a proof of a similar guarantee for C++:

    THEOREM 8.1. If a program allows a type 2 data race in a consistent execution, then there exists a sequentially consistent execution, with two conflicting actions, neither of which happens before the other.8

    In effect, we only need to look at sequentially consistent executions in order to determine whether there is a data race in a consistent execution.
    [...]
    8 The latter is essentially the condition used in [25] to define “correctly synchronized” for Java.

    Unfortunately, I'm not sure this proof holds for the JMM because the following doesn't work:

    Consider the longest prefix P of <T that contains no data race. Note that each load in P must see a store that precedes it in either the synchronization or happens-before orders.

    It seems to me that the above doesn't work in JMM because causality allows a read to return a later store.


Solution

  • The proof is in The Java Memory Model by J.Manson, W.Pugh and S.Adve:

    9.2.1 Correctly synchronized programs exhibit only sequentially consistent behaviors

    Lemma 2. Consider an execution E of a correctly synchronized program P that is legal under the Java memory model. If, in E, each read sees a write that happens-before it, E has sequentially consistent behavior.

    Proof:
    Since the execution is legal under the memory model, the execution is happens-before consistent and synchronization order consistent.
    A topological sort on the happens-before edges of the actions in an execution gives a total order consistent with program order and synchronization order. Let r be the first read in E that doesn’t see the most recent conflicting write w in the sorted order but instead sees w′. Let the topological sort of E be αw′βwγrδ.
    Let αw′βwγr′δ′ be the topological sort of an execution E′. E′ is obtained exactly as E, except that instead of r, it performs the action r′, which is the same as r except that it sees w; δ′ is any sequentially consistent completion of the program such that each read sees the previous conflicting write.
    The execution E′ is sequentially consistent, and it is not the case that w′ ‒hb→ w ‒hb→ r, so P is not correctly synchronized.
    Thus, no such r exists and the program has sequentially consistent behavior. □

    Theorem 3. If an execution E of a correctly synchronized program is legal under the Java memory model, it is also sequentially consistent.

    Proof: By Lemma 2, if an execution E is not sequentially consistent, there must be a read r that sees a write w such that w does not happen-before r. The read must be committed, because otherwise it would not be able to see a write that does not happen-before it. There may be multiple reads of this sort; if so, let r be the first such read that was committed. Let Eᵢ be the execution that was used to justify committing r.
    The relative happens-before order of committed actions and actions being committed must remain the same in all executions considering the resulting set of committed actions. Thus, if we don’t have w ‒hb→ r in E, then we didn’t have w ‒hb→ r in Eᵢ when we committed r.
    Since r was the first read to be committed that doesn’t see a write that happens-before it, each committed read in Eᵢ must see a write that happens-before it. Non-committed reads always sees writes that happens-before them. Thus, each read in Eᵢ sees a write that happens-before it, and there is a write w in Eᵢ that is not ordered with respect to r by happens-before ordering.
    A topological sort of the actions in Eᵢ according to their happens-before edges gives a total order consistent with program order and synchronization order. This gives a total order for a sequentially consistent execution in which the conflicting accesses w and r are not ordered by happens-before edges. However, Lemma 2 shows that executions of correctly synchronized programs in which each read sees a write that happens-before it must be sequentially consistent. Therefore, this program is not correctly synchronized. This is a contradiction. □