javamultithreadingcompiler-optimizationinstruction-reordering

How to demonstrate Java instruction reordering problems?


With Java instruction reordering the execution order of the code is changed by the JVM at compile time or run time, possibly causing unrelated statements to be executed out-of-order.

Edit: [Instruction reordering can produce counter-intuitive results. Many CPU architectures can reorder the memory interactions of machine instructions which leads to similar unexpected results even if the compiler did not change the instruction order. Thus, the term memory reordering may be a better fit than instruction reordering.]

So my question is:

Can someone provide an example Java program/snippet, that reliably shows an instruction reordering problem, that is not caused also by other synchronization issues ( such as caching/visibility or non-atomic r/w, as in my failed attempt at such a demo in my previous question )

To emphasize, I am not looking for examples of theoretical reordering issues. What I am looking for is a way to actually demonstrate them by seeing incorrect or unexpected results of a running program.

Barring a faulty behavior example, just showing actual reordering happening in the assembly of a simple program could also be nice.


Solution

  • This demonstrates reordering of certain assignments, out of 1M iterations there is usually couple of printed lines.

    public class App {
    
        public static void main(String[] args) {
    
            for (int i = 0; i < 1000_000; i++) {
                final State state = new State();
    
                // a = 0, b = 0, c = 0
    
                // Write values
                new Thread(() -> {
                    state.a = 1;
                    // a = 1, b = 0, c = 0
                    state.b = 1;
                    // a = 1, b = 1, c = 0
                    state.c = state.a + 1;
                    // a = 1, b = 1, c = 2
                }).start();
    
                // Read values - this should never happen, right?
                new Thread(() -> {
                    // copy in reverse order so if we see some invalid state we know this is caused by reordering and not by a race condition in reads/writes
                    // we don't know if the reordered statements are the writes or reads (we will se it is writes later)
                    int tmpC = state.c;
                    int tmpB = state.b;
                    int tmpA = state.a;
    
                    if (tmpB == 1 && tmpA == 0) {
                        System.out.println("Hey wtf!! b == 1 && a == 0");
                    }
                    if (tmpC == 2 && tmpB == 0) {
                        System.out.println("Hey wtf!! c == 2 && b == 0");
                    }
                    if (tmpC == 2 && tmpA == 0) {
                        System.out.println("Hey wtf!! c == 2 && a == 0");
                    }
                }).start();
    
            }
            System.out.println("done");
        }
    
        static class State {
            int a = 0;
            int b = 0;
            int c = 0;
        }
    
    }
    

    Printing the assembly for the write lambda gets this output (among other..)

                                                    ; {metadata('com/example/App$$Lambda$1')}
      0x00007f73b51a0100: 752b                jne       7f73b51a012dh
                                                    ;*invokeinterface run
                                                    ; - java.lang.Thread::run@11 (line 748)
    
      0x00007f73b51a0102: 458b530c            mov       r10d,dword ptr [r11+0ch]
                                                    ;*getfield arg$1
                                                    ; - com.example.App$$Lambda$1/1831932724::run@1
                                                    ; - java.lang.Thread::run@-1 (line 747)
    
      0x00007f73b51a0106: 43c744d41402000000  mov       dword ptr [r12+r10*8+14h],2h
                                                    ;*putfield c
                                                    ; - com.example.App::lambda$main$0@17 (line 18)
                                                    ; - com.example.App$$Lambda$1/1831932724::run@4
                                                    ; - java.lang.Thread::run@-1 (line 747)
                                                    ; implicit exception: dispatches to 0x00007f73b51a01b5
      0x00007f73b51a010f: 43c744d40c01000000  mov       dword ptr [r12+r10*8+0ch],1h
                                                    ;*putfield a
                                                    ; - com.example.App::lambda$main$0@2 (line 14)
                                                    ; - com.example.App$$Lambda$1/1831932724::run@4
                                                    ; - java.lang.Thread::run@-1 (line 747)
    
      0x00007f73b51a0118: 43c744d41001000000  mov       dword ptr [r12+r10*8+10h],1h
                                                    ;*synchronization entry
                                                    ; - java.lang.Thread::run@-1 (line 747)
    
      0x00007f73b51a0121: 4883c420            add       rsp,20h
      0x00007f73b51a0125: 5d                  pop       rbp
      0x00007f73b51a0126: 8505d41eb016        test      dword ptr [7f73cbca2000h],eax
                                                    ;   {poll_return}
      0x00007f73b51a012c: c3                  ret
      0x00007f73b51a012d: 4181f885f900f8      cmp       r8d,0f800f985h
    

    I am not sure why the last mov dword ptr [r12+r10*8+10h],1h is not marked with putfield b and line 16, but you can see the swapped assignment of b and c (c right after a).

    EDIT: Because writes happen in order a,b,c and reads happen in reverse order c,b,a you should never see an invalid state unless the writes (or reads) are reordered.

    Writes performed by single cpu (or core) are visible in same order by all processors, see e.g. this answer, which points to Intel System Programming Guide Volume 3 section 8.2.2.

    Writes by a single processor are observed in the same order by all processors.