javadecompiler

how to identify ternary operator from java's bytecode


i am working on a java’s bytecode project, and need to identify ternary operator and nested ternary operator.

I got two questions

  1. how to determine an if statement is ternary operator?based on stack variables?
  2. how to determine an if statement’ s consumed value on stack oprand is a ternary operator’s result

example:

((a>b ? 0 : 1 ) > (a > c ? 10 : 20) ? 100 : 101

here is a full example of ternary operator chain:

source code:

public void ddddd()
{
    int m,k, n, z;
    Random r = new Random();
    boolean p = (r.nextInt() > 100 || (30 > r.nextInt() ? (r.nextInt() > 4000 ? 1 : 0) : (r.nextInt() > 2000 ? 100 : 20)) <  (r.nextInt() > 1000 ? 0 : 1));
}

structed bytecode:

[

   7          astore] java/util/Random var1(1) = <init>(v0)
 [   9           aload] java/util/Random var1(1) = (stack_var)var1
 [  11   invokevirtual] I v11 = var1.nextInt()
 [  14          bipush] (I)v14 = 100
 if { // block_id: 16 16 => 89  parent_id: 0
    [  16       if_icmpgt] v11 > v14 : goto => 85
    if_false_block { // block_id: 15 19 => 89  parent_id: 16
        [  19          bipush] (I)v19 = 30
        [  21           aload] java/util/Random var1(1) = (stack_var)var1
        [  23   invokevirtual] I v23 = var1.nextInt()
        if { // block_id: 9 26 => 64  parent_id: 15
            [  26       if_icmple] v19 <= v23 : goto => 48
            if_false_block { // block_id: 8 29 => 45  parent_id: 9
                [  29           aload] java/util/Random var1(1) = (stack_var)var1
                [  31   invokevirtual] I v31 = var1.nextInt()
                [  34          sipush] (I)v34 = 4000
                if { // block_id: 6 37 => 45  parent_id: 8
                    [  37       if_icmple] v31 <= v34 : goto => 44
                    if_false_block { // block_id: 5 40 => 41  parent_id: 6
                        [  40        iconst_1] (I)v40 = 1
                        [  41            goto] goto: 66
                    }
                    if_true_block { // block_id: 4 44 => 45  parent_id: 6
                        [  44        iconst_0] (I)v44 = 0
                        [  45            goto] goto: 66
                    }
                }
            }
            if_true_block { // block_id: 7 48 => 64  parent_id: 9
                [  48           aload] java/util/Random var1(1) = (stack_var)var1
                [  50   invokevirtual] I v50 = var1.nextInt()
                [  53          sipush] (I)v53 = 2000
                if { // block_id: 3 56 => 64  parent_id: 7
                    [  56       if_icmple] v50 <= v53 : goto => 64
                    if_false_block { // block_id: 2 59 => 61  parent_id: 3
                        [  59          bipush] (I)v59 = 100
                        [  61            goto] goto: 66
                    }
                    if_true_block { // block_id: 1 64 => 64  parent_id: 3
                        [  64          bipush] (I)v64 = 20
                    }
                }
            }
        }
        [  66           aload] java/util/Random var1(1) = (stack_var)var1
        [  68   invokevirtual] I v68 = var1.nextInt()
        [  71          sipush] (I)v71 = 1000
        if { // block_id: 12 74 => 81  parent_id: 15
            [  74       if_icmple] v68 <= v71 : goto => 81
            if_false_block { // block_id: 11 77 => 78  parent_id: 12
                [  77        iconst_0] (I)v77 = 0
                [  78            goto] goto: 82
            }
            if_true_block { // block_id: 10 81 => 81  parent_id: 12
                [  81        iconst_1] (I)v81 = 1
            }
        }
        if { // block_id: 14 82 => 89  parent_id: 15
            [  82       if_icmpge] v40 >= v77 : goto => 89
            [  85        iconst_1] (I)v85 = 1
            [  86            goto] goto: 90
            if_true_block { // block_id: 13 89 => 89  parent_id: 14
                [  89        iconst_0] (I)v89 = 0
            }
        }
    }
 }
 [  90          istore] I var2(1) = (stack_var)var2
 [  92          return] return

Solution

  • For ternary operator, java compiler generates byte code as if/else blocks, for example:

    int a = m > 1 ? 0 : 1
    

    after compiled the class file's bytecode look like the follow one(don't care faked offset)

    0. aload_1  // load local variable m
    1. ifeq #4
    2. iconst_1
    3. goto #5
    4. iconst_0
    5. istore   // store local variable a
    

    the control flow graph

    block_0:
        0. aload_1
        1. ifeq #4
    block_1:
        2. iconst_1
        3. goto #5
    block_3
        4. iconst_0
    block_4
        5. istore
        
    

    it clearly shows that

    1. first a block_4 start with store/return, and the operand stack have some value.
    2. if (block_4->prev->prev) == block_0 and block_0 end with if statement.
    3. bingo! from block_0 to block_4 can be a ternary operator.

    this is my stupid idea at the beginning, but not real world!

    For the following example, the above method does not work

    boolean dddd = (m = 20) > 100 || (30 > ((r.nextInt() > 100 && r.nextInt() != 1000) ? (100 & r.nextInt()) : 20) ? (Or.a1 = this.b1 = m = k = n = 100 > r.nextInt() ? r.nextInt() : 0) : (r.nextInt() > 2000 ? 100 : 20)) < (z = r.nextInt() > 1000 ? 0 : 1);
    

    I stucked, this why I asked the question.

    After lots of days, I found the solution finally

    the control flow graph should be build as following

    block_0:
        0. v0 = aload_1
        1. ifeq v0 == 0 goto #4
    block_1:
        2. v1 = iconst_1
        3. goto #5
    block_3
        4. v2 = iconst_0
    block_4
        5. var1 = v3
    

    because of my code will be run each path of cfg, v3 is easily marks as intersection of v1 and v2, then the cfg should be

    block_0:
      0. v0 = aload_1
        1. ifeq v0 == 0 goto #4
    block_1:
        2. v1 = iconst_1
           v3 = v1
        3. goto #5
    block_3
        4. v2 = iconst_0
           v3 = v2
    block_4
        5. var1 = v3
    

    after copy v3 to block_3 and block_1 and inline stack variable:

    block_0:
      0. v0 = aload_1
        1. ifeq v0 == 0 goto #4
    block_1:
        2. 
        v3 = iconst_1
        3. goto #5
    block_3
        4. 
        v3 = iconst_0
    block_4
        5. var1 = v3
    

    ok, the stack variable v3 is ternary.

    This is not the full story, after I learned Static Single Assignment Form

    if block A (have dominate frontiers && block's stack out's depth > 0)
      for block child in block A's successors
        insert phi node to child head
    

    depend on phi node, it's more eaiser find v3, and no need to loop through each path of control flow graph.

    I can't write all here, its too much. Hope it helps others.