javaandroidjvmdalvik

How come the same static field has two different ids in dalvik?


I'm writing my own toy Dalvik VM, and I can't seem to figure out how dalvik handles inherited static fields.

Consider the following java code:

class Parent { static int parent_int = 10; }
class MyCode extends Parent {
    public static void main(String[] args){
        System.out.println(parent_int + 1);
    }
}

When compiled and run with javac and java it prints 11 to the console, as one would expect. However, when this is compiled to dalvik, the parent_int value is turned into an sget statement to get field@0000, whilst in the <clinit> method of Parent the field id of parent_int is field@0001.

In my implementation of the Dalvik VM this becomes a problem, since field@0000 is not initialized, even though the Parent class and field@0001 has been.

How does the Dalvik VM handle this? How does it know that they are related, and should be considered the same? And why have they been turned into two different fields in the first place, when they could just as well be one?


Solution

  • Short answer: The field@0000 and field@0001 are actually different references.

    Longer: Let's take this one step at a time:

    Decompiling:

    [tmp]$ dextra  -j -d -D classes.dex
    /* 0 */ class   Parent  {
     /** 1 Static Fields (not printed - use -f to print) **/
     /** 2 Direct Methods  **/
     static  void <clinit> () // Class Constructor
        {
        /* # of Registers: 1 */
        /* # of Instructions: 5 */
        /* 0000: Op 1300 0a00       const/16 v0, 0xa */
        v0 = 10; 
        /* 0002: Op 6700 0100       sput v0, ?@1 */
        Parent.parent_int = v0; // (Field@1)
        /* 0004: Op 0e00            return-void  */
        return;
    
        } // end <clinit>
      void <init> () // Constructor
        {
        /* # of Registers: 1 */
        /* # of Instructions: 4 */
        /* 0000: Op 7010 0500 0000  invoke-direct { v0 } */
        result = java.lang.Object.<init>(v0); // (Method@5(v0))
        /* 0003: Op 0e00            return-void  */
        return;
    
        } // end <init>
        }  // end class Parent
    /* 1 */ class   MyCode extends Parent   {    /** 2 Direct Methods  **/
          void <init> () // Constructor
            {
            /* # of Registers: 1 */
            /* # of Instructions: 4 */
            /* 0000: Op 7010 0300 0000  invoke-direct { v0 } */
            result = Parent.<init>(v0); // (Method@3(v0))
            /* 0003: Op 0e00            return-void  */
            return;
    
            } // end <init>
         public static  void main (java.lang.String[])
            {
            /* # of Registers: 2 */
            /* # of Instructions: 19 */
            /* 0000: Op 6201 0200       sget-object v1, ?@2 */
            v1 = java.lang.System.out; // (Field@2)
            /* 0002: Op 6000 0000       sget v0, ?@0 */
            v0 = MyCode.parent_int; // (Field@0)
            /* 0004: Op d800 0001       add-int/lit8 v0, 1 */
            v0 +=  1;
            /* 0006: Op 6e20 0400 0100  invoke-virtual { v1, v0 } */
            result = java.io.PrintStream.println(java.lang.System.out, MyCode.parent_int); // (Method@4(v1, v0))
            /* 0009: Op 6201 0200       sget-object v1, ?@2 */
            v1 = java.lang.System.out; // (Field@2)
            /* 000b: Op 6000 0100       sget v0, ?@1 */
            v0 = Parent.parent_int; // (Field@1)
            /* 000d: Op d800 0001       add-int/lit8 v0, 1 */
            v0 +=  1;
            /* 000f: Op 6e20 0400 0100  invoke-virtual { v1, v0 } */
            result = java.io.PrintStream.println(java.lang.System.out, Parent.parent_int); // (Method@4(v1, v0))
            /* 0012: Op 0e00            return-void  */
            return;
    
            } // end main
        }  // end class MyCode
    

    (this is your code, but adding another "System.out.println(Parent.parent_int + 1);" after yours)

    Now, as you correctly, said, the of the parent references field1, whereas the main looks at @0. If we display the fields:

    [tmp]$ dextra  -F classes.dex
             Field (0) Definer: LMyCode; Name: parent_int type: I Flags: 0x0 
             Field (1) Definer: LParent; Name: parent_int type: I Flags: 0x0 
             Field (2) Definer: Ljava/lang/System; Name: out type: Ljava/io/PrintStream; Flags: 0x0
    

    we see that field 1 belongs to the Parent, and field 0 to MyCode. Since they are both static, they are owned individually by each class: Upon Dalvik's internal initialization of MyCode (which we don't see, since that's handled at the runtime level, nowadays by cloning the shadow from the .ART), the field reference "static_int" is copied over (from the shadow of the child, not the parent) - but further references to the parent field (like the system.out.println I added) will address the field 1, not 0.


    EDIT: Note that from the class perspective, there is only field, belonging to the parent:

    Class 0: Parent File: MyCode.java 1 static fields 2 Direct Methods Class 1: MyCode extends LParent; File: MyCode.java 2 Direct Methods

    so that "parent_int" is very much owned by the parent class.


    Note that you always reference the field in the context of this or that object/class, so if not for that optimization. So - Strictly speaking, Dalvik could have collapsed both fields (being identical) - but won't do so since the cost of having this seemingly "duplicate" field isn't high and worth the fast access to the definer.