javaserializationsingletoneffective-java

ElvisStealer from Effective Java


Here is the class stealing reference to a copy of a singleton while singleton is deserialized.

public class ElvisStealer implements Serializable {

    static Elvis impersonator;
    private Elvis payload;

    private Object readResolve() {
        // Save a reference to the "unresolved" Elvis instance
        impersonator = payload;
        // Return an object of correct type for favorites field
        return new String[] { "A Fool Such as I" };
    }
    
    private static final long serialVersionUID = 0;
    
}

My questions are:

  1. Where and how exactly is a copy of Elvis object reference aquired? This part does not say much about that.

// Save a reference to the "unresolved" Elvis instance impersonator = payload;

In other words, how can child object's readResolve access parent object reference?

  1. To make ElvisStealer work I have to modify serialized data by replacing singleton instance field of type String[] (in that case) with ElvisStealer instance. In the book singleton contains a non transient String[] instance field and that field is replaced with ElvisStealer instance in serialized stream. Then when that field is deserialized, ObjectInputStream sees that this field is of type ElvisStealer and it invokes readResolve from ElvisStealer class. My questions are: why JVM don't give an error parsing such field knowing that there should be String instead of ElvisStealer and secondly why JVM invokes readResolve from ElvisStealer class knowing that there should be String[], not ElvisStealer.

  2. Why ElvisStealer contains an instance field of type Elvis in addition to static field? Shouldn't be static field enough?


Solution

  • Deserialization will always start with fresh instances conjured out of thin air and bytes from the ObjectInputStream.

    After that step you have all new instances like so:

    Elvis(new).favoriteSongs = ElvisStealer(new)
    ElvisStealer(new).payload = Elvis(new)   // same elvis, circular reference
    

    Then in step 2, deserialization uses the readResolve methods of those instances to "resolve" the preliminary deserialized objects to their final form. However it starts with the inner ElvisStealer.

    Elvis(new).favoriteSongs = ElvisStealer(new).readResolve()
    => Elvis(new).favoriteSongs = String[] { .... }
    

    Next step is to resolve the Elvis instance

    result = Elvis(new).readResolve()
    => result = Elvis(INSTANCE)
    

    The correct type (String[] instead of the actually invalid ElvisStealer) is only required to be present after the readResolve step.

    Having this "invalid" stage inbetween is useful. You can declare in the writeReplace method that you want a different object serialized instead, then in that object's readResolve method have code that produces an object of the right type.

    E.g. when you have

    class ComplexThing implements Serializable {
        private Object writeReplace() throws ObjectStreamException {
            return new SimpleHiddenReplacement();
        }
    }
    private class SimpleHiddenReplacement implements Serializable {
        private Object readResolve() throws ObjectStreamException {
            return new ComplexThing();
        }
    }
    

    You can pass ComplexThing to an ObjectOutputStream and you'll get back a ComplexThing from the ObjectInputStream, but under the hood, the bytes those streams operate on on are actually the representation of a SimpleHiddenReplacement.

    The elvis stealing attack steals the freshly created Elvis before it's readResolve method (= it's unresolve) had a chance to replace (resolve) it.


    ElvisStealer(new).payload = Elvis(new). Where does that happen? I see only the declaration of a payload variable, but it is not initialized in any way.

    The book says

    First, write a “stealer” class that has [..] an instance field that refers to the serialized singleton in which the stealer “hides.” In the serialization stream, replace the singleton’s nontransient field with an instance of the stealer.

    This setup happens in the form of a crafted byte[] serializedForm. It is a fake serialized form of an Elvis object, which unlike normal serialized Elvises contains the stealer with a backreference to the Elvis object.

    Secondly, shouldn't the static impersonator be enough?

    No, serialization doesn't do static variables, only instance fields. This attack relies on initialization of the variables by the deserialization and deserialization does that because there is a value in the serialized form.