Here is the class stealing reference to a copy of a singleton while singleton is deserialized.
public class ElvisStealer implements Serializable {
static Elvis impersonator;
private Elvis payload;
private Object readResolve() {
// Save a reference to the "unresolved" Elvis instance
impersonator = payload;
// Return an object of correct type for favorites field
return new String[] { "A Fool Such as I" };
}
private static final long serialVersionUID = 0;
}
My questions are:
// Save a reference to the "unresolved" Elvis instance impersonator = payload;
In other words, how can child object's readResolve
access parent object reference?
To make ElvisStealer work I have to modify serialized data by replacing singleton instance field of type String[]
(in that case) with ElvisStealer instance. In the book singleton contains a non transient
String[]
instance field and that field is replaced with ElvisStealer instance in serialized stream. Then when that field is deserialized, ObjectInputStream
sees that this field is of type ElvisStealer and it invokes readResolve
from ElvisStealer class. My questions are: why JVM don't give an error parsing such field knowing that there should be String instead of ElvisStealer and secondly why JVM invokes readResolve
from ElvisStealer class knowing that there should be String[]
, not ElvisStealer.
Why ElvisStealer contains an instance field of type Elvis in addition to static field? Shouldn't be static field enough?
Deserialization will always start with fresh instances conjured out of thin air and bytes from the ObjectInputStream
.
After that step you have all new instances like so:
Elvis(new).favoriteSongs = ElvisStealer(new)
ElvisStealer(new).payload = Elvis(new) // same elvis, circular reference
Then in step 2, deserialization uses the readResolve
methods of those instances to "resolve" the preliminary deserialized objects to their final form. However it starts with the inner ElvisStealer
.
Elvis(new).favoriteSongs = ElvisStealer(new).readResolve()
=> Elvis(new).favoriteSongs = String[] { .... }
Next step is to resolve the Elvis
instance
result = Elvis(new).readResolve()
=> result = Elvis(INSTANCE)
The correct type (String[]
instead of the actually invalid ElvisStealer
) is only required to be present after the readResolve
step.
Having this "invalid" stage inbetween is useful. You can declare in the writeReplace
method that you want a different object serialized instead, then in that object's readResolve
method have code that produces an object of the right type.
E.g. when you have
class ComplexThing implements Serializable {
private Object writeReplace() throws ObjectStreamException {
return new SimpleHiddenReplacement();
}
}
private class SimpleHiddenReplacement implements Serializable {
private Object readResolve() throws ObjectStreamException {
return new ComplexThing();
}
}
You can pass ComplexThing
to an ObjectOutputStream
and you'll get back a ComplexThing
from the ObjectInputStream
, but under the hood, the bytes those streams operate on on are actually the representation of a SimpleHiddenReplacement
.
The elvis stealing attack steals the freshly created Elvis
before it's readResolve
method (= it's unresolve) had a chance to replace (resolve) it.
ElvisStealer(new).payload = Elvis(new)
. Where does that happen? I see only the declaration of a payload variable, but it is not initialized in any way.
The book says
First, write a “stealer” class that has [..] an instance field that refers to the serialized singleton in which the stealer “hides.” In the serialization stream, replace the singleton’s nontransient field with an instance of the stealer.
This setup happens in the form of a crafted byte[] serializedForm
. It is a fake serialized form of an Elvis
object, which unlike normal serialized Elvises contains the stealer with a backreference to the Elvis
object.
Secondly, shouldn't the
static impersonator
be enough?
No, serialization doesn't do static variables, only instance fields. This attack relies on initialization of the variables by the deserialization and deserialization does that because there is a value in the serialized form.