javafileinputstreamobjectinputstreamobjectoutputstream

Why is deserialized object's size different from serialized object's size?


I'm saving some Java objects in files. I seralize them that way :

Class Timestamp.java:

public class Timestamp implements java.io.Serializable {
    private static final long serialVersionUID = 1L;
    private static SimpleDateFormat staticFormat = new SimpleDateFormat("EEE, dd MMM yyyy HH:mm:ss zzz", ENGLISH);
    private SimpleDateFormat instanceFormat = new SimpleDateFormat("EEE, dd MMM yyyy HH:mm:ss zzz", ENGLISH);
    private long time;
}

Class ObjId.java:

public class ObjId implements java.io.Serializable {
    private final int x;
    private final int y;
    private final int z;

public ObjId(final int x, final int y, final int z) {
        this.x = x;
        this.y = y;
        this.z = z;
    }
}

CachedObj.java :

class CachedObj implements java.io.Serializable {
        private static final long serialVersionUID = 1L;
        private final ObjId id;
        private final byte[] data;
        private final String eTag;
        private Timestamp modified;
        private Timestamp expires;
        private boolean mustRevalidate;

        public CachedObj(final ObjId ID, final byte[] data, final String eTag, final Timestamp modified, final Timestamp expires, final boolean mustRevalidate) {
            this.id= ID;
            this.data = data;
            this.eTag = eTag;
            this.modified = modified;
            this.expires = expires;
            this.mustRevalidate = mustRevalidate;
        }
}

The rest of my code :

    public void saveObjToFlash(CachedObj obj, String fileName) {
                ByteArrayOutputStream bos = new ByteArrayOutputStream();
                ObjectOutputStream out = null;
                try {
                    out = new ObjectOutputStream(bos);
                    out.writeObject(obj);
                    out.flush();
                    bos.close();
                    out.close();
                    try (OutputStream outputStream = new FileOutputStream(fileName)) {
                        bos.writeTo(outputStream);
                    } catch (Exception e) {
                        e.getStackTrace();
                    }
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }

In another place in my code I deserialize that way :

public CachedObj getCachedObjFromFile(String filename) {
    try {
        File file = new File(filename);
        FileInputStream fileInput = new FileInputStream(file);
        ObjectInputStream objInput = new ObjectInputStream(fileInput);
        CachedObj obj= (CachedObj ) objInput.readObject();
        fileInput.close();
        objInput.close();
        return obj;
    } catch (IOException ex) {
        ex.printStackTrace();
    } catch (java.lang.ClassNotFoundException exx) {
        exx.printStackTrace();
    }
    return null;
}

And I have a function that calculates an CachedObj object size :

public int getObjectSize(CachedObj obj) {
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    ObjectOutputStream out = null;
    try {
        out = new ObjectOutputStream(bos);
        out.writeObject(obj);
        out.flush();
        out.close();
        bos.close();
        int sizeObj = bos.toByteArray().length;
        return sizeObj;
    } catch (IOException e) {
        e.printStackTrace();
        return -1;
    }
}

My question is, when I run the following code :

    public static void main(String[] args) {
        byte[] bytesArr = new byte[4];
        CachedObj obj1 = new CachedObj new ObjId(100, 100, 100), bytesArr , "etag100",
                new Timestamp(1659523700),
                new Timestamp(1659523700), false);
        int size1 = getObjectSize(obj1);
        saveObjToFlash(obj1, "file");
        CachedObj obj2 = getCachedObjFromFile("file"); // objects obj1 and obj2 seems to have same exact fields values
        int size2 = getObjectSize(obj2); // I always get : size2 = size1 - 2 !!
    }

Why are size1 and size2 different ? How can I get the size of obj2 such as I'll get the same value of size1 ?


Solution

  • This isn't a complete answer. Your issue with slight change in the serialised size is down to the SimpleDateFormat in Timestamp. Comment out the instanceFormat field, or make the field transient will make all the sizes the same again, or change your example to just serialise SimpleDateFormat instead of CachedObj:

    private SimpleDateFormat instanceFormat = new SimpleDateFormat("EEE, dd MMM yyyy HH:mm:ss zzz", Locale.ENGLISH);
    

    SimpleDateFormat isn't a good type of field to add to the serialized format of your class as it has a relatively big memory footprint (~ 48KB!). It would be better to created one instance in the application code, never serialised, and re-use it in same thread for all your Timestamp <-> String conversions performed rather than allocate new instance per Timestamp.

    If you are performing a significant number of conversions re-using the same SimpleDateFormat for date to string formatting will reduce overall memory churn on large application servers.

    By the way, your calls can be simplified with try-with-resources any adapted for use by and Serializable:

    public static int getObjectSize(Serializable obj) throws IOException {
        ByteArrayOutputStream bos = new ByteArrayOutputStream();
        try (ObjectOutputStream out = new ObjectOutputStream(bos)) {
            out.writeObject(obj);
        }
        return bos.size();
    }
    public static void saveObjToFlash(Serializable obj, String fileName) throws IOException {
        try (ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(fileName))) {
            out.writeObject(obj);
        }
    }
    public static <T extends Serializable> T getCachedObjFromFile(String filename) throws IOException, ClassNotFoundException {
        try (ObjectInputStream objInput = new ObjectInputStream(new FileInputStream(filename))) {
            return (T)objInput.readObject();
        }
    }