oopserializationterminologydata-representation

How do you call an object which state can be completely described by its string representation?


Is there a name in the OOP world to refer to such objects? For example, in java

"Word".toString();

Will return an output of Word. This is a string representation of the entity that exists currently in the program.

Some more examples can be accomplished with other datatypes like Doubles, Integers, maybe even lists or different data structures.

And some other more complex that cannot be represented in this way, for example a full fledged RESTful service class might not have a string representation of its current state.

What's the right terminology? native? immutable? those 2 last terms doesn't really reflect this definition.

To expand on the question:

Imagine you have a function/method that converts a string to a map, a string could be {key1=value1,key2=value2} and you would get a map back, this doesn't work for some complex objects, how would you describe the parameters of this function if you were to generalize it's use for other simple object types?


Solution

  • You have an abstract object that consists of internal state.

    You have one or more concrete representations of that object's state.

    In one case the concrete representation is a chunk of memory containing primitives and references to other component objects on the heap (in Java, other languages may be different).

    You have a different representation that is amenable to being stored in a contiguous block of characters or bytes, and possibly transmitted over a network.

    Both representations are canonically equivalent given equivalent contexts containing their non-state information (methods, class hierarchy, etc), but they serve different purposes.

    Generically, this could be called a "change of representation". When the first representation above is converted to the second it's called "serialization", and the reverse process is "deserialization". Note that you could have many different representations fulfilling different requirements and supporting different functionality.

    One important point to note is that in both cases, in-memory and "serialized" (and any other representations), if an object's state contains references to other objects, then the entire "state" consists of that object and all the objects that can be reached from it, and objects reachable from those objects, etc. This is known as an "object graph", and it exists equally in all representations.

    As to deciding which one you should or shouldn't use, that depends totally on your processing requirements.

    for example a full fledged RESTful service class might not have a string representation of its current state

    This is incorrect, you can always define a serialized representation of an object's state. It may be inconvenient to do so, but if it is required it can be done.

    Imagine you have a function/method that converts a string to a map, a string could be {key1=value1,key2=value2} and you would get a map back, this doesn't work for some complex objects

    Again, it can always be made to work if it is a requirement, as long as the cost of doing so is justified.

    In summary, everything is a representation, and you can arrange to transform one representation to any other and back again, without loss, assuming you're willing to incur the costs of doing so. As mentioned above, one factor is the cost of representing not just the single object, but the entire object graph, which can be substantial.