I have a "User
" class with 40+ private variables including complex objects like private/public keys (QCA library), custom QObjects etc. The idea is that the class has a function called sign()
which encrypts, signs, serializes itself and returns a QByteArray
which can then be stored in a SQLite blob.
What's the best approach to serialize a complex object? Iterating though the properties with QMetaObject
? Converting it to a protobuf object?
Could it be casted to a char array?
Could it be casted to a char array?
No, because you'd be casting QObject
's internals that you know nothing about, pointers that are not valid the second time you run your program, etc.
TL;DR: Implementing it manually is OK for explicit data elements, and leveraging metaobject system for QObject
and Q_GADGET
classes will help some of the drudgery.
The simplest solution might be to implement QDataStream
operators for the object and the types you use. Make sure to follow good practice: each class that could conceivably ever change the format of data it holds must emit a format identifier.
For example, let's take the following classes:
class User {
QString m_name;
QList<CryptoKey> m_keys;
QList<Address> m_addresses;
QObject m_props;
...
friend QDataStream & operator<<(QDataStream &, const User &);
friend QDataStream & operator>>(QDataStream &, User &);
public:
...
};
Q_DECLARE_METATYPE(User) // no semi-colon
class Address {
QString m_line1;
QString m_line2;
QString m_postCode;
...
friend QDataStream & operator<<(QDataStream &, const Address &);
friend QDataStream & operator>>(QDataStream &, Address &);
public:
...
};
Q_DECLARE_METATYPE(Address) // no semi-colon!
The Q_DECLARE_METATYPE
macro makes the classes known to the QVariant
and the QMetaType
type system. Thus, for example, it's possible to assign an Address
to a QVariant
, convert such a QVariant
to Address
, to stream the variant directly to a datastream, etc.
First, let's address how to dump the QObject
properties:
QList<QByteArray> publicNames(QList<QByteArray> names) {
names.erase(std::remove_if(names.begin(), names.end(),
[](const QByteArray & v){ return v.startsWith("_q_"); }), names.end());
return names;
}
bool isDumpable(const QMetaProperty & prop) {
return prop.isStored() && !prop.isConstant() && prop.isReadable() && prop.isWritable();
}
void dumpProperties(QDataStream & s, const QObject & obj)
{
s << quint8(0); // format
QList<QByteArray> names = publicNames(obj.dynamicPropertyNames());
s << names;
for (name : names) s << obj.property(name);
auto mObj = obj.metaObject();
for (int i = 0; i < mObj->propertyCount(), ++i) {
auto prop = mObj->property(i);
if (! isDumpable(prop)) continue;
auto name = QByteArray::fromRawData(prop.name(), strlen(prop.name());
if (! name.isEmpty()) s << name << prop.read(&obj);
}
s << QByteArray();
}
In general, if we were to deal with data from a User
that didn't have the m_props
member, we'd need to be able to clear the properties. This idiom will come up every time you extend the stored object and upgrade the serialization format.
void clearProperties(QObject & obj)
{
auto names = publicNames(obj.dynamicPropertyNames());
const QVariant null;
for (name : names) obj.setProperty(name, null);
auto const mObj = obj.metaObject();
for (int i = 0; i < mObj->propertyCount(), ++i) {
auto prop = mObj->property(i);
if (! isDumpable(prop)) continue;
if (prop.isResettable()) {
prop.reset(&obj);
continue;
}
prop.write(&obj, null);
}
}
Now we know how to restore the properties from a stream:
void loadProperties(QDataStream & s, QObject & obj)
{
quint8 format;
s >> format;
// We only support one format at the moment.
QList<QByteArray> names;
s >> names;
for (name : names) {
QVariant val;
s >> val;
obj.setProperty(name, val);
}
auto const mObj = obj.metaObject();
forever {
QByteArray name;
s >> name;
if (name.isEmpty()) break;
QVariant value;
s >> value;
int idx = mObj->indexOfProperty(name);
if (idx < 0) continue;
auto prop = mObj->property(idx);
if (! isDumpable(prop)) continue;
prop.write(&obj, value);
}
}
We can thus implement the stream operators to serialize our objects:
#define fallthrough
QDataStream & operator<<(QDataStream & s, const User & user) {
s << quint8(1) // format
<< user.m_name << user.m_keys << user.m_addresses;
dumpProperties(s, &m_props);
return s;
}
QDataStream & operator>>(QDataStream & s, User & user) {
quint8 format;
s >> format;
switch (format) {
case 0:
s >> user.m_name >> user.m_keys;
user.m_addresses.clear();
clearProperties(&user.m_props);
fallthrough;
case 1:
s >> user.m_addresses;
loadProperties(&user.m_props);
break;
}
return s;
}
QDataStream & operator<<(QDataStream & s, const Address & address) {
s << quint8(0) // format
<< address.m_line1 << address.m_line2 << address.m_postCode;
return s;
}
QDataStream & operator>>(QDataStream & s, Address & address) {
quint8 format;
s >> format;
switch (format) {
case 0:
s >> address.m_line1 >> address.m_line2 >> address.m_postCode;
break;
}
return s;
}
The property system will also work for any other class, as long as you declare its properties and add the Q_GADGET
macro (instead of Q_OBJECT
). This is supported from Qt 5.5 onwards.
Suppose that we declared our Address
class as follows:
class Address {
Q_GADGET
Q_PROPERTY(QString line1 MEMBER m_line1)
Q_PROPERTY(QString line2 MEMBER m_line2)
Q_PROPERTY(QString postCode MEMBER m_postCode)
QString m_line1;
QString m_line2;
QString m_postCode;
...
friend QDataStream & operator<<(QDataStream &, const Address &);
friend QDataStream & operator>>(QDataStream &, Address &);
public:
...
};
Let's then declare the datastream operators in terms of [dump|clear|load]Properties
modified for dealing with gadgets:
QDataStream & operator<<(QDataStream & s, const Address & address) {
s << quint8(0); // format
dumpProperties(s, &address);
return s;
}
QDataStream & operator>>(QDataStream & s, Address & address) {
quint8 format;
s >> format;
loadProperties(s, &address);
return s;
}
We do not need to change the format designator even if the property set has been changed. We should retain the format designator in case we had other changes that couldn't be expressed as a simple property dump anymore. This is unlikely in most cases, but one must remember that a decision not to use a format specifier immediately sets the format of the streamed data in stone. It's not subsequently possible to change it!
Finally, the property handlers are slightly cut-down and modified variants of the ones used for the QObject
properties:
template <typename T> void dumpProperties(QDataStream & s, const T * gadget) {
dumpProperties(s, T::staticMetaObject, gadget);
}
void dumpProperties(QDataStream & s, const QMetaObject & mObj, const void * gadget)
{
s << quint8(0); // format
for (int i = 0; i < mObj.propertyCount(), ++i) {
auto prop = mObj.property(i);
if (! isDumpable(prop)) continue;
auto name = QByteArray::fromRawData(prop.name(), strlen(prop.name());
if (! name.isEmpty()) s << name << prop.readOnGadget(gadget);
}
s << QByteArray();
}
template <typename T> void clearProperties(T * gadget) {
clearProperties(T::staticMetaObject, gadget);
}
void clearProperties(const QMetaObject & mObj, void * gadget)
{
const QVariant null;
for (int i = 0; i < mObj.propertyCount(), ++i) {
auto prop = mObj.property(i);
if (! isDumpable(prop)) continue;
if (prop.isResettable()) {
prop.resetOnGadget(gadget);
continue;
}
prop.writeOnGadget(gadget, null);
}
}
template <typename T> void loadProperties(QDataStream & s, T * gadget) {
loadProperties(s, T::staticMetaObject, gadget);
}
void loadProperties(QDataStream & s, const QMetaObject & mObj, void * gadget)
{
quint8 format;
s >> format;
forever {
QByteArray name;
s >> name;
if (name.isEmpty()) break;
QVariant value;
s >> value;
auto index = mObj.indexOfProperty(name);
if (index < 0) continue;
auto prop = mObj.property(index);
if (! isDumpable(prop)) continue;
prop.writeOnGadget(gadget, value);
}
}
TODO An issue that was not addressed in the loadProperties
implementations is to clear the properties that are present in the object but not present in the serialization.
It is very important to establish how the entire data stream is versioned when it comes to the internal version of QDataStream
formats. The documentation is a required reading.
One also has to decide how is the compatibility handled between the versions of the software. There are several approaches:
(Most typical and unfortunate) No compatiblity: No format information is stored. New members are added to the serialization in an ad-hoc fashion. Older versions of the software will exhibit undefined behavior when faced with newer data. Newer versions will do the same with older data.
Backward compatibility: Format information is stored in the serialization of each custom type. New versions can properly deal with older versions of the data. Older versions must detect an unhandled format, abort deserialization, and indicate an error to the user. Ignoring newer formats leads to undefined behavior.
Full backward-and-forward compatibility: Each serialized custom type is stored in a QByteArray
or a similar container. By doing this, you have information on how long the data record for the entire type is. The QDataStream
version must be fixed. To read a custom type, its byte array is read first, then a QBuffer
is set up that you use a QDataStream
to read from. You read the elements you can parse in the formats you know of, and ignore the rest of the data. This forces an incremental approach to formats, where a newer format can only append elements over an existing format. But, if a newer format abandons some data element from an older format, it must still dump it, but with a null or otherwise safe default value that keeps the older versions of your code "happy".
If you think that the format bytes may ever run out, you can employ a variable-length encoding scheme, known as extension or extended octets, familiar across various ITU standards (e.g. Q.931 4.5.5 Bearer Capability information element). The idea is as follows: the highest bit of an octet (byte) is used to indicate whether the value needs more octets for representation. This makes the byte have 7 bits to represent the value, and 1 bit to mark extension. If the bit is set, you read the subsequent octets and concatenate them in little-endian fashion to the existing value. Here is how you might do this:
class VarLengthInt {
public:
quint64 val;
VarLengthInt(quint64 v) : val(v) { Q_ASSERT(v < (1ULL<<(7*8))); }
operator quint64() const { return val; }
};
QDataStream & operator<<(QDataStream & s, VarLengthInt v) {
while (v.val > 127) {
s << (quint8)((v & 0x7F) | 0x80);
v.val = v.val >> 7;
}
Q_ASSERT(v.val <= 127);
s << (quint8)v.val;
return s;
}
QDataStream & operator>>(QDataStream & s, VarLengthInt & v) {
v.val = 0;
forever {
quint8 octet;
s >> octet;
v.val = (v.val << 7) | (octet & 0x7F);
if (! (octet & 0x80)) break;
}
return s;
}
The serialization of VarLengthInt
has variable length and always uses the minimum number of bytes possible for a given value: 1 byte up to 0x7F, 2 bytes up to 0x3FFF, 3 bytes up to 0x1F'FFFF, 4 bytes up to 0x0FFF'FFFF, etc. Apostrophes are valid in C++14 integer literals.
It would be used as follows:
QDataStream & operator<<(QDataStream & s, const User & user) {
s << VarLengthInt(1) // format
<< user.m_name << user.m_keys << user.m_addresses;
dumpProperties(s, &m_props);
return s;
}
QDataStream & operator>>(QDataStream & s, User & user) {
VarLengthInt format;
s >> format;
...
return s;
}