As the title states I have a protobuf message with another message inside it like this:
syntax = "proto3";
message Message
{
message SubMessage {
int32 number = 1;
}
SubMessage subMessage = 1;
}
My example.json
is empty (which means default values everywhere):
{
}
In my python script I read this message with:
example_json = open("example.json", "r").read()
example_message = example.Message()
google.protobuf.json_format.Parse(example_json, example_message)
and when I check the value of example_message.subMessage.number
it is 0
which is correct.
Now I want to convert it into a dict where all values are present - even the default values.
For the conversion I use the method google.protobuf.json_format.MessageToDict()
.
But as you may know MessageToDict()
doesn't serialize default values without me telling it to do so (like in this question: Protobuf doesn't serialize default values).
So I added the argument including_default_value_fields=True
to the call of MessageToDict()
:
protobuf.MessageToDict(example_message, including_default_value_fields=True)
which returns:
{}
instead of what I expected:
{'subMessage': {'number': 0}}
A comment in the code of protobuf (found here: https://github.com/protocolbuffers/protobuf/blob/master/python/google/protobuf/json_format.py) confirms this behaviour:
including_default_value_fields: If True, singular primitive fields, repeated fields, and map fields will always be serialized. If False, only serialize non-empty fields. Singular message fields and oneof fields are not affected by this option.
So what can I do to get a dict with all values even when they are default values inside nested messages?
Interestingly when my example.json
looks like this:
{
"subMessage" : {
"number" : 0
}
}
I get the expected output.
But I cannot make sure that the example.json
will have all values written out so this is not an option.
Based on the answer of Looping over Protocol Buffers attributes in Python I created a custom MessageToDict
function:
def MessageToDict(message):
message_dict = {}
for descriptor in message.DESCRIPTOR.fields:
key = descriptor.name
value = getattr(message, descriptor.name)
if descriptor.label == descriptor.LABEL_REPEATED:
message_list = []
for sub_message in value:
if descriptor.type == descriptor.TYPE_MESSAGE:
message_list.append(MessageToDict(sub_message))
else:
message_list.append(sub_message)
message_dict[key] = message_list
else:
if descriptor.type == descriptor.TYPE_MESSAGE:
message_dict[key] = MessageToDict(value)
else:
message_dict[key] = value
return message_dict
Given the message read from the empty example.json
this function returns:
{'subMessage': {'number': 0}}