pythonpython-3.xnetworkxgraphml

How do I call networkx.add_node(..) with optional properties?


I'm looping through a dictionary of objects constructed from JSON, and I'm creating vertices from them using networkx. The problem I'm experiencing is that some of the JSON object have missing properties, and if I do this:

self.graph.add_node(valueToCheck, 
                id=self.vertexDict[valueToCheck], 
                namespace=component["namespace"], 
                tenant=component["tenant"], 
                type=component.get("type")+"Component",
                artifactFileName=component.get("artifactFileName"),
                className=component.get("className"),
                userConfig=component.get("userConfig"),
                sourceType=component.get("sourceType"),
                sinkType=component.get("sinkType"))

then I can't export my graph using nx.write_graphml(..) because some of the vertex properties have the value None (which is the expected output of component.get(..) when the property is missing).

How do I use networkx to construct vertices when some of my properties might be missing in the JSON objects?

Here's what my JSON looks like:

[{'type': 'function',
  'namespace': 'campaigns',
  'name': 'campaign-record-transformer',
  'tenant': 'osp',
  'artifactFileName': 'osp-functions-1.1-SNAPSHOT-jar-with-dependencies.jar',
  'className': 'com.overstock.dataeng.pulsar.functions.CampaignRecordTransformer',
  'inputs': ['persistent://osp/campaigns/campaign-manager'],
  'logTopic': 'persistent://osp/logging/pulsar-log-topic',
  'output': 'persistent://osp/campaigns/campaign-records'},
 {'type': 'function',
  'namespace': 'campaignsTest',
  'name': 'campaign-metadata-transformer',
  'tenant': 'osp',
  'artifactFileName': 'osp-functions-1.1-SNAPSHOT-jar-with-dependencies.jar',
  'className': 'com.overstock.dataeng.pulsar.functions.CampaignMetadataTransformer',
  'logTopic': 'persistent://osp/logging/pulsar-log-topic',
  'output': 'persistent://osp/campaigns/campaign-metadata-output'}]

Notice that the inputs property is missing from the second object. In the actual data, there are at least 8 optional properties that can be missing in different combinations, and there are hundreds of objects like this.


Solution

  • I do not have the reputation for a comment, so despite this not being a full answer, I am posting it as such

    Have you tried simply excluding the properties that are missing from your add_node step?

    That is, instead of providing a key value pair where the value is None, don't provide a key/value pair at all if the key is missing.

    You can probably achieve this quite easily by loading your json using python and then just unpacking your component:

    components = json.load(...)
    for component in components:
        self.graph.add_node(value, **component)
    

    See https://docs.python.org/3/tutorial/controlflow.html#unpacking-argument-lists