amazon-web-servicesamazon-s3aws-lambdanetworkxgraphml

Load GraphML or GEXF file from S3 into AWS lambda


I have a graph stored as GraphML format in s3. I would like to load it into Lambda, to use it later with a Python library called networkx. I was trying to read it as instructed in the docs but it does not work because the path is not local but in s3 so it cannot find it.

I managed to get it working for JSON (code also below) but the file size with JSON gets very quickly huge so it is not an option.

import json
import boto3
import networkx as nx

client = boto3.client('s3')
s3_bucket_name = "<bucket_name>"
s3_object_key = "example.graphml"
#s3_object_key = "example.json"

def lambda_handler(event, context):
    content_object = client.get_object(Bucket=s3_bucket_name, Key=s3_object_key)
    file_content = content_object["Body"].read().decode('utf-8')
    nx.read_graphml(file_content)
    #json_content = json.loads(file_content)
    #print(json_content)

As sample graphml file you can try the following:

import networkx as nx
G = nx.Graph()
G.add_nodes_from(["A", "B", "C", "D", "E"])
G.add_edges_from([("A","C"), ("B","D"), ("B","E"), ("C", "E"), ("A", "E"), ("E", "D")])
nx.write_graphml_lxml(G, "example.graphml")

enter image description here


Solution

  • I think the issue you're running into is that the argument to the read_graphml function should be a file path not the GraphML string. If you want to use the string directly you can try using the nx.parse_graphml function.