I'm dipping my toes into network visualizations in Python. I have a dataframe like the following:
| user | nodes |
| -----| ---------|
| A | [0, 1, 3]|
| B | [1, 2, 4]|
| C | [0, 3] |
|... | |
Is there a way to easily plot a network graph (NetworkX?) from data that contains the list of nodes on each row? The presence of a node in a row would increase the prominence of that node on the graph (or the prominence/weight of the edge in the relationship between two nodes).
I assume some transformation would be required to get the data into the appropriate format for NetworkX (or similar) to be able to create the graph relationships.
Thanks!
Since you have lists, using pandas would not be more efficient.
You could use itertools
to enumerate the edges, and collections.Counter
to count them, then build the graph and plot with a width based on the weight:
from itertools import combinations, chain
from collections import Counter
import networkx as nx
c = Counter(chain.from_iterable(combinations(sorted(l), 2) for l in df['nodes']))
G = nx.Graph()
G.add_weighted_edges_from((*e, w) for e, w in c.items())
pos = nx.spring_layout(G)
nx.draw_networkx(G, pos)
for *e, w in G.edges(data='weight'):
nx.draw_networkx_edges(G, pos, edgelist=[e], width=w)
Output:
Used input:
df = pd.DataFrame({'user': ['A', 'B', 'C'],
'nodes': [[0, 1, 3], [1, 2, 4], [0, 3]],
})