I am trying to create a network graph on correlation data and would like to color the nodes based on categories.
Data:
import pandas as pd
links_data = pd.read_csv("https://raw.githubusercontent.com/johnsnow09/network_graph/refs/heads/main/links_filtered.csv")
graph code:
import networkx as nx
G = nx.from_pandas_edgelist(links_data, 'var1', 'var2')
# Plot the network:
nx.draw(G, with_labels=True, node_color='orange', node_size=200, edge_color='black', linewidths=.5, font_size=2.5)
All the nodes in this network graph is colored as orange but I would like to color them based on Category
variable. I have looked for more examples but not sure how to do it.
I am also open to using other python libraries if required.
Appreciate any help here !!
Since you have a unique relationship from var1 to Category, you could build a list of colors for all the nodes using:
import matplotlib as mpl
cmap = mpl.colormaps['Set3'].colors # this has 12 colors for 11 categories
cat_colors = dict(zip(links_data['Category'].unique(), cmap))
colors = (links_data
.drop_duplicates('var1').set_index('var1')['Category']
.map(cat_colors)
.reindex(G.nodes)
)
nx.draw(G, with_labels=True, node_color=colors, node_size=200,
edge_color='black', linewidths=.5, font_size=2.5)
If you also want a legend:
import matplotlib.patches as mpatches
plt.legend(handles=[mpatches.Patch(color=c, label=label)
for label, c in cat_colors.items()],
bbox_to_anchor=(1, 1))
Output: