pythonigraphhierarchical-clustering

Using igraph in python for community detection and writing community number for each node to CSV


I have an network that I would like to analyze using the edge_betweenness community detection algorithm in igraph. I'm familiar with NetworkX, but am trying to learning igraph because of it's additional community detection methods over NetworkX.

My ultimate goal is to run edge_betweenness community detection and find the optimal number of communities and write a CSV with community membership for each node in the graph.

Below is my code as it currently stands. Any help figuring out community membership is greatly appreciated.

input data ('network.txt'):

1 2
2 3
2 7
3 1
4 2
4 6
5 4
5 6
7 4
7 8
8 9
9 7
10 7
10 8
10 9

iGraph code

import igraph

# load data into a graph
g = igraph.Graph.Read_Ncol('network.txt')

# plot graph
igraph.plot(g)

igraph.plot(g)

# identify communities
communities = igraph.community_edge_betweenness()

# not really sure what to do next
num_communities = communities.optimal_count
communities.as_clustering(num_communities)

What do I need to do to find the optimal number of communities and write which community each node in the graph belongs to a list?


Solution

  • You are on the right track; the optimal number of communities (where "optimal" is defined as "the number of communities that maximizes the modularity score) can be retrieved by communities.optimal_count and the community structure can be converted into a flat disjoint clustering using communities.as_clustering(num_communities). Actually, the number of communities can be omitted if it happens to be equal to communities.optimal_count. Once you've done that, you get a VertexClustering object with a membership property which gives you the cluster index for each vertex in the graph.

    For sake of clarity, I'm renaming your communities variable to dendrogram because the edge betweenness community detection algorithm actually produces a dendrogram::

    # calculate dendrogram
    dendrogram = graph.community_edge_betweenness()
    # convert it into a flat clustering
    clusters = dendrogram.as_clustering()
    # get the membership vector
    membership = clusters.membership
    

    Now we can start writing the membership vector along with the node names into a CSV file::

    import csv
    from itertools import izip
    
    writer = csv.writer(open("output.csv", "wb"))
    for name, membership in izip(graph.vs["name"], membership):
        writer.writerow([name, membership])
    

    If you are using Python 3, use zip instead of izip and there is no need to import itertools.