I want to implement a popularity feature for ranking multiple sentence in my project. I want to know how to implement a directed graph, with each node representing a sentence and an edge exists between them if the cosine similarity between the sentence exceeds a threshold value.
Below is a piece of code that will plot a graph with n nodes, where n is the amount of strings provided in a list. The edges are provided in the format (i,j) where i,j are node numbers which correspond to the index in the string list. In this example, (0,2) would correspond to an edge between 'Some' and 'Strings'.
Since you are looking to connect nodes based on some threshold, your edgelist would correspond to something like: [[(x,y) for y in range(len(words)) if similarity(words[x],words[y]) < threshold][0] for x in range(len(words))]
where similarity()
is a function defined by you to check the similarity.
from igraph import *
words = ['Some', 'Random', 'Strings','Okay'] #Whatever your strings would be
n_nodes = len(words) #Would be equal to the amount of words you have
g = Graph(directed=True)
layout = g.layout('kk')
edges = [(n,n+1) for n in range(n_nodes-1)] #Connects each node to the next, replace this with your own adjacency tuples
g.add_vertices(n_nodes) #Add the nodes
g.add_edges(edges) #Add the edges
plot(g, bbox=(500,500),margin=30, vertex_label = words)
Good luck!