I'm new to using NetworkX library with Python.
Let's say that I import a Pajek-formatted file:
import networkx as nx
G=nx.read_pajek("pajek_network_file.net")
G=nx.Graph(G)
The contents of my file are (In Pajek, nodes are called "Vertices"):
*Network
*Vertices 6
123 Author1
456 Author2
789 Author3
111 Author4
222 Author5
333 Author6
*Edges
123 333
333 789
789 222
222 111
111 456
Now, I want to calculate all the shortest path lengths between the nodes in my network, and I'm using this function, per the library documentation
path = nx.all_pairs_shortest_path_length(G)
Returns: lengths – Dictionary of shortest path lengths keyed by source and target.
The return I'm getting:
print path
{u'Author4': {u'Author4': 0, u'Author5': 1, u'Author6': 3, u'Author1': 4, u'Author2': 1, u'Author3': 2}, u'Author5': {u'Author4': 1, u'Author5': 0, u'Author6': 2, u'Author1': 3, u'Author2': 2, u'Author3': 1}, u'Author6': {u'Author4': 3, u'Author5': 2, u'Author6': 0, u'Author1': 1, u'Author2': 4, u'Author3': 1}, u'Author1': {u'Author4': 4, u'Author5': 3, u'Author6': 1, u'Author1': 0, u'Author2': 5, u'Author3': 2}, u'Author2': {u'Author4': 1, u'Author5': 2, u'Author6': 4, u'Author1': 5, u'Author2': 0, u'Author3': 3}, u'Author3': {u'Author4': 2, u'Author5': 1, u'Author6': 1, u'Author1': 2, u'Author2': 3, u'Author3': 0}}
As you can see, it's really hard to read, and to put to a later use...
Ideally, what I'd like is a return with a format similar to the below:
source_node_id, target_node_id, path_length
123, 456, 5
123, 789, 2
123, 111, 4
In short, I need to get a return using only (or at least including) the nodes ids, instead of just showing the node labels. And, to get every possible pair in a single line with their corresponding shortest path...
Is this possible in NetworkX?
Function Reference: https://networkx.github.io/documentation/latest/reference/generated/networkx.algorithms.shortest_paths.unweighted.all_pairs_shortest_path_length.html
How about something like this?
import networkx as nx
G=nx.read_pajek("pajek_network_file.net")
G=nx.Graph(G)
# first get all the lengths
path_lengths = nx.all_pairs_shortest_path_length(G)
# now iterate over all pairs of nodes
for src in G.nodes():
# look up the id as desired
id_src = G.node[src].get('id')
for dest in G.nodes():
if src != dest: # ignore self-self paths
id_dest = G.node[dest].get('id')
l = path_lengths.get(src).get(dest)
print "{}, {}, {}".format(id_src, id_dest, l)
This yields an output
111, 222, 1
111, 333, 3
111, 123, 4
111, 456, 1
111, 789, 2
...
If you need to do further processing (e.g. sorting) then store the l
values rather than just printing them.
(you could loop through pairs more cleanly with something like itertools.combinations(
G.nodes(), 2)
but the method above is a bit more explicit in case you aren't familiar with it.)