Get adjacency matrices of networkx.MultiDiGraph

I want to obtain the adjacency matrices of a networkx.MultiDiGraph. My code looks as follows:

import numpy as np
import networkx as nx
np.random.seed(123)

n_samples = 10


uv = [
    (1, 2),
    (2, 3),
    (3, 4),
    (4, 5),
    (5, 6)
]


G = nx.MultiDiGraph()

for u, v in uv:
    weights = np.random.uniform(0, 1, size=n_samples)
    G.add_edges_from([(u, v, dict(sample_id=s+1, weight=weights[s])) for s in range(n_samples)])

A = nx.to_numpy_array(G=G, nodelist=list(G.nodes))

As the docs state the default of nx.to_numpy_array() for this type of graph is to sum the weights of the multiple edges. Therefore, the output look as follows:

[[0.         5.44199353 0.         0.         0.         0.        ]
 [0.         0.         4.12783997 0.         0.         0.        ]
 [0.         0.         0.         5.37945594 0.         0.        ]
 [0.         0.         0.         0.         4.95418265 0.        ]
 [0.         0.         0.         0.         0.         5.18942126]
 [0.         0.         0.         0.         0.         0.        ]]

I would like to obtain 10 adjacency matrices, one for each s. My desired output should look as follows:

print(A.shape)
>> (6, 6, 10)

Please advice

Solution

As indicated in comment, you might want to generate individual DiGraphs instead of a MultiDiGraph.

That said, if you want to export multiple adjacency matrices based on the sample_id, you could export to pandas DataFrame with to_pandas_edgelist, then reshape with pivot_table and split the arrays with groupby:

nodes = list(G.nodes)

df = (nx.to_pandas_edgelist(G)
        .pivot_table(index=['sample_id', 'source'],
                     columns='target', values='weight')
        .reindex(columns=nodes)
     )

matrices = {k: g.droplevel(0).reindex(nodes).to_numpy()
            for k, g in df.groupby('sample_id')}

Then you'll have a dictionary of {sample_id: adjacency_matrix}:

matrices[6]

array([[     nan, 0.423106,      nan,      nan,      nan,      nan],
       [     nan,      nan, 0.737995,      nan,      nan,      nan],
       [     nan,      nan,      nan, 0.322959,      nan,      nan],
       [     nan,      nan,      nan,      nan, 0.312261,      nan],
       [     nan,      nan,      nan,      nan,      nan, 0.250455],
       [     nan,      nan,      nan,      nan,      nan,      nan]])

NB. if you want 0s for missing edges, add .fillna(0) before converting .to_numpy() .

Alternatively, to get directly a 3D numpy array from the DataFrame, you could complete/reindex the missing values:

# pip install janitor
import janitor

nodes = list(G.nodes)
df = nx.to_pandas_edgelist(G)
samples = df['sample_id'].unique()
N = len(nodes)

A = (df.complete({'source': nodes, 'target': nodes, 'sample_id': samples})
       ['weight'].to_numpy().reshape(N, N, -1)
    )

Or:

import pandas as pd

nodes = list(G.nodes)
df = nx.to_pandas_edgelist(G)
samples = df['sample_id'].unique()
N = len(nodes)

A = (df.set_index(['source', 'target', 'sample_id'])
       .reindex(pd.MultiIndex.from_product([nodes, nodes, samples]))
       ['weight'].to_numpy().reshape(N, N, -1)
    )

Output of A[:, :, 5] (6th sample):

array([[     nan, 0.423106,      nan,      nan,      nan,      nan],
       [     nan,      nan, 0.737995,      nan,      nan,      nan],
       [     nan,      nan,      nan, 0.322959,      nan,      nan],
       [     nan,      nan,      nan,      nan, 0.312261,      nan],
       [     nan,      nan,      nan,      nan,      nan, 0.250455],
       [     nan,      nan,      nan,      nan,      nan,      nan]])

I would probably prefer a (10, 6, 6) shape to directly access the samples with A[id]:

A = (df.complete({'sample_id': samples, 'source': nodes, 'target': nodes})
       ['weight'].to_numpy().reshape(-1, N, N)
    )

# or
A = (df.set_index(['sample_id', 'source', 'target'])
       .reindex(pd.MultiIndex.from_product([samples, nodes, nodes]))
       ['weight'].to_numpy().reshape(-1, N, N)
    )