I'm trying to use numpy.fill_diagonal() as shown in this answer to change the diagonal entries of my dataframe to be 20,000. However, when I run the example below, the diagonal entries in my dataframe stay zero. My question is "Why?"
Please don't post some answer showing how to change the diagonal entries with a for loop. I can change the diagonal entries myself. My question is why the below example does not work to change the entries. If I don't understand why the below example fails, I cannot rely on numpy.fill_diagonal() in any of my code. I'm using numpy version 2.1.0 and pandas version 2.2.2.
import numpy as np
import pandas as pd
nodes = ['A','B','C','D','E','F','G','H']
df_cap = pd.DataFrame(None, index=nodes, columns=nodes)
edges = {
('A', 'B'): 809, ('A', 'C'): 184, ('A', 'D'): 440, ('B', 'C'): 134,
('B', 'E'): 277, ('B', 'F'): 138, ('C', 'D'): 194, ('C', 'F'): 144,
('C', 'G'): 139, ('D', 'E'): 190, ('D', 'G'): 284, ('E', 'F'): 100,
('E', 'H'): 281,('F', 'H'): 922,('G', 'F'): 123,('G', 'H'): 232
}
for i in nodes:
for j in nodes:
if (i,j) in edges:
df_cap.loc[i,j] = edges[(i,j)]
nodes = nodes + ["Dummy"]
df_cap = df_cap.astype(float).fillna(0)
df_cap.loc["Dummy", :] = 0
df_cap.loc[:, "Dummy"] = 0
np.fill_diagonal(df_cap.values, 20000)
df_cap.loc["A", "Dummy"] = 20000
df_cap.loc["Dummy", "H"] = 20000
df_cap
The issue is that you initialized an object DataFrame and used a loop to fill your DataFrame, this created a fragmented DataFrame (i.e. not a monobloc underlying numpy array) and using df.values
is making a copy (not a view).
A simple workaround could be to make a copy before np.fill_diagonal
:
df_cap = df_cap.copy()
A better approach would be to avoid using a loop to create df_cap
:
nodes = ['A','B','C','D','E','F','G','H']
df_cap = (pd.Series(edges).unstack(fill_value=0)
.reindex(index=nodes, columns=nodes, fill_value=0)
)
np.fill_diagonal(df_cap.values, 20000)
And without fill_diagonal
:
df_cap = (pd.Series(edges|{(k, k): 2000 for k in nodes})
.unstack(fill_value=0)
.reindex(index=nodes, columns=nodes, fill_value=0)
)
Output:
A B C D E F G H
A 20000 809 184 440 0 0 0 0
B 0 20000 134 0 277 138 0 0
C 0 0 20000 194 0 144 139 0
D 0 0 0 20000 190 0 284 0
E 0 0 0 0 20000 100 0 281
F 0 0 0 0 0 20000 0 922
G 0 0 0 0 0 123 20000 232
H 0 0 0 0 0 0 0 20000