I'm trying to calculate the probability of each source address give a certain destination IP deriving from PACKET_IN message. To do this, I first use DataFrame to reunite those addresses and then use nested loop to address different probability of occurrence. The code does work on IDE, however, it gives me different output on the controller. It seems like something wrong with loop statement in my code, could you give me a hand?
You can eliminate the loops by using the split-apply-combine features of pandas.
First, let's abstract away the "pox" portion of your problem by creating a dataframe with integer src/dst.
import pandas as pd
import numpy as np
src = np.trunc(np.random.uniform(0, 5, size=1000))
dst = np.trunc(np.random.uniform(0, 3, size=1000)) + src
df = pd.DataFrame({'dst': x, 'src': y})
So in this example, the src and dst are correlated. To get the frequency counts are available with a single line
df.groupby('dst').src.value_counts()
which yields something like the following.
dst src
0.0 0.0 71
2.0 68
1.0 45
1.0 3.0 80
2.0 76
1.0 60
2.0 4.0 84
3.0 61
2.0 56
3.0 3.0 90
4.0 58
5.0 50
4.0 4.0 71
6.0 67
5.0 63
This gives you raw counts of each src/dst pair. You can convert this into the fraction of time that each src appeared given a single dst by using the groupby object twice: once to compute the frequency of each src/dst like above, and once to compute the frequency of each dst.
g = df.groupby('dst')
g.src.value_counts() / g.size()
Which will yield something like
dst src
0.0 0.0 0.385870
1.0 0.244565
2.0 0.369565
...
4.0 4.0 0.353234
5.0 0.313433
6.0 0.333333