pythonprobabilitybayesian-networkspgmpy

pgmpy returning incorrect conditional probabilities


I'm trying to use the pgmpy Python package to learn the transition probabilities between a certain set of states, however when I fit the model, I find that the conditional probabilities are incorrect.

As a very simplified example of the sort of issue I'm talking about, consider the Bayesian network consisting of two states, A and B, with a single directed edge running from A to B. And suppose that we have observed that whenever A is zero, B is one, and whenever A is one, B is zero. The code describing this situation is given by:

import pandas as pd
from pgmpy.models import BayesianModel

data = pd.DataFrame(data={'A': [0, 0, 1, 1, 1, 1], 'B': [1, 1, 0, 0, 0, 0]})
model = BayesianModel([('A', 'B')])
model.fit(data)

However when we then inspect the fitted conditional probabilities by calling model.cpds[1], we find that pgmpy has learned the following:

+------+------+------+
| A    | A(0) | A(1) |
+------+------+------+
| B(0) | 0.5  | 0.5  |
+------+------+------+
| B(1) | 0.5  | 0.5  |
+------+------+------+

when it should have learned

+------+------+------+
| A    | A(0) | A(1) |
+------+------+------+
| B(0) | 0.0  | 1.0  |
+------+------+------+
| B(1) | 1.0  | 0.0  |
+------+------+------+

Can someone please explain to me what is going on here? This is an extremely basic example, and I feel like I'm going crazy. Thanks


Solution

  • The version of pgmpy available for installation through pip has a bug that causes it to compute conditional probabilities incorrectly. Cloning the dev repository from git and installing it manually fixes the issue. Thanks to @lstbl for figuring this out here: https://stats.stackexchange.com/questions/292738/inconsistencies-between-conditional-probability-calculations-by-hand-and-with-pg