python-3.xnumpyinformation-theory

Finding conditional mutual information from 3 discrete variable


I am trying to find conditional mutual information between three discrete random variable using pyitlib package for python with the help of the formula:

I(X;Y|Z)=H(X|Z)+H(Y|Z)-H(X,Y|Z)

The expected Conditional Mutual information value is= 0.011

My 1st code:

import numpy as np
from pyitlib import discrete_random_variable as drv

X=[0,1,1,0,1,0,1,0,0,1,0,0]
Y=[0,1,1,0,0,0,1,0,0,1,1,0]
Z=[1,0,0,1,1,0,0,1,1,0,0,1]

a=drv.entropy_conditional(X,Z)
##print(a)
b=drv.entropy_conditional(Y,Z)
##print(b)
c=drv.entropy_conditional(X,Y,Z)
##print(c)

p=a+b-c
print(p)

The answer i am getting here is=0.4632245116328402

My 2nd code:

import numpy as np
from pyitlib import discrete_random_variable as drv

X=[0,1,1,0,1,0,1,0,0,1,0,0]
Y=[0,1,1,0,0,0,1,0,0,1,1,0]
Z=[1,0,0,1,1,0,0,1,1,0,0,1]

a=drv.information_mutual_conditional(X,Y,Z)
print(a)

The answer i am getting here is=0.1583445441575102

While the expected result is=0.011

Can anybody help? I am in big trouble right now. Any kind of help will be appreciable. Thanks in advance.


Solution

  • I think that the library function entropy_conditional(x,y,z) has some errors. I also test my samples, the same problem happens. however, the function entropy_conditional with two variables is ok. So I code my entropy_conditional(x,y,z) as entropy(x,y,z), the results is correct. the code may be not beautiful.

    def gen_dict(x):
        dict_z = {}
        for key in x:
            dict_z[key] = dict_z.get(key, 0) + 1
        return dict_z
    
    def entropy(x,y,z):   
        x = np.array([x,y,z]).T
        x = x[x[:,-1].argsort()] # sorted by the last column    
        w = x[:,-3]
        y = x[:,-2]
        z = x[:,-1]
        
        # dict_w = gen_dict(w)
        # dict_y = gen_dict(y)
        dict_z = gen_dict(z)
        list_z = [dict_z[i] for i in set(z)]
        p_z = np.array(list_z)/sum(list_z)
        pos = 0
        ent = 0
        for i in range(len(list_z)):   
            w = x[pos:pos+list_z[i],-3]
            y = x[pos:pos+list_z[i],-2]
            z = x[pos:pos+list_z[i],-1]
            pos += list_z[i]
            list_wy = np.zeros((len(set(w)),len(set(y))), dtype = float , order ="C")
            list_w = list(set(w))
            list_y = list(set(y))
            
            for j in range(len(w)):
                pos_w = list_w.index(w[j])
                pos_y = list_y.index(y[j])
                list_wy[pos_w,pos_y] += 1
                #print(pos_w)
                #print(pos_y)
            list_p = list_wy.flatten()
            list_p = np.array([k for k in list_p if k>0]/sum(list_p))
            ent_t = 0
            for j in list_p:
                ent_t += -j * math.log2(j)
            #print(ent_t)
            ent += p_z[i]* ent_t
        return ent
        
    
    X=[0,1,1,0,1,0,1,0,0,1,0,0]
    Y=[0,1,1,0,0,0,1,0,0,1,1,0]
    Z=[1,0,0,1,1,0,0,1,1,0,0,1]  
    
    a=drv.entropy_conditional(X,Z)
    ##print(a)
    b=drv.entropy_conditional(Y,Z)         
    c = entropy(X, Y, Z)
    p=a+b-c
    print(p)
    0.15834454415751043