python-3.xnumpytensorflowgini

Computing Gini index in tensorflow


I'm trying to write down the gini index calculation as a tensorflow cost function. Gini index is: https://en.wikipedia.org/wiki/Gini_coefficient

a numpy solution would be

def ginic(actual, pred):
    n = len(actual)
    a_s = actual[np.argsort(pred)]
    a_c = a_s.cumsum()
    giniSum = a_c.sum() / a_s.sum() - (n + 1) / 2.0
    return giniSum / n

Can someone help me figure out how to do this in tf (for example, in tf there is no argsort that can be part of a function that is differentiated, AFAIK)


Solution

  • You can perform the argsorting by using tf.nn.top_k(). This function returns a tuple, the second element being the indices. Its order must be reversed since the order is descending.

    def ginicTF(actual:tf.Tensor,pred:tf.Tensor):
        n = int(actual.get_shape()[-1])
        inds =  tf.reverse(tf.nn.top_k(pred,n)[1],axis=[0]) # this is the equivalent of np.argsort
        a_s = tf.gather(actual,inds) # this is the equivalent of numpy indexing
        a_c = tf.cumsum(a_s)
        giniSum = tf.reduce_sum(a_c)/tf.reduce_sum(a_s) - (n+1)/2.0
        return giniSum / n
    

    Here is a code you can use for verification that this function returns the same numerical value as your numpy function ginic:

    sess = tf.InteractiveSession()
    ac = tf.placeholder(shape=(50,),dtype=tf.float32)
    pr = tf.placeholder(shape=(50,),dtype=tf.float32)
    actual  = np.random.normal(size=(50,))
    pred  = np.random.normal(size=(50,))
    print('numpy version: {:.4f}'.format(ginic(actual,pred)))
    print('tensorflow version: {:.4f}'.format(ginicTF(ac,pr).eval(feed_dict={ac:actual,pr:pred})))