pythonpandasdataframefunctiontypeerror

TypeError: 'DataFrame' object is not callable


I've programmed these for calculating Variance

credit_card = pd.read_csv("default_of_credit_card_clients_Data.csv", skiprows=1)
    
for col in credit_card:
    var[col]=np.var(credit_card(col))

I'm getting this error

Traceback (most recent call last):
   File "C:/Python34/project.py", line 11, in <module>
     var[col]=np.var(credit_card(col)) 
TypeError: 'DataFrame' object is not callable

Solution

  • It seems you need DataFrame.var:

    Normalized by N-1 by default. This can be changed using the ddof argument

    var1 = credit_card.var()
    

    Sample:

    #random dataframe
    np.random.seed(100)
    credit_card = pd.DataFrame(np.random.randint(10, size=(5,5)), columns=list('ABCDE'))
    print (credit_card)
       A  B  C  D  E
    0  8  8  3  7  7
    1  0  4  2  5  2
    2  2  2  1  0  8
    3  4  0  9  6  2
    4  4  1  5  3  4
    
    var1 = credit_card.var()
    print (var1)
    A     8.8
    B    10.0
    C    10.0
    D     7.7
    E     7.8
    dtype: float64
    
    var2 = credit_card.var(axis=1)
    print (var2)
    0     4.3
    1     3.8
    2     9.8
    3    12.2
    4     2.3
    dtype: float64
    

    If need numpy solutions with numpy.var:

    print (np.var(credit_card.values, axis=0))
    [ 7.04  8.    8.    6.16  6.24]
    
    print (np.var(credit_card.values, axis=1))
    [ 3.44  3.04  7.84  9.76  1.84]
    

    Differences are because by default ddof=1 in pandas, but you can change it to 0:

    var1 = credit_card.var(ddof=0)
    print (var1)
    A    7.04
    B    8.00
    C    8.00
    D    6.16
    E    6.24
    dtype: float64
    
    var2 = credit_card.var(ddof=0, axis=1)
    print (var2)
    0    3.44
    1    3.04
    2    7.84
    3    9.76
    4    1.84
    dtype: float64