pythonpandasattributeerror

Pandas - 'Series' object has no attribute


I need to use a lambda function to do a row by row computation. For example create some dataframe

import pandas as pd
import numpy as np

def myfunc(x, y):
    return x + y

colNames = ['A', 'B']
data = np.array([np.arange(10)]*2).T

df = pd.DataFrame(data, index=range(0, 10), columns=colNames)

using 'myfunc' this does work

df['D'] = (df.apply(lambda x: myfunc(x.A, x.B), axis=1))

but this second case does not work!

df['D'] = (df.apply(lambda x: myfunc(x.colNames[0], x.colNames[1]), axis=1))

giving the error

AttributeError: ("'Series' object has no attribute 'colNames'", u'occurred at index 0')

I really need to use the second case (access the colNames using the list) which gives an error, any clues on how to do this?


Solution

  • When you use df.apply(), each row of your DataFrame will be passed to your lambda function as a pandas Series. The frame's columns will then be the index of the series and you can access values using series[label].

    So this should work:

    df['D'] = (df.apply(lambda x: myfunc(x[colNames[0]], x[colNames[1]]), axis=1))