pythonpandasdataframeif-statementambiguous

search for 2 values from csv file using pandas


My csv file looks like this :

drug1   |drug2  |chem       |enzym  |indication |offside |pathway|sideeffect|target|transporter
DB00945 |DB00526|0.217391304|0      |0          |0.139664804|0   |0.470588235|0    |0

I want to search using pandas for drug1 AND drug2 and then takes the corresponding features I did this: Note that: drugName is String and userDrugs is a list of drugs I want to iterate

def getData(drugName,userDrugs):
    data = pd.read_csv(r"C:\Users\soso-\Desktop\python & flutter\drug data.csv" ,sep=",")
    df = pd.DataFrame(data)
    for i in userDrugs:
        similarities=df[(df['drug1']==(drugName) ) and (df['drug2']==(i))]['chem','enzym','indication','offside','pathway','sideeffect','target','transporter']
        result = loaded_model.predict([similarities])
        print(result)

However, this error occur:

getData('DB00945', ['DB00454','DB00952'])
  File "c:\Users\soso-\Desktop\python & flutter\model.py", line 9, in getData
    similarities=df[(df['drug1']==(drugName) ) and (df['drug2']==(i))]['chem','enzym','indication','offside','pathway','sideeffect','target','transporter']
  File "C:\Users\soso-\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\generic.py", line 1527, in __nonzero__
    raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How can I solve it? Knowing that, I want to find drugName and userDrugs[0] then print the result, and so on until I iterate the whole list


Solution

  • Change and to & in the following line. Because and is not a valid operator for DataFrame filtering.

    similarities=df[(df['drug1']==(drugName)) & (df['drug2']==(i))][['chem','enzym','indication','offside','pathway','sideeffect','target','transporter']]