I am importing a csv file in pandas containing data like this. Referring to following code, I want to get integer index of row containing name_to_search
in column name
in df1
.
name, ColB, ColC, ColD
P1, 1,1,1
P2, 0,1,0
P3, 1,1,0
...
df1 = pd.read_csv(filepath_or_buffer='file.csv', header=[0])
df1['name'].str.lower()
name_to_search = 'p1'
row_indx1 = df1.index.get_loc(df1[df1['name'] == name_to_search].index[0]) # error line
However, I am getting error IndexError: index 0 is out of bounds for axis 0 with size 0
in error line
row. Any idea how to fix?
If default index use next
with iter
, with default value if no match, here -1
:
name_to_search = 'p1'
row_indx1 = next(iter(df1[df1['name'].str.lower() == name_to_search].index), -1)
print (row_indx1)
0
name_to_search = 'tmp'
row_indx1 = next(iter(df1[df1['name'].str.lower() == name_to_search].index), -1)
print (row_indx1)
-1
Or compare indices, faster if large DateFrames:
row_indx1 = next(iter(df1.index[df1['name'].str.lower() == name_to_search]), -1)
Any idea what is going wrong in original code?
First problem is no assigned back lowercase values, so compared P2
:
df1['name'].str.lower()
print (df1['name'])
0 P1
1 P2
2 P3
Name: name, dtype: object
Second problem is if no match, is not possible select first index
values of empty DataFrame
and IndexError
raised.
print (df1[df1['name'] == name_to_search])
Empty DataFrame
Columns: [name, ColB, ColC, ColD]
Index: []
print (df1[df1['name'] == name_to_search].index)
Index([], dtype='int64')
print (df1[df1['name'] == name_to_search].index[0])
IndexError: index 0 is out of bounds for axis 0 with size 0
If default index Index.get_loc
return same value, it is used for position of non RangeIndex
:
print (df1.index.get_loc(2))
2
df = pd.DataFrame({'a':range(3)}, index=['E','W','T'])
print (df)
a
E 0
W 1
T 2
print (df.index.get_loc('E'))
0