I realize this issue has been explained many times before so I understand if this is closed as a duplicate, but I have some more theoretic questions to ask that may justify this as a new question. I'm new to Python (and SO), so bear with me.
I'm trying to read in a .csv file that has 16 columns and 30,000ish rows, populated with values from 0 to 17. There are no empty cells. What I would like to do is iterate through each of the rows, doing entry-wise subtraction with the cells from each other row. Currently, I'm attempting to do this using a Pandas DataFrame. So my first question is: Should I be using a different data structure? I've read that DataFrame's are bad for iterating through rows.
Next, for the title question, I need help interpreting my error. Thusfar, I've only written code to try this subtraction on a small subset of the data. Here's my code:
import numpy as np
import pandas as pd
scrambles = pd.read_csv('scrambles.csv')
df = pd.DataFrame(scrambles)
#print(df)
columns = list(df)
for i in columns:
print (df[i][0]-df[i][1])
This all works as anticipated. However, when I change the last piece of code to the following, I get an error:
for i in range(15):
print (df[i][0]-df[i][1])
I'll post a transcript of the error below. The reason I'm trying to do it this way even though I have a working code is because when I write the full script, I'm iterating over a known amount of rows. For what it's worth, I'm doing this on Jupyter online.
KeyError Traceback (most recent call last)
/srv/conda/envs/notebook/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2889 try:
-> 2890 return self._engine.get_loc(key)
2891 except KeyError:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 0
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-6-0faa876fbe56> in <module>
1 for i in range(15):
----> 2 print (df[i][0]-df[i][1])
/srv/conda/envs/notebook/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
2973 if self.columns.nlevels > 1:
2974 return self._getitem_multilevel(key)
-> 2975 indexer = self.columns.get_loc(key)
2976 if is_integer(indexer):
2977 indexer = [indexer]
/srv/conda/envs/notebook/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2890 return self._engine.get_loc(key)
2891 except KeyError:
-> 2892 return self._engine.get_loc(self._maybe_cast_indexer(key))
2893 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2894 if indexer.ndim > 1 or indexer.size > 1:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 0
I'll expand on my comment to answer the original question - interpreting the exception.
The cause for the error is because your dataframe most likely is not using integers for its column names, so the integers 0 through 15 will cause the KeyError you're seeing, which is the final line of both Exceptions: KeyError: 0
In the Traceback, Python is giving you additional context to the error that is happening.
When you make the attempt to access column 0
of your dataframe, the processing code reaches line 2890 of base.py
in the function get_loc()
.
In that code, the KeyError
that occurs is handled by the containing try/except
. However, the handling call also raises a KeyError
which is not handled (this call is also unfortunately not included in the Traceback). This is where the "During handling of the above exception, another exception occurred:
" message comes in.
To illustrate using the code itself:
...
try:
return self._engine.get_loc(key) # <- KeyError raised here
except KeyError: # <- Caught by except
return self._engine.get_loc(self._maybe_cast_indexer(key)) # <- 2nd KeyError
...
Finally, as I said in the comment, the final line of the Traceback reveals the error:
KeyError: 0