I have to parse data from Excel files (and I can't control the incoming format). Four columns have column headers; the fifth column does not, and is often blank -- sometimes there is no data at all in this column, and sometimes one or two rows may have data, so there can be an arbitrary number of blanks in that column near the top. If I do this:
pd.read_excel(excel_file, usecols="A:E")
I get a warning: FutureWarning: Defining usecols with out of bounds indices is deprecated and will raise a ParserError in a future version.
and it doesn't give me column 5.
I have a workaround using openpyxl, but I would like to use pandas if possible. Is there a way to do this?
you can use dataframe.iloc the first value is the row range second value is the column range
so in this example
row will start at index 0 and end at the last entry
Column will start at the second last column and end at the last column
import pandas as pd
data_frame = pd.read_excel("test.xlsx")
print(data_frame.iloc[0:, -1:])