[SOLVED] How to consistently handle Excel boolean with pandas

How to consistently handle Excel boolean with pandas

I'm have an Excel file where the first column is:

(where TRUE/FALSE are Excel's boolean type or whatever Excel calls them)

I'm reading this file with python 3.6, pandas 1.0.3 and xlrd 1.2.0:

dat = pandas.read_excel('simple.xlsx', engine= 'xlrd', header= None)

The TRUE/FALSEs seem to be converted in an inconsistent way:

print(list(dat.iloc[:,0]))

['col1', 1, 2, 3, 6, 1, False, 1]

TRUE has been converted to 1 and FALSE to False. This means I cannot tell whether 1 is actually the number 1 or True.

Is there a way to tell pandas.read_excel to consistently convert TRUE to True and FALSE to False?

(NB I'm using xlrd rather then openpyxl since xlrd is supported by more versions of pandas - I'm happy to switch to openpyxl if that helps here...)

Solution

Not really answering the question but that's how I resolved it. Basically, I needed an Excel to csv exporter and I hoped to get away with a few lines of pandas.

I ended up writing my own excelToCsv with Java and the POI library which seems much more sophisticated than xlrd, openpyxl etc...