excelpython-3.xpandas

File corruption while writing using Pandas


I am reading data from a perfectly valid xlsx file and processing it using Pandas in Python 3.5. At the end I am writing the final dataframe to an Excel file using :

writer = pd.ExcelWriter(os.path.join(DATA_DIR, 'Data.xlsx'), 
engine='xlsxwriter', options={'strings_to_urls': False})
manual_labelling_data.to_excel(writer, 'Sheet_A', index=False)
writer.save()

While trying to open the Data.xlsx, I am getting the error : We found a problem with some content in 'Data.xlsx'... On proceeding the file loads into Excel with info : Removed Records: Formula from /xl/worksheets/sheet1.xml part

I cannot find out what the problem is.


Solution

  • Thanks a lot to @jmcnamara for the help in comment. The issue was that some strings in the data were wrongly being interpreted as formulas. The corrected code is :

    options = {}
    options['strings_to_formulas'] = False
    options['strings_to_urls'] = False
    writer = pd.ExcelWriter(os.path.join(DATA_DIR, 'Data.xlsx'),engine='xlsxwriter',options=options)
    manual_labelling_data.to_excel(writer, 'Sheet_A', index=False)
    writer.save()