pythonexcelxlsxpyexcel

pyexcel get_book and get_records functions throw exceptions for XLSX files


I'm trying to open an XLSX file using pyexcel. But it fails for both get_book and get_records with the following error. However if I try to read the same file converted to xls it does work. I get the files uploaded by users: so can not restrict uploading files in XLSX format.

>>> import pyexcel

>>> workbook = pyexcel.get_book(file_name='Sample_Employee_data_xls.xlsx')

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/me/env/lib/python3.10/site-packages/pyexcel/core.py", line 47, in get_book
    book_stream = sources.get_book_stream(**keywords)
  File "/home/me/env/lib/python3.10/site-packages/pyexcel/internal/core.py", line 38, in get_book_stream
    sheets = a_source.get_data()
  File "/home/me/env/lib/python3.10/site-packages/pyexcel/plugins/sources/file_input.py", line 38, in get_data
    sheets = self.__parser.parse_file(self.__file_name, **self._keywords)
  File "/home/me/env/lib/python3.10/site-packages/pyexcel/plugins/parsers/excel.py", line 19, in parse_file
    return self._parse_any(file_name, **keywords)
  File "/home/me/env/lib/python3.10/site-packages/pyexcel/plugins/parsers/excel.py", line 40, in _parse_any
    sheets = get_data(anything, file_type=file_type, **keywords)
  File "/home/me/env/lib/python3.10/site-packages/pyexcel_io/io.py", line 86, in get_data
    data, _ = _get_data(
  File "/home/me/env/lib/python3.10/site-packages/pyexcel_io/io.py", line 105, in _get_data
    return load_data(**keywords)
  File "/home/me/env/lib/python3.10/site-packages/pyexcel_io/io.py", line 205, in load_data
    result = reader.read_all()
  File "/home/me/env/lib/python3.10/site-packages/pyexcel_io/reader.py", line 95, in read_all
    content_dict = self.read_sheet_by_index(sheet_index)
  File "/home/me/env/lib/python3.10/site-packages/pyexcel_io/reader.py", line 84, in read_sheet_by_index
    sheet_reader = self.reader.read_sheet(sheet_index)
  File "/home/me/env/lib/python3.10/site-packages/pyexcel_xlsx/xlsxr.py", line 148, in read_sheet
    sheet = SlowSheet(native_sheet, **self.keywords)
  File "/home/me/env/lib/python3.10/site-packages/pyexcel_xlsx/xlsxr.py", line 72, in __init__
    for ranges in sheet.merged_cells.ranges[:]:
TypeError: 'set' object is not subscriptable

>>> workbook = pyexcel.get_book(file_name='Sample_Employee_data_xls.xls') # working

Here is my requirements file.

asgiref==3.6.0
asttokens==2.2.1
autopep8==2.0.1
backcall==0.2.0
certifi==2022.12.7
chardet==5.1.0
charset-normalizer==2.1.1
decorator==5.1.1
Django==3.2.16
django-cors-headers==3.13.0
django-filter==22.1
djangorestframework==3.13.1
et-xmlfile==1.1.0
executing==1.2.0
idna==3.4
ipython==8.8.0
jedi==0.18.2
lml==0.1.0
matplotlib-inline==0.1.6
openpyxl==3.1.0
parso==0.8.3
pexpect==4.8.0
pickleshare==0.7.5
prompt-toolkit==3.0.36
ptyprocess==0.7.0
pure-eval==0.2.2
pycodestyle==2.10.0
pyexcel==0.7.0
pyexcel-io==0.6.6
pyexcel-xls==0.7.0
pyexcel-xlsx==0.6.0
Pygments==2.14.0
pytz==2022.7
requests==2.28.1
six==1.16.0
sqlparse==0.4.3
stack-data==0.6.2
texttable==1.6.7
tomli==2.0.1
traitlets==5.8.1
urllib3==1.26.13
wcwidth==0.2.5
xlrd==2.0.1
xlwt==1.3.0

Solution

  • You can downgrade openpyxl to 3.0.10 for now. I referenced your issue here:

    https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1960

    Pyexcel uses openpyxl to open xlsx files. The commands are:

    pip uninstall openpyxl
    pip install openpyxl==3.0.10