javaapache-poixlsxxssf

Is there a way I can open an xlsx file in without facing the CharconversionException exception?


My application processes excel files; but when opening xlsx Excel format, it gives me the following error. Is there something I'm doing wrong or anything I've left out?. Below is the code and the error stack trace; I've traced the error's origin to here:

myExcelBook = new XSSFWorkbook(new FileInputStream(f));
myExcelSheet = ((XSSFWorkbook) myExcelBook).getSheetAt(0);

The exception stack trace:

Caused by: java.io.CharConversionException: Characters larger than 4 bytes are not supported: byte 0x8f implies a length of more than 4 bytes
    at org.apache.xmlbeans.impl.piccolo.xml.UTF8XMLDecoder.decode(UTF8XMLDecoder.java:162)
    at org.apache.xmlbeans.impl.piccolo.xml.XMLStreamReader$FastStreamDecoder.read(XMLStreamReader.java:762)
    at org.apache.xmlbeans.impl.piccolo.xml.XMLStreamReader.read(XMLStreamReader.java:162)
    at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.yy_refill(PiccoloLexer.java:3477)
    at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.yylex(PiccoloLexer.java:3962)
    at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yylex(Piccolo.java:1290)
    at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yyparse(Piccolo.java:1400)
    at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.parse(Piccolo.java:714)
    at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3479)
    at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1277)
    at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1264)
    at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345)
    at org.openxmlformats.schemas.spreadsheetml.x2006.main.SstDocument$Factory.parse(Unknown Source)
    at org.apache.poi.xssf.model.SharedStringsTable.readFrom(SharedStringsTable.java:119)
    at org.apache.poi.xssf.model.SharedStringsTable.<init>(SharedStringsTable.java:106)
    at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:67)
    ... 61 more

I wanted to get the number of columns and cell data from the xlsx file for processing in my application, but when I load the file in the JavaFx application, the error shows up and the application freezes.


Solution

  • The issue here is that I was using an outdated version of the apache poi and XML beans libraries.By upgrading to the apache poi version 5.2.3 , org.apache.xmlbeans version 5.1.1 and poi-ooxml version 5.2.3 , I was able to open the xlsx file. Also, I faced the error below during opening one of my larger xlsx file. Error:

      Exception in thread "main" org.apache.poi.util.RecordFormatException: Tried to read data but the maximum length for this record type is 100,000,000.
    If the file is not corrupt and not large, please open an issue on bugzilla to request 
    increasing the maximum allowable size for this record type.
    You can set a higher override value with IOUtils.setByteArrayMaxOverride()
        at org.apache.poi.poi/org.apache.poi.util.IOUtils.throwRecordTruncationException(IOUtils.java:607)
        at org.apache.poi.poi/org.apache.poi.util.IOUtils.toByteArray(IOUtils.java:249)
        at org.apache.poi.poi/org.apache.poi.util.IOUtils.toByteArrayWithMaxLength(IOUtils.java:220)
        at org.apache.poi.ooxml/org.apache.poi.openxml4j.util.ZipArchiveFakeEntry.<init>(ZipArchiveFakeEntry.java:81)
        at org.apache.poi.ooxml/org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:98)
        at org.apache.poi.ooxml/org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:132)
        at org.apache.poi.ooxml/org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:319)
        at org.apache.poi.ooxml/org.apache.poi.ooxml.util.PackageHelper.open(PackageHelper.java:59)
        at org.apache.poi.ooxml/org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:290)
        at org.apache.poi.ooxml/org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:286)
        at com.example.scrapper/com.example.scrapper.ChangeToCSV.main(ChangeToCSV.java:15)
    
    Process finished with exit code 1
    

    I fixed the error by using the line below, just before declaring the Workbook and other variables to open the xlsx file: IOUtils.setByteArrayMaxOverride(1000000000); Illustration here:

    IOUtils.setByteArrayMaxOverride(1000000000);
    try {
         FileInputStream inputStream = new FileInputStream(xlsxFilePath);
         Workbook workbook = new XSSFWorkbook(inputStream);
         Sheet sheet = workbook.getSheetAt(0);
         //The rest of the code