javaexcelzipapache-poihssfworkbook

Unable to create HSSFWorkbook workbook for a .xls file that is within a ZIP file


My requirement is that there is a .xls file within a zip file, which can be downloaded using an URL. As I want to read this excel file in memory (for later processing), without downloading the zip locally, I have used ZipInputStream and this is how the main part of my code looks like:

String finalUrl = "https://server/myZip.zip"
URL url = new URL(finalUrl);
InputStream inputStream = new BufferedInputStream(url.openStream());
ZipInputStream zis = new ZipInputStream(inputStream);
ZipEntry file;
try {
  while ((file = zis.getNextEntry()) != null) {
    if (file.getName().endsWith(".xls")) {
      log.info("xls file found");
      log.info("file name : {}", file.getName());
      byte excelBytes[] = new byte[(int)file.getSize()];
      zis.read(excelBytes);
      InputStream excelInputStream = new ByteArrayInputStream(excelBytes);
      HSSFWorkbook wb = new HSSFWorkbook(excelInputStream);
      HSSFSheet sheet = wb.getSheetAt(8);
      log.info("sheet : {}", sheet.getSheetName());
    }
    else {
      log.info("xls file not found");
    }
  }
}
finally{
   zis.close();
}

But unfortunately I am receiving the following error:

java.lang.ArrayIndexOutOfBoundsException: Index -3 out of bounds for length 3247

Note: The .xls file is around 2MB and the zip file does not have any complex structure such as sub-directories or multiple files.

Any help here would be highly appreciated. Thanks!


Solution

  • Thanks to @PJ Fanning for highlighting this, The problem was in zis.read(excelBytes) which does not guarantee to read all the bytes. After using IOUtils.toByteArrayinstead, the problem was resolved. The correct code is:

    String finalUrl = "https://server/myZip.zip"
    URL url = new URL(finalUrl);
    InputStream inputStream = new BufferedInputStream(url.openStream());
    ZipInputStream zis = new ZipInputStream(inputStream);
    ZipEntry file;
    try {
      while ((file = zis.getNextEntry()) != null) {
        if (file.getName().endsWith(".xls")) {
          log.info("xls file found");
          log.info("file name : {}", file.getName());
          byte excelBytes[] = IOUtils.toByteArray(zis);
          InputStream excelInputStream = new ByteArrayInputStream(excelBytes);
          HSSFWorkbook wb = new HSSFWorkbook(excelInputStream);
          HSSFSheet sheet = wb.getSheetAt(8);
          log.info("sheet : {}", sheet.getSheetName());
        }
        else {
          log.info("xls file not found");
        }
      }
    }
    finally{
       zis.close();
    }