javaperformanceoptimizationxssf

What is causing my program to bog down when writing to XSSF Workbook?


Maybe "writing" wasn't the correct word since in this function, I am just setting the cells and then writing afterwards.

I have a function that I have pin pointed to be the cause of it bogging down. When it gets to this function, it spends over 10 minutes here before I just terminate it. This is the function that I am passing an output_wb to:

private static void buildRowsByListOfRows(int sheetNumber, ArrayList<Row> sheet, Workbook wb) {
    Sheet worksheet = wb.getSheetAt(sheetNumber);
    int lastRow;
    Row row;
    String cell_value;
    Cell cell;
    int x = 0;
    System.out.println("Size of array list: " + sheet.size());
    for (Row my_row : sheet) {
        try {
            lastRow = worksheet.getLastRowNum();
            row = worksheet.createRow(++lastRow);

            for (int i = 0; i < my_row.getLastCellNum(); i++) {
                cell_value = getCellContentAsString(my_row.getCell(i, Row.MissingCellPolicy.CREATE_NULL_AS_BLANK));
                cell = row.createCell(i);
                cell.setCellValue(cell_value);
                System.out.println("setting row #: " + x + "with value =>" + cell_value);
            }
            x++;
        } catch (Exception e) {
            System.out.println("SOMETHING WENT WRONG");
            System.out.println(e);
        }
    }
}

The size of the ArrayList is 73,835. It starts off running pretty fast then it gets to around row 20,000 and it then you can see the print statements in the loop getting spread out further and further apart. Each row has 70 columns.

Is this function really written that poorly or is something else going on? What can I do to optimize this?

I create the output workbook like this if this matters:

// Create output file with the required sheets 
createOutputXLSFile(output_filename_path);
XSSFWorkbook output_wb = new XSSFWorkbook(new FileInputStream(output_filename_path));

And the createOutputXLSFile() looks like this:

private static void createOutputXLSFile(String output_filename_path) throws FileNotFoundException {
    try {
        // Directory path where the xls file will be created
        // Create object of FileOutputStream
        FileOutputStream fout = new FileOutputStream(output_filename_path);
        XSSFWorkbook wb = new XSSFWorkbook();
        wb.createSheet("Removed records"); 
        wb.createSheet("Added records"); 
        wb.createSheet("Updated records"); 

        // Build the Excel File
        ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
        wb.write(outputStream);
        outputStream.writeTo(fout);
        outputStream.close();
        fout.close();
        wb.close();
    } catch (IOException e) {
        e.printStackTrace();
    }
}
    private static String getCellContentAsString(Cell cell) {
        DataFormatter fmt = new DataFormatter();
        String data = null;
        if (cell.getCellType() == CellType.STRING) {
            data = String.valueOf(cell.getStringCellValue());
        } else if (cell.getCellType() == CellType.NUMERIC) {
            data = String.valueOf(fmt.formatCellValue(cell));
        } else if (cell.getCellType() == CellType.BOOLEAN) {
            data = String.valueOf(fmt.formatCellValue(cell));
        } else if (cell.getCellType() == CellType.ERROR) {
            data = String.valueOf(cell.getErrorCellValue());
        } else if (cell.getCellType() == CellType.BLANK) {
            data = String.valueOf(cell.getStringCellValue());
        } else if (cell.getCellType() == CellType._NONE) {
            data = String.valueOf(cell.getStringCellValue());
        }
        return data;
    }

Update #1- Seems to be happening here. If I comment out all 3 lines then it finishes:

    cell_value = getCellContentAsString(my_row.getCell(i, Row.MissingCellPolicy.CREATE_NULL_AS_BLANK));
    cell = row.createCell(i);
    cell.setCellValue(cell_value);

Update #2 - If I comment out these two lines, then the loop finishes as expected:

    cell = row.createCell(i); // The problem
    cell.setCellValue(cell_value);

So now I know the problem is the row.createCell(i) but why? How can I optimize this?


Solution

  • I finally managed to resolve this issue. Turns out that using XSSF to write is just too slow if the files are large. So I converted the XSSF output workbook to an SXSSFWorkbook. To do that I just passed in my already existing XSSFWorkbook into SXSSFWorkbook like this :

            // Create output file with the required sheets 
            createOutputXLSFile(output_filename_path);
            XSSFWorkbook output_wb_temp = new XSSFWorkbook(new FileInputStream(output_filename_path));
            SXSSFWorkbook output_wb = new SXSSFWorkbook(output_wb_temp);
    

    The rest of the code works as is.