javaexcelapache-poisxssf

How to add columns to an existing large excel file using SXSSF Apache POI?


I am working with a large excel file ( larger than 40 Mb , more than 100k rows and 50 columns ). I am successfully reading it using POI ( 3.10.1 version ) event stream and then doing some calculation and storing result into a List.

Now I have to append this List as a column in the same file. In this part I am facing issue.

I have tried to achieve this by using the below code

FileInputStream excelFile = new FileInputStream(new File(pathToFile));
Workbook workbook = new XSSFWorkbook(excelFile);
Sheet datatypeSheet = workbook.getSheetAt(0); // Get first sheet
Iterator<Row> iterator = datatypeSheet.iterator();

int i=0;
while (iterator.hasNext()) {  // Loop over each row
    Row currentRow = iterator.next();
    Cell cell = currentRow.createCell(currentRow.getLastCellNum());
    cell.setCellType(Cell.CELL_TYPE_STRING);
    if(currentRow.getRowNum() == 0)
        cell.setCellValue("OUTPUT-COLUMN"); // set column header for the new column
    else {
        cell.setCellValue(list.get(i));  // list contains the output to populate in new column
        i++;
    }

}

FileOutputStream fos = new FileOutputStream(new File(pathToOutput));
workbook.write(fos);
fos.close();

It is working fine with smaller files But the issue is that I am getting Out of memory for the larger files. Now I tried to modify this and use SXSSF in place of XSFF to get over the memory issue (See below code). But while testing even for smaller files I am getting output file same as the input file.

FileInputStream excelFile = new FileInputStream(new File(pathToFile));
XSSFWorkbook xwb = new XSSFWorkbook(inputStream);
inputStream.close();

SXSSFWorkbook wb = new SXSSFWorkbook(xwb,100); 
wb.setCompressTempFiles(true);
SXSSFSheet sh = (SXSSFSheet) wb.getSheetAt(0);
Iterator<Row> iterator = datatypeSheet.iterator();

int i=0;
while (iterator.hasNext()) {  // Loop over each row
    Row currentRow = iterator.next();
    Cell cell = currentRow.createCell(currentRow.getLastCellNum());
    cell.setCellType(Cell.CELL_TYPE_STRING);
    if(currentRow.getRowNum() == 0)
        cell.setCellValue("OUTPUT-COLUMN"); // set column header for the new column
    else {
        cell.setCellValue(list.get(i));  // list contains the output to populate in new column
        i++;
    }

}

FileOutputStream fos = new FileOutputStream(new File(pathToOutput));
wb.write(fos);
fos.close();

Using a db is not suitable in my use case and i want to avoid using a temporary data structure to hold data for writing due to memory constraint.

Is there a way to write in output workbook while streaming ? Here is the code that I am using to read using POI Streaming API

private class ExcelData implements SheetContentsHandler {  

LinkedHashMap<Strin, String> rowMap;

    public void startRow(int rowNum) {
             
    }

    public void endRow(int rowNum) {
    // Process the row
    // Handle write to output workbook ??
    }

    public void cell(String cellReference, String formattedValue,
            XSSFComment comment) {
    // Save current row in rowMap ( column name => cell value )
    }

    public void headerFooter(String text, boolean isHeader, String tagName) 
    {
    }
}

Solution

  • It is not possible to add column to existing workbook using POI SXSSF. It only allows addition of new rows.

    The only solution is to read the existing workbook and write to a new workbook with the added column.

    To achieve this we can store the rows in a data structure or database in the endrow() method and then use the persisted data to write a new workbook.