I am working with a large excel file ( larger than 40 Mb , more than 100k rows and 50 columns ). I am successfully reading it using POI ( 3.10.1 version ) event stream and then doing some calculation and storing result into a List.
Now I have to append this List as a column in the same file. In this part I am facing issue.
I have tried to achieve this by using the below code
FileInputStream excelFile = new FileInputStream(new File(pathToFile));
Workbook workbook = new XSSFWorkbook(excelFile);
Sheet datatypeSheet = workbook.getSheetAt(0); // Get first sheet
Iterator<Row> iterator = datatypeSheet.iterator();
int i=0;
while (iterator.hasNext()) { // Loop over each row
Row currentRow = iterator.next();
Cell cell = currentRow.createCell(currentRow.getLastCellNum());
cell.setCellType(Cell.CELL_TYPE_STRING);
if(currentRow.getRowNum() == 0)
cell.setCellValue("OUTPUT-COLUMN"); // set column header for the new column
else {
cell.setCellValue(list.get(i)); // list contains the output to populate in new column
i++;
}
}
FileOutputStream fos = new FileOutputStream(new File(pathToOutput));
workbook.write(fos);
fos.close();
It is working fine with smaller files But the issue is that I am getting Out of memory for the larger files. Now I tried to modify this and use SXSSF in place of XSFF to get over the memory issue (See below code). But while testing even for smaller files I am getting output file same as the input file.
FileInputStream excelFile = new FileInputStream(new File(pathToFile));
XSSFWorkbook xwb = new XSSFWorkbook(inputStream);
inputStream.close();
SXSSFWorkbook wb = new SXSSFWorkbook(xwb,100);
wb.setCompressTempFiles(true);
SXSSFSheet sh = (SXSSFSheet) wb.getSheetAt(0);
Iterator<Row> iterator = datatypeSheet.iterator();
int i=0;
while (iterator.hasNext()) { // Loop over each row
Row currentRow = iterator.next();
Cell cell = currentRow.createCell(currentRow.getLastCellNum());
cell.setCellType(Cell.CELL_TYPE_STRING);
if(currentRow.getRowNum() == 0)
cell.setCellValue("OUTPUT-COLUMN"); // set column header for the new column
else {
cell.setCellValue(list.get(i)); // list contains the output to populate in new column
i++;
}
}
FileOutputStream fos = new FileOutputStream(new File(pathToOutput));
wb.write(fos);
fos.close();
Using a db is not suitable in my use case and i want to avoid using a temporary data structure to hold data for writing due to memory constraint.
Is there a way to write in output workbook while streaming ? Here is the code that I am using to read using POI Streaming API
private class ExcelData implements SheetContentsHandler {
LinkedHashMap<Strin, String> rowMap;
public void startRow(int rowNum) {
}
public void endRow(int rowNum) {
// Process the row
// Handle write to output workbook ??
}
public void cell(String cellReference, String formattedValue,
XSSFComment comment) {
// Save current row in rowMap ( column name => cell value )
}
public void headerFooter(String text, boolean isHeader, String tagName)
{
}
}
It is not possible to add column to existing workbook using POI SXSSF. It only allows addition of new rows.
The only solution is to read the existing workbook and write to a new workbook with the added column.
To achieve this we can store the rows in a data structure or database in the endrow() method and then use the persisted data to write a new workbook.