javascriptexceljssheetjs

How to get the validations applied on a cell in an Excel file using javascript?


In ExcelJs package there is dataValidation or _dataValidation attribute for cell object. This gives the type of data in cell like list, dropdown, decimal, boolean etc.

But, for some reason, it is throwing Javascript out of heap memory error for some files, So, I have increased the memory limit to 16GB (it is the max I can use) by node --max-old-space-size=[size_in_GB*1024] index.js but it failed to parse.

So, I am looking for alternative packages like sheet js and others but I coun't get all the data validations as in ExcelJs mainly options in a dropdown of an Excel file.

Please help me to solve this issue. Thank you.

PS: I have checked this question How to Get the DataType of excel cell value using nodejs version16.14.2 and Vue js and the convert-excel-to-json don't provide any validations on the cell.

I am getting this error while parsing the particular Excel file.

Error image

This is the code snippet that I have used and I got the above error after few minutes.

const ExcelJS = require('exceljs');
const fs = require("fs");

async function readExcelFile(filePath) {
  const workbook = new ExcelJS.Workbook();
  const stream = fs.createReadStream(filePath);

  await workbook.xlsx.read(stream);

  const worksheet = workbook.getWorksheet(1);

  worksheet.eachRow((row, rowNumber) => {
    
      row.eachCell((cell, colNumber) => {
        console.log(cell.dataValidations, cell._dataValidations);
        console.log(`Cell value at row ${rowNumber}, column ${colNumber}: ${cell.value}`);
      });
    });
}

readExcelFile('PAG KIC TAB A (1)(1).xlsx');

Solution

  • I had a look why the OOM happens and the culprit is actually a defined name (MS docs, ExcelJS docs) from a sheet named "[31]IRRs UPDATE EACH QUARTER". It references a huge range "C6:XFD1048576" which ExcelJS goes through cell by cell and by doing so uses up all memory. No wonder your workbook doesn't open in MS Office cloud.

    Now depending on what kind of processing you need, you can either inspect that offending sheet whether this defined name with such huge range makes even sense, fix/remove it and try again.

    Or you can use ExcelJS's Streaming API, like this:

    import ExcelJS from "exceljs";
    
    async function readExcelFile(filePath: string) {
      const workbookReader = new ExcelJS.stream.xlsx.WorkbookReader(filePath, {});
    
      for await (const worksheetReader of workbookReader) {
        for await (const row of worksheetReader) {
          row.eachCell((cell) => {
            console.log(cell.value);
          });
        }
      }
    }
    
    readExcelFile("PAG KIC TAB A (1)(1).xlsx");
    
    

    As is the case with streaming in general, it doesn't buffer all data in memory, but rather only processes one batch of data at a time, and is therefore much more efficient with memory usage. But I also found that this way of reading doesn't process the defined names at all, which can be a no-go for you.