phpphpexcelphpspreadsheetphpoffice-phpspreadsheet

how to improve PhpSpreadsheet data retrieve timing from local file?


I'm using PhpSpreadsheet to read a specific value from a local xlsx file, but it's taking 510ms, which is too slow. I need it to be under 20ms, similar to reading from a CSV file with the same data in a CSV format. Any tips to speed up the code?

<?php
require '/vendor/autoload.php'; // Include PhpSpreadsheet library

use PhpOffice\PhpSpreadsheet\IOFactory;

$xlsxFile = 'local.xlsx'; // path to XLSX file
$targetRowNumber = 302; // row number to access
$targetColumn = 3; // column number (C = 3)

try {
    $trystarttime = microtime(true);
    // Load the XLSX file
    $spreadsheet = IOFactory::load($xlsxFile);

    // Get the value of the specified cell (column 2, row 301)
    $value = $spreadsheet->getActiveSheet()->getCellByColumnAndRow($targetColumn, $targetRowNumber)->getValue();
    
    echo "Value in column $targetColumn, row $targetRowNumber: " . $value . "\n";

    $tryendtime = microtime(true);
    $durationInMilliseconds = (number_format(($tryendtime - $trystarttime) * 1000, 2)) . "ms to fetch the required row value" . PHP_EOL;

    echo $durationInMilliseconds;

} catch (Exception $e) {
    echo "Error: " . $e->getMessage() . "\n";
    echo "Trace: " . $e->getTraceAsString() . "\n";
}
?>

Solution

  • If you expect to be able to load a complex file format like Xlsx as efficiently as you can read a plaintext CSV file, then rethink your assumptions. That being said, if you only need to read a single cell value, then there is a lot that you can do to improve performance.

    If you only need data from the file, and not structure or style information, then load it with ReadDataOnly enabled.

    $reader = new \PhpOffice\PhpSpreadsheet\Reader\Xlsx();
    $reader->setReadDataOnly(true);
    $spreadsheet = $reader->load($inputFileName);
    

    If the file has several worksheets, and you're only interested in the data from one of those worksheets, then tell the loader to load only that one worksheet.

    $reader = new \PhpOffice\PhpSpreadsheet\Reader\Xlsx();
    $reader->setLoadSheetsOnly('Worksheet 1');
    $spreadsheet = $reader->load($inputFileName);
    

    If you know the cell Address (or Addresses) that you want to read, then use a Read Filter to read only those cells from the file.

    /**  Define a Read Filter class implementing \PhpOffice\PhpSpreadsheet\Reader\IReadFilter  */
    class MyReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
    {
        public function readCell($targetColumn, $targetRowNumber, $worksheetName = '') {
            if ($row == $targetRowNumber) {
                if ($columnAddress == $targetColumn) {
                    return true;
                }
            }
            return false;
        }
    }
    
    /**  Create an Instance of our Read Filter  **/
    $filterSubset = new MyReadFilter('C', 103);
    
    $reader = new \PhpOffice\PhpSpreadsheet\Reader\Xlsx();
    /**  Tell the Reader that we want to use the Read Filter  **/
    $reader->setReadFilter($filterSubset);
    /**  Load only the rows and columns that match our filter to Spreadsheet  **/
    $spreadsheet = $reader->load($inputFileName);
    

    These features are all documented.