excelazure-functions.net-8.0

Read .xlsx files sheets using XLWorkbook will raise error "corrupted data" on one sheet


I have this code inside my Azure Function that uses .NET 8.0, to read the first and fourth sheets inside a .xlsx file stored in Sharepoint Online:

var relativeFilePath = "/sites/marketing/Shared Documents/Reporting/datac.xlsx"; // relative to tenant
_logger.LogInformation("1");

string uri = siteUrl + "/_api/web/getfilebyserverrelativeurl('" + relativeFilePath + "')/$value";

var stream = await httpClient.GetStreamAsync(uri);

_logger.LogInformation("2");

List<int> worksheetnumbers = new List<int> { 1, 4 }; // read the first and fourth sheet

foreach (int w in worksheetnumbers)
{
    // 3. Open Excel as stream with OpenXmlReader
    using var workbook = new XLWorkbook(stream);
    _logger.LogInformation("3");

    var worksheet = workbook.Worksheet(w); // First worksheet
    _logger.LogInformation("4");

    var headerRow = worksheet.Row(1);
    _logger.LogInformation("5");

    var columnMap = new Dictionary<string, int>();
    _logger.LogInformation("6");

    for (int i = 1; i <= headerRow.LastCellUsed().Address.ColumnNumber; i++)
    {
        string header = headerRow.Cell(i).GetString().Trim();

        if (!string.IsNullOrWhiteSpace(header))
            columnMap[header] = i;
    }

    int lastRow = worksheet.LastRowUsed().RowNumber();

    for (int i = lastRow; i >= 2; i--)
    {
        // code goes here
    }
}

Now the code will read the first sheet correctly, but will raise this error on the second sheet

Error: File contains corrupted data.

Mainly on this code

using var workbook = new XLWorkbook(stream);

Any advice? What could cause a sheet to be corrupted? I checked the sheet (although it has 30K rows) and it contains a single table, and all the cells contain text or number.

Thanks


Solution

  • Try to do not open the workbook for every sheet:

    // 3. Open Excel as stream with OpenXmlReader
    using var workbook = new XLWorkbook(stream);
    _logger.LogInformation("3");
    foreach (int w in worksheetnumbers)
    {
        var worksheet = workbook.Worksheet(w); // First worksheet
        _logger.LogInformation("4");
        var headerRow = worksheet.Row(1);
        _logger.LogInformation("5");
        var columnMap = new Dictionary<string, int>();
        _logger.LogInformation("6");
        for (int i = 1; i <= headerRow.LastCellUsed().Address.ColumnNumber; i++)
        {
            string header = headerRow.Cell(i).GetString().Trim();
            if (!string.IsNullOrWhiteSpace(header))
                columnMap[header] = i;
        }
        int lastRow = worksheet.LastRowUsed().RowNumber();
        for (int i = lastRow; i >= 2; i--)
        {
        //code goes here
        }
    }