excelvb.netexceldatareader

ExcelDataReader changing columns name


I am reading an Excel file using ExcelDataReader

Dim entireExcel As DataTableCollection

Using ofd As OpenFileDialog = New OpenFileDialog() With {.Filter = "Excel Workbook|*.xlsx|Excel 97-2003 Workbook|*.xls"}
    If ofd.ShowDialog() = DialogResult.OK Then
    txtFileName.Text = ofd.FileName

        Using stream = File.OpenRead(ofd.FileName)
            System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance)
            Dim reader As IExcelDataReader = ExcelReaderFactory.CreateReader(stream)
            entireExcel = reader.AsDataSet(New ExcelDataSetConfiguration() With {
                             .ConfigureDataTable = Function(__) New ExcelDataTableConfiguration() With {
                             .UseHeaderRow = True}}).Tables
        End Using
    End If
End Using

It all works fine, with one exception: The column names in the sheets are not the real ones in Excel. If the column is a duplicate of a column to the left, it adds _1, _2, etc., to the name. If the column name is missing, it creates a name Column0, Column1, Column2, etc.

How can I get the real names, or how can I prevent changing them?


Solution

  • the reason for this behaviour is the implementation of DataSet throws a DuplicateNameException if you try to set duplicate column names [1]. ExcelDataReader works around this by generating the unique names for you.

    You need to find a workaround depending on your case. F.ex set UseHeaderRow to false and read the column names manually from the data. Or don't use AsDataSet() at all, but instead rely on the reader's Read()/GetValue()/GetXXX() methods.

    [1] https://learn.microsoft.com/en-us/dotnet/api/system.data.datacolumn.columnname