sqlsql-serversql-server-2017nvarcharsql-server-collation

How to retrieve German characters from a large CSV File into SQL Server 2017 script


I have a CSV file including a list of employees, where some of them includes German characters like 'ö' in their names. I need to create a temp table in my SQL Server 2017 script and fill it with the content of the CSV file. My script is:

CREATE TABLE #AllAdUsers(
    [PhysicalDeliveryOfficeName] [NVARCHAR](255) NULL,
    [Name] [NVARCHAR](255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [DisplayName] [NVARCHAR](255) NULL,
    [Company] [NVARCHAR](255) NULL,
    [SAMAccountName] [NVARCHAR](255) NULL
)

--import AD users
BULK INSERT #AllAdUsers
    FROM 'C:\Employees.csv'
    WITH
    (
        FIRSTROW = 2,
        FIELDTERMINATOR = ',',  --CSV field delimiter
        ROWTERMINATOR = '\n',   --Use to shift the control to next row
        TABLOCK
    )

However, even though I use "Nvarchar" variable type with the collation of "SQL_Latin1_General_CP1_CI", the German characters are not seem OK, for instance "Kösker" seems like:

"K├╢sker"

I've tried many other collations but couldn't find a fix for it. Any help would be very much appreciated.


Solution

  • Adding codepage=65001 to the bulk insert query fixed the issue:

    --import AD users
    BULK INSERT #AllAdUsers
        FROM 'C:\Employees.csv'
        WITH
        (
            FIRSTROW = 2,
            CODEPAGE = '65001',
            FIELDTERMINATOR = ',',  --CSV field delimiter
            ROWTERMINATOR = '\n',   --Use to shift the control to next row
            TABLOCK
        )