sqlsql-servercsvbulkinsertbulk

Import CSV file into SQL Server


I am looking for help to import a .csv file into SQL Server using BULK INSERT and I have few basic questions.

Issues:

  1. The CSV file data may have , (comma) in between (Ex: description), so how can I make import handling these data?

  2. If the client creates the CSV from Excel then the data that have comma are enclosed within "" (double quotes) [as the below example] so how do the import can handle this?

  3. How do we track if some rows have bad data, which import skips? (does import skips rows that are not importable)

Here is the sample CSV with header:

Name,Class,Subject,ExamDate,Mark,Description
Prabhat,4,Math,2/10/2013,25,Test data for prabhat.
Murari,5,Science,2/11/2013,24,"Test data for his's test, where we can test 2nd ROW, Test."
sanjay,4,Science,,25,Test Only.

And SQL statement to import:

BULK INSERT SchoolsTemp
FROM 'C:\CSVData\Schools.csv'
WITH
(
    FIRSTROW = 2,
    FIELDTERMINATOR = ',',  --CSV field delimiter
    ROWTERMINATOR = '\n',   --Use to shift the control to next row
    TABLOCK
)

Solution

  • Based SQL Server CSV Import

    1) The CSV file data may have , (comma) in between (Ex: description), so how can I make import handling these data?

    Solution

    If you're using , (comma) as a delimiter, then there is no way to differentiate between a comma as a field terminator and a comma in your data. I would use a different FIELDTERMINATOR like ||. Code would look like and this will handle comma and single slash perfectly.

    2) If the client create the csv from excel then the data that have comma are enclosed within " ... " (double quotes) [as the below example] so how do the import can handle this?

    Solution

    If you're using BULK insert then there is no way to handle double quotes, data will be inserted with double quotes into rows. after inserting the data into table you could replace those double quotes with ''.

    update table
    set columnhavingdoublequotes = replace(columnhavingdoublequotes,'"','')
    

    3) How do we track if some rows have bad data, which import skips? (does import skips rows that are not importable)?

    Solution

    To handle rows which aren't loaded into table because of invalid data or format, could be handle using ERRORFILE property, specify the error file name, it will write the rows having error to error file. code should look like.

    BULK INSERT SchoolsTemp
        FROM 'C:\CSVData\Schools.csv'
        WITH
        (
        FIRSTROW = 2,
        FIELDTERMINATOR = ',',  --CSV field delimiter
        ROWTERMINATOR = '\n',   --Use to shift the control to next row
        ERRORFILE = 'C:\CSVDATA\SchoolsErrorRows.csv',
        TABLOCK
        )