sql-serverxmlbulkinsertbcp

Sql Bulk insert XML format file with double quotes in terminator


I'm trying to insert some data into a table from a csv document which has all of the fields delimited with ""

ie.

 APPLICANTID,NAME,CONTACT,PHONENO,MOBILENO,FAXNO,EMAIL,ADDR1,ADDR2,ADDR3,STATE,POSTCODE
 "3","Snoop Dogg","Snoop Dogg","411","","","","411 High Street","USA 
 ","","USA", "1111" "4","LL Cool J","LL Cool J","","","","","5 King
 Street","","","USA","1111"

I am using an xml format file to try and overcome the "" delimiters as I believe I would have to update the data again after importing to remove the inital " if it did not.

My format file looks like the following:

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <RECORD>
  <FIELD ID="1" xsi:type="NCharTerm" TERMINATOR='",' MAX_LENGTH="12"/>
  <FIELD ID="2" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="3" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="4" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="5" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="6" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="7" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="8" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="9" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="10" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="11" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="12" xsi:type="CharTerm" TERMINATOR="\r\n" COLLATION="Latin1_General_CI_AS"/>
 </RECORD>
 <ROW>
  <COLUMN SOURCE="1" NAME="APPLICANTID" xsi:type="SQLINT"/>
  <COLUMN SOURCE="2" NAME="NAME" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="3" NAME="CONTACT" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="4" NAME="PHONENO" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="5" NAME="MOBILENO" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="6" NAME="FAXNO" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="7" NAME="EMAIL" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="8" NAME="ADDR1" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="9" NAME="ADDR2" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="10" NAME="ADDR3" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="11" NAME="STATE" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="12" NAME="POSTCODE" xsi:type="SQLCHAR"/>
 </ROW>
</BCPFORMAT>

and I am running the import with the following:

BULK INSERT [PracticalDB].dbo.applicant 
FROM 'C:\temp.csv'
WITH (KEEPIDENTITY, FORMATFILE='C:\temp.xml', FIRSTROW = 2)

I am getting the error:

Msg 4864, Level 16, State 1, Line 1 Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 1 (APPLICANTID).

for all of the rows.

I have tried various different combinations for the terminator including using:

TERMINATOR="&quot;,"
TERMINATOR="\","
TERMINATOR='","
TERMINATOR='\","

and none of them seem to work.

Is there a correct way to escape the " so that it will be parsed correctly, assuming that that is my problem here.


Solution

  • Ok so I figured it out!

    You can use ' instead of " when you are defining the xml attributes ie TERMINATOR='', then you can use the " within them without worrying.

    Also I needed to eat the first " with a field so the other columns could be parsed correctly. This ended up with the format file

    <?xml version="1.0"?>
    <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
     <RECORD>
      <FIELD ID="1" xsi:type="CharTerm" TERMINATOR='"' />
      <FIELD ID="2" xsi:type="CharTerm" TERMINATOR='","' />
      <FIELD ID="3" xsi:type="CharTerm" TERMINATOR='","' />
      <FIELD ID="4" xsi:type="CharTerm" TERMINATOR='","' />
      <FIELD ID="5" xsi:type="CharTerm" TERMINATOR='","' />
      <FIELD ID="6" xsi:type="CharTerm" TERMINATOR='","' />
      <FIELD ID="7" xsi:type="CharTerm" TERMINATOR='","' />
      <FIELD ID="8" xsi:type="CharTerm" TERMINATOR='","' />
      <FIELD ID="9" xsi:type="CharTerm" TERMINATOR='","' />
      <FIELD ID="10" xsi:type="CharTerm" TERMINATOR='","' />
      <FIELD ID="11" xsi:type="CharTerm" TERMINATOR='","' />
      <FIELD ID="12" xsi:type="CharTerm" TERMINATOR='","' />
      <FIELD ID="13" xsi:type="CharTerm" TERMINATOR='"\r\n' />
     </RECORD>
     <ROW>
      <COLUMN SOURCE="2" NAME="APPLICANTID" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="3" NAME="NAME" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="4" NAME="CONTACT" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="5" NAME="PHONENO" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="6" NAME="MOBILENO" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="7" NAME="FAXNO" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="8" NAME="EMAIL" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="9" NAME="ADDR1" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="10" NAME="ADDR2" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="11" NAME="ADDR3" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="12" NAME="STATE" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="13" NAME="POSTCODE" xsi:type="SQLCHAR"/>
     </ROW>
    </BCPFORMAT>
    

    Where the first field is just a throw away one to remove the first " and the other fields all separate on "," and the final separates on "(newline)