sql-serverlinuxetlbcp

BCP Inputting Incorrect Integer Data


I am running a bcp command to load some test data from local linux to a remote mssql db. The data is semi-randomly generated ints for testing purposes.

bcp test_load in /tmp/small_insert_test.csv -S <Server> -U <UID> -P <PWD> -d <DB> -f bcp.fmt -u

bcp.fmt looks like:

14.0
3
1       SQLINT              0       4       ","    1     site_id                                      ""
2       SQLINT              0       4       ","    2     view_id                                    ""
3       SQLINT              0       4       "\n"   3     content_id                                   ""

The target table test_load is a simple table with fields site_id, view_id and content_id, all INTs.

The weird thing is while the bcp is working, the data being inserted does not resemble the source data.

Here is the source data, all of it:

6,47603,4519004
6,47603,4521834
6,47603,4521904
6,47603,4521924
6,47603,4521934
6,47603,4522024
6,47603,4522094
6,47603,4522134
6,47603,4522544
6,47603,4522624

But here is the inserted data when I query select * from test_load (ignore the quotes etc thats just due to how I copied it to clipboard)

"54"    "808859444" "959526196"
"54"    "808859444" "825374004"
"54"    "808859444" "825374004"
"54"    "808859444" "825374004"
"54"    "808859444" "825374004"
"54"    "808859444" "842151220"
"54"    "808859444" "842151220"
"54"    "808859444" "842151220"
"54"    "808859444" "842151220"
"54"    "808859444" "842151220"

This is the input I get from running the bcp

Starting copy...

10 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total     : 124    Average : (80.6 rows per sec.)

I cannot understand what bcp is doing here. I've truncated and reran bcp a few times and these values are consistent. It looks like its somehow translating the original ints into different values, but I don't understand why.


Solution

  • I was able to identify a solution. The issue is that the bcp.fmt file was using SQLINT instead of SQLCHAR. Even though the values were integer, when bcp sees SQLINT apparently assumes the data is binary so it was expanding the integers.

    As documented:

    *Data files that are stored in character format use char as the file storage type. Therefore, for character data files, SQLCHAR is the only data type that appears in a format file.

    I also switched the row terminator from '\n' to '\r\n'.