Trying to complete step one of a project where I create a new table with
CREATE TABLE Customers (customerID SMALLINT,firstName VARCHAR(25),lastName VARCHAR(25),street VARCHAR(50),city VARCHAR(50),state VARCHAR(25),zipCode VARCHAR(10),telephone VARCHAR(15),PRIMARY KEY (customerID));
Then, I am supposed to upload a csv file from the application.
LOAD DATA INFILE "/home/codio/workspace/customers.csv" INTO TABLE Customers FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';
or
LOAD DATA INFILE '/home/codio/workspace/customers.csv' INTO TABLE Customers FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n';
but both of these provide the error code
ERROR 1062 (23000): Duplicate entry '32767' for key 'PRIMARY'
Opening the file and searching for 32767 only yields one result in the telephone line. There is no duplicate. I have DELETE FROM Customers
and DROP TABLE Customers
. Both times confirming the table is empty by
mysql> SELECT * FROM Customers;
Empty set (0.00 sec)
I have been arguing with this for hours and am unsure what else to do. The worst part is I feel like I am not doing anything wrong. But that is where you come in. Fingers crossed.
I have deleted the table, cleared the table, recreated the table, at one point, went so far as creating a brand new file.
You have defined your primary key as
customerID SMALLINT
which means that it can only hold values between -32768 and 32767. When you use the LOAD DATA
statement any value outside this range gets truncated as mentioned in the manual for the LOAD DATA
statement:
Invalid values produce warnings rather than errors, and are converted to the “closest” valid value for the column data type. Examples:
[...]
An out-of-range numeric or temporal value is clipped to the closest endpoint of the range for the column data type.
This means that your first new row with a larger customerId
value than 32767 gets clipped to 32767. The second new row with a larger customerId
value than 32767 gets clipped to 32767 as well, resulting in a primary key conflict.
The solution is to use a bigger integer type for the customerId
column like INT
or BIGINT
.