For loading huge amounts of data into MySQL, LOAD DATA INFILE is by far the fastest option. Unfortunately, while this can be used in a way INSERT IGNORE or REPLACE works, ON DUPLICATE KEY UPDATE is not currently supported.
However, ON DUPLICATE KEY UPDATE
has advantages over REPLACE
. The latter does a delete and an insert when a duplicate exists. This brings overhead for key management. Also, autoincrement ids will not stay the same on a replace.
How can ON DUPLICATE KEY UPDATE
be emulated when using LOAD DATA INFILE?
These steps can be used to emulate this functionality:
Create a new temporary table.
CREATE TEMPORARY TABLE temporary_table LIKE target_table;
Optionally, drop all indices from the temporary table to speed things up.
SHOW INDEX FROM temporary_table;
DROP INDEX `PRIMARY` ON temporary_table;
DROP INDEX `some_other_index` ON temporary_table;
Load the CSV into the temporary table
LOAD DATA INFILE 'your_file.csv'
INTO TABLE temporary_table
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
(field1, field2);
Copy the data using ON DUPLICATE KEY UPDATE
SHOW COLUMNS FROM target_table;
INSERT INTO target_table
SELECT * FROM temporary_table
ON DUPLICATE KEY UPDATE field1 = VALUES(field1), field2 = VALUES(field2);
Remove the temporary table
DROP TEMPORARY TABLE temporary_table;
Using SHOW INDEX FROM
and SHOW COLUMNS FROM
this process can be automated for any given table.