mysqlinsertrecords

How can mysql insert millions records faster?


I wanted to insert about millions records into my database, but it went very slow with a speed about 40,000 records/hour, I dont think that my hardware is too slow, because i saw the diskio is under 2 MiB/s. I have many tables seperated in different .sql-files. One single record is also very simple, one record has less than 15 columns and one column has less than 30 characters. I did this job under archlinux with mysql 5.3. Do you guys have any ideas? Or is this speed not slow?


Solution

  • It's most likely because you're inserting records like this:

    INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
    INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
    INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
    INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
    INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
    

    Sending a new query each time you need to INSERT something is bad for performance. Instead combine those queries into a single query, like this.

    INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2"),
                                                     ("data1", "data2"),
                                                     ("data1", "data2"),
                                                     ("data1", "data2"),
                                                     ("data1", "data2");
    

    You can also read more about insert speed in the MySQL Docs. It clearly describs the following.

    To optimize insert speed, combine many small operations into a single large operation. Ideally, you make a single connection, send the data for many new rows at once, and delay all index updates and consistency checking until the very end.

    Of course don't combine ALL of them, if the amount is HUGE. Say you have 1000 rows you need to insert, then don't do it one at a time. But you probably shouldn't equally try to have all 1000 rows in a single query. Instead break it into smaller sizes.

    If it's still really slow, then it might just be because your server is slow.

    Note that you of course don't need all those spaces in the combined query, that is simply to get a better overview of the answer.