mysqlmysql5mysql-5.1mysql-5.5

Handling Deletes/Inserts/Select in a huge table


I have a dating website. In this website I used to send daily 10 photo matches to a user and store them in the structure as

SENDER RECEIVER
11      1
12      1
13      1
14      1

I maintain two month log. User can also check them by logging to my website.

Which means there is parallel insert and select, which surely is not an issue.

Problem is when a user becomes inactive or deletes its id, I need to remove all the entries from the log where sender='inactive-id'.

Size of log is approx 60 million. So whenever a delete queries comes in this huge table, all selects get locked and my site is getting down.

Note my table is merge myisam as I need to store 2-3 month records and on 1st of every month I change the definition


Solution

  • Normally, Table is the most granular object that is locked by DELETE statement. Therefore, by using MERGE table you combine several objects that can be lock independently into a single big object that will be locked, when DELETE hits ANY of its tables.

    MERGE is a solution for tables which change rarely or never: MERGE Table Advantages and Disadvantages.

    You have 2 options:

    Minimise impact of locks:

    Have smaller objects locked (rather than locking all your tables at once):

    Quote from MySQL Manual:

    An alternative to a MERGE table is a partitioned table, which stores partitions of a single table in separate files. Partitioning enables some operations to be performed more efficiently and is not limited to the MyISAM storage engine. For more information, see Chapter 18, Partitioning.

    I would strongly advocate for partitioning, because: - You can fully automate your logging / data retention process: a script can create new and remove empty partitions, move obsolete data to a different table and then truncate that table. - key uniqueness is enforced - Only partition that contains data to be deleted is locked. Selects on other partitions runs as normal. - Searches run on all partitions at the same time (as with MERGE), but you can use HASH SubPartitioning to further speed up searches.

    However if you believe that benefits of partitioning will be outweighed by cost of development, then may be you should not delete that data at all?