mysqlruby-on-railsruby-on-rails-4activerecordinnodb

Optimizing delete operations in Rails 4 / ActiveRecord / MySQL


Essentially, we need to delete T1 records that have not had a t3 record associated with it for given @user. Though not required, it would be nice to delete T2 records as well that do not have a T3 join.

This was the code that got pushed to production. Obviously, it's great, because it passed a unit test(ha!)...except it's causing a lock across millions of rows in production, resulting in a Server 500 deadlock (Mysql2::Error: Deadlock found when trying to get lock; try restarting transaction) when multiple users hit the DELETE query at the same time. Yes, indexes are in place:

T1.where(user_id: @user.id, enabled: true)
    .joins('LEFT JOIN t2 ON t2.t1_id = t1.id')
    .joins('LEFT JOIN t3 ON t3.id = t2.t3_id')
    .where('t3.id IS NULL').delete_all

Resulting SQL:

DELETE FROM `t1`
WHERE `t1`.`id` IN
(SELECT id FROM
     (SELECT `t1`.`id` FROM `t1`
      LEFT JOIN t2 ON t2.t1_id = t1.id
      LEFT JOIN t3 ON t3.id = t2.t3_id
      WHERE `t1`.`user_id` = 65987
      AND `t1`.`enabled` = 1
      AND (t2.id IS NULL)
     ) __active_record_temp
    );

The only reason I know that is the resulting SQL here, is due it being included with the Server 500 deadlock error. I can't seem to get delete_all queries to show in the console while testing. I was able to take the query output and convert it to a SELECT with explain, which showed the outermost select scanning millions of rows (which I believe translates to the same number of row locks for DELETE operation.) The innermost query scans only 27 rows.

Questions:

  1. What is the optimal approach to deleting records from one or multiple tables based on join values in Rails with ActiveRecord?
  2. What options do we have to review/test the SQL output in Rails to ensure I'm not at risk of poor performance & deadlock?

UPDATE: the plot thickens...adding current associations

class User < ActiveRecord::Base
has_many :T1s
has_many :T2s

class T1 < ActiveRecord::Base
belongs_to :user

class T2Custom < ActiveRecord::Base
self.table_name = "t2"
has_many :T3s, :foreign_key => :t2_id

class T3 < ActiveRecord::Base
belongs_to :T2, foreign_key: "t2_id"
belongs_to :T1

Solution

  • As a quick (possibly permanent) fix, I decided to replace .delete_all with .destroy_all which will essentially only run the inner query, scanning 27 rows, then instantiate and delete the records one by one, with the additional advantage of running destroy callbacks if required.

    The LEFT JOIN was to find records that were invalid at this point in time due to child records in other tables never being created. The number of records in scope should never be more than 30. The code is attempting to find and delete records that shouldn't and usually don't exist. Meaning, 95% the time, the snippet will exit after the (fast) initial query comes back empty.

    Running .to_sql as recommended by ruby_newbie was extremely helpful to sort this out.