sqlruby-on-railsactiverecordrails-activerecordsqueel

Optimize difficult query (possibly with squeel)


There is such code(using PublicActivity gem & Squeel)

  def index
    @activities = Activity.limit(20).order { created_at.desc }
    @one = @activities.where{trackable_type == 'Post'}.includes(trackable: [:author, :project])
    @two = @activities.where{trackable_type == 'Project'}.includes trackable: [:owner]
    @activities = @one + @two
  end

But it creates 8 SQL requests:

 SELECT "activities".* FROM "activities" WHERE "activities"."trackable_type" = 'Post' ORDER BY "activities"."created_at" DESC LIMIT 20

      SELECT "posts".* FROM "posts" WHERE "posts"."id" IN (800, 799, 798, 797, 796, 795, 794, 793, 792, 791, 790, 789, 788, 787, 786, 785, 784, 783, 782, 781)

      SELECT "users".* FROM "users" WHERE "users"."id" IN (880, 879, 878, 877, 876, 875, 874, 873, 872, 871, 869, 868, 867, 866, 865, 864, 863, 862, 861, 860)

      SELECT "projects".* FROM "projects" WHERE "projects"."id" IN (80, 79)

      SELECT "activities".* FROM "activities" WHERE "activities"."trackable_type" = 'Project' ORDER BY "activities"."created_at" DESC LIMIT 20

      SELECT "projects".* FROM "projects" WHERE "projects"."id" IN (80, 79, 78, 77, 76, 75, 74, 73, 72, 71, 70, 69, 68, 67, 66, 65, 64, 63, 62, 61)

     SELECT "users".* FROM "users" WHERE "users"."id" IN (870, 859, 848, 837, 826, 815, 804, 793, 782, 771, 760, 749, 738, 727, 716, 705, 694, 683, 672, 661)
  1. activites request are not joined
  2. some users (post owner and project owner) are loaded twice
  3. some projects are loaded twice
  4. @activities is Array. Rails relations merge methods(except +) don't work with the code above.

Any ideas to optimize it?


Solution

  • In a nutshell, you can't optimize any further without using SQL. This is the way Rails does business. It doesn't allow access to join fields outside the AR model where the query is posed. Therefore to get values in other tables, it does a query on each one.

    It also doesn't allow UNION or fancy WHERE conditions that provide other ways of solving the problem.

    The good news is that these queries are all efficient ones (given that trackable_type is indexed). If the size of the results is anything substantial (say a few dozen rows), the i/o time will dominate the slight additional overhead of 7 simple queries vice 1 complex one.

    Even using SQL, it will be difficult to get all the join results you want in one query. (It can be done, but the result will be a hash rather than an AR instance. So dependent code will be ugly.) The one-query-per-table is wired pretty deeply into Active Record.

    @Mr.Yoshi's solution is a good compromise using minimal SQL except it doesn't let you selectively load either author or project+owner based on the trackable_type field.

    Edit

    The above is all correct for Rails 3. For Rails 4 as @CMW says, the eager_load method will do the same as includes using an outer join instead of separate queries. This is why I love SO! I always learn something.