ruby-on-railspostgresqlpagy

Rails query to order records by single association row column


Say I have a books table that has id and name columns and a has_many book_ratings association. The book_ratings table has id, rating, rating_date, and book_id columns and belongs_to :book. I'm displaying the books in a table with 3 columns: the books name, the highest rating for the book, and the most recent rating for the book. I want each column sortable and I'm using the Pagy gem for pagination. I want to make a single call to each table to avoid N+1.

I made this query which includes the max_rating and sorts by max_rating:

@pagy, @books = pagy(
  Book.joins(:book_ratings)
      .select('books.*,MAX(rating) as max_rating')
      .order('max_rating')
      .group('books.id')
)

With each book record, I can call book.max_rating to display the max_rating in the table. I cannot think of a way to also include the rating with the latest date and sort the records by that rating. I know I can use MAX(rating_date) to get the date of the book_rating with the most recent date, but I need a way to include the rating value of the book_rating with the latest date and also be able to order the book records by those values.

Any ideas?


Solution

  • To get this to work, I ended up using a select method and then chaining two joins methods. Here's what I ended up using:

    Book.select('books.*,t2.rating as latest_rating,t4.rating as highest_rating')
        .joins(
          "INNER JOIN (
             SELECT book_ratings.rating, book_ratings.book_id, book_ratings.rating_date
             FROM book_ratings
             JOIN (
               SELECT book_id, MAX(rating_date) as max_date
               FROM book_ratings GROUP BY book_id
             ) t1
             ON t1.book_id = book_ratings.book_id
             AND t1.max_date = book_ratings.rating_date
           ) t2
           ON t2.book_id = books.id"
        )
        .joins(
          "INNER JOIN (
             SELECT book_ratings.rating, book_ratings.book_id
             FROM book_ratings
             JOIN (
               SELECT book_id, MAX(rating) as max_rating
               FROM book_ratings
               GROUP BY book_id
             ) t3
             ON t3.book_id = book_ratings.book_id
             AND t3.max_rating = book_ratings.rating
           ) t4
           ON t4.book_id = books.id"
        )
        .distinct
    

    Calling .to_sql on that query gives this SQL:

    SELECT DISTINCT books.*,t2.rating as latest_rating,t4.rating as highest_rating
    FROM "books"
    INNER JOIN (
      SELECT book_ratings.rating, book_ratings.book_id, book_ratings.date
      FROM book_ratings
      JOIN (
        SELECT book_id, MAX(date) as max_date
        FROM book_ratings
        GROUP BY book_id
      ) t1
      ON t1.book_id = book_ratings.book_id
      AND t1.max_date = book_ratings.date
    ) t2
    ON t2.book_id = books.id
    INNER JOIN (
      SELECT book_ratings.rating, book_ratings.book_id
      FROM book_ratings
      JOIN (
        SELECT book_id, MAX(rating) as max_rating
        FROM book_ratings
        GROUP BY book_id
      ) t3
      ON t3.book_id = book_ratings.book_id
      AND t3.max_rating = book_ratings.rating
    ) t4
    ON t4.book_id = books.id
    

    When iterating through this collection, latest_rating and highest_rating can be called on each object and provides the value from the rating column for either. It can also be used with .order('latest_rating asc') (or desc) or .order('highest_rating asc') (or desc)