ruby-on-railspostgresqlactiverecorddistinct-on

Rails 8, postgres, distinct on and ordering


This feels so much harder than I think it should!

I have a Rails model: MarketplaceMessage

Every message has an initial_record_id - if it's the first message, it's the ID of the record and if it's a reply it is the first record in the thread.

I want to show the distinct threads, ordered by the most recent message in the threads descending - so:

If it wasn't distinct, it would just be order(id: :desc) - but that returns multiple messages from the thread.

This mostly works - it returns the most recent messages from each thread - but the returned messages are ordered asc:

scope :distinct_threads, -> {
  select("DISTINCT ON (marketplace_messages.initial_record_id) *")
    .order("marketplace_messages.initial_record_id, marketplace_messages.id desc")
}

How do I get this to work and avoid the error:

ActiveRecord::StatementInvalid:
       PG::InvalidColumnReference: ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY expressions

I assume I have to use some sql, but would prefer to do it as Railsy as possible


Solution

  • While in the comments I originally recommended:

    scope :distinct_threads, -> {
      where(
        id: select(arel_table[:id].maximum).group(:initial_record_id)
      ).order(id: :desc)
    }
    

    Result:

    SELECT 
      marketplace_messages.*
    FROM 
      marketplace_messages
    WHERE 
      marketplace_messages.id IN (
       SELECT 
         MAX(marketplace_messages.id) 
       FROM 
         marketplace_messages
       GROUP BY 
         marketplace_messages.initial_id
    ) 
    ORDER BY 
     marketplace_messages.id
    

    If you want to use DISTINCT ON we can do so using a bit of Arel to construct a source subquery as follows:

    scope :distinct_threads, -> {
      messages_table = arel_table
      sub_query = messages_table 
         .project(messages_table[Arel.star])
         .distinct_on(messages_table[:initial_record_id])
         .order(messages_table[:initial_record_id].asc, messages_table[:id].desc)
    
      from(sub_query.as(table_name)).order(id: :desc)
    }
    

    This should result in

    SELECT 
      marketplace_messages.*
    FROM 
      (
        SELECT DISTINCT ON (marketplace_messages.initial_record_id)
          marketplace_messages.*
        FROM 
          marketplace_messages
        ORDER BY 
          marketplace_messages.initial_record_id ASC,  marketplace_messages.id DESC
      ) AS marketplace_messages
    ORDER BY 
      marketplace_messages.id DESC