A :parent has_many :children
and I am trying to retrieve the age of the oldest child for a parent as an attribute on the parent
. I'm open to any solution that accomplishes that efficiently.
The reason I'm trying to do a subquery is to let the DB do the n+1
overhead instead of making a separate DB request for each parent. Both are inefficient, but using a subquery seems more efficient.
# attributes: id
class Parent < ActiveRecord::Base
has_many :children
# Results in an (n+1) request
def age_of_oldest_child
children.maximum(:age)
end
end
# attributes: id, parent_id, age
class Child < ActiveRecord::Base
belongs_to :parent
end
Sample use case:
parent = Parent.first.age_of_oldest_child # => 16
parents = Parent.all
parents.each do |parent|
puts parent.age_of_oldest_child # => 16, ...
end
My attempt:
sql = "
SELECT
(SELECT
MAX(children.age)
FROM children
WHERE children.parent_id = parents.id
) AS age_of_oldest_child
FROM
parents;
"
Parent.find_by_sql(sql)
This returns an array of maximum ages for all parents; I would like to restrict this to just 1 parent or also have it included as an attribute on a parent when I retrieve all parents.
Here is a workable solution I came up with; are there more efficient alternatives?
class Parent < ActiveRecord::Base
scope :with_oldest_child, -> { includes(:oldest_child) }
has_many :children
has_one :oldest_child, -> { order(age: :desc).select(:age, :parent_id) }, class_name: Child
def age_of_oldest_child
oldest_child && oldest_child.age
end
end
Example usage:
# 2 DB queries, 1 for parent and 1 for oldest_child
parent = Parent.with_oldest_child.find(1)
# No further DB queries
parent.age_of_oldest_child # => 16
Here are two ways of doing it:
parent.rb
class Parent < ActiveRecord::Base
has_many :children
# Leaves choice of hitting DB up to Rails
def age_of_oldest_child_1
children.max_by(&:age)
end
# Always hits DB, but avoids instantiating Child objects
def age_of_oldest_child_2
Child.where(parent: self).maximum(:age)
end
end
The first method uses the enumerable module's max_by
functionality and calls age
on each object in the collection. The advantage of doing it this way is you leave the logic of whether or not to hit the database to Rails. If the children
are already instantiated for some reason, it won't hit the database again. If they are not instantiated, it will perform a select query, load them into memory in a single query (thus avoiding N+1) and then go through each one calling its age
method.
The two disadvantages, however, are that if the underlying data has changed since the children were instantiated, it will still use the outdated result (this could be avoided by passing :true
when calling :children
. Also, it is loading every single child
into memory first, then counting them. If the child
object is large and/or a parent has a large number of children, that could be memory-intensive. It really depends on your use case.
If you decided you wanted to avoid loading all of those children
, you could do a straight DB hit every time using the count
query depicted in method 2. In fact, you would probably actually want to relocate that to a scope in Child
as perhaps some would consider it an anti-pattern to do queries like that outside of the target model, but this just makes it easier to see for the example.