sqlpostgresqljoinaggregate-filter

Conditional aggregate queries


I've seen other solutions. Curious if this method of solving these would work.

TABLES:

attendance_events : date | student_id | attendance

all_students : student_id | school_id | grade_level | date_of_birth | hometown


What percent of students attend school on their birthday?

With agg_join as (SELECT att.date as dates, att.attendance as attendance, als.date_of_birth as DOB, att.student_id as student_id
      FROM attendance_events att
      join all_students als on att.student_id = als.studentid)
      
Select count(DISTINCT student_id) as total_students, 
  count( Distinct case when DOB = dates and attendance = TRUE) as count_of_DOBS,
  total_students/ count_of_DOBS as percent_of_student
from agg_join

Which grade level had the largest drop in attendance between yesterday and today?

With agg_join as ( SELECT att.date as dates, att.attendance as attendance, als.grade_level as grade
      FROM attendance_events att
      join all_students als on att.student_id = als.studentid)
      
Select grade,  
      case when dates ( 'd', -1, currentdate) and attendance = True then 1
      else 0 end as yesterday_att,

       case when dates ( 'd', currentdate) and attendance = True then 1
      else 0 end as Today_att, 
      
      (Today_att - yesterday_att) * -1 AS DIFF
      
from agg_join
Group by grade
Order by DIFF DESC
Limit 1

Solution

  • What percent of students attend school on their birthday?

    SELECT 100.0
         * count(*) FILTER (WHERE EXISTS (SELECT FROM attendance_events a
                                          WHERE a.student_id = s.student_id
                                          AND f_mmdd(a.date) = f_mmdd(s.date_of_birth)
                                         ))
         / count(*) AS percentage
    FROM   all_students s;
    

    Where the custom function f_mmdd() is defined here:

    See:

    About the aggregate FILTER clause:

    Which grade level had the largest drop in attendance between yesterday and today?

    SELECT s.grade_level
         , count(*) FILTER (WHERE a.date = CURRENT_DATE - 1) AS yesterday_attendance
         , count(*) FILTER (WHERE a.date = CURRENT_DATE) AS today_attendance
    FROM   attendance_events a
    JOIN   all_students s USING (student_id)
    WHERE  a.date IN (CURRENT_DATE, CURRENT_DATE -1)  -- logically redundant
    GROUP  BY s.grade_level
    ORDER  BY today_attendance - yesterday_attendance
        -- short for: yesterday_attendance - today_attendance DESC
    LIMIT  1;
    

    WHERE a.date IN (CURRENT_DATE, CURRENT_DATE -1) is logically redundant, but makes the query much faster.

    Read up and try to understand why these work, if you are not solving them yourself.