mysqlsybasepervasivefaircom-db-sql

Selecting parent records when child mathes criteria


I am trying to limit returned results of users to results that are "recent" but where users have a parent, I also need to return the parent.

CREATE TABLE `users`  (
  `id` int(0) NOT NULL,
  `parent_id` int(0) NULL,
  `name` varchar(255) NULL,
  PRIMARY KEY (`id`)
);
CREATE TABLE `times` (
  `id` int(11) NOT NULL,
  `time` datetime DEFAULT NULL,
  PRIMARY KEY (`id`)
);

INSERT INTO `users`(`id`, `parent_id`, `name`) VALUES (1, NULL, 'Alan');
INSERT INTO `users`(`id`, `parent_id`, `name`) VALUES (2, 1, 'John');
INSERT INTO `users`(`id`, `parent_id`, `name`) VALUES (3, NULL, 'Jerry');
INSERT INTO `users`(`id`, `parent_id`, `name`) VALUES (4, NULL, 'Bill');
INSERT INTO `users`(`id`, `parent_id`, `name`) VALUES (5, 1, 'Carl');

INSERT INTO `times`(`id`, `time`) VALUES (2, '2019-01-01 14:40:38');
INSERT INTO `times`(`id`, `time`) VALUES (4, '2019-01-01 14:40:38');

http://sqlfiddle.com/#!9/91db19

In this case I would want to return Alan, John and Bill, but not Jerry because Jerry doesn't have a record in the times table, nor is he a parent of someone with a record. I am on the fence about what to do with Carl, I don't mind getting the results for him, but I don't need them.

I am filtering tens of thousands of users with hundreds of thousands of times records, so performance is important. In general I have about 3000 unique id's coming from times that could be either an id, or a parent_id.

The above is a stripped down example of what I am trying to do, the full one includes more joins and case statements, but in general the above example should be what we work with, but here is a sample of the query I am using (full query is nearly 100 lines):

SELECT id                                   AS reference_id, 
       CASE WHEN (id != parent_id)
       THEN
       parent_id
       ELSE null END                                    AS parent_id, 
       parent_id                                          AS family_id, 
       Rtrim(last_name)                                 AS last_name, 
       Rtrim(first_name)                                AS first_name, 
       Rtrim(email)                                     AS email, 
       missedappt                                     AS appointment_missed, 
       appttotal                                      AS appointment_total, 
       To_char(birth_date, 'YYYY-MM-DD 00:00:00')       AS birthday, 
       To_char(first_visit_date, 'YYYY-MM-DD 00:00:00') AS first_visit, 
       billing_0_30
FROM   users AS p
      RIGHT JOIN(
                SELECT p.id, 
                       s.parentid, 
                       Count(p.id) AS appttotal, 
                       missedappt, 
                        billing0to30                                        AS billing_0_30
                FROM   times AS p 
                       JOIN (SELECT missedappt, parent_id, id                                     
                             FROM   users) AS s 
                         ON p.id = s.id 
                       LEFT JOIN (SELECT parent_id, billing0to30
                                  FROM   aging) AS aging 
                              ON aging.parent_id = p.id 
                WHERE  p.apptdate > To_char(Timestampadd(sql_tsi_year, -1, Now()), 'YYYY-MM-DD') 
                GROUP  BY p.id, 
                          s.parent_id, 
                          missedappt, 
                          billing0to30
                ) AS recent ON recent.patid = p.patient_id

This example is for a Faircom C-Tree database, but I also need to implement a similar solution in Sybase, MySql, and Pervasive, so just trying to understand what I should do for best performance.

Essentially what I need to do is somehow get the RIGHT JOIN to also include the users parent.


Solution

  • NOTES:

    We want to limit the number of times we have to join the times and users tables (a CTE would make this a bit easier to code/read).

    The main query (times -> users(u1) -> users(u2)) will give us child and parent names in separate columns so we'll use a 2-row dynamic table plus a case statement to to pivot the columns into their own rows (NOTE: I don't work with MySQL and didn't have time to research if there's a pivot capability in MySQL 5.6)

    -- we'll let 'distinct' filter out any duplicates (eg, 2 'children' have same 'parent')
    
    select distinct 
           final.name
    
    from
    
        -- cartesian product of 'allnames' and 'pass' will give us
        -- duplicate lines of id/parent_id/child_name/parent_name so 
        -- we'll use a 'case' statement to determine which name to display
    
        (select case when pass.pass_no = 1 
                     then allnames.child_name 
                     else allnames.parent_name 
                end as name
    
         from 
    
             -- times join users left join users; gives us pairs of
             -- child_name/parent_name or child_name/NULL
    
             (select u1.id,u1.parent_id,u1.name as child_name,u2.name as parent_name
              from   times t
              join   users u1
              on     u1.id = t.id
    
              left
              join   users u2
              on     u2.id = u1.parent_id) allnames
    
              join
    
              -- poor man's pivot code:
              -- 2-row dynamic table; no join clause w/ allnames will give us a
              -- cartesian product; the 'case' statement will determine which
              -- name (child vs parent) to display
    
              (select 1 as pass_no
               union
               select 2) pass
    
        ) final
    
    -- eliminate 'NULL' as a name in our final result set
    where final.name is not NULL
    
    order by 1
    

    Result set:

    name
    ==============
    Alan
    Bill
    John
    

    MySQL fiddle