sqloracle-databaseanti-join

Oracle - Finding missing /non-joined records


I have an issue in Oracle 12 that is easiest explained with the traditional database design scenario of students, classes, and students taking classes called registrations. I understand this model well. I have a scenario where I need to get a COMPLETE list, of all students against ALL classes, and whether or not they are taking that class or not...

Lets use this table design here...

CREATE TABLE CLASSES 
(CLASSID  VARCHAR2(10) PRIMARY KEY,
CLASSNAME  VARCHAR2(25),
INSTRUCTOR VARCHAR2(25) );


CREATE TABLE STUDENTS
(STUDENTID  VARCHAR2(10) PRIMARY KEY,
STUDENTNAMENAME  VARCHAR2(25)
STUDY_MAJOR VARCHAR2(25) );

CREATE TABLE REGISTRATION
(
 CLASSID VARCHAR2(10 BYTE), 
 STUDENTID VARCHAR2(10 BYTE), 
 GRADE NUMBER(4,0), 
 CONSTRAINT "PK1" PRIMARY KEY ("CLASSID", "STUDENTID"),
 CONSTRAINT "FK1" FOREIGN KEY ("CLASSID") REFERENCES "CLASSES" ("CLASSID") ENABLE, 
 CONSTRAINT "FK2" FOREIGN KEY ("STUDENTID") REFERENCES "EGR_MM"."STUDENTS" ("STUDENTID") ENABLE
 ) ;

So assume the following... 300 students, and 15 different classes... and the REGISTRATION table will show how many students taking how many classes... What I need is that info PLUS all the NON-TAKEN combinations... i.e. I need a report (SQL statement) that shows ALL possible combinations... i.e. 300 x 15, and then whether that row exists in the registration table...so for example, the output should look like this...

STUDENTID   Class1_GRADE  Class2_Grade      Class3_Grade`       Class4_Grade
101         A               B                   Not Taking          A
102         C               Not Taking          Not Taking          Not Taking
****** THIS STUDENT NOT TAKING ANY CLASSES So NOT in the Registrations Table
103         Not Taking      Not Taking          Not Taking          Not Taking  

This would work as well, and I can probably do a PIVOT to get the above listing.

STUDENTID   CLASSID  GRADE
101         Class1    A
101         Class2    B
101         Class3    Not Taking
101         Class4    A
...
102         Class1    C
102         Class2    Not Taking
102         Class3    Not Taking
102         Class4    Not Taking
...
103         Class1    Not Taking  // THIS STUDENT NOT TAKING ANY CLASSES
103         Class2    Not Taking
103         Class3    Not Taking
103         Class4    Not Taking

How do I fill in the missing data, i.e. the combination of students and classes NOT taken...?


Solution

  • CROSS JOIN the students and classes and then LEFT OUTER JOIN the registrations and then use COALESCE to get the Not taken value:

    SELECT s.studentid,
           c.classid,
           COALESCE( TO_CHAR( r.grade ), 'Not taken' ) AS grade
    FROM   students s
           CROSS JOIN classes c
           LEFT OUTER JOIN registration r
           ON ( s.studentid = r.studentid AND c.classid = r.classid )
    

    Which, if you have the data:

    INSERT INTO Classes
    SELECT LEVEL,
           'Class' || LEVEL,
           'Instructor' || LEVEL
    FROM   DUAL
    CONNECT BY LEVEL <= 3;
    
    INSERT INTO Students
    SELECT TO_CHAR( LEVEL, 'FM000' ),
           'Student' || LEVEL,
           'Major'
    FROM   DUAL
    CONNECT BY LEVEL <= 5;
    
    INSERT INTO Registration
    SELECT 1, '001', 4 FROM DUAL UNION ALL
    SELECT 1, '002', 2 FROM DUAL UNION ALL
    SELECT 1, '003', 5 FROM DUAL UNION ALL
    SELECT 2, '001', 3 FROM DUAL UNION ALL
    SELECT 3, '001', 1 FROM DUAL;
    

    Then it outputs:

    STUDENTID | CLASSID | GRADE    
    :-------- | :------ | :--------
    001       | 1       | 4        
    002       | 1       | 2        
    003       | 1       | 5        
    001       | 2       | 3        
    001       | 3       | 1        
    005       | 1       | Not taken
    004       | 2       | Not taken
    003       | 3       | Not taken
    005       | 3       | Not taken
    005       | 2       | Not taken
    002       | 2       | Not taken
    003       | 2       | Not taken
    004       | 1       | Not taken
    002       | 3       | Not taken
    004       | 3       | Not taken
    

    If you want to pivot it then:

    SELECT *
    FROM   (
      SELECT s.studentid,
             c.classid,
             COALESCE( TO_CHAR( r.grade ), 'Not taken' ) AS grade
      FROM   students s
             CROSS JOIN classes c
             LEFT OUTER JOIN registration r
             ON ( s.studentid = r.studentid AND c.classid = r.classid )
    )
    PIVOT ( MAX( grade ) FOR classid IN (
      1 AS Class1,
      2 AS Class2,
      3 AS Class3
    ) )
    ORDER BY StudentID
    

    Which outputs:

    STUDENTID | CLASS1    | CLASS2    | CLASS3   
    :-------- | :-------- | :-------- | :--------
    001       | 4         | 3         | 1        
    002       | 2         | Not taken | Not taken
    003       | 5         | Not taken | Not taken
    004       | Not taken | Not taken | Not taken
    005       | Not taken | Not taken | Not taken
    

    db<>fiddle here