In Microsoft SQL Server, I have one table (live table) that takes snapshot of student enrollment information on a daily basis such that one students can take multiple rows by date. In this table, we sore information like student_id, enrollmentStatus (1 or 0), attendnaceDate (date), and live_tablemakrer (all rows = 1). This table is set up to update daily in the data warehouse to bring in new enrollment data of that day.
In another static table (static table), it stores students' enrollment information on the first day of this semester. This table is static an will not change. It contains fields like student_id, enrolledonFD (all rows = 1). In this table, every student is an unique entry.
For each day, I want to compare who are in the live table and also in the static table, who are in the live table but not in the static table, and who are not in the live table but in the static table.
To achieve this goal, I'm thinking of doing a full outer join. But I got stuck on how to handle the date element in the live table. Basically, it should be a full outer join between the static table and the live table for every date.
Could someone give me some advice on how to do that?
Below is sample data and intended outcome
StaticTable
StudentID | Grade |
---|---|
A | K |
B | K |
C | PK |
LiveTable
StudentID | Date | Attendance | Grade |
---|---|---|---|
A | 9/1 | 1 | K |
A | 9/2 | 0 | K |
A | 9/3 | 1 | K |
B | 9/1 | 1 | K |
B | 9/2 | 1 | K |
C | 9/1 | 1 | PK |
C | 9/3 | 0 | PK |
D | 9/1 | 1 | PK |
D | 9/3 | 1 | PK |
F | 9/3 | 1 | K |
Intended Outcome
StudentID | Date | InStaticTable | InLiveTable | Attendance | Grade |
---|---|---|---|---|---|
A | 9/1 | 1 | 1 | 1 | K |
A | 9/2 | 1 | 1 | 0 | K |
A | 9/3 | 1 | 1 | 1 | K |
B | 9/1 | 1 | 1 | 1 | K |
B | 9/2 | 1 | 1 | 1 | K |
B | 9/3 | 1 | 0 | NaN | K |
C | 9/1 | 1 | 1 | 1 | PK |
C | 9/2 | 1 | 0 | NaN | PK |
C | 9/3 | 1 | 1 | 0 | PK |
D | 9/1 | 0 | 1 | 1 | PK |
D | 9/3 | 0 | 1 | 1 | PK |
F | 9/3 | 0 | 1 | 1 | K |
In other words:
You plan to use full join. However, the full join operation is DBMS-specific. Since the DBMS was not initially specified in the question, I suggest considering an example of a query without using full join.
out
rows, where Grade is null in booth st and lt tables - this row (StudentId,date) not exists in LiveTable with this date and not exists in StaticTable (your 4. condition)
See exampleselect s.StudentId,d.date
,case when st.Grade is not null then 1 else 0 end isStaticTable
,case when lt.Grade is not null then 1 else 0 end isLiveTable
,lt.Attendance
,coalesce(st.Grade,lt.Grade) Grade
from ( -- students
select StudentId from StaticTable
union
select StudentId from LiveTable
) s
left join ( -- dates
select distinct date from LiveTable
)d on 1=1
left join StaticTable st on st.StudentId=s.StudentId
left join LiveTable lt on lt.StudentId=s.StudentId and lt.date=d.date
where coalesce(st.Grade,lt.Grade) is not null
order by s.StudentId,d.date;
StudentId | date | isStaticTable | isLiveTable | Attendance | Grade |
---|---|---|---|---|---|
A | 2024-09-01 | 1 | 1 | 1 | K |
A | 2024-09-02 | 1 | 1 | 0 | K |
A | 2024-09-03 | 1 | 1 | 1 | K |
B | 2024-09-01 | 1 | 1 | 1 | K |
B | 2024-09-02 | 1 | 1 | 1 | K |
B | 2024-09-03 | 1 | 0 | null | K |
C | 2024-09-01 | 1 | 1 | 1 | PK |
C | 2024-09-02 | 1 | 0 | null | PK |
C | 2024-09-03 | 1 | 1 | 0 | PK |
D | 2024-09-01 | 0 | 1 | 1 | PK |
D | 2024-09-03 | 0 | 1 | 1 | PK |
F | 2024-09-03 | 0 | 1 | 1 | K |