sqlmysqlunion

SQL using unions/intersections to return rows where two columns have values meeting certain conditions in other tables


In SQL I have three tables as follows: table name = users, id is primary key and of type int, name is of type text

id name
1 user1
2 user2

table name = courses, id is primary key and of type int, name is of type text and userid is a foreign key for the ids in table users and is of type int

id name userid
1 course1 1
2 course2 1
3 course3 2
4 course4 2

table name = modules, id is primary key and of type int, name is of type text, userid is the foreign key for the ids in the users table, and courseid is another foreign key for the ids in the courses table

id name userid courseid
1 module1 1 1
2 module2 1 3
3 module3 2 3
4 module4 1 4

I am trying to build an sql query that can return the rows in the modules table where the userid and courseid contain values that should not normally result from the primary ids/foreign keys shown in the courses table (I am trying to solve a potential corruption of data).

In the example above, I should return the rows of the modules table with primary ids 2 and 4, since the courses table has no rows where the foreign key (the user) is 1 and the primary id (course) is 3, and there are also no rows where the foreign key (the user) is 1 and the primary id (course) is 4. Thus the output should be:

id name userid courseid
2 module2 1 3
4 module4 1 4

I am not sure how to use union/intersection along with potential not and in to solve this - any guidance would be appreciated.


Solution

  • I'm not sure why you think you want UNION or INTERSECT, because EXCEPT would be the correct set operator.

    SELECT m.courseid, m.userid
    FROM modules m
    EXCEPT
    SELECT c.id, c.userid
    FROM courses c;
    

    But you probably want a NOT EXISTS rather than EXCEPT, as this allows you to show all columns. Do not use NOT IN as this has weird effects in the presence of nulls.

    SELECT m.*
    FROM modules m
    WHERE NOT EXISTS (SELECT 1
        FROM courses c
        WHERE c.id = m.courseid
          AND c.userid = m.userid);
    

    db<>fiddle


    Having said that, your table design is heavily denormalized.

    courses

    id name
    1 course1
    2 course2
    3 course3
    4 course4

    user_courses

    courseid userid
    1 1
    2 1
    3 2
    4 2

    modules

    id name courseid
    1 module1 1
    2 module2 3
    3 module3 3
    4 module4 4

    This needs a secondary unique key (courseid, id) for the foreign key below.

    user_modules

    moduleid userid courseid
    1 1 1
    2 1 3
    3 2 3
    4 1 4

    This last table should be foreign-keyed like this:

    FOREIGN KEY (courseid, userid) REFERENCES user_courses (courseid, userid),
    FOREIGN KEY (courseid, moduleid) REFERENCES modules (courseid, id)
    

    Note that many-to-many join tables should not normally have a separate primary key. The PK is made up of the columns of the FKs.