I'm having trouble understanding how this query works. It is supposed to return the name of customers who have ordered ALL items.
R
refers to the table of item orders made by customers which contains the customer id (cid
) and item id (iid
).
I
refers to the table of items that can be ordered which contains the item id.
C
is the customer table with customer id.
SELECT cname
FROM Customer C
WHERE NOT EXISTS
( (SELECT I.iid
FROM Item I)
EXCEPT
(SELECT R.iid
FROM Order R
WHERE R.cid=C.cid))
The bottom nested query with SELECT R.iid
gets all the items ordered by any customer.
Then the nested query above the EXCEPT
with SELECT I.iid
finds all the items which have not been ordered before by subtracting the result of the query below it.
If it is nested, what statement does NOT EXISTS
evaluate? Is it R.cid = C.cid
because of FROM Customer C
? How does it get to the end result?
SELECT I.iid FROM Item I
means "IDs of all items".
SELECT R.iid FROM Order R WHERE R.cid=C.cid
means "IDs of all items that customer C
has ever ordered" (where C
is determined by the containing query).
In general, query1 EXCEPT query2
means "all rows that are returned by query1
and are not returned by query2
". In your specific case, (SELECT I.iid FROM Item I) EXCEPT (SELECT R.iid FROM Order R WHERE R.cid=C.cid)
means "IDs of all items that customer C
has never ordered".
Inside a NOT EXISTS (subquery)
expression, all that really matters is whether subquery
returns any rows (in which case it "exists") or not. So (SELECT I.iid FROM Item I) EXCEPT (SELECT R.iid FROM Order R WHERE R.cid=C.cid)
exists if there are is any item that customer C
has never ordered — and it doesn't exist if there isn't any such item.
So the query as a whole finds the name of customers who have ordered every single item.