I've had trouble understanding joins in sql and came upon this image which I think might help me. The problem is that I don't fully understand it. For example, the join in the top right corner of the image, which colors the full B circle red and but only the overlap from A. The image makes it seem like circle B is the primary focus of the sql statement, but the sql statement itself, by starting with A (select from A, join B), conveys the opposite impression to me, namely that A would be the focus of the sql statement.
Similarly, the image below that only includes data from the B circle, so why is A included at all in the join statement?
Question: Working clockwise from the top right and finishing in the center, can someone provide more information about the representation of each sql image, explaining
a) why a join would be necessary in each case (for example, especially in situations where no data's taken from A or B i.e. where only A or B but not both is colored)
b) and any other detail that would clarify why the image is a good representation of the sql
I think your main underlying confusion is that when (for example) only A
is highlighted in red, you're taking that to mean "the query only returns data from A
", but in fact it means "the query only returns data for those cases where A
has a record". The query might still contain data from B. (For cases where B
does not have a record, the query will substitute NULL
.)
Similarly, the image below that only includes data from the B circle, so why is A included at all in the join statement?
If you mean — the image where A
is entirely in white, and there's a red crescent-shape for the part of B
that doesn't overlap with A
, then: the reason that A
appears in the query is, A
is how it finds the records in B
that need to be excluded. (If A
didn't appear in the query, then Venn diagram wouldn't have A
, it would only show B
, and there'd be no way to distinguish the desired records from the unwanted ones.)
The image makes it seem like circle B is the primary focus of the sql statement, but the sql statement itself, by starting with A (select from A, join B), conveys the opposite impression to me, namely that A would be the focus of the sql statement.
Quite right. For this reason, RIGHT JOIN
s are relatively uncommon; although a query that uses a LEFT JOIN
can nearly always be re-ordered to use a RIGHT JOIN
instead (and vice versa), usually people will write their queries with LEFT JOIN
and not with RIGHT JOIN
.