sqldatabasejoindistributed-databasesemi-join

What is semi-join in database?


I am having trouble while trying to understand the concept of semi-join and how it is different from conventional join. I have tried some article already but not satisfied with the explanation, could someone please help me to understand it?


Solution

  • Simple example. Let's select students with grades using inner join:

    SELECT DISTINCT s.id
    FROM  students s
          INNER JOIN grades g ON g.student_id = s.id
    

    Now the same with left semi-join:

    SELECT s.id
    FROM  students s
    WHERE EXISTS (SELECT 1 FROM grades g
                  WHERE g.student_id = s.id)
    

    The latter is generally more efficient (depending on concrete DBMS and query optimizer).