sqlmysqlinner-joinrelational-division

Retrieve rows that are fully compatible with each other in an SQL database


I have this simple structure in my database

CREATE TABLE species (
  _id INTEGER PRIMARY KEY,
  name TEXT NOT NULL
);
CREATE TABLE compatibility (
  _id INTEGER PRIMARY KEY,
  speciesA INTEGER,
  speciesB INTEGER,
  compatibility TINYINT NOT NULL
);

I want to point out that speciesA and speciesB are compound unique to prevent duplicated information. For example, if there would be only 5 species in my database, the compatibility table would look like this:

INSERT INTO species VALUES (1, 'EspecieA');
INSERT INTO species VALUES (2, 'EspecieB');
INSERT INTO species VALUES (3, 'EspecieC');
INSERT INTO species VALUES (4, 'EspecieD');
INSERT INTO species VALUES (5, 'EspecieD');

INSERT INTO compatibility VALUES (null, 1, 2, 1);
INSERT INTO compatibility VALUES (null, 1, 3, 1);
INSERT INTO compatibility VALUES (null, 1, 4, 1);
INSERT INTO compatibility VALUES (null, 1, 5, 0);
INSERT INTO compatibility VALUES (null, 2, 3, 1);
INSERT INTO compatibility VALUES (null, 2, 4, 1);
INSERT INTO compatibility VALUES (null, 2, 5, 0);
INSERT INTO compatibility VALUES (null, 3, 4, 1);
INSERT INTO compatibility VALUES (null, 3, 5, 1);
INSERT INTO compatibility VALUES (null, 4, 5, 1);

I need to write a query that from a given list of species returns a list of species that are fully compatible with each other, meaning all species in the result list must be compatible with all the species in the list provided. Provided species must not be in the result list.

I have tried the following query, but it only returns species that are compatible with at least one of the provided species:

SELECT id, name
FROM species s
WHERE s.id NOT IN (
    SELECT IF(speciesA NOT IN (1,2,3), speciesA, speciesB) AS specie
    FROM compatibility 
    WHERE (speciesA IN (1,2,3)
    AND compatible IN (0)) OR (speciesB IN (1,2,3)
    AND compatible IN (0))
)
AND s.id NOT IN (1,2,3);

How can I modify this query to obtain the list of species that are fully compatible with each other?

For the query above the expected result should be a list of species that only contains species 4. Species 1, 2, 3 are excluded as are in the list provided, and 5 should be excluded cause is not compatible with species 1 and 2.

Any help or suggestions would be greatly appreciated. Thank you!


Solution

  • This will only include species that are explicitly linked with compatibility = 1 (ie assumes by default that species are not compatible)

    SELECT s._id, s.name
    FROM species s
    INNER JOIN compatibility c ON
        (s._id = c.speciesA AND c.speciesB IN (1, 2))
        OR (s._id = c.speciesB AND c.speciesA IN (1, 2))
    WHERE c.compatibility = 1
    AND s._id NOT IN (1, 2)
    

    This will include all species except those that are explicitly linked with compatibility = 0 (ie assumes by default that species are compatible)

    SELECT s._id, s.name
    FROM species s
    LEFT JOIN compatibility c ON
        ((s._id = c.speciesA AND c.speciesB IN (1, 2))
        OR (s._id = c.speciesB AND c.speciesA IN (1, 2)))
        AND c.compatibility = 0
    WHERE c._id IS NULL
    AND s._id NOT IN (1, 2)
    

    Whichever of these better fits your logic, I would also recommend that to optimise query performance you add indices on the compatibility.speciesA and compatibility.speciesB columns.