sqlpostgresqlpostgresql-12relational-division

Find entities based on "labels" in Postgresql


Say I have two tables in Postgresql 12, one which stores base data of a project and each of these projects may have an arbitrary set of labels:

CREATE TABLE projects (
  id SERIAL,
  title TEXT
);
CREATE TABLE labels (
  id SERIAL,
  projects_id INT NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
  key VARCHAR(127) NOT NULL,
  value TEXT NOT NULL,
  UNIQUE (projects_id, key)
);

Now I want to create a query which returns all projects which match all of multiple key/value combinations from labels. I would like to use an "arbitrary" number (say at least 5) of label matches.

Selecting projects matching exactly one label is easy:

SELECT 
  p.id, p.title 
FROM projects p 
JOIN labels l ON l.projects_id = p.id 
WHERE l.key = 'k1' AND l.value = 'v1';

Just using

SELECT p.id, p.title
FROM projects p
JOIN labels l ON l.projects_id = p.id 
WHERE 
  l.key = 'k1' AND l.value = 'v1' AND 
  l.key = 'k2' AND l.value = 'v2';

does not work and I understand why.

What is the best approach here using SQL only?


Solution

  • You can do:

    select p.*
    from projects p
    join (
      select projects_id 
      from labels
      where (key, value) in (
        ('k1', 'v1'), -- The list of labels to look for
        ('k2', 'v2'),
        ('k3', 'v3'),
        ('k4', 'v4'),
        ('k5', 'v5')
      )
      having count(*) = 5 -- We want full count of them
      group by projects_id
    ) x on x.projects_id = p.id