sqlpostgresqljoinleft-join

How to JOIN M:M table the right way?


My database structure:

CREATE TABLE categories (
    name VARCHAR(30) PRIMARY KEY
);

CREATE TABLE additives (
    name VARCHAR(30) PRIMARY KEY
);

CREATE TABLE beverages (
    name VARCHAR(30) PRIMARY KEY,
    description VARCHAR(200),
    price NUMERIC(5, 2) NOT NULL CHECK (price >= 0),
    category VARCHAR(30) NOT NULL REFERENCES categories(name) ON DELETE CASCADE ON UPDATE CASCADE
);

CREATE TABLE b_additives_xref (
    bname VARCHAR(30) REFERENCES beverages(name) ON DELETE CASCADE ON UPDATE CASCADE,
    aname VARCHAR(30) REFERENCES additives(name) ON DELETE CASCADE ON UPDATE CASCADE,
    PRIMARY KEY(bname, aname)
);


INSERT INTO categories VALUES
    ('Cocktails'), ('Biere'), ('Alkoholfreies');

INSERT INTO additives VALUES
    ('Kaliumphosphat (E 340)'), ('Pektin (E 440)'), ('Citronensäure (E 330)');

INSERT INTO beverages VALUES
    ('Mojito Speciale', 'Cocktail mit Rum, Rohrzucker und Minze', 8, 'Cocktails'),
    ('Franziskaner Weißbier', 'Köstlich mildes Hefeweizen', 6, 'Biere'),
    ('Augustiner Hell', 'Frisch gekühlt vom Fass', 5, 'Biere'),
    ('Coca Cola', 'Coffeeinhaltiges Erfrischungsgetränk', 2.75, 'Alkoholfreies'),
    ('Sprite', 'Erfrischende Zitronenlimonade', 2.50, 'Alkoholfreies'),
    ('Karaffe Wasser', 'Kaltes, gashaltiges Wasser', 6.50, 'Alkoholfreies');

INSERT INTO b_additives_xref VALUES
    ('Coca Cola', 'Kaliumphosphat (E 340)'),
    ('Coca Cola', 'Pektin (E 440)'),
    ('Coca Cola', 'Citronensäure (E 330)');

SqlFiddle

I am trying to list all beverages and their attributes (price, description, etc.) and add another column additives from the b_additives_xref table that holds a concatenated string with all additives contained in each beverage:

SELECT
    beverages.name AS name,
    beverages.description AS description,
    beverages.price AS price,
    beverages.category AS category,
    string_agg(additives.name, ', ') AS additives
FROM beverages, additives
    LEFT JOIN b_additives_xref ON b_additives_xref.aname = additives.name
GROUP BY beverages.name
ORDER BY beverages.category;

The output:

Coca Cola       | Coffeeinhaltiges Erfrischungsgetränk | 2.75 | Alkoholfreies | Kaliumphosphat (E 340), Pektin (E 440), Citronensäure (E 330)
Karaffe Wasser  | Kaltes, gashaltiges Wasser           | 6.50 | Alkoholfreies | Kaliumphosphat (E 340), Pektin (E 440), Citronensäure (E 330)
Sprite          | Erfrischende Zitronenlimonade        | 2.50 | Alkoholfreies | Kaliumphosphat (E 340), Pektin (E 440), Citronensäure (E 330)
Augustiner Hell | Frisch gekühlt vom Fass              | 5.00 | Biere         | Kaliumphosphat (E 340)[...]

Which is wrong since only 'Coca Cola' has existing rows in the b_additives_xref table. Except for the row 'Coca Cola' all other rows should have 'null' or 'empty field' values in the column additives. What am I doing wrong?


Solution

  • I believe you are looking for this

    SELECT 
        B.name AS name, 
        B.description AS description, 
        B.price AS price,
        B.category AS category, 
        string_agg(A.name, ', ') AS additives 
    FROM Beverages B
        LEFT JOIN b_additives_xref xref ON xref.bname = B.name 
        Left join additives A on A.name = xref.aname
    GROUP BY B.name
    ORDER BY B.category;
    

    Output

    NAME    DESCRIPTION                                 PRICE   CATEGORY        ADDITIVES
    Coca Cola   Coffeeinhaltiges Erfrischungsgetränk    2.75    Alkoholfreies   Kaliumphosphat (E 340), Pektin (E 440), Citronensäure (E 330)
    

    The problem was that you had a Cartesian product between your beverages and additives tables

    FROM beverages, additives
    

    Every record got placed with every other record. They both need to be explicitly joined to the xref table.