I have a sqlite database containing a table who looks like this, to store many to many relation :
ID_1 | ID_2 | TOOL |
---|---|---|
1 | 2 | x |
1 | 2 | y |
1 | 3 | x |
1 | 3 | z |
2 | 3 | x |
2 | 3 | y |
2 | 3 | z |
The two first columns contains ID of pairs of features, and the last one contains the name a tool that is related to these feature. A same pair of features can be associated with multiple tools.
I'm looking for a set of query to summarize this table so it looks like this, without modifying the database structure.
ID_1 | ID_2 | x | y | z |
---|---|---|---|---|
1 | 2 | 1 | 1 | 0 |
1 | 3 | 1 | 0 | 0 |
2 | 3 | 1 | 1 | 1 |
Displaying a new column for each unique TOOL name, containing 0 if the tool is not associated with the pair of ID, and 1 if it is. And having only one row per pair of ID.
Is there a combination of command that allow to do that ?
I tried using multiple CASE WHERE statements, but I'm stuck with this kind of results (here with a simplified example) :
CREATE TABLE IF NOT EXISTS test_table (
ID_1 integer NOT NULL,
ID_2 integer NOT NULL,
tool string NOT NULL
)
INSERT INTO test_table (ID_1, ID_2, tool)
VALUES
(1,2,"x"), (1,2,"y"), (1,3,"x"), (1,3,"z"), (2,3,"x"), (2,3,"y"), (2,3,"z")
SELECT *,
CASE WHEN tool == "x" THEN 1 ELSE 0 END AS x,
CASE WHEN tool == "y" THEN 1 ELSE 0 END AS y,
CASE WHEN tool == "z" THEN 1 ELSE 0 END AS z
FROM test_table
ID_1 | ID_2 | x | y | z |
---|---|---|---|---|
1 | 2 | 1 | 0 | 0 |
1 | 2 | 0 | 1 | 0 |
1 | 3 | 1 | 0 | 0 |
1 | 3 | 0 | 0 | 1 |
2 | 3 | 1 | 0 | 0 |
2 | 3 | 0 | 1 | 0 |
2 | 3 | 0 | 0 | 1 |
You can use conditional aggregation:
SELECT ID_1, ID_2,
MAX(CASE WHEN tool = 'x' THEN 1 ELSE 0 END) x,
MAX(CASE WHEN tool = 'y' THEN 1 ELSE 0 END) y,
MAX(CASE WHEN tool = 'z' THEN 1 ELSE 0 END) z
FROM test_table
GROUP BY ID_1, ID_2;
which, for SQLite, can be simplified:
SELECT ID_1, ID_2,
MAX(tool = 'x') x,
MAX(tool = 'y') y,
MAX(tool = 'z') z
FROM test_table
GROUP BY ID_1, ID_2;
See the demo.