The idea is that the user leaves likes-dislikes for different items, I need to get a list of users who have left the same ratings (likes and dislikes) as the selected user (USER_ID = 1), to determine their similarity.
RATING Column:
1 = like,
0 = dislike
Full table:
+---------+---------+--------+--------------------------------------------------+
| USER_ID | ITEM_ID | RATING | -EXAMPLE- |
+---------+---------+--------+--------------------------------------------------+
| 1 | 1 | 1 |-+
| 1 | 2 | 1 | |
| 1 | 3 | 1 | +-[1,1,1,0,0] user_1 vector of ratings
| 1 | 4 | 0 | | | | |
| 1 | 5 | 0 |-+ | | |
| 3 | 1 | 1 |----+ + + total_match with user_1 = 3 [1,0,0]
| 3 | 2 | 0 | | |
| 3 | 3 | 0 | | |
| 3 | 4 | 0 |----------+ |
| 3 | 5 | 0 |------------+
| 4 | 1 | 1 |
| 4 | 2 | 1 |
| 4 | 3 | 1 |
| 4 | 4 | 0 |
| 4 | 5 | 0 |
+---------+---------+--------+
Match computing:
user_3 likes_match with user_1 = 1
user_3 dislikes_match with user_1 = 2
total_match = likes_match + dislikes_match = 3
How to make a SQL query to get the following result:
+---------+-------------+----------------+-------------+
| user_id | likes_match | dislikes_match | total_match |
+---------+-------------+----------------+-------------+
| 3 | 1 | 2 | 3 |
| 4 | 3 | 2 | 5 |
+---------+-------------+----------------+-------------+
Any Ideas?
(This uses sqlite, but it shouldn't need much if anything to work on other databases):
Given the following table:
CREATE TABLE ratings(user_id INTEGER, item_id INTEGER, rating INTEGER
, PRIMARY KEY(user_id, item_id)) WITHOUT ROWID;
INSERT INTO ratings VALUES(1,1,1);
INSERT INTO ratings VALUES(1,2,1);
INSERT INTO ratings VALUES(1,3,1);
INSERT INTO ratings VALUES(1,4,0);
INSERT INTO ratings VALUES(1,5,0);
INSERT INTO ratings VALUES(3,1,1);
INSERT INTO ratings VALUES(3,2,0);
INSERT INTO ratings VALUES(3,3,0);
INSERT INTO ratings VALUES(3,4,0);
INSERT INTO ratings VALUES(3,5,0);
INSERT INTO ratings VALUES(4,1,1);
INSERT INTO ratings VALUES(4,2,1);
INSERT INTO ratings VALUES(4,3,1);
INSERT INTO ratings VALUES(4,4,0);
INSERT INTO ratings VALUES(4,5,0);
This query:
SELECT r1.user_id AS user_id
, sum(r1.rating) AS likes_match
, sum(CASE r1.rating WHEN 0 THEN 1 ELSE 0 END) AS dislikes_match
, count(*) AS total_match
FROM ratings AS r1
JOIN ratings AS r2 ON r2.user_id = 1
AND r1.item_id = r2.item_id
AND r1.rating = r2.rating
WHERE r1.user_id <> 1
GROUP BY r1.user_id
ORDER BY r1.user_id;
Prouduces:
user_id likes_match dislikes_match total_match
---------- ----------- -------------- -----------
3 1 2 3
4 3 2 5