I need to get the list of design_id with similar tags, ordered by entries having the most common tags count first
Tables structures:
shop_tags (tag_id, tag_name)
shop_tags_link (tag_id, design)
shop_tshirts (design_id, article_id)
Let's say I have 4 designs
design_id 1 tags = red, blue
design_id 2 tags = black, grey
design_id 3 tags = yellow, green
design_id 4 tags = white, red, black
design_id 5 tags = red
The input is an array of designs ID :
$input = array("1", "2");
I need to find designs with similar tags and sort them by the common tags count (desc), so if the input is designs 1 and 2 we're looking for designs with tags "red, blue, black, grey" which would give the following result in this order :
design 4, design 5
(excluding the original input designs of course)
Finally, instead of returning the designs ID, I need to find the corresponding article_id from table shop_tshirts with the matching design. What is the best way to do this? Should I do a separate query or another INNER JOIN?
Restating what we have to work with, so that the solution is more obvious:
+-----------+ +-----------------+ +--------------+
| shop_tags | | shop_tags_links | | shop_tshirts |
+-----------+ +-----------------+ +--------------+
| tag_id |----| tag_id | /-| design_id |
| tag_name | | design_id |-/ | article_id |
+-----------+ +-----------------+ +--------------+
design_id 1 tags = red, blue
design_id 2 tags = black, grey
design_id 3 tags = yellow, green
design_id 4 tags = white, red, black
design_id 5 tags = red
+-----------+ +-----------------+ +--------------+
| shop_tags | | shop_tags_links | | shop_tshirts |
+-----------+ +-----------------+ +--------------+
| 1 red | | 1 1 | | 1 ? |
| 2 blue | | 2 1 | | 2 ? |
| 3 black | | 3 2 | | 3 ? |
| 4 grey | | 4 2 | | 4 ? |
| 5 yellow | | 5 3 | | 5 ? |
| 6 green | | 6 3 |
| 7 white | | 7 4 |
| | | 1 4 |
| 3 4 |
| 1 5 |
"find designs with similar tags and sort them by the common tags count (desc)", so if the input is designs 1 and 2 we're looking for designs with tags "red, blue, black, grey" which would give the following result in this order :
design 4, design 5 (excluding the original input designs of course)"
This should get you pointed in the right direction. This is off the top of my head and untested, so it probably needs a little tweaking, but it should cover the main concepts.
select count(b.tag_id) as mysort, b.design_id, c.article_id
from shop_tags_links as a, shop_tags_links as b inner join shop_tshirts as c on b.design_id=c.design_id
where
a.design_id IN (?,?) AND
a.tag_id=b.tag_id AND
b.design_id NOT IN(?,?)
group by b.design_id
order by mysort desc