phpmysqlsqlinner-join

Find similar items and order by common tags count


I need to get the list of design_id with similar tags, ordered by entries having the most common tags count first

Tables structures:

shop_tags (tag_id, tag_name)
shop_tags_link (tag_id, design)
shop_tshirts (design_id, article_id)

Let's say I have 4 designs

design_id 1 tags = red, blue
design_id 2 tags = black, grey
design_id 3 tags = yellow, green
design_id 4 tags = white, red, black
design_id 5 tags = red

The input is an array of designs ID :

$input = array("1", "2");

I need to find designs with similar tags and sort them by the common tags count (desc), so if the input is designs 1 and 2 we're looking for designs with tags "red, blue, black, grey" which would give the following result in this order :

design 4, design 5 (excluding the original input designs of course)

Finally, instead of returning the designs ID, I need to find the corresponding article_id from table shop_tshirts with the matching design. What is the best way to do this? Should I do a separate query or another INNER JOIN?


Solution

  • Restating what we have to work with, so that the solution is more obvious:

    +-----------+    +-----------------+    +--------------+
    | shop_tags |    | shop_tags_links |    | shop_tshirts |
    +-----------+    +-----------------+    +--------------+
    | tag_id    |----| tag_id          |  /-| design_id    |
    | tag_name  |    | design_id       |-/  | article_id   |
    +-----------+    +-----------------+    +--------------+
    
    design_id 1 tags = red, blue
    design_id 2 tags = black, grey
    design_id 3 tags = yellow, green
    design_id 4 tags = white, red, black
    design_id 5 tags = red
    
    +-----------+   +-----------------+   +--------------+
    | shop_tags |   | shop_tags_links |   | shop_tshirts |
    +-----------+   +-----------------+   +--------------+
    | 1 red     |   |   1  1          |   | 1  ?         |
    | 2 blue    |   |   2  1          |   | 2  ?         |
    | 3 black   |   |   3  2          |   | 3  ?         |
    | 4 grey    |   |   4  2          |   | 4  ?         |
    | 5 yellow  |   |   5  3          |   | 5  ?         |
    | 6 green   |   |   6  3          |    
    | 7 white   |   |   7  4          |   
    |           |   |   1  4          |   
                    |   3  4          |   
                    |   1  5          |   
    

    "find designs with similar tags and sort them by the common tags count (desc)", so if the input is designs 1 and 2 we're looking for designs with tags "red, blue, black, grey" which would give the following result in this order :

    design 4, design 5 (excluding the original input designs of course)"

    This should get you pointed in the right direction. This is off the top of my head and untested, so it probably needs a little tweaking, but it should cover the main concepts.

    select count(b.tag_id) as mysort, b.design_id, c.article_id
    from shop_tags_links as a, shop_tags_links as b inner join shop_tshirts as c on b.design_id=c.design_id
    where 
        a.design_id IN (?,?) AND
        a.tag_id=b.tag_id AND
        b.design_id NOT IN(?,?) 
    group by b.design_id
    order by mysort desc