mysqlstatisticsanalyticschord-diagramtable-statistics

Creating chord diagram matrix in MySql / GROUP BY Syntax


I am doing some data analysis in MYSQL and hoping that the following is possible.

I have a table that documents user blog topics as follows. I have a total of 25 topics, and would like to abstract this enough that we can add topics later and not have to redo the query (if possible):

-----------------------
| user_id |  topic    |
-----------------------
|   01    |  art      | 
|   01    |  cooking  |
|   02    |  art      | 
|   03    |  outdoors |
|   03    |  art      |
|   03    |  cooking  |
-----------------------

In order to create a chord diagram (http://bl.ocks.org/4062006) from this data, I need to analyze the relationships between any two topics, either by grouping the rows by two topics, or by creating a matrix:

-----------------------------------------
| topic_combo            |  user_cnt    |
-----------------------------------------
|   art + cooking        |  2           | 
|   art + outdoors       |  1           |
|   cooking + outdoors   |  1           | 
-----------------------------------------

Or, even better...

---------------------------------------------
|          |  art  |  cooking  |  outdoors  |
---------------------------------------------
| art      |   3   |    2      |     1      |
| cooking  |   2   |    2      |     1      |
| outdoors |   1   |    1      |     1      |
---------------------------------------------

Are either of these possible in MYSQL? If so, what's the easiest way to do them?


Solution

  • Well, I do not know if this is the best way but at least it is one until you get a better answer. Use JOIN and GROUP BY as mentioned by Guvante.

    SELECT a.topic AS at, b.topic AS bt, COUNT(*) AS c FROM a JOIN a AS b ON a.user_id=b.user_id GROUP BY a.topic, b.topic;
    

    This will result in an output like this:

    at          bt          c
    -----------------------------
    art         art         3
    art         cooking     2
    art         outdoors    1
    cooking     art         2
    cooking     cooking     2
    cooking     outdoors    1
    outdoors    art         1
    outdoors    cooking     1
    outdoors    outdoors    1
    

    You can transform it into a matrix e.g. by PHP

    while($data = mysql_fetch_object($sql)) {
      $matrix[$data->at][$data->bt] = $data->c;
    }
    

    Edit: Removed duplication as mentioned in the comments after a hint of a friend.