I am doing some data analysis in MYSQL and hoping that the following is possible.
I have a table that documents user blog topics as follows. I have a total of 25 topics, and would like to abstract this enough that we can add topics later and not have to redo the query (if possible):
-----------------------
| user_id | topic |
-----------------------
| 01 | art |
| 01 | cooking |
| 02 | art |
| 03 | outdoors |
| 03 | art |
| 03 | cooking |
-----------------------
In order to create a chord diagram (http://bl.ocks.org/4062006) from this data, I need to analyze the relationships between any two topics, either by grouping the rows by two topics, or by creating a matrix:
-----------------------------------------
| topic_combo | user_cnt |
-----------------------------------------
| art + cooking | 2 |
| art + outdoors | 1 |
| cooking + outdoors | 1 |
-----------------------------------------
Or, even better...
---------------------------------------------
| | art | cooking | outdoors |
---------------------------------------------
| art | 3 | 2 | 1 |
| cooking | 2 | 2 | 1 |
| outdoors | 1 | 1 | 1 |
---------------------------------------------
Are either of these possible in MYSQL? If so, what's the easiest way to do them?
Well, I do not know if this is the best way but at least it is one until you get a better answer. Use JOIN and GROUP BY as mentioned by Guvante.
SELECT a.topic AS at, b.topic AS bt, COUNT(*) AS c FROM a JOIN a AS b ON a.user_id=b.user_id GROUP BY a.topic, b.topic;
This will result in an output like this:
at bt c
-----------------------------
art art 3
art cooking 2
art outdoors 1
cooking art 2
cooking cooking 2
cooking outdoors 1
outdoors art 1
outdoors cooking 1
outdoors outdoors 1
You can transform it into a matrix e.g. by PHP
while($data = mysql_fetch_object($sql)) {
$matrix[$data->at][$data->bt] = $data->c;
}
Edit: Removed duplication as mentioned in the comments after a hint of a friend.