I am quite new to mysql. I have 2 identical mysql tables which have 50K rows (70 columns) each. Those tables are updated everyday by a datafeed. I need to execute some nested queries like intersections / substractions etc.
One of the queries I try to use is as below. But it doesn't work properly. Either it takes 5 min. to 10 min. (through terminal) or it does not respond back.
SELECT *
FROM table1
WHERE table1.sku IN (SELECT t1.sku
FROM ((SELECT DISTINCT sku
FROM table2)
UNION ALL
(SELECT DISTINCT sku
FROM table1)) AS t1
GROUP BY sku
HAVING Count(*) >= 2)
How can I make it work faster/properly? How should I configure the tables/columns (index, primary key etc.) Or do I need to make any tuning on the mysql server?
I tried several things. I created indexes on the 'sku' which are varchar(75) columns. My database server runs on a 1 CoreProcessor (Digital Ocean) server with 512MB Memory.
--- query with 'EXPLAIN'
+----+--------------------+-----------------------+-------+---------------+---------+---------+------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-----------------------+-------+---------------+---------+---------+------+-------+---------------------------------+
| 1 | PRIMARY | table1 | ALL | NULL | NULL | NULL | NULL | 30260 | Using where |
| 2 | DEPENDENT SUBQUERY | <derived3> | ALL | NULL | NULL | NULL | NULL | 65677 | Using temporary; Using filesort |
| 3 | DERIVED | table2 | range | NULL | sku_idx | 227 | NULL | 31016 | Using index for group-by |
| 4 | UNION | table1 | range | NULL | sku | 227 | NULL | 30261 | Using index for group-by |
| NULL | UNION RESULT | <union3,4> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------------+-----------------------+-------+---------------+---------+---------+------+-------+---------------------------------+
If I understand this particular query correctly, you are trying to display all the records from table1 which have a corresponding sku
in table2.
That can be achieved by a much simpler query:
SELECT *
FROM table1
WHERE table1.sku IN (SELECT DISTINCT table2.sku FROM table2 )
GROUP BY table1.sku
Or, with joins:
SELECT table1.*
FROM table1
INNER JOIN table2 ON table1.sku = table2.sku
GROUP BY table1.sku
This should work in an instant if you have indexes on table1.sku
and table2.sku