I am trying to summarize some data from two data sources. The tables are following
Table t1
auto_id | unique_column | value |
---|---|---|
1 | d1 | 10 |
2 | d2 | 5 |
3 | d3 | 15 |
Table t2
auto_id | unique_column | value |
---|---|---|
null | d3 | 5 |
null | d4 | 6 |
Expected data in t1 after query:
auto_id | unique_column | value |
---|---|---|
1 | d1 | 10 |
2 | d2 | 5 |
3 | d3 | 20 |
4 | d4 | 6 |
To solve this I was planning to do the following query
REPLACE INTO t1
SELECT auto_id, unique_column, SUM(value) value FROM
(
SELECT * FROM t1
UNION ALL
SELECT * FROM t2
) GROUP BY unique_column;
Issue with this query
SELECT auto_id, unique_column, SUM(value) value FROM
(
SELECT * FROM t1
UNION ALL
SELECT * FROM t2
) GROUP BY unique_column;
The above query will give us
auto_id | unique_column | value |
---|---|---|
1 | d1 | 10 |
2 | d2 | 5 |
3 | d3 | 20 |
NULL | d4 | 6 |
Which is okay. In the insert query we will get the following result:
auto_id | unique_column | value |
---|---|---|
1 | d1 | 10 |
2 | d2 | 5 |
3 | d3 | 20 |
4 | d4 | 6 |
But the following query
SELECT auto_id, unique_column, SUM(value) value FROM
(
SELECT * FROM t2
UNION ALL
SELECT * FROM t1
) GROUP BY unique_column;
will give us
auto_id | unique_column | value |
---|---|---|
1 | d1 | 10 |
2 | d2 | 5 |
NULL | d3 | 20 |
NULL | d4 | 6 |
So, Insert query will product the following unexpected result
auto_id | unique_column | value |
---|---|---|
1 | d1 | 10 |
2 | d2 | 5 |
3 | d3 | 15 |
4 | d3 | 20 |
5 | d4 | 6 |
Alternative thought was, full outer join
but mysql decided not to implement this. What other alternative can be done with mysql?
You were close:
REPLACE INTO t1
SELECT MAX(auto_id), unique_column, SUM(value) value FROM
(
SELECT * FROM t1
UNION ALL
SELECT * FROM t2
) t
GROUP BY unique_column;
MAX(auto_id)
will skip NULL
fields, which will give us just 3
for 'd3'
; for the 'd4'
case, auto_id
will still be NULL
, but (as long as the column is declared as AUTO_INCREMENT NOT NULL
), NULL
will be replaced by the next number in the sequence.
EDIT for your edit: as long as you properly aggregate auto_id
using MAX
(or MIN
), the only scenario NULL
appears is if you have no prior auto_id
; it is not sensitive to ordering.