mysqlmariadbmariadb-10.3

MySQL syntax confusion - Merging together simple data for a graceful single-row insertion


I am building a MySQL table to aggregate a set of particular information for a customer. Much of it I've assembled and filtered successfully with simple INNER JOIN commands on records with shared keys/columns, but the pivoting/transposing of the data, even being of a fixed size, is still causing me some syntax confusion for my query. The schema for this table t_snapshots can be seen as follows:

+-------------+---------------------+------+-----+---------------------+----------------+
| Field       | Type                | Null | Key | Default             | Extra          |
+-------------+---------------------+------+-----+---------------------+----------------+
| id          | int(11)             | NO   | PRI | NULL                | auto_increment | 
| value_type  | tinyint(1) unsigned | YES  |     | NULL                |                |
| ch1_id      | varchar(20)         | YES  |     | NULL                |                |
| ch1_val     | float               | NO   |     | 0                   |                |
| ch2_id      | varchar(20)         | YES  |     | NULL                |                |
| ch2_val     | float               | NO   |     | 0                   |                |
| ch3_id      | varchar(20)         | YES  |     | NULL                |                |
| ch3_val     | float               | NO   |     | 0                   |                |
| ch4_id      | varchar(20)         | YES  |     | NULL                |                |
| ch4_val     | float               | NO   |     | 0                   |                |
| timestamp   | datetime            | NO   | MUL | current_timestamp() |                |
+-------------+---------------------+------+-----+---------------------+----------------+

Simply put, I want to select the most recently updated value from t_other_data0.ch[n] to send to t_snapshots.ch[n]_val and select the most recently updated id from t_id_pool.unit_id for each unique t_id_pool.channel_num to send to t_snapshots.ch[n]_id. t_id_pool.channel_num is what correlates to the n value of columns t_snapshots.ch[n]_val:

--EDIT--: Ideally, sample data from source table t_other_data0 that looks for the latest unit_id value from t_id_pool for channel_num=1,2,3,4 and outputs to table t_snapshots:

Gather latest process data from t_other_data0. In this case, rows with id 5-8 are selected because they span all distinct value_type with the latest timestamp.:

Table: t_other_data0
+----+------+------+------+------+------------+---------------------+
| id | ch1  | ch2  | ch3  | ch4  | value_type | timestamp           |
+----+------+------+------+------+------------+---------------------+
|  1 | 1.65 | 3.25 | 1.98 | 2.17 |          1 | 2021-07-22 16:26:40 |
|  2 | 3.12 | 2.33 | 6.42 | 3.22 |          2 | 2021-07-22 16:26:40 |
|  3 | 2.22 | 2.24 | 3.34 | 1.17 |          3 | 2021-07-22 16:26:40 |
|  4 | 1.52 | 1.34 |  1.9 | 2.01 |          4 | 2021-07-22 16:26:40 |
|  5 |  3.2 | 3.21 | 5.42 | 2.13 |          1 | 2021-07-22 16:26:50 |
|  6 | 1.55 | 1.92 | 4.32 | 4.12 |          2 | 2021-07-22 16:26:50 |
|  7 | 2.31 | 1.93 | 2.36 |  3.4 |          3 | 2021-07-22 16:26:50 |
|  8 | 1.78 | 2.17 | 5.62 | 2.34 |          4 | 2021-07-22 16:26:50 |
+----+------+------+------+------+------------+---------------------+

Because these permanent channels change what temporary equipment they're tied to, we determine the current unit_id for each channel_num using latest unit_id from t_id_pool:

Table: t_id_pool
+----+---------------------+-------------+---------+
| id | timestamp           | channel_num | unit_id |
+----+---------------------+-------------+---------+
|  1 | 2021-07-22 09:39:09 |           1 | S4251   |
|  2 | 2021-07-22 09:38:09 |           2 | S3552   |
|  3 | 2021-07-22 09:38:09 |           3 | S0001   |
|  4 | 2021-07-22 09:38:09 |           4 | S1001   |
|  5 | 2021-07-22 09:39:10 |           1 | P5251   |
|  6 | 2021-07-22 09:38:10 |           2 | P4552   |
|  7 | 2021-07-22 09:38:10 |           3 | P1001   |
|  8 | 2021-07-22 09:38:10 |           4 | P2001
+----+---------------------+-------------+---------+

Output to t_snapshots:

Table: t_snapshots
+-----+---------------------+------------+--------+---------+--------+---------+--------+---------+--------+---------+
| id  | timestamp           | value_type | ch1_id | ch1_val | ch2_id | ch2_val | ch3_id | ch3_val | ch4_id | ch4_val |
+-----+---------------------+------------+--------+---------+--------+---------+--------+---------+--------+---------+
| 211 | 2021-07-14 16:26:50 |          1 | P5251  |     3.2 | P4552  |    3.21 | P1001  |    5.42 | P2001  |    2.13 |
| 212 | 2021-07-14 16:26:50 |          2 | P5251  |    1.55 | P4552  |    1.92 | P1001  |    4.32 | P2001  |    4.12 |
| 213 | 2021-07-14 16:26:50 |          3 | P5251  |    2.31 | P4552  |    1.93 | P1001  |    2.36 | P2001  |     3.4 |
+-----+---------------------+------------+--------+---------+--------+---------+--------+---------+--------+---------+



 

Solution

  • Table t_other_data0 seems to be a pivot table. Therefore, I think the first step is to un-pivot it then join it with t_id_pool table to get the latest unit_id, then re-pivot it again. Maybe a query like this can work:

    SELECT 0 id, tod.timestamp, value_type,
           MAX(case when channel_num=1 THEN unit_id else 0 END) AS ch1_id,
           SUM(case when channel_num=1 then chan_val else 0 END) as ch1_val,
           MAX(CASE WHEN channel_num=2 THEN unit_id ELSE 0 END) AS ch2_id,
           SUM(CASE WHEN channel_num=2 THEN chan_val ELSE 0 END) AS ch2_val,
           MAX(CASE WHEN channel_num=3 THEN unit_id ELSE 0 END) AS ch3_id,
           SUM(CASE WHEN channel_num=3 THEN chan_val ELSE 0 END) AS ch3_val,
           MAX(CASE WHEN channel_num=4 THEN unit_id ELSE 0 END) AS ch4_id,
           SUM(CASE WHEN channel_num=4 THEN chan_val ELSE 0 END) AS ch4_val
    FROM   (SELECT value_type, ch1 AS chan_val, 1 AS chan_num, timestamp
            FROM   (SELECT *, Row_number() OVER (partition BY value_type ORDER BY id DESC) rn
                    FROM t_other_data0) AS A
            WHERE  rn = 1 UNION ALL
            SELECT value_type, ch2, 2,
                   timestamp
            FROM   (SELECT *, Row_number() OVER (partition BY value_type ORDER BY id DESC) rn
                    FROM t_other_data0) AS A
            WHERE  rn = 1 UNION ALL
            SELECT value_type, ch3, 3,
                   timestamp
            FROM   (SELECT *, Row_number() OVER (partition BY value_type ORDER BY id DESC) rn
                    FROM t_other_data0) AS A
            WHERE  rn = 1 UNION ALL
            SELECT value_type, ch4, 4,
                   timestamp
            FROM   (SELECT *, Row_number() OVER (partition BY value_type ORDER BY id DESC) rn
                    FROM t_other_data0) AS A
            WHERE  rn = 1) AS tod
           JOIN (SELECT id, timestamp, channel_num, unit_id, 
                        Row_number() OVER (partition BY channel_num ORDER BY timestamp DESC) rn
                 FROM t_id_pool) AS tip
             ON tod.chan_num = tip.channel_num AND tip.rn = 1
        GROUP BY tod.timestamp, value_type;
    

    One of the functions being used here is ROW_NUMBER() with a purpose to assign row number 1 on the latest value_type and channel_number timestamps. As for table t_other_data0, I'm using UNION ALL and there are 4 queries in total following the columns ch1, ch2, ch3 & ch4. Each of them I assign with a hardcoded chan_num according to which column I take.

    I'm not sure about the column id there of what to populate but I assume that since the main purpose of the query is to INSERT into another table, then maybe the id column is auto increment.

    Unfortunately dbfiddle.uk can't be use since yesterday so the fiddle here is for MySQL v8.0 instead of MariaDB 10.3. https://www.db-fiddle.com/f/xf1VmfYMbnGcabJS7dS6A1/1 . The result in the fiddle will have an extra row for t_other_data.id=8 (mentioned by @danblack in the comment) and will not include id=4 since your condition is "Gather latest process data from t_other_table0". But judging from your expected output, it seems like you didn't include id=4 so maybe there were some mistype in your description.