I have a database where I store a 1-minute snapshot for various exchange's product's best bid and ask. I have a requirement where I need to find:
product1 & exch1
, product2 & exch2
,product3 & exch3
)I am using MySQL 8.0.35 on Ubuntu 20.04
My table has the following structure
symbol(varchar)
exchange(varchar)
bid_price(double)
ask_price(double)
timestamp(bigint) # actual best bid ask received time from exchange (millisecond precision)
updated_at(bigint) # minutes precision stored as seconds since epoch e.g. 1702954500102
symbol,exchange,bid_price,ask_price,timestamp,updated_at
prod1,exch1,555.7,555.8,1702956540603,1702956540
prod1,exch1,555.8,555.9,1702956478591,1702956480
prod1,exch1,555.9,556,1702956420102,1702956420
prod1,exch1,556,556.1,1702956360610,1702956360
prod1,exch1,556,556.1,1702956299607,1702956300
prod1,exch1,556.1,556.2,1702956239595,1702956240
prod1,exch1,556.2,556.3,1702956179597,1702956180
prod1,exch1,556,556.1,1702956122111,1702956120
prod1,exch1,556,556.1,1702956061081,1702956060
prod1,exch1,555.8,555.9,1702955998590,1702956000
prod2,exch2,7.1041,7.1042,1702956600942,1702956600
prod2,exch2,7.104,7.1042,1702956541012,1702956540
prod2,exch2,7.1043,7.1045,1702956480465,1702956480
prod2,exch2,7.1043,7.1044,1702956420617,1702956420
prod2,exch2,7.1043,7.1044,1702956360264,1702956360
prod2,exch2,7.1037,7.1038,1702956300269,1702956300
prod2,exch2,7.1039,7.1041,1702956239092,1702956240
prod2,exch2,7.1041,7.1043,1702956180001,1702956180
prod2,exch2,7.1043,7.1045,1702956123891,1702956120
prod2,exch2,7.1045,7.1047,1702956060588,1702956060
prod3,exch3,72.78,72.79,1702956600332,1702956600
prod3,exch3,72.79,72.8,1702956540698,1702956540
prod3,exch3,72.8,72.81,1702956480542,1702956480
prod3,exch3,72.79,72.81,1702956420228,1702956420
prod3,exch3,72.8,72.81,1702956352133,1702956360
prod3,exch3,72.81,72.82,1702956287733,1702956300
prod3,exch3,72.81,72.82,1702956239441,1702956240
prod3,exch3,72.82,72.83,1702956179253,1702956180
prod3,exch3,72.81,72.82,1702956124140,1702956120
prod3,exch3,72.81,72.82,1702956058999,1702956060
With following query I am able to do requirement #1 and #2, but #3 is something I am unable to figure out yet
(SELECT *, FROM_UNIXTIME(updated_at, '%H%i') AS u_time
FROM algo_system.bbo_1min
WHERE
(
(symbol = 'prod1' AND exchange = 'exch1')
OR
(symbol = 'prod2' AND exchange = 'exch2')
OR
(symbol = 'prod3' AND exchange = 'exch3')
)
HAVING
(u_time >= 900 AND u_time <= 1014)
OR
(u_time >= 1030 AND u_time <= 1129)
ORDER BY TIMESTAMP DESC LIMIT 500)
Edit: From the image below, as can see the highlighted red section has prod2 and prod3 data for 21:00 but prod1 doesn't have, I want to remove the prod2 and prod3 data if there is no datapoint for prod1 at 21:00
A sub join or a where exists query may work. Do consider performance also -- I have no idea if the sub query will slow your work down too much.
Apologies for not answering in MySQL.
Here is an example in MsSQL using an inner join -- you can see where I mildly adapted to account for the difference between MsSQL and MySQL. You should be able to work with the idea though.
SELECT
symbol,
exchange,
bid_price,
ask_price,
--timestamp,
FORMAT(dateadd(S, convert(bigint, timestamp) / 1000, '1970-01-01'),'hhmm')+700 AS timestamp,
--updated_at
FORMAT(dateadd(S, convert(INT, updated_at), '1970-01-01'),'hhmm')+700 AS updated_at
into #tmp_db
FROM(
values
('prod1','exch1',7.1041,7.1042, '1702956600942', '1702956600'), -- 1030 for testing
('prod1','exch1',555.7,555.8, '1702956540603', '1702956540'),
('prod1','exch1',555.8,555.9, '1702956478591', '1702956480'),
('prod1','exch1',555.9,556, '1702956420102', '1702956420'),
('prod1','exch1',556,556.1, '1702956360610', '1702956360'),
('prod1','exch1',556,556.1, '1702956299607', '1702956300'),
('prod1','exch1',556.1,556.2, '1702956239595', '1702956240'),
('prod1','exch1',556.2,556.3, '1702956179597', '1702956180'),
('prod1','exch1',556,556.1, '1702956122111', '1702956120'),
('prod1','exch1',556,556.1, '1702956061081', '1702956060'),
('prod1','exch1',555.8,555.9, '1702955998590', '1702956000'),
('prod2','exch2',7.1041,7.1042, '1702956600942', '1702956600'), -- 1030 for testing
('prod2','exch2',7.104,7.1042, '1702956541012', '1702956540'),
('prod2','exch2',7.1043,7.1045, '1702956480465', '1702956480'),
('prod2','exch2',7.1043,7.1044, '1702956420617', '1702956420'),
('prod2','exch2',7.1043,7.1044, '1702956360264', '1702956360'),
('prod2','exch2',7.1037,7.1038, '1702956300269', '1702956300'),
('prod2','exch2',7.1039,7.1041, '1702956239092', '1702956240'),
('prod2','exch2',7.1041,7.1043, '1702956180001', '1702956180'),
('prod2','exch2',7.1043,7.1045, '1702956123891', '1702956120'),
('prod2','exch2',7.1045,7.1047, '1702956060588', '1702956060'),
('prod3','exch3',72.78,72.79, '1702956600332', '1702956600'), -- 1030 for testing
('prod3','exch3',72.79,72.8, '1702956540698', '1702956540'),
('prod3','exch3',72.8,72.81, '1702956480542', '1702956480'),
('prod3','exch3',72.79,72.81, '1702956420228', '1702956420'),
('prod3','exch3',72.8,72.81, '1702956352133', '1702956360'),
('prod3','exch3',72.81,72.82, '1702956287733', '1702956300'),
('prod3','exch3',72.81,72.82, '1702956239441', '1702956240'),
('prod3','exch3',72.82,72.83, '1702956179253', '1702956180'),
('prod3','exch3',72.81,72.82, '1702956124140', '1702956120'),
('prod3','exch3',72.81,72.82, '1702956058999', '1702956060')
) AS x (symbol, exchange, bid_price, ask_price, timestamp, updated_at)
SELECT top 500
*
--FROM_UNIXTIME(updated_at, '%H%i') AS u_time
FROM #tmp_db
JOIN (
SELECT
count(DISTINCT symbol) as product_count,
updated_at as sub_updated_at
FROM #tmp_db
WHERE
(
(symbol = 'prod1' AND exchange = 'exch1')
OR
(symbol = 'prod2' AND exchange = 'exch2')
OR
(symbol = 'prod3' AND exchange = 'exch3')
)
GROUP BY updated_at
) as x on x.product_count = 3 and x.sub_updated_at = updated_at
WHERE
(
(symbol = 'prod1' AND exchange = 'exch1')
OR
(symbol = 'prod2' AND exchange = 'exch2')
OR
(symbol = 'prod3' AND exchange = 'exch3')
)
--HAVING
-- (updated_at >= 900 AND updated_at <= 1014)
-- OR
-- (updated_at >= 1030 AND updated_at <= 1129)
and (
(updated_at >= 900 AND updated_at <= 1014)
OR
(updated_at >= 1030 AND updated_at <= 1129)
)
ORDER BY updated_at DESC --LIMIT 500
drop table #tmp_db
all three if the prod1 is included
none if the prod1 is not showing
Updated: added "DISTINCT symbol" to ensure all three products are counted, in case a prod can show more than once.