I've been asking a lot of questions using Clickhouse lately I hope someone can save me from this suffering.
I have multiple dates in my DB,
ex. my db has dates of everyday for month of may (2020-05-01 ~ 2020-05-31)
then I want to set 1st may and 31st may as starting date and end date
and get the SELECT result of dates in array with certain interval
for ex.
SELECT (some query about setting 2 timestamps as start / end and interval of 5 days)
then the expecting result will be
2020-05-01
2020-05-05
2020-05-10
2020-05-15 . . .goes on till 30
and I want this interval to be month, day, hour, minute, second, milisecond
but as far as I've studied about clickhouse, is it true that we cant use milisecond with clickhouse ????
if so, do I have to convert the date into UInt64 then do some interval tricks to get the result UInt64 and then cast it to datetime ???
please help me :(
example of input data:
start_at
2020-01-14 18:04:36.000
2020-01-14 19:07:48.000
2020-01-14 20:46:48.000
2020-01-14 23:21:12.000
2020-01-15 00:02:00.000
2020-01-15 03:36:00.000
2020-01-15 04:54:24.000
2020-01-15 08:04:00.000
2020-01-15 09:04:00.000
2020-01-15 10:04:00.000
2020-01-15 11:04:00.000
2020-01-15 14:04:00.000
2020-01-15 18:04:00.000
2020-01-16 11:04:00.000
2020-01-16 17:04:00.000
2020-01-16 17:04:00.000
2020-01-17 11:04:00.000
2020-01-17 18:04:00.000
2020-01-17 20:04:00.000
2020-01-18 01:04:00.000
2020-01-18 15:04:00.000
expected result (with interval of 2 days for example)
time count
2020-01-14 18:04:36.000
2020-01-16 18:04:36.000
2020-01-18 18:04:36.000
or with interval 1 day
time count
2020-01-14 18:04:36.000
2020-01-15 18:04:36.000
2020-01-16 18:04:36.000
2020-01-17 18:04:36.000
2020-01-18 18:04:36.000
or with 12 hours
time count
2020-01-14 18:04:36.000
2020-01-15 06:04:36.000
2020-01-15 18:04:36.000
2020-01-16 06:04:36.000
2020-01-16 18:04:36.000
2020-01-17 06:04:36.000
2020-01-17 18:04:36.000
2020-01-18 06:04:36.000
2020-01-18 18:04:36.000
Try this query:
WITH
toDateTime64('2020-01-14 18:04:36.000', 3) AS start_from,
toUnixTimestamp64Milli(start_from) AS start_from_ts,
((12 * 60) * 60) * 1000 AS interval_msec
SELECT
fromUnixTimestamp64Milli(toInt64(ts)) dt,
count
FROM (
SELECT
start_from_ts + interval_msec * interval_number AS ts,
floor((toUnixTimestamp64Milli(start_at) - start_from_ts) / interval_msec) AS interval_number,
count() AS count
FROM
(
/* emulate the test dataset */
SELECT toDateTime64(dt, 3) AS start_at
FROM (
SELECT arrayJoin([
('2020-01-14 18:04:36.000'),
('2020-01-14 19:07:48.000'),
('2020-01-14 20:46:48.000'),
('2020-01-14 23:21:12.000'),
('2020-01-15 00:02:00.000'),
('2020-01-15 03:36:00.000'),
('2020-01-15 04:54:24.000'),
('2020-01-15 08:04:00.000'),
('2020-01-15 09:04:00.000'),
('2020-01-15 10:04:00.000'),
('2020-01-15 11:04:00.000'),
('2020-01-15 14:04:00.000'),
('2020-01-15 18:04:00.000'),
('2020-01-16 11:04:00.000'),
('2020-01-16 17:04:00.000'),
('2020-01-16 17:04:00.000'),
('2020-01-17 11:04:00.000'),
('2020-01-17 18:04:00.000'),
('2020-01-17 20:04:00.000'),
('2020-01-18 01:04:00.000'),
('2020-01-18 15:04:00.000')]) dt)
)
WHERE start_at >= start_from
GROUP BY interval_number
ORDER BY ts WITH FILL FROM toUnixTimestamp64Milli(toDateTime64('2020-01-14 18:04:36.000', 3)) TO toUnixTimestamp64Milli(toDateTime64('2020-01-18 18:04:36.000', 3)) STEP ((12 * 60) * 60) * 1000
)
/* result
┌──────────────────────dt─┬─count─┐
│ 2020-01-14 18:04:36.000 │ 7 │
│ 2020-01-15 06:04:36.000 │ 6 │
│ 2020-01-15 18:04:36.000 │ 0 │
│ 2020-01-16 06:04:36.000 │ 3 │
│ 2020-01-16 18:04:36.000 │ 0 │
│ 2020-01-17 06:04:36.000 │ 2 │
│ 2020-01-17 18:04:36.000 │ 2 │
│ 2020-01-18 06:04:36.000 │ 1 │
└─────────────────────────┴───────┘
*/
The query above calculates values for the interval 12 hours.
To apply it to other intervals needs to modify: