so my problem is that I'm creating a table with few joins, few dimensions, few metrics. I'm using only one date column in the final select, that is the mm.date below. It's in a table where my main metric resides (mm.measure). Although, in the servers table, there is another create_date column I'm not using, that tells me when a server_component was created. Now you can see, that the server_component '2' only shows up for the 1/4/2024 mm.date in the result, but it should show up for each row starting 1/1/2024. Basically, it should generate more rows in the date column based off of create_date from servers table, that I'm not adding into the select.
select
s.server_location
, s.server_component
, mm.date
, max(mm.measure)
, count(distinct s.server_component)
from
servers s
left join main_metric_table mm
on s.server_component = mm.server_component
group by
s.server_location
, s.server_component
, mm.date
;
Current result example:
Desired result example:
So the goal is not to generate multiple rows from min(mm.date), rather to generate more rows based off a different date column (create_date as a start date, per each server_component). These new rows should have max(measure) as null (not to inflate the numbers), but should have a count() column filled.
I was thinking of some sort of calendar dummy but the tables are big, and I'm not sure it's going to work because of that.
You appear to want to generate a calendar and CROSS JOIN
it to the servers
table and then aggregate:
WITH calendar (day) AS (
SELECT DATE '2024-01-01' + LEVEL - 1 AS dt
FROM DUAL
CONNECT BY LEVEL <= 4
)
select s.server_location
, s.server_component
, c.day
, max(mm.measure)
, COUNT(distinct s.server_component) OVER (PARTITION BY s.server_location)
AS server_component_count
from calendar c
CROSS JOIN servers s
LEFT OUTER JOIN main_metric_table mm
ON ( s.server_component = mm.server_component
AND c.day = mm.dt )
group by
s.server_location
, s.server_component
, c.day;
Note: server_component
is included in the GROUP BY
clause so COUNT(DISTINCT server_component)
can be at most 1
in each group (since if there are different values they will be in different groups). To get a count across multiple groups you need to use an analytic function.
Which, for the sample data:
CREATE TABLE servers (server_location, server_component) AS
SELECT 'EU', 1 FROM DUAL UNION ALL
SELECT 'EU', 2 FROM DUAL;
CREATE TABLE main_metric_table (dt, measure, server_component) AS
SELECT DATE '2024-01-01', 0.0173, 1 FROM DUAL UNION ALL
SELECT DATE '2024-01-02', 0.5783, 1 FROM DUAL UNION ALL
SELECT DATE '2024-01-03', 0.1200, 1 FROM DUAL UNION ALL
SELECT DATE '2024-01-04', 0.3734, 1 FROM DUAL UNION ALL
SELECT DATE '2024-01-04', 0.9194, 2 FROM DUAL;
Outputs:
SERVER_LOCATION | SERVER_COMPONENT | DAY | MAX(MM.MEASURE) | SERVER_COMPONENT_COUNT |
---|---|---|---|---|
EU | 1 | 2024-01-01 00:00:00 | .0173 | 2 |
EU | 1 | 2024-01-02 00:00:00 | .5783 | 2 |
EU | 1 | 2024-01-03 00:00:00 | .12 | 2 |
EU | 1 | 2024-01-04 00:00:00 | .3734 | 2 |
EU | 2 | 2024-01-01 00:00:00 | null | 2 |
EU | 2 | 2024-01-02 00:00:00 | null | 2 |
EU | 2 | 2024-01-03 00:00:00 | null | 2 |
EU | 2 | 2024-01-04 00:00:00 | .9194 | 2 |