I have data like the one below and am looking to get it grouped based on the time intervals where it is repeating for col1+col2 with start and end dates. Can anyone provide some tips to get the result in Oracle?
Col1 Col2 Date_col Quantity
I1 L1 1-Jul-24 10
I1 L1 2-Jul-24 10
I1 L1 3-Jul-24 10
---------------------------------
I1 L1 12-Jul-24 10
I1 L1 13-Jul-24 10
I1 L1 14-Jul-24 10
---------------------------------
I2 L2 5-Jul-24 26
I2 L2 6-Jul-24 26
I2 L2 7-Jul-24 26
I2 L2 8-Jul-24 26
---------------------------------
I2 L2 10-Jul-24 34
I2 L2 11-Jul-24 34
---------------------------------
I2 L2 12-Jul-24 28
I2 L2 13-Jul-24 28
I2 L2 14-Jul-24 28
---------------------------------
I2 L2 21-Jul-24 20
I2 L2 22-Jul-24 20
I2 L2 23-Jul-24 20
I2 L2 24-Jul-24 20
Need to aggregate data grouped by col1 and col2 and get the data with start and end dates as the first and last dates of the pattern.
Desired Output
Col1 Col2 Startdate Enddate Quantity
I1 L1 1/Jul/2024 3/Jul/2024 10
I1 L1 12/Jul/2024 14/Jul/2024 10
I2 L2 5/Jul/2024 8/Jul/2024 26
I2 L2 10/Jul/2024 11/Jul/2024 34
I2 L2 12/Jul/2024 14/Jul/2024 28
I2 L2 21/Jul/2024 24/Jul/2024 20
Below are the DDLs to create the above data.
Create table tab1 as
select 'I1' col1,'L1' col2,to_date('01-07-2024','DD-MM-YYYY') Date_col,10 quantity from dual
union
select 'I1','L1',to_date('02-07-2024','DD-MM-YYYY'),10 from dual
union
select 'I1','L1',to_date('03-07-2024','DD-MM-YYYY'),10 from dual
union
select 'I1','L1',to_date('12-07-2024','DD-MM-YYYY'),10 from dual
union
select 'I1','L1',to_date('13-07-2024','DD-MM-YYYY'),10 from dual
union
select 'I1','L1',to_date('14-07-2024','DD-MM-YYYY'),10 from dual
union
select 'I2','L2',to_date('05-07-2024','DD-MM-YYYY'),26 from dual
union
select 'I2','L2',to_date('06-07-2024','DD-MM-YYYY'),26 from dual
union
select 'I2','L2',to_date('07-07-2024','DD-MM-YYYY'),26 from dual
union
select 'I2','L2',to_date('08-07-2024','DD-MM-YYYY'),26 from dual
union
select 'I2','L2',to_date('10-07-2024','DD-MM-YYYY'),34 from dual
union
select 'I2','L2',to_date('11-07-2024','DD-MM-YYYY'),34 from dual
union
select 'I2','L2',to_date('12-07-2024','DD-MM-YYYY'),28 from dual
union
select 'I2','L2',to_date('13-07-2024','DD-MM-YYYY'),28 from dual
union
select 'I2','L2',to_date('14-07-2024','DD-MM-YYYY'),28 from dual
union
select 'I2','L2',to_date('21-07-2024','DD-MM-YYYY'),20 from dual
union
select 'I2','L2',to_date('22-07-2024','DD-MM-YYYY'),20 from dual
union
select 'I2','L2',to_date('23-07-2024','DD-MM-YYYY'),20 from dual
union
select 'I2','L2',to_date('24-07-2024','DD-MM-YYYY'),20 from dual;
From Oracle 12, you can use MATCH_RECOGNIZE
to perform row-by-row pattern matching:
SELECT *
FROM tab1
MATCH_RECOGNIZE(
PARTITION BY col1, col2
ORDER BY date_col
MEASURES
FIRST(quantity) AS quantity,
FIRST(date_col) AS start_date,
LAST(date_col) AS end_date
PATTERN (first_row consecutive*)
DEFINE
consecutive AS quantity = FIRST(quantity)
AND date_col <= PREV(date_col) + INTERVAL '1' DAY
)
Note: You (probably) do not want to include quantity
in the columns you are partitioning by and, instead, check for identical quantities in the pattern matching.
Which, for the sample data:
Create table tab1 (col1, col2, date_col, quantity) as
SELECT 'I1','L1',DATE '2024-07-01',10 FROM DUAL UNION ALL
SELECT 'I1','L1',DATE '2024-07-02',10 FROM DUAL UNION ALL
SELECT 'I1','L1',DATE '2024-07-03',10 FROM DUAL UNION ALL
SELECT 'I1','L1',DATE '2024-07-12',10 FROM DUAL UNION ALL
SELECT 'I1','L1',DATE '2024-07-13',10 FROM DUAL UNION ALL
SELECT 'I1','L1',DATE '2024-07-14',10 FROM DUAL UNION ALL
SELECT 'I2','L2',DATE '2024-07-05',26 FROM DUAL UNION ALL
SELECT 'I2','L2',DATE '2024-07-06',26 FROM DUAL UNION ALL
SELECT 'I2','L2',DATE '2024-07-07',26 FROM DUAL UNION ALL
SELECT 'I2','L2',DATE '2024-07-08',26 FROM DUAL UNION ALL
SELECT 'I2','L2',DATE '2024-07-10',34 FROM DUAL UNION ALL
SELECT 'I2','L2',DATE '2024-07-11',34 FROM DUAL UNION ALL
SELECT 'I2','L2',DATE '2024-07-12',28 FROM DUAL UNION ALL
SELECT 'I2','L2',DATE '2024-07-13',28 FROM DUAL UNION ALL
SELECT 'I2','L2',DATE '2024-07-14',28 FROM DUAL UNION ALL
SELECT 'I2','L2',DATE '2024-07-21',20 FROM DUAL UNION ALL
SELECT 'I2','L2',DATE '2024-07-22',20 FROM DUAL UNION ALL
SELECT 'I2','L2',DATE '2024-07-23',20 FROM DUAL UNION ALL
SELECT 'I2','L2',DATE '2024-07-24',20 FROM DUAL;
Outputs:
COL1 | COL2 | QUANTITY | START_DATE | END_DATE |
---|---|---|---|---|
I1 | L1 | 10 | 2024-07-01 00:00:00 | 2024-07-03 00:00:00 |
I1 | L1 | 10 | 2024-07-12 00:00:00 | 2024-07-14 00:00:00 |
I2 | L2 | 26 | 2024-07-05 00:00:00 | 2024-07-08 00:00:00 |
I2 | L2 | 34 | 2024-07-10 00:00:00 | 2024-07-11 00:00:00 |
I2 | L2 | 28 | 2024-07-12 00:00:00 | 2024-07-14 00:00:00 |
I2 | L2 | 20 | 2024-07-21 00:00:00 | 2024-07-24 00:00:00 |
In earlier versions you can use:
SELECT col1,
col2,
MIN(date_col) AS start_date,
MAX(date_col) AS end_date,
MAX(quantity) AS quantity
FROM (
SELECT col1,
col2,
date_col,
quantity,
SUM(has_changed_group) OVER (PARTITION BY col1, col2 ORDER BY date_col)
AS grp
FROM (
SELECT col1,
col2,
date_col,
quantity,
CASE
WHEN quantity = LAG(quantity) OVER (PARTITION BY col1, col2 ORDER BY date_col)
AND date_col <= LAG(date_col) OVER (PARTITION BY col1, col2 ORDER BY date_col)
+ INTERVAL '1' DAY
THEN 0
ELSE 1
END AS has_changed_group
FROM tab1 t
)
)
GROUP BY col1, col2, grp