oracle-databaseoracle-analytics

Using Oracle analytics for average


How should I use the analytics to give me a rolling average where the sample size changes?

create table MyVals  (Item_no char(10), epoch number, Yield number, Skip_Period char(1), Reset_Period char(1));

insert into MyVals values ('A00001',1705, 12, 'N','N');     /* 17.18181818  average of epochs 1705..1610 & 1607..1606 */    
insert into MyVals values ('A00001',1704, 13, 'N','N');     /* 19.45454545  average of epochs 1704..1610 & 1607..1605 */    
insert into MyVals values ('A00001',1703, 9,  'N','N');     /* 20.36363636  average of epochs 1703..1610 & 1607..1604 */    
insert into MyVals values ('A00001',1702, 11, 'N','N');     /* 21.5     average of epochs 1702..1610 & 1607..1604 */    
insert into MyVals values ('A00001',1701, 4,  'N','N');     /* 22.66666667  average of epochs 1701..1610 & 1607..1604 */    
insert into MyVals values ('A00001',1613, 16, 'N','N');     /* 25       average of epochs 1613..1610 & 1607..1604 */    
insert into MyVals values ('A00001',1612, 33, 'N','N');     /* 26.28571429  average of epochs 1612..1610 & 1607..1604 */    
insert into MyVals values ('A00001',1611, 2,  'N','N');     /* 25.16666667  average of epochs 1611..1610 & 1607..1604 */    
insert into MyVals values ('A00001',1610, 1,  'N','N');     /* 29.8     average of epochs 1610 & 1607..1604       */    
insert into MyVals values ('A00001',1609, 66, 'Y','N');     /* 37       average of epochs 1607..1604              */    
insert into MyVals values ('A00001',1608, 23, 'Y','N');     /* 37       average of epochs 1607..1604              */    
insert into MyVals values ('A00001',1607, 22, 'N','N');     /* 37       average of epochs 1607..1604              */    
insert into MyVals values ('A00001',1606, 66, 'N','N');     /* 42       average of epochs 1606..1604              */    
insert into MyVals values ('A00001',1605, 37, 'N','N');     /* 30       average of epochs 1605..1604              */    
insert into MyVals values ('A00001',1604, 23, 'N','Y');     /* 23       average of epochs 1604                    */    
insert into MyVals values ('A00001',1603, 77, 'N','N');     /* 44.83333333  average of epochs 1603..1511              */    
insert into MyVals values ('A00001',1602, 15, 'N','N');     /* 38.4     average of epochs 1602..1511              */    
insert into MyVals values ('A00001',1601, 82, 'N','N');     /* 44.25    average of epochs 1601..1511              */    
insert into MyVals values ('A00001',1513, 4,  'N','N');     /* 31.66666667  average of epochs 1513..1511              */    
insert into MyVals values ('A00001',1512, 7,  'N','N');     /* 45.5     average of epochs 1512..1511              */    
insert into MyVals values ('A00001',1511, 84, 'N','N');     /* 84       average of epochs 1511                    */    

How do I get the the average Yield of the preceding maximum 13 records where Skip_Period = 'N' while Reset_Period = 'N'

So the window for the average changes depending on the value of skip_Period and Reset_Period based on these rules:

If a row has Reset_Period = 'Y', then don't go back any further than that record. If a row has Skip_period = 'Y', then exclude that period from the average sample

I can't figure how to create a range between expression that will give me the rolling average I need using analytics.

Any suggestions welcome :)


Solution

  • I think this is what you're after:

    WITH res AS (SELECT item_no,
                        epoch_number,
                        yield,
                        skip_period,
                        reset_period,
                        SUM(CASE WHEN reset_period = 'Y' THEN 1 ELSE 0 END) OVER (PARTITION BY item_no ORDER BY epoch_number) grp
                 FROM   myvals)
    SELECT item_no,
           epoch_number,
           yield,
           skip_period,
           reset_period,
           grp,
           AVG(CASE WHEN skip_period = 'N' THEN yield END) OVER (PARTITION BY item_no, grp
                                                                 ORDER BY epoch_number
                                                                 rows 12 preceding) rolling_avg_yield
    FROM   res
    ORDER BY epoch_number DESC;
    
    ITEM_NO EPOCH_NUMBER      YIELD SKIP_PERIOD RESET_PERIOD        GRP ROLLING_AVG_YIELD
    ------- ------------ ---------- ----------- ------------ ---------- -----------------
    A00001          1705         12 N           N                     1  17.1818181818182
    A00001          1704         13 N           N                     1  19.4545454545455
    A00001          1703          9 N           N                     1  20.3636363636364
    A00001          1702         11 N           N                     1              21.5
    A00001          1701          4 N           N                     1  22.6666666666667
    A00001          1613         16 N           N                     1                25
    A00001          1612         33 N           N                     1  26.2857142857143
    A00001          1611          2 N           N                     1  25.1666666666667
    A00001          1610          1 N           N                     1              29.8
    A00001          1609         66 Y           N                     1                37
    A00001          1608         23 Y           N                     1                37
    A00001          1607         22 N           N                     1                37
    A00001          1606         66 N           N                     1                42
    A00001          1605         37 N           N                     1                30
    A00001          1604         23 N           Y                     1                23
    A00001          1603         77 N           N                     0  44.8333333333333
    A00001          1602         15 N           N                     0              38.4
    A00001          1601         82 N           N                     0             44.25
    A00001          1513          4 N           N                     0  31.6666666666667
    A00001          1512          7 N           N                     0              45.5
    A00001          1511         84 N           N                     0                84
    

    First of all, you need to work out the groups you're averaging over. We can do this by generating values of 1 or 0 based on whether the reporting group changes or not, and then doing a running sum across those values.

    Once we have that, it's just a matter of including that column in the partition by and then doing a conditional avg if the skip period is N, over the current and 12 preceding rows.