sqlteradata

Find max consecutive days with no sales in SQL


DBMS I am using is Teradata.

I have a table called Sales, with columns: Store, Item, DT, and Volume

This is a daily sales table, where Store and Item are numbers, and DT is a date.

Now, it could be the case that if a store/item combo did not have any sales for a certain date, that it could have no entry in the table, or its entry has a 0 as the value for volume. It is imperative that I consider both cases.

I would like to calculate the max consecutive days with no sales for a store/item combo, where the date range is between '2023-12-31' and '2024-03-30'

This is what I have come up with, but it unfortunately does not consider many cases:

SELECT 
    sq.STORE,
    sq.ITEM,
    MAX(sq.Consec_No_Sales_Days) AS Max_Consec_No_Sales_Days
FROM (
    SELECT 
        STORE,
        ITEM,
        DT - LAG(DT) 
        OVER(ORDER BY STORE,ITEM,DT) AS Consec_No_Sales_Days
    FROM
    SALES

    WHERE
    SALES.VOLUME > 0
    AND DT BETWEEN '2023-12-31' AND '2024-03-30'


    GROUP BY
    STORE,
    ITEM,
    DT
) sq

WHERE 
    sq.Consec_No_Sales_Days > 0 
GROUP BY
    sq.STORE,
    sq.ITEM

Solution

  • If you are fetching the data for a specific period ('2023-12-31' -'2024-03-30') and you have the data before start of period then what would be previous sales date for the STORE/ITEM having first sale within period on '2024-01-10' ???
    NOTE: In such a case the code below takes start of the period as previous sales date (using Coalesce function). The same has been done with end of the period against next sales date. If that is not ok - you should make the appropriate adjustment to meet your needs.

    /*  SALES table sample data
         STORE       ITEM DT             VOLUME
    ---------- ---------- ---------- ----------
            10        101 2023-11-30         12
            10        101 2023-12-31         10
            10        101 2024-01-25         21
            10        103 2024-03-25         32
            10        103 2024-03-30         10
            20        107 2024-01-30         54
            20        107 2024-02-12         23
            20        109 2024-03-01         43    
            30        110 2024-01-10         17   */
    

    Create inner query that excludes null or zero VOLUME rows and fetches just the rows within period (Where clause). Without null/zero volume rows LAG()/LEAD() Over() analytic functons will work ok if partition by and order by parts are ok. There is also a date math calculating DAY_DIFF column described in NOTE above. Outer query just aggregates and groups the resultset of the inner query.

    --      S Q L :
    SELECT    STORE, ITEM, Max(DAY_DIFF) AS MAX_DAY_DIFF
    FROM    ( Select    STORE, ITEM, DT, VOLUME, 
                         GREATEST( DT - Coalesce( LAG(DT) Over(Partition By STORE, ITEM Order By DT), To_Date('2023-12-31', 'yyyy-mm-dd') ),
                                   Coalesce( LEAD(DT) Over(Partition By STORE, ITEM Order By DT), To_Date('2024-03-30', 'yyyy-mm-dd') ) - DT ) AS DAY_DIFF
              From      sales
              Where     Coalesce(VOLUME, 0) > 0 And 
                        DT Between DATE '2023-12-31' And DATE '2024-03-30'
            )
    GROUP BY  STORE, ITEM
    HAVING    Max(DAY_DIFF) > 0
    ORDER BY  STORE, ITEM
    
    /*        R e s u l t : 
         STORE       ITEM MAX_DAY_DIFF
    ---------- ---------- ------------
            10        101           65
            10        103           85
            20        107           47
            20        109           61
            30        110           80  */