sqlteradata

Finding customers who've had multiple tickets within a 3-day period over the last month


I have a table that includes columns for receive_date, customer_id, and ticket_number. I want to find all the tickets for customers who've had more than 1 ticket created within 3 days of each other.

Tickets Table

row ticket_number customer_id receive_date
1 NY1 0001 2024-10-01
2 NY2 0002 2024-10-01
3 NY3 0001 2024-10-02
4 MD1 0003 2024-10-05
5 NY4 0001 2024-10-06
6 MD2 0003 2024-10-06
7 MD3 0003 2024-10-07

In the above example table, I'd want my query to return rows 1 & 2 for customer 0001, and rows 4, 6, & 7 for customer 3

Query Output

ticket_number customer_id receive_date
NY1 0001 2024-10-01
NY3 0001 2024-10-02
MD1 0003 2024-10-05
MD2 0003 2024-10-06
MD3 0003 2024-10-07

I feel like I should be using a Window Function of some kind, but I can't for the life of me figure out what I should be doing with it. Closest I've gotten is basically just counting all the times the customer_id shows up in the previous month:

SELECT customer_id,
    COUNT(customer_id)
FROM tickets
WHERE EXTRACT(
        YEAR
        FROM receive_date
    ) = YEARNUMBER_OF_CALENDAR(ADD_MONTHS(CURRENT_DATE, -1))
    AND EXTRACT(
        MONTH
        FROM receive_date
    ) = MONTHNUMBER_OF_YEAR(ADD_MONTHS(CURRENT_DATE, -1))
HAVING COUNT(customer_id) > 1
GROUP BY customer_id
ORDER BY COUNT(customer_id) DESC

Any pointers would be greatly appreciated.


Solution

  • If the next or previous row is within 3 days of the current row it's at least 2 rows.

    Untestet:

    SELECT *
    FROM tickets
    QUALIFY
       LAG(receive_date)    -- previous row within 3 days
       OVER (PARTITION BY customer_id ORDER BY receive_date ASC) >= receive_date - 2
    OR 
       LEAD(receive_date)   -- next row within 3 days
       OVER (PARTITION BY customer_id ORDER BY receive_date ASC) <= receive_date + 2