sqlwhere-clause

SQL - WHERE - variable dynamic filter


I'm trying to write a WHERE clause that will filter data differently for respective entities, based on time frames that are unique to each entity, like so:

import pandas as pd

product_dates = [
        {'name': 'product_1', 'start_date': '2024-01-01', 'end_date': '2024-05-01'},
        {'name': 'product_2', 'start_date': '2024-03-05', 'end_date': '2024-04-26'},
        {'name': 'product_3', 'start_date': '2024-02-09', 'end_date': '2024-11-08'}
]

df_product = pd.DataFrame(product_dates)

What I'd like to see for the output is the number of sales from each product that are between the given dates, like so:

output_data = [
        {'name': 'product_1', 'time_frame': '2024-01-01 to 2024-05-01', 'sales': 204},
        {'name': 'product_2', 'time_frame': '2024-03-05 to 2024-04-26', 'sales': 1890},
        {'name': 'product_3', 'time_frame': '2024-02-09 to 2024-11-08', 'sales': 766}
]


df_output = pd.DataFrame(output_data)

And to get that, there's a table of product sales structured like this:

sales_data = [
        {'name': 'product_1', 'date': '2024-01-01', 'sales': 1},
        {'name': 'product_2', 'date': '2024-01-01', 'sales': 3},
        {'name': 'product_3', 'date': '2024-01-01', 'sales': 4},
    ...
]


df_sales = pd.DataFrame(sales_data)

Theoretically the SQL clause would be:

SELECT name, time_frame, SUM(quantity) AS sales
FROM sales_data
WHERE date BETWEEN start_date AND end_date
GROUP BY ALL

But I'm not sure how to have the WHERE clause reflect the fact that the dates are variable by product.

Any tips? Thank you!


Solution

  • As per Dale K.s comment, In SQL you can use a JOIN for the sales_data and product_dates, this should create a link that you can use to filter the sales to a start date and end date then use SUM to add up the amount sold in that date range.

    It could look something like this:

    SELECT 
        s.name, 
        CONCAT(p.start_date, ' to ', p.end_date) AS time_frame, 
        SUM(s.sales) AS sales
    FROM sales_data s
    JOIN product_dates p ON s.name = p.name
    WHERE s.date BETWEEN p.start_date AND p.end_date
    GROUP BY s.name, p.start_date, p.end_date;