pythonpython-3.xscikit-learn

Trying to Find Optimal Price Point in a Data Set


I have a data set that looks like this.

Year    Quarter Quantity    Price
2000    1   23  142
2000    2   23  144
2000    3   23  147
2000    4   23  151
2001    1   22  160
2001    2   22  183
2001    3   22  186
2001    4   22  186
2002    1   21  212
2002    2   19  232
2002    3   19  223
2002    4   19  224
2003    1   19  231
2003    2   19  228
2003    3   19  238
2003    4   19  238
2004    1   19  234
2004    2   19  231
2004    3   20  239
2004    4   19  235
2005    1   19  233
2005    2   19  243
2005    3   20  244
2005    4   19  233
2006    1   19  234
2006    2   19  241

I am trying to figure out a pricing strategy to maximize revenue (optimal price) using Python. I found a few example online, but didn't know how to adapt them to my specific scenario.

This one looks good, but I'm not sure how to modify it to fit my data set.

https://www.datascience.com/resources/notebooks/python-dynamic-pricing

This one looks goo too, but again, I'm not sure how to make it work for me.

https://cscherrer.github.io/post/max-profit/

If someone here knows how to modify the sample code at one of those sites to fit the data I'm working with, I'd really like to see it. Or, if you have a link to another site that answers my question, please do share it. Thanks!


Solution

  • Fundamentally I don't think there's a enough data here to be able to implement a pricing strategy based on pure statistics. The differences in quantity are barely outside of the standard deviation (std 1.6, mean 20.2). However theoretically what you want to do is:

    1. convert the data to a more easily importable formate such as csv
    Year,Quarter,Quantity,Price
    2000,1,23,142
    2000,2,23,144
    2000,3,23,147
    2000,4,23,151
    2001,1,22,160
    2001,2,22,183
    2001,3,22,186
    2001,4,22,186
    2002,1,21,212
    2002,2,19,232
    2002,3,19,223
    2002,4,19,224
    2003,1,19,231
    2003,2,19,228
    2003,3,19,238
    2003,4,19,238
    2004,1,19,234
    2004,2,19,231
    2004,3,20,239
    2004,4,19,235
    2005,1,19,233
    2005,2,19,243
    2005,3,20,244
    2005,4,19,233
    2006,1,19,234
    2006,2,19,241
    
    1. load in the data prices = pd.read_csv("price_data.csv")
    2. make a graph to visually show change in price with respect to quantity sns.scatterplot(x=prices["Price"], y=prices["Quantity"]) Scatter plot showing relationship between price and quantity
    3. add columns for change in quantity and change in price
    prices["% Change in Quantity"] = prices["Quantity"].pct_change()
    prices["% Change in Price"] = prices["Price"].pct_change()
    
    1. calculate the price elasticity prices["Price Elasticity"] = prices["% Change in Quantity"] / prices["% Change in Price"]
    2. Graph price elasticity vs the price Scatter plot of price elasticity vs price of goods

    from this data you could then fit a model (depending on the complexity some order of polynomial makes sense), and use this to figure out at what point the price elasticity becomes too high and you wouldn't be making enough sales. But that's highly subjective and based more on business needs than anything else.