statisticshierarchical-bayesian

BTYD: Prior model tweaking


I am recently encountering a challenge with BTYD, specifically with Pareto-NBD model. See, from the papers that I read from Faders, there are few assumptions using this model, and the first and foremost is:

i) Customers go through two stages in their “lifetime” with a specific firm: they are “alive” for some period of time, then become permanently inactive.

"then become permanently inactive" we want to challenge that. I am working in a situation where, its brand value is very high and our unit price for our products are like in 2~3 dollar ranges (think of a candy shop). We have an assumption from our business setting, that; "Yes, our customer may have a pause period every now and then, however, from their behaviors, they will come back again and most likely will never churn out".

The reason why I want to challenge that is because the model will tend to underestimate when we try to compute conditional expectation of frequency for a very long time (say.. 3 years or 5 years from point where we compute CLV). From the p_alive function provided in lifetimes.py package, after fitting to my RFM table, I could see that quite a chunk of customers will churn out after 400~500 days.

We want to modify this behavior so that the p_alive goes to some "low value" (still brainstorming to decide on this value) after "x" period of days and after that, that p_alive prolongs to infinity.

I understand that this will highly likely overshoot the future visits and furthermore overshoot CLV. However, this is sort of some method that I tried to come up to compensate our earlier question.

If we modify the prior, (Gamma function and some exponential function), how should I approach modifying these distribution functions so that it can be suitable in my situation?

Tried: calculating a retention table for every months in the future and setting that as p_alive when it sufficiently goes to low value


Solution

  • Perhaps this note gives you a better insight into the Pareto/NBD model: https://brucehardie.com/notes/031/

    Are you trying to validate the Pareto/NBD the way Fader et al. are doing it in their paper? I would start there before over-complicating your problem. I don't see how you have shown the Pareto/NBD is under-estimating the CLV.

    If it does, you can try a model where the second stage is not 'death'. A good example is: https://faculty.wharton.upenn.edu/wp-content/uploads/2009/08/Schweidel_Fader_IJRM_2009.pdf (Sadly, no package available).