I am trying to fit a distribution to some data using scipy.stats.fit
. I specifically need the stats.nbinom
distribution to be fit using a non-integer value for parameter n
.
I've made this minimal example of what I'd like to achieve. I can't see what I've done incorrectly.
This is a modified version of the example here
from scipy import stats
from scipy.optimize import differential_evolution
import numpy as np
rng = np.random.default_rng()
def optimizer(fun, bounds, *, integrality):
return differential_evolution(fun, bounds, strategy='best2bin',
rng=rng, integrality=[False, False, False])
data = [0,0,0,1,2,4,11]
bounds = [(1.5, 1.55), (0, 1)]
res4 = stats.fit(stats.nbinom, data, bounds, optimizer=optimizer)
print(res4.params)
This returns:
ValueError: There are no integer values for `n` on the interval defined by the user-provided bounds and the domain of the distribution.
EDIT:
Here's an example to show that the negative binomial distribution will accept non-integer n
.
from scipy import stats
import matplotlib.pyplot as plt
n = 0.4
p = 0.1
x = np.arange(stats.nbinom.ppf(0.01, n, p),
stats.nbinom.ppf(0.99, n, p))
plt.bar(x, stats.nbinom.pmf(x, n, p))
plt.show()
For some reason this is not available by default and you need to patch original distribution if you'd like to do such optimization. The reason why it is not available lay most probably in distribution definition - Negative binomial distribution describes a sequence of i.i.d. Bernoulli trials, repeated until a predefined, non-random number of successes occurs.
Based on this definition n
must be an integer.
Here is a way to patch this behaviour:
from scipy import stats
from scipy.optimize import differential_evolution
import numpy as np
from scipy.stats._discrete_distns import nbinom_gen, _ShapeInfo
class new_nbinom(nbinom_gen):
def _shape_info(self):
# in Shapeinfo we specify 'n' as non-integer
return [_ShapeInfo("n", False, (0, np.inf), (True, False)),
_ShapeInfo("p", False, (0, 1), (True, True))]
def optimizer(fun, bounds, *, integrality):
return differential_evolution(fun,
bounds,
strategy='best2bin',
integrality=[False, False, False])
data = [0,0,0,1,2,4,11]
bounds = [(0.1, 1.55), (0, 1)]
distr = new_nbinom(name = 'nbinom')
res4 = stats.fit(distr, data, bounds, optimizer=optimizer)
print(res4)