I am trying to define a custom distribution in Scipy. Let us assume for simplicity that we are looking at the "affine" distribution, i.e. a mix of uniform and triangular.
from scipy import stats
class affine_distribution_gen(stats.rv_continuous):
def _argcheck(self, c):
return 0 <= c <= 2
def _pdf(self, x, c):
return (2 - 2 * c) * x + c
def _cdf(self, x, c):
return x * (c + x + c * x)
I then attempt fitting the parameters to data:
affine = affine_distribution_gen(name='affine', a=0, b=1)
stats.fit(affine, data, {'c': (0, 2)})
The problem is that I get the following exceptions:
AttributeError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/scipy/stats/_fit.py in fit(dist, data, bounds, guess, method, optimizer)
539 try:
--> 540 param_info = dist._param_info()
541 except AttributeError as e:
2 frames
/usr/local/lib/python3.10/dist-packages/scipy/stats/_distn_infrastructure.py in _param_info(self)
2921 def _param_info(self):
-> 2922 shape_info = self._shape_info()
2923 loc_info = _ShapeInfo("loc", False, (-np.inf, np.inf), (False, False))
AttributeError: 'affine_distribution_gen' object has no attribute '_shape_info'
The above exception was the direct cause of the following exception:
ValueError Traceback (most recent call last)
<ipython-input-11-779300b83040> in <cell line: 2>()
1 affine = affine_distribution_gen(name='affine', a=0, b=1)
----> 2 stats.fit(affine, data, {'c': (0, 2)})
/usr/local/lib/python3.10/dist-packages/scipy/stats/_fit.py in fit(dist, data, bounds, guess, method, optimizer)
543 "`scipy.stats.fit` because shape information has "
544 "not been defined.")
--> 545 raise ValueError(message) from e
546
547 # data input validation
ValueError: Distribution `affine` is not yet supported by `scipy.stats.fit` because shape information has not been defined.
The documentation says that the shape is somehow inferred from the signatures of _cdf
and _pdf
, and provides no explanation on how to do this manually. How should I proceed here?
In case you haven't seen it, there are instructions on how to add a new distribution in the scipy manual. Does not seem to be a very refined process.
Following the errors and looking up existing distributions gets you this:
import numpy as np
from scipy import stats
from scipy.stats._distn_infrastructure import (
_ShapeInfo,
)
class affine_distribution_gen(stats.rv_continuous):
def _argcheck(self, c):
return 0 <= c <= 2
def _pdf(self, x, c):
return (2 - 2 * c) * x + c
def _cdf(self, x, c):
return x * (c + x + c * x)
def _shape_info(self):
return [_ShapeInfo("c", False, (0, 2), (True, True))]
_ShapeInfo
arguments are the name of the parameter, whether it's an integer (True or False), domain of the parameter, and whether the lower and upper limit are included or not.