Multiple symfit model instances share parameter objects with the same name. I'd like to understand where this behaviour comes from, what it's intent is and if it's possible to deactivate.
To illustrate what I mean, a minimial example:
import symfit as sf
# Create Parameters and Variables
a = sf.Parameter('a',value=0)
b = sf.Parameter('b',value=1,fixed=True)
x, y = sf.variables('x, y')
# Instanciate two models
model1=sf.Model({y:a*x+b})
model2=sf.Model({y:a*x+b})
# They are indeed not the same
id(model1) == id(model2)
>>False
# There are two parameters
print(model1.params)
>>[a,b]
print(model1.params[1].name, model1.params[1].value)
>>b 1
print(model2.params[1].name, model2.params[1].value)
>>b 1
#They are initially identical
# We want to manually modify the fixed one in only one model
model1.params[1].value = 3
# Both have changed
print(model1.params[1].name, model1.params[1].value)
>>b 3
print(model2.params[1].name, model2.params[1].value)
>>b 3
id(model1.params[1]) == id(model2.params[1])
>>True
# The parameter is the same object
I want to fit multiple data streams with different models, but different fixed paramter values dependent on the data stream. Renaming the parameters in each instance of the model would work, but is ugly given that the paramter represents the same quantity. Processing them sequentially and modifying the parameters in between is possible, but I worry about unintended interactions between steps.
PS: Can someone with sufficient reputation please create the symfit tag
Excellent question. In principle this is because Parameter
objects are a subclass of sympy.Symbol
, and from its docstring:
Symbols are identified by name and assumptions:
>>> from sympy import Symbol
>>> Symbol("x") == Symbol("x")
True
>>> Symbol("x", real=True) == Symbol("x", real=False)
False
This is fundamental to the inner working of sympy
, and therefore something we also use in symfit
. But the value and fixed arguments are not viewed as assumptions, so they are not used to distinguish parameters.
Now, to your question on how this would affect fitting. Like you say, working sequentially is a good solution, and one that will not have any side effects:
model = sf.Model({y:a*x+b})
b.fixed = True
fit_results = []
for b_value, xdata, ydata in datastream:
b.value = b_value
fit = Fit(model, x=xdata, y=ydata)
fit_results.append(fit.execute())
So there is no need to define a new Parameter
every iteration, the b.value
attribute will be the same within each loop so there is no way this can go wrong. The only way I can imagine this going wrong is if you use threading
, that will probably create some race conditions. But threading
is not desirable for CPU bound tasks anyway, multiprocessing
is the way to go. And in that case, separate processes will be spawned, creating separate microcosms, so there should be no problem there either.
I hope this answers your question, if not let me know.
p.s. I'm slowly answering my way up to 1500 to make that tag, but if someone beats me to it I'd be all the happier for it of course ;)