When I ingested values to the feature set, the pipeline was called 2x more (I used MLRun version 1.2.1). It seems as the issue, do you know why?
I used this code:
import mlrun
import mlrun.feature_store as fstore
# mlrun: start-code
import math
def calc(x):
x['fn2']=math.sin(x['fn2'])*100.0
print('calc')
return x
# mlrun: end-code
mlrun.set_env_from_file("mlrun-nonprod.env")
project = mlrun.get_or_create_project(project_name, context='./', user_project=False)
feature_derived = fstore.get_feature_set(f"{project_name}/{feature_derivedName}")
...
# dataFrm has only two values
feature_derived.graph.to(name="calc", handler='calc')
fstore.ingest(feature_derived, dataFrm)
I got this output (method calc
was called four times) for dataFrm with two values:
> calc
> calc
> calc
> calc
The solution is easy, it is enough to switch-off preview mode based on setting infer_options=0
in ingest method. See part of the code
...
feature_derived.graph.to(name="calc", handler='calc')
fstore.ingest(feature_derived, dataFrm, infer_options=0)
...
The output has only two values (as requested):
> calc
> calc