When the time index is integer(e.g. starting from 0 for each user), running dfs
shows warnings:
UnusedPrimitiveWarning: Some specified primitives were not used during DFS:
agg_primitives: ['avg_time_between', 'time_since_first', 'time_since_last', 'trend']
groupby_trans_primitives: ['cum_count', 'time_since', 'time_since_previous']
This may be caused by a using a value of max_depth that is too small, not setting interesting values, or it may indicate no compatible variable types for the primitive were found in the data.
However, the timeindex can be an integer in many cases (e.g. https://www.kaggle.com/c/riiid-test-answer-prediction/data):
In this case, even though I set the timestamp
variable as ft.variable_types.TimeIndex(numeric_time_index)
when creating entityset, it still showed the same warning and features generated by ['avg_time_between', 'time_since_first', 'time_since_last', 'trend']
didn't appear.
How can I handle it?
Thanks for the question. The time_since
and time_since_first
primitives are currently implemented to handle only Datetime
and DatetimeTimeIndex
variables. To handle cases where the time index is numeric, you can create custom primitives to handle NumericTimeIndex
variables.
from featuretools.primitives import AggregationPrimitive, TransformPrimitive
from featuretools.variable_types import NumericTimeIndex
class TimeSinceNumeric(TransformPrimitive):
input_types = [NumericTimeIndex]
...
class TimeSinceFirstNumeric(AggregationPrimitive):
input_types = [NumericTimeIndex]
...
Then, you can pass in the custom primitives directly to DFS.
ft.dfs(
...
trans_primitives=[TimeSinceNumeric],
agg_primitives=[TimeSinceFirstNumeric],
)