pythontime-seriestsfresh

tsfresh timeseries missing values


I am confused about tsfresh input format. Can I give a dataframe with missing values for different ids? For example, timeseries 1 {t0: 1, t2: 4, t5: 1} and timeseries 2 {t1: 5, t2: 2}. Should I fill missing values(t1, t3 etc.) with 0? thanks in advance


Solution

  • tsfresh does not "care" about the time entries of your data. Most of its feature calculators do not need to have fixed time intervals (e.g. the mean of a timeseries is still the same, no matter which time stamps we are talking about). So yes, technically it is possible to have different times for different ids.

    That being said, some feature calculators do rely on the time stamp and having proper time intervals (e.g. Fourier transformation). However, there exist many different ways on how to fill these missing values which need a lot of domain knowledge. That is why tsfresh does not do this "automatically". However, many libraries (e.g. pandas), give many possibilities for this, e.g. using resampling methods.