[SOLVED] GMM/EM on time series cluster

GMM/EM on time series cluster

According to a paper, it is supposed to work. But as a learner of scikit-learn package.. I do not see how. All the sample codes cluster by ellipses or circles as here.

I would really like to know how to cluster the following plot by different patterns... 0 -3 are the mean of power over certain time periods (divided into 4) while 4, 5, 6 each correspond to standard deviation of the year, variance in weekday/weekend, variance in winter/summer. So the ylabel does not necessarily meet with 4,5,6.

Following the sample..BIC did generate that the optimal number of clusters is 5.

n_components = np.arange(1, 21)
models = [GMM(n, covariance_type='full', random_state=0).fit(input)
      for n in n_components]
plt.plot(n_comp, [m.bic(read) for m in models], label = 'BIC')
plt.legend(loc='best')
plt.xlabel('n_components')

If I plot with the sample code available however.. it returns something completely weird, not worth sharing. I though negative BIC was ok. But I don't even know if it clustered correctly to deduce that 5 is the optimal number.

Solution

Basically in an effort to close this question..my following post answers how to cluster using GMM.

Create a model using the parameters accordingly

gmm = GaussianMixture(n_components=10, covariance_type ='full', \
              init_params = 'random', max_iter = 100, random_state=0)

Fit your data (number of samples x number of attributes) whose name is input in my case

gmm.fit(input)
print(gmm.means_.round(2))
cluster = gmm.predict(input)

Cluster contains the labels to each of these samples of my input

Feel free to add, if I've gotten anything wrong