I'm searching for a way to easily print SSE value and Silhouette score of a Dtaidistance (https://dtaidistance.readthedocs.io/en/latest/index.html) kmeans model after its training on data. While TSlearn kmeans produce _inertia and _labels from which I can retrieve the information needed, doesn't seems to me an equivalent way to do that with Dtaidistance library. I'd like to avoid another run of training because I have an huge dataset of time series. Thank you everyone :)
#kmeans k = 4 python - dtaidistance kmeans settings
km0 = dtaikm(
k=4,
max_it=5,
max_dba_it=5,
thr=0.0001,
drop_stddev=3,
initialize_with_kmeanspp=True,
initialize_sample_size= 4,
show_progress=True
)
# fit
cluster_idx, performed_it = km0.fit_fast(x_red)
#now i have in km0.means[i] the centroid i and
#in cluster_idx[i] the list of rows' ids assigned to cluster i
This was not supported. We (authors here) have added an extra argument to the fit function (monitor_distances
) that accepts a function in which you can compute inertia. This is available in the master branch on Github (and will be part of the next release).
This allowed you to do something like:
def mymonitor(clusters_distances, clustering_ended):
clusters, distances = zip(*clusters_distances)
... compute inertia and print/plot/save
return True
cluster_idx, performed_it = km0.fit_fast(x_red, monitor_distances=mymonitor)