As a newbie in Machine Learning, I have a set of trajectories that may be of different lengths. I wish to cluster them, because some of them are actually the same path and they just SEEM different due to the noise.
In addition, not all of them are of the same lengths. So maybe although Trajectory A is not the same as Trajectory B, yet it is part of Trajectory B. I wish to present this property after the clustering as well.
I have only a bit knowledge of K-means Clustering
and Fuzzy N-means Clustering
. How may I choose between them two? Or should I adopt other methods?
Any method that takes the "belongness" into consideration?
(e.g. After the clustering, I have 3 clusters A, B and C
. One particular trajectory X
belongs to cluster A
. And a shorter trajectory Y
, although is not clustered in A
, is identified as part of trajectory B
.)
=================== UPDATE ======================
The aforementioned trajectories are the pedestrians' trajectories. They can be either presented as a series of (x, y)
points or a series of step vectors (length, direction)
. The presentation form is under my control.
It might be a little late but I am also working on the same problem. I suggest you take a look at TRACLUS, an algorithm created by Jae-Gil Lee, Jiawei Han and Kyu-Young Wang, published on SIGMOD’07. https://hanj.cs.illinois.edu/pdf/sigmod07_jglee.pdf
This is so far the best approach I have seen for clustering trajectories because:
Basically is a 2 phase approach:
Finally they calculate a for each cluster a representative trajectory, which is nothing else that a discovered common sub-trajectory in each cluster.
They have pretty cool examples and the paper is very well explained. Once again this is not my algorithm, so don't forget to cite them if you are doing research.
PS: I made some slides based on their work, just for educational purposes: http://www.slideshare.net/ivansanchez1988/trajectory-clustering-traclus-algorithm