solrunsupervised-learningcarrot2

Implementing incremental clustering using Carrot2 DCS


Carrot2 accepts XML inputs that include 'clusters' from its own export on some other documents. Now, if I want to implement incremental clustering i.e. introduce new documents, along with the previous clusters, I have to retain the older documents in the input too. That makes the inputs grow linearly as we progress.

Is there a way to extract clusters along with document features for the respective clusters so as to solve this incremental/online clustering problem?


Solution

  • Incremental clustering is currently only available in the Lingo3G algorithm (commercial add-on to Carrot2). In Carrot2, the only option for now is re-clustering the whole enlarged document set.