solrcluster-analysiscarrot2lingo

Tokenizing cluster labels of Carrot2 Lingo Clustering on Solr


I use the Carrot2 Lingo Clustering Algorithm to Cluster my Solr search results. Now I want to process the clustering labels further and therefore I need to tokenize the labels to get them one-by-one per label.

Is there some kind of post tokenizer available to achieve this or do I have to process the results myself?

Thanks for your help!

Tim


Solution

  • There's no special tokenizer for this, you'll need to tokenize labels on your own. Tokenizing on white space will be a good choice in most cases.