I like how Carrot2 works. I use mostly XML import at the moment. I'd like to import XML file with TF-IDF results instead of snippets. That would allow me to prepare data as I wish.
I tried to pass TF-IDF keywords (without metrics) in snippets and it worked somehow. Unfortunately, Carrot2 performs TF-IDF again on my data and the results are mediocre. It would be great if I could pass my keywords together with importance metrics and then use Carrot2 only to fine-tune the results.
I searched for such solution in API, but I couldn't find one. Is it possible somehow?
Carrot2 does not support the direct input of TF-IDF data, unfortunately. One hack you could try is to feed each keyword separated by a period (.), repeating each keyword as many times as indicated by its importance metrics (rounded/scaled to the nearest integer). Separating the keywords with a period will ensure that Carrot2 does not try to join adjacent keywords into phrases.