carrot2carrot

Getting Java Heap Space error while using Carrot2


I have all my search result formatted in XML format and am trying to run lingo algorithm in the Carrot2 workbench and am continuously running into Java heap space error.

The XML is formatted in a way that Carrot2 uses. I am running Carrot2 workbench on a MAC machine.

Is there a way:

  1. To increase the Java Heap Space for the application like some setting?
  2. Is there a limitation to the documents that I can pass to the application for clustering? (I have around 10k documents)**

An internal error occurred during: "Searching for 'gene therapy'...". Java heap space


Solution

    1. To set the maximum Java heap space, you can pass suitable -Xmx JVM parameter value during start: carrot2-workbench -vmargs -Xmx256m

    2. Carrot2 is designed for small to medium collections of documents (a few hundred). This fairly depends on the algorithm. See "Got java heap size error when trying to cluster 15980 documents via carrot2workbench" for more details.