javaporter-stemmersnowball

Is there a Java implementation of the Porter2 stemmer


Do you know any Java implementation of the Porter2 stemmer (or any better stemmer written in Iava)? I know that there is a Java version of Porter (not Porter2) at http://tartarus.org/~martin/PorterStemmer/java.txt, but on http://tartarus.org/~martin/PorterStemmer/ the author mentions that the Porter is bit outdated and recommends to use Porter2, available at http://snowball.tartarus.org/algorithms/english/stemmer.html.

However, the problem is that this Porter2 is written in Snowball (I never heard of it before, so don't know anything about it). What I am looking for is a Java version of it.


Solution

  • The Snowball algo is available as a Java download

    And from snowball.tartarus.org:

    Feb 2002 - Java support Richard has modified the snowball code generator to produce Java output as well as ANSI C output. This means that pure Java systems can now use the snowball stemmers.

    This is what you want, right?

    You can create an instance of it like so:

      Class stemClass = Class.forName("org.tartarus.snowball.ext." + lang + "Stemmer");
      stemmer = (SnowballProgram) stemClass.newInstance();
      stemmer.setCurrent("your_word");
      stemmer.stem();
      String your_stemmed_word = stemmer.getCurrent();