javanlpnamed-entity-recognitiongate

How to get Named Entity Extraction using GATE Annie in Java?


I am newbie in GATE ANNIE. I tried GATE GUI interface and got experience to do task on it. I wanted to know how can I implement Named Entity Extraction in Java?

I did R&D but unable to find any tutorial regarding Named Entity Extraction.

Is there any code available to find out Named Entity Extraction in GATE ANNIE in Java?


Solution

  • import gate.*;
    import gate.creole.ANNIEConstants;
    import gate.util.persistence.PersistenceManager;
    import java.io.File;
    import java.util.*;
    
    public class AnnieNerExample {
    
        public static void main(String[] args) throws Exception {
            Gate.setGateHome(new File("C:\\Program Files\\GATE_Developer_8.1"));
            Gate.init();
    
            LanguageAnalyser controller = (LanguageAnalyser) PersistenceManager
                    .loadObjectFromFile(new File(new File(Gate.getPluginsHome(),
                            ANNIEConstants.PLUGIN_DIR), ANNIEConstants.DEFAULT_FILE));
    
            Corpus corpus = Factory.newCorpus("corpus");
            Document document = Factory.newDocument(
                    "Michael Jordan is a professor at the University of California, Berkeley.");
            corpus.add(document); controller.setCorpus(corpus); 
            controller.execute();
    
            document.getAnnotations().get(new HashSet<>(Arrays.asList("Person", "Organization", "Location")))
                .forEach(a -> System.err.format("%s - \"%s\" [%d to %d]\n", 
                        a.getType(), Utils.stringFor(document, a),
                        a.getStartNode().getOffset(), a.getEndNode().getOffset()));
    
            //Don't forget to release GATE resources 
            Factory.deleteResource(document); Factory.deleteResource(corpus); Factory.deleteResource(controller);
        }
    }
    

    The output:

    Person - "Michael Jordan" [0 to 14]
    Organization - "University of California" [37 to 61]
    Location - "Berkeley" [63 to 71]
    

    Jars

    two possibilities:

    1. Manual

    Quick Start with GATE Embedded:

    add $GATE_HOME/bin/gate.jar and the JAR files in $GATE_HOME/lib to the Java CLASSPATH ($GATE_HOME is the GATE root directory)

    1. Maven

      <dependency>
          <groupId>uk.ac.gate</groupId>
          <artifactId>gate-core</artifactId>
          <version>8.4</version>
      </dependency>