I am newbie in GATE ANNIE. I tried GATE GUI interface and got experience to do task on it. I wanted to know how can I implement Named Entity Extraction in Java?
I did R&D but unable to find any tutorial regarding Named Entity Extraction.
Is there any code available to find out Named Entity Extraction in GATE ANNIE in Java?
import gate.*;
import gate.creole.ANNIEConstants;
import gate.util.persistence.PersistenceManager;
import java.io.File;
import java.util.*;
public class AnnieNerExample {
public static void main(String[] args) throws Exception {
Gate.setGateHome(new File("C:\\Program Files\\GATE_Developer_8.1"));
Gate.init();
LanguageAnalyser controller = (LanguageAnalyser) PersistenceManager
.loadObjectFromFile(new File(new File(Gate.getPluginsHome(),
ANNIEConstants.PLUGIN_DIR), ANNIEConstants.DEFAULT_FILE));
Corpus corpus = Factory.newCorpus("corpus");
Document document = Factory.newDocument(
"Michael Jordan is a professor at the University of California, Berkeley.");
corpus.add(document); controller.setCorpus(corpus);
controller.execute();
document.getAnnotations().get(new HashSet<>(Arrays.asList("Person", "Organization", "Location")))
.forEach(a -> System.err.format("%s - \"%s\" [%d to %d]\n",
a.getType(), Utils.stringFor(document, a),
a.getStartNode().getOffset(), a.getEndNode().getOffset()));
//Don't forget to release GATE resources
Factory.deleteResource(document); Factory.deleteResource(corpus); Factory.deleteResource(controller);
}
}
The output:
Person - "Michael Jordan" [0 to 14]
Organization - "University of California" [37 to 61]
Location - "Berkeley" [63 to 71]
two possibilities:
Quick Start with GATE Embedded:
add
$GATE_HOME/bin/gate.jar
and the JAR files in$GATE_HOME/lib
to the Java CLASSPATH ($GATE_HOME
is the GATE root directory)
Maven
<dependency>
<groupId>uk.ac.gate</groupId>
<artifactId>gate-core</artifactId>
<version>8.4</version>
</dependency>