nlpgate

USE (NLP) GATE TOOL FOR NAMED-ENTITY


Can I use GATE http://gate.ac.uk/ within my java program to extract named-entity. If yes, could you give any examples or guide me to some sources. Thank you


Solution

  • Your question is really two questions: how to use GATE to find named entities and maybe how to embed GATE into your application.

    Named entity recognition or classification is a huge field of research and depending on what named entities you want to find, different approaches may be most effective. GATE provides a very basic gazetteer list and rule based approach for finding some categories of named entities in English text: ANNIE. If the categories found by ANNIE are those interesting to you, one way to start might be to understand and improve what is already provided by ANNIE. The ANNIE pipeline will create annotations for Person, Organization etc in your document and you only need to use or write a PR that accesses those annotations and does whatever you need with the features or the text of those annotations. Look at the GATE manual http://gate.ac.uk/sale/tao/split.html it explains ANNIE and also has some documentation on how to embed GATE (how to use GATE directly from your Java program without running the GUI).