uimaruta

WORDTABLE - Not matching the word - UIMA RUTA


I've tried to match a word using WORDTABLE. But some text is not matching.

In the below input the word Afghanistan is not matching. If I remove A Coruña;n.a. from WORDTABLE, then it's matching.

Sample Input:

Afghanistan
Report
report

Sample CSV ( test.csv):

Afghanistan;Afghan.
report;rep.
A Coruña;n.a.

Code:

PACKAGE uima.ruta.example;
RETAINTYPE(SPACE);
WORDTABLE Table = 'test.csv';
DECLARE Annotation Abbr(STRING short);
Document{->MARKTABLE(Abbr, 1, Table,true,0,"",0, "short" = 2)};   
RETAINTYPE;

Solution

  • This is most likely caused by the whitespace in the wordlist. There are several options to avoid this problem, e.g., activating the configuration parameter dictRemoveWS.