I'mI'm trying to annotate some Text inbetween two annotations. While this seems simple enough, I want to explicitly exclude passages of text that contain another annotation.
Let's say I previously anannotated the last Token of the Document and some POIs as several other annotations. now I want to SHIFT the last POI Annotation of the Document to that Token. But there mustn't be other POIs in that text.
What I currently do:
POI1 {->SHIFT(POI1,1,3)} ANY*?{-CONTAINS(POI2)} EndToken;
Ruta will still annotate every other Text containing POI2 annotations with this rule.
What am I missing?
Your rule won't work because ANY*
doesn't have a stopper, so the star-greedy quantifier *
overrides the EndToken
. Additionally, you would normally use CONTAINS
to check for an annotation exclusively inside another. Therefore the PARTOF
condition is indicated for your case.
Within UIMA Ruta you could use at least two approaches to tackle your problem:
1.) Boundary Matching - annotates the text between predefined boundaries
(Annotation1 ANY+{-PARTOF(Annotation2)} Annotation3){-> Annotation4};
2.) Transformation - modifies the offset of the target annotation (your solution)
Annotation1{->SHIFT(Annotation1,1,2)} Annotation2;
If your goal is to just change the offsets of the POI1 annotation, then the following rule should work:
POI1{->SHIFT(POI1,1,3)} ANY*?{-PARTOF(POI2), -PARTOF(EndToken)} EndToken;