uimauimafit

uimaFIT: Build up a list of JCas instances


For the evaluation of my uimaFIT-pipeline I want to build up a list of JCas instances that were annotated by the pipeline and written to xmi-files. In my evaluation I read in the xmi-files and want to access the JCas for each xmi-file and save it in a list in order to process them further.

JCasIterable goldIterable = SimplePipeline.iteratePipeline(xmiReaderGold);
JCasIterator goldIterator = goldIterable.iterator();

ArrayList<JCas> goldJCasList = new ArrayList<JCas>();

while (goldIterator.hasNext()) {
    JCas goldJCas = goldIterator.next().getCas().getJCas();
    goldJCasList.add(goldJCas);
}

The problem is that in every iteration of the while-loop the JCas in the list that has been added in the iteration before gets overwritten by the current JCas. How do I avoid this and how can I correctly build up my list? I tried to create a new JCas-object with JCas goldJCas = JCasFactory.createJCas() before calling next() on the iterator and adding the JCas to the list. But still I get the same result.


Solution

  • The JCas instance returned by iteratePipeline is always the same one - it is re-used. This is for performance reasons.

    If you want to have a list of JCas-es you can do that somewhat like this

    CollectionReader reader = CollectionReaderFactory.createReader(MyReader.class, <parameters>);
    List<JCas> documents = new ArrayList<>();
    while (reader.hasNext()) {
       JCas document = JCasFactory.createJCas();
       reader.getNext(document.getCas());
       documents.add(document);
    }