For the evaluation of my uimaFIT-pipeline I want to build up a list of JCas instances that were annotated by the pipeline and written to xmi-files. In my evaluation I read in the xmi-files and want to access the JCas for each xmi-file and save it in a list in order to process them further.
JCasIterable goldIterable = SimplePipeline.iteratePipeline(xmiReaderGold);
JCasIterator goldIterator = goldIterable.iterator();
ArrayList<JCas> goldJCasList = new ArrayList<JCas>();
while (goldIterator.hasNext()) {
JCas goldJCas = goldIterator.next().getCas().getJCas();
goldJCasList.add(goldJCas);
}
The problem is that in every iteration of the while-loop the JCas in the list that has been added in the iteration before gets overwritten by the current JCas. How do I avoid this and how can I correctly build up my list? I tried to create a new JCas-object with JCas goldJCas = JCasFactory.createJCas()
before calling next()
on the iterator and adding the JCas to the list. But still I get the same result.
The JCas instance returned by iteratePipeline
is always the same one - it is re-used. This is for performance reasons.
If you want to have a list of JCas-es you can do that somewhat like this
CollectionReader reader = CollectionReaderFactory.createReader(MyReader.class, <parameters>);
List<JCas> documents = new ArrayList<>();
while (reader.hasNext()) {
JCas document = JCasFactory.createJCas();
reader.getNext(document.getCas());
documents.add(document);
}