apache-nifijournaldata-ingestion

NiFi FlowFile Repository failed to update


I´m using Apache NiFi to ingest and preprocess some CSV files, but when runing during a long time, it always fails. The error is always the same:

FlowFile Repository failed to update

Searching at logs, I see this error always:

2018-07-11 22:42:49,913 ERROR [Timer-Driven Process Thread-10] o.a.n.p.attributes.UpdateAttribute UpdateAttribute[id=c7f45dc9-ee12-31b0-8dee-6f1746b3c544] Failed to process session due to org.apache.nifi.processor.exception.ProcessException: FlowFile Repository failed to update: org.apache.nifi.processor.exception.ProcessException: FlowFile Repository failed to update
org.apache.nifi.processor.exception.ProcessException: FlowFile Repository failed to update
        at org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:405)
        at org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:336)
        at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:28)
        at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
        at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: **Cannot update journal file ./flowfile_repository/journals/8772495.journal because this journal has already been closed**
        at org.apache.nifi.wali.LengthDelimitedJournal.checkState(LengthDelimitedJournal.java:223)
        at org.apache.nifi.wali.LengthDelimitedJournal.update(LengthDelimitedJournal.java:178)
        at org.apache.nifi.wali.SequentialAccessWriteAheadLog.update(SequentialAccessWriteAheadLog.java:121)
        at org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.updateRepository(WriteAheadFlowFileRepository.java:300)
        at org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.updateRepository(WriteAheadFlowFileRepository.java:257)

What makes me believe that the root cause is that Nifi Cannot update journal file ./flowfile_repository/journals/8772495.journal because this journal has already been closed**, as seen on logs file.

How can I solve this issue?

Thanks!


Solution

  • If NiFi is having issues writing to the journal file there are a few things to check.

    In order to solve the problem you may need to tweak a couple of things. Does the flow handle the CSV in an efficient manner? Does NiFi have enough memory to do what it needs to with the data? Would it be more appropriate to handle the CSV files as records? If that concept is unfamiliar check out this post that introduces record processing in NiFi. I hope some of these resources help you get a little closer to a solution. If you have a follow up question let me know.