javajsr352jberet

Stop chunk processing from a Processor or Writer


I do have batch jobs that perform chunk processing. This loop usually continues until the Reader has no more items to read and thus returns null. Or if at any point an Exception is thrown. In this case the batch job fails and the current transaction will be rolled back.

Now I have a generic Reader, e.g. JDBCReader to read from database, and some processor and writer. And I experience that either the processor or the writer can detect when no more records shall be read. But the reader does not know. You may consider this to be a "quota limit reached" in my application.

How can the processor or the writer exit the chunk processing loop without failing the job and thus resetting the last transaction?


Solution

  • I found two alternatives that do allow using generic readers and still terminate the processing loop from Processor or Writer. The first is a variant of @cheng's answer. The other makes use of JSR-352 error handling concepts.

    1. Variant of the suggested solution

    Let's assume we have some Processor or Writer that communicates it's state or health or whatever you want to name it in the StepContext via the transient user data. But at the same time we have a Reader that does not look at the StepContext data. What can we do?

    We can attach a ItemReadListener. It gets called before and after the Reader, and in afterRead we can indeed look at the StepContext and decide to not return the read item. To the chunk processing it looks like the Reader returned null and the loop will terminate for good.

    1. JSR-352 concepts

    This mechanism is even easier to use, plus it allows some other builtin features Java Batch to shine.

    Again let's assume we have some Processor or Writer that can identify conditions when to end the loop. And we have a Reader not aware of this. A very simple thing is for the Processor or Writer to throw an Exception. But this exception must not be a generic Exception, it should be more concise like TargetRecordLockedException or QuotaLimitReachedException. But - does that not terminate the whole batch job?

    We are not done. In the job xml, within your chunk add something like

        <skippable-exception-classes>
            <include class="TargetRecordLockedException"/>
        </skippable-exception-classes>
        <retryable-exception-classes>
        </retryable-exception-classes>
        <no-rollback-exception-classes>
            <include class="QuotaLimitReachedException"/>
        </no-rollback-exception-classes>
    

    With that you tell the chunk processing what to do with the exception. And you have four choices:

    Hmmm, you could have done this by simply throwing/not throwing an exception, could you? But the batch system internally keeps some metrics. They include the amount of records read, the amount of records written and the amount of records filtered (that means the processor decided to return null). If your writer does not write for a reason but does not tell anyone, metrics will show X items written but the amount is less. How much less? If you use a skippable exception you can see directly how many items were written and how many were skipped.

    As examples are scarce, here is something connected to this topic: JSR 352 - Why does exception included in <skippable-exception-classes> stop the job