springspring-batchfile-writingspring-batch-stream

Spring Batch: dynamic or rotate writer


I'm trying to make the next implementation.

Due to size reasons, I have to split my output file in, for example, 10k row chunks.

So, I need to dump 10k in file "out1.csv", the next 10k in file "out2.csv", and so on.

With one output file, the schema batch:chunk with reader-processor-writer is easy and direct.

The output stream is opened in batch:streams XML section inside the chunk, so I avoid the "Writer must be open before it can be written to" exception.

I want to make an implementation avoiding this strict and preset solution:

<batch:chunk reader="reader"  writer="compositeWriter" commit-interval="10000" processor-transactional="false">
    <batch:streams>
        <batch:stream ref="writer1" />
        <batch:stream ref="writer2" />
        <batch:stream ref="writer3" />
        .
        .
        .<batch:stream ref="writer20" />
    </batch:streams>
</batch:chunk>

<bean id="writer1" class="org.springframework.batch.item.file.FlatFileItemWriter" scope="step">
        <property name="resource" value="out1.csv" />
        ...
</bean>

<bean id="writer2" class="org.springframework.batch.item.file.FlatFileItemWriter" scope="step">
        <property name="resource" value="out2.csv" />
        ...
</bean>

...
<!-- writer 20 -->

Supposing that 20 writers are quite enough. I'm looking for a solution to create output writers dynamically (maybe programatically), open them and avoid the above exception.


Solution

  • Due to size reasons, I have to split my output file in, for example, 10k row chunks. So, I need to dump 10k in file "out1.csv", the next 10k in file "out2.csv", and so on.

    You seem to be using a CompositeItemWriter, but this is not the way to go. What you need to use is the MultiResourceItemWriter which allows you to split the output by item count. In your case, you would need to configure a MultiResourceItemWriter and set the itemCountLimitPerResource to 10.000. You can also provide a ResourceSuffixCreator to customize the output file names like out1.csv, out2.csv, etc.