javalinuxeclipsedefunct

Defunct processes when java start terminal execution


I try to ocr some images with kraken. I prepared a console command for doing that. It was slow, so I combined that with gnu parallel.

find temp/ -name '*.tif' -or -name '*.jpg' | parallel -j4 kraken -i {} {}.html binarize segment ocr -h

It works fine, when I'm doing this in the terminal. When I start this in java(eclipse), the execution stops after 30 images. It does not terminate. It left defunct processes.

String command = "find temp/ -name '*.tif' -or -name '*.jpg' | parallel -j4 kraken -i {} {}.html binarize segment ocr -h";
Process p = Runtime.getRuntime().exec(new String[]{"/bin/bash","-c",command});
p.waitFor() == 0;

I tried several configurations(more memory(eclipse and the exceution), less threads), but nothing helped.

Has someone an idea to avoid defunct processes or how the execution can be started again?


Solution

  • Almost certainly, the problem is that you're not consuming the output of the process, causing its output buffer to fill and therefore the process to stall.

    Try:

    String command = "find temp/ -name '*.tif' -or -name '*.jpg' | parallel -j4 kraken -i {} {}.html binarize segment ocr -h";
    Process p = Runtime.getRuntime().exec(new String[]{"/bin/bash","-c",command});
    InputStream is = p.getInputStream();
    // is.skip(Long.MAX_VALUE);  Doesn't work
    while (is.read() != -1) { } // consume all process output
    p.waitFor();
    

    A complete solution would also process the error stream. This can be done by starting a separate thread which reads/skips the input from the error stream.

    (Alternatively, you could redirect output to /dev/null in the bash command script).