In my java application, I need to copy the content of a directory from one to another. But sometimes (very rare) the copyDirectory stuck forever and the code does not execute after that. Which result in high CPU. I checked the jstack of my application multiple times and found that the same thread is in the runnable state for long. Below is the stack trace of the thread.
"pool-2-thread-3" #17 prio=5 os_prio=0 tid=0x00007fab5585c000 nid=0xa81 runnable [0x00007fab0af6f000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.FileDispatcherImpl.size0(Native Method)
at sun.nio.ch.FileDispatcherImpl.size(FileDispatcherImpl.java:84)
at sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:310)
- locked <0x00000000c5f59728> (a java.lang.Object)
at sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:705)
at org.apache.commons.io.FileUtils.doCopyFile(FileUtils.java:1147)
at org.apache.commons.io.FileUtils.doCopyDirectory(FileUtils.java:1428)
at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1389)
at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1261)
at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1230)
I tried copying same manually with shell command but they copied successfully. Also, there is one more thread which is in running state for long with the following stack trace.
"pool-2-thread-52" #81581 prio=5 os_prio=0 tid=0x00007fab55951800 nid=0x5db runnable [0x00007faafb2f0000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.FileChannelImpl.position0(Native Method)
at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:288)
- locked <0x00000000c5ffde60> (a java.lang.Object)
at sun.nio.ch.FileChannelImpl.transferFromFileChannel(FileChannelImpl.java:651)
- locked <0x00000000c5ffde60> (a java.lang.Object)
at sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:708)
at org.apache.commons.io.FileUtils.doCopyFile(FileUtils.java:1147)
at org.apache.commons.io.FileUtils.doCopyDirectory(FileUtils.java:1428)
at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1389)
at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1261)
at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1230)
I am not getting any clue why the thread stuck in that native call. Is there any environmental issue or related to the machine?
I just got the clue from the IO-385
Since I was using apache commons-io version 2.4. Which is having a bug where FileUtils.doCopyFile can potentially lead to infinite loop.
for(long count = 0L; pos < size; pos += output.transferFrom(input, pos, count))
{
count = size - pos > 31457280L ? 31457280L : size - pos;
}