hadoopoozieoozie-workflow

oozie workflow throws Socket error but submits the workflow twice after 10 minutes


I am facing very weird issue. I have workflow xml which contains like 20 fork-join nodes and each contain 4-8 actions . When I submits this workflow, It wait for like 5-6 minutes, throws

"Error: IO_ERROR : java.net.SocketException: Connection reset"

But actually what happens in the background is Its submits one workflow after 10 mins & another one after 12 mins. So it ends up triggering it twice.

I tried validate to my xml & it returned "OK". Since its not returning workflow, I am unable to do debugging. To be honest, I am not sure where to even start the debugging with.

I have similar workflow with lesser forks(6) and they all work fine. But not sure why this one causes all the trouble.


Solution

  • Those logs did not provide any meaningful information. So I split my workflow files into 2 xmls. I called 2nd workflow from last action of first workflow .It works well without any issues.