javalinuxmultithreadingapache-commons-exec

Apache Commons Exec - sometimes a Thread can't open a local file in Linux


WARNING -- THE CAUSE IS NOT A MISSING FILE -- ALL THREADS ARE CALLING THE SAME SCRIPT FILE

I am starting 5-6 threads that call a local script in a Red Hat box.

I've noticed that sometimes, I get the following error message

couldn't read file "/home/leo/myScript.exp": no such file or directory

Obviously, all processes are executing the script, so it seems to be something related to [1] OS has some restriction on simultaneous processes that can run a script or access a file for reading or [2] Java is trying to perform some operation in some stream that is not ready (I was assuming that commons-exec would take care of this for me)

Here's the code

ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
CommandLine commandline = CommandLine.parse("/home/leo/myScript.exp");
DefaultExecutor exec = new DefaultExecutor();
PumpStreamHandler streamHandler = new PumpStreamHandler(outputStream);
exec.setStreamHandler(streamHandler);
try {
   exec.execute(commandline); <<< error happens here
}catch(IOException io) {
   throw new Exception("");
}

If the error is [1], then I'd like to know how to relax this restriction in the linux OS

If the error is [2], then I'd like to know how do I tell commons-exec to wait for the resource to be ready (in the worst case I'll just add some retries, but I think this is not very elegant)

If the error is something else, at least knowing the cause will be good enough for me to find some solution.

UPDATE - March 15th

Well, here's the thing.

The script is an expect script that uses a library to call a Java class.

One thing I've noticed is that the script goes well until it calls a java method that creates a database connection.

Because the number of threads is low (3~5) I don't think it's a problem in the database. Instead, it seems to me that something is blocking the script to be called while the java code is being called and/or while the java code is creating the database connection.

I am still trying to get the exact exception, but the expect script looks like this (kind of)

#!/opt/tclblend/bin/expect -f
set edfDir "/usr/local/nssa/bin/edf";
set env(LD_LIBRARY_PATH) "/opt/tclblend/lib/tcljava1.4.1"; # for tclBlend

## always use absolute paths
set env(TCL_CLASSPATH) "/home/leoks/EclipseIndigo/workspace2/xyzJavaWrapper/bin";
set env(CLASSPATH) {/home/leoks/EclipseIndigo/workspace2/xyzTomEE/lib/commons-logging-1.1.1.jar:/home/leoks/EclipseIndigo/workspace2/xyzConfiguration/lib/commons-configuration-1.9.jar:/home/leoks/EclipseIndigo/workspace2/xyzConfiguration/lib/commons-lang-2.4.jar:/home/leoks/EclipseIndigo/workspace2/xyz/3rdPartyJDBCJars/ojdbc6.jar:/home/leoks/EclipseIndigo/workspace2/xyzJavaWrapper/lib/commons-dbutils-1.5.jar:/home/leoks/EclipseIndigo/workspace2/xyzJavaWrapper/tcl/tcllib/xyzConfiguration.jar};

source $edfDir/lib/statics.tcl;
source $edfDir/lib/acclib.tcl; 

package require java

java::import com.abc.xyz.legacydriver.TCLDriverWrapper
java::import com.abc.xyz.legacydriver.LegacyDriverTaskInputData
java::import com.abc.xyz.legacydriver.LegacyDriverTaskEnum

set ticket  [ lindex $argv 0 ];
set inputData [ java::call com.abc.xyz.legacydriver.TCLDriverWrapper pullNextInputData $ticket ]

where pullNextInputData looks like

public static LegacyDriverTaskInputData pullNextInputData(String token) throws Exception {

    try {
        return pullNextInputDataImpl(token);
    } catch (Exception e) {
        e.printStackTrace();
        throw e;
    }
}

private static LegacyDriverTaskInputData pullNextInputDataImpl(String token) throws Exception {
    Connection conn = null;
    try{
        conn = new TCLDriverWrapper().getConnection();
        QueryRunner run = new QueryRunner();
        ResultSetHandler<LegacyDriverTaskInputData> rsh = new BeanHandler<LegacyDriverTaskInputData>(LegacyDriverTaskInputData.class);
        LegacyDriverTaskInputData inputData = run.query(conn,"select * from LegacyDriverTask where id = ?",rsh,Long.valueOf(token));
        return inputData;
    } catch (ClassNotFoundException e) {
        e.printStackTrace();
        throw e;
    } catch (SQLException e) {
        e.printStackTrace();
        throw e;
    } finally {
        DbUtils.close(conn);
    }
}

and getConnection() is just a regular driver instantiation code like (it uses apache dbutils)

private Connection getConnectionImpl() throws Exception{    
    Class.forName("driver name");
    Properties props = new Properties();
    props.put("user", UtilConf.getProperty("javawrapper.user"));
    props.put("password", UtilConf.getProperty("javawrapper.password"));
    return DriverManager.getConnection(UtilConf.getProperty("javawrapper.jdbc"), props);
}

As soon as I get a stacktrace, I'll put it here

UPDATE - March 16th

The stacktrace does not say much :-(

2016-03-17 01:49:10,034 INFO   [QProcessor] Threads started (ok=0 nok=0 wait=0) org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1)
at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166)
at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:153)
at com.ericsson.xyz.tomee.q.QWorker.onMessageImpl(QWorker.java:776)
at com.ericsson.xyz.tomee.q.QWorker.onMessage(QWorker.java:303)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.openejb.core.interceptor.ReflectionInvocationContext$Invocation.invoke(ReflectionInvocationContext.java:182)
at org.apache.openejb.core.interceptor.ReflectionInvocationContext.proceed(ReflectionInvocationContext.java:164)
at org.apache.openejb.monitoring.StatsInterceptor.record(StatsInterceptor.java:180)
at org.apache.openejb.monitoring.StatsInterceptor.invoke(StatsInterceptor.java:99)
at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.openejb.core.interceptor.ReflectionInvocationContext$Invocation.invoke(ReflectionInvocationContext.java:182)
at org.apache.openejb.core.interceptor.ReflectionInvocationContext.proceed(ReflectionInvocationContext.java:164)
at org.apache.openejb.core.interceptor.InterceptorStack.invoke(InterceptorStack.java:80)
at org.apache.openejb.core.stateless.StatelessContainer._invoke(StatelessContainer.java:212)
at org.apache.openejb.core.stateless.StatelessContainer.invoke(StatelessContainer.java:181)
at org.apache.openejb.core.ivm.EjbObjectProxyHandler.synchronizedBusinessMethod(EjbObjectProxyHandler.java:268)
at org.apache.openejb.core.ivm.EjbObjectProxyHandler$1.call(EjbObjectProxyHandler.java:253)
at org.apache.openejb.async.AsynchronousPool$AsynchronousCall.call(AsynchronousPool.java:110)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

Solution

  • I'd like to thank all the people who helped in this case.

    I'd delete this question but since I've put a bounty on it, I can't. (maybe when the bounty expires, if SO allows me to, I'll do it then)

    I've finally found a way to not have the problem, and it seems to be related to the way the application was allocating resources for the threads, so given the information provided in the question, I think it was not possible to figure out the root cause.

    Anyway, I'll try to describe here what happened.

    enter image description here (click to enlarge)

    The problem was I've noticed was in (5) (6) (7). But the cause was in (3).

    TomEE configuration for the thread pool that serves the @Asynchronous methods had fewer threads (my app is not using the default values) than necessary, so it seems some resource was being starved.

    Also the error message seems to have some bug that I haven't identified yet, so even the error message can be invalid in this case.

    Because the question had not all the variables necessary for its resolution, I'm providing a whole explanation here about the problem.

    Since this is a client code, I cannot make the code available, or provide all the information. Even package names had to be obfuscated.

    Again, I'd like to thank anyone who spent time trying to help here and I'd like to apologize for not providing all the necessary data.