javamatlab64-bitjnajvm-crash

JNA crashes with 64 bit libmx.so on Linux


I have a legacy Java application that uses JNA to interact with Matlab C Matrix API.

/**
 * Load a MAT-file. Must be synchronized because of native libraries calls.
 *
 * @param file MAT file
 * @return map with the variables in the MAT file
 */
public static synchronized Map<String, MObj> load(final File file) {
    if (file == null) {
        throw new InvalidMatFileException(null, ERROR_MSG_FILE_NOT_FOUND);
    }
    if (!file.isFile()) {
        throw new InvalidMatFileException(file.getName(), ERROR_MSG_FILE_NOT_FOUND);
    }
    final Map<String, MObj> vars = new HashMap<String, MObj>();
    final Pointer matfile = MAT_LIB.matOpen(file.getAbsolutePath(), "r");
    if (matfile == null) {
        throw new InvalidMatFileException(file.getName(), ERROR_MSG_NOT_A_PROPER_MAT_FILE);
    }
    for (;;) {
        final PointerByReference name = new PointerByReference();
        final Pointer ar = MAT_LIB.matGetNextVariable(matfile, name);
        if (ar == null) {
            break;
        }
        final MObj obj = ToJava.convert(MX_LIB, ar);
        MX_LIB.mxDestroyArray(ar);
        vars.put(name.getValue().getString(0), obj);
    }
    MAT_LIB.matClose(matfile);
    return vars;
}

//ToJava class has the following methods that are called from the code above
 /**
 * Convert a single mxArray
 *
 * @param mx MX library
 * @param ar mxArray
 * @return MObj instance
 */
public static MObj convert(MX mx, Pointer ar) {
    final ToJava tr = new ToJava(mx);
    tr.transform(ar);
    return tr.obj;
}

/**
 * Top-level transform
 *
 * @param ar mxArray
 */
private void transform(Pointer ar) {
    final ClassID classID = ClassID.valueOf(mx.mxGetClassID(ar));
    switch (classID) {

    case DOUBLE:
        transformDoubleArray(ar);
        break;

Inside the for loop

ToJava.convert
calls

mx.mxGetClassID(ar)

under the hood. This is defined in the api as

mxClassID mxGetClassID(const mxArray *pm);

where mxClassID is an enum. This is mapped in JNA as

int mxGetClassID(Pointer ar);

This is the exact piece of code causing the crash below

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fbb7fcd45f0, pid=13363, tid=140445588428544
#
# JRE version: 6.0_45-b06
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.45-b01 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libmx.so+0x615f0]  matrix::detail::noninlined::mx_array_api::mxGetClassID(mxArray_tag const*)+0x0
Register to memory mapping:

RAX=0x0000000000000000 is an unknown value
RBX=0x0000000000000008 is an unknown value
RCX=0x00007fbc04252a7a is an unknown value
RDX=0x00007fbc04252a7a is an unknown value
RSP=0x00007fbc09686888 is pointing into the stack for thread: 0x00007fbc04006800
RBP=0x00007fbc09686890 is pointing into the stack for thread: 0x00007fbc04006800
RSI=0x00007fbc04252a7a is an unknown value
RDI=0x0000000000000000 is an unknown value
R8 =0x00007fbc04252a7a is an unknown value
R9 =0x00007fbc04252a95 is an unknown value
R10=0x0000000000000000 is an unknown value
R11=0x00007fbb7fd46800: mxGetClassID+0 in /var/opt/Matlab_MCR/v91/bin/glnxa64/libmx.so at 0x00007fbb7fc73000
R12=0x00007fbc096869d0 is pointing into the stack for thread: 0x00007fbc04006800
R13=0x0000000000000008 is an unknown value
R14=0x0000000000000001 is an unknown value
R15=0x00007fbc096869b0 is pointing into the stack for thread: 0x00007fbc04006800

 Stack: [0x00007fbc09589000,0x00007fbc0968a000],  sp=0x00007fbc09686888,  free space=1014k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libmx.so+0x615f0]  matrix::detail::noninlined::mx_array_api::mxGetClassID(mxArray_tag const*)+0x0

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  com.sun.jna.Native.invokeInt(Lcom/sun/jna/Function;JI[Ljava/lang/Object;)I+0
j  com.sun.jna.Function.invoke([Ljava/lang/Object;Ljava/lang/Class;ZI)Ljava/lang/Object;+211
j  com.sun.jna.Function.invoke(Ljava/lang/reflect/Method;[Ljava/lang/Class;Ljava/lang/Class;[Ljava/lang/Object;Ljava/util/Map;)Ljava/lang/Object;+271
j  com.sun.jna.Library$Handler.invoke(Ljava/lang/Object;Ljava/lang/reflect/Method;[Ljava/lang/Object;)Ljava/lang/Object;+390
j  com.sun.proxy.$Proxy12.mxGetClassID(Lcom/sun/jna/Pointer;)I+16

To sum it up,

So far, I tried setting the LD_LIBRARY_PATH as well as the java.library.path and pointed both to the MCR location on the file system

/var/opt/Matlab_MCR/v91/runtime/glnxa64:/var/opt/Matlab_MCR/v91/bin/glnxa64:/var/opt/Matlab_MCR/v91/sys/os/glnxa64

Does anybody know how to troubleshoot this further?


Solution

  • The code in this question calls the functions exposed by Matlab Compiler Runtime via JNA. It turns out that the interface has been changed on Matlab side for 64bit architecture. They use mwSize type in their native API signature while our JNA mapping uses int. On a 64-bit machine where mwSize is equivalent to size_t, an int will be 32-bits. Therefore, the JNA mapping and the native API signature do not match anymore. After repeatedly calling these functions via JNA, it even causes JNA to crash.

    The solution direction is to replace int with a 64-bit integer on Java code.This can be done by extending the abstract class called IntegerType provided by JNA and using this extended class wherever applicable

    Reference article from Matlab: https://nl.mathworks.com/help/matlab/matlab_external/upgrading-mex-files-to-use-64-bit-api.html