javaproject-panamajava-ffm

Java Foreign Function & Memory API, problems with Locale


Long short story, I've been calling the native Lammps using the latest Foreign Function & Memory API (Project Panama) in JDK 22.

I have problems with the locale.

atof will parse 2.2598258677677969 as 2 because on my system localeconv()->decimal_point is indeed ,.

I've been trying to set on terminal export LC_ALL=C and adding in the native code itself.

#include <clocale>
setlocale(LC_ALL, "C");

But I didn't have any success so far. Unfortunately I'm not familiar with the native environment.

My OS is Linux (Ubuntu).

To reproduce:

val l = Linker.nativeLinker()
val d = FunctionDescriptor.of(ValueLayout.JAVA_DOUBLE, ValueLayout.ADDRESS)
val atof = l.downcallHandle(l.defaultLookup().find("atof").get(), d)
val res = atof.invoke(Arena.global().allocateFrom("2.2598258677677969")) as Double
println(res)

On native, this:

    auto a = setlocale(LC_ALL, NULL);
    utils::logmesg(this, "old locale {}\n", a);

prints the following

old locale LC_CTYPE=en_US.UTF-8;LC_NUMERIC=it_IT.UTF-8;LC_TIME=it_IT.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=it_IT.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=it_IT.UTF-8;LC_NAME=it_IT.UTF-8;LC_ADDRESS=it_IT.UTF-8;LC_TELEPHONE=it_IT.UTF-8;LC_MEASUREMENT=it_IT.UTF-8;LC_IDENTIFICATION=it_IT.UTF-8

locale -a

C C.utf8 POSIX de_AT.utf8 de_BE.utf8 de_CH.utf8 de_DE.utf8 de_IT.utf8 de_LI.utf8 de_LU.utf8 en_AG en_AG.utf8 en_AU.utf8 en_BW.utf8 en_CA.utf8 en_DK.utf8 en_GB.utf8 en_HK.utf8 en_IE.utf8 en_IL en_IL.utf8 en_IN en_IN.utf8 en_NG en_NG.utf8 en_NZ.utf8 en_PH.utf8 en_SG.utf8 en_US.utf8 en_ZA.utf8 en_ZM en_ZM.utf8 en_ZW.utf8 it_CH.utf8 it_IT.utf8


Solution

  • I can reproduce your issue if I set my LC_NUMERIC locale to it_IT.utf8:

    setlocale(LC_NUMERIC(), Arena.global().allocateFrom("it_IT.utf8"));
    var res = atof(Arena.global().allocateFrom("2.2598258677677969"));
    System.out.println(res); // 2.0
    

    As you've found, setting the LC_ALL environment variable to C should fix the issue, and make atof return 2.259825867767797. You can make sure that LC_ALL is set correctly by printing out the result of System.getenv("LC_ALL") just before the call to atof.


    As an alternative, you could explicitly set LC_ALL (or just LC_NUMERIC) to C at the start of your program, with a call to setlocale:

    setlocale(LC_ALL(), Arena.global().allocateFrom("C"));
    

    Note that LC_ALL and LC_NUMERIC are implementation-defined, so for the above snippets, I'm using a jextract-ed interface for the C standard library, which will give me the right values of LC_ALL and LC_NUMERIC. See the sample here.

    To generate the interface, install jextract and Java 22, add the bin directory of both to your PATH, and then run the following 2 commands:

    $ jextract --output src -t libc --header-class-name LibC 'libc.h'
    $ javac -d classes src/**/*.java
    

    The libc.h file just contains #includes of all the C standard library header files, minus stdbit.h and stdckdint.h which are too new at the time of writing.

    That should generate an src directory with the extracted sources, and spit out all the compiled classes in a classes directory, which you can then add to the class path of your JVM application. The generated classes and functions can be imported with:

    import static libc.LibC.*;
    import libc.*;