javaclangllvmproject-panama

Jextract YARA headers throws unknown type name 'intmax_t'


I want to generate Java bindings for the /usr/include/yara.h header file using the https://github.com/openjdk/jextract tool.

From readme:

Jextract jextract is a tool which mechanically generates Java bindings from a native library headers. This tools leverages the clang C API in order to parse the headers associated with a given native library, and the generated Java bindings build upon the Foreign Function & Memory API. The jextract tool was originally developed in the context of Project Panama (and then made available in the Project Panama Early Access binaries).

I am able to build and test the project but I'm having some issues with the type definition of intmax_t when running the command:

build/jextract/bin/jextract --source --output java-yara --target-package com.virustotal.yara /usr/include/yara.h
/usr/include/inttypes.h:290:8: error: unknown type name 'intmax_t'

I checked the /usr/include/inttypes.h file and it correctly imports the <stdint.h> header:

/*
 *  ISO C99: 7.8 Format conversion of integer types <inttypes.h>
 */

#ifndef _INTTYPES_H
#define _INTTYPES_H 1

#include <features.h>
/* Get the type definitions.  */
#include <stdint.h>

Here https://github.com/MaurizioCasciano/jextract/tree/unknown_type_name_intmax_t is my fork with a simple Taskfile to build the project and reproduce the error running:

task build
task yara-extract

instead of the full commands, if you prefer.

I did also try adding the relevant include paths but it didn't work.

-I /usr/include
-I /usr/include/yara
-I /usr/lib/llvm-15/lib/clang/15.0.6/include

What else am I missing to allow the parsing of this typedef for 'intmax_t' ?

Below you can see the tools I'm using.

Environment

$ hostnamectl
Operating System: Ubuntu 22.04.1 LTS              
          Kernel: Linux 5.15.0-52-generic
    Architecture: x86-64

$ java --version
openjdk 19 2022-09-20
OpenJDK Runtime Environment (build 19+36-2238)
OpenJDK 64-Bit Server VM (build 19+36-2238, mixed mode, sharing)

$ gradle --version

------------------------------------------------------------
Gradle 7.6
------------------------------------------------------------

Build time:   2022-11-25 13:35:10 UTC
Revision:     daece9dbc5b79370cc8e4fd6fe4b2cd400e150a8

Kotlin:       1.7.10
Groovy:       3.0.13
Ant:          Apache Ant(TM) version 1.10.11 compiled on July 10 2021
JVM:          19 (Oracle Corporation 19+36-2238)
OS:           Linux 5.15.0-52-generic amd64

$ llvm-config-15 --version
15.0.6

$ clang-15 --version
Ubuntu clang version 15.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

I am also able to parse the /usr/include/yara.h header file to generate the corresponing Abstract Syntax Tree (AST) representation of the code:

clang-15 -fsyntax-only -Xclang -ast-dump /usr/include/yara.h | grep intmax_t

and the reference is correctly returned:


|-TypedefDecl 0x559220f80400 <line:72:1, col:18> col:18 referenced __intmax_t 'long'
|-TypedefDecl 0x559220f80470 <line:73:1, col:27> col:27 referenced __uintmax_t 'unsigned long'
|-TypedefDecl 0x55922102e2f0 <line:101:1, col:21> col:21 referenced intmax_t '__intmax_t':'long'
| `-TypedefType 0x55922102e2c0 '__intmax_t' sugar
|   |-Typedef 0x559220f80400 '__intmax_t'
|-TypedefDecl 0x55922102e380 <line:102:1, col:22> col:22 referenced uintmax_t '__uintmax_t':'unsigned long'
| `-TypedefType 0x55922102e350 '__uintmax_t' sugar
|   |-Typedef 0x559220f80470 '__uintmax_t'
|-FunctionDecl 0x55922102e788 <line:290:1, col:74> col:17 imaxabs 'intmax_t (intmax_t)' extern
| |-ParmVarDecl 0x55922102e6c0 <col:26, col:35> col:35 __n 'intmax_t':'long'
|-FunctionDecl 0x55922102ea68 <line:293:1, line:294:41> line:293:18 imaxdiv 'imaxdiv_t (intmax_t, intmax_t)' extern
| |-ParmVarDecl 0x55922102e8e0 <col:27, col:36> col:36 __numer 'intmax_t':'long'
| |-ParmVarDecl 0x55922102e958 <col:45, col:54> col:54 __denom 'intmax_t':'long'
|-FunctionDecl 0x559221040998 <line:297:1, /usr/include/x86_64-linux-gnu/sys/cdefs.h:79:54> /usr/include/inttypes.h:297:17 strtoimax 'intmax_t (const char *restrict, char **restrict, int)' extern
|-FunctionDecl 0x559221040cc8 </usr/include/inttypes.h:301:1, /usr/include/x86_64-linux-gnu/sys/cdefs.h:79:54> /usr/include/inttypes.h:301:18 strtoumax 'uintmax_t (const char *restrict, char **restrict, int)' extern
|-FunctionDecl 0x5592210410e8 </usr/include/inttypes.h:305:1, /usr/include/x86_64-linux-gnu/sys/cdefs.h:79:54> /usr/include/inttypes.h:305:17 wcstoimax 'intmax_t (const __gwchar_t *restrict, __gwchar_t **restrict, int)' extern
|-FunctionDecl 0x559221041428 </usr/include/inttypes.h:310:1, /usr/include/x86_64-linux-gnu/sys/cdefs.h:79:54> /usr/include/inttypes.h:310:18 wcstoumax 'uintmax_t (const __gwchar_t *restrict, __gwchar_t **restrict, int)' extern

Solution

  • As suggested by @Jorn Vernee, this issue was related to the installation of LLVM and CLang on Ubuntu, using the APT packages available at https://apt.llvm.org/

    I tested it, without any problem, using the latest release archive available on GitHub:

    There are only some warnings for the long double native types being skipped.

    This is the Taskfile.yaml to download and extract the LLVM+CLang archive and generate the YARA-Java bindings inside a simple Maven module structure:

    version: 3
    
    dotenv: ['.env']
    
    env:
      LLVM_HOME: "./libs/clang_llvm"
      JTREG_HOME: "/usr/lib/jtreg"
    
    vars:
      PROJECT_ROOT:
        sh: echo $PWD
      YARA_JAVA_ROOT: "samples/yara/yara-java"
      YARA_JAVA_SRC: "{{.YARA_JAVA_ROOT}}/src/main/java"
      YARA_JAVA_RESOURCES: "{{.YARA_JAVA_ROOT}}/src/main/resources"
      YARA_JAVA_HEADERS: "{{.YARA_JAVA_RESOURCES}}/headers"
    
    tasks:
      setup:
        cmds:
          - mkdir -p libs/clang_llvm
          - true || wget -nc -O libs/clang_llvm.tar.xz --show-progress https://github.com/llvm/llvm-project/releases/download/llvmorg-15.0.6/clang+llvm-15.0.6-x86_64-linux-gnu-ubuntu-18.04.tar.xz
          - tar --strip-components=1 -xvf libs/clang_llvm.tar.xz -C libs/clang_llvm
      build:
        cmds:
          - gradle -Pjdk19_home="${JAVA_HOME}" -Pllvm_home="${LLVM_HOME}" clean verify
      test:
        cmds:
          - gradle -Pjdk19_home="${JAVA_HOME}" -Pllvm_home="${LLVM_HOME}" -Pjtreg_home="${JTREG_HOME}" jtreg
      yara-headers:
        cmds:
          - mkdir -p {{.YARA_JAVA_HEADERS}}
          - cp /usr/include/yara.h {{.YARA_JAVA_HEADERS}}
          - cp -r /usr/include/yara {{.YARA_JAVA_HEADERS}}
      yara-java:
        cmds:
          - task: yara-java-includes
          - sh -c "(
                    build/jextract/bin/jextract
                    -l yara
                    --source
                    --output {{.YARA_JAVA_SRC}}
                    --target-package com.virustotal.yara
                    {{.YARA_JAVA_HEADERS}}/yara.h
                  )"
      yara-java-includes:
        cmds:
          - mkdir -p {{.YARA_JAVA_RESOURCES}}
          - sh -c "(
                    build/jextract/bin/jextract
                    --dump-includes {{.YARA_JAVA_RESOURCES}}/includes.txt
                    {{.YARA_JAVA_HEADERS}}/yara.h
                  )"