linuxtesseractappimage

How to use AppImageTool to create package to run on older Linux


I'm trying to use appimagetool(https://appimage.org/) to create a single-binary executable of the OCR program tesseract(https://github.com/tesseract-ocr). I have built tesseract on Ubuntu 19.10, and I want the executable to run on Ubuntu 14.01.

NOTE: I do not have control over the old version of Ubuntu, and I need features in the late-version tesseract. I have already tried an existing AppImage of tesseract, and it fails in a similar way to what's detailed below.

Somewhat following this tutorial: https://appiomatic.com/blog/creating-appimage-binary-manually-for-linux-from-your-app/ I created a tesseract.AppDir with the requisite layout:

tesseract.AppDir/AppRun
tesseract.AppDir/.DirIcon
tesseract.AppDir/tesseract.desktop
tesseract.AppDir/tesseract.png
tesseract.AppDir/usr
tesseract.AppDir/usr/bin
tesseract.AppDir/usr/bin/tesseract
tesseract.AppDir/usr/lib
tesseract.AppDir/usr/lib/libtesseract.so.5
tesseract.AppDir/usr/lib/libtesseract.so.5.0.0
...
tesseract.AppDir/usr/share
tesseract.AppDir/usr/share/tessdata
tesseract.AppDir/usr/share/tessdata/eng.traineddata
...
tesseract.AppDir/usr/share/tessdata/tessconfigs
...

And created the AppImage:

[Ubuntu 19.10]$ ~/Downloads/appimagetool-x86_64.AppImage tesseract.AppDir
appimagetool, continuous build (commit effcebc), build 2084 built on 2019-05-01 21:02:41 UTC
Using architecture x86_64
/home/kingsley/Software/Tesseract/tesseract/tesseract.AppDir should be packaged as Tesseract-OCR-x86_64.AppImage
Generating squashfs...
Parallel mksquashfs: Using 6 processors
Creating 4.0 filesystem on Tesseract-OCR-x86_64.AppImage, block size 131072.
[=======================================================================================================================|] 1921/1921 100%

Exportable Squashfs 4.0 filesystem, gzip compressed, data block size 131072
    compressed data, compressed metadata, compressed fragments, compressed xattrs
    duplicates are removed
Filesystem size 73511.40 Kbytes (71.79 Mbytes)
    30.95% of uncompressed filesystem size (237490.75 Kbytes)
Inode table size 5971 bytes (5.83 Kbytes)
    57.29% of uncompressed inode table size (10423 bytes)
Directory table size 1019 bytes (1.00 Kbytes)
    56.90% of uncompressed directory table size (1791 bytes)
Number of duplicate files found 0
Number of inodes 92
Number of files 78
Number of fragments 5
Number of symbolic links  3
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 11
Number of ids (unique uids + gids) 1
Number of uids 1
    root (0)
Number of gids 1
    root (0)
Embedding ELF...
Marking the AppImage as executable...
Embedding MD5 digest
Success

However copying it to the older system, it would not run, saying it was missing libpng16.so.16.

[Ubuntu14]$ ./Tesseract-OCR-x86_64.AppImage 
tesseract: error while loading shared libraries: libpng16.so.16: cannot open shared object file: No such file or directory

Further research led me to believe that I had to manually copy in all the dependencies.

So using ldd on the tesseract executable:

[Ubuntu 19.10]$ ldd LOCAL_INSTALL/bin/tesseract 
    linux-vdso.so.1 (0x00007fffd7937000)
    libtesseract.so.5 => not found
    liblept.so.5 => /usr/lib/x86_64-linux-gnu/liblept.so.5 (0x00007f44c03d3000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f44c03b0000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f44c01c2000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f44c01a8000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f44bffb7000)
    libpng16.so.16 => /usr/lib/x86_64-linux-gnu/libpng16.so.16 (0x00007f44bff7d000)
    libjpeg.so.8 => /usr/lib/x86_64-linux-gnu/libjpeg.so.8 (0x00007f44bfef8000)
    libgif.so.7 => /usr/lib/x86_64-linux-gnu/libgif.so.7 (0x00007f44bfeed000)
    libtiff.so.5 => /usr/lib/x86_64-linux-gnu/libtiff.so.5 (0x00007f44bfe6c000)
    libwebp.so.6 => /usr/lib/x86_64-linux-gnu/libwebp.so.6 (0x00007f44bfc03000)
    libopenjp2.so.7 => /usr/lib/x86_64-linux-gnu/libopenjp2.so.7 (0x00007f44bfbad000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f44bfa5c000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f44bfa40000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f44c0706000)
    libzstd.so.1 => /usr/lib/x86_64-linux-gnu/libzstd.so.1 (0x00007f44bf999000)
    liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f44bf972000)
    libjbig.so.0 => /usr/lib/x86_64-linux-gnu/libjbig.so.0 (0x00007f44bf764000)

I then copied all those shared libraries into the tesseract.AppDir/usr/lib/ and rebuilt the AppImage again.

Testing on Ubuntu 14 still failed:

[Ubuntu14]$ ./Tesseract-OCR-x86_64.AppImage 
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)

EDIT: I retried making the AppImage, adding the midding .so files one by one. Only when I finally copy in the libc.so.6 did I get the seg. fault. However, if I leave this library out, the executable run fails with:

tesseract: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.22' not found (required by /tmp/.mount_Tesser6wDkZB/lib/liblept.so.5)

It seems that liblept.so.5 is the problem.

Now I'm pretty much out of ideas.

  1. Is this not a use-case for AppImages ?
  2. Is there a way to debug what's going wrong ?
  3. Is there a tool that automatically finds the dependencies?
  4. Is Ubuntu 14.01 just too old a target, and I should give up and go back to using gocr.

Solution

  • Is this not a use-case for AppImages ?

    It's a valid use case for sure.

    Is there a way to debug what's going wrong ?

    Yes, you can use strace and the LD_DEBUG=libs environment variable to see what's being loaded. For more information about debugging AppImages check:

    Is there a tool that automatically finds the dependencies?

    Yes, please check https://github.com/AppImage/awesome-appimage#build-systems

    Which one you should use depends on whether your app can be built on an oldest stable system. If the answer is YES you can use linuxdeploy otherwise you can use appmage-builder. I would recommend reading this entry to discern which tool use.

    Is Ubuntu 14.01 just too old a target, and I should give up and go back to using gocr.

    Provably, you can use appimage-builder to build your AppImage in ubuntu 20.04.