I'm trying to use appimagetool
(https://appimage.org/) to create a single-binary executable of the OCR program tesseract
(https://github.com/tesseract-ocr). I have built tesseract on Ubuntu 19.10, and I want the executable to run on Ubuntu 14.01.
NOTE: I do not have control over the old version of Ubuntu, and I need features in the late-version tesseract. I have already tried an existing AppImage of tesseract, and it fails in a similar way to what's detailed below.
Somewhat following this tutorial: https://appiomatic.com/blog/creating-appimage-binary-manually-for-linux-from-your-app/ I created a tesseract.AppDir
with the requisite layout:
tesseract.AppDir/AppRun
tesseract.AppDir/.DirIcon
tesseract.AppDir/tesseract.desktop
tesseract.AppDir/tesseract.png
tesseract.AppDir/usr
tesseract.AppDir/usr/bin
tesseract.AppDir/usr/bin/tesseract
tesseract.AppDir/usr/lib
tesseract.AppDir/usr/lib/libtesseract.so.5
tesseract.AppDir/usr/lib/libtesseract.so.5.0.0
...
tesseract.AppDir/usr/share
tesseract.AppDir/usr/share/tessdata
tesseract.AppDir/usr/share/tessdata/eng.traineddata
...
tesseract.AppDir/usr/share/tessdata/tessconfigs
...
And created the AppImage:
[Ubuntu 19.10]$ ~/Downloads/appimagetool-x86_64.AppImage tesseract.AppDir
appimagetool, continuous build (commit effcebc), build 2084 built on 2019-05-01 21:02:41 UTC
Using architecture x86_64
/home/kingsley/Software/Tesseract/tesseract/tesseract.AppDir should be packaged as Tesseract-OCR-x86_64.AppImage
Generating squashfs...
Parallel mksquashfs: Using 6 processors
Creating 4.0 filesystem on Tesseract-OCR-x86_64.AppImage, block size 131072.
[=======================================================================================================================|] 1921/1921 100%
Exportable Squashfs 4.0 filesystem, gzip compressed, data block size 131072
compressed data, compressed metadata, compressed fragments, compressed xattrs
duplicates are removed
Filesystem size 73511.40 Kbytes (71.79 Mbytes)
30.95% of uncompressed filesystem size (237490.75 Kbytes)
Inode table size 5971 bytes (5.83 Kbytes)
57.29% of uncompressed inode table size (10423 bytes)
Directory table size 1019 bytes (1.00 Kbytes)
56.90% of uncompressed directory table size (1791 bytes)
Number of duplicate files found 0
Number of inodes 92
Number of files 78
Number of fragments 5
Number of symbolic links 3
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 11
Number of ids (unique uids + gids) 1
Number of uids 1
root (0)
Number of gids 1
root (0)
Embedding ELF...
Marking the AppImage as executable...
Embedding MD5 digest
Success
However copying it to the older system, it would not run, saying it was missing libpng16.so.16
.
[Ubuntu14]$ ./Tesseract-OCR-x86_64.AppImage
tesseract: error while loading shared libraries: libpng16.so.16: cannot open shared object file: No such file or directory
Further research led me to believe that I had to manually copy in all the dependencies.
So using ldd
on the tesseract
executable:
[Ubuntu 19.10]$ ldd LOCAL_INSTALL/bin/tesseract
linux-vdso.so.1 (0x00007fffd7937000)
libtesseract.so.5 => not found
liblept.so.5 => /usr/lib/x86_64-linux-gnu/liblept.so.5 (0x00007f44c03d3000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f44c03b0000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f44c01c2000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f44c01a8000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f44bffb7000)
libpng16.so.16 => /usr/lib/x86_64-linux-gnu/libpng16.so.16 (0x00007f44bff7d000)
libjpeg.so.8 => /usr/lib/x86_64-linux-gnu/libjpeg.so.8 (0x00007f44bfef8000)
libgif.so.7 => /usr/lib/x86_64-linux-gnu/libgif.so.7 (0x00007f44bfeed000)
libtiff.so.5 => /usr/lib/x86_64-linux-gnu/libtiff.so.5 (0x00007f44bfe6c000)
libwebp.so.6 => /usr/lib/x86_64-linux-gnu/libwebp.so.6 (0x00007f44bfc03000)
libopenjp2.so.7 => /usr/lib/x86_64-linux-gnu/libopenjp2.so.7 (0x00007f44bfbad000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f44bfa5c000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f44bfa40000)
/lib64/ld-linux-x86-64.so.2 (0x00007f44c0706000)
libzstd.so.1 => /usr/lib/x86_64-linux-gnu/libzstd.so.1 (0x00007f44bf999000)
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f44bf972000)
libjbig.so.0 => /usr/lib/x86_64-linux-gnu/libjbig.so.0 (0x00007f44bf764000)
I then copied all those shared libraries into the tesseract.AppDir/usr/lib/
and rebuilt the AppImage again.
Testing on Ubuntu 14 still failed:
[Ubuntu14]$ ./Tesseract-OCR-x86_64.AppImage
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
EDIT: I retried making the AppImage, adding the midding .so files one by one. Only when I finally copy in the libc.so.6
did I get the seg. fault. However, if I leave this library out, the executable run fails with:
tesseract: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.22' not found (required by /tmp/.mount_Tesser6wDkZB/lib/liblept.so.5)
It seems that liblept.so.5
is the problem.
Now I'm pretty much out of ideas.
gocr
.Is this not a use-case for AppImages ?
It's a valid use case for sure.
Is there a way to debug what's going wrong ?
Yes, you can use strace
and the LD_DEBUG=libs
environment variable to see what's being loaded. For more information about debugging AppImages check:
Is there a tool that automatically finds the dependencies?
Yes, please check https://github.com/AppImage/awesome-appimage#build-systems
Which one you should use depends on whether your app can be built on an oldest stable system. If the answer is YES you can use linuxdeploy
otherwise you can use appmage-builder
. I would recommend reading this entry to discern which tool use.
Is Ubuntu 14.01 just too old a target, and I should give up and go back to using gocr.
Provably, you can use appimage-builder to build your AppImage in ubuntu 20.04.