embedded-linuxyoctobitbakeopenembedded

Complex programs built with Bitkake have different checksums based on path, simple programs yield always the same checksums


Executive summary: Some binaries build for our yocto images have different md5sums based on build path. Path does not affect checksums of a minimal project built.

I am using TI-supplied Yocto platform to build our product images. It is working fine. We have a couple of our proprietary software projects hooked up there as additional layer to make our actual application image.

Recently we discovered that the checksums of a couple of helper programs we are building for our application image yield different checksums based on where they are build.

Example when building under /tmp/Project-xxxxx_Yocto and ~/repos/Project-xxxxx_Yocto:

user@MACHINE:/tmp/Project-xxxxx_Yocto$ md5sum ./workdir/arago-tmp-default-glibc/work/armv7at2hf-neon-oe-linux-gnueabi/canbus-handler/git+AUTOINC+b99a8424b6-r0/packages-split/canbus-handler/usr/bin/canbus-handler
4f1b270a374c14bcd95d093095f4354d ./workdir/arago-tmp-default-glibc/work/armv7at2hf-neon-oe-linux-gnueabi/canbus-handler/git+AUTOINC+b99a8424b6-r0/packages-split/canbus-handler/usr/bin/canbus-handler

user@MACHINE:~/repos/Project-xxxxx_Yocto$ md5sum ./workdir/arago-tmp-default-glibc/work/armv7at2hf-neon-oe-linux-gnueabi/canbus-handler/git+AUTOINC+b99a8424b6-r0/packages-split/canbus-handler/usr/bin/canbus-handler
db35646094dc2f05022e012666973d7f md5sum ./workdir/arago-tmp-default-glibc/work/armv7at2hf-neon-oe-linux-gnueabi/canbus-handler/git+AUTOINC+b99a8424b6-r0/packages-split/canbus-handler/usr/bin/canbus-handler

Ok, so next I wanted to create a minimal compilable debug project ( https://github.com/usvi/helloyoctoworld ) to demonstrate. Guess what? Checksums from this project are always to same, the path does not affect!

user@MACHINE:/tmp/Project-xxxxx_Yocto$ md5sum ./workdir/arago-tmp-default-glibc/work/armv7at2hf-neon-oe-linux-gnueabi/helloyoctoworld/git+AUTOINC+6716589062-r0/packages-split/helloyoctoworld/usr/bin/helloyoctoworld
40ae08fc09eb08ef8f519ee9312659c9  ./workdir/arago-tmp-default-glibc/work/armv7at2hf-neon-oe-linux-gnueabi/helloyoctoworld/git+AUTOINC+6716589062-r0/packages-split/helloyoctoworld/usr/bin/helloyoctoworld

user@MACHINE:~/repos/Project-xxxxx_Yocto$ md5sum ./workdir/arago-tmp-default-glibc/work/armv7at2hf-neon-oe-linux-gnueabi/helloyoctoworld/git+AUTOINC+6716589062-r0/packages-split/helloyoctoworld/usr/bin/helloyoctoworld
40ae08fc09eb08ef8f519ee9312659c9  ./workdir/arago-tmp-default-glibc/work/armv7at2hf-neon-oe-linux-gnueabi/helloyoctoworld/git+AUTOINC+6716589062-r0/packages-split/helloyoctoworld/usr/bin/helloyoctoworld

So to re-iterate: helloyoctoworld build in /tmp vs. build in /home/USER/repos

40ae08fc09eb08ef8f519ee9312659c9 vs. 40ae08fc09eb08ef8f519ee9312659c9

canbus-handler build in /tmp vs build in /home/USER/repos

4f1b270a374c14bcd95d093095f4354d vs. db35646094dc2f05022e012666973d7f

So what is going on? How to debug this? Ok, well maybe I'll first take the canbus-handler and peel stuff off to see when it starts to have stable checksums on both locations.

EDIT1: I ran strings + diff on the binaries:

< /tmp/Project-xxxxx_Yocto/workdir/arago-tmp-default-glibc/work/armv7at2hf-neon-oe-linux-gnueabi/canbus-handler/git+AUTOINC+deffada7ed-r0/recipe-sysroot/usr/include/nlohmann/json.hpp
---
> /home/USER/repos/Project-xxxxx_Yocto/workdir/arago-tmp-default-glibc/work/armv7at2hf-neon-oe-linux-gnueabi/canbus-handler/git+AUTOINC+deffada7ed-r0/recipe-sysroot/usr/include/nlohmann/json.hpp

So, nlohmann is imprinting it's header location to the actual binary. When I build the binary without Yocto, directly on the host machine natively with "make" the location is also there:

cannot use operator[] with
/usr/include/nlohmann/json.hpp
m_object != nullptr

EDIT2: Made self-contained test case: https://github.com/usvi/nlohmannjsontest

Results are strange:

janne@shell:/tmp$  git clone git@github.com:usvi/nlohmannjsontest.git
Cloning into 'nlohmannjsontest'...
remote: Enumerating objects: 18, done.
remote: Counting objects: 100% (18/18), done.
remote: Compressing objects: 100% (14/14), done.
remote: Total 18 (delta 2), reused 18 (delta 2), pack-reused 0 (from 0)
Receiving objects: 100% (18/18), 204.74 KiB | 605.00 KiB/s, done.
Resolving deltas: 100% (2/2), done.
janne@shell:/tmp$ cd nlohmannjsontest/
janne@shell:/tmp/nlohmannjsontest$ make
g++  -I inc -Wno-deprecated  -c -o main_3.11.3.o src/main_3.11.3.cpp
g++   main_3.11.3.o   -o main_3.11.3
g++  -I inc -Wno-deprecated  -c -o main_2.1.1.o src/main_2.1.1.cpp
g++   main_2.1.1.o   -o main_2.1.1
strip main_3.11.3
strip main_2.1.1
strings main_3.11.3 | grep "json.hpp"
inc/nlohmann3.11.3/json.hpp
strings main_2.1.1 | grep "json.hpp"
inc/nlohmann2.1.1/json.hpp

So, the path is imprinted.


Solution

  • This is going to be exactly the same stuff I told you over IRC, but at least here the answer will be archived.

    This is a classic reproducible builds (https://reproducible-builds.org) problem. There are most likely build paths in the generated output: it could be other non-deterministic sources, until you've verified its build paths you can't be sure.

    Lots of things to try:

    When you know what exactly is causing the difference, it's normally quite simple to fix.

    Some common problems:

    Worked example: packages using cython are non-reproducible. Diffoscope showed that there are two causes:

    1. the source package contains generated code which embed build paths
    2. the binaries have strings and symbols containing the build path

    For (1) the references are in comments that are not needed after the build, so we can just strip them out. For (2) the string is the path to the original source file, and luckily the variable name is generated from the string value. So I implemented path remapping à la -fdebug-prefix-map here and remap S and B.