c++makefilebioinformaticssamtools

How to build a simple main.cpp file using samtools C API


I am trying to compile (on Linux, using G++) a simple main.cpp program using samtools C API (https://github.com/samtools/samtools) that I have downloaded in the folder of my main.cpp file. I would like to have a very simple makefile compiling the main.cpp (and eventually compiling samtools code). However, as I have very few knowledge about makefiles, I am probably doing something wrong.

Here is my makefile:

SAMTOOLS=./samtools/
HTSLIB=${SAMTOOLS}htslib-1.9/

all: samtools htslib BAMCoverage

samtools: 
    ${MAKE} -C ${SAMTOOLS}

htslib: 
    ${MAKE} -C ${HTSLIB}

BAMCoverage: main.cpp
    g++ -I./ -I${SAMTOOLS} -I${HTSLIB} -g -O2 -Wall ./main.cpp -o ./BAMCoverage -lz -L${SAMTOOLS} -L${HTSLIB} -lbam  -lhts

And here is my cpp main:

#include "samtools/sam.h"

#include <string>
#include <iostream>

using namespace std;

int main (int argc, char *argv[]) { 
    string bam_file_path ("myfile.bam");
    bamFile bam_file = bam_open (bam_file_path.c_str (), "rb");
    if (bam_file == 0) {
        cerr << "Failed to open BAM file " << bam_file_path << endl;
        return 1;
    }
    bam_close (bam_file);

    return 0;
}

It compiles with no warning when I run "make", but at runtime, it tells me: "error while loading shared libraries: libhts.so.2 cannot open shared object file"

Any help is more than welcome! Thanks in advance.


Solution

  • This is not a problem with your makefile per se; your makefile has some issues but the problem you are running into is understanding how to link with shared libraries properly. In other words, if you ran that same set of commands from the shell command line, instead of using a makefile, you'd have the same problem.

    You should look for documentation on your link command line option -L and read about the difference between link-time and run-time library locations.

    The -lfoo option will tell the linker to link in a library named foo. The -Lsome/dir option will tell the linker to find that library foo in a directory some_dir.

    If the linker finds a static library libfoo.a then whatever parts of that library are needed to link your program will be included directly into your program. This makes your program larger but it means that at run-time nothing besides your program needs to be found.

    If the linker finds a shared library (also called a dynamic library) libfoo.so then the linker just puts a reference to the library name libfoo.so into your program (of course the details are more complex than this but that's the general idea). This makes your program smaller but it means that at runtime not only is your program needed but also the shared library is needed, or else your program can't run.

    This is called run-time linking and the program used to resolve all these shared references when you start your program is called the run-time linker. For very good reasons, the reference that the compile-time linker puts into in your program just lists the name of the library, not the full path to the library. That means that the run-time linker needs to know where to look to find the shared library.

    The run-time linker looks in various places, which can be learned about by reading its documentation; on GNU/Linux for example the run-time linker is called ld.so so you can read the docs with man ld.so.

    This is a complex subject and the best way to do it depends a LOT on what your needs and requirements are.

    If you just want to hard-code the path to look in at compile/link time, you can add an -Rsome/dir option to your link line, one for each -L option, like this:

    BAMCoverage: main.cpp
            g++ -I./ -I${SAMTOOLS} -I${HTSLIB} -g -O2 -Wall ./main.cpp -o ./BAMCoverage -lz -L${SAMTOOLS} -L${HTSLIB} -R${SAMTOOLS} -R${HTSLIB} -lbam  -lhts
    

    This will work fine, as long as the SAMTOOLS and HTSLIB directories exist and still contain the correct shared libraries in them. Obviously that's a big limitation, but we can't guess what your ultimate requirements are.