linker-errorsnetcdfmpichcime

Linking PnetCDF, NetCDF-C and NetCDF-Fortran libraries for an earth system model


I am trying to work on an earth system model and I am new to this. Currently, I am only trying to run a test case which is computationally less intensive.

My system is Ubuntu 20.04. I have built the required libraries in the following order - mpich, pnetcdf, zlib, hdf5, netcdf-c, netcdf-fortran, lapack and blas. The versions are as follows (my GCC and gfortran version is 9.4.0) mpich-3.3.1, pnetcdf-1.12.3, zlib-1.2.13, hdf5-1.10.5, netcdf-c-4.9.0, netcdf-fortran-4.6.0, LAPACK and BLAS 3.11. For building with Parallel I/O support I had followed the order Pnetcdf, then hdf5, then Netcdf-c and finally Netcdf-fortran while installing. All the libraries and packages were installed properly without any error and with the same compiler that I'd be using for the model.

The issue that I am coming across now has to do with the linking of libraries (pnetcdf, netcdf-c and netcdf-fortran), more particularly the order, as indicated by the forum dedicated for the model. At the end of the build for the model, when it is trying to create a single executable,it fails (collect2: error: ld returned 1 exit status). The following is the command where it shows the errors

mpif90 -o /home/ubuntuvm/projects/cesm/scratch/testrun11/bld/cesm.exe \
cime_comp_mod.o cime_driver.o component_mod.o component_type_mod.o \
cplcomp_exchange_mod.o map_glc2lnd_mod.o map_lnd2glc_mod.o \
map_lnd2rof_irrig_mod.o mrg_mod.o prep_aoflux_mod.o prep_atm_mod.o \
prep_glc_mod.o prep_ice_mod.o prep_lnd_mod.o prep_ocn_mod.o \
prep_rof_mod.o prep_wav_mod.o seq_diag_mct.o seq_domain_mct.o \
seq_flux_mct.o seq_frac_mct.o seq_hist_mod.o seq_io_mod.o \
seq_map_mod.o seq_map_type_mod.o seq_rest_mod.o t_driver_timers_mod.o \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -latm \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lice \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -llnd \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -locn \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lrof \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lglc \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lwav \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lesp \
-L../../gnu/mpich/nodebug/nothreads/mct/noesmf/c1a1l1i1o1r1g1w1e1/lib \
-lcsm_share -L../../gnu/mpich/nodebug/nothreads/lib -lpio -lgptl \
-lmct -lmpeu  -L/home/ubuntuvm/CESM/lib -lnetcdff \
-Wl,-rpath=/home/ubuntuvm/CESM/lib -lnetcdf -lm -lnetcdf -lhdf5_hl \
-lhdf5 -lpnetcdf -ldl -lm -lz -Wl,-rpath=/home/ubuntuvm/CESM/lib \
-lpnetcdf -L/usr/local/lib -llapack -L/usr/local/lib -lblas \
-L/home/ubuntuvm/CESM/lib -lpnetcdf  -L/home/ubuntuvm/CESM/lib

Below is a part of the errors where libpio.a is a component library that is built before the above command

/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_copy_att':
nf_mod.F90:(.text+0x31): undefined reference to `nfmpi_copy_att'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_def_var_md':
nf_mod.F90:(.text+0x3b5): undefined reference to `nfmpi_def_var'
/usr/bin/ld: nf_mod.F90:(.text+0x4fe): undefined reference to `nfmpi_def_var'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_def_dim':
nf_mod.F90:(.text+0xab9): undefined reference to `nfmpi_def_dim'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_redef':
nf_mod.F90:(.text+0xeb9): undefined reference to `nfmpi_redef'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_enddef':
nf_mod.F90:(.text+0xff0): undefined reference to `nfmpi_enddef'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_inq_dimlen':
nf_mod.F90:(.text+0x115c): undefined reference to `nfmpi_inq_dimlen'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_inq_dimname':
nf_mod.F90:(.text+0x14c2): undefined reference to `nfmpi_inq_dimname'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_inq_dimid':
nf_mod.F90:(.text+0x1821): undefined reference to `nfmpi_inq_dimid'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_inq_varnatts_vid':
nf_mod.F90:(.text+0x1c24): undefined reference to `nfmpi_inq_varnatts'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_inq_vardimid_vid':
nf_mod.F90:(.text+0x1fcc): undefined reference to `nfmpi_inq_vardimid'

The libraries are linked as follows

-L/home/ubuntuvm/CESM_Library/lib -lnetcdff -lnetcdf \
-Wl,-rpath=/home/ubuntuvm/CESM_Library/lib -lnetcdf -lm -lnetcdf -lhdf5_hl \
-lhdf5 -lpnetcdf -ldl -lm -lz -Wl,-rpath=/home/ubuntuvm/CESM_Library/lib -lpnetcdf

What am I doing wrong here? I would be grateful for any suggestions regarding the order of the libraries and would be happy to provide any other details that might be required.


Solution

  • While it is difficult to reproduce this specific error. It looks like the issue is with PNetCDF library that does not seem to contain some FORTRAN functions.

    Here is a brief instruction for setting up CESM on Ubuntu with OpenMPI starting from scratch.

    Caveats: This example is for Ubuntu 22.04 (Jammy) but should work on all recent Ubuntu versions. This is for GCC version 11.4. Switching from OpenMPI to MPICH might require certain changes.

    It is based on a very detailed post by Yonash Mersha. It would be ok to compare the instructions below with the instructions in that post if some details seem missing.

    Assume the hostname is ubuntu-jammy.

    1. Install the necessary system packages and libraries
    apt-get install -y gcc g++ gfortran build-essential cmake git \
    subversion python-is-python3 perl vim python3-pip libxml2-utils unzip libopenmpi-dev
    
    1. For OpenMPI tests configure the number of slots on your machine. Assume that you have 8 CPU cores. This number will be used below many times as argument to -j option for make.
    echo "ubuntu-jammy slots=8" >> /etc/openmpi/opempi-default-hostfile
    
    1. Download CESM and setup CIME
    mkdir CESM
    cd CESM
    RUN git clone -b release-cesm2.1.3 https://github.com/ESCOMP/CESM.git my_cesm_sandbox
    mkdir ~/.cime
    export CIME_MODEL=cesm
    mkdir -p cesm/inputdata
    
    1. Some UCAR SVN servers are using either self-signed certificates or certificates from some unknown CA, so edit my_cesm_sandbox/manage_externals/manic/repository_svn.py around line 267 to add options to ignore unknown CA. This might introduce a vulnerability, so think twice. Make the SVN command look like this:
    cmd = ['svn', 'checkout', '--non-interactive', '--trust-server-cert-failures=unknown-ca', '--quiet', url, repo_dir_path]
    
    1. Checkout the model codes
    ./manage_externals/checkout_externals
    

    Use ./manage_externals/checkout_externals -S to check that all the necessary codes have been delivered.

    1. Now build the IO libraries and Lapack. We shall assume that libraries for CESM will be installed under $HOME/CESM/lib.
    CESM_LIB_DIR=$HOME/CESM/lib
    ZLIB=$CESM_LIB_DIR/zlib
    HDF5=$CESM_LIB_DIR/hdf5
    NETCDF=$CESM_LIB_DIR/netcdf
    PNETCDF=$CESM_LIB_DIR/pnetcdf
    export ZLIB HDF5 NETCDF PNETCDF
    
    1. Download and unpack zlib, hdf5, netcdf-c, netcdf-fortran, pnetcdf, lapack

    2. To build zlib use the default settings: In the directory containing zlib sources

    ./configure --prefix=$ZLIB
    make -j 8
    make check
    make install
    
    1. Important: build HDF5 with parallel support In the directory containing HDF5 sources
    CPPFLAGS="-I$ZLIB/include" LDFLAGS="-L$ZLIB/lib" \
    CC=mpicc CXX=mpicxx ./configure --prefix=$HDF5 --with-zlib=$ZLIB --enable-hl --enable-fortran --enable-parallel
    

    After configure check that parallel support is enabled:

            SUMMARY OF THE HDF5 CONFIGURATION
            =================================
    ...
    Features:
    ---------
                       Parallel HDF5: yes
    Parallel Filtered Dataset Writes: yes
                  Large Parallel I/O: yes
                  High-level library: yes
    ...
    

    Now build, check, and install.

    make -j8
    make -j8 check
    make install
    
    1. Build NetCDF C bindings with Parallel-NetCDF4 support In the directory containing netcdf-c sources
    CPPFLAGS="-I$HDF5/include -I$ZLIB/include" LDFLAGS="-L$HDF5/lib -L$ZLIB/lib" \
    CC=mpicc CXX=mpicxx ./configure --prefix=$NETCDF --disable-dap --enable-parallel4
    

    Make sure the correct options are enabled

    # NetCDF C Configuration Summary
    ==============================
    ...
    HDF5 Support:       yes
    NetCDF-4 API:       yes
    NC-4 Parallel Support:  yes
    ...
    

    Now make,check, and install

    make -j8
    make -j8 check
    make install
    
    1. Build NetCDF FORTRAN bindings In the directory containing netcdf-fortran sources
    CPPFLAGS="-I$NETCDF/include -I$HDF5/include -I$ZLIB/include" \
    FFLAGS="-fallow-argument-mismatch -fallow-invalid-boz" \
    LDFLAGS="-L$NETCDF/lib -L$HDF5/lib -L$ZLIB/lib" LD_LIBRARY_PATH="$NETCDF/lib:$LD_LIBRARY_PATH" \
    CC=mpicc CXX=mpicxx FC=mpifort ./configure --prefix=$NETCDF
    

    Check that Parallel options are configured as follows:

    # NetCDF Fortran Configuration Summary
    ==============================
    ...
    Parallel IO:                    yes
    NetCDF4 Parallel IO:            yes
    PnetCDF Parallel IO:            no
    ...
    

    Now build, check, and install

    make -j8 
    make -j8 check
    make install
    
    1. Build PNetCDF Switch to the directory, containing pnetcdf sources
    CPPFLAGS="-I$NETCDF/include -I$HDF5/include -I$ZLIB/include" \
    FFLAGS="-fallow-argument-mismatch -fallow-invalid-boz" \
    LDFLAGS="-L$NETCDF/lib -L$HDF5/lib -L$ZLIB/lib" \
    LD_LIBRARY_PATH="$NETCDF/lib:$LD_LIBRARY_PATH" \
    CC=mpicc CXX=mpicxx FC=mpifort ./configure --prefix=$PNETCDF --enable-shared --enable-fortran --enable-profiling --enable-large-file-test --with-netcdf4
    

    Run

    make -j8
    make -j8 tests
    make check
    make ptest
    make ptests
    make install
    

    While ptest runs on 4 processors, you should have at least this number. Do not worry if some ptests fail because your system does not have enough processors, as some tests require 10 processors for MPI or more.

    1. Build and install LAPACK

    Switch to the LAPACK source directory

    make -j8 blaslib
    make -j8 lapacklib
    mkdir -p $LAPACK/lib
    mv librefblas.a $LAPACK/lib/libblas.a
    mv liblapack.a $LAPACK/lib/liblapack.a
    
    1. Configure machine copy the following text to .cime/config_machines.xml
    <?xml version="1.0"?>
    <config_machines version="2.0">
     <machine MACH="ubuntu-jammy">
        <DESC>
          Example port to Ubuntu Jammy linux system with gcc, netcdf, pnetcdf and openmpi
        </DESC>
        <NODENAME_REGEX>ubuntu-jammy</NODENAME_REGEX>
        <OS>LINUX</OS>
        <COMPILERS>gnu</COMPILERS>
        <MPILIBS>openmpi</MPILIBS>
        <PROJECT>none</PROJECT>
        <SAVE_TIMING_DIR> </SAVE_TIMING_DIR>
        <CIME_OUTPUT_ROOT>$ENV{HOME}/cesm/scratch</CIME_OUTPUT_ROOT>
        <DIN_LOC_ROOT>$ENV{HOME}/cesm/inputdata</DIN_LOC_ROOT>
        <DIN_LOC_ROOT_CLMFORC>$ENV{HOME}/cesm/inputdata/lmwg</DIN_LOC_ROOT_CLMFORC>
        <DOUT_S_ROOT>$ENV{HOME}/cesm/archive/$CASE</DOUT_S_ROOT>
        <BASELINE_ROOT>$ENV{HOME}/cesm/cesm_baselines</BASELINE_ROOT>
        <CCSM_CPRNC>$ENV{HOME}/cesm/tools/cime/tools/cprnc/cprnc</CCSM_CPRNC>
        <GMAKE>make</GMAKE>
        <GMAKE_J>8</GMAKE_J>
        <BATCH_SYSTEM>none</BATCH_SYSTEM>
        <SUPPORTED_BY>me@my.address</SUPPORTED_BY>
        <MAX_TASKS_PER_NODE>8</MAX_TASKS_PER_NODE>
        <MAX_MPITASKS_PER_NODE>8</MAX_MPITASKS_PER_NODE>
        <PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED>
        <mpirun mpilib="default">
          <executable>mpiexec</executable>
          <arguments>
            <arg name="ntasks"> -np {{ total_tasks }} </arg>
          </arguments>
        </mpirun>
        <module_system type="none" allow_error="true">
        </module_system>
        <environment_variables>
          <env name="NETCDF">$ENV{HOME}/CESM/lib/necdf</env>
          <env name="PNETCDF">$ENV{HOME}/CESM/lib/pnetcdf</env>
          <env name="OMP_STACKSIZE">256M</env>
        </environment_variables>
        <resource_limits>
          <resource name="RLIMIT_STACK">-1</resource>
        </resource_limits>
      </machine>
    </config_machines>
    
    

    Validate the XML file

    xmllint --noout --schema $HOME/CESM/my_cesm_sandbox/cime/config/xml_schemas/config_machines.xsd $HOME/.cime/config_machines.xml
    
    1. Configure compilers Put the following text to .cime/config_compilers.xml
    <?xml version="1.0" encoding="UTF-8"?>
    <config_compilers version="2.0">
    
      <compiler>
            <LDFLAGS>
                    <append compile_threaded="true"> -fopenmp </append>
            </LDFLAGS>
            <FFLAGS>
                    <append>   -fallow-argument-mismatch -fallow-invalid-boz</append>
            </FFLAGS>
            <SFC>gfortran</SFC>
            <SCC>gcc</SCC>
            <SCXX>g++</SCXX>
            <MPIFC>mpifort</MPIFC>
            <MPICC>mpicc</MPICC>
            <MPICXX>mpicxx</MPICXX>
            <CXX_LINKER>FORTRAN</CXX_LINKER>
            <NETCDF_PATH>$ENV{HOME}/CESM/lib/nectdf</NETCDF_PATH>
            <PNETCDF_PATH>$ENV{HOME}/CESM/lib/pnectdf</PNETCDF_PATH>
            <SLIBS>
                    <append>-L $ENV{HOME}/CESM/lib/nectdf/lib -lnetcdff -lnetcdf -lm</append>
                    <append>-L $ENV{HOME}/CESM/lib/lapack/lib -llapack -lblas</append>
            </SLIBS>
    </compiler>
    
    </config_compilers>
    

    Validate the xml file

    xmllint --noout --schema $HOME/CESM/my_cesm_sandbox/cime/config/xml_schemas/config_compilers_v2.xsd $HOME/.cime/config_compilers.xml
    
    1. Create a new case, set it up, and build it.
    cd $HOME/CESM/my_cesm_sandbox
    cime/scripts/create_newcase --case mycase --compset X --res f19_g16
    cd mycase
    ./case.setup
    ./case.build