condabioinformaticsconda-forge

How to install Bioconda packages


This is a generic formulation of a common question. I am trying to install package foo from Bioconda. I attempt:

conda install -c bioconda foo

However, this leads to either unsatisifiable dependencies or if it does solve leads to shared object errors at runtime.

Why doesn't this work and what is the correct solution?


Solution

  • Conda Forge must be prioritized

    The Bioconda channel builds packages with the Conda Forge channel prioritized. This requires users to likewise prioritize the Conda Forge channel when installing Bioconda packages (see Bioconda documentation). That is, the correct formulation for specifying channels at install time is:

    conda install -c conda-forge -c bioconda foo
    

    Additional Notes

    Global configuration and Anaconda

    The Bioconda channel recommends that users configure their channels globally to have Conda Forge prioritized.

    Global Bioconda configuration (not for Anaconda users)

    conda config --add channels defaults
    conda config --add channels bioconda
    conda config --add channels conda-forge
    conda config --set channel_priority strict
    

    However, please note that this is fundamentally incompatible with Anaconda base. Subsequent solves in Anaconda base will be crippled when solving for the huge number of anaconda packages in the vast search space of Conda Forge. Users who intend to configure Conda Forge as their priority should install a base such as Miniforge, which is preconfigured for this setup.

    Anaconda users who do not want to replace their base instead can configure channels per environment by using the --env argument.

    Example: prioritize conda-forge in one environment

    conda create -n bioenv
    conda activate bioenv
    conda config --env --add channels bioconda
    conda config --env --add channels conda-forge
    conda config --env --set channel_priority strict
    

    All subsequent installations must have the environment activated to use this environment-specific configuration.

    Prefer YAML

    Ad hoc installations with conda install can lead to difficult to reproduce environments and also waste time solving. A better approach is to define environments using YAML files.

    Example: rnaseq.yaml

    name: rnaseq
    channels:
      - conda-forge
      - bioconda
      - nodefaults  ## ignore user settings
    dependencies:
      ## alignment/quantification
      - salmon
    
      ## R (CRAN)
      - r-base=4.3
      - r-tidyverse
      - r-magrittr
      - r-pheatmap
    
      ## R (Bioconductor)
      - bioconductor-deseq2
      - bioconductor-tximeta
      - bioconductor-tximport
    

    which can be used to create an environment with

    conda env create -n rnaseq -f rnaseq.yaml