pythondependenciescondapdflatexbcftools

Dependencies plot-vcfstats in conda environment


I have a conda environment where I have packages including bcftools installed. I am using bcftools stats to generate some stats on my VCF files. Then, I want to plot the generated stats using plot-vcfstats, also from bcftools. However, this command turned out to be dependent on certain packages that didn't install when I installed bcftools in my conda env. The output I got when running plot-vcfstats:

Parsing bcftools stats output: test.txt
Plotting graphs: python3 plot.py
Traceback (most recent call last):
  File "plot.py", line 54, in <module>
    import matplotlib as mpl
ModuleNotFoundError: No module named 'matplotlib'
The command exited with non-zero status 256:
        python3 plot.py

matplotlib can easily be installed using conda, so I did this, but then got the following error:

Parsing bcftools stats output: test.txt
Plotting graphs: python3 plot.py
Neither pdflatex or tectonic were found in your PATH, impossible to create a PDF at /home/nick/miniconda3/envs/variantcallingpipeline/bin/plot-vcfstats line 112.
        main::error("Neither pdflatex or tectonic were found in your PATH, impossi"...) called at /home/nick/miniconda3/envs/variantcallingpipeline/bin/plot-vcfstats line 1934
        main::create_pdf(HASH(0x7fffc2f047b0)) called at /home/nick/miniconda3/envs/variantcallingpipeline/bin/plot-vcfstats line 73

However, I couldn't find an easy way to install the pdflatex and tectonic dependencies, and there might even be more dependencies required. So, I am wondering if there is an easy way to install all required dependencies of plot-vcfstats (or any tool), and if this is all possible using conda.

Edit: I just tried to install pdflatex and tectonic via:

conda install -c conda-forge texlive-core

And that changed the error to:

Parsing bcftools stats output: test.txt
Plotting graphs: python3 plot.py
Note: The xcolor.sty package not available, black and white tables only...

Creating PDF: pdflatex summary.tex >plot-vcfstats.log 2>&1
The command exited with non-zero status, please consult the output of pdflatex: .plot-vcfstats.log

 at /home/nick/miniconda3/envs/test/bin/plot-vcfstats line 112.
        main::error("The command exited with non-zero status, please consult the o"...) called at /home/nick/miniconda3/envs/test/bin/plot-vcfstats line 2206
        main::create_pdf(HASH(0x7fffd8101f90)) called at /home/nick/miniconda3/envs/test/bin/plot-vcfstats line 73

The log file contains the following:

This is pdfTeX, Version 3.14159265-2.6-1.40.19 (TeX Live 2018) (preloaded format=pdflatex)
 restricted \write18 enabled.

kpathsea: Running mktexfmt pdflatex.fmt
Can't locate mktexlsr.pl in @INC (@INC contains: /home/nick/miniconda3/envs/test/share/tlpkg /home/nick/miniconda3/envs/test/share/texmf-dist/scripts/texlive /home/nick/.t_coffee/perl/lib/perl5 /home/nick/.t_coffee/perl/lib/perl5 /home/nick/miniconda3/envs/test/lib/perl5/5.32/site_perl /home/nick/miniconda3/envs/test/lib/perl5/site_perl /home/nick/miniconda3/envs/test/lib/perl5/5.32/vendor_perl /home/nick/miniconda3/envs/test/lib/perl5/vendor_perl /home/nick/miniconda3/envs/test/lib/perl5/5.32/core_perl /home/nick/miniconda3/envs/test/lib/perl5/core_perl .) at /home/nick/miniconda3/envs/test/bin/mktexfmt line 23.
BEGIN failed--compilation aborted at /home/nick/miniconda3/envs/test/bin/mktexfmt line 25.
I can't find the format file `pdflatex.fmt'!


Solution

  • Seems like a mess. Some of the comments in this open issue imply that Conda's texlive-core is broken, but not really clear there is an authoritative response there.

    On osx-64 platform, I can get semi-functionality with the environment:

    mamba create -n vcfstats bcftools vcftools python matplotlib numpy tectonic
    

    and then running, with a test.vcf,

    conda activate vcfstats
    
    bcftools stats test.vcf > test.vchk
    
    plot-vcfstats -p outdir test.vchk
    ## this fails with complaints about pdflatex, but continuing...
    
    cd outdir
    tectonic summary.tex
    ## then renders the pdf without issue
    

    The failure for me could be unique to me, since there is a pdflatex on the PATH, under /usr/local/bin/pdflatex and the plot-vcfstats code prioritizes pdflatex if present. That is it uses this code:

    my $engine = '';
    system('command -v pdflatex >/dev/null');
    if ($? == 0) { $engine = 'pdflatex'; }
    else
    {
       system('command -v tectonic >/dev/null');
       if ($? == 0) { $engine = 'tectonic'; }
    }
    

    For users that don't already have pdflatex on PATH, using the Conda tectonic package alone should just work out-of-the-box. For example, this seems to be what Galaxy uses to run bcftools stats it in its container.