pythonanacondaconda

Should conda, or conda-forge be used for Python environments?


Conda and conda-forge are both Python package managers. What is the appropriate choice when a package exists in both repositories? Django, for example, can be installed with either, but the difference between the two is several dependencies (conda-forge has many more). There is no explanation for these differences, not even a simple README.

Which one should be used? Conda or conda-forge? Does it matter?


Solution

  • The short answer is that, in my experience generally, it doesn't matter which you use, with one exception. If you work for a company with more than 200 employees then the default conda channel is not free as of 2020.

    The long answer:

    So conda-forge is an additional channel from which packages may be installed. In this sense, it is not any more special than the default channel, or any of the other hundreds (thousands?) of channels that people have posted packages to. You can add your own channel if you sign up at https://anaconda.org and upload your own Conda packages.

    Here we need to make the distinction, which I think you're not clear about from your phrasing in the question, between conda, the cross-platform package manager, and conda-forge, the package channel. Anaconda Inc. (formerly Continuum IO), the main developers of the conda software, also maintain a separate channel of packages, which is the default when you type conda install packagename without changing any options.

    There are three ways to change the options for channels. The first two are done every time you install a package and the last one is persistent. The first one is to specify a channel every time you install a package:

    conda install -c some-channel packagename
    

    Of course, the package has to exist on that channel. This way will install packagename and all its dependencies from some-channel. Alternately, you can specify:

    conda install some-channel::packagename
    

    The package still has to exist on some-channel, but now, only packagename will be pulled from some-channel. Any other packages that are needed to satisfy dependencies will be searched for from your default list of channels.

    To see your channel configuration, you can write:

    conda config --show channels
    

    You can control the order that channels are searched with conda config. You can write:

    conda config --add channels some-channel
    

    to add the channel some-channel to the top of the channels configuration list. This gives some-channel the highest priority. Priority determines (in part) which channel is selected when more than one channel has a particular package. To add the channel to the end of the list and give it the lowest priority, type

    conda config --append channels some-channel
    

    If you would like to remove the channel that you added, you can do so by writing

    conda config --remove channels some-channel
    

    See

    conda config -h
    

    for more options.

    With all of that said, there are five main reasons to use the conda-forge channel instead of the defaults channel maintained by Anaconda:

    1. Packages on conda-forge may be more up-to-date than those on the defaults channel
    2. There are packages on the conda-forge channel that aren't available from defaults
    3. You would prefer to use a dependency such as openblas (from conda-forge) instead of mkl (from defaults).
    4. If you are installing a package that requires a compiled library (e.g., a C extension or a wrapper around a C library), it may reduce the chance of incompatibilities if you install all of the packages in an environment from a single channel due to binary compatibility of the base C library (but this advice may be out of date/change in the future). For reference, see the Conda Forge post on mixing channels.
    5. conda-forge is free to use even in large companies, while the default conda channel is not. See here.