An introduction to Conda
Conda Intro
- What is Conda and why do we love it?
- Binder available
- Getting and installing Conda
- Creating and navigating environments
- Finding and installing packages
- An example of something we might want to install in our base environment
- Creating an environment and installing packages in one command
- Other Conda Resources
Unix
This is a page about Conda in the style of this site, but there is also excellent documentation available from the Conda developers and community here. Thank you, Conda team đ
What is Conda and why do we love it?
Conda is a package and environment manager that is by far the easiest way to handle installing most of the tools we want to use in bioinformatics. Being âconda-installableâ requires that someone (could be the developer, could be others) has gone through the trouble of making it that way, so not everything is available, but almost everything weâre likely to want to use is. Going hand-in-hand with making things easier to install is condaâs other value, that it handles different environments very nicely for us. Sometimes Program A
will depend on a specific version of Program B
. But then, Program C
will depend on a different version of Program B
, and this causes problems. Conda lets us easily create and manage separate environments to avoid these types of version conflicts, and automatically checks for us when we try to install something new (so we find out now, before we break something somewhere under the hood and have no idea what happened). The benefits go further, like really helping with reproducibility too, but letâs get into it!
NOTE: This page assumes already having some familiarity with working at the command line. If thatâs not the case yet, then consider running through the Unix crash course first đ
Binder available
Weâll most likely want to be doing this on our own system eventually, but if we just want a temporary system to run through this tutorial, we can open a Binder by clicking this badge â . After the screen loads, we can click the âTerminalâ icon under âOtherâ to launch our command-line environment:

Getting and installing Conda
Conda comes in two broad forms: Anaconda and Miniconda. Anaconda is large with lots of stuff, Miniconda is more lightweight and we install just what we want. For this reason, weâre gonna move forward with Miniconda. The download page for Miniconda is here, and we should pick the one appropriate for our operating system. Those labeled Miniconda2 are using python 2 as the base, while those labeled Miniconda3 are using python 3. Both can still build environments in the otherâs python version, but more than likely we should just take python 3.
Most likely, we will want Miniconda3, and for a 64-bit system. Choosing for our appropriate system here (working in the binder linked above), we are going to want a Linux version, so this is one way to download it to our system (from copying the link under âMiniconda3 Linux 64-bitâ:
curl -LO https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
NOTE:
Donât forget to get the link that works for your operating system, and modify the above as needed. E.g., if doing this on a mac, we would want the link under âMiniconda3 MacOSX 64-bit bashâ, and the command we would run would be:
curl -LO https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
Now we can install that by running it as a bash
command:
bash Miniconda3-latest-Linux-x86_64.sh
# again, adjusted if needed based on your system, e.g. if working on a Mac:
# bash Miniconda3-latest-MacOSX-x86_64.sh
We need to interact with the following during installation:
- We will be asked to review the licence agreement. After pressing return/enter, we can view it and hit the
space
bar to advance. At the bottom, we need to type inyes
to accept. - It will then ask if the location is ok, hitting enter will place it in our current home directory, which is usually great.
- At the end it will ask if we want to initialize Miniconda3. There might be reasons to not want to do this, but until we hit one of those reasons, it probably makes things easier to say
yes
to that.
Great! Now we have conda installed. We just need to reload our current shell session, which we can do like so:
source ~/.bashrc
And now we can see a (base)
at the start of our prompt, which tells us we are in the âbaseâ conda environment.
Creating and navigating environments
Here, âenvironmentsâ represent the the infrastructure our computer is currently operating in. This includes things like if certain variables exist and how they are set (like our PATH).
Base environment
The âbaseâ conda environment is, like it sounds, kind of our home base inside conda. We wouldnât want to install lots of complicated programs here, as the more things added, the more likely something is going to end up having a conflict. But the base environment is somewhere we might want to install smaller programs that we tend to use a lot (example below).
Making a new environment
The simplest way we can create a new conda environment is like so:
conda create -n new-env
Where the base command is conda create
, then we are specifying the name of our new environment with -n
(here ânew-envâ). It will check some things out and tell us where it is going to put it, when we hit y
and enter, it will be created.
Entering an environment
To enter that environment, we need to execute:
conda activate new-env
And now we can see our prompt has changed to have (new-env)
at the front, telling us we are in that environment.
If we had forgotten the name, or wanted to see all of our environments, we can do so with:
conda env list
Which will print out all of the available conda environments, and have an asterisk next to the one we are currently in.
Exiting an environment
We can exit whatever conda environment we are currently in by running:
conda deactivate
Making an environment with a specific python version
By default, the conda create
command will use the python version that the base conda environment is running. But we can specify a different one in the command if weâd like:
conda create -n python-v2.7 python=2.7
Breakdown
conda create
â this is our base command-n python-v2.7
â we are naming the environment âpython-v2.7âpython=2.7
â here we are specifying the python version to use within the environment
After entering y
to confirm and complete that, we can see this worked by checking our python version inside and out of these environments.
conda activate base
python --version
conda activate new-env
python --version # same as base
conda activate python-v2.7
python --version # different than base
If we were trying to use a program that was only available in python 2, we could create a conda environment that was prepared to work with it just like this without messing up all the things we use that depend on python 3! (Iâll note itâs kind of hard to appreciate this sort of thing unless youâve fought with the consequences before đ)
Removing an environment
And here is how we can remove an environment, by providing its name to the -n
flag:
conda deactivate # we can't be inside the environment we want to remove
conda env remove -n python-v2.7
Finding and installing packages
To install the packages (tools/programs) we want, we use the conda install
command. But we need to tell conda where to look for it (conda stores and looks for packages in different âchannelsâ), and we need to know what it is called in the conda system (usually this is just the program name we are familiar with). Letâs imagine we want to install the protein-coding gene-prediction program prodigal.
Letâs change into our ânew-envâ and check if prodigal
is there already:
conda activate new-env
prodigal -v
And we get a âcommand not foundâ error if this isnât installed on our system and accessible in our current environment. So letâs look at a common way to find it and install it through conda!
Searching for packages
The first thing I usually do, whether I know if a program is conda-installable or not, is just search in a web-browser for âconda installâ plus whatever program I am looking for. For example, searching for âconda install prodigalâ on google brings the top hit back as the prodigal package page on anaconda.org (which is where these are all hosted). Anaconda.org is also a great place to look if we are having trouble finding a program.
Installing packages
Looking at the prodigal package page, we can see there are instructions for installing through conda. These specify where they are specifying the appropriate channel to look in with the -c
flag. So we can just run this line to install prodigal
:
conda install -c bioconda prodigal
IMPORTANT NOTE
There is more on this below, but because itâs important, weâre going to note it here too. Despite what the install examples show on the anaconda pages like the one above, for many packages from bioconda to install properly, more channels need to be set in a specific hierarchy (e.g. see this in the bioconda documentation here. Some installs work fine regardless (as is the case withprodigal
here), but others will fail, and even worse, some will succeed but have more insidious issues. This is because even thoughprodigal
may be in the bioconda channel, some of its dependencies may be found elsewhere or in multiple places, and having those 3 channels set in the proper priority helps to ensure the correct versions of everything are being used. Since I use conda on different systems all the time, I find it much easier and safer just to run the install command specifiying all the channels needed for it in one line like so:
conda install -c conda-forge -c bioconda -c defaults prodigal
That way we get whatâs needed each time, and we donât have to worry about changing the underlying conda channel hierarchy on each system we use it on like the bioconda documentation demonstrates.
This prints out such as what and what versions of things are going to be installed, and where they are going to be installed (which will end with the name of the environment we created). After inputting y
and pressing return/enter, the program is installed in this environment, and we can check for prodigal
again, only this time successfully:
prodigal -v
And now since the program is installed in this environment, we get the version information printed out as we should. If we move out of this environment, it will not be found again (unless it is installed there too):
conda deactivate
prodigal -v
NOTE: Notice that the prodigal github page doesnât list the conda installation there! This is not all that uncommon, as developers arenât always the folks that make their programs conda-installable. Not seeing conda installation instructions on a particular programâs installation page, does not mean it is not available through conda. Be sure to look for it like shown above!
Uninstalling a package
If we wanted to remove prodigal, we would want to first be inside the environment that has it, so letâs switch back in to our new environment:
conda activate new-env
And then we would run:
conda uninstall prodigal
Once we confirm that with y
and hit return, prodigal
is once again gone:
prodigal -v
Installing a specific version of a package
If we wanted to specify a different version of prodigal
to install, we first need to know it exists in conda. We can see all the version types available like so:
conda search -c bioconda prodigal
This lists 2 different versions available (2.6.2 and 2.6.3, with 3 different âbuildsâ for each. A build here is when the program isnât changed, but how it was packaged for conda has changed. If we wanted to install v2.6.2 instead of v2.6.3 like we got above, we could specify it like so:
conda install -c conda-forge -c bioconda -c defaults prodigal=2.6.2
Just like above when we specified the python version we wanted, we can place an equals sign and then the version we want of any program we are installing.
Confirming that by entering y
and hitting return finishes the installation, and we can see we have this older version now:
prodigal -v
A note on channels
When we do something in conda, it automatically searches our stored channels (and it does so in a specific order). We had to specify -c bioconda
in the command above when we installed prodigal
because that channel wasnât yet in our stored channels. We can either set these ahead of time, or we can specify them when we install something. Most of the things we biologists will want to use are found in the bioconda channel. The prodigal package page instructions on anaconda.org instructions are actually not 100% ideal. Looking at the bioconda documentation we can see that we should specify our channel priority such that conda-forge
is searched before bioconda
which is searched before defaults
.
Even though it worked ok with prodigal
when we only gave it the -c bioconda
channel, other programs might run into an issue. So itâs best to follow the bioconda documentation and specify all the channels in the proper order. Here are two ways we can do that.
We can specify these channels in the installation command like so:
conda activate new-env # making sure we are in our new environment
conda install -c conda-forge -c bioconda -c defaults prodigal
(Though in this case it is going to ask us if we want to update the version we have to a newer version, we can just hit n
and return.)
As mentioned above, I prefer doing things this way so I donât have to ever bother with changing the stored channel configuration on any system I use conda on. Plus, this has the added benefit that I very frequently accidentally type out âconda-forageâ đ
Or we can set the channels ahead of time like the bioconda documentation demonstrates (when done this way, the last one we add has the highest priority):
conda config --add channels defaults # no need to worry about the warning
conda config --add channels bioconda
conda config --add channels conda-forge
And we can see our channel list and priority order with the following:
conda config --get channels
Now weâd be able to install things from bioconda without needing to specify the channels if we wanted, e.g.:
conda install prodigal
(Again this will ask us if we want to update. If we had the same version as the latest, it would just say it is installed alrady. We can just hit n
and return again.)
An example of something we might want to install in our base environment
As mentioned above, the base environment is somewhere we might want to install smaller programs that we tend to use a lot. For example, I have a collection of small scripts I use very frequently stored in a package here called bit
â for bioinformatics tools ÂŻ\_(ă)_/ÂŻ
Letâs switch back into our base environment, by deactivating the one we are currently in, and install it (the -y
flag makes is so we donât need to confirm anything):
conda deactivate
conda install -y -c conda-forge -c bioconda -c defaults -c astrobiomike bit
Breakdown
conda install
â our base command-y
â says not to ask us for any confirmation-c ...
â specifies the channelsbit
â this positional argument is specifying the program name as it is within conda
This might take about a minute (or it might fail, see below), once it is finished we can try one of them:
# if working in the binder, this random-assembly.fa file is there already
# if not, we can grab real quick with:
curl -L -o random-assembly.fa https://ndownloader.figshare.com/files/23842415
bit-summarize-assembly random-assembly.fa
Because I typically use this and other scripts in here all over the place (we can see them all by typing bit-
then hitting tab twice), it makes sense for me to have them in my base environment. But we wouldnât want to put too many large things into one environment unless it is needed, as it increases the likelihood of version conflicts.
Itâs possible that already conflicts with the version of python our base conda environment is using, in which case that may have failed. Due to my adding and adding things to that, Iâve not found that it is pretty restricted to which version of python3 it can use. In that case, I would install it in itâs own environment like weâll look at next.
Creating an environment and installing packages in one command
We can also create an environment and install tools at the same time. Here is creating an environment for GToTree and installing it at the same time:
conda create -n gtotree -c conda-forge -c bioconda -c defaults -c astrobiomike gtotree
Breakdown
conda create
â our base command-n gtotree
â creates the environment named âgtotreeâ-c ...
â specifies the channelsgtotree
â this positional argument is specifying the program name as it is within conda
When that is done, we can change into the environment, and see that it is there:
conda activate gtotree
GToTree -v
gtt-hmms
Again, it might be hard to appreciate if we havenât fought with installations, versions, and environments before, but Conda is stellar đ
Other Conda Resources
- Main Conda documentation
- Main Conda user guide
- Getting started tutorial from Conda folks
- Conda cheat sheet
- Anaconda.org (A good place to search for packages if a regular web-search is failing us)