Installing Python packages
pip
and conda
are the easiest ways of installing python packages and
programs as user. In both cases it is advised to use virtual environments to
separate between different workflows/projects. This makes it possible to have
multiple versions of the same package or application without problems of
conflicting dependencies.
Virtual environments
Virtual environments in Python are a nice way to compartmentalize package installation. You can have many virtual environment and we recommend that you at least have one for each disparate experiment. One additional benefit of this setup is that it allows other researchers to easily replicate your setup.
pip
is the main package installer for Python and included in every Python
installation. It is easy to use and can be combined with virtualenv
to manage
independent environments. These can contain different Python versions and packages.
In some cases, packages installed with pip
have problems with complex dependencies
and libraries. In this case, conda
is the better solution.
Setup and installation with pip
Users can install Python packages in a virtual Python environment. Here is how you create a virtual environment with Python:
# First load an appropriate Python module (use 'module avail Python/' to see all)
$ module load Python/3.8.6-GCCcore-10.2.0
# Create the virtual environment.
$ python -m venv my_new_pythonenv
# Activate the environment.
$ source my_new_pythonenv/bin/activate
# Install packages with pip. Here we install pandas.
$ python -m pip install pandas
After the analysis is finished the environment can be unloaded or deactivated using one of the two methods below.
Close the current terminal
Use the deactivate command
In a job script (described below), there is no need to deactivate as the environment is only active in the shell the job was running in.
For more information, have a look at the official
pip
and
virtualenv
documentations.
Note
When running software from your Python environment in a batch script, it is highly recommended to activate the environment only in the script (see below), while keeping the login environment clean when submitting the job, otherwise the environments can interfere with each other (even if they are the same).
Using the virtual environment in a batch script
In a batch script you will activate the virtual environment in the same way as above. You must just load the python module first:
# Set up job environment
set -o errexit # exit on any error
set -o nounset # treat unset variables as error
# Load modules
module load Python/3.8.6-GCCcore-10.2.0
# Set the ${PS1} (needed in the source of the virtual environment for some Python versions)
export PS1=\$
# activate the virtual environment
source my_new_pythonenv/bin/activate
# execute example script
python pdexample.py
Anaconda, Miniconda & Conda
Conda (either in the form of Miniconda or Anaconda) is a combination of a package manager like pip
and a virtual environment manager like venv
. You can use it to install and manage many python and non-python packages yourself.
See Installing software with Conda for details and examples.