Installing software with EasyBuild

The NRIS software team is currently using the EasyBuild system for installing system-wide software and scientific applications on all Norwegian HPC systems. It is, actually, quite easy (hence the name) and straightforward for users to install with the same tool (provided that an easyconfig file already exists for the particular package you are interested in).

With EasyBuild you might encounter a few different scenarios:

  1. An easyconfig already exists for my software and can be used as is.

  2. An easyconfig exists, but not for the software version or compiler toolchain that I need.

  3. An easyconfig does not exist for my software (not covered here, please refer to general EasyBuild tutorials).

In the following we will cover the first two scenarios. We will describe installations for single users - placing the software in standard user home folder - as well as installations for an entire project/group - placing the software in a project folder for many to share.

About EasyBuild

EasyBuild is a software build and installation framework specifically targeting HPC systems, with focus on build automation, reproducibility, and automatic dependency resolution. It is fully compatible with the Lmod module system that is used on all our HPC clusters, and every EB installation will automatically generate a corresponding module file which allows you to load the software into your environment.

EasyBuild is very explicit in the build specifications for each part in a software’s chain of dependencies, and is very careful not to mix dependencies that are built using different compiler versions or toolchains. This is the reason for the long names, like rjags-4-12-foss-2021b-R-4.1.2 which means that the program rjags version 4.12 has been built using the foss-2021b toolchain and aimed at R version 4.1.2. This in turn means that all other dependencies that rjags might have, or anything that want to use rjags as a dependency, will have to be compatible with these versions.

Installing software in your home folder

By default, EasyBuild will download, build and install software in a directory called .local/easybuild/ in your own $HOME folder, which means that only you will be able to find and use the package after it is installed. The following procedure applies equally well to all our current HPC systems.

The first thing we do is to load the EasyBuild module on a login node of your cluster of choice. Make sure that you have a clean environment with no other modules loaded, then search for available EasyBuild versions and load the latest version you can find:

$ module purge
$ module avail easybuild
----------------------- /cluster/modulefiles/all ------------------------
EasyBuild/4.9.0

$ module load EasyBuild/4.9.0

Say you want to install the already mentioned rjags package. Then you first need to find out which easyconfigs are available on the system. This can be done with the --search-filename or -S option. So to display all rags version that can be installed use:

$ eb --search-filename rjags
== found valid index for /cluster/software/EasyBuild/4.9.0/easybuild/easyconfigs, so using it...
 * rjags-4-6-intel-2017a-R-3.4.0.eb
 * rjags-4-6-intel-2017b-R-3.4.3.eb
 * rjags-4-8-foss-2018b-R-3.5.1.eb
 * rjags-4-9-foss-2019a-R-3.6.0.eb
 * rjags-4-10-foss-2019b.eb
 * rjags-4-10-foss-2020a-R4.0.0.eb
 * rjags-4-10-foss-2020b-R-4.0.3.eb
 * rjags-4-10-foss-2020b-R-4.0.4.eb
 * rjags-4-10-foss-2020b-R-4.0.5.eb
 * rjags-4-10-foss-2021a-R-4.1.0.eb
 * rjags-4-10-fosscuda-2020b-R-4.0.3.eb
 * rjags-4-10-fosscuda-2020b-R-4.0.4.eb
 * rjags-4-10-fosscuda-2020b-R-4.0.5.eb
 * rjags-4-12-foss-2021b-R-4.1.2.eb
 * rjags-4-13-foss-2022a-R-4.2.1.eb
 * rjags-4-13-foss-2022b-R-4.2.2.eb

Note: 1 matching archived easyconfig(s) found, use --consider-archived-easyconfigs to see them

From this list we decide to go for rjags version mentioned above, i.e. rjags-4-12-foss-2021b-R-4.1.2.eb.

Now, we advice to do an install in three steps, first download the sources of your software, then do a test run where you check what will be installed and then the full install.

Step 1: To fetch the source, run the following command:

$ eb rjags-4-12-foss-2021b-R-4.1.2.eb --fetch

This will download a tarball into your local EB directory .local/easybuild/sources:

$ ls $HOME/.local/easybuild/sources/r/rjags
rjags_4-12.tar.gz

Step 2: It may be a good idea to perform a test run to get an overview of what will be installed with the command you are planning to use. This you get by the command:

$ eb rjags-4-12-foss-2021b-R-4.1.2.eb --dry-run

This will check that all the necessary dependencies are available in the current EB repository. If not, you will get an error message. It will also print the full list of dependencies, where everything marked with [x] means that the dependency is already satisfied, and won’t be re-installed. If you only want to see which extra dependencies are going to be installed:

$ eb rjags-4-12-foss-2021b-R-4.1.2.eb --missing-modules

Step 3: If the test build was successful, you can perform the build with:

$ eb rjags-4-12-foss-2021b-R-4.1.2.eb --robot --parallel=2

where --robot means that EasyBuild should automatically resolve and install all necessary dependencies, and --parallel will set the number of CPU threads to use in the build. Hopefully, this returns successfully after a few minutes with a message like this:

== Build succeeded for 1 out of 1
== Temporary log file(s) /tmp/eb-BoOCuj/easybuild-CuSy5M.log* have been removed.
== Temporary directory /tmp/eb-BoOCuj has been removed.

Note

The default build will use all available cores, so please set the --parallel option to a more reasonable number to avoid clogging the login node.

You can now confirm that the package has been installed under .local/easybuild/software:

$ ls $HOME/.local/easybuild/software/rjags/4-12-foss-2021b-R-4.1.2/rjags/
data  DESCRIPTION  help  html  INDEX  libs  Meta  NAMESPACE  R

Using locally installed software

In the example above we installed the rjags code locally in our home folder under .local/easybuild/software. At the same time, EasyBuild created a new module file under .local/easybuild/modules:

$ ls .local/easybuild/modules/all/rjags/
4-12-foss-2021b-R-4.1.2.lua

which can now be loaded and used alongside any other globally installed software. In order to do so you need to tell the module system to look for modules in this directory, which is done with the module use command:

$ module use $HOME/.local/easybuild/modules/all
$ module avail rjags
------------------------------------ .local/easybuild/modules/all -------------------------------------
    rjags/4-12-foss-2021b-R-4.1.2

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

You can now load and use the package just like any other module:

$ module load rjags/4-12-foss-2021b-R-4.1.2

For more information about the module system, please see: https://lmod.readthedocs.io/en/latest

If you are planning to install more than a few modules you will quickly run out of disk quota in the home folder. In general we recommend that you instead install in a project folder.

Installing software in a project folder

In order to install semi-globally under a project directory you should make the following changes to the above procedure.

$ my_path=/cluster/projects/nnXXXXk/easybuild
$ mkdir -p $my_path
$ eb rjags-4-12-foss-2021b-R-4.1.2.eb --prefix=$my_path --fetch
$ eb rjags-4-12-foss-2021b-R-4.1.2.eb --prefix=$my_path --dry-run
$ eb rjags-4-12-foss-2021b-R-4.1.2.eb --prefix=$my_path

where XXXX is your project id number. Note the easybuild folder in the path, this is a tip for housekeeping and not strictly required. This will give the path structure as for the local case, with the software and modulefiles installed in cluster/projects/nnXXXXk/easybuild.

Now the rjags installation is available to everyone associated with the nnXXXXk project, after typing:

$ module use /cluster/projects/nnXXXk/easybuild/modules/all

The prefix option (in fact any command line option) can also be set vi an environment variable like this:

$ export EASYBUILD_PREFIX=/cluster/project/nnXXXXk/easybuild

and then you need not repeat the --prefix option on the command line.

Writing your own easyconfigs

Let’s say that for some reason you need to use rjags-4-12 with the foss/2022a toolchain, instead of 2021b which we already installed. A quick look into the EB repo tells you that this particular version is not available, which means that simply substituting the toolchain version will not work:

$ eb rjags-4-12-foss-2022a-R-4.1.2.eb --fetch
== Temporary log file in case of crash /tmp/eb-bliwuikc/easybuild-2a9dc17d.log
== found valid index for /cluster/software/EasyBuild/4.9.0/easybuild/easyconfigs, so using it...
ERROR: One or more files not found: rjags-4-12-foss-2022a-R-4.1.2.eb (search paths:
/cluster/software/EasyBuild/4.9.0/easybuild/easyconfigs)

In this case what you need to do is to write your own easyconfig file and use that for your easy build. Depending on the package this can be either really simple or frustratingly complicated. This particular example is somewhere in between.

Now, be aware that our clone of the EB repo will only occasionally be updated, so you might get lucky and find that the exact version you are looking for has already become available on the central EB repo, so please check this out first. In that case you can simply download or copy the easyconfig file from there, otherwise it’s a good idea to start from an easyconfig that is similar to the one you are trying to make. In our case we will copy the rjags-4-12-foss-2021b-R-4.1.2.eb file and work from there.

Step 1: Copy similar easyconfig to somewhere in your $HOME, here eb-sandbox:

$ cd eb-sandbox
$ eb --copy-ec rjags-4-12-foss-2021b-R-4.1.2.eb

Step 2: Inspect the easyconfig file and check for dependencies; in this case there are two, R and JAGS. Next you need to check if any of the dependencies are available with the toolchain that you want. Here we see that R-4.2.1 is available with foss-2022a:

$ ls $EBROOTEASYBUILD/easybuild/easyconfigs/r/R/*foss-2022a*
/cluster/software/EasyBuild/4.9.0/easybuild/easyconfigs/r/R/R-4.2.1-foss-2022a.eb

Also a never version of JAGSis available, but if we assume that we need to keep the version at 4.3.0 it is not available for the new toolchain

$ ls $EBROOTEASYBUILD/easybuild/easyconfigs/j/JAGS/JAGS-4.3.0-foss-2022a*
No such file or directory

However, we do have a version with foss/2021b (which of course is the one used by our original rjags), so we’ll copy that one as well and adapt it to our target toolchain:

$ eb --copy-ec JAGS-4.3.0-foss-2021b.eb

You can see that this procedure gets exponentially more complicated when you have to recursively update all dependencies of your original package, but thankfully in our case it stops here.

Note

The toolchains do not necessarily have to match literally, they just need to not be conflicting. For instance, foss/2022a includes the compiler GCCcore/11.3.0, so any easyconfig with the GCCcore-11.3.0 suffix would also be compatible. For more information about common toolchains: https://docs.easybuild.io/common-toolchains/

Step 3: Starting with your deepest dependency, edit your new easyconfigs and change the version specifications. In JAGS it’s just a matter of changing the toolchain to:

toolchain = {'name': 'foss', 'version': '2022a'}

If the software version is changed you also need to update the checksum line. For rjags you do exactly the same, but you should also update the versions of the dependencies. The JAGS entry does not have to be changed, because we’re still using version 4.3.0, but R has to be changed to version 4.2.1, which was the one we found in Step 2 to be available for our toolchain.

Step 4: Rename the easyconfig files to match the new versions:

$ mv JAGS-4.3.0-foss-2021b.eb JAGS-4.3.0-foss-2022a.eb
$ mv rjags-4-12-foss-2021b-R-4.1.2.eb rjags-4-12-foss-2022a-R-4.2.1.eb

Our new easyconfigs should now look like this:

JAGS-4.3.0-foss-2022a.eb:

# This file is an EasyBuild reciPY as per https://github.com/easybuilders/easybuild
# Author: Pablo Escobar Lopez
# Swiss Institute of Bioinformatics
# Biozentrum - University of Basel

easyblock = 'ConfigureMake'

name = 'JAGS'
version = '4.3.0'

homepage = 'http://mcmc-jags.sourceforge.net/'
description = """JAGS is Just Another Gibbs Sampler.  It is a program for analysis
 of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation  """

toolchain = {'name': 'foss', 'version': '2022a'}

source_urls = [
    ('https://sourceforge.net/projects/mcmc-%(namelower)s/files/%(name)s/%(version_major)s.x/Source/', 'download')
]
sources = [SOURCE_TAR_GZ]
checksums = ['8ac5dd57982bfd7d5f0ee384499d62f3e0bb35b5f1660feb368545f1186371fc']

configopts = ' --with-blas="$LIBBLAS" --with-lapack="$LIBLAPACK"'


sanity_check_paths = {
    'files': ['bin/%(namelower)s', 'libexec/%(namelower)s-terminal', 'lib/libjags.%s' % SHLIB_EXT],
    'dirs': [],
}

modextrapaths = {
    'JAGS_INCLUDE': 'include/%(name)s',
    'JAGS_LIB': 'lib',
}

moduleclass = 'math'

rjags-4-12-foss-2022a-R-4.2.1.eb:

easyblock = 'RPackage'

name = 'rjags'
version = '4-12'
versionsuffix = '-R-%(rver)s'

homepage = 'https://cran.r-project.org/web/packages/rjags'
description = """The rjags package is an interface to the JAGS library."""

toolchain = {'name': 'foss', 'version': '2022a'}

source_urls = [
    'https://cran.r-project.org/src/contrib/',
    'https://cran.r-project.org/src/contrib/Archive/rjags/',
]
sources = ['%(name)s_%(version)s.tar.gz']
checksums = ['b91f34c3f9ebf78fa44bd661346fbb6f28a01693a7203ac133c98392dd273e10']

dependencies = [
    ('R', '4.2.1'),
    ('JAGS', '4.3.0'),
]

sanity_check_paths = {
    'files': [],
    'dirs': ['rjags'],
}

moduleclass = 'math'

Step 5: Build your new module while adding the current directory to --robot so that your new .eb files are picked up:

$ eb rjags-4-12-foss-2022a-R-4.2.1.eb --robot=. --parallel=2

Again, please don’t use too many --parallel threads on login!

Step 6: Load your shining new module:

$ module use $HOME/.local/easybuild/modules/all
$ module avail rjags
--------------------------- $HOME/.local/easybuild/modules/all ---------------------------
    rjags/4-12-foss-2021b-R-4.1.2    rjags/4-12-foss-2022a-R-4.2.1

$ module load rjags/4-12-foss-2022a-R-4.2.1

Note

In this particular case we could also have used the --try-* options to update the toolchain version without having to edit any easyconfigs by hand: https://docs.easybuild.io/using-easybuild/#tweaking_easyconfigs_using_try

For more information on how to write easyconfigs: https://easybuild.readthedocs.io/en/latest/Writing_easyconfig_files.html