Creating a Singularity Overlay with Custom Python Packages on LUMI-G

The LUMI-G High-Performance Computing (HPC) system offers users a collection of prebuilt Singularity containers, pre-configured with a suite of scientific computing tools and libraries optimized for the LUMI-G infrastructure. Users looking to add specific software, such as machine learning frameworks, can follow detailed installation instructions provided by LUMI, exemplified by the guides for PyTorch and TensorFlow. These guides demonstrate the process of installing additional software stacks using EasyBuild recipes. After installation, users can activate the software by using the module load command, which also sets the $SIF environment variable, indicating the path to the newly installed Singularity Image File (SIF).

While the prebuilt containers on LUMI-G are comprehensive, users may need to install additional Python packages not already included. The guide below shows how to create a Singularity overlay, which allows for the installation of these extra packages onto the existing container. This method keeps the base environment unchanged, ensuring reproducibility.

In the guide, the $WITH_CONDA environment variable is used to activate the Python environment within the container, a step necessary for installing new Python packages. This variable is predefined by the container developers for ease of use.

Prerequisites

  • A singularity image installed using EasyBuild on LUMI-G

  • A requirements.txt file listing the desired Python packages

Step 1: Prepare the requirements.txt File

Create a requirements.txt file that lists all the Python packages you need to add to the LUMI-G provided Singularity image. This file will be used to specify the dependencies that are not already included in the prebuilt image.

Example requirements.txt:

wandb
opencv-python
torchmetrics
albumentations
einops
pandas
matplotlib

Step 2: Create the install.sh Script

The install.sh script automates the installation of packages listed in requirements.txt. It sets up the environment and installs the packages into a staging directory that will later be converted into an overlay.

Script install.sh:

$WITH_CONDA

set -xeuo pipefail

export TMPDIR=$PWD

export INSTALLDIR=$PWD/staging/$(dirname $(dirname $(which python)) | cut -c2-)
export PYMAJMIN=$(python -c "import sys; print(f'{sys.version_info[0]}.{sys.version_info[1]}')")
export PYTHONPATH=$INSTALLDIR/lib/python$PYMAJMIN/site-packages

pip install --no-cache-dir --prefix=$INSTALLDIR --no-build-isolation -r requirements.txt

exit

Make the script executable:

chmod u+x install.sh

Step 3: Create the create_overlay.sh Script

The create_overlay.sh script facilitates the creation of a squashfs overlay file from the staging directory by running the install.sh script within the Singularity image environment. This overlay file encapsulates the additional Python packages.

Script create_overlay.sh:

singularity exec -B$PWD $SIF ./install.sh 1>&2
chmod -R 777 staging/
mksquashfs staging/ overlay.squashfs -no-xattrs -processors 8 1>&2
rm -rf staging

Make this script executable:

chmod u+x create_overlay.sh

Step 4: Execute the Overlay Creation Script

Execute the create_overlay.sh script to produce the overlay.squashfs file. This file represents the custom environment additions and can be transported and used across the LUMI-G system.

./create_overlay.sh

Step 5: Modify Your Singularity Run Command with Overlay

When you’re ready to run your Singularity container on LUMI-G with the custom overlay, you’ll need to modify your existing run command by appending the --overlay flag. This flag specifies the path to your overlay.squashfs file, which contains the additional Python packages you’ve installed.

Here is an example of how such a command might look:

singularity exec --overlay path_to_overlay/overlay.squashfs $SIF bash -c '$WITH_CONDA; python my_script.py'

In this example, replace path_to_overlay/overlay.squashfs with the actual path to your overlay file, and my_script.py with the script or command line you intend to execute within the container. This modification ensures that your custom environment is active during the container’s runtime.