Migration to Saga
Before you move existing scripts, jobs, etc to Saga, we recommend that you make yourself familiar with Saga by reading this page, work on the exercises listed and explore Saga's capabilities by trying out new commands and options.
The major steps in migrating to Saga are:
- Getting an account and project quota on Saga.
- Getting aware of differences (disk quota, module system, job types, running jobs, how to get help, no backup currently).
- Transfering data, scripts etc from other machines to Saga.
- Modifying scripts & routines to match differences on Saga.
- Verifying that your jobs run efficiently and produce the same results as on other systems!
- Be patient with user support (firstname.lastname@example.org), but don't hesitate to ask questions!
Current (major) differences of Saga
- NO backup of any file system, neither $HOME, shared folders, project folders - please, backup important data yourself until we've implemented it!
- NIRD storage is not mounted on Saga's login nodes!
- Quota/disk usage policies are not enforced yet! We strongly recommend to not use more than the quota of 20 GiB on $HOME or you may run into problems later on!
Getting familiar with Saga
Please, read the introduction about Saga to inform yourself about Saga's characteristics.
Account and projects
You need a user account (say
USER) and a project account on Saga before you can start using it. Obtaining
a user account and a project account on Saga works as for the other systems (except that for
the Notur period 2019.2 one could not directly apply for compute time on Saga - instead
you or your PI will be informed when you can start migrating to Saga). For details see
Get User Account and
Apply for e-infrastructure resources.
Use your favourite login tool, e.g., ssh or Putty, to login with your user account
into Saga. The machine name for Saga is
saga.sigma2.no. An example command to login
with ssh is
For more information see Getting Started. Note, on Saga there is no support for remote desktops configured yet.
File systems and storage quota
Note! Currently (september 2019), there is no backup of any file system on Saga!
Users on Saga have access to the following file systems and usage policies:
|Filesystem||Access||Usage policy (quota, removal)|
||20 GiB quota|
|user specific work folder||
||any file older than 21/42 days will be removed automatically, 21 days when total disk usage (do
||quota depending on grant|
||specific to each running job, will be removed at end of job|
Check your disk usage with
du -sh $HOME. Replace
$HOME with any other folder
you want to check.
Exercise 1: Check your disk usage of
Saga uses lmod as module system - the same as on Fram. One major difference to Abel's older module system is that it is case-insensitive (see Exercise 3 below).
to list all available modules. If you want to check which versions of a specific package are available do
module avail SAM
which shows all packages whose name/version matches
Exercise 2: Play around with
module avail to identify all packages built with the
Exercise 3: Try
module avail NETCDF and
module avail netCDF on both Saga
and Abel and compare the differences.
You may notice that the
name/version identifiers for modules on Saga differ from
other systems, particularly when you're used to the module system on Abel.
On Saga (and Fram) the version (the part following the
/) decodes both the
version of the software package and the tool chain (decodes a compiler version and
supporting libraries) used to build the package.
The structure of these identifiers is the same on Saga and Fram. While the modules
system on Abel is following a similar concept (
name/version) as identifier, the tool chain
part usually contains only a brief keyword for the compiler being used.
Project accounts and compute hour usage
cost work in the same manner as on Fram and Abel.
The former lists all project accounts (don't confuse with your user account)
you have access to. The latter provides details about allocated, used, running,
pending and available cpu hours for your project accounts. For details of
these commands check their usage information
projects --help and
Exercise 4: Determine all projects you have access to on Saga and Abel/Fram.
Exercise 5: Check how many cpu hours are available in the project accounts you have access to.
Interactive jobs provide you a shell prompt on one of the compute nodes where you can run any commands interactively. This is useful for several scenarios such as debugging issues, exploring/developing complex jobs composed of several steps, determining memory requirements, initial trials to build software packages, etc. Common to these scenarios is the need for more than one core or that they are compute-intensives and/or run for longer times (longer than a few minutes) - in other words activities which are forbidden on the login nodes (we monitor to detect usage exceeding low thresholds and automatically kill processes which violate the usage policy).
Once you have found working sequences of commands, for example, for complex jobs or for building software packages it is highly recommended to put such sequences into a shell script and execute that as a batch job. An example command to submit an interactive job is
srun --account=nnXXXXk --time=00:01:00 --mem=1G --pty bash -i
The parameters before
--pty bash -i could also be used to submit a batch job
(often they are put into a shell script). On Abel you may have used the command
qlogin to submit an interactive job.
Our documentation provides additional information about interactive jobs.
You may also check the manual page for srun with the command
Exercise 6: Determine the compute node on which the job runs and list all modules which are loaded once the job has started.
Exercise 7: Play with the
srun command changing the wall time limit (
and specifying that it should run on three nodes (parameter
Exercise 8: What happens to your interactive job if your connection to Saga terminates? Does the job continue to run (as a batch job would do) or is it terminating (as a shell would do)?
Most production jobs are run as batch jobs. You specify how many resources the job needs, for how long, on which project account it should be billed for, and what commands the job should run. After you submitted a job, you don't need to do anything to launch it - it may start at any time when the resources it requests become available. When this sounds familiar to you being a user on Abel or Fram, then you're right.
However, there are a few changes you have to apply to your existing job scripts.
Porting job scripts from Abel
A typical job script on Abel looks like (taken from A Simple Serial Job)
#!/bin/bash # Job name: #SBATCH --job-name=YourJobname # # Project: #SBATCH --account=YourProject # # Wall clock limit: #SBATCH --time=hh:mm:ss # # Max memory usage: #SBATCH --mem-per-cpu=Size ## Set up job environment: source /cluster/bin/jobsetup module purge # clear any inherited modules set -o errexit # exit on errors ## Copy input files to the work directory: cp MyInputFile $SCRATCH ## Make sure the results are copied back to the submit directory: chkfile MyResultFile ## Do some work: cd $SCRATCH YourCommands
This job ported to Saga would be (with some additions taken from Sample MPI Batch Script)
#!/bin/bash # Job name: #SBATCH --job-name=YourJobname # # Project: #SBATCH --account=YourProject # # Wall clock limit: #SBATCH --time=hh:mm:ss # # Max memory usage: Size is a number plus M (megabyte) or G (gigabyte), e.g., 3M or 5G #SBATCH --mem-per-cpu=Size # # Number of tasks (cores): this is added to make it easier for you to do exercises #SBATCH --ntasks=1 ## Set up job environment: (this is done automatically behind the scenes) ## (make sure to comment '#' or remove the following line 'source ...') # source /cluster/bin/jobsetup module restore system # instead of 'module purge' rather set module environment to the system default module load SoftWare/Versions #nb: 'Versions' is mandatory! There are no default versions of modules as on Abel! # It is also recommended to to list loaded modules, for easier debugging: module list set -o errexit # exit on errors set -o nounset # Treat unset variables as errors (added for more easily discovering issues in your batch script) ## Copy input files to the work directory: cp MyInputFile $SCRATCH ## Make sure the results are copied back to the submit directory (see Work Directory below): # chkfile MyResultFile # chkfile is replaced by 'savefile' on Saga savefile MyResultFile ## Do some work: cd $SCRATCH YourCommands
Exercise 9: Explore job limits (what amount of resources you can request and
for how long) by adapting the above job script's walltime limit (
of memory (
--mem-per-cpu) and number of cores (
--ntasks). You can change the values
in a script and submit it with just
sbatch SCRIPTNAME or if you don't want to edit
the script for all trials, add the parameters to the sbatch command, for example,
sbatch --time=00:01:00 --ntasks=2 --mem-per-cpu=2G SCRIPTNAME
Exercise 10: Add commands to your job script which print environment variables (
the current date (
date), sleeps for a while (
sleep 300) or calculates disk usage
of various directories your job has access to (
du -sh $HOME $USERWORK $SCRATCH).
Exercise 11: Add parameters to get notified when your job is started, ends,
#SBATCH --mail-type=ALL and
Transferring files to Saga
The recommended way to transfer files is
rsync. In case a transfer is
interrupted or when you have changed files at the origin, you can synchronise files
on Saga by simply rerunning
rsync. A typical rsync command line looks as follows
rsync -a -v all_my_scripts_for_paper_x YOUR_USERNAME@saga.sigma2.no:from_abel/.
-a instructs rsync to copy the whole directory tree starting with
-v instructs rsync to be verbose, i.e.,
it will print what it is doing.
We recommend that you use a specific folder on Saga for files originating from other systems. That makes it easier to keep an original version and lowers the risk of overwriting other things you have on Saga.
Exercise 12: On Abel or your own Linux-based machine create a sample directory
tree (for example, by running
mkdir -p rsync10/A; mkdir -p rsync10/B; touch rsync10/foo rsync10/A/bar rsync10/B/foobar)
and rsync this to your
$HOME on Saga.
Exercise 13: Rsync this to another directory you have access to, e.g.,
$USERWORK or a project directory.
Exercise 14: Rsync a larger directory tree to Saga, interrupt it (press
and rerun the rsync command.
Installing software in folders accessible to users, e.g., $HOME, can be relatively easy with EasyBuild which is also used for system-wide installations. We illustrate how you can do this for the software package SPAdes for which Saga does not provide all the versions you would find on Abel (3.1.1, 3.5.0, 3.6.0, 3.7.0, 3.8.0, 3.9.0, 3.10.0, 3.10.1, 3.11.0, 3.11.1, 3.12.0, 3.13.0).
We are going to demonstrate how to install version 3.12.0 which is the last version before the current default on Abel.
On a login node, run
screen -S spades_eb - see Exercise 16 below for working
with screens. With screens you can detach from and reattach to a running
session, which is particularly useful when your network connection to Saga could
be lost. If that happens, you just have to login to the machine where you started
screen and reattach to a session. Without screens (or similar tools) your session
would terminate, which means that also your interactive job would terminate.
Next, start an interactive job (see details above) with
srun --account=nnXXXXk --time=08:00:00 --nodes=1 --ntasks-per-node=40 --mem=185G --pty bash -i
Restore a clean module environment
module restore system
Load the module for EasyBuild
module load EasyBuild/3.9.3
Download an easyconfig file for SPAdes 3.12.0
eb SPAdes-3.12.0-foss-2018b.eb --fetch
You may check which easyconfig files are available at EasyBuild config files and which have been downloaded to Saga at the local directory
Do a dry run
eb SPAdes-3.12.0-foss-2018b.eb --dry-run
Assuming that is successful, i.e., no errors reported, build the software - this may take very long (hours) particularly when many dependencies are built.
eb SPAdes-3.12.0-foss-2018b.eb -r
When you run
module avail SPAdes it may not be shown yet. That's because the
module system doesn't search your $HOME for modules. Do
module use $HOME/.local/easybuild/modules/all
Now you should see it
$ module avail SPAdes ------------- /cluster/home/YOUR_USERNAME/.local/easybuild/modules/all -------------- SPAdes/3.12.0-foss-2018b ------------------------------ /cluster/modulefiles/all ------------------------------ SPAdes/3.13.0-foss-2018b SAMtools/3.13.1-GCC-8.2.0-2.31.1 ...
Exercise 15: Install the easyconfig
SAGE-6.4.eb. Load it and find out what it provides.
Exercise 16: Detach from the screen (type
d), list running
screen -list and reattach to screen
spades_eb with the command
screen -dR spades_eb.
Transferring files back home
Sometimes you may need to transfer files out of Saga, e.g., to your laptop or
another server or cluster. The easiest way to do that is to use again
you can login to the destination machine, then you can simply do
rsync -a -v my_folder_on_Saga YOUR_USERNAME_ON_YOUR_MACHINE@YOUR_MACHINE:from_saga
In case you cannot login into machine from Saga, you can initiate the transfer from your machine. On your machine do
rsync -a -v YOUR_USERNAME_ON_SAGA@saga.sigma2.no:my_folder_on_Saga from_saga