Questions and answers from HPC on-boarding Oct 2023, Day 2 (Oct 5)
Icebreaker
Which software would you like to run on our HPC machines?
LAMMPS
anvio workpackages, phyloflash
a python project
a python program that interacts with a JS client
trimAl
GTDB-Tk
R studio
Obitools 4
Accessing software
https://training.pages.sigma2.no/tutorials/hpc-intro/episodes/14-modules.html
(questions about previous days also welcome)
What if we need more than one software ? e.g Python and R
in this case we can load both as a module but we need to make sure they are compatible. more about it in a moment.
How do I know which software are not campatible?
This is very good question and something we must be very considerate. Because if the modules that are loaded together are not compatible this will cause issues
The main consideration is the compiler or the tool chain used. This we can see from the prefix(last part of the module name) of the module.
Another thing is when we load modules, it will load all the dependencies along side as well, e.g. sometimes we do not have to do anything extra
We will show more latter in the lesson
Still talking about more than one software, if both need to be in a container, can I download one container for one and install the other software inside the same container or should I use two different containers?
I would install both into the same container. Composing containers is not so easy but it is possible.
What is the difference between installing a software and loading a module?
Loading a module means that the software is installed as a module on the cluster, you can load them when you need the software, this is applicable to most cases, see here. In somecases, the software that you need may not be available as a module, then you can install it as a user. More on this is here
Why did you remove parts of the path? Did not understand that step and what it did/why it was done?
in practice we will not do this manually but we will let the
module
commands modify path. they will in effect pre-pend and modify PATH and thus make software visible or not visible. key point here is that linux will search through PATH from left to right.
How do I find more info about the module command, man module is not giving anything - Try
module --help
- also:module show SOMEMODULE
The
StdEnv
is a set of settings that are needed for the module system to work. Loading this will help the system what to do where to find the software when you ask quesitons likemodule avail
from the system.Do you normally make environments in saga to use software, like miniconda?
in this case I would module load the miniconda module and then use miniconda to manage dependencies
Can we use different toolchain for a job?
As far as I understand the module system (or EasyBuild which we use underneath) assumes that we have a consistent toolchain in our environment
To avoid conflict I would suggest to go for same toolchain
what is GCCcore ?
it is one of the “toolchains”. in other words it is a set of compilers that were used to build the software and dependencies. in this case it is the GNU C compiler. not sure what “core” is but it is probably a minimal version of the compiler suite.
core
concept is from easybuild to serve as base for different toolchain.thanks for clarification!
You can find more info about toolchain on easybuild documentation here
Is it better to use Conda for installing software or a Container?
both have advantages and disadvantages. I would start with Conda and only later consider a container. containerizing might be more work. but the advantage will be that it will be easier to move from one computer to another. but conda alone is probably simpler and it is good enough for most applications.
advanced answer: personally I use both using https://github.com/bast/singularity-conda (but I wrote it)
using a container can be a workaround around the problem of too many files or too long installation process
documentation on how to use Conda on HPC: https://documentation.sigma2.no/software/userinstallsw/conda.html
Is there a way to search for specific information about the software you have installed? -
module avail
will list all the software installed on the cluster -but information about what the software does? lmodule show software
, for eg :module show R/4.2.1-foss-2022a
will print alot of info about R and the version on screen - and also info about what the module do in terms of loading paths etc.try
module help
, for instancemodule help Python/3.11.3-GCCcore-12.3.0
How can a JS client interact with a Python program running on Saga? More concretely, I assume that a web API first authenticates incoming requests from the client and then takes some actions on the cluster, such as running the Python scripts or submitting a Slurm job. But I would greatly appreciate it if you could tell me a few things regarding such a specific way of real-time interaction with the HPC cluster, particularly about the authentication. I think this is an essential use case concerning the “Accessing the Software” part of the tutorial. Thank you!
This is not the way a HPC cluster is normally used. HPC systems are best when you can breakdown your analysis to smaller components and anaysise the components in the same time (parallel)
From your question it seems that you have an idea what you want as a function, but your technical requirement may have to be discussed further. For example JS clinets run on the browser and they do not directly communicte with back-end (e.g with server)
Regarding authentication. Setting up your own web application to authenticate against SAGA is not recommended. What could be an alternative is to use continuous integration(CI/CD) pipleines. But for this even you need to define your front end better.
Can you please give an example on what your intentions are .
Thank you for the response. A good example is a user study scenario where the user interact with the browser app. The app sends the input arrays to the Python program on the cluster, where (1) the data is stored and labelled, and (2) a trained model generates a novel output based on the user’s input, which is fetched back by the client program and shown to the user.
For this I do not think SAGA is the best place to host this. But we have other resources like the service platform.Please send us a mail and we can have discussion about this to how we can help you. May be we can have some pipline, e.g a Git ci/Cd where jobs are triggered on a HPC system.
As shown yesterday, the interaction with the cluster is not immediate, e.g you ask the schedular (SLURM) to exceute the jobs when resources are availble.So a user interface directly communicating might have to wait for some time, that can not be predefined (e.g. it depends on current load at the moment the job was sumbmitted).
Yes, that was precisely what I have been concerning about. Thank you for the explanation! I will surely contact you and we can discuss more specifically so I will not spam here more:) Again, thanks a lot!
Exercise
https://training.pages.sigma2.no/tutorials/hpc-intro/episodes/14-modules.html#id1
Status:
I am done: ooooo0oo
I need more time:
Not trying:
Compute and storage quota
https://training.pages.sigma2.no/tutorials/hpc-intro/episodes/compute-storage-quota.html
What does it mean to change the group using the command: chgrp?
on linux/unix system each file or folder has two ownership aspects. it belongs to a user and a group. it is possible to modify the ownership with chown or chgrp. did this clarify? (maybe I am answering out of context; I was in a different meeting)
Thank youf for the answer. The context is
dusage
and moving to the work directory, I am used to chmod and understand what it does, but not chgrp. Maybe you cloud provide a bit more clarification.with
chown
you can change ownership.chgrp
you can change ownership of the group part (ownership always has a user part and a group part)What if you do not have the permission to do that? I meant chmod not chown? (Difference between chmod and chgrp)?
You can change permission for files you own or part of the group (and the group permissions allow this)
One example for groups is projects. Where a set of group members can share a folder
What are the different groups? How do we know which one to change to?
try typing
groups
- this will list you all the groups that you are part ofbut what is fex the difference between username and username_g?
this is a very good question and I think the answer is that this is a technical choice made by the administrators to enable storage quota in the network file system. I don’t know otherwise why we needed two different groups:
user
anduser_g
. It confuses me as well and I write this as staff.
Is the directory /cluster/work/users/myuser empty automatically every once in a while? I think they mentioned something related to that. So I should move my stuff to the project directory if I want to keep the data?
yes, files and folders older than typically 21 days will get automatically deleted: “The $USERWORK autocleanup period is at least 21 days and up to 42 days if sufficient storage is available”. so this is a good place to run calculations and to generate temporary data that you don’t need to keep but all data that you want to keep for later you should move away to the project folder
Thanks! I will go right away through all of my stuff given I have never transferred anything yet. I have been using it for one month so I might be at risk =)
I will mention this in the next session.
we will try to communicate this better in future to avoid surprises. on the positive side: it can be a nice place also for experimentation where you don’t need to clean up because the system eventually will.
Good point!
So related to question 16, now my work directory is 364 GB big. Is it safe to transfer all to the project directory in one shot (say, with rsync)? Then I will clean up from the project directory
first step: check with
dusage
that you have enough quota for it in the project directorythen you can use rsync to transfer everything in one shot. even if the transfer gets interrupted, you can resume.
if you are sure that your connection does not get interrupted, you could also
mv
the folders but here I would perhaps do it in chunks and not in one shot.rsync
is safer w.r.t. connection loss.Understood! Thanks!
can you please show the project are to explain a use of groups
What is the use of group ownership. So we can share files with others
A group ownership is where you can share folders and files, for eg
nn9970k
project can have files withnn9970k
ownership under the project area.you can use groups to assign access to files
each user may belong to one primary group (
username_g
) and the project group herenn9970k
, you can trygroups
and find the groups you are inDoes it explains ?
Poll: Try dusage
Yes: oooooooo
No:
Poll: Try cost
Yes:oooo
No: