Job Types on Olivia
Olivia is designed for large-scale parallel jobs and GPU-accelerated workloads. With its high-performance compute nodes featuring 256 or 288 CPUs and substantial memory per node, Olivia is suited for computationally intensive applications that can scale across many cores.
The basic allocation units on Olivia are cpus, memory and GPUs, or whole nodes. The details about how the billing units are calculated can be found in Projects and accounting. Note that the number of GPUs is counted separately, not as part of the billing units.
Small
Allocation units: cpus and memory
Job Limits:
maximum 256 billing units
maximum 1 node
Maximum walltime: 7 days
Priority: normal
Available resources:
88 nodes with 256 AMD cpus and 741 GiB RAM
Parameter for sbatch/salloc:
--partition=small(can be omitted, small is the default)
Job Scripts: Small
This is the default job type, meant for CPU-only jobs needing less than a whole node. The partition is good for:
Memory-intensive applications requiring substantial RAM but less than 256 CPUs.
Large
Allocation units: whole nodes
Job Limits:
maximum 9 nodes
Maximum walltime: 7 days
Priority: normal
Available resources:
172 nodes with 256 AMD cpus and 741 GiB RAM
Parameter for sbatch/salloc:
--partition=large
Job Scripts: Large
This is meant for larger CPU-only jobs, needing at least one node. The partition is good for:
Large-scale parallel computations
Memory-intensive applications requiring substantial RAM and many CPUs
Jobs that can efficiently utilize many CPU cores
Scientific simulations requiring significant computational resources
Note that jobs will be allocated to whole nodes, no matter what
--ntasks and/or --ntasks-per-node are specified as. A notice will
be printed about this at submission if needed. This can be suppressed
by setting the environment variable
SLURM_SUBMIT_SUPPRESS_NTASKS_WARNING or
SLURM_SUBMIT_SUPPRESS_WARNINGS to 1 (the latter will suppress any
warnings from submission).
Accel
Allocation units: cpus, memory and GPUs
Job Limits:
minimum 1 GPU
maximum 32 GPUs
Maximum walltime: 7 days
Priority: normal
Available resources: 76 nodes (max 60 per project) with 288 ARM64 cpus, 808 GiB RAM and 4 GH200 GPUs.
Parameter for sbatch/salloc:
--partition=accel--gpus=N,--gpus-per-node=Nor similar, with N being the number of GPUs
Job Scripts: Accel
Accel jobs give access to use the Grace Hopper nodes that combine ARM64 CPUs with NVIDIA GH200 GPUs. This is useful for AI/ML training, inference, and other GPU-accelerated applications.
Can be combined with --qos=devel to get higher priority but maximum wall time (2h)
and resource limits of devel apply.
Devel
Allocation units: cpus and memory and GPUs
Job Limits:
maximum 1152 billing units per job
maximum 16 GPUs per job
maximum 2304 billing units in use at the same time
maximum 32 GPUs in use at the same time
maximum 2 running jobs per user
Maximum walltime: 2 hours
Priority: high
Available resources: devel jobs can run on any node on Olivia
Parameter for sbatch/salloc:
--qos=devel
Job Scripts: Devel
This is meant for small, short development or test jobs. Devel jobs get higher priority for them to run as soon as possible. On the other hand, there are limits on the size and number of devel jobs.
Can be combined with --partition=small, --partition=large or
--partition=accel to increase priority while having max wall time
and job limits of devel job.
If you have temporary development needs that cannot be fulfilled by the devel job type, please contact us at support@nris.no.