Projects and accounting
All jobs are run in a project. Use
--account in job scripts to
select which project the job should run in. (The queue system calls
projects accounts.) Each project has a CPU hour quota, and when a
job runs, CPU hours are subtracted from the project quota. If there
is not enough hours left on the quota, the job will be left pending
with a reason
Fram, but this will soon change.)
To see which projects you have access to on a cluster, run
List available quota
cost gives an overview of the CPU hour quota. It can be
run in different ways:
# Show quota information for all projects you have access to $ cost # Show quota information for project YourProject $ cost -p YourProject # Adds information about how much each user has run $ cost --details
cost --man for other options, and explanation of the output.
cost command only shows usage in the current allocation
period. Historical usage can be found here.
Do not ask for a lot more memory than you need
A job that asks for many CPUs but little memory will be billed for its number of CPUs, while a job that asks for a lot of memory but few CPUs, will be billed for its memory requirement.
If you ask for a lot more memory than you need, you might be surprised that your job will bill your project a lot more than you expected.
If you ask for a lot more memory than you need, your job may queue much longer than it would asking for less memory.
The term "CPU hour" above is an over-simplification. Jobs are accounted for both CPU and memory usage, as well as usage of GPUs. (Currently, jobs on Fram are only accounted for their CPU usage, but this will change soon.)
Accounting is done in terms of billing units, and the quota is in billing unit hours. Each job is assigned a number of billing units based on the requested CPUs, memory and GPUs. The number that is subtracted from the quota is the number of billing units multiplied with the (actual) wall time of the job.
The number billing units of a job is calculated like this:
- Each requested CPU is given a cost of 1.
- The requested memory is given a cost based on a memory cost factor (see below).
- Each requested GPU is given a cost based on a GPU cost factor (see below).
- The number of billing units is the maximum of the CPU cost, memory cost and GPU cost.
The memory cost factor varies between the partitions on the clusters:
For the bigmem partition on Saga, the factor is currently 0.1059915 units per GiB. This means that for a job requesting all memory on one of the "small" bigmem nodes, the memory cost is 40, while for a job requesting all memory on one of the large nodes, it is 320.
For all other partitions, the factor is set such that the memory cost of requesting all the memory on a node is the same as requesting all CPUs on the node (40 for normal nodes and 24 for the GPU nodes).
The GPU cost factor on Saga is currently 6, which means that a job requesting all 4 GPUs on a node, will cost the same as a job requesting all CPUs on it. Note that this factor might increase in the future, because GPUs are more expensive than CPUs.