PyTorch on Olivia 

This guide family shows how to run PyTorch on Olivia in three ways:

The main focus is the HPC workflow: start on one GPU, then scale to multiple GPUs on one node, and then to multiple nodes.

Guide Structure 

Use the reference pages first:

Then follow the execution guides:

Performance Summary

This 3-part guide walks you through scaling PyTorch training on Olivia’s GH200 GPUs:

The multi-GPU guides use FP16 mixed precision for improved performance.

Note

Key considerations for Olivia:

The login node is x86_64, while the GPU compute nodes are Aarch64.
Software and containers must therefore be compatible with ARM on the compute nodes.
Set up projects in project or work storage, not in your home directory.