Description:
The training outline follows:
- Slurm Refresher
- How Slurm actually works.
- How Slurm schedules jobs.
- How long to wait; how to better schedule jobs.
- Slurm and priorities; how is it done?
- Key features
- Resource Management
- Running a job; job/step allocation
- Examples – GPUs
- Examples – Job Arrays
- Advanced Features
- Topology Aware Scheduling
- Job Sanity Check
- Job profiling
- Multithreading (SMT)
- Heterogeneous j obs
- Job Dependencies
- Chain Jobs
- Staging input before running, and storing outputs
- Master/Slave programs
- Submitting collections of programs (multi-prog)
- System Information Job monitoring
- Checkpointing & Restart
- Use of SLURM API (plans to support this in the future on Pawsey systems)
Start: Tuesday, 13 July 2021 @ 09:00
End: Friday, 16 July 2021 @ 12:00
Duration: 12:00
Timezone: Perth
Prerequisites:
This training is targeted at users who have already used SLURM but whose needs go beyond simple batch files or small interactive jobs.
- Host institutions
Organiser: Pawsey Supercomputing Research Centre
Contact: training@pawsey.org.au
Host institution: Pawsey Supercomputing Centre
Keywords: slurm, scheduler, supercomputer
Capacity: 16
Event type:- Workshop
