Scheduling systems. SLURM example. - ut

Preview:

Citation preview

Scheduling systems.SLURM example.

Lauri Anton

11.06.2009 HPC@UT 2

Example of a resource (old)

• Juhtmasinad

• Arvutussõlmed

• Ethernet

• InfiniBand

• Jahutid

• Gaaskustutus

Pildilt väljas:

• UPS

• generaator

Example of a resource (new)

Why scheduling system?

• Resources are limited• CPU cores/cycles

• RAM

• Network bandwith

• Scratch space (/tmp)

• Many users with different needs

• Jobs need to be ordered

• Priorities

• Protecting users from themselves

• Protecting jobs from other jobs

Ordering jobs

• What jobs should be run first?

• Parallel jobs vs single-core

Historical overview of schedulers

• Pre-scheduler times

• PBS - Portable Batch SystemTorque + Maui

• Various grid middlewares

• SLURM

SLURM overview

Partitions, jobs, job steps

Aims of the scheduling policy in HPC@UT

• Long jobs have to be limited

• Short jobs should start quickly

• Parallel jobs should start quickly

• Large users should have lower priority (fairshare)

Home reading

• https://computing.llnl.gov/linux/slurm/quickstart.html

Recommended