Many sites possess workloads of varying importance. While it may be critical that some jobs obtain resources immediately, other jobs are less turnaround time sensitive but have an insatiable hunger for compute cycles, consuming every available cycle for years on end. These latter jobs often have turnaround times on the order of weeks or months. The concept of cycle stealing, popularized by systems such as Condor, handles such situations well and enables systems to run low priority, preemptible jobs whenever something more pressing is not running. These other systems are often employed on compute farms of desktops where the jobs must vacate anytime interactive system use is detected.
Action |
Flag |
Details |
Cancel |
-c |
terminate and remove job from queue |
Checkpoint |
-C |
terminate and checkpoint job leaving job in queue |
Requeue |
-R |
terminate job leaving job in queue |
Resume |
-r |
resume suspended job |
Start (execute) |
-x |
start idle job |
Suspend |
-s |
suspend active job |
In general, users are allowed to suspend or terminate jobs they own. Administrators are allowed to suspend, terminate, resume, and execute any queued jobs.
It is important to note the rules of QoS based preemption. Preemption only occurs when the following 3 conditions are satisfied:
Use of the preemption system need not be limited to controlling low priority jobs. Other uses include optimistic scheduling and development job support.
Example:
----
PREEMPTPOLICY REQUEUE
QOSCFG[high] QFLAGS=PREEMPTOR
QOSCFG[med]
QOSCFG[low] QFLAGS=PREEMPTEE
----
The Moab Cluster ManagerTM's graphical interface presents numerous choices for configuration. For example, PREEMPTOR and PREEMPTEE attributes can be set when a QoS is created.
Table 8.4.2.4 Resource Manager Preemption Constraints
Resource Manager |
OpenPBS (2.3) |
PBSPro (5.2) |
Loadleveler (3.1) |
LSF (5.2) |
SGE (5.3) |
Cancel |
yes |
yes |
yes |
yes |
??? |
Requeue |
yes |
yes |
yes |
yes |
??? |
Suspend |
yes |
yes |
yes |
yes |
??? |
Checkpoint |
(yes on IRIX) |
(yes on IRIX) |
yes |
(OS dependent) |
??? |
See Also: N/A .
QOS Overview
Managing QOS Access