dc.description.abstract |
This thesis addresses the job scheduling problem for heterogeneous supercomputers where accelerators such as GPGPUs or co-processors are employed. On homogeneous supercomputers, the problem of scheduling user jobs to the available resources is NP-hard. Heterogeneous systems make the scheduling problem combinatorially more di cult. In this thesis, we aim to (i) design a new class of scheduling algorithms for state-of-the-art heterogeneous supercomputers, (ii) implement these scheduling algorithms as ready to use open source plugin software (iii) demonstrate the e ectiveness of these algorithms by emulating real life usages. We propose four di erent models to solve the scheduling problem on heterogeneous supercomputers. In the rst model, we formulate a simple co-allocation problem that does not take topology into consideration. In the second model, we implement the problem as an auction problem and automatically generate multiple bids for each job by assuming a one dimensional system topology. In the third model, we support moldable jobs that may request a range of resources. In our fourth model, we also consider topologically aware scheduling for hierarchical fat tree interconnection architectures. All of these models are formulated as integer programming problems and are solved periodically at each scheduling step. We use existing workloads to test the performance of our scheduling algorithms and also develop our own workload generator that generates realistic workloads for heterogeneous systems. The tests carried out show that our algorithms perform better than the traditional back lling algorithm in terms of system utilization, average job waiting time and/or job fragmentation. |
|