Grid scheduling
Effective scheduling in the Grid environment is a complex problem which is not fully and efficiently solved in nowadays production systems. Finding of optimal solution is an NP complete problem which is practically intractable for larger sets of jobs.
Therefore, a good approach is to search for suboptimal solutions, where sufficiently efficient algorithms exist. However, production schedulers usually use very simple algorithms based on (priority) queues. Due to the limited and often imprecise information used by these algorithms their efficiency is often very low especially when sufficient QoS is required (deadlines, co-allocation of resources, advanced reservation). Therefore, in our research we apply advanced scheduling techniques based on schedule optimization.
![]() |
| Fig. 1 Queue vs. schedule-based scheduling |
The schedule is a data structure that, unlike to the queue, specifies when and on which machines (CPUs) the jobs will be executed. This information is known in advance in contrast to the queue where such information is only available at the moment when the job is selected and submitted onto the machine (See Fig. 1). Using the schedule we are capable to analyze the Grid’s behavior in advance. Ideally, we are able to manage advanced reservation or machine co-allocation easily. We can also realize which jobs will not meet their deadlines, etc. Such schedule can be further optimized to improve the schedule quality.
On the other hand, schedule construction requires reasonably precise information concerning job and machine parameters. Moreover, schedule-based techniques are usually applied for static problems, i.e., they do not expect any changes during the schedule construction process such as new job arrivals, machine startups/failures, etc. This is not the case for the real Grid environment: usually the precise knowledge concerning the job runtime is not available, jobs and machines may fail down or appear suddenly. In Grid — due to the Grid dynamic behavior — such changes must be reflected by the scheduling algorithm. We have to use job runtime estimations (precise runtime is not realistic assumption), which can be specified e.g., by a user or they can be somehow approximated by e.g., statistical analysis of already computed jobs. Another possibility is to assign jobs into standard classes such as “long”, “normal” or “short”.
In case of machine restart/failure we have to re-schedule suitable jobs to balance the load or remove jobs from failed machine’s schedule and place them into remaining machines’ schedules respectively. Moreover we have to be reasonably fast.
Applied solution techniques
![]() |
| Fig. 2 Tabu search algorithm |
In our research we focus on the design and implementation of various scheduling techniques, namely those based on local search and scheduling policies. Policies are used to build initial schedule while local search, e.g., well known Tabu search algorithm (See Fig. 2) is used for further schedule optimization according to applied optimization criteria.
These heuristics, originally designed for static problems, were modified to efficiently reflect the dynamic Grid behavior. Following list represents some of the proposed approaches:
- incremental approach — it is used to minimize the number of changes in the current schedule when some new event such as job arrival or machine restart/failure appears. The new solution is based on the previously computed schedule, which we reuse and locally modify to come up with the new, up-to-date schedule. This incremental approach allows us to come up with new solution in a significantly shorter time comparing to the total schedule re-computation.
- anytime approach — it is used to guarantee that the optimizing procedure can be stopped at any time during its execution. This is very important to be able to flexibly react on incoming events (e.g., new job arrival). Both the policy and Tabu search are designed so that the constructed schedule is always executable (after each iteration). Therefore, if prompt decision is required
these algorithms can be stopped at any time and the resulting schedule can be used immediately. - local changes — imprecision of the current schedule, caused by the dynamic Grid behavior, is solved by local changes. This helps us to minimize the number of necessary computational steps and also helps us to keep the existing properties of the schedule unchanged as much as possible (existing advanced reservations, co-allocations, etc. should remain unchanged).
- optimization — incremental and anytime approach allows us to periodically optimize the schedule through the local search-based algorithms. It improves the quality (QoS, machine usage, etc.) of the resulting schedule significantly comparing to the situation when only the simple scheduling policy is applied.
Grid scheduling simulator “Alea”
To test the proposed algorithms we developed a Grid scheduling simulator allowing controllable and repeatable testing of existing and newly proposed algorithms (both queue and schedule-based are supported). The solution is able to deal with the common problems of the job scheduling in Grids like heterogeneity of jobs and resources, and dynamic runtime changes such as arrivals of new jobs. The Alea is based on the GridSim simulation toolkit written in Java which we extended to provide a simulation environment that supports simulations of varying Grid scheduling problems such as parallel job support or new extensible centralized scheduler entity. Through the tests we implemented various optimization criteria such as makespan, tardiness, deadlines, machine usage, etc. We also implemented various queue-based algorithms such as FCFS, Earliest Deadline First, EASY Backfilling, Flexible Backfilling and schedule-based algorithms such as those used to build the initial schedule (Earliest Gap-Earliest Deadline First (EG-EDF)) and those designed to optimize it (Tabu search, Hill climbing, Simulated annealing).
More details as well as sources and latest distribution of Alea can be found here.
Scheduling with virtual machines
Virtualization of Grid resources is a new popular and successful direction which aims to solve many current problems related not only to Grid scheduling. Today, various users’ requirements can be more or less fulfilled by introducing different queues, job priorities, advanced reservations and so on. However, without the ability to preempt running jobs a scheduling system can hardly guarantee immediate execution of interactive or high priority jobs. While current scheduling systems may provide support for job preemption, it is limited to the provided support of underlying operating systems. Nowadays job preemption is usually achieved by completely suspending the job or reducing its priority. Even in such situation still the high priority job may be slowed down by a preempted job e.g., by swapping. Condor support job preemption due to recompilation of applications, which substitutes all I/O system calls with variants that allows to performing the preemption safely. Complete suspend might also have serious impacts on parallel jobs or processes communicating over the network. With virtual machine technology, which provides strong isolation of processes running within different virtual machines, better preemption without unpredictable performance losses can be achieved.
Magrathea is a scheduling and resource management system which can be used to support scheduling on virtual machines. Here the preemption is allowed by running two separate virtual machines with the same environment within each node. One of them is used for normal jobs and the other one for high-priority jobs which are allowed to de facto preempt normal jobs by minimizing their resources to some given low amount, so that almost all resources are then available for the high priority job. Still, the low priority job is kept running (with minimal resources) to prevent its re-submission. The virtual machine used for normal jobs can be used for backfilling when there is no high-priority job running on the
other machine. Moreover, virtualization allows us, if necessary, to migrate the whole virtual machine including the job onto different node which allows us to introduce job migration that would be
otherwise problematic.
Scheduling systems
In current production environment of the METACenter, which is the national Grid infrastructure, the PBSpro scheduling system is used. Our goal is to study and extend other scheduling systems, to allow their future application in the METACenter environment. Currently, we study the Simple Linux Utility for Resource Management (SLURM), which we plan to extend
with our own scheduling algorithms.
Links
- METACenter — national Grid infrastructure.
- Sitola — Laboratory of Advanced Network Technologies.
- Super Computing Center Brno — department providing support for scientific computing and enabling the development of computing and storage facilities of the Masaryk University.
- Hana Rudová - publications — the list of interesting papers concerning the timetabling and scheduling.
- Dalibor Klusáček - publications — the list of papers focusing on the Grid scheduling.
- Miroslav Ruda - publications — the list of papers focusing on the scheduling and virtualization.
Contact:
Hana Rudová, Dalibor Klusáček
{hanka,xklusac}(at)fi.muni.cz






