Abstract

In our work, we address the problem of workload distribution within a computational grid. In this scenario, users submit jobs to local high performance computing (HPC) systems which are, in turn, interconnected such that the exchange of jobs to other sites becomes possible. Providers are able to avoid local execution of jobs by offering them to other HPC sites. In our implementation, this distribution decision is made by a fuzzy system controller whose parameters can be adjusted to establish different exchange behaviors.

In such a system, it is essential that HPC sites can only benefit if the workload is equitably (not necessarily equally) portioned among all participants. However, each site egoistically strives only for the minimization of its own jobs' response times regularly at the expense of other sites.

This scenario is particularly suited for the application of a competitive coevolutionary algorithm: the fuzzy systems of the participating HPC sites are modeled as species that evolve in different populations while having to compete within the commonly shared ecosystem.

Using real workload traces and grid setups, we show that opportunistic cooperation leads to significant improvements for each HPC site as well as for the overall system.

This content is only available as a PDF.