Fault-tolerant parallel scheduling of arbitrary length jobs on a shared channel



Kowalski, Dariusz R ORCID: 0000-0002-1316-7788, Mirek, JD and Wong, Prudence WH ORCID: 0000-0001-7935-7245
(2019) Fault-tolerant parallel scheduling of arbitrary length jobs on a shared channel. Lecture Notes in Computer Science.

[img] Text
1710.07380v2.pdf - Submitted Version

Download (755kB)

Abstract

We study the problem of scheduling jobs on fault-prone machines communicating via a shared channel, also known as multiple-access channel. We have $n$ arbitrary length jobs to be scheduled on $m$ identical machines, $f$ of which are prone to crashes by an adversary. A machine can inform other machines when a job is completed via the channel without collision detection. Performance is measured by the total number of available machine steps during the whole execution. Our goal is to study the impact of preemption (i.e., interrupting the execution of a job and resuming later in the same or different machine) and failures on the work performance of job processing. The novelty is the ability to identify the features that determine the complexity (difficulty) of the problem. We show that the problem becomes difficult when preemption is not allowed, by showing corresponding lower and upper bounds, the latter with algorithms reaching them. We also prove that randomization helps even more, but only against a non-adaptive adversary; in the presence of more severe adaptive adversary, randomization does not help in any setting. Our work has extended from previous work that focused on settings including: scheduling on multiple-access channel without machine failures, complete information about failures, or incomplete information about failures (like in this work) but with unit length jobs and, hence, without considering preemption.

Item Type: Article
Uncontrolled Keywords: cs.DC, cs.DC
Depositing User: Symplectic Admin
Date Deposited: 21 Nov 2017 10:15
Last Modified: 29 Sep 2020 08:20
Related URLs:
URI: http://livrepository.liverpool.ac.uk/id/eprint/3012361