10
Synthesis and stochastic assessment of schedules for lacquer production H.C. Bohnenkamp 1 H. Hermanns 2,1 R. Klaren 1 A. Mader 1 Y.S. Usenko 1 1 Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands 2 Department of Computer Science, Saarland University, D-66123 Saarbr¨ ucken, Germany Abstract The M modeling language pairs modeling features from stochastic process algebra and from timed and proba- bilistic automata with light-weight notations such as excep- tion handling. It is supported by the M tool, which fa- cilitates the execution and evaluation of M specifica- tions by means of the discrete event simulation engine of the tool. This paper describes the application of M- , M and to a highly nontrivial case. We in- vestigate the eect of faulty behavior on a hard real-time scheduling problem from the domain of lacquer produc- tion. The scheduling problem is first solved using the timed model-checker U. The resulting schedules are then em- bedded in a M failure model of the lacquer produc- tion line, and analyzed with the discrete event simulator of . This approach allows one to assess the quality of the schedules with respect to timeliness, utilization of re- sources, and sensitivity to dierent assumptions about the reliability of the production line. 1. Introduction Scheduling problems are an important area of research, where typically strict response requirements with limited resources have to be met. Their solution is usually based on techniques from classical scheduling theory, which is a well-established branch of operations research [7]. Indus- trial scheduling problems, e.g. for pharmaceutical manu- facturing lines, are highly complex, and only heuristic so- lutions are used in practice. More recently, the application of search techniques from model checking and mixed integer linear programming have been proposed as a complementary approach to schedule This work was supported by the NWO-DFG bilateral project Vali- dation of Stochastic Systems (VOSS), the STW/PROGRESS project TES.4999 (HaaST), the EU IST project AMETIST, and the NWO Vernieuwingsimpuls project 016.023.010 (VoPaD). synthesis. The potential advantage of this approach lies in its simplicity and genericity. Where classical scheduling so- lutions often rely on restricting assumptions that may not be fulfilled in practice, state space exploration techniques are generally applicable, and very robust under variation in the problem parameters. In the model checking approach scheduling feasibility is expressed as temporal or real-time property (“The production can be finished by friday after- noon.”), and a model checker is used to check whether this property holds for a reachable state in a model of the over- all system behavior. The European AMETIST [3] project is a joint initia- tive from leading research groups on timed systems, and four industrial partners. One of the main strands of activi- ties of AMETIST is to advance the power of schedulabil- ity analysis by tackling challenging industrial case studies provided by the industrial partners. One particularly chal- lenging case - focusing on scheduling in the domain of lac- quer production - originates from AXXOM AG, a German software company specialized in supply chain optimization. The scheduling synthesis for lacquer production has been successfully tackled using the real-time model-checker U- . However, this analysis had to ignore two parameters of the original problem specification as provided by AXXOM. These two parameters types relate to quantitative stochas- tic influences due to failures, repairs, cleaning periods and other unforeseeable (and thus unplannable) events. This paper reports on our activities to incorporate these types of parameters into an a posteriori analysis of the schedules generated with U [16]. To do so, we re- fine the timed automata model of the production units into a stochastic timed automata model in order to faithfully rep- resent the stochastic perturbations, as described by the so- far not considered specification parameters. The assessment of the schedules is then done on the basis of the estimates obtained by discrete-event simulation. Technically, we use the process algebra-based formal- ism M to specify the production units, as well as the schedules. The M models are evaluated by means of Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST’04) 0-7695-2185-1/04 $ 20.00 IEEE

[IEEE First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings. - Enschede, Netherlands (2004.11.12-2004.11.12)] First International Conference

  • Upload
    ys

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: [IEEE First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings. - Enschede, Netherlands (2004.11.12-2004.11.12)] First International Conference

Synthesis and stochastic assessment of schedules for lacquer production∗

H.C. Bohnenkamp1 H. Hermanns2,1 R. Klaren1 A. Mader1 Y.S. Usenko1

1Faculty of Electrical Engineering, Mathematics and Computer Science,University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands

2Department of Computer Science, Saarland University, D-66123 Saarbrucken, Germany

Abstract

The M modeling language pairs modeling featuresfrom stochastic process algebra and from timed and proba-bilistic automata with light-weight notations such as excep-tion handling. It is supported by the M tool, which fa-cilitates the execution and evaluation of M specifica-tions by means of the discrete event simulation engine of theM tool. This paper describes the application of M-, M and M to a highly nontrivial case. We in-vestigate the effect of faulty behavior on a hard real-timescheduling problem from the domain of lacquer produc-tion. The scheduling problem is first solved using the timedmodel-checker U. The resulting schedules are then em-bedded in a M failure model of the lacquer produc-tion line, and analyzed with the discrete event simulator ofM. This approach allows one to assess the quality ofthe schedules with respect to timeliness, utilization of re-sources, and sensitivity to different assumptions about thereliability of the production line.

1. IntroductionScheduling problems are an important area of research,where typically strict response requirements with limitedresources have to be met. Their solution is usually basedon techniques from classical scheduling theory, which is awell-established branch of operations research [7]. Indus-trial scheduling problems, e.g. for pharmaceutical manu-facturing lines, are highly complex, and only heuristic so-lutions are used in practice.

More recently, the application of search techniques frommodel checking and mixed integer linear programming havebeen proposed as a complementary approach to schedule

∗ This work was supported by the NWO-DFG bilateral project Vali-dation of Stochastic Systems (VOSS), the STW/PROGRESS projectTES.4999 (HaaST), the EU IST project AMETIST, and the NWOVernieuwingsimpuls project 016.023.010 (VoPaD).

synthesis. The potential advantage of this approach lies inits simplicity and genericity. Where classical scheduling so-lutions often rely on restricting assumptions that may notbe fulfilled in practice, state space exploration techniquesare generally applicable, and very robust under variation inthe problem parameters. In the model checking approachscheduling feasibility is expressed as temporal or real-timeproperty (“The production can be finished by friday after-noon.”), and a model checker is used to check whether thisproperty holds for a reachable state in a model of the over-all system behavior.

The European AMETIST [3] project is a joint initia-tive from leading research groups on timed systems, andfour industrial partners. One of the main strands of activi-ties of AMETIST is to advance the power of schedulabil-ity analysis by tackling challenging industrial case studiesprovided by the industrial partners. One particularly chal-lenging case - focusing on scheduling in the domain of lac-quer production - originates from AXXOM AG, a Germansoftware company specialized in supply chain optimization.The scheduling synthesis for lacquer production has beensuccessfully tackled using the real-time model-checker U-. However, this analysis had to ignore two parameters ofthe original problem specification as provided by AXXOM.These two parameters types relate to quantitative stochas-tic influences due to failures, repairs, cleaning periods andother unforeseeable (and thus unplannable) events.

This paper reports on our activities to incorporate thesetypes of parameters into an a posteriori analysis of theschedules generated with U [16]. To do so, we re-fine the timed automata model of the production units into astochastic timed automata model in order to faithfully rep-resent the stochastic perturbations, as described by the so-far not considered specification parameters. The assessmentof the schedules is then done on the basis of the estimatesobtained by discrete-event simulation.

Technically, we use the process algebra-based formal-ism M to specify the production units, as well as theschedules. The M models are evaluated by means of

Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST’04) 0-7695-2185-1/04 $ 20.00 IEEE

Page 2: [IEEE First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings. - Enschede, Netherlands (2004.11.12-2004.11.12)] First International Conference

M, a very flexible multi-formalism/multi-solution per-formance evaluation environment, which offers a powerfulsimulator and statistical engines to evaluate different typesof stochastic systems. The M tool M is used toconnect M to M.

In summary, this paper offers the following tangiblecontributions. (i) It shows how schedules for this type ofschedulability problem can be synthesized using U.(ii) It shows the application of M to a highly non-trivial example. (iii) It shows that the integration of M- and M, as advertised in [5], is a sustainable linkto carry out a combined qualitative and quantitative analy-sis. (iv) It provides a methodology to fruitfully complement(rather en vogue) timed automata-based analysis with (fairlywell-established) stochastic simulation analysis. It is out ofour imagination that any other related technique can be ap-plied as fruitful as M in this context.

Organization of the paper. Section 2 discusses the method-ology and tools used to carry out this case study. Section 3introduces the scheduling problem and how it is modeled inM. Results of our analysis are presented in Section 4.Section 5 concludes the paper.

2. Methodology2.1. Scheduling synthesis based on timed au-

tomata model checking.

The synthesis of schedules can be seen as a special case ofcontrol synthesis [13]. It was first introduced by Fehnker[9], and Abdeddaım and Maler [1]. In general, a modelclass suitable for scheduler synthesis must provide the pos-sibility to represent timing information in addition to theusual state/action/synchronisation mechanism. The under-lying framework used in this paper is the one of timed au-tomata as introduced by Alur and Dill [2] and representedin the model checker U. In addition to traditional au-tomata there are clocks, whose valuations progress withtime. Clocks can be reset and be used as guards for transi-tions and in state invariants. In general, the semantic modelof a timed automaton is an infinite state space. The regionautomaton construction [2], however, shows that this infi-nite state space can be mapped to an automaton with a fi-nite number of equivalence classes (regions) and the finite-state model checking approach can be applied to the so con-structed finite region automaton.

Scheduling synthesis based on timed automata makesuse of the search strategies of model checking. First, amodel of the overall, uncontrolled system behavior has tobe constructed In our case here the model consists of allpossible production steps (of all orders) possible at everymoment. Feasibility is formulated as a real-time property(“The production is finished by friday evening”). The modelchecker searches the reachable state space for a state where

this property holds. If it has found one, it provides a diag-nostic trace. The diagnostic trace contains a sequence of ac-tions and delays from the initial state to the state found. Thestart of a processing step is encoded as an action and can befound in the diagnostic trace together with the timing infor-mation. This is enough to extract a feasible schedule from adiagnostic trace.

The advantage of this approach is its robustness againstchanges in the setting of parameters, due to the fact thattimed automata provides a very general model class. Thedisadvantage lies in the well-known state-space explosionproblem. For interesting cases the model checking approachas described above does not terminate. The way out is to addheuristics, or features of schedules, that reduce the searchspace to a size that can be traversed more easily. In Sec-tion 3.1 we discuss the heuristics used for the case studypresented here.

2.2. M

We introduce the language features of M by a smallexample; we refer to [8] for an exhaustive language intro-duction. Figure 1 depicts a fragment of a M specifica-tion, describing a cashier in a discount market environment.All numbers refer to this figure.

The M language allows one to specify processes(1), and to compose them in parallel using a ’par ’ opera-tor (2). Processes can manipulate data variables by assign-ments (3). Data variables are typed and must be declared,and the point of declaration (4) determines their scope. Inparticular, they may be local to a process (not shown in thisexample), or global, in which case they are shared betweenall processes. Standard data types in M are bool, intand float. A particular type of variable which can be de-clared is the clock type (4). Clocks can be read like an ordi-nary float variable, but advance their value linearly to sys-tem time. All clocks run at the same speed. Clocks can onlybe set to zero. The language provides generic constructs tosample values from a set of predefined probability distribu-tions (5). For instance, ’xd = U[2.0]’ assigns a sample fromthe uniform distribution on the interval [10, 20] to the vari-able ’xd’. Other types of distributions are, e.g., Exponen-tial(rate) and Normal(mean,var).

Apart from manipulating data, processes can interactwith other parallel processes (or the environment) by meansof actions (6). Their occurrence within a process can beguarded by a ’when(.)’ clause (7), specifying a Boolean en-abledness condition. In particular, the boolean expression ina ’when(.)’ clause may refer to clock values. In that case, anaction may be enabled as soon as the when() condition be-comes true (and no other action becomes enabled earlier).Processes in the body of a ’par ’ (2) construct perform ac-tions and assignments independently from each other, ex-

2

Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST’04) 0-7695-2185-1/04 $ 20.00 IEEE

Page 3: [IEEE First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings. - Enschede, Netherlands (2004.11.12-2004.11.12)] First International Conference

exception no price;clock x, y;floatxd;action get prod , cash, set price;

par {:: Arrivals() ;:: Queue(N) ;:: Cashier()

}. . .

process Cashing() {{= xd = U [10, 20] , x = 0 =};when(x ≥ xd)

cash}. . .

process Cashier() {do {

:: try {get prod palt {

:49: Cashing(): 1: throw(no price)

}}catch no price {

y = 0 ;when(y ≥ 120)

alt{:: set price;:: free;

};Cashing()

}} }. . .

(2) Parallel Composition

(3) Assignments

(4) Declarations

(5) Sampling

(14) Loop

(13) Prob. Choice

(12) Nondet. Choice

(11) Process Call(6) Actions

(7) Guards

(9) Try Block

(8) Exception

(1) Process Definitions

(10) Exception Handler

Figure 1. A M specification

cept that common (non-local) actions need to be executedsynchronously, a la CSP [11].

M provides means to raise (8) exceptions in side atry block (9) and handle them (10). Exceptions must be de-clared (4). When an exception is raised (8), process con-trol is handed over to the exception handler (10). Another,standard way of handing over process control is by a sim-ple process call (11). Upon termination of the called pro-cess, the calling process gains back control, like in an ordi-nary procedure call.

The ’alt’ construct (12) is used to specify choice be-tween different possible behaviors. In general, it is allowedto make this choice nondeterministically. In this case thechoice is between the execution of action set price andaction free. A variant thereof is the ’palt ’ construct (13),which provides a weighted probabilistic choice, where foreach weight has the form :w:, with w a positive real number.A ’palt ’ must always be preceded by an action. The ’do’keyword (14) indicates a repetitive behavior. Upon termi-nation of the body of this construct, the body is restarted,until a ’break’ is encountered (not used in the example).

The combination of when() and alt has the features of atest-and-set operation, and does therefore enable the imple-mentation of locks and semaphores via shared variables. InSection 3.2 we make extensive use of this feature.

2.3. M and M

The schedules assessed in this paper have been analyzedby means of the M tool environment M and theperformance evaluation environment M. In this sectionwe discuss these two tools.

M. M is a performance evaluation tool en-vironment developed at the University of Illinois atUrbana-Champaign, USA. M supports multiple in-put formalisms and several evaluation approaches for thesemodels. Figure 2 shows an overview over the M ar-

chitecture. Atomic models are specified in one of the avail-able input formalisms. Atomic models can be composedby means of state-variable sharing, yielding so called com-posed models. Along with an atomic or composed model,the user specifies a reward model, which defines a re-ward structure on the overall model. On top of a rewardmodel, the tool provides support to define experiment se-ries, called Studies, in which the user defines the set of inputparameters for which the composed model should be eval-uated. Each combination of input parameters defines aso-called experiment. Before analyzing the experiments, asolution method has to be selected: M offers a pow-erful discrete-event simulator, and, for Markov models,explicit state-space generators and numerical solution algo-rithms. It is possible to analyze transient and steady-statereward models.

M currently supports four input formalisms:Bucket and Balls (an input formalism for Markov Chains),SAN (Stochastic Activity Networks) [14, 15], and PEPA(a Markovian Stochastic process algebra) [10]. Re-cently, the M modeling language has been integratedinto the M framework.

M. In order to facilitate the analysis of M mod-els, we have developed the prototype tool M [6]. Theenormous expressiveness of M implies that no genericanalysis algorithm is at hand. Instead, M aims at sup-porting a variety of analysis algorithms tailored to the va-riety of analyzable submodels. The philosophy behind M- is to connect M to existing tools, rather than re-implementing existing analysis algorithms anew.

To complement the qualitative analysis of M speci-fications using Cwe started joint efforts with the Mdevelopers [5] to link to the powerful solution techniques ofM for quantitative assessment. The main objective wasto simulate M models by means of the M dis-tributed discrete-event simulator, because a stochastic sim-ulator can cope with one of the largest class of models ex-

3

Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST’04) 0-7695-2185-1/04 $ 20.00 IEEE

Page 4: [IEEE First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings. - Enschede, Netherlands (2004.11.12-2004.11.12)] First International Conference

Mobius Libraries

Out

put

Con

trol

GeneratedC++ files

Mobius GUIs

Simulator

MoDeST Spec

Motor

Compiler/Linker

Solver

Composer

Studies

Rewards

Editor

MoDeSTAtomicModel

Figure 2. M Architecture with M

pressible in M. A single semantic concept which issupported in M can not be dealt with in M: non-determinism. We address this restriction below.

M and M. The integration of M intoM is done by means of M. For this integra-tion, M has been augmented with a M-to-C++compiler. From a user-perspective, the M atomicmodel interface to design M specifications is an or-dinary text editor. Whenever a new version of the Mspecification is saved to disk the M tool is called auto-matically in order to regenerate all C++ files (cf. Figure 2).Additionally, a supporting C++ library has been writ-ten for M, which contains two components: first, avirtual machine responsible for the execution of the M- model, and second, an interface to the simulator ofM.

As with all other types of atomic models of M isit possible to define reward models and studies on top ofM models. The state variables which are accessiblefor reward specification are the global variables of the M- specification. Additionally, it is possible to declare con-stants in the M specification as extern, meaning thatthese constants are actually input parameters of the model,pre-set according to the specified study.

Due to the possibility to specify non-Markov and non-homogeneous stochastic processes, only simulation is cur-rently supported as a suitable evaluation approach for M- models within M. While it is in principle possi-ble to identify sublanguages of M corresponding toMarkov chain models, this has not been implemented inM yet.

Nondeterminism. In languages like M, specify-ing nondeterministic behavior is a vital concept and abasic requisite for compositionality. However, in stochas-tic evaluation of systems, nondeterminism can not bedealt with, since every possible state-change or time de-lay must be (stochastically) described. Since simula-tion is a form of stochastic evaluation, this would implythat a simulation model must not contain nondetermin-ism as well. However, it is in fact possible to take a

more relaxed view here. Nondeterminism in simula-tion occurs if two (or more) events are scheduled forexecution at the same time. There are two cases to dis-tinguish: (i), the order of execution matters; (ii), the orderdoes not matter. The first case occurs if the two events ma-nipulate the same data, or are in a direct choice, suchthat the execution of one event would disable the respec-tive other. If none of these dependencies exist betweenthe two events, the second case holds and the simula-tor can just fire both in arbitrary order.

In the present case study, nondeterminism is heavily usedas a convenient modeling means, and the possibility thatnondeterminism of the first kind occurs can not be excluded.We will comment on this problem in Section 3.2.

3. ModelingThis section describes the case study considered in this pa-per, and how it was modeled with the tools in M.The case study originates from AXXOM AG (Munich, Ger-many) within the framework of the IST project AMETIST[3]. The focus of this case study is on synthesis of opti-mal schedules for a lacquer production plant, where varioustypes of costs have to be optimized, e.g. for storage, ma-terial, delay, etc. In a first version of the problem we ad-dressed only schedulability, i.e. the question whether thereexist feasible schedules.

The description provided by AXXOM contains an ab-stract view on a pipeless production plant for lacquers ofdifferent kind. The plant comprises several resources (mix-ing vessels, filling lines, etc) which are used in differentways to produce different kinds of lacquer. These productsneed to be produced according to product orders which ar-rive at specified start times (from the customers), and pos-sess due dates at which the production needs to be finished(and delivered to the customer).

Concretely, there are 29 product orders (jobs) that haveto be delivered by their due dates and can not start beforethe earliest start time specified. Each product is based onone of three recipes (uni, metallic, and bronze lacquers).Each recipe gives information on the sequence of produc-tion steps, the type of resource that is needed for a produc-tion step, and the timing behavior, e.g. the processing timewith a certain resource, offset times, that specify the timeinterval between succeeding production steps, etc. See Fig-ure 3 for a graphical representation of the three recipes. Thedifference of the scheduling problem here to a classical job-shop problem is twofold. (i) There are multiple resources,e.g. 3 mixing vessels of the same type, and, (ii) there aretiming constraints between processing steps, e.g. upper andlower bounds on the start time of a production step with re-spect to the start or end time of the previous production step.

The resources available for production are: 2 mixingvessels for uni-lacquers, 3 mixing vessels for metallic and

4

Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST’04) 0-7695-2185-1/04 $ 20.00 IEEE

Page 5: [IEEE First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings. - Enschede, Netherlands (2004.11.12-2004.11.12)] First International Conference

metallic bronce uni

mixing vessel metal

dose spinner

lab

filling station

bronce mixer

dose spinner bronce

disperging line

disperser

wait

arbitrary,

if not specifiedsynchronize

mixing vessel uni

[2,4]

17.63

5.18

7.35

6.58

22.04

[2,4]

17.63

5.18

7.35

27.73

22.04

[6,6] 11.75 5.88

[0,4]

[2,4]

11.02

5.18

7.35

23.95

25.69

[6,6]

48.98

26.44

Figure 3. Recipes for different types of lacquerswith timing information.

bronze lacquers, 2 filling stations, 1 disperser, 2 dose spin-ners, a special bronze dose-spinner, a bronze mixer, a dis-persing line, and a lab with unlimited availability.

In the description provided by AXXOM, each of the re-sources comes along with two stochastic parameters, so-called performance and availability factors. Both are realnumbers from the interval (0, 1). A performance factor pindicates that the average fraction of time the resource isnot operational due to unforeseeable or unplannable circum-stances, such as resource breakdowns, is 1 − p. The avail-ability factors instead can best be considered as a meansto abstract from system details. An availability factor a de-scribes the fraction of operational time in which the re-source can actually be used because necessary human sup-port is present. A more detailed model could replace thisfactor by specifying the working hours (mon-fri, 9 a.m. to5 p.m., for instance) explicitly. Thus, the availability factorcan be influenced by increasing the person power, while theperformance factor can not be influenced. Apart from thesetwo factors, the description provided by AXXOM does notcontain information about the frequency of resource breaks,the duration of down-times or repair-times, etc.

3.1. The U model and experiments

U [16] is a tool for modeling, simulation and verifi-cation of timed automata. To tackle the principle schedulersynthesis problem for the case study, we defined a (tem-plate) automaton for each of the recipes, including free pa-

rameters for earliest start time and due date. An order isthen an instantiation of a recipe with these dates. The sys-tem consists of the parallel composition of the 29 instantia-tions of the recipes. Resources are modeled as integer vari-ables shared between all the order automata. Each reciperequires one or two clocks measuring the durations of theprocessing steps and the intervals between subsequent pro-cessing steps. We dealt with the processing times in two dif-ferent ways (giving two different models). In the first casewe ignored the performance and availability factors. In thesecond case, following the suggestions of the case studyprovider, we extended the processing times by the corre-sponding factors, e.g., if a resource has a performance of50% the production times on this resource are doubled.

The property to prove was whether all automata repre-senting the orders can reach their final state, indicating thatthey are ready before the due date. In the first place themodel checking runs suffered from state space explosionand further techniques (heuristics) had to be added. Aftersome experimentation, the following heuristics were em-ployed to generate successful runs:

1. start of the orders of one type (uni, metal, bronze) inorder of their due dates;

2. orders of the same type do not overtake each other;

3. non-laziness (active), if possible, i.e. a order does notwait for a continuously unused resource, and takes itonly after a (useless) waiting period. In the model suchbehavior leads to a deadlock and will be not considered(backtracking);

4. greedy strategy (alternatively to non-laziness): if a re-source is available it is taken immediately - note thatthis strategy possibly excludes good schedules, but afeasible schedule found with this technique is still avalid one.

Several successful runs on models with the heuristics abovewere found with a random-depth first version of the veri-fier of U, terminating within a few seconds (whereasthe classical depth-first did not terminate). The diagnos-tic traces leading to the successful state provided the (fea-sible) schedules. For the work presented here, 20 sched-ules were synthesized, each of them meeting all 29 individ-ual due dates. Ten synthesized schedules are based on theplain processing times, not taking performance and avail-ability factors into account. The other ten are overapprox-imated in the sense that, processing times are extended bythe performance and availability factors as follows. Insteadof scheduling a job j on a resource R for processing time P,it is scheduled for P/(p ·a) time units, where p and a are theabove factors for this particular resource. This way of over-approximating was suggested (and is practiced) by the casestudy provider lacking other possibilities for dealing withstochastic parameters.

5

Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST’04) 0-7695-2185-1/04 $ 20.00 IEEE

Page 6: [IEEE First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings. - Enschede, Netherlands (2004.11.12-2004.11.12)] First International Conference

3.2. Modeling the case in M

Ten of the schedules generated by model checking withU contain overapproximations in the sense that it isassumed that the machine is reserved for a longer periodthan needed, in order to compensate the possible delays dueto the breaks and repairs, and other unplanned influences.Since this way of overapproximation tries to compensate forrandom effects with a fixed amount of time, there is an obvi-ous risk that the additionally assigned time may not suffice,and that a job may therefore miss its due date in reality. In-tuitively, this risk is even higher for the schedules computedwithout overapproximations.

This is the phenomenon we are interested to study witha more faithful model of the resources, where the stochas-tic perturbations as described by the performance and avail-ability factor are modeled explicitly. To this end, we exe-cute the scheduled jobs on realistic machine models withstochastically distributed break and repair times. As a re-sult of our analysis, we obtain the probabilities for each jobto be finished on time. This analysis is performed for eachschedule, thus enabling us to compare the schedules in viewof this kind of robustness criterion. Note that all the synthe-sized schedules are perfect in the hard real-time interpre-tation, but they are expected to differ in the more realisticstochastic interpretation.

Structure of the model. The M model is composed ofseveral parallel processes. Each scheduled job (modeling aproduct order) is a process and each machine (modeling aresource) is a parallel process as well. All jobs and machinesare independent, i.e. they do not communicate among them-selves, but jobs do communicate (synchronously) with ma-chines, and vice versa (see Figure 4). Each job process may

Job 1 Job 2 Job 29

ABF1 ABF2

...jobs...

...machines... MVM3

work

lock

unlock

done

error

Figure 4. Structure of the M model.

lock, unlock a machine, or give it a command to work. A ma-chine process can produce a done or error action, so that thejob process knows the result of performing a task.

Performance and availability factors. We intend to faith-fully model the fact that the resources are not always avail-able. This is achieved by a model where each resource maybreak and then can be repaired. For a specific resource, weuse performance and availability factors p and a as an in-dication how often the breaks occur and how long it takesto repair it. The first question which arises in this context is

whether a different treatment is needed for each of the fac-tors. After studying the provided documentation, we cameto the conclusion that the availability factor is conditionalw.r.t the performance factor, so that by multiplying thesetwo numbers we get the real performability factor of themachine, e.g. the fraction of time during which the machineworks “full speed”, on average.

With this decision, there is a close correspondence to thestandard definition of availability given below:

p · a = mean up time (MUT)MUT+ mean time to repair (MTTR)

The case study documentation of a resource only containsthe parameters p and a, but does not specify a quantity thatcould be interpreted as MUT or MTTR. For example, if a re-source on average breaks every half a day and needs half aday to be repaired, it has the same factor p · a as the ma-chine which on average breaks every week and needs an-other week to be repaired. To study the influence of thismissing information, and to discuss its effect with the casestudy providers, we incorporate a pace parameter into themodel that indicates how frequently events happen in thesystem. The pace may be viewed as the reciprocal of thebasic time unit of our model, where the basic time unit isMUT+MTTR, so pace−1 = MUT+MTTR. We can thus de-rive MUT = pace−1 ·p·a, and MTTR = pace−1 ·(1−p·a). Thefinal question that has to be answered before fixing the pa-rameters of a resource model is how to represent these MUTand MTTR stochastically. Lacking any other specific infor-mation we chose to represent these delays by exponentiallydistributed random variables. Note that exponential distri-butions can be interpreted as the best guess (in the senseof maximizing the entropy) if only the mean of a distribu-tions is available.

Resource model. In Figure 5 a schematic process for a re-source (a filling line) is presented. This process waits fora command to start working. This happens when the vari-able ABF1 work is put to a non-zero value by a parallel pro-cess. This value represents the time during which the ma-chine has to work. At this point we have already drawna sample from the exponential distribution with the rater = 1/MUT = pace/(p·a), where p = 0.75 and a = 0.86 arethe two factors for the machine under consideration. Thisgives an average time of 1/r for the machine to break. Ifthis time is larger than the time to work, then the processwaits till the end of work is reached and then informs theother process that the job is done (act done ABF1). Other-wise, the process waits till the time to break and draws asample from the exponential distribution to know the repairtime. The mean value for this distribution is pace/(1− p ·a).After waiting for this time to repair, the process goes back tothe working state. In the working state the work of the ma-chine can be interrupted if the global time gets larger than

6

Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST’04) 0-7695-2185-1/04 $ 20.00 IEEE

Page 7: [IEEE First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings. - Enschede, Netherlands (2004.11.12-2004.11.12)] First International Conference

1 process ABF1_machine() // Filling line 12 { clock w,y;3 float br,r,work,deadline;4 br=Exponential(pace/0.645) //next break time56 do{::7 when (ABF1_work >0) act_work_ABF18 {= work=ABF1_work , ABF1_work=0,9 deadline=ABF1_deadline ,

10 w=0 //current working time11 =};12 do13 {::when (cjobs>=deadline) act_error_ABF114 {= ABF1_done=2, br-=w =}; break15 ::when (br>=work && w>=work) act_done_ABF116 {= ABF1_done=1, br-=w =}; break17 ::when (br<work && w>=br) act_break_ABF118 {= work-=br, y=0,19 r=Exponential(pace/(1-0.645)) =};20 when(y==r) act_repaired_ABF121 {= w=0, br=Exponential(pace/0.645) =}22 } } }

Figure 5. A M model of a resource.

the job deadline. We note here that this does not happen ifthe machine is being repaired.

Job model. The schedules derived from the synthesis withU contain the precise times when each particular taskof each job should start and when it finishes. There is someflexibility how to interpret these schedules in an environ-ment where machines are occupied in a stochastic man-ner. In particular, a task may block a machine for less thanthe overapproximated time it has assigned, due to absenceof breakdowns, for instance. This means that subsequenttasks can use the slack in the schedule, which is not appar-ent in the hard real-time interpretation. We therefore take aloose interpretation of the synthesized schedules. We inter-pret them as a totally ordered list of tasks, but only take intoaccount the time when the whole job (the first task of thatjob) is scheduled to start, and also the time frame (the earli-est starttime and the deadline) for the job execution. Whenexecuting a job we start a subsequent task immediately af-ter the previous task has finished (if a machine is available),which means that we can even be ahead of the schedule be-ing executed.

We use the global clock to know when to start the job(starttime parameter), and to know whether we have met thedeadline or not at the end (of the job). A job process tries tolock a mixing vessel and then proceed to perform the tasksas they are shown on Figure 3. For each task it first locks therequired machine, and then commands it to work. Then itawaits till the task is finished, and proceeds to the next task.At the end the job process compares the current time withthe job deadline, and in this way determines if the deadlinehas been missed or not.

Synchronization. Apart from using action-based syn-chronization, another way to synchronize two processesin M is by means of a shared variable and thewhen construction. The statement when (ABF1 work>0)

1 //a filling line (and the mixing vessel) for 232 alt{3 ::when(ABF1_lock==0) {= ABF1_lock=1,4 ABF1_deadline=deadline -2, ABF1_work=23 =};5 when(ABF1_done >0)6 {= ABF1_done=0, ABF1_lock=0 =}7 ::when(ABF2_lock==0) {= ABF2_lock=1,8 ABF2_deadline=deadline -2, ABF2_work=23 =};9 when(ABF2_done >0)

10 {= ABF2_done=0, ABF2_lock=0 =}11 }

Figure 6. Implementation of locks in M

act work ABF1 means that the process waits till the valueof the shared variable ABF1 work becomes larger than0, and then executes act work ABF1. In this way an-other process just needs to assign a value bigger than0 to ABF1 work, and the waiting process has to set itback to 0 afterwards. A symmetric synchronization isneeded to make the synchronization in the other direc-tion. By this kind of synchronization a data value ispassed from one process to the other using the shared vari-able. Most of the synchronizations appearing in this casestudy use this feature.

Locks One of the ways to implement the locking mecha-nisms is based on test-and-set primitives [4, Page 43]. It hasto do with the fact that the when construction is not an ac-tion, but a guard. This means that there is no interleaving be-tween the when construction and the following action. Thefollowing action can have an assignment that performs theset part of test-and-set. As an example we show a part of thejob process (Figure 6): Here the job tries to acquire the lockof one of the two filling lines. In case ABF1 lock==0 it setsthis lock (ABF1 lock=1) and no other job can access thevalue of ABF1 lock in the mean time. After that the worktime and the deadline are set for ABF1, which is a commandfor the ABF1 process to get out from the idle state. The or-der of the assignments is not important because all of themare executed in one atomic action.

It is important to mention here that this locking mech-anism does not prevent a malicious or an incorrectly spec-ified job from breaking the exclusivity constraint. A care-ful inspection of our model shows that, in fact, task execu-tions are atomic even in this case, but, for example, if a jobwants to lock a resource for two consecutive tasks, this can-not be guaranteed. Action synchronization with parameter-ized actions could provide a solution in this case.

It is also worthwhile to mention that the entire specifica-tion makes excessive use of nondeterminism induced by thealt-construct, but also by the interleaving semantics of thepar-construct. This nondeterminism is a modeling meansrather than a semantic entity, because the specification in-cludes a schedule (generated by U) which orders thetasks and jobs in a specific way, with jobs being bound tomachines. Therefore, the nondeterminism is mostly “sched-

7

Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST’04) 0-7695-2185-1/04 $ 20.00 IEEE

Page 8: [IEEE First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings. - Enschede, Netherlands (2004.11.12-2004.11.12)] First International Conference

uled away”, except for some cases, e.g. where schedulingdecisions have to be made among multiple (identical) ma-chines. We take a pragmatic approach to this issue, basedon the discussion in Section 2.3. We apply a solution basedon Bernoulli’s principle of insufficient reason, which statesthat all events over a sample space should have the sameprobability unless there is evidence to the contrary. Practi-cally, this means that the nondeterminism is actually treatedas a probabilistic choice with a uniform distribution overthe different alternatives, which is a maximum-entropy so-lution. It is realized by a simulator which is not biased, suchthat the different possibilities of choice are, on average, cho-sen with the same frequency.

4. Results

4.1. Experiments

Our primary goal is to evaluate the degree to which theschedules can tolerate the delays due to the breaks and re-pairs of the machines. We do evaluation by means of simu-lation, e.g. executing the schedules for many times and com-puting the average values of performance variables. We de-fine the following performance variables. First of all we tryto estimate the number of jobs meeting the deadlines (e.g.finishing before the due day). This number ranges from 0to 29, so we are interested in the mean value as well as inthe distribution of this performance variable among the val-ues from 0 to 29. Another criterion is the success rate foreach individual job. Here we want to estimate the probabil-ity of a particular job (with the number from 1 to 29) to fin-ish before the deadline, for each particular schedule.

Following a suggestion of Kim Larsen [12], we are alsoinvestigating the total time that each particular machine isworking, is being repaired, and is idle. We also estimatethese for each of the schedules. We evaluated 20 schedules,10 of them based on the plain processing times, 10 basedon overapproximations, i.e. the extended processing times.We first experimented with a wide range of paces, but feed-back from AXXOM made us focus on only two paces, cor-responding to basic time units of 32 and 100 hours. Basedon similar feedback we also distinguish two different dead-line behaviors. We originally considered it wise to give upperforming a job once it is known for sure that it will missits due date. This makes resources available for future jobs,which can profit from this by starting tasks ahead of sched-ule. However, the case study providers considered it inter-esting, but unrealistic to abort a lacquer production with anunfinished product. Therefore we also consider all sched-ules by not giving up producing products even though theirdeadline has already passed. In total we simulated 80 exper-iments, each of which was executed for 20000 times.

The experiments were run on 4 Pentium 4 PC’s. A totalof 80 experiments were completed in 11 hours. The fastest

machine (a PC running Linux with a Intel P4 3GHz CPU)completed one experiment in about 20 minutes. A runningsimulation took about 7MB of memory. As a result we ob-tained 80 files with the values of the performance variables.Postprocessing of these files was performed with pythonscripts and SQL queries to obtain the mean values and confi-dence intervals of the performance variables for each sched-ule. We further used GNU Plot and an office package to pro-duce charts from the obtained data.

4.2. Evaluation

The experiments carried out with M and M pro-vide interesting insight into the scheduling problem. Thefirst observation is that, both for “normal” and overapprox-imated schedule types, the probability distribution of thenumber of jobs finished on-time does not differ much fromone particular schedule to another. This can be seen fromFigure 7, where the significant portion of the probabilitydistribution of the number of successful jobs is drawn forall 20 schedules. The chart on the left corresponds to the 10normal schedules, and the chart on the right corresponds 10overapproximated schedules. Each of the charts contains 4clusters of lines, corresponding to the 2 paces and 2 kindsof deadline behaviors.

We recall that according to suggestions of the case studyproviders, overapproximated schedules were supposed to bemore robust with respect to stochastic perturbations. How-ever, as it turns out from the experiment, the normal sched-ules outperform the overapproximated ones, which is rathersurprising at first sight. It can be explained by the fact that inthe overapproximated schedules some jobs are scheduled tostart late, because earlier jobs need to be scheduled longer.As a consequence, later jobs have less time remaining be-fore the deadline. Note that in our model the jobs are notable to profit from slacks in a schedule, while the tasks are.As we can also see from the charts, the probability mass isshifted towards a lower number of jobs in the case whenthe job execution is continued, even if the deadline has al-ready been missed. This can be explained by the fact thatthe “struggling” jobs continue to occupy resources, with-out improving their chance to reach the deadlines. It is alsovisible from Figure 7 that decreasing the pace (i.e., increas-ing both MTTR and MUT) makes the plot look steeper andthinner.

The probabilities of meeting the deadline for each indi-vidual job (for the case with pace 0.01 and without respect-ing the deadlines) are shown in the Figure 8. Here only theoverapproximated schedules (11-20) are shown, becausethe differences among the schedules is slightly more sig-nificant in this case.

The jobs on the chart are ordered by their (earliest) start-ing time. This chart shows that most of the jobs with laterstarting times have less chance to finish on-time than the

8

Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST’04) 0-7695-2185-1/04 $ 20.00 IEEE

Page 9: [IEEE First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings. - Enschede, Netherlands (2004.11.12-2004.11.12)] First International Conference

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

16 17 18 19 20 21 22 23 24 25 26 27 28 29

w/o deadlines, pace 0.01

w deadlines, pace 0.01

w deadlines, pace 0.0316

w/o deadlines, pace 0.0316

Job success ratios, schedules without overapproximation

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

16 17 18 19 20 21 22 23 24 25 26 27 28 29

w/o deadlines, pace 0.01

w deadlines, pace 0.0316

w deadlines, pace 0.01

w/o deadlines, pace 0.0316

Job success ratios, schedules with overapproximation

Figure 7. Success probabilities for a number of jobs.

jobs that can start earlier. This can be explained by the in-crease of stochastic perturbation of the model as time in-creases. Another observation concerns jobs of the sametype. If two jobs of the same job type have close earlieststarting time, one of them may have less chances to fin-ish on-time, depending on the particular schedule (and theavailable resources). For instance, job 14 has only about40% chance in the first three schedules, but about 73% bythe following two. The situation is entirely reversed for job18. A careful inspection of the schedules 11-15 reveals thatthe first three schedules launch job 18 before job 14, andthe next two do the reverse. One can exploit this observa-tion, e.g. if job 14 is a product for a more important cus-tomer than job 18, the schedules 14 and 15 are more prefer-able than schedules 11-13. Similar observations, though lesssignificant, can also be made across jobs of different type,for example, jobs 20 and 19, or jobs 10 and 20.

As far as the average machine utilization/repair times areconcerned, the results show that they do not differ at allfrom one schedule to another. Actually, this was to be ex-pected, because the same amount of work is being done byall schedules and because the breaking behaviors of the ma-chines is the same across the schedules.

5. Conclusions and Future WorkIn this paper we have described a methodology to syn-thesize schedules, and to assess these schedules by meansof simulation. We have used the tool U and well-established model-checking techniques for the synthesispart, and M/M/M for the simulation. Themain contribution of this paper lies in the combination ofreal-time model-checking techniques and stochastic evalu-ation of results obtained by these techniques. To the bestof our knowledge, this combination is unique. The M-/M tandem provides excellent means to perform

this kind of analysis, due to the fact that M has strongroots in both worlds. Being an automata-based, composi-tional, easily extensible and modifiable formalism, M- enables a natural integration of the synthesized sched-ules and machine models into the simulation environmentof M, in particular relative to more classical queuing-oriented tools such as QNAP.

The results of our simulation enabled us to detect whichschedules are better and where the bottlenecks are (whichjobs are risky), and to give feedback to the case studyproviders. We can also study which parameters influencethe results and which do not. It became apparent that a naıveway to deal with performance and availability factors (i.e.overallocating processing times) as suggested and practicedby the case study provider does not lead to better results.

5.1. M and M

This case study is the first touchstone for the M-/M tool tandem. Before the final evaluation of thepresented model could take place, some practical obsta-cles had to be overcome. In the first version of the M-/M compiler, the generated C++ code was enor-mous: several 100.000 lines of code for a 700-line Mspecification). Compilation of this code was quite dif-ficult, since the used C++ compiler (g++ 3.3) uses avery memory intensive optimization algorithm. Actu-ally, compilation was often infeasible doe to memoryconstraints. Since then the translation has been opti-mized so that the generated files became more than 10times smaller.

Currently simulating M models in the M sim-ulator is somewhat inefficient since the generated files donot contain information about the interdependencies be-tween actions and state variables. As a result the entire eventqueue needs to be re-examined after each action firing. Re-cent results from the same model including this dependency

9

Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST’04) 0-7695-2185-1/04 $ 20.00 IEEE

Page 10: [IEEE First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings. - Enschede, Netherlands (2004.11.12-2004.11.12)] First International Conference

Figure 8. Success probabilities of the individual jobs.

information, generated with a development version of thecompiler, indicate that the simulation can be done about 8times faster and takes 3 times less memory.

In the future we plan to support more features for processparameterization and modularization. The current modelcould be substantially improved so that we do not repeatsimilar fragments of process definitions. One promisingway to arrive there is based on an implementation of actionsynchronization augmented with value passing and match-ing mechanisms.

5.2. Model improvements

For a more consistent evaluation of the schedules a morerealistic machine model is desirable. Mean up time, meantime to repair, factors like overall working time, workingtime since last repair, working time since last stop, over-all idle time, idle time since last stop, etc., influence thebreaking/repair time distributions of a machine. This infor-mation was not available, which forced us to assume simpleexponential distributions as failure distributions, and vary-ing the MUT and MTTR behavior with simple pace param-eters. Our results are as good as these assumptions. To im-prove the results, a more sophisticated statistical analysis ofa machine’s breakdown behavior is needed, which is in theinterest of the case study providers. Incorporating a moredetailed description of a machines failure behavior into theM model is not expected to pose problems.

The model can also be improved in a different directionby incorporating cost aspects. Using a machine can havea cost per time unit. Each repair could have an additionalcost (also per time unit). Missing a job deadline also has acost (depends on job type and the delay time). Meeting adeadline has a reward (constant per job type). We can alsohave storage costs if we finish earlier than the deadline (pertime unit). In addition there can be costs for using the sameequipment for different types of jobs, because the machinesmust be cleaned, etc. All these kind of costs can be conve-

niently incorporated into the model by means of the rate/im-pulse reward models that can be specified in M.

References[1] Y. Abdeddaım and O. Maler. Job-Shop Scheduling using

Timed Automata. In Proc. CAV’01, 2001.[2] R. Alur and D. Dill. A theory of timed automata. TCS,

138:183–335, 1994.[3] AMETIST, IST project ist-2001-35304.http://ametist.cs.utwente.nl/.

[4] M. Ben-Ari. Principles of Concurrent and Distributed Pro-gramming. Prentice Hall international series in computingscience. Prentice Hall, 1990.

[5] H. Bohnenkamp, T. Courtney, D. Daly, S. Derisavi, H. Her-manns, J.-P. Katoen, V. V. Lam, and W. H. Sanders. On in-tegrating the Mobius and MoDeST modeling tools. In Proc.DSN’03. IEEE Computer Society, 2003.

[6] H. Bohnenkamp, H. Hermanns, J.-P. Katoen, and R. Klaren.The MoDeST modeling tool and its implementation. InTOOLS 2003, LNCS 2794. Springer, 2003.

[7] G. Butazzo. Hard Real-Time Computing Systems. Pre-dictable Scheduling Algorithms and Applications. KluwerAcademic Publishers, 1997.

[8] P. R. D’Argenio, H. Hermanns, J.-P. Katoen, , and R. Klaren.MoDeST - a modelling and description language for stochas-tic timed systems. In Proc. PAPM-ProbmiV’01, LNCS 2165,pages 87–104. Springer, 2001.

[9] A. Fehnker. Scheduling a Steel Plant with Timed Automata.In Proc. RTCSA’99. IEEE Computer Society Press, 1999.

[10] J. Hillston. A Compositional Approach to Performance Mod-elling. PhD thesis, University of Edinburgh, 1994.

[11] C. Hoare. Communicating Sequential Processes. Series inComputer Science. Prentice-Hall International, 1985.

[12] K. G. Larsen. Personal communications, Dec. 2003.[13] O. Maler, A. Pnueli, and J. Sifakis. On the synthesis of

discrete controllers for timed systems. In Proc. STACS’95,LNCS 900. Springer, 1995.

[14] J. F. Meyer, A. Movaghar, and W. H. Sanders. Stochastic ac-tivity networks: structure, behavior and application. In Proc.Int. Workshop on Timed Petri Nets, pages 106–115, 1985.

[15] A. Movaghar and J. F. Meyer. Performability modeling withstochastic activity networks. In Proc. Real-Time SystemsSymposium, 1984.

[16] UPPAAL home page. http://www.uppaal.com.

10

Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST’04) 0-7695-2185-1/04 $ 20.00 IEEE