2012 Scalabilitiy TSE Preprint

Embed Size (px)

Citation preview

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    1/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 1

    Systematic Elaboration of ScalabilityRequirements through Goal-Obstacle Analysis

    Leticia DubocDept. of Computer Science

    State University of Rio de JaneiroBrazil 20550 900

    Email: [email protected]

    Emmanuel LetierDept. of Computer ScienceUniversity College London

    United Kingdom WC1E 6BTEmail: [email protected]

    David S. RosenblumSchool of Computing

    National University of SingaporeSingapore 117417

    Email: [email protected]

    Abstract Scalability is a critical concern for many software systems.Despite the recognized importance of considering scalability from theearliest stages of development, there is currently little support for rea-soning about scalability at the requirements level. This paper presents agoal-oriented approach for eliciting, modeling and reasoning about scal-ability requirements. The approach consists of systematically identifyingscalability-related obstacles to the satisfaction of goals; assessing thelikelihood and severity of these obstacles; and generating new goalsto deal with them. The result is a consolidated set of requirements inwhich important scalability concerns are anticipated through the precise,quantied specication of scaling assumptions and scalability goals. Thepaper presents results from applying the approach to a complex, large-scale nancial fraud detection system.

    Index Terms D.2.1 Requirements/Specications, D.2.0.a Analysis,D.2.8.b Performance measures, D.2.10.h Quality analysis and evalua-

    tion, Goal-oriented requirements engineering, KAOS, Scalability

    1 INTRODUCTION

    SCALABILITY is widely regarded as one of the mostimportant qualities of software-intensive systems [1],yet fundamental principles for developing scalable sys-tems are still poorly understood. As with other non-functional system qualities, a systems ability to scaleis strongly dependent on decisions taken at the re-quirements and architecture levels [2], [3]. Failure toconsider scalability adequately in these stages may resultin systems that are impossible or too costly to change if scalability problems become apparent during testing orsystem usage.

    Current technologies, such as virtualization and cloudcomputing, may help to achieve scalability in softwaresystems. However, a system whose architecture fails tosupport some of its scalability goals cannot be madescalable simply by employing virtualization and cloudsfor its implementation. In order to take advantage of these technologies, one must understand and dene thesystems scalability goals precisely. An a priori analysisof scalability goals prevents system analysts from beingcaught prematurely into discussions about technology

    and gives them more freedom to explore, negotiate and

    fulll these goals. There is therefore a strong need forsystematic techniques to reason about scalability duringthe early stages of development.

    In our previous studies of scalability, we dened scala- bility as the ability of a system to maintain the satisfaction of its quality goals to levels that are acceptable to its stakeholderswhen characteristics of the application domain and the systemdesign vary over expected operational ranges [4]. A systemsscalability is therefore relative to other primary qualitygoals for the system. For example, the scalability of aWeb search engine may be characterized by its abilityto return search results in a time that remains almostinstantaneous (a performance goal) when the number

    of simultaneous queries and the number of pages in-dexed increase. Scalability, however, is not limited toperformance goals. The scalability of an air trafc controlsystem may be characterized by its ability to keep safeseparation distances between airplanes (a safety goal)when the number of airplanes in the airspace managed by the system increases. One should note that, as illus-trated by the latter example, scalability is not limited toperformance goals. Performance is only one of the manypossible indicators of scalability; there are others such asare reliability, availability, dependability and security [4].

    Dening the scalability requirements of a complexsystem is not trivial. Many issues arise during theirelaboration [5]:

    how to specify scalability requirements that areprecise and testable so that they provide adequateinputs for architectural design and analysis;

    how to identify and manage scalability-related risksearly in the development process, and in particularhow to make sure one has identied all applicationdomain quantities whose variations could have animpact on the systems scalability;

    how to deal with the inevitable unknowns, uncer-tainties and disagreements about expected ranges of variations for these domain quantities;

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    2/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 2

    how to ensure that the specied scalability require-ments are neither too strong nor too weak withrespect to business goals, to expected ranges of vari-ations in the application domain, and to capabilitiesof current technologies; and

    how to nd adequate tradeoffs between scalabilityrequirements and other qualities, in particular thoserelated to cost and development time.

    Existing requirements engineering techniques providelittle concrete support to address these specic issues.User stories and use-case based approaches to require-ments engineering overlook scalability concerns andother non-functional requirements altogether. A few in-dustry white papers provide high-level guidelines forwriting quantiable scalability requirements, but no pre-cise foundations nor any guidance to deal with theabove issues [6], [7]. Goal-oriented approaches such asKAOS [8], i* [8] [9], and Problem Frames [10] providegeneric support for elaborating, structuring and analyz-ing software requirements, including both functional and

    non-functional requirements, but no specic support todeal with scalability concerns. A rudimentary applica-tion of a goal-oriented approach that would consist of,say, specifying a high-level goal named Scalable Systemand gradually rening it into more precise requirementsis not satisfying because of the lack of a precise denitionfor the scalability goal and the lack of support forrening it into testable scalability requirements.

    Dedicated requirements engineering techniques existfor other important categories of non-functional require-ments such as safety [11], [12] and security [13], [14], [15];however, up to now no specic attention has been givento scalability requirements. Surprisingly, scalability is not

    even included in some taxonomies of non-functionalrequirements in industrial standards or in the softwareengineering literature [8], [16], [17], [18], [19]. As a con-sequence, scalability requirements are often overlookedor dened in vague terms at best [5].

    This paper presents a systematic method for elab-orating and analyzing scalability requirements, and itpresents the results of a case study in which we appliedthe method to a complex, large-scale nancial frauddetection system. The method has been formulated inthe context of the KAOS goal-oriented requirementsengineering method [8], and it relies on KAOS for theelaboration of goal models, for managing conicts be-tween goals, and for evaluating and selecting amongalternative system designs. We extend the conceptualframework of KAOS with precise characterizations of the concepts of scalability goals, scalability requirements andscaling assumptions that are essential for reasoning aboutscalability at the requirements level. We also extend theKAOS model elaboration process with concrete supportfor elaborating and analyzing such concepts.

    Our approach deals with scalability concerns duringthe goal-obstacle analysis steps of the goal-oriented elab-oration process [20]. The basic idea consists in system-atically identifying and resolving potential variations

    in domain quantities that will cause goals to fail; wewill dene such conditions as scalability obstacles. Theaim of a scalability obstacle analysis is to anticipatescalability related risks at requirements elaboration time,when there is more freedom for nding adequate waysto deal with them, rather than discovering them duringsystem development or, worse, experiencing unantici-pated scalability failures at runtime. Existing obstacleanalysis techniques [20] are not sufcient for handlingscalability obstacles because they support only genericanalyses in which the specics of scalability obstacles arenot considered. As a result, scalability obstacles are likelyto be overlooked during obstacle identication (as theywere, for example, in a previous obstacle analysis of theLondon Ambulance Service case study [20]). We presentnew, more specic heuristics, formal patterns, and Eventhough, we recognize the potential of virtualization andcloud for achieving scalability model transformation tacticsfor identifying and resolving scalability obstacles.

    2 B ACKGROUND

    2.1 The Intelligent Enterprise Framework

    Our presentation relies on examples of a real-worldsystem, Searchspaces Intelligent Enterprise Framework(IEF), that we used to develop and validate our tech-niques.

    IEF is a platform used by banks and other nancialinstitutions for processing large volumes of nancialtransaction data, and notifying users (such as bank em-ployees) of potentially fraudulent behavior within thedata. IEF has important scalability requirements due to

    the recent widespread consolidation in the banking andnance sector. The rate of electronic transactions hasincreased dramatically, analytical methods have becomemore complex, and system performance expectationshave become higher. Over the last 10 years, the numberof installations of the companys software went from ahandful of deployments in a small number of countriesto many hundreds of deployments, spread worldwide.The volume of data processed increased from a singledeployment peak of about 3 million transactions per dayto over 90 million transactions for a recent deployment,coupled with an increase in the complexity of the pro-cessing performed.

    The system uses a number of analytical methods forrecognizing fraudulent transactions. For simplicity, weconsider here only the comparison of transactions de-livered by a bank against known fraudulent patterns,which may be done continuously or overnight. In theformer, transactions are streamed by the bank systeminto the IEF and processed as soon as they arrive. Inthe latter, transactions are provided periodically andprocessed in data batches. The choice of approach de-pends on the type of fraud being addressed and theupstream banking processes. Alerts about potentiallyfraudulent transactions are generated by the IEF and

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    3/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 3

    then investigated by the bank, which will dismiss orconrm the frauds and then take appropriate measures.

    2.2 Goal-Oriented Requirements Engineering

    Goal orientation is a paradigm for the elicitation, evalu-ation, elaboration, structuring, documentation and anal-ysis of software requirements. The KAOS approach iscomposed of a modeling language and a constructiveelaboration method supported by informal heuristicsand formal techniques [8]. The modeling language com- bines multiple system views for modeling goals, do-main objects, agents and their responsibilities, agentoperations, and system behaviors through scenarios andstate machines. The method consists of a series of stepsfor elaborating and analyzing the different views. Inthis section, we focus on presenting the key conceptsthat are used in the paper. Further details about theKAOS modeling language and elaboration method can be found in Lamsweerdes book [8].

    A goal is a prescriptive statement of intent that asystem should satisfy through the cooperation of agents. Agents are active system components, such as humans,hardware devices and software components, that arecapable of performing operations that modify systemstate in order to satisfy goals. The word system refers tothe composition of the software under development andits environment. Goals range from high-level businessobjectives whose satisfaction involves multiple agents(such as minimizing loss due to nancial fraud, andmaintaining the banks reputation, etc.), to ne-grainedtechnical properties involving fewer agents (such asprocessing all transaction batches within eight hours).

    Unlike goals that are prescriptive, domain properties aredescriptive statements that are always true in the appli-cation domain, irrespective of agents behaviors. Physicallaws are typical examples of domain properties. Do-main hypotheses are also descriptive statements about theapplication domain, but unlike domain properties areconsidered to be more uncertain and subject to change.

    In a goal model, goals are organized in AND/OR re-nement structures. An AND-renement relates a goal toa set of subgoals; this means that satisfying all subgoalsin the renement is a sufcient condition in the domainfor satisfying the goal. OR-renement links relate a goalto a set of alternative AND-renements; this means thateach AND-renement represents an alternative way tosatisfy the parent goal.

    The KAOS elaboration process consists in constructinga systems goal model both top-down (by asking HOWquestions to rene goals into subgoals) and bottom-up(by asking WHY questions to identify parent goals).During the goal renement process, goals are decom-posed into subgoals so that each subgoal requires thecooperation of fewer agents for its satisfaction; the rene-ment process stops when it reaches subgoals that can beassigned as the responsibility of single agents. In orderto be assignable to an agent, a goal must be realizable

    Fig. 1. Partial Model for a Previous Version of the IEF [5].

    by the agent. Realizability here is a formal condition

    requiring that the goal denition corresponds to a transi-tion relation between the agents monitored variables (itsinput) and controlled variables (its internal and outputvariables) [21]. A goal assigned to the software underdevelopment is called a software requirement. A goal as-signed to an agent in the environment is called a domainexpectation. The term domain assumption is used to referto both domain expectations and domain hypotheses.

    As an example, Figure 1 shows a portion of the goalmodel we have elaborated for a previous version of theIEF system that supported batch processing only [5].Graphically, goals are depicted as parallelograms, andAND-renement links are represented as arrows fromsubgoals to a parent goal. Agents are depicted ashexagons, and a link between an agent and a goal meansthat the agent is responsible for the goal. The top of thegure shows three stakeholders goals for this system,namely Avoid [Loss Due to Fraud] , Maintain [Bank Reputation] , andAvoid [Inconvenience to Account Holder] . The goal Achieve [FraudDetected and Resolved] contributes to these three goals and isrened into the goal Achieve [Alerts Generated] requiring thatfraud alerts be generated for potentially fraudulent banktransactions, and the expectation Achieve [Alerts Investigatedand Acted Upon Quickly] , under the responsibility of a humanFraud Investigator , requiring that generated alerts either beconrmed as fraudulent and resolved, or cleared as falsealerts. The goal Achieve [Alerts Generated] is itself renedinto an expectation on the Bank System to provide the batches of transactions data, and a requirement on theAlert Generator software component to process the batchesof transactions overnight.

    2.3 Specifying Goals

    Each goal has a specication including its pattern (e.g.,Achieve , Maintain , Avoid ), name and natural language deni-tion. A goal also may be specied as belonging to oneor more categories (e.g., functional, performance, safety)

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    4/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 4

    and be given an optional formal denition expressed inMetric Linear Temporal Logic [22], which is used todisambiguate natural language denitions and to allowautomated reasoning on goal models. In this paper, wewill use the following standard temporal logic operators:

    P (P is true in all future states)

    P (P is true in some future state) d P (P holds in some future state that is less

    than d time units from the current state)

    The natural language and formal denitions of a goaldene what it means for the goal to be satised inan absolute sense. Partial levels of goal satisfaction arespecied by extending goal specications with domain-specic quality variables and objective functions [23]. Math-ematically, quality variables correspond to probabilisticrandom variables. The specication of objective func-tions and quality variables extends the goal model witha probabilistic layer that can be used to compute theimpact of alternative goal renements and responsibilityassignments on the degrees of goal satisfaction.

    As an example, the goal Achieve [Batch Processed Overnight]has the following specication.

    Goal Achieve [Batch Processed Overnight]

    Category Performance goalDenition Batches of bank transactions provided by the banksystem should be processed in less than 8 hours. Processingconsists in generating alerts about transactions in the batch thatmatch known fraudulent patterns stored in the system.Formal Spec ( b:Batch)

    (EndOfDay b.Submitted 8 hours b.Processed)

    Quality Variable : procTime: Batch TimeDenition The time to process the full batch of transactions afterthe end of the day;Sample Space The set of daily batches of bank transactionssubmitted by the bank system.

    Objective Functions At least 9 out of 10 batches should beprocessed within 8 hours, and all batches should be processedwithin 9.6 hours.

    Name Denition Target

    % of Batches Processed Pr(procTime 8h) 90%

    in 8 hours

    % of Batches Processed Pr(procTime 9 .6h) 100%

    in 9.6 hours

    Responsibility Alert Generator

    The natural language and formal denitions state thatthis goal is satised in an absolute sense if every batchof transactions submitted to the IEF is processed in lessthan 8 hours. The quality variable procTime is a randomvariable whose sample space is the set of daily batchessubmitted to the IEF and whose value denotes the timeto process a batch. The rst objective function statesthat the system should be designed to maximize theprobability that a batch is processed in less than 8 hours.

    Fig. 2. Obstacles to the Expectation Achieve [Transac-tions Submitted in Daily Batches].

    The nal column species the target to be achievedfor each objective function; at least 90% of the batchesshould be processed within 8 hours, and there is atolerance of 20% in the processing time (i.e., 9.6 hours)for the remaining batches. The goal is assigned to theAlert Gnerator agent, a subsystem of the IEF.

    2.4 Goal-Obstacle Analysis

    Goals, requirements and assumptions elaborated duringa requirements elaboration process are often too ideal because they fail to take into consideration exceptionalconditions in the application domain that may violatethem. The principle of obstacle analysis is to take apessimistic view of the model elaborated so far bysystematically considering how the actual system mightdeviate from the model.

    An obstacle to some goal is an exceptional condition

    that prevents the goal from being satised [20], [24]. Anobstacle O is said to obstruct a goal G in some domaincharacterized by a set of domain properties Dom if, andonly if:

    the obstacle entails the negation of the goal in thedomain (i.e., O, Dom G); and

    the obstacle is not inconsistent in the domain (i.e.,Dom O )

    Graphically, obstacles are depicted as reverse par-allelograms and related to the goals they obstruct bynegated arrows. Obstacles can be rened into sub-obstacles. The resulting model correspond to a goal-anchored form of risk model. For example, Figure 2shows the obstacle Daily Transactions Batch Not Submittedobstructing the goal Achieve [Transactions Submitted in DailyBatches] and its OR-renement into sub-obstacles No BatchSubmitted , Submitted Batch is Corrupted and Wrong Batch Submitted .

    Starting from an idealized goal model, a goal-obstacleanalysis consists of

    1) identifying as many obstacles as possible by system-atically considering all leaf goals and assumptionsin the goal graph;

    2) assessing the relative importance of the identiedobstacles in terms of their likelihood and impactson top-level goals;

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    5/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 5

    3) resolving the more critical obstacles by modifyingexisting goals, requirements and assumptions, or by introducing new ones so as to prevent, reduceor tolerate the obstacles.

    For example, having identied the obstacle that theBank System could provide a corrupted batch of transac-tions will generate additional requirements on the AlertGenerator

    . This identify-assess-resolve loop is repeateduntil all remaining obstacles are considered acceptablewithout further resolution.

    3 S PECIFYING S CALABILITY REQUIREMENTSAccording to our denition of scalability, the elementsrequired to dene what it means for a system to bescalable are (1) the quality goals of the system; (2) thecharacteristics of the application domain and systemdesign that are expected to vary, and their ranges; and(3) the acceptable levels of quality goal satisfaction underthese variations. In the KAOS framework, these elementscan be represented as follows:

    1) Quality goals correspond to goals whose speci-cation includes domain-dependent objective func-tions.

    2) Expected variations of characteristics in the appli-cation domain correspond to a new kind of domainassumption that we call a scaling assumption.

    3) The acceptable levels of quality goal satisfactionunder these variations are specied as target valuesfor the quality goal objective functions. We referto quality goals that are constrained by scalingassumptions as scalability goals.

    Note that, even though scalability is highly inuenced by the system design and implementation, goals andassumptions related to scalability should be describedin terms of domain phenomena only and should notrefer to design or implementation concerns. Goals willimpose constraints on the system design by makingreference to assumptions about variations in the domain.Stating the constraints in terms of domain phenomenanot only avoids over-specication, but also gives systemdesigners more freedom in choosing the design that willsatisfy these goals best. We describe these elements andtheir roles in more detail in the following sections.

    3.1 Specifying Scaling AssumptionsWe dene a scaling assumption as a domain assumptionspecifying how certain characteristics in the applicationdomain are expected to vary over time and across de-ployment environments. The latter dimension is espe-cially important in the context of product families. Thespecication of scaling assumptions should therefore bedened in terms of the following items:

    1) one or more domain quantities whose expected vari-ations are dened in the assumption;

    2) the periods of time and classes of system instances overwhich the assumption is dened; and

    3) the range of values each quantity is expected toassume for each system class over each period of time.

    For example, the scaling assumption Expected Batch SizeEvolution presented below species how the number of transactions in daily batches submitted to the IEF isexpected to vary over time and for banks categories.

    Assumption Expected Batch Size EvolutionCategory Scaling assumptionDenition From 2011 to 2015, daily batches are expected to contain upto the following numbers of transactions for different bank categories: 1

    Bank 2011 until 2013 until 2015

    small 10,000 15,000 20,000

    medium 1 million 1.2 million 1.8 million

    large 50 million 55 million 60 million

    merger 80 million 85 million 95 million

    Our denition of scaling assumption covers the casewhere variations of domain quantities refer to a single

    period of time and a unique class of system instances.For example, an alternative, simpler scaling assumptionon the batch size could be the following:

    Assumption Expected Batch Size EvolutionCategory Scaling assumptionDenition Over the next ve years, daily batches of transactions for allbanks are expected to vary between 10,000 and 95 million transactions.

    Following the KAOS classication of properties (Sec-tion 2.2), a scaling assumption could either be a descrip-tive scaling hypothesis, in which case the assumption isassumed to hold without being prescribed on any system

    agent, or a prescriptive scaling expectation, in which casesome agent would be expected to restrict its behaviorso as to ensure that the scaling assumption is satised.Scaling assumptions are not, however, software require-ments, as they describe properties that are assumed to be true in the application domain rather than propertiesthat must be enforced by the software-to-be.

    Scaling assumptions can be specied with varyinglevels of precision. For some systems, specifying ordersof magnitude for some of the domain quantities may besufcient to inform the early design decisions adequatelyat the requirements and software architecture levels. Forothers, one may need more precise gures, such as theones in the above examples. In some cases, it may beuseful to distinguish between the average and peakvalues of some of the domain quantities. The speci-cation of a scaling assumption might even refer to thefull probability distribution of a domain quantity underconsideration (e.g., the distribution of the number of transactions in daily batches and the expected evolutionin the distributions parameters or moments over time).

    The ranges of values referenced in scaling assumptionsmust be elicited and negotiated with project stakehold-

    1. These numbers are ctitious and used for illustration only.

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    6/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 6

    ers. These may be obtained by analyzing characteristicsof the existing system, extrapolating characteristics fromrelated systems, making predictions about future sys-tem evolutions, and eliciting and negotiating the rangeswith stakeholders. A variety of requirements elicitationtechniques is available to support such activities [8].Applying such techniques to elicit these numbers of course remains difcult, due to uncertainties about fu-ture behaviors and divergences of opinion among sys-tem stakeholders. But surfacing the uncertainties anddivergences explicitly during requirements elaboration isessential to an adequate analysis of a systems scalabil-ity requirements. A systematic approach for identifyingwhat scaling assumptions need to be elicited, such as thescalability obstacle analysis technique to be described inSection 4, greatly facilitates this process.

    A key feature of our approach is that the valuesto be estimated in scaling assumptions denote measur-able, domain-specic quantities whose validity eventu-ally can be checked in the running system. This contrasts

    with other quantitative techniques for specifying non-functional requirements that rely on estimating sub- jective quantities that have no domain specic mean-ing [25], [26], [27], [28], [29] and can therefore never bevalidated empirically.

    3.2 The Role of Scaling Assumptions

    Like other kinds of domain assumptions, scaling as-sumptions support the renement of goals towards re-quirements and expectations that can be satised by asingle agent [8], [30]. Semantically, scaling assumptionsexpress constraints on the ranges of values that domain

    quantities are expected to assume over specied timeperiods and classes of system instances. Logically, theabsence of a scaling assumption for a domain quantitymeans that there is no assumed constraint on its possiblevalues; that is, its value at any point in time potentiallycould be innite.

    To illustrate this, consider again the goal Achieve [BatchProcessed Overnight] in Section 2.3. In Figure 1, we hadassigned that goal as the responsibility of the AlertGenerator agent. The goal is indeed realizable by theagent, as it is dened entirely in terms of quantities thatare monitored and controlled by that agent. Realizabilityimplicitly assumes that agents have innite resourcecapacities to perform the operations required to satisfytheir goals. In practice, however, all agents have nitecapacities that may be insufcient for satisfying thegoala form of scalability obstacle. In this example, itwill be impossible for the Alert Generator to guarantee thesatisfaction of the goal for any size of daily batches to be processed. To obtain a goal whose satisfaction can beguaranteed by the Alert Generator alone, the goal Achieve[Batch Processed Overnight] needs to be rened into a scalingassumption on the size of daily batches, such as one of those given in the previous section, and a goal requiringdaily batches to be processed within 8 hours only for

    batches satisfying the scaling assumption. Using the scalingassumption Expected Batch Size Evolution in Section 3.1, weobtain the following subgoal:

    Goal Achieve [Batch Processed Overnight Under Expected BatchSize Evolution]Category Performance goal, scalability goalDenition At the end of each day, the batch of transactions submittedby the bank should be processed in less than 8 hours, provided thatthe batch size does not exceed the bounds stated in the scaling as-sumption Expected Batch Size Evolution.

    This goal can now be satised by an Alert Generator withsufciently large but nevertheless nite capacities (i.e.,nite processing speed, memory and storage capacities).

    3.3 Specifying Scalability Goals

    We dene a scalability goal as a goal whose denitionand required levels of goal satisfaction (as specied byits objective functions) make explicit reference to one ormore scaling assumptions. The goal Achieve [Batch ProcessedOvernight Under Expected Batch Size Evolution] is a scalabilitygoal because its denition refers to the scaling assump-tion Expected Batch Size Evolution .

    Our denition is consciously in opposition with a com-mon intuition about scalability that consists of viewinga system as scalable if it can handle any increased work-load in a cost-effective way [31]. In this more commonview, the satisfaction of a scalability goal should not berestricted to bounds explicitly stated in scaling assump-tions. Our denition, in contrast, takes the view thatexpected variations of relevant domain characteristicsshould always be stated explicitly in order to adequatelyinform a cost-effective design strategy.

    The specication of a scalability goal involves describ-ing how the required level of goal satisfaction is allowedto vary when domain characteristics vary within the bounds specied in scaling assumptions. This goal speci-cation takes different forms, depending on whether thegoal satisfaction level is required to stay the same oris allowed to vary for different values of the domaincharacteristics. For example, for an air trafc controlsystem, the required safety level would remain the samewhen the number of airplanes simultaneously present

    in a region increases, whereas for an online store, onemay accept system performance to slow as the numberof simultaneous requests increases.

    A scalability goal with xed objectives is one in whichthe same level of goal satisfaction must be main-tained under the full range of variations specied inthe scaling assumptions. These are goals for whichthe goal denition, objective functions and targetvalues are the same across the whole range of valuesconsidered in the scaling assumptions.

    A scalability goal with varying objectives is a scalabilitygoal whose required level of goal satisfaction is not

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    7/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 7

    the same under the whole range of variations spec-ied in the scaling assumptions. Their goal deni-tion, objective functions denitions, or target valuesare different across the range of values consideredin the scaling assumptions.

    Scalability goals also can be extended to specify grace-ful degradation of goal satisfaction when the system

    operates outside its nominal scaling range. This can bedone by specifying additional scaling assumptions, oradditional scaling ranges within scaling assumptions,characterizing possible values for the domain charac-teristics outside of the initial scaling assumptions thatthe system must still be able to react to. More than onelevel of degradation could be specied if needed. Suchconcerns for handling such cases are typically handledduring the obstacle analysis steps of the goal-orientedrequirements elaboration process that will be presentedin Section 4.

    Finally, scalability goals can be specied at differentlevels in the goal renement graph. Following the KAOS

    denitions, a scalability requirement is a scalability goalassigned to an agent in the software-to-be.

    4 S CALABILITY OBSTACLE ANALYSISThis section describes the process of elaborating require-ments models that include specications of scalabilitygoals and scaling assumptions.

    As mentioned earlier, what it means for a system to bescalable is relative to other quality goals for the system.Identifying the primary quality goals of a system istherefore a prerequisite to the precise specication of itsscalability goals. Our process for analyzing a systems

    scalability and elaborating scalability goals assumes thatthe other functional and quality goals of the system have been elaborated following a systematic goal-orientedrequirements elaboration process [8].

    As was illustrated in Section 3.2, during incrementalelaboration of requirements models, initial specicationof goals and assignments of goals to agents are typicallyidealized, in the sense that they are generally elaboratedwithout explicit considerations for the scaling character-istics in the application domain and the limited capaci-ties of agents for fullling the goals assigned to them.

    Starting from idealized goals and goal assignments is benecial as it avoids premature compromises based onimplicit and possibly incorrect perceptions of what might be possible to achieve [32]. However, later on in therequirements elaboration process, it is necessary to takeinto account what can be achieved realistically by thesystem agents and, if needed, modify the specicationsof goals, requirements and expectations accordingly.

    Therefore, a natural way to deal with scalability dur-ing requirements elaboration consists of handling scala- bility concerns during the goal-obstacle analysis steps of the goal-oriented method. Starting from an initial goalmodel, our process for elaborating scalability require-ments consists of the following activities:

    1) systematically identifying scalability obstacles thatmay obstruct the satisfaction of the goals, require-ments and expectations elaborated so far;

    2) assessing the likelihood and criticality of those ob-stacles; and

    3) resolving the obstacles by modifying existing goals,requirements and expectations, or generating newones so as to prevent, reduce or mitigate the obsta-cles.

    The resolution of scalability obstacles leads to the iden-tication of scaling assumptions and scalability goalsintroduced in the previous section.

    4.1 Identifying Scalability Obstacles

    We dene a scalability obstacle to some goal G as acondition that prevents the goal from being satisedwhen the load imposed by the goal on agents involvedin its satisfaction exceeds the capacity of the agents.

    In this denition, we dene the concepts of goal load

    and agent capacity to denote domain-specic measuresintended to characterize the amount of work needed tosatisfy the goal and the amount of resources availableto the agent to satisfy the goal, respectively. To identifyconcrete scalability obstacles, one must instantiate theseconcepts to domain-specic measures appropriate to thegoals and agents under consideration.

    This denition suggests the following heuristic foridentifying a scalability obstacle from a goal: for each goal,identify what denes the goal load, what agent resources areinvolved in its satisfaction, and consider an obstacle of the form Goal Load Exceeds Agent Capacity . This heuristic extendsthe existing catalog of obstacle generation heuristics

    in [8], [20].For example, consider again the goal Achieve [BatchProcessed Overnight] assigned to the Alert Generator agent. Thegoal load in this case is the number of transactions inthe batches to be processed, and the agent capacity isthe Alert Generator s throughput measured as the numberof transactions it can process per second. A scalabilityobstacle to this goal is therefore the condition BatchSize Exceeds Alert Generator Processing Speed . This obstaclesname, which refers to a goal load and agent capacityexpressed in different units, is extended with a moreprecise obstacle denition, namely that the batch sizeexceeds the number of transactions that can be processedovernight given the alert generators throughput, thuscomparing terms expressed in the same unit.

    As another example, consider in Figure 1 the goalAchieve [Alerts Investigated and Acted Upon Quickly] assigned tohuman Fraud Investigators . This goal requires every gener-ated alert either to be conrmed and resolved or to becleared by a fraud investigator within a day of the alert being generated. The goal load in this case is the numberof alerts generated per day, while the agent capacity isthe number of alerts the team of fraud investigators isable to investigate per day. The scalability obstacle tothis goal is Number of Alerts Exceeds Fraud Investigators Capacity .

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    8/23

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    9/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 9

    TABLE 1Scalability Obstruction Patterns

    Goal Domain Property ObstacleG : P (P LoadG CapacityAg ) (LoadG > CapacityAg )

    G : (SA P ) (P LoadG CapacityAg ) (SA LoadG > CapacityAg )

    TABLE 2Pattern of Scalability Obstruction to Multiple Goals

    Goals Domain Property ObstacleG 1 , . . . , G n G 1 . . . G n (LoadG 1 + . . . + LoadG n CapacityAg ) (LoadG 1 + . . . + LoadG n ) > CapacityAg

    4.2 Assessing Scalability Obstacles

    Once potential scalability obstacles have been identied,their criticality and likelihood should be assessed. Asfor other kinds of obstacles, the criticality of a scalabilityobstacle depends on its impact on higher-level goals andthe importance of those goals.

    A lightweight technique to support this process con-sists of using a standard qualitative risk analysis matrix inwhich the likelihood of a scalability obstacle is estimatedon a qualitative scale from Very Unlikely to Almost Certainand its criticality estimated on a scale from Insignicantto Catastrophic [8]. The goal renement graph helps inestimating obstacle criticality by allowing one to followgoal-renement links upward to identify all high-levelgoals affected by an obstacle.

    If there exists a quantitative goal model relating qual-ity variables on software requirements and domain ex-pectations to higher-level goals [23], then this modelcan be used to perform a more detailed quantitativeanalysis of the impact of obstacles on higher-level goals.However, developing such quantitative models requiresmore effort, and a detailed quantitative analysis is notalways needed, as the main objective of the obstacleassessment step is to separate scalability obstacles forwhich resolutions need to be sought from those whoserisk is so low that they can safely be ignored.

    4.3 Resolving Scalability Obstacles

    Scalability obstacles whose combined likelihood and crit-icality are considered serious enough must be resolved.The obstacle resolution process comprises two activities:the generation of alternative resolutions and the selectionof resolutions among generated alternatives. Only thegeneration step is considered here; the selection amongalternatives can be performed using qualitative or quan-titative techniques covered elsewhere [8], [18], [23].

    Ideally, obstacle resolutions should be dened inde-pendently of any technology adopted to develop thesystem. Doing so allows the requirements analyst toexplore a broader range of resolutions and to negotiategoals with the stakeholders. For example, the resolutionadapt agent capacity at runtime according to load, described

    in Section 4.3.3, predicts the future value of the goal loadat runtime and adjusts the agent capacity accordingly.Later in the development lifecycle, this resolution could be implemented using, say, a cloud, in which the agentswould be deployed as cloud-based services that can be

    replicated to support a growing load. It is importantto note, however, that the decision of which agentswould be cloud-based and which requirements theywould be responsible for is a subsequent activity to ourmethod and should not be considered in the generationof alternative resolutions described in this section.

    In Kaos, obstacle resolution tactics are model transfor-mation operators that resolve an obstacle by introduc-ing new goals, assumptions, and responsibility assign-ments in the model, or by modifying existing ones. Wehave extended the existing catalog of obstacle resolutiontactics [20] with new, specialized tactics to deal withscalability obstacles. Our tactics have been developed

    by systematically considering specializations of the eightgeneral obstacles resolutions strategies in [20]: goal sub-stitution, agent substitution, obstacle prevention, goalweakening, obstacle reduction, goal restoration, obstaclemitigation, and do-nothing.

    The benet of these tactics is to encode expert knowl-edge about how to deal with scalability risks in a formthat allows requirements analysts to explore system-atically during requirements elaboration the range of alternative ways to deal with such risks.

    Table 3 provides an overview of the scalability obstacleresolution tactics. The rst and second columns show thegeneral obstacle resolution categories and descriptions,as dened by Lamsweerde and Letier [20]. The thirdcolumn shows the specialized tactics and sub-tactics forresolving scalability obstacles.

    A complete, formal description of these tactics can befound in Duboc [33]. This section presents the main ideaof each tactic. For each of the general obstacle resolutionstrategies, we briey discuss its relevance to resolvingscalability obstacles and, where appropriate, dene newmodel transformation tactics that are specic to the res-olution of scalability obstacles. The Do-Nothing strategyconsists simply of leaving the obstacle unresolved andwill not be considered further. This tactic should be

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    10/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 10

    TABLE 3Scalability Obstacle Resolution Tactics

    Strategy Short Description Scalability Obstacle Resolution TacticGoal Substitution Resolve the obstacle by nding alternative Can be applied without specialization

    goals in which the obstructed goal is nolonger needed

    Agent Substitution Resolve the obstacle by changing the Transfer goal to non-overloaded agent

    responsibility assignment for the Automate responsibility of overloaded human agentobstructed goal Split goal load among multiple agents

    Split goal load into subtasks Split goal load by case

    Obstacle Prevention Introduce new goals and assumptions Introduce scaling assumptionto prevent the obstacle from occurring Introduce scalability obstacle prevention goal

    Set agent capacity according to load- Set agent capacity upfront to worst-case load- Adapt agent capacity at runtime according to load

    - Increase number of agent instances- Increase the capacity of the agent instance

    Limit goal load according to agent capacity- Limit goal load according to xed agent capacity

    - Distribute goal load over time- Limit goal load according to varying agent capacity

    Obstacle Reduction Introduce new goals to reduce the obstacles Inuence goal load distributionlikelihood

    Goal Weakening Change the denition of an obstructed goal Weaken goal denition with scaling assumptionto make it more liberal so that Weaken goal denition by strengthening scaling assumptionthe obstruction disappears Weaken goal objective function

    Relax real-time requirement Relax required level of satisfaction

    Goal Restoration Introduce a new goal to restore the Can be applied without specializationsatisfaction of the obstructed goal whenviolated

    Obstacle Mitigation Introduce a new goal to mitigate the Can be applied without specializationconsequences of an obstacle when it occurs

    Do-Nothing Leave the obstacle unresolved No specialization needed

    applied only when the likelihood and criticality of theobstacle are considered low.

    4.3.1 Goal Substitution

    A rst effective way to resolve an obstacle to somegoal is to eliminate the obstacle completely by ndingan alternative to the obstructed goal. For scalabilityobstacles, the principle is to try to nd an alternativeway to satisfy the systems higher-level goals in whichthe scalability risk no longer exists.

    Example. Consider the scalability obstacle Batch SizeExceeds Alert Generators Processing Speed obstructing the goalAchieve [Batch Processed Overnight] . An application of the goalsubstitution strategy in this case is to design a system inwhich transactions are processed continuously, 24 hoursper day, rather than in overnight batches. This is in fact

    a strategy that has been developed for a later version of the IEF, which now supports both modes of operation.

    4.3.2 Agent Substitution

    This tactic consists of eliminating the possibility foran obstacle to occur by assigning the responsibility forthe obstructed goal to another agent. The followingspecializations of this tactic are useful for resolvingscalability obstacles.

    Transfer goal to non-overloaded agent : This tacticconsists of transferring responsibility for one or moreof the goals G i assigned to an overloaded agent Agto an alternative agent Ag with bigger capacity orsmaller load. Possible choices for the alternative agentcan be explored systematically by considering all agents

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    11/23

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    12/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 12

    (LoadG K ) limiting the goal load value at any giventime below K . This tactic also has two specializations:

    Limit goal load according to xed agent capacity:For this tactic, K is a constant value representing axed agent capacity. One typical tactic to satisfy thissubgoal is to distribute the goal load over time ;for example, for a student enrollment system, differ-ent registration dates can be set for different studentgroups. Limit goal load according to varying agentcapacity: This tactic is the dynamic counterpart of the previous one, where K is a time-varying variable.The principle here is to monitor or predict variationsin the agent capacity at runtime and dynamicallychange the limit on the maximal goal load based onthese variations.

    Example. To illustrate the use of these tactics for theexploration of alternative system designs, consider againthe obstacle Batch Size Exceeds Alert Generator Processing Speed

    and its prevention through the goal Avoid [Batch SizeExceeds

    Alert Generator Processing Speed] . We consider each of theabove goal renement tactics in turn.

    1) The tactic set agent capacity upfront to worst-caseload renes the goal into a scaling assumption BatchSize Below Max (where Max is a constant value) anda subgoal Maintain [Alert Generator Processing Speed AboveMax], such that the alert generator would have axed capacity designed to deal with the largestpossible batch size.

    2) The tactic adapt agent capacity at runtime ac-cording to load renes this goal into the subgoalsMaintain [Accurate Batch Size Prediction] and Maintain [Alert

    Generator Processing Speed above Predicted Batch Size] . Sat-isfaction of these adaptation goals can be assignedas the responsibility of human agents who haveto manually upgrade the system, but they alsocould be partly automated by developing a self-managed system that would monitor and predict batch sizes and automatically allocate more re-sources as needed.

    3) The tactic limit goal load according to xed agentcapacity renes the goal into a requirement Main-tain [Alert Generator Processing Speed Above K] (where K is a constant value)this can be translated to ahardware requirement on machines running theIEFand a subgoal Maintain [Submitted Batch Size BelowK] requiring the size of the submitted batch toremain within the alert generator capacity. Thereare in turn different ways in which this goal could be satised. It could, for example, be assigned asthe responsibility of the Bank System , or we couldintroduce an additional agent between the BankSystem and Alert Generator responsible for truncating batches of transactions that are too large, and forrescheduling the remaining transactions to a latertime or redirecting them to another system.

    4) The tactic limit goal load according to varying

    agent capacity is the dynamic counterpart of theabove renement, in which the limit on the batchsize submitted to the Alert Generator would varyat runtime based on observations of the actualprocessing speed of the Alert Generator (as opposedto a xed processing speed that was estimated anddened once when the system was deployed).

    These alternative renements would be modeled as OR-renement links in the goal graph structure. Each alter-native would have to be evaluated in terms of cost, risks,and its contribution to high-level goals. Such evaluationwould be used then to decide which alternative orcombination of alternatives to select for implementation.

    4.3.4 Obstacle Reduction Obstacle reduction consists in introducing a new goalaiming at reducing the likelihood of the obstacle oc-curring. Typically, this technique plays on factors thatencourage or dissuade human agents to behave in cer-tain ways. For scalability obstacles, one common obstacle

    reduction tactic is to inuence the distribution of goalload . Consider, for example, an e-commerce websitewishing to announce a sale. It can send notication e-mails to registered customers on different dates ratherthan all on the same date, assuming that customers aremore likely to shop as soon as they receive the e-mail.Another typical example is the tactic used in some largeofce buildings to reduce morning congestion in theirelevators by spreading out employees start times ondifferent oors.

    4.3.5 Goal Weakening Sometimes a goal is found to be obstructed by an obsta-

    cle simply because the initial goal denition is too strong.In this case, one way to resolve the obstruction is todeidealize the goal, that is, to change the goal denition tomake it more liberal so that the obstruction is no longerpresent. This process is supported by a set of formaland informal goal transformation tactics [20]. We havedened the following new transformation tactics forresolving scalability obstacles through goal weakening:

    Weaken goal denition with scaling assumption :This tactic transforms an obstructed goal into a moreliberal goal, requiring the original goal to be satisedonly when its load does not exceed the agent capacity.It should be applied when a goal G that does not referto any scaling assumption is obstructed by a scalabilityobstacle Goal Load Exceeds Agent Capacity . In this tactic, theoriginal goal is transformed into the goal named GWhen Goal Load Does Not Exceed Agent Capacity , where thename corresponds to a scaling assumption. The for-mal transformation pattern associated with this tacticis given by the rst row in Table 4. It states that agoal of the form P obstructed by a scalability obstacle (LoadG > CapacityAg ) is transformed into a goal of theform (LoadG CapacityAg P ).

    Note that the denition of the resulting goal isthe same as the denition of the subgoal that would

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    13/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 13

    TABLE 4Goal Deidealization Patterns for Scalability Obstacles

    Goal Scalability Obstacle Deidealized GoalG : P (LoadG > CapacityAg ) (LoadG CapacityAg P )

    G : (SA P ) (SA LoadG > CapacityAg ) (SA LoadG CapacityAg P )

    be obtained by applying the tactic introduce scalingassumption . The difference is in terms of how theweakening is propagated, as we illustrate on theexample below.

    Weaken goal denition by strengthening thescaling assumption: This tactic has the same objectiveas the one above, but it should be applied when theobstructed goal already refers to a scaling assumptionthat is, when a scalability goal G When SA referring to ascaling assumption SA is obstructed by an additionalscalability obstacle. Applying this tactic transforms theoriginal goal into the goal G When Goal Load Is Within SA andDoes Not Exceed Agent Capacity . The formal transformationpattern associated with this tactic is given by the secondrow in Table 4.

    Once a goal denition has been weakened withone of the two tactics above, the change needs to be propagated along the renement links in the goalgraph following the general procedure described byLamsweerde and Letier [20]. In the case of these tactics,the propagation typically is performed by weakeningthe denitions of parent goals using the same patterns

    in Table 4. Weaken goal objective function: This tactic involves

    weakening the goals objective function or requiredlevels of satisfaction so that it requires less agentcapacity for its satisfaction. For Achieve goals of the form

    (C d T ), a specialized tactic for goal weakeningis to relax real-time requirement by increasing the timed for achieving the condition T . The relax requiredlevel of satisfaction specialization involves relaxingthe minimum value to be achieved in order for thegoal objective function to be maximized (i.e. the targetvalues of the goals objective function).

    Example. Applying the tactic weaken goal denitionwith scaling assumption to the goal Achieve [Batch Pro-cessed Overnight] generates the weaker goal Achieve [BatchProcessed Overnight When Batch Size Does Not Exceed Alert Gener-ators Speed] . This change would be propagated upwardin the goal model of Figure 1 to yield the goal Achieve[Alert Generated When Batch Size Does Not Exceed Alert GeneratorsSpeed] . The purpose of considering such a tactic is toexplore systematically the space of possible resolutions.In this example, it is likely that such resolutions will not be acceptable to system stakeholders. Consider also the

    goal Achieve [Batch Processed Overnight Under Expected Batch SizeEvolution] (specied in Section 3.2). This goal is speciedwith the following xed objective function:

    Objective Functions At least 9 out of 10 batches should be pro-cessed within 8 hours, and all batches should be processed within9.6 hours.

    Name Denition Target

    % of Batches Pro- Pr(processingTime 90%

    cessed in 8 hours 8h | ExpectedBatch-

    SizeEvolution)

    % of Batches Pro- Pr(processingTime 100%

    cessed in 9.6 hours 9.6h | ExpectedBatch-SizeEvolution)

    Assume that the scalability obstacle Expected Batch SizeEvolution Exceeds Alert Generator Processing Speed obstructingthis goal is considered likely to occur. In order to re-solve the obstacle, the tactic relax real-time requirementwould weaken the goal by increasing the expected pro-cessing time from 8 to 10 hours. Alternatively, the tacticrelax required level of satisfaction would weaken thegoal by allowing for less than 90% of the batches to beprocessed in 8 hours.

    4.3.6 Goal Restoration and Obstacle Mitigation In some cases, it may be impossible or too costly toguarantee that all scalability obstacles will be avoided.One should therefore consider how to tolerate a scalabil-ity obstacle or mitigate its consequences when it occurs.Two general tactics here are goal restoration and obstaclemitigation.

    In goal restoration, a new goal of the form ( P P ) is added to the model to eventually restore the con-dition P of the obstructed goal. In obstacle mitigation, anew goal is added to attenuate the consequences of anobstacle occurrence. This typically involves ensuring thesatisfaction of a weaker version of the obstructed goalor of an ancestor of the obstructed goal. Consider, forexample, a call center with the goal Achieve [Calls AnsweredQuickly] stating that calls should be answered in less than3 minutes. If in a given day, the call center receives anunusually high number of calls, this goal is obstructed by the scalability obstacle Number of Calls Exceeds CustomerService Team Capacity . A tactic to mitigate this obstaclemight be to create a new goal Achieve [Return Call] whichensures a weaker version of the obstructed goal statingthat a call must be taken in less than 3 minutes or elsethe details of the customer should be recorded and thecall returned in less than half an hour.

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    14/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 14

    5 C AS E S TUDY

    We have applied the scalability obstacle analysis tech-niques described in the previous section to the elabora-tion of the scalability requirements of a major redesignof the IEF system. One of the main objectives of thisredesign was to extend the existing batch-based systemwith functionality to process transactions in a continuousstream.

    This case study follows a previous case study reported by Duboc et al. [5] in which we elaborated a KAOS goalmodel for the batch-based version of IEF. The rst casestudy occurred before our development of the scalabilityobstacle analysis techniques. The elaboration of the goalmodel for the new system and its subsequent scalabilityobstacle analysis were performed by the rst author byinvitation of the IEF project manager after the success of the rst case study. Having worked at Searchspace forthree and a half years, she had good knowledge of theapplication domain and previous system versions.

    Unlike a retrospective study of a fully implementedsystem with well understood requirements, this studyinvolved a system under development, with all thecomplexities normally present during the initial elab-oration of a systems requirements. Stakeholders were busy with their own daily tasks and had limited timefor requirements elicitation and validation; knowledgeabout the system was spread across individuals; manyassumptions originated from experience with previousprojects that may not have been valid for the newsystem; documents were contradictory and incomplete;and many requirements had no clear rationale other than being present in older versions of the IEF.

    Our case study lasted for a period of approximately30 working days, including interviews, brainstormingsessions, modeling, data characterization, and documentwriting and reviewing. During the rst two days, theanalyst built an initial high-level model based on herown domain knowledge to serve as a starting point fordiscussions. The next 25 days were spent elaboratingthe initial, idealized goal model and applying the scal-ability obstacle identication and assessment activitiesdescribed in the previous sections. The last ve dayswere spent exploring obstacle resolutions and producingthe nal documentation.

    The initial goal model was elaborated from a numberof sources: the goal model for the previous batch-basedsystem, a business requirements specication that al-ready had been written for the new system, and informa-tion gathered from interviews with project stakeholdersduring the goal model elaboration. The initial businessrequirements specication document contained mostlyfunctional requirements specied through use cases, anda single unjustied estimation of the number of trans-actions the system should be able to process per day.The IEF developers judged this document insufcient toinform the software design adequately.

    The full goal model that we elaborated cannot be

    shown here due to its size and the sensitivity of the in-formation it contains. Sanitized extracts from the modelcan be found in Duboc [33]. A portion of the goal modelon which we illustrate the scalability obstacle analysistechniques is shown in Figures 5 to 7 in Appendix A.

    5.1 Identifying Scalability Obstacles

    We followed the process dened in Section 4.1 to identifyscalability obstacles systematically from goals assignedto agents. Table 5 illustrates the result of this process byshowing the scalability obstacles generated from someof the responsibility assignments included in the goalmodel of Appendix A.

    In this table, the Alert Generator agent is shown to be re-sponsible for three goals: Achieve [Batch Processed Overnight] ,Achieve [Real-time Transactions Processed Instantaneously] andAchieve [Fraud Patterns Tested on Past Transactions] . The rst goalis similar to the one dened in Section 2.2, the secondgoal bounds the processing time for transactions thatrequire immediate clearing, and the third goal requires

    that every new fraud detection rule should be testedagainst past transactions.For these three goals, we have dened the measure

    of the goal load to be the number of transaction checks(txn check ), where a transaction check corresponds to theapplication of a single fraud detection rule to a singletransaction. The load imposed by these goals on theAlert Generator is therefore proportional to the number of transactions to be checked and the number of frauddetection rules to be applied on each transaction. Wethereby obtained the scalability obstacle Number of TxnChecks Exceeds Alert Generator Processing Speed obstructingthese three goals. In dening this scalability obstacle,

    we further took into consideration the fact that not allfraud detection rules take the same time to compute. Forexample, fraud detection rules that consider historicalinformation about a set of related accounts take muchlonger to process than simple comparisons of transac-tions against blacklisted account numbers.

    Other domain characteristics, such as the percentageof new accounts in a daily batch or stream, also in-uence the goal load and therefore were included inthe denition of this scalability obstacle. Thus, unlikeour simple illustrative example in Section 4.1 where theload of the goal Achieve [Batch Processed Overnight] was asimple scalar value (the number of transactions), in ourfull model, this load is dened as a vector composedof several domain characteristics. Signicant knowledgeabout the application domain and fraud detection ruleswas necessary to identify such additional characteristics.

    As shown in Table 5, the obstacle identication processalso led us to identify scalability obstacles to agentsother than the Alert Generator . In particular, the number of generated alerts and the time needed to handle an alertare important factors whose variations could preventFraud Investigator s from satisfying their goals.

    In our model, the Data Storage agent is separate from theAlert Generator because it correspond to a given domain

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    15/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 15

    TABLE 5Scalability Obstacles to IEF Goals

    Agent Goal Scalability Obstacle

    Alert Generator

    Achieve [Batch Processed Overnight]Number of Txn Checks Exceeds Alert GeneratorProcessing SpeedAchieve [Real-time Transactions Processed Instantaneously]

    Achieve [Fraud Patterns Tested on Past Transactions]

    Data Store

    Achieve [Incoming Transactions Stored]

    Amount of Data Exceeds Data Storage SpaceAchieve [Alerts Stored]

    Achieve [Fraud Detection Rules Stored]

    Fraud Investigator Achieve [Alerts Investigated and and Acted Upon Quickly] Number of Alerts Exceeds Fraud InvestigatorSpeed

    component the Alert Generator must interact with. Fromthe goals shown in Table 5, we generated the scalabilityobstacle Amount of Data Exceeds Data Storage Space whosedenition refers to domain quantities such as the numberand size of transactions and generated alerts.

    Our systematic identication of scalability obstaclesled us to uncover 13 important domain characteristicsthat have an impact on the systems scalability. Allthese characteristics except one had been overlookedin the initial business requirements specication, andtheir impact on scalability never had been made fullyexplicit to the main system stakeholders. The benet of exposing these scaling characteristics became apparentwhen eliciting scaling assumptions on these variablesand discovering that stakeholders had diverging viewsabout the ranges of values such variables may take, andtherefore about the software scalability requirements thatdepend on these assumptions.

    Two important observations can be made from ourexperience in identifying scalability obstacles for the IEFsystem:

    1) The completeness of the generated scalability obstacles isrelative to the completeness of the goal model. An im-portant concern when identifying scalability obsta-cles is to try to identify all domain quantities whosevariations have an impact on the systems ability tosatisfy its goals. The systematic process in whichscalability obstacles are generated for each agentand every goal assigned to it ensures completenessof the identied obstacles with respect to speciedsoftware requirements and domain expectations.By completeness here, we mean that a scalabilityobstacle of the form Goal Load Exceeds Agent Capacityhas been considered for all leaf goals and agent re-sources that have been identied in the model. Thiscompleteness is of course relative to the quality of the initial goal modelif some important agents,requirements or domain expectations are missing,it is likely that their scalability obstacles will beoverlooked as well.

    2) Detailed domain knowledge is needed to dene ne- grained scalability obstacles. Dening what consti-tutes the load of a particular goal was not always

    straightforward and could not be derived simplyfrom the goal denition alone. It often required agood knowledge of the application domain and of the factors affecting the computational complexitiesof the functions to be performed for satisfyingthe goals. This was illustrated above for the goalsassigned to the Alert Generator . In our example, itrequired good knowledge of the complexities of the different alert generation rules used in frauddetection systems. When this knowledge is notavailable, scalability obstacles still can be identi-ed from the goal denition, but the scalabilityobstacles as dened remain much more coarse-grained. For example, without expert knowledge,our scalability obstacles to the Alert Generator s goalswould have been dened in terms of numbers of transactions only, as was done in Section 4.1. Simi-larly, the scalability obstacles we have identied for

    the goals assigned to the Fraud Investigator agents re-main quite coarse-grained, because during our casestudy we lacked access to the knowledge of whatdomain characteristics affect the speed at whichalerts are investigated. A benet of our scalabilityobstacle identication process is that it structuresthe elicitation activities that need to be performedfor identifying detailed scalability obstacles.

    5.2 Assessing Scalability Obstacles and ElicitingScaling Assumptions

    Once identied, scalability obstacles need to be assessedin terms of their likelihood and criticality, so that de-cisions can be made about which obstacles need to beresolved. Since obstacle analysis is an iterative process,scalability obstacles need to be reassessed after their res-olutions. During this reassessment, obstacle likelihoodsand criticality are used also to guide the selection amongalternative resolutions.

    In a rst assessment, we considered all identied scal-ability obstacles equally likely to happen. Since we hadno information about the potential scaling characteristicsof the various domain quantities, we pessimisticallyassumed that they all could rise to levels that wouldlead to violations of idealized goals.

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    16/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 16

    TABLE 6Criticality of Scalability Obstacles

    Scalability Obstacle Likelihood CriticalityNumber of Txn Checks Exceeds Alert Generator Processing Speed High HighAmount of Data Exceeds Data Storage Space High HighNumber of Alerts Exceeds Fraud Investigator Speed High Moderate

    The criticality of the identied scalability obstacles wasestimated by the rst author and project manager byidentifying which higher-level goals would be violatedif the obstacles were to occur. Table 5.2 shows thisqualitative assessment for three obstacles.

    After this rst assessment, we started eliciting scalingassumptions for all domain quantities involved in thescalability obstacles and updated the goal model byapplying the tactic introduce scaling assumption to theobstructed goals. This tactic generated new obstacles thathad to be assessed as well. We did not yet consider other

    scalability obstacle resolution tactics, because we felt thatwe needed rst to understand the scaling assumptionsof the variables involved before investigating other res-olution tactics.

    For example, the goal Achieve [Batch Processed Overnight]was rened into the scaling assumption and scalabilityrequirement shown in Figure 4. The gure also shows

    Fig. 4. Obstacles to a Scaling Assumption and a Scala-bility Requirement.

    new obstacles to the scaling assumption and the scal-ability requirement. Assessing the likelihoods of thesetwo obstacles consists of (1) asking how likely it is thatthe number of transaction checks in a batch will exceedthe assumed maximum value, and (2) how likely it isthat the batch will not be processed in time despite thefact that the number of transaction checks is below theassumed maximum value, respectively.

    We also asked developers to estimate how feasibleit would be to develop a system that would meet thescalability requirements given these scaling assumptions.When there was a perceived risk that the scalability re-quirements may not be achievable, this signaled the needto investigate additional resolutions for the scalability

    obstacles to these requirements. The exploration of theseresolutions will be discussed in the next section.

    From our experience in assessing scalability obstacles,we can make the following observations.

    1) Assessing scalability obstacle likelihoods and elicitingscaling assumptions are strongly related activities. Thestandard identify-assess-resolve loop of obstacleanalysis assumes that obstacle likelihoods can beassessed before deciding on their resolution. Thisproved to be impossible for the rst assessment of

    scalability obstacles, because assessing how likely ascalability obstacle is to occur requires knowledgeof the scaling characteristics of the domain quan-tities concerned by the obstacles. For this reason,we deviated from the standard obstacle analysisprocess, and combined the tasks of assessing scal-ability obstacles and eliciting scaling assumptionsas described above.

    2) Another difculty we faced was that the capacity of some agents is still unknown when the obstacle like-lihood needs to be assessed. Indeed, one of the pur-poses of the requirements engineering process isto decide what the required capacities for these

    agents should be. Such analysis should take intoconsideration the varying likelihood of scalabilityobstacles to different envisioned agents capacities.For example, the scalability obstacle Batch NOT Pro-cessed In Time When Nbr Txn Checks Below Assumed Max inFigure 4 required asking IEF developers to assessthe likelihood of this obstacle occuring for feasibleAlert Generator capacities (i.e., its processing speed).However, we feel that a technique for more thor-ough analysis of the variation of obstacle likelihoodwith respect to the variation of feasible agent ca-pacities is needed, and this is a subject of futureresearch.

    3) Eliciting and reaching agreements on the projected val-ues for domain quantities in the scaling assumptions proved to be the most difcult and time consuming tasksof our case study. This was due to the uncertaintiesand divergence of opinions among stakeholdersabout future values for some of the domain char-acteristics, and to large variations in the customer base that led people to make different (often im-plicit) assumptions. There was also a lack of aware-ness by non-software engineers of the impact thatsome of the projected values would have on thesoftware design, leading sometimes to infeasible

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    17/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 17

    or extremely costly requirements due to unrealis-tically high scaling characteristics. To help resolvethese divergences, we created a simple quantitativegoal model [23] relating projected values for thedomain quantities involved in scaling assumptions(e.g., the expected growth rate in the number of accounts and number of transactions per account)to scalability requirements for the software system(e.g., in terms of throughput and storage). Thismodel was used during stakeholder meetings tohelp reach agreement on the scaling assumptions.

    4) The qualitative assessment of the scalability obstaclescriticality was straightforward and aided by the goalmodel structure. This simply consisted of follow-ing goal-renement links bottom-up from the ob-structed goal to higher-level business goals.

    5) Qualitative vs. quantitative assessment of scalabilityobstacles. Our assessment of scalability obstacles re-mained entirely qualitative. A ner-grained quan-titative assessment of the likelihood of violations

    of scaling assumptions was felt to be too uncertainto be of value. Accurate quantitative assessmentsof the likelihood of meeting the scalability require-ments are also difcult to obtain. These potentiallycould be obtained by analyzing models of theintended software architecture or by testing systemprototypes, but such analyses were not performed.A qualitative assessment of scalability obstacleswas sufcient for our purposes of identifying whatthe important scalability obstacles are and howthese can be resolved through the introduction of new goals and requirements. We note that similarobservations about the limitation of quantitative

    assessments and benets of qualitative assessmentshas been made about the use of fault tress for safetycritical systems [11]. However, our case study didnot involve detailed comparisons of alternativesystem designs for which a more precise, quanti-tative assessment of scalability obstacles might beneeded. This is an area where further research isrequired.

    5.3 Resolving Scalability Obstacles

    For resolving scalability obstacles, we have used the cata-log of scalability obstacle resolution tactics in Section 4.3to explore systematically some potential resolutions of a few key scalability obstacles that were judged to becritical. Some of the resolutions we identied correspondto design decisions that had been taken independentlyof our study by Searchspace for the new system version.Other resolutions correspond to alternative and addi-tional requirements that had not been considered or had been considered but not retained. For the case study, wefocussed on generating a wide range of resolutions forthe identied scalability obstacles. We did not performdetailed evaluations of the generated alternatives inorder to guide the selection of the preferred ones.

    The process of generating resolutions to scalabilityobstacles consists of systematically considering the po-tential application of each resolution tactic in Table 3 toeach obstacle. To illustrate this process, Tables 7 and 8illustrate the set of model transformations that can beapplied to resolve the obstacles listed in Table 5. Someof the goal specications and goal graphs obtained bythese model transformations can be found in Duboc [33].The identify-assess-resolve loop of obstacle analysis isiterative. Once a resolution tactic has been selected, oneneeds to identify its obstacles and explore new obstacleresolution tactics. In Tables 7 and 8, we only discussresolutions for the original obstacles.

    A few observations can be made from our experiencein resolving scalability obstacles for the IEF system:

    1) We felt that the biggest benet of the catalog of obsta-cle resolution tactics was in encouraging a systematicexploration of a wide range of alternative options forresolving a scalability obstacle. Although we did notdo this during this case study, we believe that

    such a catalog can be used during meetings withstakeholders, domain experts, and system design-ers to facilitate the exploration of creative designsolutions to envisioned scalability risks. Furtherstudies, such as those explored by Maiden andRobertson [35], would be needed to investigate theeffectiveness of scalability resolution tactics in thissetting.

    2) A question we faced was at which level of detail tospecify the models generated by the application of theresolution tactics. In practice, when exploring alter-native resolutions, we did not specify all alternativemodels explicitly, as the enunciation of the appliedresolution tactic is in most cases enough to under-stand what the model would be. We felt it would be sufcient to specify the actual requirements anddomain assumption only once an alternative wasselected. We did not specify detailed formal speci-cations for the resulting new goals because this wasnot felt to be of benet to the project. However, theformal descriptions of the model transformationswere helpful to clarify and disambiguate what eachresolution tactic could achieve.

    5.4 Summary

    Our case study produced valuable results toSearchspace. Project managers easily understoodand accepted the basic concepts of our approach andwere particularly impressed by the ability of the modelto reveal the impact of distinct projected scaling boundson leaf goals and to expose conicting requirements.The goal model provided a rationale for low-levelrequirements and, in particular, it allowed a moreprecise and justiable characterization of scaling rangesand objective functions. Some scaling bounds happenedto be more exible than originally stated, while otherswere found to be more rigid. One of the tools that

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    18/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 18

    TABLE 7Resolving the Scalability Obstacle Nbr of Txn Checks Exceeds Alert Generator Processing Speed

    Resolution Strategy Model TransformationsGoal Substitution none, because the obstructed goals are essential and have no alternativesAgent Substitution Transfer goal to non-overloaded agent: Responsibility for the goal Achieve [Fraud

    Patterns Tested on Past Transactions] can be transferred from the Alert Generator to aSandbox agent providing a separate environment for testing new fraud detection rules. Split goal load among multiple agents: none; we could identify no subtasks orsub-cases for the obstructed goals that could be ofoaded to other agents.

    Obstacle Prevention Introduce scaling assumption: This tactic generates a series of scalingassumptionson the numbers of transactions in a batch, distinct business entities ina batch, transactions per second in the transactions stream, distinct business entitiesper second in the transactions stream, transactions in testing data, distinct businessentities in testing data, fraudulent patterns, fraudulent patterns added in the lastdaytogether with the subgoals obtained by application of the renement pattern. Introduce scalability obstacle prevention goal: This tactic generates the newgoal Avoid [Nbr of Txn Checks Exceeds Alert Generator Processing Speed] . Alternativerenements for this goal are as follows:

    Set agent capacity according to load: The tactic set agent capacity upfront toworst-case load renes the goal into the assumption Assumed Worst-Case Nbr of TxnChecks and goal Maintain [Alert Generators Speed above Worst-Case Nbr of Txn Checks] ,while the tactic adapt agent capacity at runtime according to load renes it into

    the subgoals Maintain [Accurate Prediction on Max Nbr of Txn Checks] and Maintain [AlertGenerators Speed above Predicted Max Nbr of Txn Checks] . Limit goal load according to agent capacity: Applying the tactic limit goalload according to xed agent capacity imposes xed limits on the number of transactions that a bank can provide in its daily batches and continuous stream,while the tactic limit goal load according to varying agent capacity imposessimilar limits but with bounds that can vary over time based on the capacitiesof the Alert Generator . Alternative ways of implementing these tactics have beendescribed in Section 4.3.3.

    Obstacle Reduction Inuence goal load distribution: Pricing incentives can be introduced to inuencethe goal load (for example, by setting up tariffs that will inuence the time at whichtransactions are submitted).

    Goal Weakening Weaken goal denition with scaling assumption and weaken goal denitionby strengthening scaling assumption consist in revisiting the scaling assumptionselicited during the scalability obstacles assessment step (see Section 5.2) so as toweaken the requirements on the Alert Generator . Weaken goal objective function: This tactic consists in modifying the AlertGenerator s requirements by relaxing its real-time requirement, relaxing its requiredlevels of satisfaction, or both, as illustrated in Section 4.3.5.

    Goal Restoration andObstacle Mitigation

    These tactics led us to specify goals to be satised when the Alert Generator failsto satisfy the obstructed goals. These include the requirements to make generatedalerts available to a Fraud Investigator even if a batch has not been processed entirely,and to nish processing an uncompleted batch as soon as possible. For the real-timetransaction processing, it includes dening how to clear real-time transactions if theAlert Generator fails and how eventually to check for frauds in transactions that have been cleared during the Alert Generator s failure if it were to happen.

    enabled us to elaborate goals and estimate boundsmore precisely was to get an explicit statement of allunderlying assumptions about the application domain.The nal model contains a number of goals that had been overlooked in Searchspaces initial requirementsspecication and replaces others that were idealized inthat specication. The approach also helped to documentand resolve numerous stakeholder disagreements. Thegoal-obstacle analysis process allowed us to exposea whole range of potential system failures, not onlyscalability-related ones. Searchspace has used the goalmodel to support the development of the new version

    of the IEF. Furthermore, the model can be used bythem to aid a number of other activities, such as thederivation of test cases, and the derivation of hardwarerequirements for their different clients based on theirexpected scaling characteristics.

    6 R ELATED WOR KScalability is commonly discussed in the context of specic technologies. Virtualization, for example, has been used for years, and, a number of studies havereported on the scalability of systems running in virtual

  • 8/13/2019 2012 Scalabilitiy TSE Preprint

    19/23

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X,NO. DAY, MONTH 2012 19

    TABLE 8Resolving the Scalability Obstacle Amount of Data Exceeds Data Storage Space

    Resolution Strategy Model TransformationsGoal Substitution none, although some of the goals obtained though goal weakening (see below) were

    rst identied when looking for goal substitutionsAgent Substitution Transfer goal to non-overloaded agent: Responsibilities for storing the transactions

    and alert data can be transferred to an external data store.

    Split goal load among multiple agents: This tactic consists in splitting the Data Storeagent into multiple components, each of which can be responsible for storing part of the data. However, deciding how to organize the Data Store into multiple componentsis a design decision rather than a requirements one. The scalability requirementsdene the constraints that the Data Store must satisfy without prescribing what itsarchitecture should be.

    Obstacle Prevention Introduce scaling assumption: This tactic introduces scaling assumptions on thenumber and size of transactions, generated alerts, and stored fraud patterns, togetherwith the corresponding scalability subgoals. Introduce scalability obstacle prevention goal: This tactic generates the new goalAvoid [Amount of Data Exceeds Data Storage Space] for which we generate the followingalternative renements:

    Set agent capacity according to load: The tactic set agent capacity upfrontto worst-case load is not applicable because the worst-case load in this case isinnite if we assume no end to the systems lifetime. The tactic adapt agentcapacity at runtime according to load suggests that we plan for a gradualincrease of the storage space during the systems lifetime. Limit goal load according to agent capacity: This tactic suggests increasingthe alerting threshold so as to limit the number of generated alerts when storagespace is insufcient.

    Obstacle Reduction Inuence goal load distribution: Inuencing the number of bank transactions isout of scope and reducing this number would be against the companys interest.The number of false alerts could be reduced by inuencing the behaviors of accountholders (e.g., by allowing and encouraging them to provide information to the bankto reduce the risk of false alerts.)

    Goal Weakening Weaken goal denition with scaling assumption and weaken goal denition bystrengthening scaling assumption consist in revising the scaling assumptions andhave not been considered. Weaken goal objective function: The data storage goal can be weakened bylimiting the amount of time during which transactions and alerts need to remainstored. This is one of the tactics implemented in the system. Many factors inuencewhen transactions and alert data can be removed, notably legal requirements on theminimum time during which transactions and alert data must remain accessible.

    Goal Restoration andObstacle Mitigation

    If the obstacle cannot be completely avoided, back up plans may consist of havingreserved storage space always on stand-by, erasing older data to allow for new datato be stored, or stopping all processing until the obstacle is resolved.

    environments [36], [37], [38]. Earlier, the advent of gridcomputing was driven by the need to share resources,and sometimes to achieve high-performance, in collab-orative problem-solving strategies which emerged inindustry, science, and engineering [39]. Currently, cloudcomputing mechanisms, such as cloudbursting, allowcompanies to maintain th