Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Stakeholders’ agreement with D-‐Case
Yutaka Matsuno University of Tokyo
JST CREST DEOS Project [email protected]‐tokyo.ac.jp
Contents
• Achieving Open Systems Dependability • Our Approach – Using Assurance Cases in both design and operaJonal phases
• Assurance Cases and D-‐Case • Example Use Case: Web Server System • D-‐Case Tool Demo • Conclusion
Contents
• Achieving Open Systems Dependability • Our Approach – Using Assurance Cases in both design and operaJonal phases
• Assurance Cases and D-‐Case • Example Use Case: Web Server System • D-‐Case Tool Demo • StandardizaJon and Future Plan • Conclusion
How to Achieve Dependability of Open System
• ConvenJonal Methods such as Formal Methods are limited for Open System – They assume ``closed systems assumpJon”
• Best way we can do is – Stakeholders together with experts argue dependability of the system with evidences, and try to make agreement that the system is dependable
Open Systems Dependability
• All stakeholders must communicate each other and agree on dependability of the system
• System must provide evidence for the agreement
System Product Provider �
Provide �
System Provider
End User
Authority�
Service Provider
Use �
Dependability Agreement
Contents
• Achieving Open Systems Dependability • Our Approach – Using Assurance Cases in both design and operaJonal phases
• Assurance Cases • Example Use Case: Web Server System • D-‐Case Tool Demo • Conclusion
Our Approach
• Assurance is a key concept • Assurance Cases seem a good framework • Based on Assurance Cases, we have started to develop D-‐Case, which is a method for dependability agreement among stakeholders
Current Status of Assurance Cases
• Assurance Case is used in a few specific system domains – Not applied to IT-‐Based social systems, enterprise architecture, …
• Assurance Case is used mostly in development phases – It is difficult to assure that system behaves as intended in run-‐Jme, especially for failure miJgaJon behaviors
Ideas
• Extend well developed Process (e.g., TOGAF) with Assurance Case
• Use Assurance Cases in both development and operaJonal phases – In development, consider operaJonal environment and risks as much as possible
– In operaJonal, always monitore that dependability is sustained or not
Contents
• Achieving Open Systems Dependability • Our Approach – Using Assurance Cases in both design and operaJonal phases
• Assurance Cases and D-‐Case • Example Use Case: Web Server System • D-‐Case Tool Demo • Conclusion
Assurance Case A documented body of evidence that provides a convincing and valid argument that a system is adequately dependable for a given applicaJon in a given environment.
Goal
Evidence
Evidence
Evidence
Argument Structure
Eg. System is Safe
Eg: FTA(Fault Tree Analysis) Result
Brief History • “Case” is one of words in courts • Recognized a_er serious incidents in UK – Piper Alpha North Sea Oil(167 death, 1988) – Clapham JuncJon rail crash (35 death, 1988)
• Not only following a procedure, but arguing why the procedure makes the system safe, based on evidence
• Widely required for regulaJon in UK, and now worldwide – EUROCONTROL, Rail Yellow Book, MoD Defence Standard 00-‐56, and
– ISO 26262: FuncJonal Safety for Automobile
GSN (Goal Structuring NotaJon): A Graphical NotaJon
• Developed by Tim Kelly and his colleagues in University of York – [1] The Goal Structuring NotaJon -‐ A Safety Argument NotaJon, T P Kelly, R A Weaver in Proceedings of Workshop on Assurance Cases, 2004
– Simple, Panerns, and Modules
• Another one is Claim, Argument and Evidence (CAE)
GSN Main nodes
Goal Strategy
Context Evidence
Claim to be argued: Eg: System is safe
RaJonale for decomposing a goal Eg. Argue for each idenJfied fault
Final object For supporJng A goal Eg: FTA results
Environmental InformaJon for arguing goals Eg. System environment, list of all idenJfied faults
GSN Example (from [1])
System cantolerate single
componentfailures
Argument byelimination of all
hazards
Fault Treefor Hazard
H1
Goal Solution Strategy
All IdentifiedSystemHazards
ContextUndeveloped Goal
(to be developed further)
Figure 4- Principal Elements of the Goal Structuring Notation
When the elements of the GSN are linked together
in a network they are described as a ‘goal structure’. The principal purpose of any goal structure is to show how goals (claims about the system) are successively broken down into sub-goals until a point is reached where claims can be supported by direct reference to available evidence (solutions). As part of this decomposition, using the GSN it is also possible to make clear the argument strategies adopted (e.g. adopting a quantitative or qualitative approach), the rationale for the approach and the context in which goals are stated (e.g. the system scope or the assumed operational role).
Figure 5 shows an example goal structure. In this structure, as in most, there exist ‘top level’ goals –
statements that the goal structure is designed to support. In this case, “C/S (Control System) Logic is fault free”, is the (singular) top level goal. Beneath the top level goal or goals, the structure is broken down into sub-goals, either directly or, as in this case, indirectly through a strategy. The two argument strategies put forward as a means of addressing the top level goal in Figure 5 are “Argument by satisfaction of all C/S (Control System) safety requirements”, and, “Argument by omission of all identified software hazards”. These strategies are then substantiated by five sub-goals. At some stage in a goal structure, a goal statement is put forward that need not be broken down and can be clearly supported by reference to some evidence. In this case, the goal “Unintended Closing of press after PoNR (Point of No Return) can only occur as a result of component failure”, is supported by direct reference to the solutions, “Fault tree cutsets …” and “Hazard Directed Testing Results”.
Within Europe, GSN has been adopted by a growing number of companies within safety-critical industries (such as aerospace, railways and defence) for the presentation of safety arguments within safety cases. The following list includes some of the applications of GSN to date: x Eurofighter Aircraft Avionics Safety Justification x Hawk Aircraft Safety Justification x U.K. Ministry of Defence Site Safety
Justifications
G1
C/S Logic is fault free
S1
Argument bysatisfaction of all C/Ssafety requirements
S2
Argument by omissionof all identified softwarehazards
C1
Identifiedsoftware hazards
G2
Press controls being'jammed on' will causepress to halt
G3
Release of controls prior to presspassing physical PoNR willcause press operation to abort
G4
C/S fails safe (halts) on, andannunciates (by soundingklaxon), all single componentfailures
Sn1
Black BoxTest Results
G5
'Failure1' transition of PLCstate machine includesBUTTON_IN remaining true
G7
'Abort' transition of PLCstate machine includesBUTTON_IN going FALSE
Sn2
C/S StateMachine
G8
Unintended opening of press(after PoNR) can only occuras a result of componentfailure
G9
Unintended closing of presscan only occur as a result ofcomponent failure
Sn3
Fault tree analysiscutsets for event'Hand trapped in
press due tocommand error'
Sn4
Hazarddirected test
results
Figure 5 – An Example Goal Structure
D-‐Case
• We add monitoring node as a sub-‐class of evidence
Goal Strategy
Context Evidence
Claim to be argued: Eg: System is safe
RaJonale for decomposing a goal Eg. Argue for each idenJfied fault
Final object For supporJng A goal Eg: FTA results
Environmental InformaJon for arguing goals Eg. System environment, list of all idenJfied faults
Monitoring
Evidences obtained from run-‐Jme system Eg. System Logs by monitoring,
Contents
• Achieving Open Systems Dependability • Our Approach – Using Assurance Cases in both design and operaJonal phases
• Assurance Cases and D-‐Case • Example Use Case: Web Server System • D-‐Case Tool Demo • Conclusion
Example Use Case: Simple Web Server System
Requirement ・Maximum Access Number: 2500 Jmes/minute ・Response Time is within 3 seconds ・Recovery for one failure is within 5 minutes ….
Service Risk Analysis Result ・Too Many Access from Users ・Response Time Delay ・Memory Leak, …
NetworkClient
Client
Client WebServer
ApplicationServer
DatabaseServer
OperatorConsole
D-‐Case Basic Structure (TentaJve)
Dependability Goal
Failure Response Argument
Change AccommodaJon
Argument
Failure DetecJon Argument
Failure MiJgaJon AcJon
Argument
Requirement Service Risk
Contents
• Achieving Open Systems Dependability • Our Approach – Using Assurance Cases in both design and operaJonal phases
• Assurance Cases and D-‐Case • Example Use Case: Web Server System • D-‐Case Tool Demo • Conclusion
D-‐Case Editor hnp://www.il.is.s.u-‐tokyo.ac.jp/deos/dcase/
• Parameterized Panern Library
• Monitoring RunJme system
• Consistency Checking by an proof assistant – Google ``D-‐Case/Agda”
Will be open source in next March
Run-‐Time Monitoring Demo in Failure Response Cycle
DemonstraJon at ET2011, Yokohama, Japan, Nov 2011
Demo System Overview
Network
Client
Client
Client
Web Server
ApplicaJon Server
Database Server
Monitor
D-‐Case Editor
DEOS Project © 2011 Japan Science and Technology Agency
2012/1/10 Page15
Ordinary operation is the state where the system is operating with values within the ranges agreed upon by the stakeholders. We refer to these ranges as “in-operation ranges”. The change accommodation cycle should run in parallel with ordinary operation, so that improvement of the system is performed while providing services. Also, the failure response cycle preferably runs in parallel with ordinary operation. In case the system detects a sign of a failure before it occurs, it might be able to prevent it from occurring. Even in the case where a failure has occurred, the system may run in a degraded mode. There also are cases where the system must be fully stopped after a severe failure.
In ordinary operation, the following actions are taken:
recording the system’s operation states and inspecting this record periodically to detect failure signs,
cleaning systems to prevent system aging, rehearsing failures to improve detectability of failure signs and preventive actions, reviewing processes periodically to improve them, educating and training staff,
etc.
3.1.2. Failure Response Cycle
The failure response cycle is the cycle to quickly respond to failures to minimize damage. In the DEOS process, failures are defined as the deviation of services and functions from the acceptable operation ranges that have been agreed upon among the stakeholders. The failure response cycle starts from the ordinary operation state, and consists of the failure prevention phase, the responsive action phase, the cause analysis phase, and the accountability achievement phase. The first three phases are not always executed in series in this order, and often are executed collaboratively in parallel.
The failure prevention phase is a phase for the following: when a failure is predicted before it
occurs, or the possibility of a failure increases, an action is taken to prevent such a failure from
Fig.4. DEOS Process
Conclusion
• Stakeholder’s agreement with D-‐Case • StandardizaJon – ISO, Open Group – OMG • Assuring Dependability of Consumer Device (Mr. Akira Ohata)
• Current Topics – D-‐Case Monitoring ImplementaJon – RelaJonship to Requirement Analysis result