37
SafetyCri+cal Embedded Systems Lecture 1: Introduc+on [email protected]

Safety-Crtical Embedded Systems

  • Upload
    eselab

  • View
    108

  • Download
    0

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Safety-Crtical Embedded Systems

Safety-­‐Cri+cal  Embedded  Systems  

Lecture  1:    Introduc+on  

[email protected]  

Page 2: Safety-Crtical Embedded Systems

Lecture  1/2  

Lecture  outline  

•  Course  informa+on  –  Examina+on:  project  

•  What  is  a  “safety-­‐cri+cal  embedded  system”?  –  Embedded  systems  

–  Real-­‐+me  systems  

–  Safety-­‐cri+cal  systems  

•  Fundamental  concepts  of  dependability  –  The  “dependability”  concept  –  Threats:  fault,  error,  failure  –  ALributes:  reliability,  availability  

Page 3: Safety-Crtical Embedded Systems

Lecture  1/3  

Course  informa+on  

•  Contact  –  Paul  Pop,  course  leader  and  examiner  

•  Email:  [email protected]  

•  Phone:  4525  3732  •  Office:  building  322,  office  228  

•  Webpage  –  All  the  informa+on  is  on  CampusNet  

Page 4: Safety-Crtical Embedded Systems

Lecture  1/4  

Course  informa+on,  cont.  

•  Textbook:  Israel  Koren  and  C.  Mani  Krishna,  Fault-­‐Tolerant  Systems  Morgan  Kaufmann  

•  Full  text  available  online,  see  the  link  on  CampusNet  

Page 5: Safety-Crtical Embedded Systems

Lecture  1/5  

Course  informa+on,  cont.  

•  Lectures  –  Language:  English  –  12  lectures  

•  Lecture  notes:  available  on  CampusNet  as  a  PDF  file  the  day  before  

•  Dec.  1  is  used  for  the  project  •  Two  invited  lectures,  from  Novo  Nordisk  and  Danfoss  

•  Examina+on  –  Project:  70%  report  +  30%  presenta+on  

•  7.5  ECTS  points  

Page 6: Safety-Crtical Embedded Systems

Lecture  1/6  

Project  

•  Milestones  –  End  of  September:  Group  registra+on  and  topic  selec+on  

•  Email  to  [email protected]  

–  End  of  October:  Project  report  drae  •  Upload  drae  to  CampusNet  

–  End  of  November:  Report  submission  •  Upload  final  report  to  CampusNet  

–  Last  lecture:  Project  presenta+on  and  oral  opposi+on  •  Upload  presenta+on  to  CampusNet  

Page 7: Safety-Crtical Embedded Systems

Lecture  1/7  

Project,  cont.  

•  Project  registra+on  –  E-­‐mail  Paul  Pop,  [email protected]  

•  Subject:  02228  registra+on  •  Body:  

–  Name  student  #1,    student  ID  

–  Name  student  #2,  student  ID  

–  Project  +tle  –  Project  details  

•  Notes  –  Groups  of  max.  3  persons  

Project  approval  

Page 8: Safety-Crtical Embedded Systems

Lecture  1/8  

Project,  cont.  

•  Topic  categories  1.  Literature  survey  

•  See  the  “references”  and  “further  reading”  in  the  course  literature  

2.  Tool  case-­‐study  •  Select  a  commercial  or  research  tool  and    

use  it  on  a  case-­‐study  

3.  Soeware  implementa+on  •  Implement  a  technique,    

e.g.,  error  detec+on  or  fault-­‐tolerance  technique  

–  Suggested  topics  available  on  CampusNet  

Page 9: Safety-Crtical Embedded Systems

Lecture  1/9  

Project,  cont.  

•  Examples  of  last  years’  projects  –  ARIANE  5:  Flight  501  Failure    –  Hamming  Correc+ng  Code  Implementa+on  in    

Transmimng  System  –  Applica+on  of  a  Fault  Tolerance  to  a  Wind  Turbine  –  Guaranteed  Service  in  Fault-­‐Tolerant  Network-­‐on-­‐Chip  –  Fault  tolerant  digital  communica+on  –  Resilience  in  Mobile  Mul+-­‐hop  Ad-­‐hoc  Networks  –  Fault  tolerant  ALU  –  Reliable  message  transmission  in  the  CAN,  TTP  and  FlexRay  

Page 10: Safety-Crtical Embedded Systems

Lecture  1/10  

Project  deliverables  

1.  Literature  survey  –  WriLen  report  

•  Structure  –  Title,  authors  

–  Abstract  

–  Introduc+on  

–  Body  

–  Conclusions  

–  References  

2.  Tool  case-­‐study  –  Case-­‐study  files  

–  Report  •  Document  your  work  

4.  Soeware  implementa+on  –  Source  code  with  comments  

–  Report  •  Document  your  work  

Deadline  for  drae:  End  of  October  

Deadline  for  final  version  End  of  November  

Page 11: Safety-Crtical Embedded Systems

Lecture  1/11  

Project  presenta+on  &  opposi+on  

•  Poster  presenta+on  of  project  –  15  min.  +  5  min.  ques+ons  

•  Note!  –  During  the  presenta+on  you  might  be  asked    general  ques+ons  that  relate  to  any  course  topic  

Deadline:  Last  lecture  

Page 12: Safety-Crtical Embedded Systems

Lecture  1/12  

Embedded  systems  

•  Compu+ng  systems  are  everywhere  

•  Most  of  us  think  of  “desktop”  computers  –  PC’s  –  Laptops  –  Mainframes  

–  Servers  •  But  there’s  another  type  of  compu+ng  system  

–  Far  more  common...  

Page 13: Safety-Crtical Embedded Systems

Lecture  1/13  

Embedded  systems,  cont.  

•  Embedded  compu+ng  systems  –  Compu+ng  systems  embedded  within  electronic  devices  

–  Hard  to  define.  Nearly  any  compu+ng  system  other  than  a  desktop  computer  

–  Billions  of  units  produced  yearly,  versus  millions  of  desktop  units  

–  Perhaps  50  per  household  and  per  automobile  

Page 14: Safety-Crtical Embedded Systems

Lecture  1/14  

What  is  an  embedded  system?  

•  Defini+on  –  an  embedded  system  special-­‐purpose  computer  system,  part  of  a  larger  system  which  it  controls.  

•  Notes  –  A  computer  is  used  in  such  devices  primarily  as  a  means  to  simplify  the  system  design  and  to  provide  flexibility.    

–  Oeen  the  user  of  the  device  is  not  even  aware  that  a  computer  is  present.    

Page 15: Safety-Crtical Embedded Systems

Lecture  1/15  

Characteris+cs  of  embedded  systems  

•  Single-­‐func+oned  –  Dedicated  to  perform  a  single  func+on  

•  Complex  func+onality  –  Oeen  have  to  run  sophis+cated  algorithms  or  mul+ple  algorithms.  

•  Cell  phone,  laser  printer.  

•  Tightly-­‐constrained  –  Low  cost,  low  power,  small,  fast,  etc.  

•  Reac+ve  and  real-­‐+me  –  Con+nually  reacts  to  changes  in  the  system’s  environment  

–  Must  compute  certain  results  in  real-­‐+me  without  delay  

•  Safety-­‐cri+cal  –  Must  not  endanger  human  life  and  the  environment  

Page 16: Safety-Crtical Embedded Systems

Lecture  1/16  

Func+onal  vs.  non-­‐func+onal  requirements  

•  Func+onal  requirements  –  output  as  a  func+on  of  input  

•  Non-­‐func+onal  requirements:  –  Time  required  to  compute  output  –  Reliability,  availability,  integrity,    maintainability,  dependability  

–  Size,  weight,  power  consump+on,  etc.  

Page 17: Safety-Crtical Embedded Systems

Lecture  1/17  

Real-­‐+me  systems  

•  Time  –  The  correctness  of  the  system  behavior  depends  not  only  on  the  logical  results  of  the  computa+ons,  but  also  on  the  !me  at  which  these  results  are  produced.  

•  Real  –  The  reac+on  to  the  outside  events  must  occur  during  their  evolu+on.  The  system  +me  must  be  measured  using  the  same  +me  scale  used  for  measuring  the  +me  in  the  controlled  environment.  

Page 18: Safety-Crtical Embedded Systems

Lecture  1/18  

Real-­‐+me  systems,  cont.  

Page 19: Safety-Crtical Embedded Systems

Lecture  1/19  

Safety-­‐cri+cal  systems  

•  Defini+ons  –  Safety  is  a  property  of  a  system  that  will  not  endanger  human  life  or  the  environment.  

–  A  safety-­‐related  system  is  one  by  which  the  safety  of  the  equipment  or  plant  is  ensured.  

•  Safety-­‐cri?cal  system  is:  –  Safety-­‐related  system,  or  

–  High-­‐integrity  system  

Page 20: Safety-Crtical Embedded Systems

Lecture  1/20  

System  integrity  

•  Defini+on  –  The  integrity  of  a  system  is  its  ability  to  detect  faults  in  its  own  opera+on  and  to  inform  the  human  operator.  

•  Notes  –  The  system  will  enter  a  failsafe  state  if  faults  are  detected  –  High-­‐integrity  system  

•  Failure  could  result  large  financial  loss  •  Examples:  telephone  exchanges,  communica+on  satellites  

Page 21: Safety-Crtical Embedded Systems

Lecture  1/21  

Failsafe  opera+on  

•  Defini+on  –  A  system  is  failsafe  if  it  adopts  “safe”  output  states  in  the  event  of  failure  and  inability  to  recover.  

•  Notes  –  Example  of  failsafe  opera+on  

•  Railway  signaling  system:  failsafe  corresponds  to  all  the  lights  on  red  

–  Many  systems  are  not  failsafe  •  Fly-­‐by-­‐wire  system  in  an  aircrae:  the  only  safe  state  is  on  the  ground  

Page 22: Safety-Crtical Embedded Systems

Lecture  1/22  

Preliminary  topics  

•  Fundamental  concepts  of  dependability  

•  Means  of  achieving  dependability    •  Hazard  and  risk  analysis    •  Reliability  analysis    •  Hardware  redundancy    •  Informa+on  and  +me  redundancy    •  Soeware  redundancy    •  Checkpoin+ng    •  Fault-­‐Tolerant  Networks    

Page 23: Safety-Crtical Embedded Systems

Lecture  1/23  

Dependability:  an  integra+ng  concept  

Availability  Reliability  Safety  Confiden?ality  Integrity  Maintainability  

Fault  preven?on  Fault  tolerance  Fault  removal  Fault  forecas?ng  

Faults  Errors  Failures  

aEributes  

means  

threats  

dependability  

•  Dependability  is  a  property  of  a  system  that  jus+fies  placing  one’s  reliance  on  it.  

Page 24: Safety-Crtical Embedded Systems

Lecture  1/24  

Threats:  Faults,  Errors  &  Failures    

Cause  of  error  (and  failure)  

Fault  Error  

Unintended  internal  state  of  subsystem  

Failure  

Devia+on  of  actual  service  from  intended  service  

Page 25: Safety-Crtical Embedded Systems

Lecture  1/25  

Threats:  Faults,  Errors  &  Failures,  cont.  

•  Fault  –  Physical  defect,  imperfec+on,  of  flaw  that  occurs  within  some  hardware  

or  soeware  component.  

–  Examples  •  Shorts  between  electrical  conductors  •  Physical  flaws  or  imperfec+ons  in  semiconductor  devices  

•  Program  loop  that  when  entered  can  never  be  exited  

–  Primary  cause  of  an  error  (and,  perhaps,  a  failure)  •  Does  not  necessarily  lead  to  an  error  e.g.,  a  bit  in  memory  flipped  by  radia+on  

–  can  cause  an  error  if  next  opera+on  on  memory  cell  is  “read”  

–  causes  no  error  if  next  opera+on  on  memory  cell  is  “write”  

Page 26: Safety-Crtical Embedded Systems

Lecture  1/26  

Threats:  Faults,  Errors  &  Failures,  cont.  

•  Error  –  An  incorrect  internal  state  of  a  computer  

•  Devia+on  from  accuracy  or  correctness  

–  Example  •  Physical  short  results  in  a  line  in  the  circuit  permanently  being  stuck  at  a  logic  1.  The  physical  short  is  a  fault  in  the  circuit.  If  the  line  is  required  to  transi+on  to  a  logic  0,  the  value  on  the  line  will  be  in  error.  

–  The  manifesta+on  of  a  fault  

–  May  lead  to  a  failure,  but  does  not  have  to  

Page 27: Safety-Crtical Embedded Systems

Lecture  1/27  

Threats:  Faults,  Errors  &  Failures,  cont.  

•  Failure  –  Denotes  a  devia+on  between  the  actual  service  and  the  specified  or  intended  service    

–  Example  •  A  line  in  a  circuit  is  responsible  for  turning  a  valve  on  or  off:  a  logic  1  turns  the  valve  on  and  a  logic  0  turns  the  valve  off.  If  the  line  is  stuck  at  logic  1,  the  valve  is  stuck  on.  As  long  as  the  user  of  the  system  wants  the  valve  on,  the  system  will  be  func+oning  correctly.  However,  when  the  user  wants  the  valve  off,  the  system  will  experience  a  failure.  

–  The  failure  is  an  event  (i.e.  occurs  at  some  +me  instant,  if  ever)  caused  by  an  error  

Page 28: Safety-Crtical Embedded Systems

Lecture  1/28  

The  pathology  of  failure  

Page 29: Safety-Crtical Embedded Systems

Lecture  1/29  

Three-­‐universe  model  

1.   Physical  universe:  where  the  faults  occur  –  Physical  en++es:  semiconductor  devices,  mechanical  elements,  

displays,  printers,  power  supplies  

–  A  fault  is  a  physical  defect  or  altera+on  of  some  component  in  the  physical  universe  

2.   Informa?onal  universe:  where  the  error  occurs  –  Units  of  informa+on:  bits,  data  words  

–  An  error  has  occurred  when  some  unit  of  informa+on  becomes  incorrect  

3.   External  (user’s  universe):  where  failures  occur    –  User  sees  the  effects  of  faults  and  errors  

–  The  failure  is  any  devia+on  from  the  desired  or  expected  behavior  

Page 30: Safety-Crtical Embedded Systems

Lecture  1/30  

Causes  of  faults  

•  Problems  at  any  stages  of  the  design  process  can  result  in  faults  within  the  system.  

Page 31: Safety-Crtical Embedded Systems

Lecture  1/31  

Causes  of  faults,  cont.  

•  Specifica+on  mistakes  –  Incorrect  algorithms,  architectures,  hardware  or  soeware  design  

specifica+ons  •  Example:  the  designer  of  a  digital  circuit  incorrectly  specified  the  +ming  characteris+cs  of  some  of  the  circuit’s  components  

•  Implementa+on  mistakes  –  Implementa+on:  process  of  turning  the  hardware  and  soeware  designs  

into  physical  hardware  and  actual  code  –  Poor  design,  poor  component  selec+on,  poor  construc+on,    

soeware  coding  mistakes  •  Examples:  soeware  coding  error,  a  printed  circuit  board  is  constructed  such  that  adjacent  lines  of  a  circuit  are  shorted  together  

Page 32: Safety-Crtical Embedded Systems

Lecture  1/32  

Causes  of  faults,  cont.  

•  Component  defects  –  Manufacturing  imperfec+ons,  random  device  defects,    

component  wear-­‐out  –  Most  commonly  considered  causes  of  faults  

•  Examples:  bonds  breaking  within  the  circuit,  corrosion  of  the  metal  

•  External  disturbance  –  Radia+on,  electromagne+c  interference,  operator  mistakes,  

environmental  extremes,  baLle  damage  •  Example:  lightning  

Page 33: Safety-Crtical Embedded Systems

Lecture  1/33  

Failure  modes  

Page 34: Safety-Crtical Embedded Systems

Lecture  1/34  

Failure  modes,  cont.  

•  Failure  domain  –  Value  failures  :  incorrect  value  delivered  at  interface  –  Timing  failures  :  right  result  at  the  wrong  +me  (usually  late)  

•  Failure  consistency    –  Consistent  failures  :  all  nodes  see  the  same,  possibly  wrong,  result  –  Inconsistent  failures  :  different  nodes  see  different  results  

•  Failure  consequences  –  Benign  failures  :  essen+ally  loss  of  u+lity  of  the  system  –  Malign  failures  :  significantly  more  than  loss  of  u+lity  of  the  system;  

catastrophic,  e.g.  airplane  crash    •  Failure  oRenness  (failure  frequency  and  persistency)  

–  Permanent  failure  :  system  ceases  opera+on  un+l  it  is  repaired  –  Transient  failure  :  system  con+nues  to  operate  

•  Frequently  occurring  transient  failures  are  called  intermiLent  

Page 35: Safety-Crtical Embedded Systems

Lecture  1/35  

Failure  modes,  cont.  

•  Consistent  failures  –  Fail-­‐silent  

•  system  produces  correct  results  or  remains  quiet  (no  delivery)  –  Fail-­‐crash  

•  system  produces  correct  results  or  stops  quietly  –  Fail-­‐stop  

•  system  produces  correct  results  or  stops  (made  known  to  others)  

•  Inconsistent  failures  –  Two-­‐faced  failures,  malicious  failures,  Byzan+ne  failures  

Page 36: Safety-Crtical Embedded Systems

Lecture  1/36  

Propor+on  of  failures  

Page 37: Safety-Crtical Embedded Systems

Lecture  1/37  

Dependability  aLributes  

•  Availability:  readiness  for  correct  service  •  Reliability:  con+nuity  of  correct  service  •  Safety:  absence  of  catastrophic  consequences  on  the  user(s)  

and  the  environment  

•  Confiden?ality:  absence  of  unauthorized  disclosure  of  informa+on  

•  Integrity:  absence  of  improper  system  altera+ons  

•  Maintainability:  ability  to  undergo,  modifica+ons,  and  repairs  

•  Security:  the  concurrent  existence  of  (a)  availability  for  authorized  users  only,  (b)  confiden+ality,  and  (c)  integrity  with  ‘improper’  taken  as  meaning  ‘unauthorized’.