Upload
jonny-doin
View
96
Download
0
Embed Size (px)
Citation preview
#ESCBOS #ESCBOS
Integra(ng Safety in Silicon: Failsafe cells for IoT Designs Jonny Doin – GridVortex
#ESCBOS
h"p://www.funfix.com/Gallery/Images/lg_Rock-‐Climbing-‐in-‐Talkeetna-‐Alaska.jpg
Agenda • Safety: What is Safety? • Failure: What consJtutes Failure? • Failsafe / Failsafe cell Design • LT Spice as a system modeling tool • Modeling the Firmware/Hardware interfaces • SimulaJng SoPware failure at the interface • Circuit behavior under failure scenarios • Final thoughts
#ESCBOS
Safety: What is Safety?
A Safe System is one that exhibits:
• DeterminisJc responses
Ø Controlled Behaviors for all inputs
Ø Never place its outputs in a hazardous state
h"p://large.stanford.edu/publicaJons/coal/references/hvistendahl/images/f1big.jpg
#ESCBOS
Safety: What is Safety? (2)
REALITY: !ALL SYSTEMS !
WILL FAIL!
h"p://stat.ks.kidsklik.com/staJcs/files/2012/10/13496768121110667387.jpg
#ESCBOS
Safety: What is Safety? (3)
In the real world, systems are always connected to other systems. Hazardous output states must be qualified from the downstream (external) systems point of view. h"ps://www.engineerjobs.co.uk/images/industry-‐sectors/img_60_instrumentaJon.jpg
#ESCBOS
Failure
Failure is a malfuncJon on the system, or a deviaJon on designed behavior.
On any system, such a deviaJon on the chain of processing can lead to system failure.
h"p://photos1.blogger.com/blogger/4548/1285/1600/Matrix%20System%20Failure.jpg
#ESCBOS
Failsafe Design
Failsafe design can be “costly” in system resources.
For example, achieving funcJonal safety in Microcontollers may require fully redundant processors, running in lockstep mode.
h"p://img.deusm.com/designnews/2011/09/233762/114610_803509.jpg
Example: Cortex-‐R4 in Lockstep
#ESCBOS
Failsafe Design (2)
One example where cost is paramount is IoT chips, designed in mature processes (e.g. 180nm) with mixed-‐signal circuitry.
These designs usually have small, low-‐cost processor cores, such as an ARM Cortex-‐M0.
A hybrid failsafe approach can be beneficial on many of those IoT cases.
ARM Cortex-‐M0
Controlled Subsystem (actuators, power)
GPIOs"
#ESCBOS
Failsafe Design (3)
Designs can handle system failures at the criJcal interfaces, by idenJfying signal state failure and insuring a known system state. This design pa"ern is recursive, i.e., can be applied to subsystems down to the smaller modules, to ensure that the whole system fails in a safe mode.
Complex Control System
Controlled Subsystem (actuators, motors)
Cri(cal Interface
#ESCBOS
Failsafe cell design
The design case we’ll look into is a hybrid IoT applicaJon chip, with an integrated Cortex-‐M0.
The design goals are:
• Firmware failure detecJon
• Safe reboot of the CPU • Safe drive logic for no loss of control
ARM Cortex-‐M0
Controlled Subsystem (actuators, power)
GPIOs"
Failsafe Logic CONTROL I/Os"
#ESCBOS
Failsafe cell design (2)
Failsafe cells use dynamic signals as control commands, or use encoded states.
Signals that are “frozen” at ‘0’ or ‘1’, or illegal states, indicate a failed soPware control funcJon.
The failsafe logic takes over and guarantees failsafe behavior.
ARM Cortex-‐M0
Controlled Subsystem (actuators, power)
GPIOs"
Failsafe Logic CONTROL I/Os"
#ESCBOS
Failsafe cell design (3) The failsafe cell can be a digital funcJon that validates the control states, or a detector for the invalid steady state control signals.
Failsafe circuitry contain hardwired logic that takes control and guarantees behavior like basic control loop and failsafe responses.
#ESCBOS
LTSpice as a System tool
LT Spice is a fast and accurate circuit simulaJon tool.
Used as a circuit simulator, LT Spice can predict actual behavior with high precision.
Modelling interacJon of Firmware and Analog hardware in the design stage is a powerful capability.
130.5ms 132.0ms 133.5ms 135.0ms 136.5msV5942.1V8942.1V1052.1V4052.1V7052.1V0152.1V3152.1V6152.1V9152.1V2252.1V5252.1V68052.1V88052.1V09052.1V29052.1V49052.1V69052.1V89052.1V00152.1V20152.1V40152.1V0.0V1.0V2.0V3.0V4.0V5.0V6.0V7.0V8.0V9.0V0.1
V(adc_val) V(adc_in)
V(vip)
V(isr_block)
#ESCBOS
LTSpice as a System tool (2)
LT Spice allows modeling mixed-‐signal systems, including Firmware behavior interacJon with Analog hardware:
• Behavioral sources (B) • Digital Gate primiJves (Axxx)
• Hierarchical subcircuits • Waveform and data file generators
#ESCBOS
Modelling system interfaces
Designing the Fw/Hw interface as a failsafe node has a number of advantages:
• ImplementaJon Decoupling of Firmware and Hardware
• Addresses CPU failure • Lower cost of implementaJon
#ESCBOS
Modelling system interfaces (2)
Some examples of System interfaces for failsafe funcJons on control circuitry and Firmware / Hw interface:
• Failsafe “Passive” drivers
• AC coupled commands
• Failsafe “ON” actuators
#ESCBOS
Example: Failsafe “passive”
Output analog drivers can be designed to fail in high-‐impedance mode
#ESCBOS
Example: Failsafe “passive” (2)
The 2 analog outputs are buffered with failsafe drivers that go high impedance when VCC is lost
#ESCBOS
Example: Failsafe “passive” (3)
Each output is buffered and isolated with 2 transistors.
When VCC fails, the transistors cut off, with very high impedance.
A 68K resistor is seen by the output current source and will drive the output voltage to 6.8V, bringing the output to 100%.
This failsafe guarantees the downstream system is ON, even on loss of control.
#ESCBOS
Example: AC-‐coupled cmds
On a firmware failure, toggling signals will stop at VCC or GND. AC-‐coupled commands can detect such firmware failures.
#ESCBOS
Example: Failsafe “ON”
A firmware failure will keep the actuator ON. The firmware commands are designed to turn it OFF.
#ESCBOS
Firmware control Loop: Servo DAC
PWM value is set to 50% when the error is Zero. PosiJve errors make the PWM duty cycle to be > 50%, driving the net integrated voltage “down”. NegaJve errors set < 50% duty cycles, driving the net integrated voltage “up”. Delays in the Firmware control loop can adversely affect the output correcJon. We can simulate the effects of interrupts causing long control loop latencies.
#ESCBOS
Detail: firmware interference
• For comparison, we removed the PWM from the control loop: direct interrupt-‐driven GPIO mode instead of Servo PWM mode
• SimulaJng perturbaJon by Interrupts blocking Jme delaying GPIO control loop
• Any firmware latency directly affects the output stability
• Hard realJme requirements for direct GPIO control loop
130.5ms 132.0ms 133.5ms 135.0ms 136.5msV5942.1V8942.1V1052.1V4052.1V7052.1V0152.1V3152.1V6152.1V9152.1V2252.1V5252.1V68052.1V88052.1V09052.1V29052.1V49052.1V69052.1V89052.1V00152.1V20152.1V40152.1V0.0V1.0V2.0V3.0V4.0V5.0V6.0V7.0V8.0V9.0V0.1
V(adc_val) V(adc_in)
V(vip)
V(isr_block)
#ESCBOS
Detail: firmware interference (2)
• Control loop via PWM as Servo drive
• Same delays caused by Interrupts blocking Jme, delaying PWM error update
• PWM servo maintains DC voltage, with minor error deviaJons
• SoP realJme requirements for PWM Servo control loop
• Accept soP Jming failures from Firmware operaJon
#ESCBOS
Final thoughts
• Failsafe design is an essenJal part of Embedded Systems
• On ultralow cost IoT systems, funcJonal safety can be hard to achieve
• Failsafe cells operate at the interfaces of the control chain
• ImplementaJon cost is very a"racJve, enabling use of low-‐end processors
• Simple Mixed-‐Signal techniques can be used in failsafe cells