Evaluation and Validation Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: © Alexandra...

Preview:

Citation preview

Evaluation and Validation

Peter MarwedelTU Dortmund, Informatik 12

Germany

Gra

phic

s: ©

Ale

xand

ra N

olte

, Ges

ine

Mar

wed

el, 2

003

These slides use Microsoft clip arts. Microsoft copyright restrictions apply. 2012 年 01 月 12 日

marwedel
According to our timing, lectures should hardly have more than 40 slides. Therefore, some leftover slides are placed at the end of this file.

- 2 - p. marwedel, informatik 12, 2011

Structure of this course

2:Specification

3: ES-hardware

4: system software (RTOS, middleware, …)

8:Test

5: Evaluation & Validation (energy, cost, performance, …)

7: Optimization

6: Application mapping

App

licat

ion

Kno

wle

dge Design

repositoryDesign

Numbers denote sequence of chapters

- 3 - p. marwedel, informatik 12, 2011

Real-time calculus (RTC)/Modular performance analysis (MPA)

1

2

3

p 2p 3p

u

l1

2

3

p 2p 3p

u

l

p-J p+J

periodic event stream periodic event stream with jitter

Thiele et al. (ETHZ): Extended networkcalculus: Arrival curves describe the maximum and minimum number of events arriving in some time interval .Examples:

p pp-J P+J

- 4 - p. marwedel, informatik 12, 2011

RTC/MPA: Service curves

Service curves u resp. ℓ describe the maximum and minimum service capacity available in some time interval

s

p

bandwidth b

TDMA bus

ps p-s p+s

b s

2p

ul

Example:

- 5 - p. marwedel, informatik 12, 2011

RTC/MPA: Workload characterization

u resp. ℓ describe the maximum and minimumservice capacity required as a function of the number e of events. Example:

1 2 3

4

8

12

16

e

WCET=4

BCET=3

u

l(Defined only for an integer number of events)

- 6 - p. marwedel, informatik 12, 2011

RTC/MPA: Workload required for incoming stream

uuu

Incoming workload

lll

Upper and lower bounds on the number of events

ulu

1)(

lul 1

)(

- 7 - p. marwedel, informatik 12, 2011

RTC/MPA: System of real time components

ul ,

RTC

RTC’

','

u

RTC”

','ul

ul ,

Incoming event streams and available capacityare transformed by real-time components:

Theoretical results allow the computation of properties of outgoing streams

- 8 - p. marwedel, informatik 12, 2011

RTC/MPA: Transformation of arrival and service curves

Resulting arrival curves:

Remaining service curves:

Where: )()(inf ugutftgf

tu

0

)()(inf ugutftgfu

0

0' luu

0' ull

)()(sup ugutftgftu

0

)()(sup ugutftgfu

0

- 9 - p. marwedel, informatik 12, 2011

RTC/MPA: Remarks

Details of the proofs can be found in relevant references

Results also include bounds on buffer sizes and on maximum latency.

Theory has been extended into various directions,e.g. for computing remaining battery capacities

- 10 - p. marwedel, informatik 12, 2011

Application: In-Car Navigation System

Car radio with navigation systemUser interface needs to be responsiveTraffic messages (TMC) must be processed in a timely waySeveral applications may execute concurrently

© Thiele, ETHZ

- 11 - p. marwedel, informatik 12, 2011

System Overview

NAV RAD

MMI

DB

Communication

© Thiele, ETHZ

- 12 - p. marwedel, informatik 12, 2011

NAV RAD

MMI

DB

Communication

Use case 1: Change Audio Volume

< 200 ms

< 50 m

s

© Thiele, ETHZ

- 13 - p. marwedel, informatik 12, 2011

Use case 1: Change Audio Volume

© Thiele, ETHZ

CommunicationResourceDemand

- 14 - p. marwedel, informatik 12, 2011

NAV RAD

MMI

DB

Communication

< 200 ms

Use case 2: Lookup Destination Address

© Thiele, ETHZ

- 15 - p. marwedel, informatik 12, 2011

Use case 2: Lookup Destination Address

© Thiele, ETHZ

- 16 - p. marwedel, informatik 12, 2011

NAV RAD

MMI

DB

Communication

Use case 3: Receive TMC Messages

< 1000 ms

© Thiele, ETHZ

- 17 - p. marwedel, informatik 12, 2011

Use case 3: Receive TMC Messages

© Thiele, ETHZ

- 18 - p. marwedel, informatik 12, 2011

Proposed Architecture Alternatives

NAV RAD

MMI

22 MIPS

11 MIPS113 MIPS

NAV RAD

MMI

22 MIPS

11 MIPS113 MIPS

RAD

260 MIPS

NAV

MMI

22 MIPS

RAD

130 MIPS

MMI

NAV

113 MIPS

MMI

260 MIPS

RAD

NAV

72 kbps

72 kbps 57 kbps

72 k

bp

s

72 k

bp

s

(A)

(E)(D)(C)

(B)

© Thiele, ETHZ

- 19 - p. marwedel, informatik 12, 2011

Step 1: Environment (Event Steams)

Event Stream Model

e.g. Address Lookup(1 event / sec)

u

[s]

[events]

1

1

© Thiele, ETHZ

- 20 - p. marwedel, informatik 12, 2011

Step 2: Architectural Elements

Event Stream Model

e.g. Address Lookup(1 event / sec)

Resource Model

e.g. unloaded RISC CPU(113 MIPS) l=u

u

l

[s]

[s]

[MIPS]

[events]

1

1

1

113

© Thiele, ETHZ

- 21 - p. marwedel, informatik 12, 2011

Step 3: Mapping / Scheduling

Rate Monotonic Scheduling(Pre-emptive fixed priority scheduling):

Priority 1: Change Volume (p=1/32 s)

Priority 2: Address Lookup (p=1 s)

Priority 3: Receive TMC (p=6 s)

© Thiele, ETHZ

- 22 - p. marwedel, informatik 12, 2011

Step 4: Performance Model

CPU1 BUS CPU3CPU2

Change Volume

Receive TMC

NAV RAD

MMI

Address Lookup

MMI NAV RAD

© Thiele, ETHZ

- 23 - p. marwedel, informatik 12, 2011

Step 5: Analysis

CPU1 BUS CPU3CPU2

Change Volume

Receive TMC

NAV RAD

MMI

Address Lookup

MMI NAV RAD

© Thiele, ETHZ

- 24 - p. marwedel, informatik 12, 2011

Analysis – Design Question 1

How do the proposed system architecturescompare in respect to end-to-end delays?

© Thiele, ETHZ

- 25 - p. marwedel, informatik 12, 2011

End-to-end delays: (A) (E)(D)(C)(B)

Analysis – Design Question 1

0

50

100

Vol Key 2 Audio [ms]

0

50

100

Address Lookup [ms]0

500

1000

1500

TMC Decode [ms]

© Thiele, ETHZ

0

20

40

60

Vol Vis. 2 Audio [ms]

- 26 - p. marwedel, informatik 12, 2011

Analysis – Design Question 2

How robust is architecture A?

Where is the bottleneck of this architecture?

NAV RAD

MMI

22 MIPS

11 MIPS113 MIPS

72 kbps

(A)

© Thiele, ETHZ

- 27 - p. marwedel, informatik 12, 2011

TMC delay vs. MMI processor speed:

Analysis – Design Question 2

NAV RAD

MMI

22 MIPS

11 MIPS113 MIPS

72 kbps

(A)

26.4 MIPS

© Thiele, ETHZ

- 28 - p. marwedel, informatik 12, 2011

Conclusions

Easy to construct models (~ half day)

Evaluation speed is fast and linear to model complexity(~ 1s per evaluation)

Needs little information to construct early models(Fits early design cycle very well)

Even though involved mathematics is very complex, the method is easy to use (Language of engineers)

© Thiele, ETHZ

- 29 - p. marwedel, informatik 12, 2011

Evaluation of designsaccording to multiple objectives

Different design objectives/criteria are relevant: Average performance Worst case performance Energy/power consumption Thermal behavior Reliability Electromagnetic compatibility Numeric precision Testability Cost Weight, robustness, usability, extendibility, security,

safety, environmental friendliness

- 30 - p. marwedel, informatik 12, 2011

Energy- and power models

Power/energy models becoming increasingly important

due to mobile computing,

since energy availability becomes more relevant due to increased performance, and

due to environmental issues.

PdtE

- 31 - p. marwedel, informatik 12, 2011

Energy models

Tiwari (1994): Energy consumption within processors

Simunic (1999): Using values from data sheets. Allows modeling of all components, but not very precise.

Russell, Jacome (1998): Measurements for 2 fixed configurations

Steinke et al., UniDo (2001): mixed model using measurements and prediction

CACTI [Jouppi, 1996]: Predicted energy consumption of caches

Wattch [Brooks, 2000]: Power estimation at the architectural level, without circuit or layout

- 32 - p. marwedel, informatik 12, 2011

Steinke’s & Knauer’s model

E.g.: ATMEL board with ARM7TDMI and ext. SRAM

DataMemory

InstructionMemory

IAddr

VDD

ALU

Multi-plier

BarrelShifter

Register File

Instr. Decoder& Control Logic

Inst

rImm

RegValue

Reg#

Opcode

ARM7DAddr

mA mA

Data

Instr

Etotal = Ecpu_instr + Ecpu_data + Emem_instr + Emem_data

- 33 - p. marwedel, informatik 12, 2011

Example:Instruction dependent costs in the CPU

Ecpu_instr = MinCostCPU(Opcodei) + FUCost(Instri-1,Instri)

1 * w(Immi,j) + ß1 * h(Immi-1,j, Immi,j) +

2 * w(Regi,k) + ß2 * h(Regi-1,k, Regi,k) +

3 * w(RegVali,k) + ß3 * h(RegVali-1,k, RegVali,k) +

4 * w(IAddri) + ß4 * h(IAddri-1, IAddri)

Cost for a sequence of m instructions

w: number of ones;h: Hamming distance;FUCost: cost of switching functional units, ß: determined through experiments

- 34 - p. marwedel, informatik 12, 2011

Hamming Distance between adjacent addresses is playing major role

h-costs, address bus, CPU + memory current

110

115

120

125

130

135

140

145

150

0 5 10 15 20

Hamming-Distance

Cu

rren

t [m

A]

Read

Write

- 35 - p. marwedel, informatik 12, 2011

Results

It is not important, which address bit is set to ‘1‘

The number of ‘1‘s in the address bus is irrelevant

The cost of flipping a bit on the address bus is independent of the bit position.

It is not important, which data bit is set to ‘1‘

The number of ‘1‘s on the data bus has a minor effect (3%)

The cost of flipping a bit on the data bus is independent of the bit position.

- 36 - p. marwedel, informatik 12, 2011

CACTI model

Comparison with SPICE

Cache model used

http://research.compaq.com/wrl/people/jouppi/CACTI.html

- 37 - p. marwedel, informatik 12, 2011

Access times and energy consumption increase with the size of the memory

Example (CACTI Model): "Currently, the size of some applications is doubling every 10 months" [STMicroelectronics, Medea+ Workshop, Stuttgart, Nov. 2003]

- 38 - p. marwedel, informatik 12, 2011

Influence of the associativity

Parameters different from previous slides

[P. Marwedel et al., ASPDAC, 2004]

- 39 - p. marwedel, informatik 12, 2011

Evaluation of designsaccording to multiple objectives

Different design objectives/criteria are relevant: Average performance Worst case performance Energy/power consumption Thermal behavior Reliability Electromagnetic compatibility Numeric precision Testability Cost Weight, robustness, usability, extendibility, security,

safety, environmental friendliness

- 40 - p. marwedel, informatik 12, 2011

Thermal models

Thermal models becoming increasingly important

since temperatures become more relevant due to increased performance, and

since temperatures affect

• usability and

• dependability.

- 41 - p. marwedel, informatik 12, 2011

Model components

Thermal conductance reflects the amount of heat transferred through a plate (made from that material) of area A and thickness L when the temperatures at the opposite sides differ by one Kelvin.

The reciprocal of thermal conductance is called thermal resistance.

Thermal resistances add up like electrical resistances

Masses storing heat correspond to capacitors

Thermal modeling typically uses equivalent electrical models and employs well-known techniques for solving electrical network equations

- 42 - p. marwedel, informatik 12, 2011

Results of simulationsbased on thermal models (1)

Source: http://www.coolingzone.com/Guest/News/NL_JUN_2001/Campi/Jun_Campi_2001.html

Encapsulated cryptographic coprocessor:

- 43 - p. marwedel, informatik 12, 2011

Results of simulationsbased on thermal models (2)

Source: http://www.flotherm.com/applications/app141/hot_chip.pdf

Microprocessor

- 44 - p. marwedel, informatik 12, 2011

Summary

Thiele’s real-time calculus (RTC)/MPA

• Using bounds on the number of events in input streams

• Using bounds on available processing capability

• Derives bounds on the number of events in output streams

• Derives bound on remaining processing capability, buffer sizes, …

• Examples demonstrate design procedure based on RTC

Energy and power consumption Thermal behavior

- 45 - p. marwedel, informatik 12, 2011

Some more details

- 46 - p. marwedel, informatik 12, 2011

Analysis – Design Question 3

Architecture D is chosen for further investigation.

How should the processors be dimensioned?

RAD

130 MIPS

MMI

NAV

113 MIPS72 k

bp

s(D)

© Thiele, ETHZ

- 47 - p. marwedel, informatik 12, 2011

RAD

130 MIPS

MMI

NAV

113 MIPS

72 k

bp

s

33 MIPS29 MIPS

Analysis – Design Question 3

dmax = 200

dmax = 1000

dmax = 50

dmax = 200

© Thiele, ETHZ

- 48 - p. marwedel, informatik 12, 2011

Hamming Distance between adjacent values on the data bus is important

h-costs, data bus, CPU+memory currents

110

115

120

125

130

135

140

145

150

0 5 10 15 20

Hamming-Distance

Cu

rre

nt

[mA

]

Read

Write

- 49 - p. marwedel, informatik 12, 2011

Other costs

Ecpu_data = 5 * w(DAddri) + ß5 * h(DAddri-1, DAddri) + 6 * w(Datai) + ß6 * h(Datai-1, Datai)

Emem_instr = MinCostMem(InstrMem,Word_widthi) + 7 * w(IAddri) + ß7 * h(IAddri-1, IAddri) + 8 * w(IDatai) + ß8 * h(IDatai-1, IDatai)

Emem_data = MinCostMem (DataMem, Direction, Word_widthi) + 9 * w(DAddri) + ß9 * h(DAddri-1, DAddri)

+ 10 * w(Datai) + ß10 * h(Datai-1, Datai)

Recommended