Impact of Power-Management Granularity on The Energy-Quality Trade-off for Soft And Hard Real-Time...

Preview:

Citation preview

Impact of Power-Management Granularity on The Energy-Quality

Trade-off for Soft And Hard Real-Time ApplicationsInternational Symposium on System-on-Chip, 2008

A. Milutinovic, K. Goossens, and G.J.M. Smit

Advisor: Shiann-Rong KuangSpeaker: Hao-Yi Jheng (鄭浩逸 )

2009.2.26

1

Outline Introduction

Application model Work and slack

Policy Conservativeness and Granularity Experimental Results Conclusions

2

Application model

3

In this paper they evaluate two power-management policies for a number of different granularities on an MPEG4 application, on energy and quality (deadline misses). Granularity (N) : frequency of operating point

changes

Hard real-time applications Don’t allow any frame miss deadline Use conservative power-management

Soft real-time applications Allow a limited number of frame miss deadline Use non-conservative power-management

Work and slack

4

Work : the number of processor cycles Relative deadline :

Relative deadline miss means this frame over deadline

Relative slack (r) :

Absolute deadline :

Absolute deadline miss means that the accumulative execution time frame 0 to i is over the total deadline

Absolute slack(s) :

1/i FRacet T f

i ir T acet

0

i

jjacet iT

0( 1)

i

i jjs i T acet

deadlineT actual execution time /i i iacet w f

Outline

5

Introduction Application model Work and slack

Policy Conservativeness and Granularity Experimental Results Conclusions

Conservative Policy Conservative power-management policy :

Does not introduce any deadline misses compared to operating at .

Non-conservative power-management policy : Some frames maybe miss it’s deadline.

6

maxf

Policy

7

Perfect predictor policy (non-conservative) : Accurately predicts the next N frames workload and

scaled the average frequency for those frame

Proven slack policy (conservative) : Proven slack : the cumulative slack of the frames

before it Assume that the next N frames all require the worst-

case work, but use all the proven slack of previous group to reduce the frequency of the processor

1

*0( ) / ( ) for group

i

N

avg i N jjf w NT i

max 0 1( ) / ( ) for group i j j if NMax w NT s i

Outline

8

Introduction Application model Work and slack

Policy Conservativeness and Granularity Experimental Results Conclusions

Experimental Results (1/5) An MPEG4 decoder running on an ARM946 at

86 MHz 25 frames per second (fps), and a resolution

of 176*144 pixel

9

Experimental Results (2/5) Energy savings w.r.t. operating at are around 30%

for 1-128 frames 2% cost for the power management Above 128 frames the proven-slack policy energy

linearly raise

maxf

10

Experimental Results (3/5)

11

The proven-slack policy cannot always exploit the accumulated slack

Average slack :

Worst-case slack :

1

0/ , for a sequence of S frames

S

iis S

10 , for a sequence of S framesS

i iMax s

Experimental Results (4/5)

12

Perfect predictor policy : 95% quality improvement costs only 3% additional energy Optimum is 13000 mJ

Experimental Results (5/5)

13

Many frames can be processed in the range of 240-250 MHz.

Outline

14

Introduction Application model Work and slack

Policy Conservativeness and Granularity Experimental Results Conclusions

Conclusions

15

1. A long tail in the work distribution results in a steep quality improvement : from almost 0% to almost 100% at an additional energy cost of only 3%.

2. The proven-slack policy offers 100% quality at only 0.3% more energy than the perfect-predictor policy, which is theoretical upper bound and hard to achieve in practice.

3. The energy of the policies increases by only 2% when increasing the granularity to 128 frames.

Conclusions Non-conservation

Conservation Tardiness

(sum of frame delay time / frame number)/deadline

16

2arg

1

( ),

Niact t et

iiact i

fps fpsi

FRV fpsN T

Comparison

17

Progress report

18

Advisor: Shiann-Rong Kuang

Speaker: Hao-Yi Jheng

2009.2.23

Outline Adaptive Inter-compensation

How to choose voltage/frequency level Adaptive Experimental Result

Future Work

19

How to choose voltage/frequency level

20

5.83 3.57 1.16 1.52 1.30 0.08 0.97

Why need inter-compensation

21

Inter-compensation PID

Adaptive inter-compensation If (previous frame predictive cycle number is more

cycles) current frame predictive voltage level decreases one

else current frame predictive voltage doesn’t change

If( ) = 2000

else = 27000

22

ii w-w(t)( ) ( )1( ) ( )

I

Di p

T D

t t TK t t D

I T

1 ii i

( ) ( )IT

t t

Inter-compensation

23

Experimental Result

24

Energy(e+08)

No-inter 2000 27000 adaptive

API_00 2.13389 1.89694 2.10778 1.98991

API_01 1.41421 1.18232 1.25112 1.23007

API_02 2.57939 2.20497 2.34232 2.29719

API_03 1.65572 1.4108 1.49139 1.45527

API_04 2.20379 1.88178 2.06792 1.99084

API_05 1.24353 1.04672 1.16125 1.11097

FRV No-inter 2000 27000 adaptive

API_00 66.2636 32.0008 76.9116 39.8287

API_01 35.9665 8.86423

0.5415340.281196

API_02 24.9081 6.53828 1.00831 1.28403

API_03 41.9968 12.2053 0.341697 1.0757

API_04 18.3523 7.35752 3.91522 1.03591

API_05 25.4673 26.3545 1.5618 3.66423

Future Work We need Hardware GM and RM cycle numbers

to verify the experimental Result

Driver is needed to support the GM and RM dump cycle number for prediction

25

Recommended