28
Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor: Dr. Jenq Kuen Lee

Array Operation Synthesis to Optimize Data Parallel Programs

  • Upload
    linore

  • View
    42

  • Download
    2

Embed Size (px)

DESCRIPTION

Array Operation Synthesis to Optimize Data Parallel Programs. Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor: Dr. Jenq Kuen Lee. Array Operation Synthesis to Optimize Data Parallel Programs. 國立清華大學 資訊工程系 Student: 黃冠寰 Advisor: 李政崑博士. - PowerPoint PPT Presentation

Citation preview

Page 1: Array Operation Synthesis to Optimize Data Parallel Programs

Array Operation Synthesis to Optimize Data Parallel Programs

Department of Computer Science,

National Tsing-Hua University

Student:Gwan-Hwan HwangAdvisor: Dr. Jenq Kuen Lee

Page 2: Array Operation Synthesis to Optimize Data Parallel Programs

Array Operation Synthesis to Optimize Data Parallel Programs

國立清華大學資訊工程系

Student: 黃冠寰Advisor: 李政崑博士

Page 3: Array Operation Synthesis to Optimize Data Parallel Programs

Array Operation Synthesis on Distributed-memory Machines

國立清華大學資訊工程學系

黃冠寰 , Phd.

Page 4: Array Operation Synthesis to Optimize Data Parallel Programs

Compiler Optimization for Compiler Optimization for

Parallel Computations on Parallel Computations on

Distributed & Shared Memory Distributed & Shared Memory

Machines Machines •Communication Code for Block-Cyclic Distribution of HPF(IPPS’98)

•Array Operation Synthesis for Intrinsic Array Functions (JPDC, ACM PPoPP’95, ICPP’96)

Research Interests

Key Issues

•Automatic Alignment for Data Parallel Languages (LCPC’97)

Concurrent Testing Concurrent Testing •Reachability Testing of Concurrent Program (IJSEKE’95, APSEC’93)

Parallel Object Program Model &Parallel Object Program Model &

Heterogeneous ComputingHeterogeneous Computing

•Java-Based Network Computing Environment •Transparent Parallel Computing Environment (Ongoing)

Page 5: Array Operation Synthesis to Optimize Data Parallel Programs

Outline of Presentation

• Fortran 90 Intrinsic Array Operations

• Array Operation Synthesis(AOS)

• SYNTOOL

• Apply AOS to Shared-Memory Machines

• Apply AOS to Distributed-Memory Machines

• Conclusion and Future Work

Page 6: Array Operation Synthesis to Optimize Data Parallel Programs

Outline of Presentation• Fortran 90 Intrinsic Array Operations

• Array Operation Synthesis(AOS)

• SYNTOOL

• Apply AOS to Shared-Memory Machines

• Apply AOS to Distributed-Memory Machines

• Integrate AOS with Automatic Data Alignment

• Conclusion and Future Work

Page 7: Array Operation Synthesis to Optimize Data Parallel Programs

Intrinsic Array Operations

• Provided by Modern Program Languages. E.g. Fortran 90, High Performance Fortran(HPF), HPF2,

Fortran 97, APL, MATLAB, MATHEMATICA, NESL, C*

• Engineering and Scientific Applications

• Facilitate a Compilation Analysis for Optimization

• Support Parallel Execution and Portability

Page 8: Array Operation Synthesis to Optimize Data Parallel Programs

4321

16151413

1211109

8765

416128

315117

214106

11395

Intrinsic Array Operations(Cont’d)• Array Operations Provided by Fortran 90, HPF.

• Examples:

CSHIFT, TRANSPOSE, MERGE, EOSHIFT, RESHAPESPREAD, Section Move, Where Constructs, Reductions.

16151413

1211109

8765

4321B=CSHIFT(A,1,1)

4321

16151413

1211109

8765

C=TRANSPOSE(B)

Page 9: Array Operation Synthesis to Optimize Data Parallel Programs

Consecutive Array Expressions• Array Expression

• Consecutive Array Operations

C=EOSHIFT(MERGE(RESHAPE(S,/N,N/),A+B,T),1,0,1)

FXP=CSHIFT(F1,1,+1)FXM=CSHIFT(F1,1,-1)FXP=CSHIFT(F1,2,+1)FYM=CSHIFT(F1,2,-1)FDERIV=ZXP*(FXP-F1)+ZXM*(FXM-F1)+ ZYP*(FYP-F1)+ZYM*(FYM-F1)

Page 10: Array Operation Synthesis to Optimize Data Parallel Programs

Classification of Array Operations

• Model Array Operations by Data Access Functions (DAF)

Single-Clause Multiple-ClauseSingle-Source TYPE 1 TYPE 3

Multiple-Source TYPE 2 TYPE 4

Type 1Type 2 Type 3

Type 4

Page 11: Array Operation Synthesis to Optimize Data Parallel Programs

Data Access Functions

• Represent Array Operations by Mathematical Functions

• Model Array Operations by Data Access Functions (DAF)Single-Source, Multiple-SourceSingle-Clause, multiple-Clause

Page 12: Array Operation Synthesis to Optimize Data Parallel Programs

Type 1: Single-source Single-clause Data Access Function

• One Source Array

• One Data Access Pattern

4321

16151413

1211109

8765

416128

315117

214106

11395

B=TRANSPOSE(A)

Data Access Function is B(I,J)=A(J,I)

Page 13: Array Operation Synthesis to Optimize Data Parallel Programs

Single-source Single-clause Data Access Function

• One Source Array

• One Data Access Pattern

4321

16151413

1211109

8765

416128

315117

214106

11395

B=TRANSPOSE(A)

Data Access Function is B(I,J)=A(J,I)

Page 14: Array Operation Synthesis to Optimize Data Parallel Programs

Type 2: Multiple-source Single-clause Data Access Function

• Multiple Source Arrays

• One Data Access PatternR=MERGE(T,F,M)

Data Access Function is

111

111

111

222

222

222

TFF

FTT

TFT

122

211

121

I,JI,JI,JI,J M,F,TR where

False if

True if ,,

zy

zxzyx

Array T Array F Array M Array R

Page 15: Array Operation Synthesis to Optimize Data Parallel Programs

Multiple-source Single-clause Data Access Function

• Multiple Source Arrays

• One Data Access PatternR=MERGE(T,F,M)

Data Access Function is

111

111

111

222

222

222

TFF

FTT

TFT

122

211

121

I,JI,JI,JI,J M,F,TR where

False if

True if ,,

zy

zxzyx

Array T Array F Array M Array R

Page 16: Array Operation Synthesis to Optimize Data Parallel Programs

Type 3: Single-source Multiple-clause Data Access Function

• Single Source Array• Multiple Data Access Patterns

B=CSHIFT(A,1,1)

Data Access Function is

/1:4:1 , 1:3:1//,,/ 1A

/1:4:1 , 1:4:4//,,/ 3AB

JI,JI

JI,JII,J

16151413

1211109

8765

4321

4321

16151413

1211109

8765

Array A Array B

: a segmentation descriptor

Page 17: Array Operation Synthesis to Optimize Data Parallel Programs

Single-source Multiple-clause Data Access Function

• Single Source Array• Multiple Data Access Patterns

B=CSHIFT(A,1,1)

Data Access Function is

/1:4:1 , 1:3:1//,,/ 1A

/1:4:1 , 1:4:4//,,/ 3AB

JI,JI

JI,JII,J

16151413

1211109

8765

4321

4321

16151413

1211109

8765

Array A Array B

: a segmentation descriptor

Page 18: Array Operation Synthesis to Optimize Data Parallel Programs

Type 4: Multiple-source Multiple-clause Data Access Function

• Multiple Source Arrays• Multiple Data Access Patterns

No array operation of Fortran 90 belongs to type 4Synthesis of multiple array operations may derive a

type 4 data access function.

Page 19: Array Operation Synthesis to Optimize Data Parallel Programs

Multiple-source Multiple-clause Data Access Function

• Multiple Source Arrays• Multiple Data Access Patterns

No array operation of Fortran 90 belongs to this typeSynthesis of multiple array operations may derive a

multiple-source multiple-clause data access function

Page 20: Array Operation Synthesis to Optimize Data Parallel Programs

Straightforward Compilation• Translate each operation into a parallel loop

B=CSHIFT((TRANSPOSE(EOSHIFT(A,1,0,1),1,1)

FORALL (I=1:N:1; J=1:N:1) T2(I,J)=T1(J,I)ENDFORALL

FORALL (I=1:N:1; J=1:N:1) IF (1<=I<=N-1) and (1<=J<=N) THEN B(I,J)=T2(I+1,J) ELSE B(I,J)=T2(I-N,J)ENDFORALL

FORALL (I=1:N:1; J=1:N:1) IF (1<=I<=N-1) and (1<=J<=N) THEN T1(I,J)=A(I+1,J) ELSE T1(I,J)=0ENDFORALL

EOSHIFT

TRANSPOSE

CSHIFT

Page 21: Array Operation Synthesis to Optimize Data Parallel Programs

Array Operation Synthesis

• Construct the Parse Tree of Array Expression

• Represent Array Operations by Mathematical Functions (DAF)

B=CSHIFT((TRANSPOSE(EOSHIFT(A,1,0,1),1,1)

CSHIFT

TRANSPOSE

EOSHIFT

/1::1 , 1:://,,/ 12T

/1::1 , 1:1:1//,,/ 12TB

NNNJI,JNI

NNJI,JII,J

J,II,J T1 T2

/1::1 , 1:://,,/ 0

/1::1 , 1:1:1//,,/ 11T

NNNJI

NNJI,JIAI,J

Page 22: Array Operation Synthesis to Optimize Data Parallel Programs

Array Operation Synthesis (Cont’d)

/1::1 , 1:://,,/ 12T

/1::1 , 1:1:1//,,/ 12TB

NNNJI,JNI

NNJI,JII,J

J,II,J T1 T2

CSHIFT

TRANSPOSE

Synthesis of twofunctions

/1::1 , 1:://,,/ 0

/1::1 , 1:1:1//,,/ 11T

NNNJI

NNJI,JIAI,J

/1::1 , 1:://,,/ 1,1T

/1::1 , 1:1:1//,,/ 1,1TB

NNNJINIJ

NNJIIJI,J COSHIFT+

TRANSPOSE

EOSHIFT

/:1 , ://,1,/ /:1 , ://,,/ 0

/:1 ,1:1//,1,/ /:1 , ://,,/ 1,A

/:1 ,:/ /,1,//:1 , 1:1//,,/ 0

/:1 , 1:1/ /,1,//:1 , 1:1//,,/ 1,1A

B

NNNNIJNNNJI

NNNIJNNNJINIJ

NNNIJNNJI

NNIJNNJIIJ

I,J

Page 23: Array Operation Synthesis to Optimize Data Parallel Programs

• Substitution (Term Rewriting like method)Having two Data Access Patterns:

The Synthesized Data Access Pattern is:

Synthesis of two Data Access Functions

,,,,,,,,,,,112111 iifiifiifSiiT nmnnn

,,,,,,,,,,,112111 iihiihiihQiiS mpmmm

'

112111 ,,,,,,,,,,, iigiigiigQiiT npnnn

piiiifiifiifhiig nmnnini 1,,112111

,,,,,,,,,,,

/::,,::/,,,,,,/ 111,111'

sulsuliifiif mmmnmn

where

/::,,::/,,/ 111,/1 sulsulii mmmn where

Page 24: Array Operation Synthesis to Optimize Data Parallel Programs

• For example,

• By the substitution rule

3:3,1:/1j/,/i, ji,Aji,1T

Synthesis of two DAFs (Cont’d)

4:3,1:/1j/,/i, j1,iT3,1ij,T1ji,B

3:3,1:/11/,i/j,4:3,1:/1j/,/i, j1,iT3,1ij,Aji,B

Page 25: Array Operation Synthesis to Optimize Data Parallel Programs

• For example,

Synthesis of two DAFs (Cont’d)

y

q

11

p1

i,,iT

y

q

x

k

11

n1

,T,

,T,

,T,

i,,iS

x

k

yxyx

xx

xx

ykyk

kk

kk

yy

,x

2,x2

1,x1

,k

2,k2

1,k1

,111

2,1121

1,1111

n1

,,

,,

,,

,,

,,

,,

,,

,,

,,

i,,iS

Page 26: Array Operation Synthesis to Optimize Data Parallel Programs

Code Generation for Synthesized Data Access Function

/:1 , ://,1,/ /:1 , ://,,/ 0

/:1 ,1:1//,1,/ /:1 , ://,,/ 1,A

/:1 ,:/ /,1,//:1 , 1:1//,,/ 0

/:1 , 1:1/ /,1,//:1 , 1:1//,,/ 1,1A

B

NNNNIJNNNJI

NNNIJNNNJINIJ

NNNIJNNJI

NNIJNNJIIJ

I,J

FORALL (I=1:N:1; J=1:N:1)

IF (/I,J/,/1:N-1,1:N/) (/J,I+1/,/1:N-1,1:N/) THEN B(I,J)=A(J+1, I+1) IF (/I,J/,/1:N-1,1:N/) (/J,I+1/,/N:N ,1:N/) THEN B(I,J)=0

IF (/I,J/,/N:N ,1:N/) (/J,I+1/,/1:N-1,1:N/) THEN B(I,J)=A(J+1, I-N+1) IF (/I,J/,/N:N ,1:N/) (/J,I+1/,/N:N ,1:N/) THEN B(I,J)=0 ENDFORALL

Code Generation

Page 27: Array Operation Synthesis to Optimize Data Parallel Programs

Code Generation for Synthesized Data Access Function

/:1 , ://,1,/ /:1 , ://,,/ 0

/:1 ,1:1//,1,/ /:1 , ://,,/ 1,A

/:1 ,:/ /,1,//:1 , 1:1//,,/ 0

/:1 , 1:1/ /,1,//:1 , 1:1//,,/ 1,1A

B

NNNNIJNNNJI

NNNIJNNNJINIJ

NNNIJNNJI

NNIJNNJIIJ

I,J

After Optimization

1

N-1N

1 N-1 N

/ : , ://,,/ 0

/1:1 , ://,,/ 1,A

/ : , 1:1//,,/ 0

/1:1 , 1:1//,,/ 1,1A

B

NNNNJI

NNNJINIJ

NNNJI

NNJIIJ

I,J

Page 28: Array Operation Synthesis to Optimize Data Parallel Programs

• Simplifying the ranges at compilation time instead of runtime

• Optimization process:Normalize:

Intersection for each dimension:

/ , 5:28:3,//,,I,/ / , 5:100:4,//,,5I3,/

Optimization

/ , 5:200:7,//,,I,/ / , 6:100:5,//,,I,/

/ , 30:77:17,//,,I,/