98
JAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター November 2011 Japan Atomic Energy Agency 日本原子力研究開発機構 坂本 健作 清水 大志 鶴岡 卓哉 根本 俊行 石川 直太 Kensaku SAKAMOTO, Futoshi SHIMIZU, Takuya TSURUOKA, Toshiyuki NEMOTO and Naota ISHIKAWA 原子力機構の新大型計算機システムにおける 基本性能の評価 Evaluation of Fundamental Performance of JAEA’s New Supercomputer System

JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター

JAEA

-TestingJAEA-Testing

2011-005

Center for Computational Science amp e-Systems

システム計算科学センター

November 2011

Japan Atomic Energy Agency 日本原子力研究開発機構

坂本 健作 清水 大志 鶴岡 卓哉 根本 俊行石川 直太

Kensaku SAKAMOTO Futoshi SHIMIZU Takuya TSURUOKA Toshiyuki NEMOTOand Naota ISHIKAWA

原子力機構の新大型計算機システムにおける基本性能の評価

Evaluation of Fundamental Performance of JAEArsquos New Supercomputer System

JAEA-Testing 2011-005

13

13 $amp ()+ - 0123 45267 892

(2011 8 30 lt=)

gt ABCDEFGDHIJ

22 3 K13LM=NO 153TflopsPQCRSTUVWXM=NO 200TflopsUV LinuxYZ$13LPRIMERGY BX900PWM=NO 12 Tflops []^_`$a^LFX1PbcdXefgh$13ijklmndoBpCDqBX900 rsXtuvwmWFX1 []efgh$Lxfgh$PyBzD^|13~

hGmWWCqgtmefgh$13

Gqefgh$13

MXBdn HPCCY^_ZEwO|YalLIOPa$fY`L^_ Piexclcent^_ZB5poundDqmndcurrenyenGmWbrvbarsectBX900 FX1 umlcopyordfmnd13ABGlaquoDsectnotshyWXregmacrordfdegdnq

plusmnLsup2sup3P micro319-1195 paramiddotcedilsup1ordmraquofrac14frac12frac34iquestAgraveiquest3 2-4 2AacuteAcircAtildeAumlAringAElig CcedilEgrave

JAEA-Testing 2011-005

Evaluation of Fundamental Performance of JAEArsquos New Supercomputer System

Kensaku SAKAMOTO Futoshi SHIMIZU Takuya TSURUOKA2 Toshiyuki NEMOTO2 and Naota ISHIKAWA2

Center for Computational Science amp e-Systems

Japan Atomic Energy Agency Tokai-mura Naka-gun Ibaraki-ken

(Received August 30 2011)

A new supercomputer system was deployed at JAEA in March 2010 The system is mainly composed of a large scale Linux cluster system (PRIMERGY BX900 200Tflops) and a lead off system of RIKENrsquos K computer (FX1 12Tflops) whose purposes are to deliver a high-performance parallel computing environment and application code development environment for the K computer respectively In this report we present results of the evaluation of fundamental performance of the system HPCC Benchmark suit as well as independent codes to measure the performances of floating-point operation memory bandwidth and interconnect bandwidth was performed Many profitable findings for uses of BX900 and FX1 were obtained by analyzing these results Keywords Supercomputer Fundamental Performance Benchmark 2 RIST (Research Organization for Information Science and Technology)

JAEA-Testing 2011-005

[ 1 EacuteH ---------------------------------------------------------------------------------------------------- 1 2 13Ecircu ------------------------------------------------------------------------------------------- 1

21 UVEumlIgraveOIacuteIcircIumlETHNtildeJ ----------------------------------------------------------- 1 22 []fIumlIacuteIcircIumlETHNtildeJ -------------------------------------------------------- 2 23 OgraveOacuteOcircOtildeYOumltimesIcircIumlETHNtildeJ ----------------------------------------------------------- 2

3 HPCCYbrvbar --------------------------------------------------------------- 5 31 HPCCYEcircu -------------------------------------------------------------------------- 5

311 ^_Z ------------------------------------------------------------------------------------------ 5 312 Oslash5 ------------------------------------------------------------------------------------------ 7

32 -------------------------------------------------------------------------------------------------- 8 321 BX900 ----------------------------------------------------------------------------------------------- 8 322 FX1 ------------------------------------------------------------------------------------------------- 14

33 HPCCYUgraveWH ---------------------------------------------------------------------- 18 4 BX900W FX1TUacuteUcirc ---------------------------------------------------------------------- 20

41 ^_aZ-------------------------------------------------------------------------------------------- 20 42 feaZbrvbar|YTUacuteUcirc --------------------------------------------------- 20

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute------------------------------------ 21 422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute ------------------------------ 23

43 hhbrvbar|YTUacuteUcirc----------------------------------- 24 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute ------------------------------------------ 28 45 TUacuteUcircUgraveWH----------------------------------------------------------------------------------------- 31

5 IORYbrvbar ---------------------------------------------------------------- 33 51 IORYEcircu --------------------------------------------------------------------------- 33

511 YacircueZ$ ------------------------------------------------------------- 33 512 MPIIOatildeGUuml^13~eZ$ -------------------------------------------------- 34

52 ------------------------------------------------------------------------------------------------ 35 521 POSIX --------------------------------------------------------------------------------------------- 38 522 MPIIO --------------------------------------------------------------------------------------------- 38 523 _aumlaringaeligccedil13h ----------------------------------------------------------------------------- 38

53 IORYUgraveWH ------------------------------------------------------------------------- 41 531 a13egravecurren ----------------------------------------------------------------- 41 532 POSIXW MPIIO ------------------------------------------------------------------------------- 41 533 eacuteccedilecirceuml ------------------------------------------------------------------------------- 41

6 a$fY` ------------------------------------------------------------------------ 44 61 a$fY`igraveiacute----------------------------------------------------------------------------- 44

JAEA-Testing 2011-005

611 icircWiumlIumlethsectntilde -------------------------------------------------------------------------- 44 612 Agravej -------------------------------------------------------------------------------------------- 46

62 MPI ---------------------------------------------------------------------------------- 48 621 Agraveograve -------------------------------------------------------------------------------------------- 48 622 -------------------------------------------------------------------------------------------- 49

63 RDMAszlig ------------------------------------------------------------------------------------------ 53 631 Agraveograve -------------------------------------------------------------------------------------------- 53 632 -------------------------------------------------------------------------------------------- 55

64 CPUSCPUiumlIuml 1oacute 1 ----------------------------------------------- 56 641 Agraveograve -------------------------------------------------------------------------------------------- 56 642 -------------------------------------------------------------------------------------------- 57

65 13aelig13ocircotildeouml ----------------------------------------------------------------- 58 651 1oacute 1 ------------------------------------------------------------------------------------------ 58 652 divideoslash -------------------------------------------------------------------------------------------- 62

66 a$fY`UgraveWH-------------------------------------------------------------------------- 64 7 UgraveWH ----------------------------------------------------------------------------------------------------- 65 ugraveuacute ------------------------------------------------------------------------------------------------------------------ 65 ucircuumlyacutethorn ------------------------------------------------------------------------------------------------------------ 65 A FX1 HPCC BMT----------------------------------------------- 66 B EumlIgravecent MPI atilde ---------------------------------------------------------------- 67 C RDMAoacute NORDMA ----------------------------------------------------------------------------- 69 D W Oslash5 -------------------------------- 70 E 1oacute 1 --------------------------------------------------------------------------------- 76 F |eacuteIuml --------------------------------------------------------------------------- 79 G Zccedil` MPIWIcirca|ccedil`gt MPI atilde ------------------------ 85

JAEA-Testing 2011-005

Contents 1 Introduction ---------------------------------------------------------------------------------------------- 1 2 Outline of the system ---------------------------------------------------------------------------------- 1

21 Hardware construction of BX900 ------------------------------------------------------------------ 1 22 Hardware construction of FX1 -------------------------------------------------------------------- 2 23 Hardware construction of Storage----------------------------------------------------------------- 2

3 Performance evaluation by HPCC benchmarks ----------------------------------------------- 5 31 Outline of HPCC benchmarks --------------------------------------------------------------------- 5

311 Test programs ------------------------------------------------------------------------------------- 5 312 Rules ------------------------------------------------------------------------------------------------- 7

32 Measured results -------------------------------------------------------------------------------------- 8 321 BX900 ----------------------------------------------------------------------------------------------- 8 322 FX1 ------------------------------------------------------------------------------------------------- 14

33 Summary ------------------------------------------------------------------------------------------------ 18 4 Tunings for BX900 and FX1------------------------------------------------------------------------- 20

41 Profiler --------------------------------------------------------------------------------------------------- 20 42 Automatic tuning of memory access ----------------------------------------------------------- 20

421 Efficacy for Original version of STREAM benchmark ------------------------------- 21 422 Efficacy for HPCC version of STREAM benchmark ---------------------------------- 23

43 Manual tuning of memory access ---------------------------------------------------------------- 24 44 Efficacy for NPB23 FT benchmark ------------------------------------------------------------- 28 45 Summary ------------------------------------------------------------------------------------------------ 31

5 Performance evaluation by IOR benchmarks ------------------------------------------------ 33 51 Outline of IOR benchmarks ---------------------------------------------------------------------- 33

511 Key parameters -------------------------------------------------------------------------------- 33 512 Option parameters for MPIIO ------------------------------------------------------------- 34

52 Measured results ------------------------------------------------------------------------------------ 35 521 POSIX --------------------------------------------------------------------------------------------- 38 522 MPIIO --------------------------------------------------------------------------------------------- 38 523 Local cache --------------------------------------------------------------------------------------- 38

53 Summary----------------------------------------------------------------------------------------------- 41 531 Choice of file system---------------------------------------------------------------------------- 41 532 POSIX vs MPIIO ------------------------------------------------------------------------------ 41 533 Buffer size ---------------------------------------------------------------------------------------- 41

6 Performance evaluation of the interconnect --------------------------------------------------- 44 61 Specification of the interconnect ----------------------------------------------------------------- 44

JAEA-Testing 2011-005

611 Construction and Nodebind ------------------------------------------------------------------ 44 612 Communication method ----------------------------------------------------------------------- 46

62 Fundamental performance of MPI communication ---------------------------------------- 48 621 Measuring method ----------------------------------------------------------------------------- 48 622 Measured results -------------------------------------------------------------------------------- 49

63 RDMA ---------------------------------------------------------------------------------------------------- 53 631 Measuring method ------------------------------------------------------------------------------ 53 632 Measured results -------------------------------------------------------------------------------- 55

64 Communications between intra-CPUs inter-CPUs and inter-nodes ----------------- 56 641 Measuring method ------------------------------------------------------------------------------ 56 642 Measured results -------------------------------------------------------------------------------- 57

65 Communications between chassises ------------------------------------------------------------ 58 651 Point-to-point communication--------------------------------------------------------------- 58 652 Collective communication -------------------------------------------------------------------- 62

66 Summary ------------------------------------------------------------------------------------------------ 64 7 Concluding Remarks --------------------------------------------------------------------------------- 65 Acknowledgements -------------------------------------------------------------------------------------------- 65 References ------------------------------------------------------------------------------------------------------- 65 Appendix A HPCC benchmark results on FX1 --------------------------------------------------- 66 Appendix B Fundamental performance of MPI communication ---------------------------- 67 Appendix C RDMA vs NORDMA--------------------------------------------------------------------- 69 Appendix D Overlapping communications with calculations -------------------------------- 70 Appendix E Peer-to-peer communications --------------------------------------------------------- 76 Appendix F Memory band width ---------------------------------------------------------------------- 79 Appendix G Flat-MPI vs Hybrid --------------------------------------------------------------------- 85

JAEA-Testing 2011-005

1 EacuteH L13Pgt ABCDE

FGDHIJ 22 3K13LM=NO 153TflopsPQCRSTUVWXM=NO 200TflopsUV LinuxYZ$13LPRIMERGY BX900PWM=NO 12 Tflops []^_`$a^LFX1PbcdXefgh$13ijklmndoBpCDqBX900 rsXtuvwmWFX1 []efgh$Lxfgh$PyBzD^|13~hGmWWCq gtoBGXumlcopyordfyBClaquoPRIMERGY

BX900WFX1oacuteCO|Yl WpoundDGY`OslashCOslashGmWbrvbarsectdcWXpoundD13

GqUgrave[13EcircuJGq 3gtefgh$13MXBdn HPCC YbrvbarC 4cd 6O|YalLIOPa$fY`L^_ Pcent AringcdGq 7M$CqXyacuteampG()Ocirc$bWCUgraveWHDq

2 13Ecircu 13 3+XB-eacute13GXUVEumlIgraveOIacute[

]fIumlIacutenot|O-eacutecdX13gtlaquoqmnd13M

=NORSTUVWX 214TflopsMacirc012 56TBOgraveOacuteOcircOtildeY1212PBgtlaquoq13Ecircu Fig 1Gq 21 UVEumlIgraveOIacuteIcircIumlETHNtildeJ 1334WXUVEumlIgraveOIacutePRIMERGY BX900L13BX900P

200Tflops=NOLgYPWMacirc0 50TBnotGUV LinuxYZ$13gtlaquoq

BX900 10U-a5Aumleuml13aelig13T 186-eacute7IumlT 86accedil7Iuml8amp9XAumlAringOslashOuml7Iuml-eacutegtlaquoq1 7Iuml-eacuteaXeon ^_ccedil- X5570LYccedilIumlfP 2 ^_ccedil-Wacirc0 24GBLDDR3 SDRAMBiIacuteiumlIuml 12GB UgraveD 48GBP8ampGqiumlIumllaquoDsect=NO9376GflopsL293GHztimes4lt=gtOY_ccedilYtimes4ftimes2^_ccedil-Pgt|eacuteIuml 512GBsAumlAnotGqiumlIumlInfiniBandLIBP 4xQDRL13B QDRWCP2`ZaringCDAgravez 8GBsAbrvbarAumlFE7a13XOcirc$FGordf9gtlaquoq

JAEA-Testing 2011-005

Fig 1 13Ecircu

BX900gtacircHI4J2K currenLXMUV13Nh713~O=5Pq

22 []fIumlIacuteIcircIumlETHNtildeJ []fIumlIacuteFX1L13FX1P12Tflops =NOLgYPWMacirc0 46TBnotGYZ$13gtlaquoq

1iumlIumlQR SPARC64SLYccedilIumlfPT^_ccedil-Wacirc0 16GB8ampGqUgraveD|UCPUVBccedil^ccedil`LJSCPbrvbarsectAuml|eacuteIumlOslashWCqiumlIuml FBBLFull Bisectional BandwidthPccedil`X| InfiniBand 4xDDRLDAgravez 2GBsPgtYZGqiumlIumleacute|O=f7YOtilde IcircIumlETHNtildegtO=GVBOumltimesLAumla$fY`accedilPyBGmWgt [

brvbarUcircXOcirc$FG]EumlIgrave^_ZOslash5^=_wXM

`notGq FX1 gtFX1 CcedilEgraveordfxfgh$WXpound`acC2012 b

=Gxfgh$yBzD^|13~h5Pq 23 OgraveOacuteOcircOtildeYOumltimesIcircIumlETHNtildeJ OgraveOacuteOcircOtildeYOumltimes IO iumlIumlSPARC Enterprise M90002 6Ocirc$OcircOtildeYETERNUS DX80366$OcircOtildeYETERNUS4000 model40016gtJeumlnq

JAEA-Testing 2011-005

MOcircOtildeY12 12PBLRAID6Pgtlaquosect13gt 576GBs IO LgYPnotGLFig 2PqIO iumlIuml InfiniBand nota13Parallelnavi SRFSbrvbarsectciumlIumlcdYoacuteCaidefgCq Ocirc$OcircOtildeYOumltimeslaquoDsect 466Za SASOcircOtildeYL1TB7200rpmP8ampG

qmndOcircOtildeY RAID6 gt^UcircCOumltimesgt`ZagGmWgt IO AumlFUcirchpoundqXIOiumlIuml_aumla13Sun QFSBCLFig 3Pq

UVEumlIgraveOIacute11913aelig13()

i QDR 9j2 (113aelig13ntildesect) InfiniBandaccedil

(9iexcl) i FGFAring 72GBs (2iumlIumlcurren) IOiumlIuml (M90002iumlIuml) i FGFAring 576GBs (OcircOtildeYcurren) OgraveOacuteOcircOtildeY (ETERNUS DX80366) ()9kUcirclmiumlIumlnC

1 2 3 119

IBUSW

IO node

Fig 2 oOumltimesYZJWFGFAring

JAEA-Testing 2011-005

M90

00

2IO

M90

00

1IO

FC (3)

FC (4)

FC (36)

hellipFC (1

)F

C (2)

FC

(42)

hellipFC (3

7)FC (3

)FC (4

)FC (3

6)hellip

FC (1)

FC (2)

FC (42)

hellipFC (3

7)

DX8

0 1

helliphelliphelliphelliphelliphellip

helliphelliphelliphelliphellip

hellip

1TB 1T

B1T

B1T

B

DX8

0 2

1TB 1T

B1T

B1T

B

DX

80

35

1TB 1T

B1T

B1T

B

DX8

0 3

6

1TB 1T

B1T

B1T

B

ETER

NU

S D

X80

(36

)E4

K 1

1TB 1T

B1T

B1T

B

ETER

NU

S400

0 M

odel

400

(1)

32G

Bs(8

FC)

QFS

Q

FS

36

576

GBs

(144

FC)

F Fi

g 3

JAEA-Testing 2011-005

3 HPCCYbrvbar

HPCC (High Performance Computing Challenge)1) Y 7^_ZB BX900W FX15poundDq13gtHPCCEcircupqWrstuq 31 HPCCYEcircu

HPCC YvR Tennessee JDongara wRordf3xWXpoundyoBCefgh$BY`^_ZzgtlaquoqO

|Y^_ WpoundDefgh$acircuXu

GmWbrvbarsectefgh$13WCeacuteZGUacuteC

qO HPL(High-Performance Linpack)DGEMMFFTE 3q|^_Zgt|Y STREAM W Random Access 2 q|^_Zgt _ PTRANSW b_eff 2q|^_Zgt5n~noacuteCc^_Zbrvbar5PJWXpound 2)qo

^_ZUuml|YacuteTHORNFortranCgtcnordfHPCCYWCCgtcnD 1^_ZeumlnqXHPCCYC 2003 sect 300wordfeumlnsect+X13rsyBGmW9gtlaquoq

311 ^_Z efgh$13f CPU 1 iexclXCiexcl8ampG

L13iumlIumlPa$fY`gtYZCDYZ$JgtlaquosectCoreCPUiumlIuml13WeumlUgraveUgraveXgtO=ordf5nDHogtordfugt

laquoqmgtiumlIumlMG` (G Global system performance)WiumlIuml[gt` (SN Single Environment)gt` (EP Embarrassingly Parallel) 3gt5Pq

3111 HPL^_Z

HPL5IgraveLUcurrengtCmWgt13OG^_ZgtlaquoqOslashm^_Z|5 CcedilEgrave^_ZaumlIacutecurrenWCyB

eumlnmWordfqUgraveDefgh$13O[P TOP500B^_ZWCBeumln LinpackMPIEumlIgraveUcircWCnotgtlaquoqiumlIuml`$XOordfecircMAumlWXq Tflopsgteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq

JAEA-Testing 2011-005

3112 DGEMM^_Z DGEMM5IgraveGmWgtiumlIumlOG^_Zgtlaquoqo^_gt5curren13^_W i5Xqm^_Z

LinpackSgtegravenauml^_ZgtlaquoqUgraveDaumlefgh$gto13zeumlnDeacuteaTHORN|ordfZaZ|WCBeumlnsect

meacuteaTHORN|ordfoacute WXqY`LBMTPbrvbariumlIumlOiexclcentGq Gflops(Giga floating-point operations per second)gteumlnq` 2gtiumlIuml[(SN)(EP)`5Pq 3113 STREAM^_Z STREAM iumlIumlordf|W CDWecirceacuteIumlG^_Zgtlaquoqm

Wecirco^_gtO=ordf5np^_W i5Xqm^_Z

cpound(Copy)curren(Scale)Myen(Add)yen(Triad) 4 XO=gtJeumlnsectiumlIuml|YiexclcentGq GBs(Giga Bytes per second)gteumlnq` 8 gt[(SN)(EP)gtoiumlIuml 4(cpoundcurrenMyenyen)`5Pq 3114 PTRANS^_Z PTARNS(Parallel matrix TRANSpose)EumlIgrave^_ocirc5IgraveFtimesbrvbarsect13ccedil`brvbarY G^_Zgtlaquoq GBs gteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq 3115 Random Access^_Z Random Access |sectumleumlnDecopyIgravecdZordf 1 ulaquonotgtshyregGO=ordf 1 macrdegplusmn5ncG^_Zgtlaquoq1 plusmnYGOcirc$2ordf8BWgteumlordfgtlaquoqiumlIuml|Yucircsup2`WiumlIuml MPI gtucircsup2`5Pq Gups(Giga updates per second)gteumlnq` 3 gtiumlIuml[(SN)(EP)iumlIumlegravepoundDM(G)`5Pq 3116 FFTE^_Z FFTEAumlF|sup3^acuteOG^_ZgtlaquoqHPLWiacute CcedilEgrave^_ZaumlIacutecurrenWCyBeumlnmWordfq[(SN)W(EP)`gt2 [microcopyIgraveordfo^_gtpara|sup3^acuteeumlnqiumlIumlegravepoundDM(G)`gt3[microcopyIgraveordflaquo 1AgravezmiddotAcurrenetheumlnDcedil|sup3^acuteeumlnlaquoW^_ordf^_W 5PmWLoacute Pbrvbaro^_

copyIgravesup1copytimesordm^_raquofrac14sect[microoacuteG|sup3^acuteordf5nq

JAEA-Testing 2011-005

Gflopsgteumlnq 3117 b_eff^_Z b_eff Ocirc$FGFGfrac12frac34(7a13acute latency)WFGFAring(eacuteIuml band width)shycdiquestH^_ZgtlaquoqFGfrac12frac34ordfAgraveXCeacuteIumlordfecircM13^_ ordfAacutenmWGqb_eff iumlIumlegravepoundDM(G)G Ring W PingPong 2 ^_ZgtJeumlnqb_eff FGGOcirc$-a5mnUgravegt^_ZW+XsectlOcirc$-a5 N ] MPI ^_ P atildeAcircXq7a13 8B Ocirc$eacuteIuml 2MB Ocirc$FGGmWbrvbarsect5Pq Ring MPIbrvbarpoundAtildeHdnDZYAumlAringAEligo^_oacuteCAumlAring 1oacute 1 5PoumlWZYAumlAringZordfEumlu 1oacute 1 5Pouml 2CcedilEgravegtCDq PingPong 2Core Eacute Ecircnototilde^_ordfethsectntildednp Core EumlCOslash5Gq2CoreIgraveEacuteIacute5poundecircTIcircIIumlshyGq b_effETHNtildeOgraveRing(AumlAring)brvbar7a13OacuteRing(AumlAring)brvbareacuteIumlOcircRing(Zordf)brvbar7a13OtildeRing(Zordf)brvbareacuteIumlOumlPingPongbrvbar7a13timesPingPongbrvbareacuteIuml 6gtlaquoq7a13ordfaY_LOslashPmacreacuteIumlordf GBsgteumlnq 312 Oslash5

HPCCZaZ(Baseline Runs)WUuml^Otildea5Z(Optimized Runs) 2Oslash5ordflaquoordffeaZTUacuteUcircEacutebrvbarZaZgt5pound

q

3121 ZaZ Oslash5gtlaquoZaZfeaZbrvbarTUacuteUcircW13Oumlfrac34

eumlnaumlUgraveAuml BLAS ZaZ|W MPI ZaZ|egraveBUgravegtUacuteeumlnordfUcircfIuml^regUacuteeumlnXqegraveBCDfeaZeacuteYacute~

WfeaegraveBCDUuml^13~ZaZ|eacuteYacute~UumlordfYacutegtlaquoq 3122 Uuml^Otildea5Z UcircfIumlTHORNszligagravePTUacuteUcirc2 7gtUacute9eumlnqiAgraveaacutednDIacutecurren(-)fIumlTUacuteUcircgtlaquosectpAgrave|5macr8CEcircnototilde3XTUacuteUcircgtlaquoqUgraveD acirc HPCC^_YacuteNtildeY`acircatilde^WaumlaringordfaeligugtlaquosectatildeCUumlGaeliguordflaquoq

JAEA-Testing 2011-005

32 BX900 FX1n~nHPCC BMT^_Z 8ccedil512EumlIgravegtOslash5CDq 321 BX900 BX900 HPCC BMT^_Z Table 1GWHPL

PTRANSRandom Access FFTEo^_ZZK|Otilde Fig 4ccedilFig 7GqZK|Otildedegn^_Zegraveeacuteecirccopyordfeuml]cgtlaquoordf512 EumlIgravegtigraveyenCiacuteXCiacuteicircXordfdegdnqUgraveDHPL=NoacuteC 822(=4902596100)1gtlaquopoundDq

Table 1iumlntildeethntildeograveoacuteWCHPCCocircAEligcdotildebIcircIumlETHNtildeoumlccedilYAElig Intel Endeavor cluster (13gt Endeavor Wdivideoslash)HPCCWeboumlYacute 1) ugraveuacute Table 2Gq BX900shyucircoB3gtlaquoDH Ecircnototilde^_ZP EndeavoregraveeacuteplusmnpoundDlaquoordfEcircuumlb13yacuteordfdegdnqUgraveDEndeavorWEacute IntelfeaZBD Table 3 GbrvbarPQRfeaZbrvbarWbgtlaquopoundDqDotildeCSTREAM iumlIumlSgt5poundmWcd thornCXatilde

d(35GBs oacute 42GBs)W 2ethM EndeavorbrvbarsectEshygtlaquopoundDqm422gtaringNGq

1 BX90013VBbrvbargt17072fgt=N 2000TflopsoacuteC 1914TflopsOslashCsectOslashszlig 9566gtlaquopoundDq

- 9 -

JAEA-Testing 2011-005

Ta

ble

1 富士

通コ

ンパイ

ラに

よる機

構B

X900

での

HPC

C B

MT測

定結果

G-H

PL

G-P

TR

AN

SG

-FFTE

EP-S

TR

EA

MSys

()

EP-S

TR

EA

MTriad

EP-D

GEM

MR

ando

mR

ing

Ban

dwid

thR

ando

mR

ing

Lat

ency

TFlo

ps

GB

sG

ups

Gup

sG

Flo

ps

GB

sG

Bs

GFlo

ps

GB

sus

ecrd

ma

008

1315

12

9330

001

5112

008

9847

555

3328

074

350

9311

018

41

1171

51

268

nord

ma

008

1226

12

9544

001

5504

009

0139

557

5727

813

347

6711

023

91

1053

51

278

rdm

a0

1579

130

510

540

0238

850

1510

719

4469

555

143

4697

111

145

092

418

149

3no

rdm

a0

1536

650

491

080

0222

270

1334

439

7508

555

683

4730

111

123

078

781

324

5rd

ma

032

2493

09

0716

003

0751

024

2906

186

357

111

164

347

3911

101

00

5220

61

944

nord

ma

032

3494

08

4422

003

5001

023

7182

190

981

112

052

350

1611

100

30

5194

95

010

rdm

a0

6201

260

150

637

003

2786

037

8769

369

923

222

203

347

1911

092

40

4192

43

156

nord

ma

060

9871

014

686

80

0763

460

3882

6638

139

722

260

03

4781

110

887

044

542

652

7rd

ma

120

6270

026

303

70

0370

050

5186

5673

075

744

426

43

4708

111

007

038

892

463

5no

rdm

a1

2265

800

302

225

007

3652

061

6532

742

721

444

063

346

9211

100

50

4161

78

290

rdm

a2

4166

200

583

348

002

9897

053

8761

139

3770

888

453

347

0511

091

00

3583

55

937

nord

ma

236

3890

057

519

80

0765

560

8003

4814

845

5089

503

73

4962

110

879

040

323

103

72rd

ma

468

4620

011

199

400

0163

910

2622

2026

859

1017

779

513

4726

111

147

032

838

711

3no

rdm

a4

9017

400

109

9150

008

2799

153

7610

275

5280

1816

433

354

7711

111

90

3572

112

790

富士

通B

X90

0 (5

12C

ore

128C

PU

)C

ip

Xeo

n X55

70 2

93G

Hz

DD

R3-

1066

SM

T-O

N T

urbo

Mod

e -O

FF

Inte

rcon

nect

Q

DR

Inf

inib

and

2so

cket

nod

e F

at T

ree

OS

Red

Hat

EL 5

Com

pile

r 富

士通

CC

++ V

er O

ption -

Kfa

st -

SS

L2

()

EP-S

TR

EA

M S

ys の

値は

(EP-S

TR

EA

M T

riad

)times並

列数

で算

出し

てい

8 16 32 64

MPI

富士

通 P

aral

leln

avi Lng

uage

Pac

kage

512

128

HP

CC

Ver

V

13

1

256

並列

数通

信方

式G

-Ran

dom

Acc

ess

オリジナ

ル版

  

SA

ND

IA_O

PT2版

JAEA-Testing 2011-005

FFig 4 BX900 HPL

Fig 5 BX900 PTRANS

JAEA-Testing 2011-005

FFig 6 BX900 Random Access

Fig 7 BX900 FFTE

JAEA-Testing 2011-005

TTa

ble

2In

tel E

ndea

vor c

lust

er (

HPC

CW

eb1)

)

JAEA-Testing 2011-005

TTa

ble

3In

tel

BX9

00H

PCC

BM

T

JAEA-Testing 2011-005

322 FX1 FX1 HPCC BMT Table 4 HPL

PTRANS Random Access FFTE Fig 8 Fig 11512

HPL755(=377650100)2

Table 4 FFTE DGEMM

A

2 FX1 1200 129Tflops116Tflops 9037

JAEA-Testing 2011-005

TTa

ble

4FX

1H

PCC

BM

T

JAEA-Testing 2011-005

FFig 8 FX1 HPL

Fig 9 FX1 PTRANS

JAEA-Testing 2011-005

FFig 10 FX1 Random Access

Fig 11 FX1 FFTE

JAEA-Testing 2011-005

33 HPCCYUgraveWH Uklq BX900W FX1WrsndWK(Altix3700Bx2)Wrs

Fig 12GqmhgtTWXpoundDqshy 100WCLRandom Ring LatencyPqRandom Ring Latency13BX900ordf`ccedil^fCmWordfcurrencq

BX900 W FX1 rsGWSTREAM W FFTE necircOatildeC atildeC eacuteIumlrvCDordfmacrdnDqSTREAM

421FFTE 43aringNGq XAltix3700Bx2 Random AccessordfEordfAltix3700Bx2W

plusmngtegraveBCD HPCC BMTeacuteYacute~cdOcirc$Y|5ordf+Xpoundsect8YrsGmW13XqAltix3700Bx2 atildeC FFTEEordfmdoacute OslashszligWuumlwdn

q

JAEA-Testing 2011-005

13

0

20

40

60

80

100

BX90

0

FX1

Altix37

00Bx2

GHPL(490Tfplos)

113

GRando

mAccess(154G

ups)

GPTRANS(1120G

Bs)

EPSTREA

M(436GBs)

EPDGEM

M(111Gflo

ps)

13

GFFTE

(2755Gflo

ps)

$amp(13

Rand

omRingBandw

idth

(0359GBs)

Rand

omRingLatency(518usec)

)+

512-(FlatM

PI)

1172

10 64

64

10 32

8 2 32

130

13

123

[GFLOPS]

4567898

123

[GBs]

lt8

7898

[GBs]

=gtAB130CDgtEFGHIAB0J

KLMNOIPQ

Fi

g 1

213cent

HPC

C B

MT^_ZrsZ

Ran

dom

Rin

g La

tenc

y (5

18

sec)

)+R

JAEA-Testing 2011-005

4 BX900W FX1TUacuteUcirc 32 c^_Zgt BX900 W FX1 IcircIumlETHNtildeoumlccedilYWiCXordfWnDmWQR^_aZegraveBC(Xcurrenyen

5poundDqeumldfeaZUuml^13~brvbarTUacuteUcirc]Ucirc7^_ZTUacuteUcirc

EacuteDq 41 ^_aZ

BX900 W FX1 gtegraveBCD^_aZWQRUgraveyenXgtlaquosectoacute ^_ZOslash5^_a|Ocirc$divideC^_ZOslash5cedildivideOcirc$divideC

^_Z=)yen5PmWordf13q ^_aZyBGUgrave^_Z_IumlYacuteh)JGDHfe

a|YUuml^13~UKtl_trtGq[ _ZOslash5_IumlYacute

h fpcoll fIumlgt=GqTcedildivideCDeacuteaTHORN|Ocirc$ fprof fIumlgtdivideCaring`Gq^_aZgtPAElig 8q|gtlaquoq

^_a|Ocirc$divideAElig AElig ^_ AElig MPIZaZ|ordm)AElig f`AElig IcircIumlETHNtilde$AElig fZAElig UcircfIumlAElig

mPIcircIumlETHNtilde$AEliggtCPUlt=gtOaringaeligccedil13

hccedil`|WOcirc$FGszligbordfoacute WXq

42 feaZbrvbar|YTUacuteUcirc BX900HPCC BMT^_Z STREAMordfotildeb

gtlaquo EndeavorWrsC(35GBsacute42GBs)W 2ethMEshygtlaquopoundDmWETHNtilde5poundDqETHNtildeBDUcirc HPCCpSTREAMUuml|YacuteTHORN(CUcirc amp FortranUcirc)7ccedilIumlEumlIgraveUcircWMPIEumlIgraveUcircgtlaquoq[13BGbrvbarPmdegn BX900 feaZUuml^13~UKntst brvbarsectCDmWcdEndeavorfeaZ(IntelUgrave)W BX900feaZ(QRUgrave)igraveiacuteWuumlwdnq

JAEA-Testing 2011-005

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute BX900Uuml|YacuteTHORN STREAM^_Z Table 5Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntst brvbarordfmacrdnDqDotildeCUuml^13~gtOcirc$|sectumlGaringaeligccedil13hyBCXbrvbarPGDHoacute WX DO^SgtOcirc$sup1yBordfXmWordfdcXoumlEacuteegraveBGuecircgtlaquosectPgtXoumlordfUcirc

GordflaquoqUgraveDfeaZordfoacute WGDO ^SgtsectumlOcirc$ordf^^shypoundZYeumlnoumlEacutegtlaquoq CUcircoumlIntelfeaZXd OpenMPEumlIgraveUgraveUgravegt 42GBscoreordfdegdnDordfQRfeaZgt7ccedilIumlEumlIgraveUcirc OpenMP EumlIgravecd=EumlIgrave^regCbrvbarP]C Intel feaZWbWXpoundDqQRfeaZgtlaquo CD7ccedilIumlEumlIgraveUcircAgraveogravebrvbarpoundOcirc$copytimes]AgravebTUacuteUcircAgraveograveordflaquoWnq [ FX1Uuml|YacuteTHORN STREAM^_Z Table 6Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntstbrvbarordfmacrdnDqXTriad (132GBs)CPUQRordfWebgtUumlCshy(135GBs)CPUWotildeEacutegtlaquoq CUcircoumlordf FortranUcircUKntstUuml^13~XCWotildeiCqFX1 CfeaZ BX900W+XsectUKntstUuml^13~ordfcentsup3CXqDHFX1 CfeaZbrvbarTUacuteUcircordf$currengtlaquopoundD9ordfAumlq

JAEA-Testing 2011-005

TTa

ble

5B

X900

STRE

AM

Ta

ble

6FX

1ST

REA

M

JAEA-Testing 2011-005

422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute 321gtdegdnD BX900 HPCC BMT^_Z STREAM

Table 7GqmmgtUKntstcopyIgraveOcirc$ecircEacutearingaeligccedil13heumlX storeampegraveBGmWUKrestp=all(a$XsectordfXmW)WCDTUacuteUcirc5PfeaZUuml^13~gtlaquoq UgraveDyacutegtgtlaquoordf _ZOslash53iumlIuml XC

iumlIumlSgt|Yo CPU]o CoreiDHEumlIgrave] Agravejatildedn~nfeaZUuml^13~brvbar^UcircCXq

STREAMbrvbarPX CoreIumliX2gtZX|Yordf5n DO^gtfeaZUuml^13~UKntst GmWgt|Y+13mWordfcurrencpoundDq

Table 7 HPCC BMT^_Z STREAM (MPIEumlIgraveUcircCUcirc) feaUuml^13~Ograve(1) feaUuml^13~Oacute(2)

EumlIgrave Agravej 13aelig13

EP-STREAMTriad (GBs)

13aelig13 EP-STREAMTriad (GBs)

rdma 1 34739 U U 32

nordma 1 35016 1 43316 rdma 3 34708 U U

128 nordma 3 34692 1 43572

rdma 5 34726 4 43595 512

nordma 4 35477 4 43424 L-1QRfeaZUuml^13~OgraveacuteKfastpackedntst SSL2 -2QRfeaZUuml^13~OacuteacuteKfastpackedntstrestp=all SSL2P

JAEA-Testing 2011-005

43 hhbrvbar|YTUacuteUcirc 32 gtdcWXpoundDbrvbarPBX900 W FX1 FFTE ^_Z

acircIcircIumlETHNtilde13yacuteordfmacrdnDqmETHNtildeDH5pound

DhhW13gtszlig0Gq 13gt FFTE 32EumlIgraveOslash5CDWecircoacuteG ethBX900

gt 12FX1 gt 16Wgteumlqpoundordm)1|YEcirc2OgtlaquopoundD9ordfAumlqmgtFFTEUuml|YacuteTHORN(FortranUcirc)l3CUuml|YacuteTHORN(13aumlWoacuterGoumlUcircWdivideoslash)WaumlIacutecurren4C)JCDaumlacircoacuteGaumlZh5poundDiacuteegraveBC

hbrvbarpound^UcircCDordm)GmWbrvbarsectBX900 W FX1 oacuteG5g5poundDqUcircpaumlbrvbar5poundDh 1QRfeaZW Endeavor egraveBeumlnDafeaZWrs5PDHW 2auml13IacutecurrenTUacuteUcircbrvbar6)BbordfaumlIacutecurrenthorn7wXmWagraveaacuteGDHgtlaquoq UcircaumlIacutecurren 38plusmnUcircWmndaumlZhCDiacute

Ucircn~n Fig 13W Fig 14GWoIacutecurreniacuteEcircu0Gq Fig 13W Fig 14OgravegtCDIacutecurrenOO=5pound^gtlaquoqUuml|YacuteTHORN

gtcopyIgrave APXYZWWWWOcirc$Yordf 2ugt5nsectZgtXcpoundDDHiacutegt)9BcopyIgrave CY_W^curreneth5poundcopyIgrave APXYZWWWWYordfZXbrvbarPCDq)9BcopyIgrave CY_YZCXcpoundD=Ocirc$sectumlG^gtYordfZXpoundDoumlUcircordfecircDHgtlaquoq

Fig 13W Fig 14OacutegtCDIacutecurren3[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 3ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 3gtXC 6Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquosectaringaeligccedil13h-

a5brvbarpoundltG_ccedilY-a5^_ZUgrave)acircregmacr= 16 gteumlnq

Fig 13W Fig 14OcircgtCDIacutecurren2[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 2ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 2gtXC 4Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquoqFX1 gtBX900rufntildeDsectaringaeligccedilaringh-a5ordfgteumlmWcdmO=)t_ccedilY-a5 16cd 4^regGmWgtzyacuteordfmacrdnDq

Table 8BX900W FX1QRfeaZegraveBCDUcircWaumlbrvbarEuml BX900 afeaZegraveBCDaumlbrvbarGq

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeFFQhFQeFHFFIhFIeFHFFhFQWFRWFFILCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildeKgMSCFDEFGLCKmMlhKOFRLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeNHQRIAacuteVOEOlOjAtildehSRAacutelOEYEEfKAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKnMSCFDEFGKmMSCFDEFGKdMSCFDEFGKkMSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePMMSCFDEFGPKMSCFDEFGPPMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLSABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtildeLCnMEEhKOFKOFAVjLCgMllhKOFPOFAVjLCPMEhEEOUEFMAacuteEEfFAVjYKOFKAtildeLCKMlhllOUEFMAacutellfFAVjYKOFPAtildeAAacutelOEAtildehNAacuteEOlAtildeaumlIacutecurrenOcircKMSCFDEFGPMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBF

GFL

Fig 13 FFTEUuml|YacuteTHORNaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeop]qrsAacuteXAtildeObqqpobtbuqrvvowxAacutevOvOvAtildeL3yzPbqqpobtrAacuteowxAacutesOwO]|AtildeAtildeLCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildep~hKOwowxAacute^O~OqAtildehowAacute~O^Atilde ^currenethDHrp )9BcopyIgrave)JIacutecurrenKgMSCFDEFGKdMopt^|rKkMopt^|rpKdK^^hKOsOuqpKmM~~hKOwOuqpKmK~h~~O^MAacute~~fuqYKOwAtildeLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildepKkKVhKO]|NHQRIAacuteVOEOlOjAtildehowxAacute^Y^^fKO~OqAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKkKopt^|r LEW V^lnacutewPKnMSCFDEFGKmKSCFDEFGKmMSCFDEFGKdKSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePPMSCFDEFG LEW j^lnacutewPPKMSCFDEFGPMMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtilde]bcbrtrcAacuteuqKhnAtildeLCnMEEhKOFKOuqKLCgMllhKOFPOuqKLCKMlhllOUEFMAacutellfuqKYKOFPAtildeLCPMEhEEOUEFMAacuteEEfuqKYKOFKAtildeAAacutelOEAtildehNAacuteEOlAtilde aumlIacutecurrenOcircPMSCFDEFGLEW l^lnacutewPKMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBFGFL

Fig 14 FFTEiacuteaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

Ta

ble

8FF

TE V

er 4

1

FullW

kern

el

(EumlIgrave

32

180M

BP

roc

)

ST

UVWXYZ[]

^_

`6a8bcdRefghHdRefgij

`6a8bcdRefghHdRefgij

klmmnopqr

cltst

ouvq

wvx

ovy

zqvz

|vy

|vz

klmmnopq

~cltst

yvy

yvo

ovo

wqvq

v|

|vy

s88oy

oqvu

oqvu

ovq

z|vu

oquv

qvz

s88Ru

ovz

qvw

x|vq

ovw

klmmnopqr

cltst

ouvy

zvy

ovx

oq|vu

uvz

uv

klmmnopq

~cltst

yvx

xvw

ovo

w|vu

vz

|vy

s88oy

oqvx

oqvo

ovq

wxv

z|vo

qvz

s88Ru

o|v|

qvw

xuvw

ovy

klmmnopqr

cltst

ouvo

zvx

ovx

klmmnopq

~cltst

yv

xv

ovo

s88oy

oqvq

oqvu

ovq

s88Ru

ovz

qvw

ln

ltd

ln

zqq

D

mo

ltd

ln

mltd

JAEA-Testing 2011-005

UgraveBX900gtUcircWQRfeaZaumlWQRfeaZaumlWafeaZdegnIgraveEacuteIacuteotildeigt

laquosectauml4ordfXmWWfeaZAacuteordfXcpoundDmWordfcurren

cqUgraveDFig 14OcircgtCDIacutecurren_ccedilY-a5^regGW(_ccedilY-a5acute164)BX900gtordfpoundUcircCsectmoumlaringaeligccedil13hegravepoundXcpoundD9ordfAumlq [FX1gtUcircWQRfeaZaumlWQR

feaZIgraveEacuteIacuteshyegraveeacuteordfaEacuteDordfUuml|YacuteTHORNcdz

yacuteCiacutecurrengtrsGWotildeigtlaquomWcd BX900 WrsGmWXqBX900W FX1gtUuml|YacuteTHORNWiacuteOslash5GWFig 14OgraveIacutecurrenO=OordfEcircUgraven^currenethWPrsUVXiacute5pound

DDHBX900 gt FX1 gtzyacuteCqCcurren BX900 16 oacuteC FX1gt 40WXzyacutegtlaquopoundDqUgraveDFig 14OacuteWOcircIacutecurrendegncopyIgravefgO=EacutegtlaquosectBX900gt 11currenW 08currengtzyacuteXcpoundDordfFX1gt 36currenW 18curren(or 16curren)gtMddcXzyacuteordfaacuteHdnDq 13yacutecdFX1feaZTUacuteUcirc BX900Wdc+XpoundsectFX1

|Y]OszligyBGDHhhordfaeliggt

laquomWordfcurrencpoundDq 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute

42 STREAM^_ZgtethCDW ordfcent^_Zgtecircc5g5poundDq egraveBCD^_Z43 FFTEWEacute Stockham|5gt FFT5P

NPB (NAS Parallel Benchmarks) 23FT^_ZgtlaquoqFFTEoumlWiacuteNPB23 FT Uuml|YacuteTHORN(Fortran Ucircgt MPI EumlIgraveUcirceumln)l3CUuml|YacuteTHORNcdaumlIacutecurren4C)JCDaumloacuteCfeaZUuml^13~brvbarpound^

UcircCDordm)5poundDqUuml|YacuteTHORNgtXCaumlBD=YZ C7UV((XYZ)=(512512512))gtaumlIacutecurrenordf On aringaeligccedil13hWXsectfeaZUuml^13~szligordf13XDHgtlaquosectaumlIacutecurrenUVEacuteCDcpoundDcd

gtlaquoq ^_ZaumlIacutecurren 28plusmngtlaquosectn~n Fig 15W Fig 16GqMd

8plusmncopyIgraveoacuteGYordfZsectumlO=gt|YordfZgtlaquo

mWcdhtgtUKntst oacute WXXgtmUuml^13~notbrvbarGq UgraveDOcirc$_ccedilY-a5 16W 64brvbar5poundDq

JAEA-Testing 2011-005

13

|ucp|t^ro__tAacute^OOOsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]|trFRFY]p^top]qrsYtpYop]qrsJJDp_Q|^bbqpc^ta|rotpbctcb|urc`Q^uptatar^]|tbtarp|t]|tbccbwOa^qrR^boocbtoabccbw`Et^b|rtabtFhPU`Ar_pcrobqq^SJJDItpo]rc_pcJJDOtarbccbw|tur^^t^bq^ruwobqq^SJJDI^taEorttpMbUrttpUQOarcrUQ^tarbs^|bq|rp_U_pcbwo|ur|rtobqq`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^oq|r[qpubq`a[^oq|r[]^_`a[^trrc^OOO^O~OqOsO^rccOr^trrc__tuqpoO__tuqpo]b]bcbrtrcAacute__tuqpohdnO__tuqpo]bhddAtilde auml^_ccedilY-a5Kd]bcbrtrcAacute__tuqpohKdO__tuqpo]bhKXAtilde]bcbrtrcAacutehiOhmKPAtildep|uqrop]qrssOwO|p|uqr]cro^^prqOrqrOrq]^r^psAacute__tuqpo]bOAtildeOwAacute__tuqpo]bOAtildeO|AacuteAtildeobqqUHExE^tAacute^rccAtildesAacutevOvAtildehAacuteK`MMOM`MMAtildewAacutevOvAtildehAacuteK`MMOM`MMAtilde|AacutevAtildehAacuteK`MMOM`MMAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pcprbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqh]^xt^rAacuteAtildep^hKOKMMM^hY^pqhKOOPobqq__tPAacute^OqOOO__tuqpoO__tuqpo]bO|OsOwAtilde^_Aacuteq`r`AtildeptpKdMobqq__tPAacute^OqfKOOO__tuqpoO__tuqpo]bO|OwOsAtilderpptpKXMoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]wRtpQ`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKdMopt^|rp~hKOp^hKO__tuqposAacute^O~AtildehwAacute^O~Atilde aumlIacutecurrenLfgO=PrprpKXMopt^|rrpobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqrh]^xt^rAacuteAtilderq]hrqrYrqobqqUHExSpxcbAacuteUHExSCUUxTCBVLOrO^rccAtilde^_Aacuter`r`MAtildetarc^trAacutedOWAtilde[Gqb]rD^ruw|u`__tPAacutero`Atildev[Orq]r^_obqqUHExJ^bq^rAacute^rccAtildetp]

Fig 15 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- cfftz)

JAEA-Testing 2011-005

|ucp|t^r__tPAacute^OqOOOwOwKO|OsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pctarVYta^trcbt^pp_tarropbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^trrc^OOqOOOwOwKOKOq^Oq~OqO|O^O~O^KKO^KPO^PKO^PPp|uqrop]qrs|OsOwO|KOsKKOsPK^r^p|AacuteAtildeOsAacutewKOAtildeOwAacutewKOAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYort^^t^bq]bcbrtrc`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKhePqhPWWAacuteqYKAtildeq^hPWWAacuteYqAtildeq~hPWq|hq^fKp^hMOq^YK^KKh^WqfK^KPh^KKfK^PKh^Wq~fK^PPh^PKfq^_Aacute^`r`KAtildetar|Kh|Aacute|f^Atilderqr|Khop~Aacute|Aacute|f^AtildeAtilder^_oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoDa^qpp]^rotpc^buqr`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYphMOqYKp~hKOwsKKhsAacute~O^KKfAtildesPKhsAacute~O^KPfAtildewAacute~O^PKfAtildehsKKfsPK aumlIacutecurrenLOO=PwAacute~O^PPfAtildeh|KWAacutesKKYsPKAtilderprprpcrt|cr

Fig 16 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- fftz2)

JAEA-Testing 2011-005

UgraveWH Fig 17(Ocirc$ Table 9 Table 10GqQRfeaZbrvbarshyatildeCBX900 fftz2gtfeaZUuml^13~UKntstbrvbarzyacuteordfagraveaacute13cfftz gtnotXordfmacrdnXqUgraveDFX1degnnotXordfmacrdnXcpoundDqmW ETHNtildeCD

zyacuteordflaquopoundD BX900 fftz213gtfeaaringaeligccedil13heumlXsectumlampordf5eumlnXmWordfcurrencpoundDq13]h

eumlnX(cXCcedilEgraveWuumlwdnordf(currencdXq Ocirc$_ccedilY-a5degnOcirc` 16ordfTUacuteshygtlaquopoundDq UgraveDQRfeaZbrvbar BX900W FX1shyordfFFTEoumlbrvbarsectgteuml

(2curren13)ucircsup2WsectumlMd|YZgtlaquopoundDDHWuumlwdnq TcedilQRfeaZWafeaZUKntst ordf)BCD

IacutecurrenotildeQRfeaZAgraveordf 1 ethAacuteWXpoundDordfSTREAM gtfrac12Eacutesect

iumlntildegtlaquoq 45 TUacuteUcircUgraveWH

BX900WFX1STREAMWFFTEUuml|YacuteTHORNfIumlWNPB23 FTB13feaZbrvbarTUacuteUcirc5gCDFX1gtegraveBGfeaZUuml^13~brvbarpoundordfecircC^ordflaquo 5ordfaeligugtlaquopoundDq

FX1gtfeaZUuml^13~otildegt13ecircGmWordf13XoumlordflaquosectWecircUcirc7gth5Paeliguordflaquoq

Fig 17 feaZbrvbarTUacuteUcirc

JAEA-Testing 2011-005

TTa

ble

9N

AS

PARA

LLEL

Ben

chm

arks

Ver

23

FT

Tabl

e 10

NA

S PA

RALL

EL

Ben

chm

arks

Ver

23

FT

JAEA-Testing 2011-005

5 IORYbrvbar Y`fIuml IORBoa13alFGFAring

CaYoBETHNtildeCDq 51 IORYEcircu

IOR Benchmark3)vR ASCI Purple=_7|eacuteplusmngteumlnD C iexclbrvbarYfIumlgtlaquoLplusmneacuteYacute~ 2102egraveBPqPOSIXLwrite() read()PMPIIOLMPI_File_write_at() MPI_File_read_at()PbatildeB13EumlIgravel5PmWordfgtecircq 511 YacircueZ$

MPI oZYcplusmnLrepetitions plusmnP`5XFGFAringTshyiquestHqOcirc`centgtEumlIgraveOslash5eumln^_cdiaoacuteCpound

Eacuteecircordf5nndFGFAringshyordfeumlnqIOROslash59XacircueZ$ Fig 18Gq

1plusmn`gt readwriteplusmn(blockSizecurrentransferSizejsegmentCount)gtlaquosect`aecirceuml (blockSizej segmentCountjMPI EumlIgraveAring )WXqyenwtransferSize=1kBblockSize=1MiBL106 Bytes MB220 Bytes MiB WGPsegmentCount=1MPI EumlIgraveAring=32 oumlblockSizejsegmentCountjEumlIgraveAring=32MiB testFile ordf)dnoZY 1MiBmiddotAbrvbarntildeC1KB7fIuml 1024iexcl1313aeligUgraveDZordfpoundEacuteecircGq

[transferSize] 1plusmn readwritegtFGGeacutea`qsizeof(long long int)currencblockSizegtlaquoaeliguordflaquoq

[blockSize] o^_ordfbrvbarntildeGeacutea`qgt 1MiBq [segmentCount] o^_brvbarsectYeumlniOcirc$qyzshy 1q [randomOffset] 1XdZordfYqOcirc`shy 0q [fsync] ecircEacuteAring fsync13f5Gqyzshy 0q [filePerProc] ^_sectWcent`aYGqPOSIXEacutenotszligq

yzshy 0q [useFileView] MPI_File_set_view() MPI_File_write() MPI_File_read()egravePq

ZordfYWparaCXqyzshy 0q [collective] MPI_File_write_at_all() MPI_File_read_at_all()egravePqZordfY

WparaCXqyzshy 0q

Fig 18 IOR YacircueZ$

JAEA-Testing 2011-005

512 MPIIOatildeGUuml^13~eZ$

IOR gt Info aringbrvbarpound MPIIO =)`7wmWordfgtecircqBX900 notszligX InfoUumlYacuteNtildeY` keyW value Table 11GqplusmnCXLyzshyPq

Table 11 MPIIO Info UumlYacuteNtildeY` key value

cb_buffer_size 4194304 $amp()

+

cb_nodes -012345

6789lt

=gt-012A

B

ind_rd_buffer_size 4194304 ACDEFGF$amp()+

ind_wr_buffer_size 524288 ACDH+GF$amp()+

JAEA-Testing 2011-005

52 latildeLPOSIX MPIIOPn~nEumlIgraveAringtransferSize (1plusmn13

fgtFGeacutea`)YAgraveograveLsequential randomPb^wFig 19GumlcopyFGFAringCDL^_P Table 12Table 13Fig 20Fig 21GqUgraveDntildea13iumlIuml_aumlaringaeligccedil13hIO -eacute-eacutearingaeligccedil13hordfcentsup3CDHn~naringaeligccedil13hoacuteGIO XcOgraveOacuteOcircOtildeYoacuteG IO XcordfcentGDH_aumlaringaeligccedil13hszligEacuteDq

iexclcentpoundcurrenyenETERNUSDX80

970TB

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

brvbar

SRFSltlaquocumlcopyordf

READD WnotiteD

brvbar

SRFSltlaquocumlcopyordf

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

shyshyshy

Fig 19 ReadWriteumlcopy

JAEA-Testing 2011-005

Table 12 LPOSIXP

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3

oqu yvq ovy| |uxvzw zvwu |zwvyy ouv|u oywvo ozuqv

quw zqzvuy uvox uu|v|u u|uvqo uxovz| uwvuo owovuq wqvxuqzy oqqvo| vy uxyvyw uuuvxx uyvyw xqzvu| |vw| |q|wvzqwoz o|uv|w ouvoy ozwv|o uyxvux uqvyo u|v| uovzu zuvxoy|wu ouoov|w uwvqu uuvxq uyvwz uuvyw uxovww uqvxw zovzw|yw ouxzv wzvwq uy|vx uyvz| uyuvwz uwxvoo |xvo wxvzyyxx|y ouwyvq| wv uxvo| uovyx uzvuu u|qv xvxy yvyxo|oq ouqwvu| uqvqq uxv|w uxvxy uuwvyo uyvoo qvyw zv|qyouu oxoyvu ywovoq uuvw ux|vy |uxvu| uxovoy oqvow uwvzwxuww oxovww oo|zv uyuv uvoq |w|vo u|wv|y ouqvoq uqwvooquwxy ouwvyq ovq uuyvo| uzvyu uquvq u|uvo q|zvx| q|vq

oqu ow|qvy ovu uqvyo vyu xx|vyq uwvwy z|zyv|y oxxqwvuwquw quzvoy |vq |w|vwz uyvyw xxwvzu xuvw o|xxxvy owv|xuqzy ouvu |vu uqov|q uxzvzo u|vzq uuwvu ooouv| |qxvxuwoz ozqxv|x o|vq |z|vuo uq|vqz xxyv|u xzzvy oo|vwo xqvzzoy|wu uvuw uqvz uovz u|vwo xwwvo y|vu| oxvz ozvoq|yw uuvw zuvy uvzz uozv yqzvyx xzyv|u ovo oxvwyxx|y owvy ouuv|u uwv|| uuwvx| x|ovx| yqyvux o|ozvuo oyoovyoo|oq |wzvy xyvyw uuwvz| uvqu x|xv xuvu ooyyvxo oxxvoyyouu u|wvq |zxvwx uvo uyzvu xo|vzq xuqvqw oyzu|v|o q|quvqxxuww uuv|o ozuxvuu uuvxx uzvzu xqvow xuwvzz oyxxxvyx owyovyooquwxy xwvy |qzv| uuzvxu uzvo| xyuvwo uwzvwx oxwwvxz oxyuyvy|

oqu oxyovoo vo yo|vyu v|w xwyvzz zv |xwovy xzzxxvyquw oxvz |vwz yquvxx y|ovx xywvyz xzyvyq x|xy|vzo w|ywzvwouqzy oyvu| zv xzwvx| yyvu xvoq xv yxy|vz zqy|vyowoz oy|xvxx uvy yozv|q yvyu xwv| xxvu| ywuv| zqqqvozoy|wu oxyzvuw uyvzz y|vyy yowvxq xwuvwx xwxvy ywzuv|q wqzvoo|yw oxoovuy wwv yqyvyq yuvyw xwov|x xzov|u ywozyvqy wyyyvywyxx|y oxwxvz ouyvxy yoyvuz yvoq xwxvwx xzqvww ywv| wxxqzvo|oq oyxwvw qvwx yovw yuvw| xxwvwo xwyvuq yquvzw woyvyouu oyw|v|z yzvx| yqvwu yxyvq| x|vx| xywvuq yywqvxw wqvq|xuww oxqvxu ouxqv|u yqvuq y|wvqx xyqvw xovz| yx|||vqq yuy|vuoquwxy oyywvqx uwvuy yo|vxz yuzvz yyxvq xwv|u yox|wvww yoovqx

oqu

|

xy

acutedegmicrodegpara

degmiddotmdegemdegkk~qgcedilsup1sup3

mdegkk~ocedilsup1sup3 cedilordm cedilsup3

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3oqu |yvoy uvqw zqxvxx xvwq uzvx ovu y|zuvx xywy|vy|quw uxzvy yvuz zoyvxo vox xxv| |vqw zyuyovyo zqzuvuyuqzy uyvuz oovx zuovw wvzy zvqx |vzu o|qxxuvy oowzyyvyqwoz uoyvxx vy xyzvzq xxv|o zuovwx uyvyo ouyyqovxx o|woo|v||oy|wu xqvzy uuvy oquvou oqovq oqqovuw z|vy oxuwv| ouqxzyvwz|yw xoyvyw xwvqy oqqvo oz|vo oqzvyz o|yvu ouwwqvq o|wzuv|yxx|y yu|vq ooyvq o|yvzx qvu o|wvqq uwvy oozwwvyo oozqyvxo|oq youv|y ovoy oxu|v|w yqv ouv|y owzvoo wuo|vou w|qzvzwyouu yxvw uouvwo oywzvwo o|yxvqo oyyvqq oxzzvq xo||voy xoyvwuxuww wozvq yuv| oyuvwo ouxvo ox|wvq o|o|v| |qwqvxw |ouvyzoquwxy zyxv|o zqyvu| oyq|vx oxqwvy oyzvz ooxvu ozzvx owovyu

oqu |xvyy v|u ozuvy w|vyy |xvw |vxy uoyyzov|o |zqyyyvoquw |xvww uvxw owq|vx xvzx oyovu |vqy xw|zvzw xyzuzvzxuqzy |yzv|z zvqw qvuy v|x oyovww wqvyw wzyv|w yqu|vowwoz |wuvz owvu o||vu oqv xzyv ouuv|| yozqovu zzqvzoy|wu xqxvwy |yvx qovou uwvyx oz|zvxy ywv| xqqxvzq yq|qxvzw|yw xuv|u uv|z qywvyu zxvy qxovww |quvz yxwoqv yuquu|vy|yxx|y yzxvqq ouyv| xzwvwx xyvzq yvoq wqzvy uzyqzvu x|yzqzvw|o|oq yyqvx| zzvq |ov|x zzovw| w|ov|| ozvo x|o||xvuz |yqqvwyouu zzyvzq yq|vwz |zzzvz |qvy |wzuv zyvxq |xowu|v |uuoqovuuxuww ooxvu o|wvu |wyvww |zvu |ux|vo zw|vzo |xuwvy |uwovxoquwxy oz|yv| owzuv| |wxyvwu ||yvqu |yzuvzx |yyovq o|uwvo o|uywvo

oqu |yvq vq |qyvqq ozwv|u |zvwq qzvww o|wuzvyo ooq|xxvzyquw |yvzu uv ooqvyu ozzv|w uxxv ozqv| owzzwvyy oxxuq|vouuqzy |yyvx| wvxo yvqq ozwvqy |zwvzq ozwv zzvx qyqqovoowoz |xv| oyvyz ozxovou |wwvw uxqvq| uquvq| |ozqxv|| qwuzzzvw|oy|wu uyqvzo ||v|| |wvyy wquvw wxuvox zvuo |oxy|vw q|qyvoq|yw ux|vuw yvw |xov|u |uvz uuvyu |wqvy o|xxovzu q|wvwzyxx|y yzvwo o|xvu |uqovuy qvw |yvy wwzvx o|wuyov|y owxuqzyvowo|oq xzzvwz yzv|y |q|vxq ovzu uqwvuq oqqvz oyuyyxuv|u oyouoz|vuoyouu wxyvq xuuv|u uoyxvo |yywvo uzvw| uouvu ooq|ozvo oqyq|vqxuww oqqvqu oowvoz uoy|vw |yvx uwuvuw uux|vxx woyyqzvow zqzvqwoquwxy owquvu owov|u uouvx |zxyv uvw uxoqvqu uy|yozvuw uuqqvuo

xy

oqu

|

cedilsup1sup3plusmnsup2

microdegpara cedilsup1sup3 cedilordm cedilsup3degmiddotmdegemdegkk~qg mdegkk~o

JAEA-Testing 2011-005

Ta

ble

13L

MPI

IOP

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

qv

qvz

ovqq

ov

|vq

vwy

quw

vyu

ovq

ovzz

vx

yvqq

yvox

uqzy

xvq

|vx

|vzq

xvo

oovw

oovxu

woz

oqvz

wvqu

vy

oqvy

|vuz

vy

oy|wu

ozvwy

oyvwq

oyvy

ozvz|

uxvw

uxvz

|yw

|yvzx

|vyu

wvy

|xvxo

vq

xv|

yxx|y

yvy

xyvoz

xqvqq

xuv||

oyvqo

oovz

o|oq

zyv|o

wqvu

zovu

zvx

owwvw

oyvzx

youuoxvoq

oqyvx

ouovzu

oqov

x|vqw

uvux

xuwwuyuvoo

|wvuo

uwvxu

oyovqq

wuvqq

||zv|y

oquwxyyqxvu

uwyvux

uwxvzx

ozuvqy

|uwv|

|uuvx

oqu

ovou

qv|o

qvy|

qvyx

|vy

|voq

quw

v|

qvx|

ovq

ovw

yvu

yvz

uqzy

uvxy

ovu

vqw

vyw

ovu

ovou

woz

zv|w

vyu

xv

xvoq

xv

uv|

oy|wu

ov|z

uvyz

oqvou

zvzw

uzvyo

uwvwz

|yw

||vz

zv|w

uv|z

qvy

wvwq

zovy

yxx|y

xvxw

v

uyv

|wvy|

oxqvz

ouzv|q

o|oq

w|vuu

u|vz

wyvx

ywvzz

y|vw

yvuz

youuooyvu|

yovw

oyqvoqvxo

||xvyo

|vx

xuwwwwvww

o|ovu

|uvqzyvuy

|yxv|

uqvw

oquwxy|xvo|

oovux

qovwu|qyvqz

uqwvo

x|v||

|

xy

acutedeg

microdegpara

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

ovou

ovoo

ovq

ovo

vo

vwx

quw

vx

vow

vo

vxu

uvxo

yvo

uqzy

xvoq

uvou

uv|

xvqu

zvo

oovy

woz

oqv

wvx

wvy

oqvo

ozvwo

vo

oy|wu

qvoq

ovo

ovux

qvq|

|yvyq

uyvou

|yw

uqvyo

|yvz

|uvy|

|wvu|

wovo|

wyvzu

yxx|y

wqvox

|v|y

yyvww

wvy

ouovz

ouwvqq

o|oqoxyv|w

o|vuq

oxvyo

oxovz

wvw

xwv||

youuwovuu

xyvox

u|vzq

yzvxw

xoovux

x|xvwu

xuww|xvo

uzvx

u|vyz

xqvozoquqvw

zuyv

oquwxyzv|y

yovz

wqvou

qyvuqooo|vuoqovu|

oqu

ovo|

qvxy

qvzy

ovo

voz

|v

quw

vz

qvzu

ovzq

vy

uv|u

yvx

uqzy

uvy|

vqx

|v

uvuz

wvwq

ovxo

woz

zv|

|vx

vwq

wvwu

ovyy

yvqu

oy|wu

owvzu

vu

oyvox

owvoq

|xvxo

xuvwx

|yw

|vy|

ouv

|ov|u

|xvx

qvq

oquvwq

yxx|y

xvq|

|vu

uxvx|

yvwo

oxuv||

owov|w

o|oqoxqvyw

v|

wyvq

ovux

|qvo

|vu

youuz|vwo

ouzvwq

o|vy

wuvqw

yovy

yywvz

xuwwxxvyu

|qv

uy|voy

xoqvwoxqvwwo|yvzw

oquwxywxwvxq

xov

wqvx

zqwvu|uovxzoqvu

xy

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2

microdegpara

|

JAEA-Testing 2011-005

521 POSIX Fig 20GbrvbarPZYgtFGeacutea`brvbardotildeiWXoacute

CZordfYgtFGeacutea`laquoGnordfzyacuteCqXFGeacutea

`ordf 1MB oumlZYWZordfYWgtiacuteOcirc$FGe$WXDHordfiGqUgraveDZYecircCo^_ordfn

~nparaCDaYGouml`$iaYbrvbarsect

EGoacuteCpoundEacuteEacutegtiaYGoumlAgraveordfAumlgtlaquopoundDq

mnd$Ocirc$YUumleacutenotccedilIuml]OcircOtildeYYL13YEuml

plusmnFEumlPthornordfuumlwdnqecircCCgtordfFGeacutea`

iexclcentIacuteiWXmW]poundEacuteEacuteecircC+Xnota

13gtOslashordfmacrdnX=WC-eacutearingaeligccedil13hLIOiumlIumlshyaringaeligccedil13hPthornWuumlwdnq 522 MPIIO POSIXZordfYWiacute=zCFGeacutea`

otilderyenCordfzyacuteCLFig 21PqUgraveDEumlIgraveiexclcentCXqmndreg=cd MPIIO ]^_ordfZordfY5POslashOumlWXpoundgtXcWmacrdegeumlnq 523 _aumlaringaeligccedil13h 13 IOoiumlIuml_aumlaringaeligccedil13hWa-eacutearingaeligccedil

13hordflaquoqa-eacutearingaeligccedil13humlcopyordfplusmnsup2GmWsup3acutegtlaquoordf13

BeumlnmemcleanfIumlegravePmWbrvbarsect_aumlaringaeligccedil13hY|1micro5PmWordfgtecircq 13G3brvbarsectpoundEacuteEacuteCDPOSIX_aumlaringaeligccedil13h

szligordfaacuteHdnDLTable 14Pq

(1) ecircEacute` (2) memclean (3) TplusmnpoundEacuteEacute` (4) 2plusmnpoundEacuteEacute`

JAEA-Testing 2011-005

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomSingleFileFilePerProc

ReadWrite

homehome

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

Write Read

sequential randomhomework

N=322561024

N=256

N=1024

N=32

N=32

N=2561024

FilePerProc FilePerProc

N=256

N=1024

N=32

N=2561024

N=32

FFig 20 POSIX

JAEA-Testing 2011-005

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential random

use viewcollective

use view N=256 use view N=256

N=32 256 N=32 256

homehome

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomhomework N=32 N=32

FFig 21 MPIIO

JAEA-Testing 2011-005

Table 14 _aumlaringaeligccedil13hszlig

| xy | xy

oregRplusmnsup2 zyvx qzv|| xwvxw zyuvqu

sup2Rplusmnsup2 woxqvoz ||qqvz| uzvu z|voy

kfrac14 frac12kfrac14

53 IORYUgraveWH

IORYbrvbardegdnDregmacrb13tuq 531 a13egravecurren occedil`brvbarYCDcparaX IO O=5Pnota13LSRFSP

homea13brvbarsectiumlIumlS_aumla13Lext3Ptmpa13PordfOslashszligordfAumlDHcAumlFXgttmpa13YZccedila)W^|13~AumlFUcircgtecirc9ordflaquoLFig 22middot Table 15Pq UgraveDhomemiddotAQFS_ccedilY-a5 512kcentCsect186OcircOtildeYOumltimes

raquoZETHIuml_Kgtaordf)JeumlnqUcircfIuml13NtildeY|^`bgta

gt_ccedilY-a5gtyBUacuteCqiAgravework middotAIO eacuteIumlkCQFS_ccedilY-a5 1MBcentCsect366OcircOtildeYOumltimes`Zaggtaordf)Jeumlnq^|13~cda_ccedilY-a5gtlgtyB

UacuteCqa-a5ordfecircCXbrvbarPX MPIIO work middotAegravePuecircgtlaquoq 532 POSIXW MPIIO c^_brvbariaYhomemiddotA Readgt 256EumlIgravegt 2curren

Writegt 10currenW POSIXbrvbarsect MPIIOordfcedilqPOSIXbrvbaraYmacrsup1Gq 533 eacuteccedilecirceuml ntransferSizeGX 1 plusmn13fgtleumlneacute

a`ordfecircMFGFAringordfecircCXWPdegdnDmWcd^|13~

^_Zeacuteccedil-a5laquo]GWordfyacuteordf9ordflaquoq C`|lOcirc`eacuteccedil-a5Y_ BUFSIZgtAtildeUgravesect13

gt gccegravePoumlgt fccegravePoumlgt 8192eacutea`gtlaquoqCfIumlgtsetvbufZaZ|atildebrvbarpoundeacuteccedilecirceuml^wmWordfgtecircq QR Fortran1313aeligleacuteccedil-a5GAgraveograve13sectLAacute

ordmPq (1) OPENyacute BLOCKSIZE (2) ^ fuxxbf (3) Oslash5fIumlZaUuml^13~ -g (4) Ocirc`gt 8Meacutea`

JAEA-Testing 2011-005

ETER

NU

S D

X80

1-

18 (d

x01frac34

dx18

)

DX

80

01

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

DX

80

02D

X80

03

DX

80

04D

X80

05

DX

80

06D

X80

07

DX

80

08D

X80

09

DX

80

10D

X80

11

DX

80

12D

X80

13

DX

80

14D

X80

15

DX

80

16D

X80

17

DX

80

18

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

IObrvbar

1(io

1)

ETER

NU

S D

X80

19

-36

(dx1

9frac34dx

36)

DX

80

19

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

DX

80

20D

X80

21

DX

80

22D

X80

23

DX

80

24D

X80

25

DX

80

26D

X80

27

DX

80

28D

X80

29

DX

80

30D

X80

31

DX

80

32D

X80

33

DX

80

34D

X80

35

DX

80

36

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

Wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

IObrvbar

2(io

2)

Fi

g 2

2a13J

JAEA-Testing 2011-005

Table 15 OcircOtilde7Y`|J

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_01 QFS_data_01 2109888 MB 2109888Icirc36 = 724TB 1024k

data_02 QFS_data_02 2109888 MB 2109888Icirc36 = 724TB 1024k

data_03 QFS_data_03 2109888 MB 2109888Icirc36 = 724TB 1024k

data_04 QFS_data_04 2109888 MB 2109888Icirc36 = 724TB 1024k

data_05 QFS_data_05 3695488 MB 3695488Icirc36= 1269TB 1024k

home QFS_home 3695488 MB 3695488Icirc36= 1269TB 512k

work 524288 MB Io2JAgraveAacuteAcirc

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_06 QFS_data_06 2109888 MB 2109888Icirc36 = 724TB 1024k

data_07 QFS_data_07 2109888 MB 2109888Icirc36 = 724TB 1024k

data_08 QFS_data_08 2109888 MB 2109888Icirc36 = 724TB 1024k

data_09 QFS_data_09 2109888 MB 2109888Icirc36 = 724TB 1024k

data_10 QFS_data_10 3695488 MB 3695488Icirc36= 1269TB 1024k

data QFS_data 3695488 MB 3695488Icirc24= 846TB 1024k

sysdata QFS_sysdata 3695488 MB 3695488Icirc12= 423TB 16k IumlETHNtildeETH

work QFS_work 524288 MB 524288Icirc144=72TB 1024k dx01frac3436OgraveJ

JAEA-Testing 2011-005

6 a$fY` mmgt BX900 a$fY`ETHNtildeCDtuq

ETHNtildeBX900Xraquozfrac14frac12GmWordfgtecircDq 61 a$fY`igraveiacute mmgt BX900 a$fY`IcircIumligraveiacute AgravejiumlIumleacuteaIuml

0Gq 611 icircWiumlIumlethsectntilde

BX900 InfiniBand QDRegraveBCD Fat Treeicircgtlaquoq113aelig13S 18iumlIumlAEligicircgtlaquoordf113aelig13 LeafWdividenaccedil7Iumlordf 2Igravecenttimeseumln1accedil7IumlcdoiumlIuml QDR ordf 1 7gtYZeumlnqaccedil7Iuml 2 Igravelaquogt1 iumlIuml 2 QDR ordfYZeumln(QDR Iacutegt 18 j2Igravegt 36WX)qoiumlIuml`ZaringbrvbarsectT 8GBs (4GBsj2) AAEligpoundq 13aelig13 5PDHLeafaccedilyacute7 SpineWdividenaccedilordf

9iexcllaquosecto LeafaccedilGu Spineaccedil QDRgtYZeumlnqbrvbarpound 113aelig13cd SpineIacutegt 9j2Igrave=18 QDRordfegraveBeumlnq QDRordf 5PoumlLeaf accedil3xmacrDownlink ordf 36 Uplink ordf 18 Xgt Uplinkshy 50WXsectmn 50_ccedilaringWdividenotgtqBX900gta$fY`Ecircfrac34h Fig 23GqBX900gt 50_ccedilaring13aelig13ocircotilde gtthornordf]GqSpineLeafaccedilW13aelig13SoiumlIumlYZGEcircfrac34h Fig 24Gqmmgt13aelig13S nAumliumlIuml nAuml(niquest9)UgraveD nU9Auml(nAgrave9) SpineaccedilegraveBGqlaquo13aelig13laquoiumlIumlordfcent13aelig13iumlIumlcdOcirc$ltAacuteoumlSpineaccedilcd Leafaccedil(2Igrave) QDR2(1j2Igrave)ordfegraveBeumln(2Igrave)LeafaccedilcdiumlIumlQDR2gtYZeumlnqDotildeCmSpineaccedilcd Leaf accedilordmAcirc13aelig13SPAtildeWiumlIumlWnoteumlnqmDHyenwiumlIumlTEacuteOcirc$lt GoumlWiumlIuml 1 W 10 ordflt GoumlWgtn~n FAringordf+XqX13aelig13Sotilde gtiuml_ccedilaring

WXqmDHoYacute~iumlIumlethsectntildeAgraveograveLiumlIumleacuteaOcircOtildePordf

thornotildeGq (1) XiumlIumlethsectntildeL13klP iumlIumlZeacuteaIumleumlnqyenw13aelig13 1 iumlIumlordf bx0001ccedil

bx0018Ugravegtlaquoordf144EumlIgraveYacute~oumlmAumlgtiumlIumlordfeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDdmiumlIumlAringAEligw[iumlIumlegraveBGq13

aelig13ocircotildeoumliacuteZeacuteaIumleumlnq

JAEA-Testing 2011-005

(2) CcedilXiumlIumlethsectntilde Wsup33iIacuteOslash5YZniumlIuml 1ecirceacuteaIumleumlnqyenw13aelig13

1gt bx0001bx0003bx0005bx0007WeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDd8)eacuteaIumleumlnDiumlIumlordfEgraveAumlAringXdAringAEligwTEacuteEgrave

AumlAringiumlIumlegraveBGqEcircAumlAringoumliacuteAringAEligwTEacuteEcircAumlAringiumlIumlegrave

BGqmnAtildeWYacute~SiumlIuml gt IB [EumlDHcenteumlnDq plusmnoqgt65EacuteWsup3AgraveogravegtCDq

13aelig13 1 13aelig13 119

Leaf(1-2) Leaf(1-1)

bx0001 bx0002 bx0018

Spine(1) Spine(2) Spine(9)

Leaf(119-2)

Leaf(119-1)

bx2125 bx2126

Fig 23 BX900a$fY`icirc

3OslashCDqIgraveIacuteoFL13aelig13raquoIcircIumlntildePfrac34wWsup313kl

WiacuteiumlIumlethsectntildeAgravejBCq

JAEA-Testing 2011-005

accedil (Spine) accedil

(Leaf) 13aelig13 iumlIuml (1 13aelig13acute18 iumlIuml ) (acutecsect]GCGDHLeaf accedilTCcX)

1

10

2

11

3

12

4 5 6 7 8 9

13 14 15 16 17 18

SW 1

SW 2

SW 3

SW4

SW5

SW6

SW 7

SW 8

SW 9

SW

Fig 24 iumlIumlWaccedilatildeAcirc

612 Agravej

BX900 Ba$fY`plusmnsup2yBCD IB AgraveograveWC

RDMA(Remote Direct Memory Access)yBCD RDMA Agravej(13 RDMA WC)WRDMA egraveBCX NORDMAAgravej(13 NORDMA WC)ordflaquoqBX900 gtOcirc` RDMA gtlaquoqRDMA W IB begravecccedil`brvbarYaumlIuml(Host Channel Adapter(HCA))Ocirc$^|13~|W8Y]sectWsectgtecircAEligDIacutemWgtETH_fg (1)5PqAuml^ccedil`E7a13 ordfNtildeEumlgtecircqiAgraveNORDMA CPUbrvbar|fg5Pq

RDMANORDMAWcentccedilYacute AgraveograveWC Normal SendAgravej(2)W Send on RequestAgravej(3)ordflaquoqRDMANORDMAacircWmnd Normal SendAgravejSend on RequestAgravej5PmWordfgtecircordfmndsectOgravewG ccedilYacute-a5brvbarpoundAtildeUgraveqRDMANORDMAgtmndAgravejsectOgraveOcirc`Oacuteshy Table 16GqOacuteshyOcircOtilde Normal SendAgravejOacuteshy13yacute Send on RequestAgravejWXqmndAgravejWccedilYacute-a5^_atildeAcirc Table 17W Table 18Gq Oacuteshy^ MP_MAX_NORMAL_SENDgt^wdnordfyacuteaacuteshyCDOcirc`shyWEacutegtlaquoqbrvbarpound^egravePoumlOacuteshyOumloumlEacutegtlaquoq

JAEA-Testing 2011-005

XDshy 5PDHOslash5 -onesideUuml^13~CDoumlNORDMAAgravejc(ccedilYacute-a5WatildeAcirc)Normal SendAgravejgt5nq

Table 16 AgravejsectOgravewOacuteshy Iuml RDMA RDMA NORDMA

^_ Tccedil1024 1025ccedil U Oacuteshy 32768eacutea` 16384eacutea` 524288eacutea`

Table 17 RDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil16KBOcircOtilde 16KBccedil32KBOcircOtilde 32KB13yacute 1ccedil1024^_ Normal Send Normal Send Send on Request

1025ccedil2048^_ Normal Send Send on Request Send on Request 2049^_13yacute NORDMAWX Send on Request Send on Request

Table 18 NORDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil512KBOcircOtilde 512KB13yacute Gu^_ Normal Send Send on Request

timesOslash (1) ETH_fg ^_ccedil-brvbarOcirc$fgIacuteccedil`brvbarYordmgt ordm|8

YOcirc$FGGCcedilEgravemWq^_ccedil-( 13)Obbrvbarsectdivide3GmWordfgtecircq (2) Normal SendAgravej G shylt shyccdccedilYacuteG GqccedilYacuteUgraveUacutesect

umlordmordfagraveCnUgraveUgravefg(direct copy)qagraveCXoumliUcircIacuteEumlAagravefCmmfg(indirect copy)agraveCDdsectumlordmfgGqUumlccedilYacutezecircq (3) Send on RequestAgravej G shylt shyWYacuteAacutesectlt shyccedilYacutesectumlordmordfagraveCcdG 5Pq

`ccedilYacutezecircq

JAEA-Testing 2011-005

62 MPI MPI acircX atildeegraveBCD^_Z)JCBX900 yacutegt

CDq256EumlIgraveccedil4096EumlIgravegt Ocirc$-a5THORNumlcopyszligpound5poundDordfmmgto MPIatildegtagraveaacuteX=ecircordfmacrdn 32ccedil80000eacutea`gtGq WCRDMANORDMA ordflteumln atildeordflaquomWlaquoOcirc

$-a5cdecircCordf^UcircG atildeordflaquomWXMordfethCDq 621 Agraveograve mmgtG Ocirc$umlcopy 0ccedil80000 eacutea`gtlaquoordfOslash5vEacuteAacuteG

Ocirc$acircETHeCLTable 19Pq AgraveograveWCMPIatildeTplusmnOslash5GraquoCmn 20plusmnatildesectaumlGq20plusmnatilde

sectaumlC3 1plusmnWecircXshyordfeumlnmWordfDDlaquopoundDqWCpumlcopyYacute~ W[ordfuumlwdnqmDHT`C19plusmnshyIIumlAacutemWCDqXIuml RDMAW NORDMA 2IumlgtGq^_ZOslash MPIatildeIacutecurren Fig 25Gq

Table 19 Ocirc$acirc

EumlIgrave AllreduceReduceBcastSendRecv 13 MPIatilde 256 32eacutea` 128eacutea` 512 yacute 128eacutea`

1024 yacute 256eacutea` 2048 yacute 2048eacutea` 4096 yacute 2048eacutea`

do 1000 III = 1 NNN call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() mesuarment start call MPI_ALLGATHER(anelemMPI_DOUBLE_PRECISION amp bnelemMPI_DOUBLE_PRECISION amp MPI_COMM_WORLDierr) ttime2 = MPI_WTIME() mesuarment end ttime0 = ttime2 - ttime1 call MPI_REDUCE(ttime0ttime1MPI_DOUBLE_PRECISIONMPI_MAX0 amp MPI_COMM_WORLDierr) if(myideq0) then max_time = max(max_timettime) min_time = min(min_timettime) all_time = all_time + ttime end if 1000 continue

Fig 25 ^_ZLIacutecurrenP

JAEA-Testing 2011-005

13

622 mnUgravegttuDOslash5CcedilEgraveOslash5IumlgtP512EumlIgrave Fig 26 Fig

27 1024EumlIgrave B Gqoh$a` MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

0

500

1000

1500

2000

2500

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Reduce(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(512-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(512-)

rdma

Nordma

Fig 26 MPI atildeL512EumlIgrave-1P

JAEA-Testing 2011-005

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(512-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(512-)

rdma

Nordma

0

500000

1000000

1500000

2000000

2500000

3000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(512-)

rdma

Nordma

0

50

100

150

200

250

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(512-Rank0gtRank511)

rdma

nordma

Fig 27 MPI atildeL512EumlIgrave-2P

JAEA-Testing 2011-005

brvbarsect[raquozfrac14frac12CDq

(1) C gt RDMAbrvbarsect NORDMAordfFoumlordflaquoq (2) RDMAgtlaquo Ocirc$-a5cdaeligccedil ordflaquoEGqmn Normal Send

Agravejcd Send on RequestAgravejsectOgravethornWnq (3) NORDMA gt AgravejsectOgravewOacuteshyWatildeAcircOcirc$-a5gtaeligccedil ordflaquo

EGoumlordflaquoq (4) EumlIgraveordfCXWEacutebrvbarPX Ocirc$-a5gt decircordfmacrdn

oumlordflaquoq ^_ZgtacircegraveBG MPIatildebrvbarpoundRDMAW NORDMAegravecurrenaeliguordflaquo

qDotildeCmnd MPI ^_Z=GfIumlgtlaquo mpiexec Uuml^13~gtlaquogtGu MPIatildeoacuteC RDMAgt5Pc NORDMAgt5Pclaquo XgtC MPI atildeegravepoundoumlAgraveIumlgtOslash5rsGmWordfaeliguWXq ucircuumlWCo MPI atildegtUVOcirc$(ccedil8KB)3UVOcirc$(8Kccedil78KB)U

VOcirc$(80KBccedil584KB)gt RDMAZUcircCD Fig 28LgtUVOcirc$P CLUVOcirc$PGqZyacutegt 100n RDMA ordfnoty0n NORDMAordfnotyWXq

JAEA-Testing 2011-005

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

middotplusmnsup1raquo

~plusmnreg

plusmn

sup2cedil~raquo

sup2~

sup2~

sup2~Oslashreg~plusmn

middotplusmnsup1

~plusmnraquo

~plusmn

Ugraveplusmnsup1raquo

Ugraveplusmnsup1

~plusmn

plusmnraquo

Fig

28

RD

MAW

NO

RD

MALgtUVOcirc$acute

0K

B~8

KBP

JAEA-Testing 2011-005

63 RDMAszlig RDMA 612 gt0CDbrvbarPCPU plusmnsup2WnotM5XDH 13

O=Vfrac34gtecircqmnbrvbarsect WbordfO=gtecirc9ordflaquosectC

cCcedilEgrave`5poundEacuteDqUgraveDrsDHNORDMAgtiacuteX`5poundDq WC WordfO=eumlnDegraveyenagraveaacutegtecircXcpoundDq 631 Agraveograve W5P^_ZegravepoundRDMA W NORDMA rsGq

eacute^_WOcirc$ecircacuteGeumlsectCigraveecircacuteiacuteVicircG (13cedil Wdivideoslash)5PqGlt iuml_ccedilaring (MPI_IsendIrecv)egraveBGqG UgraveDlt GAtildeWcopyIgraveOcirc$-a5 5MB gtlaquoq W5P5Igravegtlaquoq5Igrave-a5(450450)gtlaquoqrsDHgt n~nOslash55Pq ^_ZZccedil`MPIW 2q|Icirca|ccedil`(MPIiumlOpenMP) 3q|)

JCDqZccedil` MPIgtOslash5CDoumliumlIumlS RDMAgtXDHmAgravejszligordfgtecircXqmDHiumlIumlSgt MPI 5XCeth2Icirca|ccedil`)JCDqIcirca|ccedil`Oslash5WOslash5Iacutecurren7ccedilIumlEumlIgraveUcircCXeacute

Yacute~ 1W7ccedilIumlEumlIgraveUcircCDeacuteYacute~ 2)JCDqeacuteYacute~ 2 WOslash5Iacutecurren Fig 29GqXMPIWIcirca|ccedil`eacuteYacute~ 15IgraveIacutecurren7ccedilIumlEumlIgraveampeuml$OMP parallel doiacuteordfotildegtlaquoq DHOslash5Zccedil` MPIgt 8EumlIgraveIcirca|ccedil`Oslash5 4mpij8omp(32core

egraveB)gtOslash5Gq

JAEA-Testing 2011-005

call MPI_BARRIER(MPI_COMM_WORLDierr) ttime5 = MPI_WTIME() mesuarment start call mpi_isend(aNMBmpi_DOUBLE_PRECISIONir1 1 MPI_COMM_WORLDireq(1)ierr) call mpi_irecv(bNMBmpi_DOUBLE_PRECISIONil1 1 MPI_COMM_WORLDireq(2)ierr) call mpi_isend(dNMBmpi_DOUBLE_PRECISIONil2 1 MPI_COMM_WORLDireq(3)ierr) call mpi_irecv(cNMBmpi_DOUBLE_PRECISIONir2 1 MPI_COMM_WORLDireq(4)ierr) ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ttime7 = MPI_WTIME() mesuarment start $OMP parallel do do j = 1 NMM do i = 1 NMM a1(ij) = 1d0 b1(ij) = 1d0 c1(ij) = 0d0 end do end do $OMP parallel do do j = 1 NMM do k = 1 NMM do i = 1 NMM c1(ij) = c1(ij) + a1(ik)b1(kj) end do end do end do ttime8 = MPI_WTIME() mesuarment end ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc call mpi_waitall(4ireqiastatusierr) ttime6 = MPI_WTIME() mesuarment end ttimeDD = ttime8 - ttime7 ttimeal = ttime6 - ttime5 Oslash5 ttimesd = ttimeal ndash ttimeDD Oslash5cdnD (= UgraveDiIacute)

Fig 29 WO=5PDH^_Z

JAEA-Testing 2011-005

632 Zccedil` MPI 2plusmnIcirca|ccedil`eacuteYacute~ 1 WeacuteYacute~ 2n~n 3plusmn

Oslash55poundDqmnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt D Gqontilde13Gq

acutegtOslash5CD Oslash5 acutegtOslash5CDOslash5 R acute Wn~nOslash5shy Oslash5 acute W5poundDOslash5 (U) acuteOslash5cd()Dshy acuteOslash5CD(OslashC) U acuteOslash5cdRshyDshy

laquogtOslash5CD oacuteG WOslash5GmWbrvbarpound

UumlCXpoundDrGqmj13sectgtlaquoq

laquo() = (Oslash5U( iuml)) j100 aTHORNshyordfecircMszligordflaquomWXqbrvbarsect[raquozfrac14frac12CDq

(1) Zccedil` MPIgtiumlIumlS IBegravepoundXDH RDMAograveoacuteordfltdnOslash5szligq

(2) Icirca|ccedil`eacuteYacute~ 1 RDMAOslash5gt WOslash5rsGWucircOslash5ordfFWPXq NORDMAatildeCiacuteXWXpoundq

(3) Icirca|ccedil`eacuteYacute~ 2 gtIacutecurrenordf7ccedilIumlEumlIgraveUcirceumlngteacuteYacute~ 1WRrsgt 72currenOslash5Rrsgt 55currenAumlFUcircordfXeumln(RDMA)qCcCgtucircOslash5ordfFWPdegdnXq NORDMAIacutecurren7ccedilIumlUcircgt Oslash5eacuteYacute~ 1ruUumlIgraveUcirceumlnordfRDMAbrvbarsectcedilCXpoundq

BX900gt RDMA Uuml_IumlOslashWGgtCPU oacuteG

dCb^|13~ CPUbrvbarsectVfrac34GmWordfgtecircqCcCBX900 ordfiacuteCUgraveDplusmngt ordfUumlmWlaquosect Wordf

O=eumlnDegraveyenagraveaacuteGmWordfgtecircXcpoundDq

JAEA-Testing 2011-005

64 CPUSCPUiumlIuml 1oacute 1 core FAringG^_Zegravepound CDqCPU S

iumlIumlS CPU iumlIuml gt Ocirc$ordmAcircAgravejordf+Xqn~n gtordfaEacuteiumlIuml ordfTFWXpoundDq 641 Agraveograve

1oacute 1 ^_Z coreOcirc$ G^_ZgtlaquoqZY0cdcZYCD^_acircgtCD-a5Ocirc$ 100plusmnG Cmnd uCDn~nGq^_ZiIacute Fig 30Gq plusmn 24EumlIgravegt128MBOcirc$G C^_acirc 1WGq BX900gt 1iuml

Iuml 2CPU1CPU 4coregtlaquogtZY 3Ugravegt CPUS ZY 7UgravegtiumlIumlSPAtildeW CPUW mnd13yacuteiumlIuml WXqiumlIumlS coreoacuteG core IDWethsectntildedn MPIZYAumlAring Fig 31Gq

IC = 0 do 1000 III = intcpu nodes-1 intcpu IIRANK = III call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN if(myideq0) amp call MPI_SEND(ansizeMPI_DOUBLE_PRECISION amp IIRANKIIMPI_COMM_WORLDierr) if(myideqIIRANK) amp call MPI_RECV(ansizeMPI_DOUBLE_PRECISION amp 0IIMPI_COMM_WORLDistatusierr) 120 continue ttime2 = MPI_WTIME() c IC = IC + 1 timew(IC) = ttime2 - ttime1 ir(IC) = IIRANK 1000 continue

Fig 30 ToacuteT ^_ZLIacutecurrenP

JAEA-Testing 2011-005

NODE

CPU0 CPU1

Core IDacute0 Core IDacute2 (Rank 0) (Rank 1) Core IDacute4 Core IDacute6 (Rank 2) (Rank 3)

Core IDacute1 Core IDacute3 (Rank 4) (Rank 5) Core IDacute5 Core IDacute7 (Rank 6) (Rank 7)

Fig 31 f MPIZYethsectntilde

BX900Ocirc`QRfeaZ+QR MPIZaZ|gtlaquoordfmQRRIgraveEacuteIacute13

IntelfeaZ+QR MPI IntelfeaZ+Intel MPI QRfeaZ+Open MPI

3 gt5poundDqDotildeCQR MPI W Intel MPI MPI ^_ coreoacuteGethsectntilde Fig 31CDsectgtlaquoordfOpen MPIatildeC rankAumlAring 01234567 Core ID 01234567gtethsectntildednq 642 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt E Gqmbrvbarsect[

raquozfrac14frac12CDq

(1) QRfeaZ+QR MPI gtCPU S FAringordf 3260MBsiumlIumlS CPU FAringordf 3750MBsiumlIumlordf 5350MBsgt CPUS ordfiAumlcedilCiumlIumlordfiAumlFWPWXpoundDq

(2) Intel feaZiumlQR MPI CPU SiumlIumlS CPU QRfeaZWotildeiacuteXotildeordfiumlIuml 4750MBsWegraveeacuteEGqmnQR MPIQRfeaZUcircCD5poundordfIntelfeaZoacuteC5nXDHWnq

(3) IntelfeaZ+ Intel MPI CPUSiumlIumlS CPUgtAgraveXordfmndocircoacute(1)(2)QR MPI egraveBqUgraveDiumlIumlgt Intel MPI 2(` QDRegraveBordfgtecircXDH1(` QDRCcXq

JAEA-Testing 2011-005

(4) QRfeaZiumlOpen MPICPUS ordf 4460MBSgtmnUgravegt3gtTbrvbarordfiumlIumlS CPU QR MPI WotildeAringiumlIumloumlCXWXpoundDqDotildeCiIacute QDRegraveB5poundbrvbarPotildeq

dividefrac12XegraveAgravegtlaquoQRfeaZiumlQR MPIgt CPUSiumlIumlS CPU ordf

iumlIuml brvbarsectcedilWPfrac34W+XWXpoundDqmmacraacutesectXuC

iumlIumlSgt MPI XuCEumluecircgtlaquoq 65 13aelig13ocircotildeouml 13aelig13ccedil`brvbarY 50_ccedilaringicircXpoundDHegraveBoslashbrvbarpound

thornltCUgravePqmthorn 1oacute 1 Wdivideoslash ETHuDq 651 1oacute 1

1oacute 1 MPI atildegtlaquosectbrvbarCegraveBeumlnqyenwOslashfIumlgteacute^_WFGgt^_ordfgt5noumlordflaquoqmbrvbarPXouml50_ccedilaringthornlt9ordflaquoqmmgt 5P^_Z)JC

thornETHuDq WC1oacute 1 gt 2iumlIumlgt IBnotGn[ordfecircDH50_

ccedilaringthornCltmWordfcpoundDqDotildeCWsup3iumlIumleacuteaOcircOtilde(Yacute~iumlIumlethsectntilde)AgraveogravegtAtildeWYacute~ordf IBnotGbrvbarPXiumlIumlcopytimesUVYacute~139ordfAgraveXDHthornordfCq 6511 Agraveograve ETHNtildeegraveBCD^_Z^_1curreno^_cdfrac14secto^_

1oacute 1 (MPI_SendRecv)gt 128MBOcirc$ 100plusmnG CmnuCDiexcl FAringGqOcirc$G aYacute Fig 32 Gqmyengt8EumlIgraveYacute~oumlgtZY 0ccedil3^_cdiexclZY 4ccedil7^_Ocirc$G CqOcirc$Glt ^_iugravepGgtoslashbrvbarpoundccedil`brvbarYgt[

ordfecircmWXq ^_ZZccedil` MPI WIcirca|ccedil`(MPI+OpenMP))JCDqIcirc

a|ccedil`7ccedilIumlETHeGmWgt1 iumlIumlcdOcirc$G 2^wmWordfgtecircq

JAEA-Testing 2011-005

13

6512 (1) 113aelig13S rsDH1 13aelig13S(144core)5poundDqZccedil` MPI 144 EumlIgraveIcirca

|ccedil` 72mpij2omp36mpij4omp18mpij8omp 3 IumlgtOslash55poundDq13aelig13Siuml_ccedilaringgtlaquoDH iumlIumluacuteordf IB iexclcentGqZccedil` MPI 1 iumlIumlcd 8 ^_currenOcirc$ordfeumlnIcirca|ccedil` 2 7ccedilIumliacute 1iumlIumlcd 4^_curren47ccedilIumlEacuteC 2^_curren87ccedilIuml1^_currenWXgt87ccedilIuml13 IBgt[ordfecircqmDHmnd FAring+XordfIBagraveaacuteGDHogtIIuml FAringW 1iumlIumlntildesect FAringshyiquestHDqm Table 20GqXeacuteYacute~Icirca|ccedil`cedil$ucirc3shy(MPI^_j7ccedilIuml)Gq Icirca|ccedil`(18mpij8omp)EacuteegraveeacuteshyordfEordfFAringshyGueacuteYacute~gt

otildeEacutebrvbarPXshyWXpoundqmnordf QDR2currenOslashgtlaquoigravePq (2) 13aelig13Zccedil` MPI TEacute144EumlIgraveZccedil`MPIgt5poundDqYacute~Oslash5gt t256YZegraveBCDq

mnmYZiumlIumleacuteaOcircOtildeordfZcopytimesWXDHIBnotordfecirc]GDHgtlaquoq _copytimes 72^_13aelig13(=9iumlIuml13aelig13)gtIumlb^_ 213aelig13currenethCDq13aelig13S 1iumlIumlecirc 2 IBordfuacuteordfpoundq13aelig13S 18iumlIumllaquogt 36 IBgteumlnqiAgrave13aelig13 50_ccedilaringgtlaquogt 18(9j2ccedil`)gteumlnq144EumlIgraveouml13aelig13S13aelig13W 18 IB egraveBGgtiuml_ccedilaringWXsect13aelig13S13aelig13EacuteXqbrvbarpoundm13aelig13ocircotilde 13aelig13SgtCD Wotilde^

dXq

Ocirc$G (G G)

^_0

^_1

^_2

^_3

^_4

^_5

^_6

^_7

Fig 32 ToacuteT Ocirc$G aYacute

JAEA-Testing 2011-005

Table 20 L13aelig13SP eacuteYacute~ Zccedil` MPI uumlyacutethorn(72j2) uumlyacutethorn(36j4) uumlyacutethorn(18j8)IIuml FAring 7238MBs 14474MBs 28942MBs 53467MBs iumlIumlshy 57906MBs 57899MBs 57884MBs 53467MBs

Table 21 L13aelig13Zccedil` MPIP 144 MPI 192 MPI 224 MPI iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 7246 R 4670 laquosectR 4671 laquosectR

2 7244 R 4670 laquosectR 4671 laquosectR

3 7246 R 4670 laquosectR 4671 laquosectR

4 7242 R 7243 R 4670 laquosectR

5 7243 R 7244 R 4671 laquosectR

6 7242 R 7240 R 7237 R

7 7241 R 7239 R 7246 R

8 7244 R 7240 R 7245 R

9 7249 R 7243 R 7240 R

10 R 4670 laquosectR 4670 laquosectR

11 R 4670 laquosectR 4672 laquosectR

12 R 4669 laquosectR 4671 laquosectR

13 R R 4671 laquosectR

14 R R 4670 laquosectR

iacute 192EumlIgraveW 224EumlIgraveZccedil` MPIn~n 96^_13aelig13gt 213

aelig13Oslash5112^_13aelig13gt 213aelig13Oslash55poundDqordm 144EumlIgraveOslash5WWlt shy13aelig13iumlIumlraquoIIuml FAring Table 21Gq yenw 224 EumlIgraveoumlltAacutesectshygt IBnotGiumlIumlordflaquogt(50_ccedilaring)

miumlIuml FAringordfcedilCXqFig 24a$fY`icircgtCDbrvbarPiumlIuml1W 102W 113W 12ordfnotGmWXsectmndiumlIuml FAringcedilCXpoundqCcCiumlIuml 15161718 egraveBeumlnXDHmndiumlIumlWnotGiumlIuml 6789 FAringordfFC 144MPI oumlW FAringordfEacutegtlaquoqbrvbarpoundiumlIuml6ccedil9 iuml_ccedilaringWXq192 EumlIgraveiacutegtiumlIuml 4ccedil9 p IB notGiumlIumlegraveBeumlnXDH FAringordfFq

JAEA-Testing 2011-005

Table 22 L13aelig13Icirca|ccedilIumlP Icirca|ccedil`(112x2) Icirca|ccedil`(56x4) Icirca|ccedil`(28x8) iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 9339 laquosectR 18567 laquosectR 36408 laquosectR

2 9337 laquosectR 18564 laquosectR 36919 laquosectR

3 9346 laquosectR 18566 laquosectR 36536 laquosectR

4 9347 laquosectR 18570 laquosectR 36457 laquosectR

5 9342 laquosectR 18651 laquosectR 36646 laquosectR

6 14385 R 28606 R 53808 R

7 14488 R 29118 R 53779 R

8 14500 R 28676 R 53789 R

9 14495 R 29100 R 53784 R

10 9337 laquosectR 18797 laquosectR 36408 laquosectR

11 9340 laquosectR 18775 laquosectR 36715 laquosectR

12 9343 laquosectR 18651 laquosectR 36559 laquosectR

13 9348 laquosectR 18811 laquosectR 36457 laquosectR

14 9338 laquosectR 18694 laquosectR 36991 laquosectR

(3) 13aelig13Icirca|ccedil` Icirca|ccedil`gt 213aelig13(224core)egravepound112mpij2omp(56^_13aelig13)

56mpij4omp(28^_13aelig13)28mpij8omp(14^_13aelig13)IumlgtOslash5CDqmndoiumlIumlgtIIuml FAringWpiumlIumlWnotnot Table 22Gq MIcirca|ccedil`gt IB notnotgt FAringordfecircC^poundqUgraveDIcirca|ccedil

`7ccedilIumlbrvbarpoundiumlIumlcdGlt GOcirc$2ordf^DHiumlIumlntildesect^_

ordfMiumlIumlXordf IB gtOcirc$[thornordfagraveaacuteXqmnduumlGWBX900 gtZccedil` MPI brvbarsectIcirca|ccedil`UcirceumlnDfIumlordfIcirca|ccedil`UcircfIumlXd 87ccedilIumlordfnotygtlaquomWordfcq WmigravegtWsup3iumlIumleacuteaOcircOtildeAgraveograveiumlIumlAtildeWecirccopytimeseumlnD

HUVYacute~13currenYacute~gt IB not9AgraveXqbrvbarpound50_ccedilaringmWuumlGbrvbarsectordmgttuDIcirca|ccedil`Ucirc5GuecircgtlaquoqCcC

Yacute~Oslash5 IB notGiumlIuml ordfpYacute~ordfOslash5eumlnWthornltCUgravePordfucircoB3mnEumlmWgtecircXq

JAEA-Testing 2011-005

652 divideoslash mmgtdivideoslash atildePiAuml 2ordfCX MPI_AlltoallegravepoundEacute

DGq 6521 Agraveograve

MPI_AlltoallGCcedilEgravecentCOslash5GqCmrsGqXMPI_Alltoall 100 plusmnOslash5C oacute copyIgrave-a5Zccedil` MPI ordf512MBIcirca|ccedil` 1024MBWCDq

A ucircYacute~Oslash5 B 13aelig13ntildesectiumlIumlgtecircotildeCGq C 13aelig13ntildesectiumlIumlAgraveXCG(13aelig13ordfCX)q D IBnotCG(iuml_ccedilaring)q

BccedilD atildeGgtlaquoordfiumlcopyordfYacute~gtegraveBGiumlIumllaquo GmWgtecircXq mDHWsup3ucircoBgteacuteccedilYacute~TYZ h4096egraveTEacute 512iumlIuml(4096core)agravefCm3cdXuCCcedilEgravePiumlIumllaquonotgtecircmCcedilEgravegtagravefgtecircEumlIgravegtCfNh$)JCmfNh$egravepound

MPI_Alltoall Oslash55poundDqmDHAacutedegCDOcirc$ MPI ^_UYacuteordfXq 6522 Zccedil` MPIWIcirca|ccedil`egravepoundD Table 23GqmmgtCXC

XmWOslash5CcedilEgraveordf+XDH AucircYacute~Oslash5WBccedilDUgravegtoshy8YrsCXmWgtlaquo(A 10WCDrClaquoordfmnoumlgt BccedilDEacutersWG)qBccedilDTEacute 512iumlIumlagravefCcdEumlIgravefNh$)JCOslash5GordfQR MPI gtmbrvbarPTEacuteCiumlIumlagravefCcdAgraveXiumlIumlgtCfNh$)poundOslash5CDoumlWTEacutecd

EumlIgravegtOslash5CDoumlWgt)acircOslash5ordfEGqZccedil` MPI WIcirca|ccedil`gtgt coreordfAgraveXoumlmraquozordfagraveaacuteXqX BccedilDrsXq

JAEA-Testing 2011-005

Ta

ble

23iacuteXCcedilEgrave

MPI

_allt

oall

vUacuteSTUcircR

vRcopyUumlYacutebrvbarTHORN

szligvcopyUumlYacutebrvbaragrave

pvRaacuteacircatildeNR

+aumleAumlovqaringaeligPgR

Oslash 5 Iuml

copy Igrave - a 5 (MB)

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

R

R

szligR

pR

144m

pi

512

3R6

0 11

14

118

0

145

9 3

60 R

138

6

36

0 13

87

100

1

31

124

1

25

2plusmn

(BC

DEacute

) 51

2 3R

60

111

4

118

0

148

2 11

16 R

146

9

63

0 14

75

100

1

33

132

1

32

3plusmn

(BC

DEacute

) 51

2 3R

60

111

4

29

0 14

81

714

13 R

153

3

29

0 14

21

100

1

33

138

1

28

448m

pi

512

14R

40

142

6

414

0

182

4 21

272

1 R

170

3

78

0 16

32

100

1

28

119

1

14

504m

pi

512

14R

45

142

7

415

8

180

4 27

242

6 R

160

5

79

0 14

33

100

1

26

112

1

00

2048

mpi

51

2 37R

69

182

1

298

8 24

93

465

6 R

250

1

465

6 25

04

100

1

37

137

1

38

2plusmn

51

2 37R

69

166

7

2211

6

262

5 67

564

6 R

264

7

298

8 25

04

100

1

57

159

1

50

3plusmn

(BC

DEacute

) 51

2 37R

69

166

7

2211

6

262

1 69

524

9 R

269

3 6

298

8 24

59

100

1

57

162

1

47

3072

mpi

51

2 69R

56

258

7 28

399

9 27

34

5371

54 R

286

3 9

507

7 26

44

100

1

06

111

1

02

68m

pij

2om

p 10

24

3R5

7 12

10

11

170

11

80

91

9 R

122

3

28

5 12

21

100

0

97

101

1

01

120m

pij

2om

p 10

24

9R3

3 12

89

215

0

127

0 12

191

6 R

130

4

47

5 12

94

100

0

99

101

1

00

132m

pij

2om

p 10

24

8R4

1 13

79

216

5

116

1 15

211

6 R

119

5

48

3 11

38

100

0

84

087

0

83

432m

pij

2om

p 10

24

17R

64

162

4

119

8 16

28

1925

43 R

161

2 1

129

0 16

18

100

1

00

099

1

00

30m

pitimes4

omp

1024

4R

38

598

115

0

674

12

13 R

683

27

5 68

2

100

1

13

114

1

14

2plusmn

(BC

DEacute

) 10

24

4R3

8 59

8

115

0

678

12

13 R

690

27

5 70

0

100

1

13

115

1

17

36m

pitimes4

omp

1024

6R

30

603

118

0

684

11

16 R

697

63

0 70

3

100

1

13

116

1

17

2plusmn

(BC

DEacute

) 10

24

6R3

0 60

3

29

0 68

8

714

13 R

702

29

0 71

0

100

1

14

116

1

18

126m

pitimes4

omp

1024

16R

39

822

3

415

8

805

27

242

6 R

738

1

79

0 71

5

100

0

98

090

0

87

180m

pitimes4

omp

1024

18R

50

925

615

0

892

40

283

2 R

776

127

5 75

3

100

0

96

084

0

81

2plusmn

(BC

DEacute

) 10

24

18R

50

925

712

9

861

29

352

6 R

808

109

0 76

8

100

0

93

087

0

83

612m

pitimes4

omp

1024

45R

68

110

2

349

0 10

34

525

9 R

111

4

349

0 10

46

100

0

94

101

0

95

2plusmn

(BC

DEacute

) 10

24

45R

68

110

2

2810

9

920

57

634

9 R

960

5

358

7 83

4

100

0

83

087

0

76

17m

pitimes8

omp

1024

3R

57

338

1

117

0

430

9

19 R

432

28

5 43

3

100

1

27

128

1

28

90m

pitimes8

omp

1024

15R

60

504

109

0 44

9

243

8 R

471

109

0 47

1

100

0

89

093

0

94

357m

pitimes8

omp

1024

53R

67

731

418

7 69

2

576

3 R

711

418

7 72

9

100

0

95

097

1

00

2plusmn

(BC

DEacute

) 10

24

53R

67

731

3410

5

642

57

685

3 R

628

6

438

3 60

2

100

0

88

086

0

82

JAEA-Testing 2011-005

brvbarWBccedilD gtecircCOslash5ordf+XAgraveXq13aelig13ordfAgraveXCc IB notGiumlIumlordfAgraveXnOslash5ordfUumlWnDordfmacrWaeligCmbrvbarPXpoundXqMPI_Alltoall MPI atilde3gtT 2ordfCX 5Pordfmnegravepound BX900 gtecircXordfXmWordfethCDq1 oacute 1 gt13aelig13thornordfDordfmgtpYacute~thornltC t256YZegraveBCDoacuteCdivideoslash h4096YZegraveBCDgtpYacute~thornlt9]QR MPIbrvbar TUacuteUcircthornuumlwdnWgtszligagraveXNXq 66 a$fY`UgraveWH mnUgravegt BX900 gt brvbarsect ecircXthorn7wWCacirc

RDMANORDMA Agravej Ocirc$-a5 1oacute 1 (13aelig13ocirc)IBnot

ordflaquopoundDqumlcopyfIuml oslashfrac14frac12CTbrvbarCcedilEgravelaquo GuecircgtlaquoigravePqeumld

fIumlszligOslash5GDH Zccedil` MPIUgraveDIcirca|ccedil`laquo

ordfugtlaquoqucircuumlWCGZccedil`MPIWIcirca|ccedil` rsGordfWmcdBX900gtIcirca|ccedil`ordfnotyXmWordfcurrencqmncdfIuml)JGoumllaquocentfIumlAumlFUcirc5GoumlcentX=ordfX

nIcirca|ccedil`laquo GuecircgtlaquoqBX9001iumlIuml 8coreoacuteC IB2gt brvbarntildeGDHm IBgt 2laquo]eumlXDHIcirca|ccedil`ordfugtlaquoqCPU13fUcircordfbrvbarsect2WnIcirca|ccedil`fIuml13vC]G

Weumlnq

JAEA-Testing 2011-005

7 UgraveWH

HPCC BMT7q|^_ZWUuml|YacuteTHORN^_ZBDbrvbarsectBX900W FX1gtfeaZbrvbarTUacuteUcircordflaquosectnordf^_ZOslashszligthornGmWordfdcXpoundDqIORBDgtegraveBG API]a13brvbarpoundlordfagraveaacutegtlaquopoundDqiacuteX B^_ZbrvbarOcirc$

a$fY`yBAgraveogravegt ordfecircC+XmWethCDqmnd

EcircHdegdnDregmacrHPCCYoJusectWUgraveWHWCtCqXyacuteampG(GOcirc$b AccedilGUgraveWHq mndregmacrBX900 W FX1 o13gtegraveBG^_ZIacuteDh

Agraveogravelaquo GDHnotszligXWXqordfbrvbarsectCumlcopy13

ABGlCWCJ7gtecircngtlaquoq

ugraveuacute

)JlaquoDsectXsup2 sup2kDotildeDAElig13=` v oBeacuteQRjlucircsup2acircAgraveXugraveG

q

ucircuumlyacutethorn

1) Innovative Innovative Computing Laboratory the University of Tennesee ldquoHPC CHALLENGErdquo httpiclcsutkeduhpcc

2) gt$middotamp()p HPCaelig7Yacutegt SX13L32 3plusmnAElig13THORNYacutel 3 A=P+ frac14AElig13THORNYacute$ 2005-05middot4 p98-116 (2005)

3) ldquoIOR HPC Benchmarkrdquo httpior-siosourceforgenet

JAEA-Testing 2011-005

A

FX1

HPC

C B

MT

G-F

FTEfeaZUuml^13~

-DH

PCC

_FFT

_235GmWgt

FFT|5ordfTUacuteUcirceumlnordfzyacuteCq

iAgrave

EP-

DG

EM

M-a5^regbrvbarpoundaringaeligccedil13h-a5

(15

MB

Cor

e)UacuteCordfzyacuteCq

Tabl

e A

-1feaZUuml^13~]-a5^regCD

FX1

HPC

C B

MT

UgraveccedilegravekeacuteUgraveccedilknUgraveccedilplusmnsup2sup3

~~regreg

Ugraveccedilmmnecircecirckccedilnecircfrac12

regReeumlgecirckccedilnecircfrac12

ndegplusmnsup2

ecirckccedilpUgraveecircfrac12frac12plusmnsup2sup3degmiddot

plusmnsup2ordmdegsup2sup1plusmnsup2sup3degmiddot

eacuteplusmn~+igraveiacute8

nmcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

reg~

reg~v

reg

sup2sup3plusmn

qvqyuqou

ovwyy

qvquzqq

vxzwo

qvzz

vyoy

zzv|w|

qvuqqy

uvuyx

qyu

sup2sup3plusmnqvqy|zx

ovwz

qvquyxz

vxuo

qvzuw

vyowx

zzv|z

qvxx|y

yvox

qwy

sup2sup3plusmn

qvxoquq

uvoxzo

qvoqqz

yyvuox

w|vyzx

vyoxx

zzvzyq

qv|qq

vqy

yy

sup2sup3plusmnqvxoqwwq

uvqwxx

qvooqzu

yyvxuwo

w|vyyq

vyouu

zzvzzx

qvux|

oqvz|

xw

sup2sup3plusmn

|vwq|qqq

|uvwz

qvqoxxw

qv|

o|uuv|q

vyxy

zzv||wq

qvqwzuw

oqvywyu

sup2sup3plusmn|vw||wqq

|vyz

qvuxu

yyyvuwwu

o|uyvyw

vy|q

zzv|uy

qvxwzqu

ovxuo

zq|z

sup2sup3plusmn

|vwuzqq

uuvxqwq

qvqoxxw

wwyv|zq

o|uv|q

vy|ox

xvuuwu

qvowox

oqvowuyq

sup2sup3plusmn|vwoquxqq

|yvzyxu

qvuzou

yvqwq|

o|uyvzxu

vy|qw

xvuuz

qvqwuy

ovuz

|y

uxqqq

wxqqq

|xqqqq

|qqqq

w

frac14icircRfrac14Rplusmndegreg

szligsup3degicircRRszligcedilszligiumliumlRethvRntildeRfrac14degicircRccedilogravemicroplusmnregntildemicro~sup1Oslashsup3sup2moRenecircfrac12oacuteregplusmnAumlocircotildeg

egravekeacuteOslashfrac14knicircccedilpegravekszligszligOslashmmnOslash|xRccedilpOslashpOslashfrac14knRccedilpegravekszligszligOslashfrac12ecircfrac12eacuteeacuteszlignRocircotilde

mszligpecircmicircccedilpeacutefrac14UgraveOslashOslashyunRocircotilde

frac12kicircRRkplusmnplusmnplusmnraquodegReacutemiddotplusmnmiddotRkplusmn~plusmnmiddot

-

moRexoszligcedilowszligkoumlg

szligdegicircRkszligyudividentildeRvxUgraveegraveparantildeRpposlashueuqUgravecedilregedegsup2plusmngntildeRo|vxUgravecedilregesup3plusmnregsup2RugraveRnecircfrac12gg

Cuacute uacuteucircicircuumlTyacutethornYacuteN0ucircicircuumlTyacutethornYacuteN0gt

|

xo

xo

egravekszligszligRethvicircRethov|vo

eeumlgicircecirckccedilnecircfrac12RregR0Eeecirckccedilnecircfrac12Rndegplusmnsup2gIcirc-RJ13NOIPQ

~~icircRppRmicrodegdegugraveplusmnsup2ntildeRoreg~cedilsup2ntildeRmplusmnRn

JAEA-Testing 2011-005

B EumlIgravecentMPI atilde oh$a`MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

1024EumlIgrave

0

1000

2000

3000

4000

5000

6000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

2000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(1024-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(1024-)

rdma

Nordma

0

100000

200000

300000

400000

500000

600000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(1024-)

rdma

Nordma

Fig B-1 MPI atildeL1024EumlIgrave-1P

JAEA-Testing 2011-005

0

50000

100000

150000

200000

250000

300000

350000

400000

q qqqq uqqqq yqqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(1024-)

rdma

Nordma

0

1000000

2000000

3000000

4000000

5000000

6000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(1024-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(1024-)

rdma

Nordma

0

50

100

150

200

250

300

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(1024-Rank0gtRank1023)

rdma

nordma

Fig B-2 MPI atildeL1024EumlIgrave-2P

JAEA-Testing 2011-005

13

C

RD

MAoacute

NO

RD

MA

Ta

ble

C-13UVOcirc$

(8K

Bccedil

78K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

~plusmnreg

plusmn

sup2~

sup2~

sup2cedil~raquo

sup2~Oslashreg~plusmn

middotplusmnsup1

middotplusmnsup1raquo

~plusmnraquo

~plusmn

~plusmn

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

plusmnraquo

Tabl

e C

-2UVOcirc$

(80

KBccedil

584K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

sup2~Oslashreg~plusmn

plusmn

~plusmnreg

middotplusmnmiddotsup1

~plusmnraquo

~plusmn

sup2~

sup2cedil~raquo

sup2~

middotplusmnsup1raquo

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

~plusmn

plusmnraquo

Zordf

100M

RD

MAAgravejordfnotyWXq

0CXM

NO

RD

MA

AgravejordfnotyWXq

JAEA-Testing 2011-005

D

W Oslash5

Tabl

e D

-1Zccedil`

MPI

8mpi

qzvquecirciumlq|yvwuecirciumlquvzuecirciumlquovu|oecirciumlqxvwoqecirciumlquyvxqoecirciumlquyvxoecirciumlqu

qwvo

qzvzzecirciumlq|vuecirciumlquwvyecirciumlquovu|oecirciumlqxvwuecirciumlquyvuzoecirciumlquxvy|ecirciumlqu

yqyv|

owvz|oecirciumlq|yvwyyecirciumlquvxzecirciumlquvqxecirciumlquov|uecirciumlquyvuecirciumlquccedilxv|xecirciumlq

ccedilyvq

ozvqoecirciumlq|vxxecirciumlquwvux|ecirciumlquvyzoecirciumlquovxoecirciumlquyvuuqecirciumlquccedilvyecirciumlq|

ccedilwuvy

ovq|ecirciumlquyvzz|ecirciumlquwvq|qecirciumlquovu|qecirciumlqxvw|wecirciumlquyvuyoecirciumlquyvywecirciumlqu

yquv

ovquyecirciumlquvuyqecirciumlquwvxqecirciumlquovu|qecirciumlqxvwecirciumlquyvuwqecirciumlquxvzxecirciumlqu

xx|vw

|ovq|zecirciumlquyvwzuecirciumlquvz||ecirciumlquvyoecirciumlquovwecirciumlquyvuzecirciumlquccedilovozecirciumlq|

ccediloyvy

|ovquecirciumlquv|qecirciumlquwvuoyecirciumlquvyy|ecirciumlquovowyecirciumlquyvuecirciumlquccedilvxwecirciumlq|

ccedilovz

uzvwoecirciumlq|vuxyecirciumlquwvu|uecirciumlquovu|ecirciumlqxvw|ecirciumlquyvuuxecirciumlquxvwwuecirciumlqu

yqovy

uzvywecirciumlq|yvzuzecirciumlquvzoyecirciumlquovuwecirciumlqxvwqyecirciumlquyvuecirciumlquyv|yecirciumlqu

yxvu

xzvqecirciumlq|vuoqecirciumlquwv|wqecirciumlquvyyecirciumlquovowecirciumlquyvuuzecirciumlquccedilvozecirciumlq|

ccedil|vx

xzvyzzecirciumlq|voxuecirciumlquwvouecirciumlquvqqecirciumlquovxuecirciumlquyvuuyecirciumlquccediluv|yecirciumlq|

ccedilu|v

yzvuecirciumlq|vxzecirciumlquwvxecirciumlquovuyecirciumlqxvwecirciumlquyvuxecirciumlquxvyzqecirciumlqu

xw|vw

yzvz|oecirciumlq|yvwzoecirciumlquvwwxecirciumlquovuzecirciumlqxvwyecirciumlquyvuyyecirciumlquyvuqwecirciumlqu

yuxv|

zvwqecirciumlq|v|uzecirciumlquwv||yecirciumlquvyxwecirciumlquovozecirciumlquyvuzecirciumlquccedilyvwuecirciumlq|

ccedilywv

ovqq|ecirciumlquyvzzyecirciumlquvzzzecirciumlquvyzecirciumlquovqoecirciumlquyvuzecirciumlquccedil|vozxecirciumlq|

ccedil|ovz

zvuzecirciumlq|vowqecirciumlquwvoxxecirciumlquovoqqecirciumlqxuvxwecirciumlquyvuqecirciumlquvwu|ecirciumlqu

zovy

zvwecirciumlq|vyxecirciumlquwvuecirciumlquovqzzecirciumlqxuvxoecirciumlquyvuyzecirciumlquvu|ecirciumlqu

wvw

qwv|xzecirciumlq|vzecirciumlquwvywecirciumlquvxyoecirciumlquovqwuecirciumlquyvuecirciumlquccedilovqyecirciumlquccedilov

qzvuuuecirciumlq|v|yecirciumlquwvqecirciumlquovuzecirciumlqxvwqqecirciumlquyvuzoecirciumlquyvqoecirciumlqu

y|vx

oovouwecirciumlquwvqqecirciumlquzvoywecirciumlquovuqwecirciumlqxvyxxecirciumlquyvuwecirciumlquuvzoxecirciumlqu

uwvq

owvwuecirciumlq|vq|qecirciumlquvzqzecirciumlquv|uuecirciumlquzvqoecirciumlq|yvu|ecirciumlquccedilxvyxqecirciumlq|

ccedilyuv|

yvz||ecirciumlq|v|oecirciumlquwvuuecirciumlquvqyecirciumlqxovuowecirciumlqxyvu||ecirciumlquovozecirciumlqxoxwvy

ovouecirciumlquvuxecirciumlquwv|zecirciumlquovu|oecirciumlqxvwxyecirciumlquyvux|ecirciumlquxvzoecirciumlqu

xoyvo

|ov|qoecirciumlquv|zqecirciumlquwvyzoecirciumlquv|oecirciumlquwvu|uecirciumlq|yvu|ecirciumlquccedilov|uecirciumlquccediloqxv

|v|qecirciumlq|vqecirciumlquvxzecirciumlquvxqecirciumlquovq|ecirciumlquyvuzecirciumlquccedilv|zoecirciumlq|

ccedil|vu

uwvxy|ecirciumlq|wvzoyecirciumlquzv|ecirciumlquovuozecirciumlqxvuoqecirciumlquyvwqecirciumlquuvuoecirciumlqu

xoxvw

uovquecirciumlquvwecirciumlquwvqecirciumlquovuouecirciumlqxvyxqecirciumlquyvuzoecirciumlquxvwoecirciumlqu

xy|vx

xzvux|ecirciumlq|vxwecirciumlquwvx|ecirciumlquvx|oecirciumlquovoqoecirciumlquyvu|qecirciumlquccedilzvzoecirciumlq|ccediloquvz

xwvqqecirciumlq|vuuwecirciumlquwv|owecirciumlquvzuzecirciumlquov|wyecirciumlquyvxy|ecirciumlquccedil|vywuecirciumlq|

ccediluv|

yzvqecirciumlq|zv|w|ecirciumlquovq|qecirciumlqxovuxecirciumlqxvyecirciumlquyvzqecirciumlquuvqecirciumlqu

uyuvq

yovqyuecirciumlquv|quecirciumlquwv|ywecirciumlquovuyuecirciumlqxwvqxwecirciumlquyvxwoecirciumlquyvecirciumlqu

xwzvx

ovoooecirciumlquvzwwecirciumlquzvqzzecirciumlquvo|ecirciumlqxovuoecirciumlqxyvyouecirciumlquovecirciumlqxooqqvy

ovqoecirciumlquvqyoecirciumlquwvo|ecirciumlquvy|ecirciumlquovoecirciumlquyvuxecirciumlquccediluvxzecirciumlq|

ccediluvz

zvy|ecirciumlq|wvoqqecirciumlquzvqyecirciumlquov|uqecirciumlqxyvw|ecirciumlquyvxyecirciumlquuv||ecirciumlqu

uzovo

zvyzoecirciumlq|vqwecirciumlquwvoecirciumlquovqzwecirciumlqxuvuwwecirciumlquyvuzyecirciumlquvwqyecirciumlqu

yxvy

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

JAEA-Testing 2011-005

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy1

Ccedil

Ccedil

+

Ccedil

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

Fi

g D

-1 Zccedil`

MPI

8mpi

JAEA-Testing 2011-005

Tabl

e D

-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p

q|vu|ecirciumlq|yvuqqecirciumlquyvuuecirciumlquyvy|ecirciumlqu|vuuxecirciumlq|yvuozecirciumlquovzzoecirciumlq

xvw

qvuuwecirciumlq|yvuqecirciumlquvouecirciumlquvqyecirciumlquyvqxqecirciumlq|yvuqecirciumlquccedilovoecirciumlq|

ccediloyv|

o|vuuecirciumlq|yvuq|ecirciumlquyvuecirciumlquyvwuwecirciumlquuv|ecirciumlq|yvuuecirciumlquovqquecirciumlq|

zv

oyvqzwecirciumlq|yvuqoecirciumlquvqooecirciumlquvqecirciumlquyvqyecirciumlq|yvuozecirciumlquovxxecirciumlq

vx

|vwuecirciumlq|yvuqoecirciumlquyv|qecirciumlquyvwxecirciumlquuv|uqecirciumlq|yvu|ecirciumlquovoecirciumlq|

|wv

yvqqwecirciumlq|yvuqyecirciumlquvqqecirciumlquvq|oecirciumlquyvoyecirciumlq|yvuouecirciumlquvuowecirciumlq

uvq

||vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvwxecirciumlquuv|yqecirciumlq|yvuoecirciumlquovooecirciumlq|

|yv

|vuzwecirciumlq|yvuqyecirciumlquvoxxecirciumlquvq|ecirciumlquyvoqyecirciumlq|yvuoecirciumlquccedilov|uecirciumlq|

ccediloyvx

|v|yuecirciumlq|yvuq|ecirciumlquyv|zecirciumlquyvw|oecirciumlquuvqzuecirciumlq|yvuecirciumlquzvo|ecirciumlq

vy

yvy|ecirciumlq|yvuquecirciumlquvqwqecirciumlquvqzecirciumlquyvoqecirciumlq|yvuozecirciumlquccedilxvo|uecirciumlq

ccedilyvy

q|vu|ecirciumlq|yvuqoecirciumlquyvu|ecirciumlquyvyyyecirciumlquvx|qecirciumlq|yvuo|ecirciumlquccedilvoqecirciumlq

ccedilvx

qvu||ecirciumlq|yvuquecirciumlquvouwecirciumlquvq|ecirciumlquyvoxecirciumlq|yvuecirciumlquccedilovoquecirciumlq|

ccedilouvz

ovxouecirciumlq|yvuqecirciumlquyvyxzecirciumlquyvyyecirciumlquvu|ecirciumlq|yvuxecirciumlquwvuwwecirciumlqo

|vu

oyvozecirciumlq|yvuq|ecirciumlquvqoyecirciumlquvq|ecirciumlquyvoecirciumlq|yvuxecirciumlquvooqecirciumlq

|vu

|vyoqecirciumlq|yvuoyecirciumlquyvecirciumlquyvxyecirciumlqu|v|xwecirciumlq|yvuqecirciumlquccedilvozecirciumlq

ccedilxvz

xvzzyecirciumlq|yv|wzecirciumlquyvzwzecirciumlquvq|oecirciumlquyvouecirciumlq|yvuoyecirciumlquuvoxwecirciumlq

yvz

||vyecirciumlq|yvuqoecirciumlquyvy|ecirciumlquyvxecirciumlqu|v|xxecirciumlq|yvuoecirciumlquccedilxvwwzecirciumlqo

ccedilovy

|vuzuecirciumlq|yvuquecirciumlquvox|ecirciumlquvq||ecirciumlquyvo|ecirciumlq|yvuozecirciumlquccedilovq|ecirciumlq|

ccediloyvo

|vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvooecirciumlquvzoyecirciumlq|yvuqecirciumlquccedilv|zxecirciumlq

ccedilyv

yvy|ecirciumlq|yvuqqecirciumlquvqecirciumlquvq|xecirciumlquyvouecirciumlq|yvuqecirciumlquccediluvqoecirciumlq

ccedilxvo

q|vozyecirciumlq|yv|zzecirciumlquyvowecirciumlquyvyzecirciumlqu|vxu|ecirciumlq|yvuouecirciumlquxvq|wecirciumlq

oxvw

qv|wyecirciumlq|yvuqxecirciumlquvouuecirciumlquvquuecirciumlquyvoqyecirciumlq|yvu||ecirciumlquccedilovqqqecirciumlq|

ccedilo|vx

ouvoq|ecirciumlq|yv|zecirciumlquyvwqecirciumlquyvyxecirciumlquvuoecirciumlq|yvuoyecirciumlquccedilovxqoecirciumlq|

ccedil|yvy

ovuqzecirciumlq|yv|zoecirciumlquvo|ecirciumlquvq|zecirciumlquyvo|ecirciumlq|yvuowecirciumlquccedilzv|ecirciumlq

ccedilovy

uvoecirciumlq|yvuqqecirciumlquyvwo|ecirciumlquyvxoecirciumlqu|vzwecirciumlq|yvuoecirciumlquccedilyvowecirciumlq

ccediloxvq

vxozecirciumlq|yvuoecirciumlquvo|ecirciumlquvquuecirciumlquyvoyecirciumlq|yvuecirciumlquccedilovzzecirciumlq|

ccedilov|

||vuoqecirciumlq|yvuqxecirciumlquyvuecirciumlquyvywecirciumlqu|vuyuecirciumlq|yvuoecirciumlquvooqecirciumlq

yv

|vxoecirciumlq|yvuquecirciumlquvoxyecirciumlquvq||ecirciumlquyvqwyecirciumlq|yvuuecirciumlquccedilov||ecirciumlq|

ccediloyvu

|vqwecirciumlq|yvuqqecirciumlquyvoecirciumlquyv|yecirciumlqu|vozecirciumlq|yvuowecirciumlquccedil|vxooecirciumlq

ccedilvu

vuxwecirciumlq|yvuqxecirciumlquvoxoecirciumlquvquqecirciumlquyvoxxecirciumlq|yvuuecirciumlquccedilovooyecirciumlq|

ccedilouvz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

jetimesg

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-3Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (2

)

JAEA-Testing 2011-005

Tabl

e D

-3Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p

q|v|oxecirciumlq|wvwz|ecirciumlq|ovoecirciumlquovou|ecirciumlquvu|qecirciumlq|zvqqecirciumlq|ccedilvxwecirciumlq

ccedil|vu

qvx||ecirciumlq|wvzquecirciumlq|ovyuuecirciumlquovyuyecirciumlquvxqoecirciumlq|wvzyecirciumlq|vy|ecirciumlqo

qv|

ovyecirciumlq|wvwzqecirciumlq|ovoyxecirciumlquovouuecirciumlquvuwecirciumlq|wvzyuecirciumlq|ccedilvoqqecirciumlq

ccedilvy

ovxqoecirciumlq|wvwzecirciumlq|ovyuqecirciumlquovy|yecirciumlquvuoqecirciumlq|wvzxqecirciumlq|ccedil|vzoecirciumlqo

ccedilqvx

|vywoecirciumlq|wvz|ecirciumlq|ovyoecirciumlquov|ecirciumlqu|v|wyecirciumlq|wvz|oecirciumlq|ccedilvzxzecirciumlq

ccedilwvq

v|zoecirciumlq|wvwzyecirciumlq|ovyzecirciumlquovuzyecirciumlquxvzzwecirciumlq|wvzxwecirciumlq|ccedilov||oecirciumlq|

ccedilowvq

||vywecirciumlq|wvwwzecirciumlq|ovxecirciumlquov||ecirciumlqu|v|xwecirciumlq|wvzoecirciumlq|ccedilv|zecirciumlq

ccedilyvx

|vuooecirciumlq|wvzqwecirciumlq|ovy|ecirciumlquovxqxecirciumlquyvqzecirciumlq|wvz|ecirciumlq|ccedilovyecirciumlq|

ccedilovo

|v|xzecirciumlq|wvzqoecirciumlq|ovyecirciumlquovowwecirciumlquvzo|ecirciumlq|wvzyecirciumlq|ccedil|vzzecirciumlq

ccediloovu

vuxzecirciumlq|wvzqoecirciumlq|ovy|yecirciumlquovxoecirciumlquyvuecirciumlq|wvzyoecirciumlq|ccedilyvxuecirciumlq

ccedilwvw

q|v|xzecirciumlq|wvwwoecirciumlq|ovuecirciumlquovxqecirciumlqu|vxxuecirciumlq|wvzuuecirciumlq|vxwecirciumlq

v

qvuxecirciumlq|wvzqyecirciumlq|ovy||ecirciumlquovuzwecirciumlquyvqqecirciumlq|wvzy|ecirciumlq|ccedilov|uwecirciumlq|

ccedilowv

ovuwqecirciumlq|wvwzecirciumlq|ovo|yecirciumlquovo|zecirciumlquvuuqecirciumlq|wvzuyecirciumlq|vowecirciumlqo

ovo

oyvoozecirciumlq|wvz|qecirciumlq|ovxqxecirciumlquovuzecirciumlquyvqoecirciumlq|wvzxzecirciumlq|ccedilvzyecirciumlqo

ccedilov|

|v||zecirciumlq|wvzqzecirciumlq|ovxecirciumlquov|oecirciumlqu|v|oxecirciumlq|wvzzyecirciumlq|yv|owecirciumlqo

ovz

xvzyuecirciumlq|wvzqoecirciumlq|ovuwecirciumlquovxqxecirciumlquyvqxecirciumlq|wvzuecirciumlq|ovwuoecirciumlq

|vo

||v|xzecirciumlq|wvwzoecirciumlq|ovxecirciumlquovuzecirciumlqu|vxozecirciumlq|wvzqecirciumlq|v|zoecirciumlq

vo

|vuwxecirciumlq|wvzqyecirciumlq|ovy|zecirciumlquovxquecirciumlquyvqwoecirciumlq|wvzxzecirciumlq|ccedilov|xoecirciumlq|

ccedilowvo

|vo|uecirciumlq|wvwzqecirciumlq|ovqecirciumlquovoecirciumlqu|vqecirciumlq|wvzyuecirciumlq|ovuyzecirciumlq

uvu

yvuwecirciumlq|wvzooecirciumlq|ovxyyecirciumlquovxqoecirciumlquyvquecirciumlq|wvzyuecirciumlq|ccedilyvuw|ecirciumlq

ccedilwvy

q|v|oyecirciumlq|wvwzqecirciumlq|ovoecirciumlquov|yecirciumlqu|v|yzecirciumlq|wvzwzecirciumlq|ovxoyecirciumlq

uvy

qxvzyyecirciumlq|wvzq|ecirciumlq|ovuwecirciumlquovy|wecirciumlquvu|qecirciumlq|wvzx|ecirciumlq|ovxouecirciumlq|

xvu

ovxyecirciumlq|wvwqecirciumlq|ovoy|ecirciumlquov|oecirciumlquuvuuecirciumlq|wvzyecirciumlq|ovxwqecirciumlq|

xv|

oyvqqecirciumlq|wvzqecirciumlq|ovuzecirciumlquovy|wecirciumlquvuyecirciumlq|wvzuzecirciumlq|ovux|ecirciumlq|

uvo

|vyecirciumlq|wvwzqecirciumlq|ovxecirciumlquov|zecirciumlquuv||ecirciumlq|wvzxuecirciumlq|vowyecirciumlq

ozvx

vuuuecirciumlq|wvzuuecirciumlq|ovy|zecirciumlquovuwzecirciumlquxvzxzecirciumlq|wvz|yecirciumlq|ccedilovuz|ecirciumlq|

ccedilqvo

||vywecirciumlq|wvww|ecirciumlq|ovxyecirciumlquov|zecirciumlquuv|owecirciumlq|wvzoecirciumlq|vwoecirciumlq

ozvw

|vuqqecirciumlq|wvzqwecirciumlq|ovy|oecirciumlquovxqqecirciumlquyvquqecirciumlq|wvzxwecirciumlq|ccedilov|oqecirciumlq|

ccedilov

|v|xecirciumlq|wvww|ecirciumlq|ovuecirciumlquov|q|ecirciumlquuvqyyecirciumlq|wvzyzecirciumlq|vzuyecirciumlq

xv|

yvqwecirciumlq|wvzouecirciumlq|ovxyecirciumlquovxyyecirciumlquyvouecirciumlq|wvzuzecirciumlq|uvqzxecirciumlqo

vz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

Ccedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

Xpound

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-4Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-5Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (2

)

JAEA-Testing 2011-005

E 1oacute 1

Fig E-1 QRfeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39305 sec Ave Speed = 32566145 MBs RANK = 2 TIME = 39197 sec Ave Speed = 32655168 MBs RANK = 3 TIME = 39212 sec Ave Speed = 32643219 MBs RANK = 4 TIME = 34341 sec Ave Speed = 37272765 MBs RANK = 5 TIME = 34175 sec Ave Speed = 37453885 MBs RANK = 6 TIME = 34086 sec Ave Speed = 37552470 MBs RANK = 7 TIME = 34329 sec Ave Speed = 37286226 MBs RANK = 8 TIME = 24021 sec Ave Speed = 53287420 MBs RANK = 9 TIME = 23794 sec Ave Speed = 53794939 MBs RANK = 10 TIME = 23805 sec Ave Speed = 53770236 MBs RANK = 11 TIME = 23798 sec Ave Speed = 53786914 MBs RANK = 12 TIME = 23942 sec Ave Speed = 53462490 MBs RANK = 13 TIME = 23931 sec Ave Speed = 53488046 MBs RANK = 14 TIME = 23950 sec Ave Speed = 53443826 MBs RANK = 15 TIME = 24066 sec Ave Speed = 53186052 MBs RANK = 16 TIME = 23798 sec Ave Speed = 53786327 MBs RANK = 17 TIME = 23802 sec Ave Speed = 53775886 MBs RANK = 18 TIME = 23821 sec Ave Speed = 53733399 MBs RANK = 19 TIME = 23794 sec Ave Speed = 53795866 MBs RANK = 20 TIME = 23939 sec Ave Speed = 53468837 MBs RANK = 21 TIME = 23938 sec Ave Speed = 53470775 MBs RANK = 22 TIME = 23939 sec Ave Speed = 53470333 MBs RANK = 23 TIME = 23942 sec Ave Speed = 53462889 MBs

JAEA-Testing 2011-005

Fig E-2 afeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39370 sec Ave Speed = 32511775 MBs RANK = 2 TIME = 39475 sec Ave Speed = 32425231 MBs RANK = 3 TIME = 39415 sec Ave Speed = 32474846 MBs RANK = 4 TIME = 34392 sec Ave Speed = 37217925 MBs RANK = 5 TIME = 34543 sec Ave Speed = 37055373 MBs RANK = 6 TIME = 34250 sec Ave Speed = 37372742 MBs RANK = 7 TIME = 34709 sec Ave Speed = 36878286 MBs RANK = 8 TIME = 26967 sec Ave Speed = 47465876 MBs RANK = 9 TIME = 26964 sec Ave Speed = 47471018 MBs RANK = 10 TIME = 26955 sec Ave Speed = 47486289 MBs RANK = 11 TIME = 26947 sec Ave Speed = 47499771 MBs RANK = 12 TIME = 26957 sec Ave Speed = 47483647 MBs RANK = 13 TIME = 27020 sec Ave Speed = 47373054 MBs RANK = 14 TIME = 26948 sec Ave Speed = 47498023 MBs RANK = 15 TIME = 27174 sec Ave Speed = 47104542 MBs RANK = 16 TIME = 26956 sec Ave Speed = 47485054 MBs RANK = 17 TIME = 26945 sec Ave Speed = 47504899 MBs RANK = 18 TIME = 26946 sec Ave Speed = 47502360 MBs RANK = 19 TIME = 26967 sec Ave Speed = 47466158 MBs RANK = 20 TIME = 26948 sec Ave Speed = 47499767 MBs RANK = 21 TIME = 26982 sec Ave Speed = 47438204 MBs RANK = 22 TIME = 27087 sec Ave Speed = 47254830 MBs RANK = 23 TIME = 27004 sec Ave Speed = 47399960 MBs

JAEA-Testing 2011-005

Fig E-3 QRfeaZOpenMPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes NODE CPU coreid Rank = 0 bx2050 CPU0 0 all time RANK = 1 TIME = 38110 sec Ave Speed = 33587213 MBs bx2050 CPU1 1 RANK = 2 TIME = 28656 sec Ave Speed = 44667752 MBs bx2050 CPU0 2 RANK = 3 TIME = 37959 sec Ave Speed = 33720175 MBs bx2050 CPU1 3 RANK = 4 TIME = 28658 sec Ave Speed = 44664664 MBs bx2050 CPU0 4 RANK = 5 TIME = 37829 sec Ave Speed = 33836073 MBs bx2050 CPU1 5 RANK = 6 TIME = 28641 sec Ave Speed = 44691880 MBs bx2050 CPU0 6 RANK = 7 TIME = 37892 sec Ave Speed = 33779924 MBs bx2050 CPU1 7 RANK = 8 TIME = 35140 sec Ave Speed = 36425620 MBs bx2051 CPU0 0 RANK = 9 TIME = 35017 sec Ave Speed = 36554163 MBs bx2051 CPU1 1 RANK = 10 TIME = 38916 sec Ave Speed = 32891010 MBs bx2051 CPU0 2 RANK = 11 TIME = 45416 sec Ave Speed = 28184186 MBs bx2051 CPU1 3 RANK = 12 TIME = 35172 sec Ave Speed = 36392511 MBs bx2051 CPU0 4 RANK = 13 TIME = 35053 sec Ave Speed = 36516468 MBs bx2051 CPU1 5 RANK = 14 TIME = 35200 sec Ave Speed = 36363472 MBs bx2051 CPU0 6 RANK = 15 TIME = 35009 sec Ave Speed = 36561987 MBs bx2051 CPU1 7 RANK = 16 TIME = 22996 sec Ave Speed = 55661441 MBs bx2052 CPU0 0 RANK = 17 TIME = 35587 sec Ave Speed = 35968575 MBs bx2052 CPU1 1 RANK = 18 TIME = 35587 sec Ave Speed = 35968151 MBs bx2052 CPU0 2 RANK = 19 TIME = 35712 sec Ave Speed = 35842525 MBs bx2052 CPU1 3 RANK = 20 TIME = 22996 sec Ave Speed = 55661978 MBs bx2052 CPU0 4 RANK = 21 TIME = 36856 sec Ave Speed = 34729458 MBs bx2052 CPU1 5 RANK = 22 TIME = 22994 sec Ave Speed = 55666382 MBs bx2052 CPU0 6 RANK = 23 TIME = 22991 sec Ave Speed = 55673257 MBs bx2052 CPU1 7

JAEA-Testing 2011-005

13

F |eacuteIuml -microX|eacuteIuml^_Z)JCQRfeaZW Intel feaZegrave

pound5poundDqQRfeaZgt|YatildeGfeaUuml^13~

WCeumlUKntstiacuteordflaquoqmnaringaeligccedil13heumlX store amp5GUuml^13~gtlaquoq2csup1egraveBordfOcirc$ storegtszligordflaquoqiAgraveIntelmnntildeGfeaUuml^13~centsup3CXqCcCcedilEgravegtacircOslash5Cagraveaacute5pound

Dq WCIntelfeaZmUuml^13~ordfccdQRfeaZ

UKntst egraveBCDoumlWordfqiAgraveUKntst ordfQRfeaZEGmWordfethCDq

F1 Agraveograve o^_UgraveDo7ccedilIumlgt 128MB copyIgrave0]lyacuteOslash5GZccedil` MPI

OpenMP^_Z)JCDqm0]lyacute 100plusmnatildesectaumlC5mnccC|eacuteIuml (MBs)iquestHq^_ZiIacute[oumlYacutebrvbarsectGqOslash5IumlWCMPI gt 12348 ^_OpenMP gt 123487ccedilIumlgtOslash5UgraveD IFyacute1EC1CPUcd2^_UgraveD7ccedilIuml 2laquonotgtOslash55poundEacuteDqQRfeaZUKntst CDoumlWCXouml 2sectCDqIntelucircfeagt5poundDq

JAEA-Testing 2011-005

Fig F-1 ^_ZugraveuacuteLFlat MPIP

(main) n = nmb102410248 nmb = 128 (from input data) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) nsize = n size c NNN Repetition number of times NNN = 100 do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = MPI_WTIME() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

Fig F-2 ^_ZugraveuacuteLOpenMPP

(main) $OMP threadprivate(ioutompidompnumabierr1ierr2) n = nmb102410248 nmb = 128 (from input data) $OMP parallel shared(n) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) if(ierr1ne0orierr2ne0) then write(6) Array allocation failed stop end if $OMP end parallel nsize = n size c NNN Repetition number of times NNN = 100 $OMP parallel private(i)shared(nsize) do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue $OMP end parallel $OMP parallel private(IIttime1ttime2timewave_spd) $OMPamp shared(NNNn) ttime1 = omp_get_wtime() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = omp_get_wtime() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

F2 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt[oumlYacutebrvbarsect 4gtGqmn

d QRUKntstlaquooumlWoumlWr 2 2^_laquonotgtUKntstlaquooumlWoumlWr 3KntstCc 1ccedil8^_gtQRW IntelWr QR(-Kntst)W Intel2 2^_Wr CqOMP Intel EcircHGu MPI Wiacute

XWXpoundDDHyzGqmndbrvbarsect[raquozfrac14frac12CDq (1) MPI^_ZOpenMP^_ZWQRfeaZgtUKntstordflaquooumlW

oumlgtordfaacuteCq (2) IntelfeaZUKntstCDQRfeaZWbAEligq (3) MPI^_ZOpenMP^_ZW 1CPUSgto coremacrdnXq (4) UKntstordfQRfeaZOslash5thornlt]G(1plusmnW 2plusmn

Oslash5ordfaEacuteD)q ^_ZbrvbarPXOcirc$sup1yBordfAgraveXCc store ampordfCXoumlQR

feaZUKntst Uuml^13~CXnordfyacuteordfdXqOcirc$sup1yBordfcAgraveXcpoundecircsectCXoumlmUuml^13~CagraveaacuteGaeliguordflaquoWnq

IntelfeaZdegIacuteQRUKntstlaquosectoumlWGmWordfgtecircq(XETHNtildefeaOslash5|YatildeG4X

yen5oslashIacuteD|Yordf9WXpoundbrvbarPmacrltdnDq

JAEA-Testing 2011-005

Table F-1 |eacuteIumlWeumlUKntstiacuteUuml^13~

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqvo uuuwvq u|xvq qvzu

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |o|v wq|vy xxwvu qvXpoundo |xwxvz |xwvq |xw|vz Xpoundo |xvy wovw xov qv

|xyvu |xuvo |xxv| |ozvu woqv xyxvq qv

Xpoundq z|vx zvz z|v Xpoundq ouuwvz ozwov ooxv| qvyoXpoundo woqv wqxvo wqvy Xpoundo oux|vu ozwv| oqvu qvyoXpound wqzv| wqvo wqxv Xpound oux|v| ozzqvo oov qvyo

wquv| wqqvo wqv ouxovz ozwyvu oozvo qvyo

Xpoundq o||vo o|xvq o|uvq Xpoundq oqoxvz oxquvu oyqvo qvxzXpoundo ouv ouov ouov Xpoundo oqovw oxoqvo oy|vz qvxzXpound ouov ouyv| ouuvq Xpound oqowvz oxqzv oyuv| qvxzXpound| ouvy ouuvw ou|v Xpound| oqowvz oxqzv oyuvq qvxz

o|zvz ouovw ouqvz oqovz oxqwv| oy|vo qvxz

Xpoundq qwyvq oqyvw qzyvu Xpoundq ouxvq ouwuvy ouywv| qvqXpoundo qz|v ooyvy oqxv Xpoundo ouxvz ouwuv ouovq qvqXpound qzuvy oovq oqxvw Xpound ouxzv ouz|v ouyv qvqXpound| qzxvy ooxvu oqxvx Xpound| ouxuvw ouzv ou|vx qvqXpoundu qwzv| o||v ooovx Xpoundu ouqxvo ouuv| ouyv qvyXpoundx qwzvu o|vw oo|vy Xpoundx ouq|vo ouuxv ouuv qvyXpoundy oqvy o|xvu oovx Xpoundy ouquvw ouuyvq ouxvu qvyXpound ooqvu o|yvo o|v Xpound ouqxvx ouuyv ouyvo qvy

qzxvw ouvw ooqv| ou|qv| ouyvu ouuwvz qvyw

usup3deg

wsup3deg wsup3deg

usup3deg

|sup3deg

sup3deg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

sup3deg

|sup3deg

Table F-2 |eacuteIumlLiacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq ywxvu xyuvw yxvo qvuXpoundo |xywv |xwvx |x|vy Xpoundo yzwvz xvu y|wvo qvuXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz yzvo xovo y|ovy qvu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq yqv xywvy yozvy qvuXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound ywyvz xwv y|uvw qvuXpound| Xpound|

|xyov| |xovu |xyyv| ywvw xxvy yv qvu

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq yoov xvu ywovw qvxXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| yxv qvo yzvy qvx

|xyv| |xqvo |xyyv yowv yov ywzv qvx

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo uyv| wqvo y|v qvuXpound |xyvz |xuvu |xovo Xpound uqvu wqyv| y|wv| qvuXpound| Xpound|

|xyyv| |xuvu |xqv| uywvw wqyv y|vw qvu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo uyxvy wqyv y|yvo qvuXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| uywvq wqxvz y|vq qvu

|xyuvw |x|vo |xywvz uyyvw wqyv| y|yvx qvu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound xq|v wovo yxvz qvuXpound| |xyxvu |x|v| |xyzv| Xpound| uzzvx wqyvu yxvz qvu

|xyuvw |xuvy |xyzv xqovy wqzv yxxvu qvu

ntildeR| ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

ontildeR

ontildeR|

qntildeRo

qntildeR

qntildeR|

qntildeRo

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

JAEA-Testing 2011-005

Talbe F-3 |eacuteIumlLQRafeaZrsacute 1ccedil8 ^_P

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqxvw uqzvq uqvu ovqx

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |ywqvx |ywovy |ywovo ovq|Xpoundo |xwxvz |xwvq |xw|vz Xpoundo |yzyvz |yzvo |yzvq ovq|

|xyvu |xuvo |xxv| |ywwv |ywzv| |ywzvq ovq|

Xpoundq z|vx zvz z|v Xpoundq zv| zxvq z|v ovqqXpoundo woqv wqxvo wqvy Xpoundo wqxvx wqwvy wqvo ovqqXpound wqzv| wqvo wqxv Xpound wquv| wqwv| wqyv| ovqq

wquv| wqqvo wqv wqqv wquvq wqv| ovqq

Xpoundq o||vo o|xvq o|uvq Xpoundq ov ouqv| o|ov| ovqqXpoundo ouv ouov ouov Xpoundo o|ovy ouwvw ouqv ovqqXpound ouov ouyv| ouuvq Xpound o|ovy oxqvo ouqvw ovqqXpound| ouvy ouuvw ou|v Xpound| o|qv ouzvo o|zvz ovqq

o|zvz ouovw ouqvz ozvq ouvo o|wvq ovqq

Xpoundq qwyvq oqyvw qzyvu Xpoundq qzv ooxvq oqyvo ovqqXpoundo qz|v ooyvy oqxv Xpoundo oquv oouv| oqzv ovqqXpound qzuvy oovq oqxvw Xpound oqxv| oxvx ooxvu ovqqXpound| qzxvy ooxvu oqxvx Xpound| oqxvw oxv ooxvx ovqqXpoundu qwzv| o||v ooovx Xpoundu oqv qzuv| oqvx qvzzXpoundx qwzvu o|vw oo|vy Xpoundx oov qzov oquvu qvzwXpoundy oqvy o|xvu oovx Xpoundy ooxvu qwzv oqvy qvzwXpound ooqvu o|yvo o|v Xpound oouvo qzqvo oqvo qvzw

qzxvw ouvw ooqv| ooqvq oqxv oqvz qvzz

wsup3deg

usup3deg

|sup3deg

sup3deg

13aumlecedilg

sup3deg

|sup3deg

usup3deg

eccedilograveregYacuteg

efrac12cedilregg

oefrac12cedilregg

wsup3deg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

Table F-4 |eacuteIumlLQRafeaZrsacute iacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq |quvy |quvx |quvx ovquXpoundo |xywv |xwvx |x|vy Xpoundo ||vx |ovw |vy ovquXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz |ouvq |o|vo |o|vy ovqu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq ||zzv |qv| |xxqvw ovqqXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound |zwv| ||v |yovq ovqxXpound| Xpound|

|xyov| |xovu |xyyv| |xzwvw |o|vq |yxxvz ovq

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq |quv |quvq |quvu ovquXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| |ovx |ovy |ovy ovqu

|xyv| |xqvo |xyyv |o|vo |ovw |o|vq ovqu

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo |oyvw |oyvu |oyvy ovquXpound |xyvz |xuvu |xovo Xpound |oyv |oyvq |oyvo ovquXpound| Xpound|

|xyyv| |xuvu |xqv| |oyvx |oyv |oyv| ovqu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo |ov |oyvo |oyvz ovquXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| |ouv| |ovx |o|vu ovqu

|xyuvw |x|vo |xywvz |oyvq |ouv| |oxvo ovqu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound |oyv |oxvz |oyvo ovquXpound| |xyxvu |x|v| |xyzv| Xpound| |oxvx |oyvw |oyvo ovqu

|xyuvw |xuvy |xyzv |oxvw |oyv| |oyvo ovqu

ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

qntildeRoqntildeRo

qntildeR

qntildeR|

ontildeR

ontildeR|

ntildeR|

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpound

JAEA-Testing 2011-005

G Zccedil` MPIWIcirca|ccedil`gt MPI atilde 6MPI ETHNtildegtegraveBCD^_ZiIacute^regCZccedil`MPI

WIcirca|ccedil`gtOslash55poundDq

G1 Agraveograve MPI ETHNtildegtegraveBCD^_ZiIacute^regC1024 ^_Zccedil

` MPIW 128^_j87ccedilIumlIcirca|ccedil`128^_Zccedil` MPIW128 ^_j8 7ccedilIumlIcirca|ccedil`acircgtCc Ocirc$-a5gtrs5poundDqUgraveDRDMANORDMAIumlgt5poundDq )acircEacute coregtZccedil` MPIWIcirca|ccedil`rs acircEacute MPI^_gt

coreCCDouml rsWGq Ocirc$-a5n~n 8eacutea`ccedil1Keacutea`256Keacutea`ccedil1Meacutea`WCDq

G2 [oumlYacutebrvbarsectGqmbrvbarsect[raquozfrac14frac12CDq (1) 1024Zccedil` MPIW 128j8Icirca|ccedil`gtGu MPIatildegtIcirca|ccedil`

ordfFqUgraveDatildebrvbarpoundaacuteCCordflaquoq (2) 128Zccedil` MPIW 128j8Icirca|ccedil`gtplusmn Ocirc$-a5(256KB13yacute)

gtlaquoWGu MPIatildegtaeligIcirca|ccedil`ordfnotyW wXordfaacuteCCIcirca|ccedil`ordfZccedil` MPIbrvbarsectordfmWq

(2)gt ordfAgravecedilCXpoundegraveBG core ordfCXgtOslashf

IumlntildeHDoumlfIumlOslash5FCXWeumlnqmndbrvbarsect

BX900gtIcirca|ccedil`ordfnotygtlaquoqfIumlUTUacuteUcircbgtIcirca|ccedil`Ucirc5Gaeliguordflaquoq

JAEA-Testing 2011-005

Table G-1 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (1)

sup2~ w |y |o ovq| zx o| qv |vu vuqsup2~ oy |o |qz ovq| oqq o|x qv| |vow vwsup2~ u |ow |o ovqq zz o|x qv| |vo v|xsup2~ | |y |o ovqu zw o|x qv| |v|u v|sup2~ yu |o |oy ovqq zy o|z qvyz |v| vsup2~ ow ||u |o| ovq zw o| qv |vuq vwsup2~ xy ||u |o ovqx oqx o|z qvx |voz vsup2~ xo ||u |q ovqu oq o|w qv |vo| v|sup2~ ontildequ |uo |q ovq oqz ou| qvy |vo vu

sup2~ w owq qw qvw q oo qvy| vxy ovwxsup2~ oy owu o| qvw q oqw qvyy vyo ovzwsup2~ u owq qu qvww yw oqz qvy vyx ovwsup2~ | o| qy qvwu o oqw qvyy vu| ovzosup2~ yu o| qu qvwx o oqw qvyy vu| ovwzsup2~ ow owx qq qvz| oqz qvy vxy ovwusup2~ xy oz qo qvwz wy ooo qvw vqz ovwsup2~ xo ow qu qvw w oo qv vqx ovwsup2~ ontildequ ow ou qvwx z oq qv ovzw ovz

sup2~Oslashreg~plusmn w untildeqoy untilde|z ov| ontildeuw ontildeyz| qvu xzv|q xvusup2~Oslashreg~plusmn oy |ntildeyuw uqntilde ovwo ontildezw ontildeyuu qvz xyvx uvwsup2~Oslashreg~plusmn u untildeqxu |zntildezu| ovwx ontildexo ontildeyoq qvw xzvow uvwsup2~Oslashreg~plusmn | |ntilde|| uqntildeyq ovw ontilde|q ontildeyq qvwo xxvxy uvwysup2~Oslashreg~plusmn yu untildey uqntildeuuu ovwu ontildeww ontildeyuy qvw xvyu uvxsup2~Oslashreg~plusmn ow |ntildeqq uqntildezw| ovw ontildew ontildeyzx qv xzvxq uvowsup2~Oslashreg~plusmn xy untildeyyx uqntildeywz ovwu ontildeuxw ontildeyy qvw xovq uvuosup2~Oslashreg~plusmn xo untildeqz |zntildeoww ovwz ontildexq ontildeqx qvwz uwvyz vzzsup2~Oslashreg~plusmn ontildequ yntildexwx |ntildeuuz vqx ontildey ontildewx qvw uvo ozvzw

middotplusmnsup1 w |w | ovqq oqq o|y qv| |vw vuqmiddotplusmnsup1 oy |u |uo qvzx oqo o|y qvu |vq vxqmiddotplusmnsup1 u |uo |yq qvzx oqo ouq qv |v|w vxwmiddotplusmnsup1 | |xo |yw qvzy oq o|w qvu |vux vyymiddotplusmnsup1 yu xzo u|| ov|y oqx ouw qvo xvy| vzmiddotplusmnsup1 ow ntildez yqu uvz| ooy oxx qvx xvq |vzqmiddotplusmnsup1 xy xntilde z xv| o| oy qvz |zvyx xvwumiddotplusmnsup1 xo wntildeqoz ontildew uvxo oy oz ovo| |voy zvmiddotplusmnsup1 ontildequ oqntildezy |ntildexxo |vqz zo |w uvqz oovz ouvzx

middotplusmnsup1raquo w ntildey| oxntildequq qvo || oy qvu vz ovqomiddotplusmnsup1raquo oy ntildeuyw oxntildeqx qvoy ||x oq qvu v| ovoxmiddotplusmnsup1raquo u ntilde|y oxntildequq qvow |uo o| qvuw wvq| ovqzmiddotplusmnsup1raquo | ntildexw| oxntildeq|| qvo || ox qvu vyy ovqomiddotplusmnsup1raquo yu ntildeyyz oxntildeqqz qvow |xq yw qvuy vy ozvxumiddotplusmnsup1raquo ow |ntildeqqx oxntildeqy| qvq u|q uo qvxw yvzz qv|middotplusmnsup1raquo xy |ntilde|uy oxntildeoqu qv xyo y qvu xvzy ozvw|middotplusmnsup1raquo xo |ntilde|zq oxntildeoxq qv xzw woq qvu xvy owvomiddotplusmnsup1raquo ontildequ |ntildex| oxntildeuox qvu yy z|z qv xvxx oyvuo

Ugraveplusmnsup1 w ontildeuox yntildeyo| vy| u q ovq ovzq |vUgraveplusmnsup1 oy owntildeqz yntildewu vy |q qo ovou wv|q |uvoxUgraveplusmnsup1 u owntilde| yntildewzz v |qo q ovuy yvoy ||v|wUgraveplusmnsup1 | ozntildeqzy yntildezq v uz q ovq yvwq ||v|wUgraveplusmnsup1 yu ozntilde|wx yntildewx| vw| z oy ov| yxv|o |ovqUgraveplusmnsup1 ow qntildewo yntildez|y vz ||q o ovxq yovxq |ovuxUgraveplusmnsup1 xy qntilde|o ntildeqo vzq |wy y ovo xvx |ovqqUgraveplusmnsup1 xo ontildeo|u ntildeow| vzu |xy |u ovx| xzv| |qvyUgraveplusmnsup1 ontildequ ontildey|q ntilde|z vz| |zw ow| vow xuv|o uqvu

Ugraveplusmnsup1raquo w ozntildexw yntildez| vw| |z q ovow wvqu |uvuqUgraveplusmnsup1raquo oy ozntildexoz yntildezq vw| |q q ovou wuvyz |uvoUgraveplusmnsup1raquo u ozntildexqy yntildezu vwo u q ov| ov|q ||vyqUgraveplusmnsup1raquo | ozntildeuu yntildewz vwy o qy ov|o vwx ||vuqUgraveplusmnsup1raquo yu ozntildeqw yntildezq| vwx y oy ovu |vy |ovzy

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

sectWOcirceg

oqu$XAcircsup3deg owsup3degIcircwsup3+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

JAEA-Testing 2011-005

Table G-2 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (2)

Ugraveplusmnsup1raquo ow qntildeqx yntildewz| |vqq |u q ovq xxv|| |ovzUgraveplusmnsup1raquo xy qntildew| ntildeq| vzy ||y y ovuz yvqx |ovoUgraveplusmnsup1raquo xo ontildew ntildexy vz| u|| |u ovwx uzvoo |qvzxUgraveplusmnsup1raquo oqu ontildeww| ntildeuqz vzx |oy owu ov yzv|q uqv||

~plusmn w yntildez zntildeuw qvo ontildeozu uux vyw xvwq ovzo~plusmn oy ntildeoqz zntildeu| qv| ontildeoy| uu vyq yvoo ovz~plusmn u ntildeq|z zntildeuy qv ontildeo uuy vy| yvqq ovwx~plusmn | ntildeoz zntildeyq qvu ontildequ uu vq xvzw ovwy~plusmn yu yntildezoq zntildey qvo ontildeoyu uu vyq xvzu ovw|~plusmn ow zntilde|qz zntildew| qvzx ntildeo uuy uvw uvz ovz~plusmn xy zntilde|wz zntildewqw qvzy ntildey uuy uvzz uv ovz~plusmn xo zntildeqx zntildewuq qvzu ntildeowx uuz uvw uvo ovzu~plusmn oqu zntildexz zntildezq| qvzz ntildeo uyo uv uvuz ovxq

~plusmnraquo w yntildewox zntilde| qvq ontildeuo uux vz xvuz ovw~plusmnraquo oy yntildezq zntildeu| qvo ontildeoz uu vyw xvw ovwo~plusmnraquo u yntildeyz| zntildeux qvyz ontildeoyw uuu vy| xv| ovz|~plusmnraquo | yntildewou zntildexx qvq ontildeu uux vx xvx ovzo~plusmnraquo yu yntildezx| zntildex qvo ontildeoyo uuu vyo xvzz ovzx~plusmnraquo ow zntildeuux zntildew qvz ntildeoxx uu| uvwy uv|w vqx~plusmnraquo xy zntilde|x zntildewqo qvzx ntildeq uuy uvzx uvu ovzw~plusmnraquo xo zntildeyw zntildewuy qvzw ntildeqy uxq uvzq uv|z ovwz~plusmnraquo oqu zntildeyxq zntildezoo qvz ntildeoxy uyq uvyw uvuw ovx|

plusmn w uy uo ovo zy oow qvw uvwq |vxqplusmn oy xq uw ovqy oqo oq qvwx xvqq |vzzplusmn u xwz x|| ovoq oqy o| qvwy xvx uv|uplusmn | oo yqy ovo ooq o|q qvwx yvuy uvywplusmn yu ntilde zoz vu o| ouo qvw owvqu yvxqplusmn ow ntildeww ntildeqqz ov|z oxo ox qvwy owvu oovxqplusmn xy untildeouy |ntildeoy ovo q| oz qvz| qvuy oyvzplusmn xo |zntildeo| |xntilde|u ovoo ontildeuox ontildeuy qvwo vyz qvowplusmn oqu uqntildeqwy |xntilde|w ovo ontildeuww ontildewqw qvw yvzu ozv

plusmnraquo w oqqntildexy yqntilde|y ovy ntildeuou wy vx uovyy ywvwzplusmnraquo oy oqntildey| yontildeq ovyw ntildeu|w ww| vy uvqz yzv|oplusmnraquo u oqntildexyw yontildeqz| ovyw ntilde|zy wz| vyw uvw ywvuqplusmnraquo | oq|ntildeqo yontildeo ovyw ntildeuqq www vq uvzx ywvzoplusmnraquo yu oq|ntildeqzw yontildeoou ovyz ntildeuyx wzx vx uovw| ywv|qplusmnraquo ow oq|ntildezo yontildeq| ovq |ntildexqx wz |vzo zvyx ywvoplusmnraquo xy oquntildeqy yontildeqq ovo |ntildexyw wzz |vz zvo yvwxplusmnraquo xo oqwntilde|zq yqntildewo ovw |ntildexyx wzx |vzw |qvuo yvzplusmnraquo oqu ooqntildeqq yontildeoxx ovwq |ntildeyu zo |vzz zvzx yyv|z

~plusmnreg w ooz o|o qvzq q x qvwq xvwo xvo~plusmnreg oy oo| ou qvzo y qvwx xvq uvwu~plusmnreg u oow o|o qvzo q x qvw xvz xvy~plusmnreg | oq ou qvz q x qvwq xvzu uvwz~plusmnreg yu oq ox qvzy q x qvw yvo uvzz~plusmnreg ow o oz qvzz y qvw xvy xvq|~plusmnreg xy ox o|o qvzx y qvzx uvwy uvw|~plusmnreg xo ox o|o qvzx | y qvww xv|x uvzx~plusmnreg oqu ou o|| qvz| y z qvz uvq uvy|

~plusmn w oontildewxu |ntildeyuy |vx yu ontildeoow qvx owvuy |vy~plusmn oy oontildewo |ntildeyq |vw y|z ontildeoox qvx owvuw |v|~plusmn u oontildey |ntildexw |vz yuo ontildeoox qvx owv| |vo~plusmn | oontilde| |ntildexzx |v y| ontildeo| qvx owvuz |vq~plusmn yu oontildew|o |ntildey |v y| ontildeo qvx owvxw |vo~plusmn ow ontildeqw |ntilde| |v| yy ontildeouq qvxx ozv|o |vw~plusmn xy ontilde |ntildewox |v zz ontildeoyw qvyw oxv| |v~plusmn xo ontildeuo |ntildezzo |voo wu ontilde qvyz ouvy |v~plusmn oqu ontildequ untilde|w vzq zu ontilde|u qvo o|vuz |v|o

sup2cedil~raquo w y o qvx o | qvu| xvoy uvusup2cedil~raquo oy z o qvwq o qvxx vzx xvuxsup2cedil~raquo u x oq qvuy o qvyo |v| uvuzsup2cedil~raquo | | oo qv o qvxu vuz uvzosup2cedil~raquo yu oo qvyo o qvuz xvyo uvxusup2cedil~raquo ow oo qvyw o qvxx yvqq uvwxsup2cedil~raquo xy y oo qvxy o qvy uvyq xvoosup2cedil~raquo xo y oo qvxy | qvy| |vq| |vusup2cedil~raquo oqu x o qv|z o qvx |vyz xvuq

JAEA-Testing 2011-005

Table G-3 MPIatilderaquo LMPI and hybrid acute256ccedil1024KBP

sup2~ yntildeouu |ntildewzz |ntildeuou ovou ontildezx ontildeqyw ovw| ovzz |vqsup2~ xuntildeww ntilde|o ntildeuo qvzz |ntildeqo |ntildey qvz vu vsup2~ wyntildeu| oqntildexx oontildeqyy qvz |ntildezwy untildeuyw qvwz vq vuwsup2~ ontildequwntildexy ountildexxy oxntildeuy qvzu untildezqq xntildewyw qvwu vz vyu

sup2~ yntildeouu zou zo qvzu ontildeqqq ontildeq qvzw qvzo qvzxsup2~ xuntildeww ontildeo ontildezy qvwz ontildew ntildeuuz qv| qvz qvzsup2~ wyntildeu| ntildexuo ntildey| qvz ntildeyzz |ntildeuqw qvz qvzu qvwosup2~ ontildequwntildexy |ntildex untildeqox qvzu |ntildexww untilde|| qvw ovqx qvz

sup2~Oslashreg~plusmn yntildeouu yntildeuw| wontildexoz qvzu xntildeuo |ntildeoqw ovoq |vqq |vx|sup2~Oslashreg~plusmn xuntildeww ou|ntildewx ountildeqo qvw| uyntildeqzq yyntildewxo qvyz |vo vyosup2~Oslashreg~plusmn wyntildeu| oontildezoo yzntildeuz qvz yntildeqz zxntildeox qvo |voy vw|sup2~Oslashreg~plusmn ontildequwntildexy w|ntildeyox |yontildeuq| qvw wntildexo o|ntildexqz qvo |v| vz|

middotplusmnsup1 yntildeouu |ntildezw ||ntilde|x| qvzz oontildexww oxntildeyzu qvu vwx vo|middotplusmnsup1 xuntildeww y|ntildex yyntildeyy qvzx ontildez|x untilde|x qvuy vww ovuomiddotplusmnsup1 wyntildeu| wzntildewy z|ntildeyux qvzy |qntildeyx yntildeyqu qvuy vz ov|zmiddotplusmnsup1 ontildequwntildexy oqntildeux| o|untildeqox qvzq uqntildeyxx wwntildez| qvuy vzy ovxo

middotplusmnsup1raquo yntildeouu yntildew|u ||ntildeu ovww ntildewxy oxntildeyuw uvyy qvwy voumiddotplusmnsup1raquo xuntildeww wxntildeoz yuntildeo| ov|| wxntildex|| untildeuy ovwq ovqq ov|xmiddotplusmnsup1raquo wyntildeu| ooxntildeuw z|ntildezyu ov| z|ntildeyow yntildexuz ov|z ov| ov|zmiddotplusmnsup1raquo ontildequwntildexy o|ntildexx o|untildeuqz ovq zyntildexoq wntildeuuy ovoq ovu| ovxu

Ugraveplusmnsup1 yntildeouu |zntildeyu| oqntildexwz |vu uontildeuyq ountildewy vz qvzy qvoUgraveplusmnsup1 xuntildeww untildewxx ntildeqo voy uzntildeoy| yntildewo ovw| qvz qvw|Ugraveplusmnsup1 wyntildeu| xxntildewoq zntildewyu ovw xyntildewuu |yntildexqx ovxy qvzw qvwUgraveplusmnsup1 ontildequwntildexy yntildewxy |ntildexxw ovy yuntildewzx uyntildeoq ovuo qvz qvwo

Ugraveplusmnsup1raquo yntildeouu uqntildeq|u oqntildeyxq |vy uontildey| ountildez|z vz qvzy qvoUgraveplusmnsup1raquo xuntildeww uwntilde|u ntildeqz vo uzntildew yntildewx ovw| qvzw qvw|Ugraveplusmnsup1raquo wyntildeu| xyntildezy zntildez|y ovww xntildeo| |yntildeuwq ovx qvzw qvwUgraveplusmnsup1raquo ontildequwntildexy y|ntildeo| |ntildeyxq ovyw yxntilde|yw uyntildeqx ovuo qvz qvwo

~plusmn yntildeouu untildeyw wntildezu uvo uuntildeo wntildeuq| xvx qvzy ovq~plusmn xuntildeww xqntildeqyx qntilde|xq vuy xontildexwy untildezux vq qvz qvw~plusmn wyntildeu| xwntildeq wntildeux vqy yqntildeox |untildeu ov| qvzw qvw~plusmn ontildequwntildexy yyntildequq |yntilde|q| ovw ywntildeqx uuntildeuyy ovx| qvz qvw

~plusmnraquo yntildeouu zntilde|q zntildeqq ovqu zntildexq wntildeu|w ovo| qvzw ovq~plusmnraquo xuntildeww oyntildeu| qntilde|wy qvwo oyntildeoz untildezuy qvyx ovq qvw~plusmnraquo wyntildeu| |ntildewo wntildeuxu qvw ntildewz |untilde qvyy ovq qvw~plusmnraquo ontildequwntildexy zntildezy |yntildexo| qvw zntildezxq uuntildeu qvy ovqq qvw

plusmn yntildeouu yxntildewu wzntildeq| qv| ontildeyoz yntildezz qvwq |vq |v|qplusmn xuntildeww ountildeqoz oxntilde|o qvwo |zntildexu xxntilde|ux qv |vo vxplusmn wyntildeu| ow|ntildeuwy untildeyu qvw xntildeoyu untildezu| qvy |vo |vqqplusmn ontildequwntildexy u|ntildewo zxntildezqo qvw xntildezy zuntildeoy qvwq |vo |vo

plusmnraquo yntildeouu x|ontildeyx wyntildeqoz yvow o|ntildeqox owntilde|xu vu |vww uvyzplusmnraquo xuntildeww yozntildeyz| oz|ntildeuwu |vq oyntildeyw ywntildeyw v| |vwo vwplusmnraquo wyntildeu| yw|ntildewxu wwntildeooz v| owyntilde|q| oqxntildeyw ovy |vy vplusmnraquo ontildequwntildexy |yntildezo uqxntildeoyq ovw oontildexo ouontildezzx ovuz |vuw vwx

~plusmnreg yntildeouu uzu y| qvw uwo xzw qvwq ovq| ovqy~plusmnreg xuntildeww zq| ontildeoz qvy wo| ontildexuy qvx| ovoo qv~plusmnreg wyntildeu| ontilde|q ontildeyq qvw ontildeoxo ntildeoqz qvxx ovo| qvz~plusmnreg ontildequwntildexy ontildezqu ntilde|ux qvwo ontildeuxu ntildeyx qvxu ov|o qvww

~plusmn yntildeouu ountildewqx ountildez|x qvzz ontildeqq owntildeuw| ovou qvq qvwo~plusmn xuntildeww zntildeuy zntildewxz qvzw uqntildex|o uwntildexyw qvw| qv qvyo~plusmn wyntildeu| uuntildeq uuntildeo qvzz xyntildeyoq yzntildexzw qvwo qvw qvyu~plusmn ontildequwntildexy qntildeqqx qntildeu|q qvzz ntildeo zqntilde|ux qvwo qvzy qvw

sup2cedil~raquo yntildeouu x w qvz zo ux vq| qvw ov|sup2cedil~raquo xuntildeww ox o| qv| o|w o| qvyx qvzo qvwosup2cedil~raquo wyntildeu| oo |o qvu owz w| qvy qvzo qvwsup2cedil~raquo ontildequwntildexy ou wz qvu | |x qvyx qvz qvwo

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

sectWOcirceg

ow$XAcircsup3deg owsup3degIcircwsup3

国際単位系(SI)

乗数  接頭語 記号 乗数  接頭語 記号

1024 ヨ タ Y 10-1 デ シ d1021 ゼ タ Z 10-2 セ ン チ c1018 エ ク サ E 10-3 ミ リ m1015 ペ タ P 10-6 マイクロ micro1012 テ ラ T 10-9 ナ ノ n109 ギ ガ G 10-12 ピ コ p106 メ ガ M 10-15 フェムト f103 キ ロ k 10-18 ア ト a102 ヘ ク ト h 10-21 ゼ プ ト z101 デ カ da 10-24 ヨ ク ト y

表5SI 接頭語

名称 記号 SI 単位による値

分 min 1 min=60s時 h 1h =60 min=3600 s日 d 1 d=24 h=86 400 s度 deg 1deg=(π180) rad分 rsquo 1rsquo=(160)deg=(π10800) rad秒 rdquo 1rdquo=(160)rsquo=(π648000) rad

ヘクタール ha 1ha=1hm2=104m2

リットル Ll 1L=11=1dm3=103cm3=10-3m3

トン t 1t=103 kg

表6SIに属さないがSIと併用される単位

名称 記号 SI 単位で表される数値

電 子 ボ ル ト eV 1eV=1602 176 53(14)times10-19Jダ ル ト ン Da 1Da=1660 538 86(28)times10-27kg統一原子質量単位 u 1u=1 Da天 文 単 位 ua 1ua=1495 978 706 91(6)times1011m

表7SIに属さないがSIと併用される単位でSI単位で表される数値が実験的に得られるもの

名称 記号 SI 単位で表される数値

キ ュ リ ー Ci 1 Ci=37times1010Bqレ ン ト ゲ ン R 1 R = 258times10-4Ckgラ ド rad 1 rad=1cGy=10-2Gyレ ム rem 1 rem=1 cSv=10-2Svガ ン マ γ 1γ=1 nT=10-9Tフ ェ ル ミ 1フェルミ=1 fm=10-15mメートル系カラット 1メートル系カラット = 200 mg = 2times10-4kgト ル Torr 1 Torr = (101 325760) Pa標 準 大 気 圧 atm 1 atm = 101 325 Pa

1cal=41858J(「15」カロリー)41868J(「IT」カロリー)4184J(「熱化学」カロリー)

ミ ク ロ ン micro 1 micro =1microm=10-6m

表10SIに属さないその他の単位の例

カ ロ リ ー cal

(a)SI接頭語は固有の名称と記号を持つ組立単位と組み合わせても使用できるしかし接頭語を付した単位はもはや コヒーレントではない(b)ラジアンとステラジアンは数字の1に対する単位の特別な名称で量についての情報をつたえるために使われる

 実際には使用する時には記号rad及びsrが用いられるが習慣として組立単位としての記号である数字の1は明 示されない(c)測光学ではステラジアンという名称と記号srを単位の表し方の中にそのまま維持している

(d)ヘルツは周期現象についてのみベクレルは放射性核種の統計的過程についてのみ使用される

(e)セルシウス度はケルビンの特別な名称でセルシウス温度を表すために使用されるセルシウス度とケルビンの

  単位の大きさは同一であるしたがって温度差や温度間隔を表す数値はどちらの単位で表しても同じである

(f)放射性核種の放射能(activity referred to a radionuclide)はしばしば誤った用語でrdquoradioactivityrdquoと記される

(g)単位シーベルト(PV200270205)についてはCIPM勧告2(CI-2002)を参照

(a)量濃度(amount concentration)は臨床化学の分野では物質濃度

  (substance concentration)ともよばれる(b)これらは無次元量あるいは次元1をもつ量であるがそのこと   を表す単位記号である数字の1は通常は表記しない

名称 記号SI 基本単位による

表し方

秒ルカスパ度粘 Pa s m-1 kg s-1

力 の モ ー メ ン ト ニュートンメートル N m m2 kg s-2

表 面 張 力 ニュートン毎メートル Nm kg s-2

角 速 度 ラジアン毎秒 rads m m-1 s-1=s-1

角 加 速 度 ラジアン毎秒毎秒 rads2 m m-1 s-2=s-2

熱 流 密 度 放 射 照 度 ワット毎平方メートル Wm2 kg s-3

熱 容 量 エ ン ト ロ ピ ー ジュール毎ケルビン JK m2 kg s-2 K-1

比熱容量比エントロピー ジュール毎キログラム毎ケルビン J(kg K) m2 s-2 K-1

比 エ ネ ル ギ ー ジュール毎キログラム Jkg m2 s-2

熱 伝 導 率 ワット毎メートル毎ケルビン W(m K) m kg s-3 K-1

体 積 エ ネ ル ギ ー ジュール毎立方メートル Jm3 m-1 kg s-2

電 界 の 強 さ ボルト毎メートル Vm m kg s-3 A-1

電 荷 密 度 クーロン毎立方メートル Cm3 m-3 sA表 面 電 荷 クーロン毎平方メートル Cm2 m-2 sA電 束 密 度 電 気 変 位 クーロン毎平方メートル Cm2 m-2 sA誘 電 率 ファラド毎メートル Fm m-3 kg-1 s4 A2

透 磁 率 ヘンリー毎メートル Hm m kg s-2 A-2

モ ル エ ネ ル ギ ー ジュール毎モル Jmol m2 kg s-2 mol-1

モルエントロピー モル熱容量ジュール毎モル毎ケルビン J(mol K) m2 kg s-2 K-1 mol-1

照射線量(X線及びγ線) クーロン毎キログラム Ckg kg-1 sA吸 収 線 量 率 グレイ毎秒 Gys m2 s-3

放 射 強 度 ワット毎ステラジアン Wsr m4 m-2 kg s-3=m2 kg s-3

放 射 輝 度 ワット毎平方メートル毎ステラジアン W(m2 sr) m2 m-2 kg s-3=kg s-3

酵 素 活 性 濃 度 カタール毎立方メートル katm3 m-3 s-1 mol

表4単位の中に固有の名称と記号を含むSI組立単位の例

組立量SI 組立単位

名称 記号

面 積 平方メートル m2

体 積 立法メートル m3

速 さ 速 度 メートル毎秒 ms加 速 度 メートル毎秒毎秒 ms2

波 数 毎メートル m-1

密 度 質 量 密 度 キログラム毎立方メートル kgm3

面 積 密 度 キログラム毎平方メートル kgm2

比 体 積 立方メートル毎キログラム m3kg電 流 密 度 アンペア毎平方メートル Am2

磁 界 の 強 さ アンペア毎メートル Am量 濃 度 (a) 濃 度 モル毎立方メートル molm3

質 量 濃 度 キログラム毎立法メートル kgm3

輝 度 カンデラ毎平方メートル cdm2

屈 折 率 (b) (数字の) 1 1比 透 磁 率 (b) (数字の) 1 1

組立量SI 基本単位

表2基本単位を用いて表されるSI組立単位の例

名称 記号他のSI単位による

表し方SI基本単位による

表し方平 面 角 ラジアン(b) rad 1(b) mm立 体 角 ステラジアン(b) sr(c) 1(b) m2m2

周 波 数 ヘルツ(d) Hz s-1

ントーュニ力 N m kg s-2

圧 力 応 力 パスカル Pa Nm2 m-1 kg s-2

エ ネ ル ギ ー 仕 事 熱 量 ジュール J N m m2 kg s-2

仕 事 率 工 率 放 射 束 ワット W Js m2 kg s-3

電 荷 電 気 量 クーロン A sC電 位 差 ( 電 圧 ) 起 電 力 ボルト V WA m2 kg s-3 A-1

静 電 容 量 ファラド F CV m-2 kg-1 s4 A2

電 気 抵 抗 オーム Ω VA m2 kg s-3 A-2

コ ン ダ ク タ ン ス ジーメンス S AV m-2 kg-1 s3 A2

バーエウ束磁 Wb Vs m2 kg s-2 A-1

磁 束 密 度 テスラ T Wbm2 kg s-2 A-1

イ ン ダ ク タ ン ス ヘンリー H WbA m2 kg s-2 A-2

セ ル シ ウ ス 温 度 セルシウス度(e) Kンメール束光 lm cd sr(c) cd

スクル度照 lx lmm2 m-2 cd放射性核種の放射能( f ) ベクレル(d) Bq s-1

吸収線量 比エネルギー分与カーマ

グレイ Gy Jkg m2 s-2

線量当量 周辺線量当量 方向

性線量当量 個人線量当量シーベルト(g) Sv Jkg m2 s-2

酸 素 活 性 カタール kat s-1 mol

表3固有の名称と記号で表されるSI組立単位SI 組立単位

組立量

名称 記号 SI 単位で表される数値

バ ー ル bar 1bar=01MPa=100kPa=105Pa水銀柱ミリメートル mmHg 1mmHg=133322Paオングストローム Å 1Å=01nm=100pm=10-10m海 里 M 1M=1852mバ ー ン b 1b=100fm2=(10-12cm)2=10-28m2

ノ ッ ト kn 1kn=(18523600)msネ ー パ Npベ ル B

デ ジ ベ ル dB

表8SIに属さないがSIと併用されるその他の単位

SI単位との数値的な関係は    対数量の定義に依存

名称 記号

長 さ メ ー ト ル m質 量 キログラム kg時 間 秒 s電 流 ア ン ペ ア A熱力学温度 ケ ル ビ ン K物 質 量 モ ル mol光 度 カ ン デ ラ cd

基本量SI 基本単位

表1SI 基本単位

名称 記号 SI 単位で表される数値

エ ル グ erg 1 erg=10-7 Jダ イ ン dyn 1 dyn=10-5Nポ ア ズ P 1 P=1 dyn s cm-2=01Pa sス ト ー ク ス St 1 St =1cm2 s-1=10-4m2 s-1

ス チ ル ブ sb 1 sb =1cd cm-2=104cd m-2

フ ォ ト ph 1 ph=1cd sr cm-2 104lxガ ル Gal 1 Gal =1cm s-2=10-2ms-2

マ ク ス ウ ェ ル Mx 1 Mx = 1G cm2=10-8Wbガ ウ ス G 1 G =1Mx cm-2 =10-4Tエルステッド( c ) Oe 1 Oe  (1034π)A m-1

表9固有の名称をもつCGS組立単位

(c)3元系のCGS単位系とSIでは直接比較できないため等号「   」

   は対応関係を示すものである

(第8版2006年改訂)

この印刷物は再生紙を使用しています

Page 2: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター

JAEA-Testing 2011-005

13

13 $amp ()+ - 0123 45267 892

(2011 8 30 lt=)

gt ABCDEFGDHIJ

22 3 K13LM=NO 153TflopsPQCRSTUVWXM=NO 200TflopsUV LinuxYZ$13LPRIMERGY BX900PWM=NO 12 Tflops []^_`$a^LFX1PbcdXefgh$13ijklmndoBpCDqBX900 rsXtuvwmWFX1 []efgh$Lxfgh$PyBzD^|13~

hGmWWCqgtmefgh$13

Gqefgh$13

MXBdn HPCCY^_ZEwO|YalLIOPa$fY`L^_ Piexclcent^_ZB5poundDqmndcurrenyenGmWbrvbarsectBX900 FX1 umlcopyordfmnd13ABGlaquoDsectnotshyWXregmacrordfdegdnq

plusmnLsup2sup3P micro319-1195 paramiddotcedilsup1ordmraquofrac14frac12frac34iquestAgraveiquest3 2-4 2AacuteAcircAtildeAumlAringAElig CcedilEgrave

JAEA-Testing 2011-005

Evaluation of Fundamental Performance of JAEArsquos New Supercomputer System

Kensaku SAKAMOTO Futoshi SHIMIZU Takuya TSURUOKA2 Toshiyuki NEMOTO2 and Naota ISHIKAWA2

Center for Computational Science amp e-Systems

Japan Atomic Energy Agency Tokai-mura Naka-gun Ibaraki-ken

(Received August 30 2011)

A new supercomputer system was deployed at JAEA in March 2010 The system is mainly composed of a large scale Linux cluster system (PRIMERGY BX900 200Tflops) and a lead off system of RIKENrsquos K computer (FX1 12Tflops) whose purposes are to deliver a high-performance parallel computing environment and application code development environment for the K computer respectively In this report we present results of the evaluation of fundamental performance of the system HPCC Benchmark suit as well as independent codes to measure the performances of floating-point operation memory bandwidth and interconnect bandwidth was performed Many profitable findings for uses of BX900 and FX1 were obtained by analyzing these results Keywords Supercomputer Fundamental Performance Benchmark 2 RIST (Research Organization for Information Science and Technology)

JAEA-Testing 2011-005

[ 1 EacuteH ---------------------------------------------------------------------------------------------------- 1 2 13Ecircu ------------------------------------------------------------------------------------------- 1

21 UVEumlIgraveOIacuteIcircIumlETHNtildeJ ----------------------------------------------------------- 1 22 []fIumlIacuteIcircIumlETHNtildeJ -------------------------------------------------------- 2 23 OgraveOacuteOcircOtildeYOumltimesIcircIumlETHNtildeJ ----------------------------------------------------------- 2

3 HPCCYbrvbar --------------------------------------------------------------- 5 31 HPCCYEcircu -------------------------------------------------------------------------- 5

311 ^_Z ------------------------------------------------------------------------------------------ 5 312 Oslash5 ------------------------------------------------------------------------------------------ 7

32 -------------------------------------------------------------------------------------------------- 8 321 BX900 ----------------------------------------------------------------------------------------------- 8 322 FX1 ------------------------------------------------------------------------------------------------- 14

33 HPCCYUgraveWH ---------------------------------------------------------------------- 18 4 BX900W FX1TUacuteUcirc ---------------------------------------------------------------------- 20

41 ^_aZ-------------------------------------------------------------------------------------------- 20 42 feaZbrvbar|YTUacuteUcirc --------------------------------------------------- 20

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute------------------------------------ 21 422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute ------------------------------ 23

43 hhbrvbar|YTUacuteUcirc----------------------------------- 24 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute ------------------------------------------ 28 45 TUacuteUcircUgraveWH----------------------------------------------------------------------------------------- 31

5 IORYbrvbar ---------------------------------------------------------------- 33 51 IORYEcircu --------------------------------------------------------------------------- 33

511 YacircueZ$ ------------------------------------------------------------- 33 512 MPIIOatildeGUuml^13~eZ$ -------------------------------------------------- 34

52 ------------------------------------------------------------------------------------------------ 35 521 POSIX --------------------------------------------------------------------------------------------- 38 522 MPIIO --------------------------------------------------------------------------------------------- 38 523 _aumlaringaeligccedil13h ----------------------------------------------------------------------------- 38

53 IORYUgraveWH ------------------------------------------------------------------------- 41 531 a13egravecurren ----------------------------------------------------------------- 41 532 POSIXW MPIIO ------------------------------------------------------------------------------- 41 533 eacuteccedilecirceuml ------------------------------------------------------------------------------- 41

6 a$fY` ------------------------------------------------------------------------ 44 61 a$fY`igraveiacute----------------------------------------------------------------------------- 44

JAEA-Testing 2011-005

611 icircWiumlIumlethsectntilde -------------------------------------------------------------------------- 44 612 Agravej -------------------------------------------------------------------------------------------- 46

62 MPI ---------------------------------------------------------------------------------- 48 621 Agraveograve -------------------------------------------------------------------------------------------- 48 622 -------------------------------------------------------------------------------------------- 49

63 RDMAszlig ------------------------------------------------------------------------------------------ 53 631 Agraveograve -------------------------------------------------------------------------------------------- 53 632 -------------------------------------------------------------------------------------------- 55

64 CPUSCPUiumlIuml 1oacute 1 ----------------------------------------------- 56 641 Agraveograve -------------------------------------------------------------------------------------------- 56 642 -------------------------------------------------------------------------------------------- 57

65 13aelig13ocircotildeouml ----------------------------------------------------------------- 58 651 1oacute 1 ------------------------------------------------------------------------------------------ 58 652 divideoslash -------------------------------------------------------------------------------------------- 62

66 a$fY`UgraveWH-------------------------------------------------------------------------- 64 7 UgraveWH ----------------------------------------------------------------------------------------------------- 65 ugraveuacute ------------------------------------------------------------------------------------------------------------------ 65 ucircuumlyacutethorn ------------------------------------------------------------------------------------------------------------ 65 A FX1 HPCC BMT----------------------------------------------- 66 B EumlIgravecent MPI atilde ---------------------------------------------------------------- 67 C RDMAoacute NORDMA ----------------------------------------------------------------------------- 69 D W Oslash5 -------------------------------- 70 E 1oacute 1 --------------------------------------------------------------------------------- 76 F |eacuteIuml --------------------------------------------------------------------------- 79 G Zccedil` MPIWIcirca|ccedil`gt MPI atilde ------------------------ 85

JAEA-Testing 2011-005

Contents 1 Introduction ---------------------------------------------------------------------------------------------- 1 2 Outline of the system ---------------------------------------------------------------------------------- 1

21 Hardware construction of BX900 ------------------------------------------------------------------ 1 22 Hardware construction of FX1 -------------------------------------------------------------------- 2 23 Hardware construction of Storage----------------------------------------------------------------- 2

3 Performance evaluation by HPCC benchmarks ----------------------------------------------- 5 31 Outline of HPCC benchmarks --------------------------------------------------------------------- 5

311 Test programs ------------------------------------------------------------------------------------- 5 312 Rules ------------------------------------------------------------------------------------------------- 7

32 Measured results -------------------------------------------------------------------------------------- 8 321 BX900 ----------------------------------------------------------------------------------------------- 8 322 FX1 ------------------------------------------------------------------------------------------------- 14

33 Summary ------------------------------------------------------------------------------------------------ 18 4 Tunings for BX900 and FX1------------------------------------------------------------------------- 20

41 Profiler --------------------------------------------------------------------------------------------------- 20 42 Automatic tuning of memory access ----------------------------------------------------------- 20

421 Efficacy for Original version of STREAM benchmark ------------------------------- 21 422 Efficacy for HPCC version of STREAM benchmark ---------------------------------- 23

43 Manual tuning of memory access ---------------------------------------------------------------- 24 44 Efficacy for NPB23 FT benchmark ------------------------------------------------------------- 28 45 Summary ------------------------------------------------------------------------------------------------ 31

5 Performance evaluation by IOR benchmarks ------------------------------------------------ 33 51 Outline of IOR benchmarks ---------------------------------------------------------------------- 33

511 Key parameters -------------------------------------------------------------------------------- 33 512 Option parameters for MPIIO ------------------------------------------------------------- 34

52 Measured results ------------------------------------------------------------------------------------ 35 521 POSIX --------------------------------------------------------------------------------------------- 38 522 MPIIO --------------------------------------------------------------------------------------------- 38 523 Local cache --------------------------------------------------------------------------------------- 38

53 Summary----------------------------------------------------------------------------------------------- 41 531 Choice of file system---------------------------------------------------------------------------- 41 532 POSIX vs MPIIO ------------------------------------------------------------------------------ 41 533 Buffer size ---------------------------------------------------------------------------------------- 41

6 Performance evaluation of the interconnect --------------------------------------------------- 44 61 Specification of the interconnect ----------------------------------------------------------------- 44

JAEA-Testing 2011-005

611 Construction and Nodebind ------------------------------------------------------------------ 44 612 Communication method ----------------------------------------------------------------------- 46

62 Fundamental performance of MPI communication ---------------------------------------- 48 621 Measuring method ----------------------------------------------------------------------------- 48 622 Measured results -------------------------------------------------------------------------------- 49

63 RDMA ---------------------------------------------------------------------------------------------------- 53 631 Measuring method ------------------------------------------------------------------------------ 53 632 Measured results -------------------------------------------------------------------------------- 55

64 Communications between intra-CPUs inter-CPUs and inter-nodes ----------------- 56 641 Measuring method ------------------------------------------------------------------------------ 56 642 Measured results -------------------------------------------------------------------------------- 57

65 Communications between chassises ------------------------------------------------------------ 58 651 Point-to-point communication--------------------------------------------------------------- 58 652 Collective communication -------------------------------------------------------------------- 62

66 Summary ------------------------------------------------------------------------------------------------ 64 7 Concluding Remarks --------------------------------------------------------------------------------- 65 Acknowledgements -------------------------------------------------------------------------------------------- 65 References ------------------------------------------------------------------------------------------------------- 65 Appendix A HPCC benchmark results on FX1 --------------------------------------------------- 66 Appendix B Fundamental performance of MPI communication ---------------------------- 67 Appendix C RDMA vs NORDMA--------------------------------------------------------------------- 69 Appendix D Overlapping communications with calculations -------------------------------- 70 Appendix E Peer-to-peer communications --------------------------------------------------------- 76 Appendix F Memory band width ---------------------------------------------------------------------- 79 Appendix G Flat-MPI vs Hybrid --------------------------------------------------------------------- 85

JAEA-Testing 2011-005

1 EacuteH L13Pgt ABCDE

FGDHIJ 22 3K13LM=NO 153TflopsPQCRSTUVWXM=NO 200TflopsUV LinuxYZ$13LPRIMERGY BX900PWM=NO 12 Tflops []^_`$a^LFX1PbcdXefgh$13ijklmndoBpCDqBX900 rsXtuvwmWFX1 []efgh$Lxfgh$PyBzD^|13~hGmWWCq gtoBGXumlcopyordfyBClaquoPRIMERGY

BX900WFX1oacuteCO|Yl WpoundDGY`OslashCOslashGmWbrvbarsectdcWXpoundD13

GqUgrave[13EcircuJGq 3gtefgh$13MXBdn HPCC YbrvbarC 4cd 6O|YalLIOPa$fY`L^_ Pcent AringcdGq 7M$CqXyacuteampG()Ocirc$bWCUgraveWHDq

2 13Ecircu 13 3+XB-eacute13GXUVEumlIgraveOIacute[

]fIumlIacutenot|O-eacutecdX13gtlaquoqmnd13M

=NORSTUVWX 214TflopsMacirc012 56TBOgraveOacuteOcircOtildeY1212PBgtlaquoq13Ecircu Fig 1Gq 21 UVEumlIgraveOIacuteIcircIumlETHNtildeJ 1334WXUVEumlIgraveOIacutePRIMERGY BX900L13BX900P

200Tflops=NOLgYPWMacirc0 50TBnotGUV LinuxYZ$13gtlaquoq

BX900 10U-a5Aumleuml13aelig13T 186-eacute7IumlT 86accedil7Iuml8amp9XAumlAringOslashOuml7Iuml-eacutegtlaquoq1 7Iuml-eacuteaXeon ^_ccedil- X5570LYccedilIumlfP 2 ^_ccedil-Wacirc0 24GBLDDR3 SDRAMBiIacuteiumlIuml 12GB UgraveD 48GBP8ampGqiumlIumllaquoDsect=NO9376GflopsL293GHztimes4lt=gtOY_ccedilYtimes4ftimes2^_ccedil-Pgt|eacuteIuml 512GBsAumlAnotGqiumlIumlInfiniBandLIBP 4xQDRL13B QDRWCP2`ZaringCDAgravez 8GBsAbrvbarAumlFE7a13XOcirc$FGordf9gtlaquoq

JAEA-Testing 2011-005

Fig 1 13Ecircu

BX900gtacircHI4J2K currenLXMUV13Nh713~O=5Pq

22 []fIumlIacuteIcircIumlETHNtildeJ []fIumlIacuteFX1L13FX1P12Tflops =NOLgYPWMacirc0 46TBnotGYZ$13gtlaquoq

1iumlIumlQR SPARC64SLYccedilIumlfPT^_ccedil-Wacirc0 16GB8ampGqUgraveD|UCPUVBccedil^ccedil`LJSCPbrvbarsectAuml|eacuteIumlOslashWCqiumlIuml FBBLFull Bisectional BandwidthPccedil`X| InfiniBand 4xDDRLDAgravez 2GBsPgtYZGqiumlIumleacute|O=f7YOtilde IcircIumlETHNtildegtO=GVBOumltimesLAumla$fY`accedilPyBGmWgt [

brvbarUcircXOcirc$FG]EumlIgrave^_ZOslash5^=_wXM

`notGq FX1 gtFX1 CcedilEgraveordfxfgh$WXpound`acC2012 b

=Gxfgh$yBzD^|13~h5Pq 23 OgraveOacuteOcircOtildeYOumltimesIcircIumlETHNtildeJ OgraveOacuteOcircOtildeYOumltimes IO iumlIumlSPARC Enterprise M90002 6Ocirc$OcircOtildeYETERNUS DX80366$OcircOtildeYETERNUS4000 model40016gtJeumlnq

JAEA-Testing 2011-005

MOcircOtildeY12 12PBLRAID6Pgtlaquosect13gt 576GBs IO LgYPnotGLFig 2PqIO iumlIuml InfiniBand nota13Parallelnavi SRFSbrvbarsectciumlIumlcdYoacuteCaidefgCq Ocirc$OcircOtildeYOumltimeslaquoDsect 466Za SASOcircOtildeYL1TB7200rpmP8ampG

qmndOcircOtildeY RAID6 gt^UcircCOumltimesgt`ZagGmWgt IO AumlFUcirchpoundqXIOiumlIuml_aumla13Sun QFSBCLFig 3Pq

UVEumlIgraveOIacute11913aelig13()

i QDR 9j2 (113aelig13ntildesect) InfiniBandaccedil

(9iexcl) i FGFAring 72GBs (2iumlIumlcurren) IOiumlIuml (M90002iumlIuml) i FGFAring 576GBs (OcircOtildeYcurren) OgraveOacuteOcircOtildeY (ETERNUS DX80366) ()9kUcirclmiumlIumlnC

1 2 3 119

IBUSW

IO node

Fig 2 oOumltimesYZJWFGFAring

JAEA-Testing 2011-005

M90

00

2IO

M90

00

1IO

FC (3)

FC (4)

FC (36)

hellipFC (1

)F

C (2)

FC

(42)

hellipFC (3

7)FC (3

)FC (4

)FC (3

6)hellip

FC (1)

FC (2)

FC (42)

hellipFC (3

7)

DX8

0 1

helliphelliphelliphelliphelliphellip

helliphelliphelliphelliphellip

hellip

1TB 1T

B1T

B1T

B

DX8

0 2

1TB 1T

B1T

B1T

B

DX

80

35

1TB 1T

B1T

B1T

B

DX8

0 3

6

1TB 1T

B1T

B1T

B

ETER

NU

S D

X80

(36

)E4

K 1

1TB 1T

B1T

B1T

B

ETER

NU

S400

0 M

odel

400

(1)

32G

Bs(8

FC)

QFS

Q

FS

36

576

GBs

(144

FC)

F Fi

g 3

JAEA-Testing 2011-005

3 HPCCYbrvbar

HPCC (High Performance Computing Challenge)1) Y 7^_ZB BX900W FX15poundDq13gtHPCCEcircupqWrstuq 31 HPCCYEcircu

HPCC YvR Tennessee JDongara wRordf3xWXpoundyoBCefgh$BY`^_ZzgtlaquoqO

|Y^_ WpoundDefgh$acircuXu

GmWbrvbarsectefgh$13WCeacuteZGUacuteC

qO HPL(High-Performance Linpack)DGEMMFFTE 3q|^_Zgt|Y STREAM W Random Access 2 q|^_Zgt _ PTRANSW b_eff 2q|^_Zgt5n~noacuteCc^_Zbrvbar5PJWXpound 2)qo

^_ZUuml|YacuteTHORNFortranCgtcnordfHPCCYWCCgtcnD 1^_ZeumlnqXHPCCYC 2003 sect 300wordfeumlnsect+X13rsyBGmW9gtlaquoq

311 ^_Z efgh$13f CPU 1 iexclXCiexcl8ampG

L13iumlIumlPa$fY`gtYZCDYZ$JgtlaquosectCoreCPUiumlIuml13WeumlUgraveUgraveXgtO=ordf5nDHogtordfugt

laquoqmgtiumlIumlMG` (G Global system performance)WiumlIuml[gt` (SN Single Environment)gt` (EP Embarrassingly Parallel) 3gt5Pq

3111 HPL^_Z

HPL5IgraveLUcurrengtCmWgt13OG^_ZgtlaquoqOslashm^_Z|5 CcedilEgrave^_ZaumlIacutecurrenWCyB

eumlnmWordfqUgraveDefgh$13O[P TOP500B^_ZWCBeumln LinpackMPIEumlIgraveUcircWCnotgtlaquoqiumlIuml`$XOordfecircMAumlWXq Tflopsgteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq

JAEA-Testing 2011-005

3112 DGEMM^_Z DGEMM5IgraveGmWgtiumlIumlOG^_Zgtlaquoqo^_gt5curren13^_W i5Xqm^_Z

LinpackSgtegravenauml^_ZgtlaquoqUgraveDaumlefgh$gto13zeumlnDeacuteaTHORN|ordfZaZ|WCBeumlnsect

meacuteaTHORN|ordfoacute WXqY`LBMTPbrvbariumlIumlOiexclcentGq Gflops(Giga floating-point operations per second)gteumlnq` 2gtiumlIuml[(SN)(EP)`5Pq 3113 STREAM^_Z STREAM iumlIumlordf|W CDWecirceacuteIumlG^_Zgtlaquoqm

Wecirco^_gtO=ordf5np^_W i5Xqm^_Z

cpound(Copy)curren(Scale)Myen(Add)yen(Triad) 4 XO=gtJeumlnsectiumlIuml|YiexclcentGq GBs(Giga Bytes per second)gteumlnq` 8 gt[(SN)(EP)gtoiumlIuml 4(cpoundcurrenMyenyen)`5Pq 3114 PTRANS^_Z PTARNS(Parallel matrix TRANSpose)EumlIgrave^_ocirc5IgraveFtimesbrvbarsect13ccedil`brvbarY G^_Zgtlaquoq GBs gteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq 3115 Random Access^_Z Random Access |sectumleumlnDecopyIgravecdZordf 1 ulaquonotgtshyregGO=ordf 1 macrdegplusmn5ncG^_Zgtlaquoq1 plusmnYGOcirc$2ordf8BWgteumlordfgtlaquoqiumlIuml|Yucircsup2`WiumlIuml MPI gtucircsup2`5Pq Gups(Giga updates per second)gteumlnq` 3 gtiumlIuml[(SN)(EP)iumlIumlegravepoundDM(G)`5Pq 3116 FFTE^_Z FFTEAumlF|sup3^acuteOG^_ZgtlaquoqHPLWiacute CcedilEgrave^_ZaumlIacutecurrenWCyBeumlnmWordfq[(SN)W(EP)`gt2 [microcopyIgraveordfo^_gtpara|sup3^acuteeumlnqiumlIumlegravepoundDM(G)`gt3[microcopyIgraveordflaquo 1AgravezmiddotAcurrenetheumlnDcedil|sup3^acuteeumlnlaquoW^_ordf^_W 5PmWLoacute Pbrvbaro^_

copyIgravesup1copytimesordm^_raquofrac14sect[microoacuteG|sup3^acuteordf5nq

JAEA-Testing 2011-005

Gflopsgteumlnq 3117 b_eff^_Z b_eff Ocirc$FGFGfrac12frac34(7a13acute latency)WFGFAring(eacuteIuml band width)shycdiquestH^_ZgtlaquoqFGfrac12frac34ordfAgraveXCeacuteIumlordfecircM13^_ ordfAacutenmWGqb_eff iumlIumlegravepoundDM(G)G Ring W PingPong 2 ^_ZgtJeumlnqb_eff FGGOcirc$-a5mnUgravegt^_ZW+XsectlOcirc$-a5 N ] MPI ^_ P atildeAcircXq7a13 8B Ocirc$eacuteIuml 2MB Ocirc$FGGmWbrvbarsect5Pq Ring MPIbrvbarpoundAtildeHdnDZYAumlAringAEligo^_oacuteCAumlAring 1oacute 1 5PoumlWZYAumlAringZordfEumlu 1oacute 1 5Pouml 2CcedilEgravegtCDq PingPong 2Core Eacute Ecircnototilde^_ordfethsectntildednp Core EumlCOslash5Gq2CoreIgraveEacuteIacute5poundecircTIcircIIumlshyGq b_effETHNtildeOgraveRing(AumlAring)brvbar7a13OacuteRing(AumlAring)brvbareacuteIumlOcircRing(Zordf)brvbar7a13OtildeRing(Zordf)brvbareacuteIumlOumlPingPongbrvbar7a13timesPingPongbrvbareacuteIuml 6gtlaquoq7a13ordfaY_LOslashPmacreacuteIumlordf GBsgteumlnq 312 Oslash5

HPCCZaZ(Baseline Runs)WUuml^Otildea5Z(Optimized Runs) 2Oslash5ordflaquoordffeaZTUacuteUcircEacutebrvbarZaZgt5pound

q

3121 ZaZ Oslash5gtlaquoZaZfeaZbrvbarTUacuteUcircW13Oumlfrac34

eumlnaumlUgraveAuml BLAS ZaZ|W MPI ZaZ|egraveBUgravegtUacuteeumlnordfUcircfIuml^regUacuteeumlnXqegraveBCDfeaZeacuteYacute~

WfeaegraveBCDUuml^13~ZaZ|eacuteYacute~UumlordfYacutegtlaquoq 3122 Uuml^Otildea5Z UcircfIumlTHORNszligagravePTUacuteUcirc2 7gtUacute9eumlnqiAgraveaacutednDIacutecurren(-)fIumlTUacuteUcircgtlaquosectpAgrave|5macr8CEcircnototilde3XTUacuteUcircgtlaquoqUgraveD acirc HPCC^_YacuteNtildeY`acircatilde^WaumlaringordfaeligugtlaquosectatildeCUumlGaeliguordflaquoq

JAEA-Testing 2011-005

32 BX900 FX1n~nHPCC BMT^_Z 8ccedil512EumlIgravegtOslash5CDq 321 BX900 BX900 HPCC BMT^_Z Table 1GWHPL

PTRANSRandom Access FFTEo^_ZZK|Otilde Fig 4ccedilFig 7GqZK|Otildedegn^_Zegraveeacuteecirccopyordfeuml]cgtlaquoordf512 EumlIgravegtigraveyenCiacuteXCiacuteicircXordfdegdnqUgraveDHPL=NoacuteC 822(=4902596100)1gtlaquopoundDq

Table 1iumlntildeethntildeograveoacuteWCHPCCocircAEligcdotildebIcircIumlETHNtildeoumlccedilYAElig Intel Endeavor cluster (13gt Endeavor Wdivideoslash)HPCCWeboumlYacute 1) ugraveuacute Table 2Gq BX900shyucircoB3gtlaquoDH Ecircnototilde^_ZP EndeavoregraveeacuteplusmnpoundDlaquoordfEcircuumlb13yacuteordfdegdnqUgraveDEndeavorWEacute IntelfeaZBD Table 3 GbrvbarPQRfeaZbrvbarWbgtlaquopoundDqDotildeCSTREAM iumlIumlSgt5poundmWcd thornCXatilde

d(35GBs oacute 42GBs)W 2ethM EndeavorbrvbarsectEshygtlaquopoundDqm422gtaringNGq

1 BX90013VBbrvbargt17072fgt=N 2000TflopsoacuteC 1914TflopsOslashCsectOslashszlig 9566gtlaquopoundDq

- 9 -

JAEA-Testing 2011-005

Ta

ble

1 富士

通コ

ンパイ

ラに

よる機

構B

X900

での

HPC

C B

MT測

定結果

G-H

PL

G-P

TR

AN

SG

-FFTE

EP-S

TR

EA

MSys

()

EP-S

TR

EA

MTriad

EP-D

GEM

MR

ando

mR

ing

Ban

dwid

thR

ando

mR

ing

Lat

ency

TFlo

ps

GB

sG

ups

Gup

sG

Flo

ps

GB

sG

Bs

GFlo

ps

GB

sus

ecrd

ma

008

1315

12

9330

001

5112

008

9847

555

3328

074

350

9311

018

41

1171

51

268

nord

ma

008

1226

12

9544

001

5504

009

0139

557

5727

813

347

6711

023

91

1053

51

278

rdm

a0

1579

130

510

540

0238

850

1510

719

4469

555

143

4697

111

145

092

418

149

3no

rdm

a0

1536

650

491

080

0222

270

1334

439

7508

555

683

4730

111

123

078

781

324

5rd

ma

032

2493

09

0716

003

0751

024

2906

186

357

111

164

347

3911

101

00

5220

61

944

nord

ma

032

3494

08

4422

003

5001

023

7182

190

981

112

052

350

1611

100

30

5194

95

010

rdm

a0

6201

260

150

637

003

2786

037

8769

369

923

222

203

347

1911

092

40

4192

43

156

nord

ma

060

9871

014

686

80

0763

460

3882

6638

139

722

260

03

4781

110

887

044

542

652

7rd

ma

120

6270

026

303

70

0370

050

5186

5673

075

744

426

43

4708

111

007

038

892

463

5no

rdm

a1

2265

800

302

225

007

3652

061

6532

742

721

444

063

346

9211

100

50

4161

78

290

rdm

a2

4166

200

583

348

002

9897

053

8761

139

3770

888

453

347

0511

091

00

3583

55

937

nord

ma

236

3890

057

519

80

0765

560

8003

4814

845

5089

503

73

4962

110

879

040

323

103

72rd

ma

468

4620

011

199

400

0163

910

2622

2026

859

1017

779

513

4726

111

147

032

838

711

3no

rdm

a4

9017

400

109

9150

008

2799

153

7610

275

5280

1816

433

354

7711

111

90

3572

112

790

富士

通B

X90

0 (5

12C

ore

128C

PU

)C

ip

Xeo

n X55

70 2

93G

Hz

DD

R3-

1066

SM

T-O

N T

urbo

Mod

e -O

FF

Inte

rcon

nect

Q

DR

Inf

inib

and

2so

cket

nod

e F

at T

ree

OS

Red

Hat

EL 5

Com

pile

r 富

士通

CC

++ V

er O

ption -

Kfa

st -

SS

L2

()

EP-S

TR

EA

M S

ys の

値は

(EP-S

TR

EA

M T

riad

)times並

列数

で算

出し

てい

8 16 32 64

MPI

富士

通 P

aral

leln

avi Lng

uage

Pac

kage

512

128

HP

CC

Ver

V

13

1

256

並列

数通

信方

式G

-Ran

dom

Acc

ess

オリジナ

ル版

  

SA

ND

IA_O

PT2版

JAEA-Testing 2011-005

FFig 4 BX900 HPL

Fig 5 BX900 PTRANS

JAEA-Testing 2011-005

FFig 6 BX900 Random Access

Fig 7 BX900 FFTE

JAEA-Testing 2011-005

TTa

ble

2In

tel E

ndea

vor c

lust

er (

HPC

CW

eb1)

)

JAEA-Testing 2011-005

TTa

ble

3In

tel

BX9

00H

PCC

BM

T

JAEA-Testing 2011-005

322 FX1 FX1 HPCC BMT Table 4 HPL

PTRANS Random Access FFTE Fig 8 Fig 11512

HPL755(=377650100)2

Table 4 FFTE DGEMM

A

2 FX1 1200 129Tflops116Tflops 9037

JAEA-Testing 2011-005

TTa

ble

4FX

1H

PCC

BM

T

JAEA-Testing 2011-005

FFig 8 FX1 HPL

Fig 9 FX1 PTRANS

JAEA-Testing 2011-005

FFig 10 FX1 Random Access

Fig 11 FX1 FFTE

JAEA-Testing 2011-005

33 HPCCYUgraveWH Uklq BX900W FX1WrsndWK(Altix3700Bx2)Wrs

Fig 12GqmhgtTWXpoundDqshy 100WCLRandom Ring LatencyPqRandom Ring Latency13BX900ordf`ccedil^fCmWordfcurrencq

BX900 W FX1 rsGWSTREAM W FFTE necircOatildeC atildeC eacuteIumlrvCDordfmacrdnDqSTREAM

421FFTE 43aringNGq XAltix3700Bx2 Random AccessordfEordfAltix3700Bx2W

plusmngtegraveBCD HPCC BMTeacuteYacute~cdOcirc$Y|5ordf+Xpoundsect8YrsGmW13XqAltix3700Bx2 atildeC FFTEEordfmdoacute OslashszligWuumlwdn

q

JAEA-Testing 2011-005

13

0

20

40

60

80

100

BX90

0

FX1

Altix37

00Bx2

GHPL(490Tfplos)

113

GRando

mAccess(154G

ups)

GPTRANS(1120G

Bs)

EPSTREA

M(436GBs)

EPDGEM

M(111Gflo

ps)

13

GFFTE

(2755Gflo

ps)

$amp(13

Rand

omRingBandw

idth

(0359GBs)

Rand

omRingLatency(518usec)

)+

512-(FlatM

PI)

1172

10 64

64

10 32

8 2 32

130

13

123

[GFLOPS]

4567898

123

[GBs]

lt8

7898

[GBs]

=gtAB130CDgtEFGHIAB0J

KLMNOIPQ

Fi

g 1

213cent

HPC

C B

MT^_ZrsZ

Ran

dom

Rin

g La

tenc

y (5

18

sec)

)+R

JAEA-Testing 2011-005

4 BX900W FX1TUacuteUcirc 32 c^_Zgt BX900 W FX1 IcircIumlETHNtildeoumlccedilYWiCXordfWnDmWQR^_aZegraveBC(Xcurrenyen

5poundDqeumldfeaZUuml^13~brvbarTUacuteUcirc]Ucirc7^_ZTUacuteUcirc

EacuteDq 41 ^_aZ

BX900 W FX1 gtegraveBCD^_aZWQRUgraveyenXgtlaquosectoacute ^_ZOslash5^_a|Ocirc$divideC^_ZOslash5cedildivideOcirc$divideC

^_Z=)yen5PmWordf13q ^_aZyBGUgrave^_Z_IumlYacuteh)JGDHfe

a|YUuml^13~UKtl_trtGq[ _ZOslash5_IumlYacute

h fpcoll fIumlgt=GqTcedildivideCDeacuteaTHORN|Ocirc$ fprof fIumlgtdivideCaring`Gq^_aZgtPAElig 8q|gtlaquoq

^_a|Ocirc$divideAElig AElig ^_ AElig MPIZaZ|ordm)AElig f`AElig IcircIumlETHNtilde$AElig fZAElig UcircfIumlAElig

mPIcircIumlETHNtilde$AEliggtCPUlt=gtOaringaeligccedil13

hccedil`|WOcirc$FGszligbordfoacute WXq

42 feaZbrvbar|YTUacuteUcirc BX900HPCC BMT^_Z STREAMordfotildeb

gtlaquo EndeavorWrsC(35GBsacute42GBs)W 2ethMEshygtlaquopoundDmWETHNtilde5poundDqETHNtildeBDUcirc HPCCpSTREAMUuml|YacuteTHORN(CUcirc amp FortranUcirc)7ccedilIumlEumlIgraveUcircWMPIEumlIgraveUcircgtlaquoq[13BGbrvbarPmdegn BX900 feaZUuml^13~UKntst brvbarsectCDmWcdEndeavorfeaZ(IntelUgrave)W BX900feaZ(QRUgrave)igraveiacuteWuumlwdnq

JAEA-Testing 2011-005

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute BX900Uuml|YacuteTHORN STREAM^_Z Table 5Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntst brvbarordfmacrdnDqDotildeCUuml^13~gtOcirc$|sectumlGaringaeligccedil13hyBCXbrvbarPGDHoacute WX DO^SgtOcirc$sup1yBordfXmWordfdcXoumlEacuteegraveBGuecircgtlaquosectPgtXoumlordfUcirc

GordflaquoqUgraveDfeaZordfoacute WGDO ^SgtsectumlOcirc$ordf^^shypoundZYeumlnoumlEacutegtlaquoq CUcircoumlIntelfeaZXd OpenMPEumlIgraveUgraveUgravegt 42GBscoreordfdegdnDordfQRfeaZgt7ccedilIumlEumlIgraveUcirc OpenMP EumlIgravecd=EumlIgrave^regCbrvbarP]C Intel feaZWbWXpoundDqQRfeaZgtlaquo CD7ccedilIumlEumlIgraveUcircAgraveogravebrvbarpoundOcirc$copytimes]AgravebTUacuteUcircAgraveograveordflaquoWnq [ FX1Uuml|YacuteTHORN STREAM^_Z Table 6Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntstbrvbarordfmacrdnDqXTriad (132GBs)CPUQRordfWebgtUumlCshy(135GBs)CPUWotildeEacutegtlaquoq CUcircoumlordf FortranUcircUKntstUuml^13~XCWotildeiCqFX1 CfeaZ BX900W+XsectUKntstUuml^13~ordfcentsup3CXqDHFX1 CfeaZbrvbarTUacuteUcircordf$currengtlaquopoundD9ordfAumlq

JAEA-Testing 2011-005

TTa

ble

5B

X900

STRE

AM

Ta

ble

6FX

1ST

REA

M

JAEA-Testing 2011-005

422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute 321gtdegdnD BX900 HPCC BMT^_Z STREAM

Table 7GqmmgtUKntstcopyIgraveOcirc$ecircEacutearingaeligccedil13heumlX storeampegraveBGmWUKrestp=all(a$XsectordfXmW)WCDTUacuteUcirc5PfeaZUuml^13~gtlaquoq UgraveDyacutegtgtlaquoordf _ZOslash53iumlIuml XC

iumlIumlSgt|Yo CPU]o CoreiDHEumlIgrave] Agravejatildedn~nfeaZUuml^13~brvbar^UcircCXq

STREAMbrvbarPX CoreIumliX2gtZX|Yordf5n DO^gtfeaZUuml^13~UKntst GmWgt|Y+13mWordfcurrencpoundDq

Table 7 HPCC BMT^_Z STREAM (MPIEumlIgraveUcircCUcirc) feaUuml^13~Ograve(1) feaUuml^13~Oacute(2)

EumlIgrave Agravej 13aelig13

EP-STREAMTriad (GBs)

13aelig13 EP-STREAMTriad (GBs)

rdma 1 34739 U U 32

nordma 1 35016 1 43316 rdma 3 34708 U U

128 nordma 3 34692 1 43572

rdma 5 34726 4 43595 512

nordma 4 35477 4 43424 L-1QRfeaZUuml^13~OgraveacuteKfastpackedntst SSL2 -2QRfeaZUuml^13~OacuteacuteKfastpackedntstrestp=all SSL2P

JAEA-Testing 2011-005

43 hhbrvbar|YTUacuteUcirc 32 gtdcWXpoundDbrvbarPBX900 W FX1 FFTE ^_Z

acircIcircIumlETHNtilde13yacuteordfmacrdnDqmETHNtildeDH5pound

DhhW13gtszlig0Gq 13gt FFTE 32EumlIgraveOslash5CDWecircoacuteG ethBX900

gt 12FX1 gt 16Wgteumlqpoundordm)1|YEcirc2OgtlaquopoundD9ordfAumlqmgtFFTEUuml|YacuteTHORN(FortranUcirc)l3CUuml|YacuteTHORN(13aumlWoacuterGoumlUcircWdivideoslash)WaumlIacutecurren4C)JCDaumlacircoacuteGaumlZh5poundDiacuteegraveBC

hbrvbarpound^UcircCDordm)GmWbrvbarsectBX900 W FX1 oacuteG5g5poundDqUcircpaumlbrvbar5poundDh 1QRfeaZW Endeavor egraveBeumlnDafeaZWrs5PDHW 2auml13IacutecurrenTUacuteUcircbrvbar6)BbordfaumlIacutecurrenthorn7wXmWagraveaacuteGDHgtlaquoq UcircaumlIacutecurren 38plusmnUcircWmndaumlZhCDiacute

Ucircn~n Fig 13W Fig 14GWoIacutecurreniacuteEcircu0Gq Fig 13W Fig 14OgravegtCDIacutecurrenOO=5pound^gtlaquoqUuml|YacuteTHORN

gtcopyIgrave APXYZWWWWOcirc$Yordf 2ugt5nsectZgtXcpoundDDHiacutegt)9BcopyIgrave CY_W^curreneth5poundcopyIgrave APXYZWWWWYordfZXbrvbarPCDq)9BcopyIgrave CY_YZCXcpoundD=Ocirc$sectumlG^gtYordfZXpoundDoumlUcircordfecircDHgtlaquoq

Fig 13W Fig 14OacutegtCDIacutecurren3[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 3ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 3gtXC 6Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquosectaringaeligccedil13h-

a5brvbarpoundltG_ccedilY-a5^_ZUgrave)acircregmacr= 16 gteumlnq

Fig 13W Fig 14OcircgtCDIacutecurren2[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 2ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 2gtXC 4Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquoqFX1 gtBX900rufntildeDsectaringaeligccedilaringh-a5ordfgteumlmWcdmO=)t_ccedilY-a5 16cd 4^regGmWgtzyacuteordfmacrdnDq

Table 8BX900W FX1QRfeaZegraveBCDUcircWaumlbrvbarEuml BX900 afeaZegraveBCDaumlbrvbarGq

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeFFQhFQeFHFFIhFIeFHFFhFQWFRWFFILCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildeKgMSCFDEFGLCKmMlhKOFRLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeNHQRIAacuteVOEOlOjAtildehSRAacutelOEYEEfKAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKnMSCFDEFGKmMSCFDEFGKdMSCFDEFGKkMSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePMMSCFDEFGPKMSCFDEFGPPMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLSABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtildeLCnMEEhKOFKOFAVjLCgMllhKOFPOFAVjLCPMEhEEOUEFMAacuteEEfFAVjYKOFKAtildeLCKMlhllOUEFMAacutellfFAVjYKOFPAtildeAAacutelOEAtildehNAacuteEOlAtildeaumlIacutecurrenOcircKMSCFDEFGPMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBF

GFL

Fig 13 FFTEUuml|YacuteTHORNaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeop]qrsAacuteXAtildeObqqpobtbuqrvvowxAacutevOvOvAtildeL3yzPbqqpobtrAacuteowxAacutesOwO]|AtildeAtildeLCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildep~hKOwowxAacute^O~OqAtildehowAacute~O^Atilde ^currenethDHrp )9BcopyIgrave)JIacutecurrenKgMSCFDEFGKdMopt^|rKkMopt^|rpKdK^^hKOsOuqpKmM~~hKOwOuqpKmK~h~~O^MAacute~~fuqYKOwAtildeLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildepKkKVhKO]|NHQRIAacuteVOEOlOjAtildehowxAacute^Y^^fKO~OqAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKkKopt^|r LEW V^lnacutewPKnMSCFDEFGKmKSCFDEFGKmMSCFDEFGKdKSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePPMSCFDEFG LEW j^lnacutewPPKMSCFDEFGPMMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtilde]bcbrtrcAacuteuqKhnAtildeLCnMEEhKOFKOuqKLCgMllhKOFPOuqKLCKMlhllOUEFMAacutellfuqKYKOFPAtildeLCPMEhEEOUEFMAacuteEEfuqKYKOFKAtildeAAacutelOEAtildehNAacuteEOlAtilde aumlIacutecurrenOcircPMSCFDEFGLEW l^lnacutewPKMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBFGFL

Fig 14 FFTEiacuteaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

Ta

ble

8FF

TE V

er 4

1

FullW

kern

el

(EumlIgrave

32

180M

BP

roc

)

ST

UVWXYZ[]

^_

`6a8bcdRefghHdRefgij

`6a8bcdRefghHdRefgij

klmmnopqr

cltst

ouvq

wvx

ovy

zqvz

|vy

|vz

klmmnopq

~cltst

yvy

yvo

ovo

wqvq

v|

|vy

s88oy

oqvu

oqvu

ovq

z|vu

oquv

qvz

s88Ru

ovz

qvw

x|vq

ovw

klmmnopqr

cltst

ouvy

zvy

ovx

oq|vu

uvz

uv

klmmnopq

~cltst

yvx

xvw

ovo

w|vu

vz

|vy

s88oy

oqvx

oqvo

ovq

wxv

z|vo

qvz

s88Ru

o|v|

qvw

xuvw

ovy

klmmnopqr

cltst

ouvo

zvx

ovx

klmmnopq

~cltst

yv

xv

ovo

s88oy

oqvq

oqvu

ovq

s88Ru

ovz

qvw

ln

ltd

ln

zqq

D

mo

ltd

ln

mltd

JAEA-Testing 2011-005

UgraveBX900gtUcircWQRfeaZaumlWQRfeaZaumlWafeaZdegnIgraveEacuteIacuteotildeigt

laquosectauml4ordfXmWWfeaZAacuteordfXcpoundDmWordfcurren

cqUgraveDFig 14OcircgtCDIacutecurren_ccedilY-a5^regGW(_ccedilY-a5acute164)BX900gtordfpoundUcircCsectmoumlaringaeligccedil13hegravepoundXcpoundD9ordfAumlq [FX1gtUcircWQRfeaZaumlWQR

feaZIgraveEacuteIacuteshyegraveeacuteordfaEacuteDordfUuml|YacuteTHORNcdz

yacuteCiacutecurrengtrsGWotildeigtlaquomWcd BX900 WrsGmWXqBX900W FX1gtUuml|YacuteTHORNWiacuteOslash5GWFig 14OgraveIacutecurrenO=OordfEcircUgraven^currenethWPrsUVXiacute5pound

DDHBX900 gt FX1 gtzyacuteCqCcurren BX900 16 oacuteC FX1gt 40WXzyacutegtlaquopoundDqUgraveDFig 14OacuteWOcircIacutecurrendegncopyIgravefgO=EacutegtlaquosectBX900gt 11currenW 08currengtzyacuteXcpoundDordfFX1gt 36currenW 18curren(or 16curren)gtMddcXzyacuteordfaacuteHdnDq 13yacutecdFX1feaZTUacuteUcirc BX900Wdc+XpoundsectFX1

|Y]OszligyBGDHhhordfaeliggt

laquomWordfcurrencpoundDq 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute

42 STREAM^_ZgtethCDW ordfcent^_Zgtecircc5g5poundDq egraveBCD^_Z43 FFTEWEacute Stockham|5gt FFT5P

NPB (NAS Parallel Benchmarks) 23FT^_ZgtlaquoqFFTEoumlWiacuteNPB23 FT Uuml|YacuteTHORN(Fortran Ucircgt MPI EumlIgraveUcirceumln)l3CUuml|YacuteTHORNcdaumlIacutecurren4C)JCDaumloacuteCfeaZUuml^13~brvbarpound^

UcircCDordm)5poundDqUuml|YacuteTHORNgtXCaumlBD=YZ C7UV((XYZ)=(512512512))gtaumlIacutecurrenordf On aringaeligccedil13hWXsectfeaZUuml^13~szligordf13XDHgtlaquosectaumlIacutecurrenUVEacuteCDcpoundDcd

gtlaquoq ^_ZaumlIacutecurren 28plusmngtlaquosectn~n Fig 15W Fig 16GqMd

8plusmncopyIgraveoacuteGYordfZsectumlO=gt|YordfZgtlaquo

mWcdhtgtUKntst oacute WXXgtmUuml^13~notbrvbarGq UgraveDOcirc$_ccedilY-a5 16W 64brvbar5poundDq

JAEA-Testing 2011-005

13

|ucp|t^ro__tAacute^OOOsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]|trFRFY]p^top]qrsYtpYop]qrsJJDp_Q|^bbqpc^ta|rotpbctcb|urc`Q^uptatar^]|tbtarp|t]|tbccbwOa^qrR^boocbtoabccbw`Et^b|rtabtFhPU`Ar_pcrobqq^SJJDItpo]rc_pcJJDOtarbccbw|tur^^t^bq^ruwobqq^SJJDI^taEorttpMbUrttpUQOarcrUQ^tarbs^|bq|rp_U_pcbwo|ur|rtobqq`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^oq|r[qpubq`a[^oq|r[]^_`a[^trrc^OOO^O~OqOsO^rccOr^trrc__tuqpoO__tuqpo]b]bcbrtrcAacute__tuqpohdnO__tuqpo]bhddAtilde auml^_ccedilY-a5Kd]bcbrtrcAacute__tuqpohKdO__tuqpo]bhKXAtilde]bcbrtrcAacutehiOhmKPAtildep|uqrop]qrssOwO|p|uqr]cro^^prqOrqrOrq]^r^psAacute__tuqpo]bOAtildeOwAacute__tuqpo]bOAtildeO|AacuteAtildeobqqUHExE^tAacute^rccAtildesAacutevOvAtildehAacuteK`MMOM`MMAtildewAacutevOvAtildehAacuteK`MMOM`MMAtilde|AacutevAtildehAacuteK`MMOM`MMAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pcprbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqh]^xt^rAacuteAtildep^hKOKMMM^hY^pqhKOOPobqq__tPAacute^OqOOO__tuqpoO__tuqpo]bO|OsOwAtilde^_Aacuteq`r`AtildeptpKdMobqq__tPAacute^OqfKOOO__tuqpoO__tuqpo]bO|OwOsAtilderpptpKXMoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]wRtpQ`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKdMopt^|rp~hKOp^hKO__tuqposAacute^O~AtildehwAacute^O~Atilde aumlIacutecurrenLfgO=PrprpKXMopt^|rrpobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqrh]^xt^rAacuteAtilderq]hrqrYrqobqqUHExSpxcbAacuteUHExSCUUxTCBVLOrO^rccAtilde^_Aacuter`r`MAtildetarc^trAacutedOWAtilde[Gqb]rD^ruw|u`__tPAacutero`Atildev[Orq]r^_obqqUHExJ^bq^rAacute^rccAtildetp]

Fig 15 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- cfftz)

JAEA-Testing 2011-005

|ucp|t^r__tPAacute^OqOOOwOwKO|OsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pctarVYta^trcbt^pp_tarropbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^trrc^OOqOOOwOwKOKOq^Oq~OqO|O^O~O^KKO^KPO^PKO^PPp|uqrop]qrs|OsOwO|KOsKKOsPK^r^p|AacuteAtildeOsAacutewKOAtildeOwAacutewKOAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYort^^t^bq]bcbrtrc`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKhePqhPWWAacuteqYKAtildeq^hPWWAacuteYqAtildeq~hPWq|hq^fKp^hMOq^YK^KKh^WqfK^KPh^KKfK^PKh^Wq~fK^PPh^PKfq^_Aacute^`r`KAtildetar|Kh|Aacute|f^Atilderqr|Khop~Aacute|Aacute|f^AtildeAtilder^_oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoDa^qpp]^rotpc^buqr`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYphMOqYKp~hKOwsKKhsAacute~O^KKfAtildesPKhsAacute~O^KPfAtildewAacute~O^PKfAtildehsKKfsPK aumlIacutecurrenLOO=PwAacute~O^PPfAtildeh|KWAacutesKKYsPKAtilderprprpcrt|cr

Fig 16 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- fftz2)

JAEA-Testing 2011-005

UgraveWH Fig 17(Ocirc$ Table 9 Table 10GqQRfeaZbrvbarshyatildeCBX900 fftz2gtfeaZUuml^13~UKntstbrvbarzyacuteordfagraveaacute13cfftz gtnotXordfmacrdnXqUgraveDFX1degnnotXordfmacrdnXcpoundDqmW ETHNtildeCD

zyacuteordflaquopoundD BX900 fftz213gtfeaaringaeligccedil13heumlXsectumlampordf5eumlnXmWordfcurrencpoundDq13]h

eumlnX(cXCcedilEgraveWuumlwdnordf(currencdXq Ocirc$_ccedilY-a5degnOcirc` 16ordfTUacuteshygtlaquopoundDq UgraveDQRfeaZbrvbar BX900W FX1shyordfFFTEoumlbrvbarsectgteuml

(2curren13)ucircsup2WsectumlMd|YZgtlaquopoundDDHWuumlwdnq TcedilQRfeaZWafeaZUKntst ordf)BCD

IacutecurrenotildeQRfeaZAgraveordf 1 ethAacuteWXpoundDordfSTREAM gtfrac12Eacutesect

iumlntildegtlaquoq 45 TUacuteUcircUgraveWH

BX900WFX1STREAMWFFTEUuml|YacuteTHORNfIumlWNPB23 FTB13feaZbrvbarTUacuteUcirc5gCDFX1gtegraveBGfeaZUuml^13~brvbarpoundordfecircC^ordflaquo 5ordfaeligugtlaquopoundDq

FX1gtfeaZUuml^13~otildegt13ecircGmWordf13XoumlordflaquosectWecircUcirc7gth5Paeliguordflaquoq

Fig 17 feaZbrvbarTUacuteUcirc

JAEA-Testing 2011-005

TTa

ble

9N

AS

PARA

LLEL

Ben

chm

arks

Ver

23

FT

Tabl

e 10

NA

S PA

RALL

EL

Ben

chm

arks

Ver

23

FT

JAEA-Testing 2011-005

5 IORYbrvbar Y`fIuml IORBoa13alFGFAring

CaYoBETHNtildeCDq 51 IORYEcircu

IOR Benchmark3)vR ASCI Purple=_7|eacuteplusmngteumlnD C iexclbrvbarYfIumlgtlaquoLplusmneacuteYacute~ 2102egraveBPqPOSIXLwrite() read()PMPIIOLMPI_File_write_at() MPI_File_read_at()PbatildeB13EumlIgravel5PmWordfgtecircq 511 YacircueZ$

MPI oZYcplusmnLrepetitions plusmnP`5XFGFAringTshyiquestHqOcirc`centgtEumlIgraveOslash5eumln^_cdiaoacuteCpound

Eacuteecircordf5nndFGFAringshyordfeumlnqIOROslash59XacircueZ$ Fig 18Gq

1plusmn`gt readwriteplusmn(blockSizecurrentransferSizejsegmentCount)gtlaquosect`aecirceuml (blockSizej segmentCountjMPI EumlIgraveAring )WXqyenwtransferSize=1kBblockSize=1MiBL106 Bytes MB220 Bytes MiB WGPsegmentCount=1MPI EumlIgraveAring=32 oumlblockSizejsegmentCountjEumlIgraveAring=32MiB testFile ordf)dnoZY 1MiBmiddotAbrvbarntildeC1KB7fIuml 1024iexcl1313aeligUgraveDZordfpoundEacuteecircGq

[transferSize] 1plusmn readwritegtFGGeacutea`qsizeof(long long int)currencblockSizegtlaquoaeliguordflaquoq

[blockSize] o^_ordfbrvbarntildeGeacutea`qgt 1MiBq [segmentCount] o^_brvbarsectYeumlniOcirc$qyzshy 1q [randomOffset] 1XdZordfYqOcirc`shy 0q [fsync] ecircEacuteAring fsync13f5Gqyzshy 0q [filePerProc] ^_sectWcent`aYGqPOSIXEacutenotszligq

yzshy 0q [useFileView] MPI_File_set_view() MPI_File_write() MPI_File_read()egravePq

ZordfYWparaCXqyzshy 0q [collective] MPI_File_write_at_all() MPI_File_read_at_all()egravePqZordfY

WparaCXqyzshy 0q

Fig 18 IOR YacircueZ$

JAEA-Testing 2011-005

512 MPIIOatildeGUuml^13~eZ$

IOR gt Info aringbrvbarpound MPIIO =)`7wmWordfgtecircqBX900 notszligX InfoUumlYacuteNtildeY` keyW value Table 11GqplusmnCXLyzshyPq

Table 11 MPIIO Info UumlYacuteNtildeY` key value

cb_buffer_size 4194304 $amp()

+

cb_nodes -012345

6789lt

=gt-012A

B

ind_rd_buffer_size 4194304 ACDEFGF$amp()+

ind_wr_buffer_size 524288 ACDH+GF$amp()+

JAEA-Testing 2011-005

52 latildeLPOSIX MPIIOPn~nEumlIgraveAringtransferSize (1plusmn13

fgtFGeacutea`)YAgraveograveLsequential randomPb^wFig 19GumlcopyFGFAringCDL^_P Table 12Table 13Fig 20Fig 21GqUgraveDntildea13iumlIuml_aumlaringaeligccedil13hIO -eacute-eacutearingaeligccedil13hordfcentsup3CDHn~naringaeligccedil13hoacuteGIO XcOgraveOacuteOcircOtildeYoacuteG IO XcordfcentGDH_aumlaringaeligccedil13hszligEacuteDq

iexclcentpoundcurrenyenETERNUSDX80

970TB

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

brvbar

SRFSltlaquocumlcopyordf

READD WnotiteD

brvbar

SRFSltlaquocumlcopyordf

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

shyshyshy

Fig 19 ReadWriteumlcopy

JAEA-Testing 2011-005

Table 12 LPOSIXP

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3

oqu yvq ovy| |uxvzw zvwu |zwvyy ouv|u oywvo ozuqv

quw zqzvuy uvox uu|v|u u|uvqo uxovz| uwvuo owovuq wqvxuqzy oqqvo| vy uxyvyw uuuvxx uyvyw xqzvu| |vw| |q|wvzqwoz o|uv|w ouvoy ozwv|o uyxvux uqvyo u|v| uovzu zuvxoy|wu ouoov|w uwvqu uuvxq uyvwz uuvyw uxovww uqvxw zovzw|yw ouxzv wzvwq uy|vx uyvz| uyuvwz uwxvoo |xvo wxvzyyxx|y ouwyvq| wv uxvo| uovyx uzvuu u|qv xvxy yvyxo|oq ouqwvu| uqvqq uxv|w uxvxy uuwvyo uyvoo qvyw zv|qyouu oxoyvu ywovoq uuvw ux|vy |uxvu| uxovoy oqvow uwvzwxuww oxovww oo|zv uyuv uvoq |w|vo u|wv|y ouqvoq uqwvooquwxy ouwvyq ovq uuyvo| uzvyu uquvq u|uvo q|zvx| q|vq

oqu ow|qvy ovu uqvyo vyu xx|vyq uwvwy z|zyv|y oxxqwvuwquw quzvoy |vq |w|vwz uyvyw xxwvzu xuvw o|xxxvy owv|xuqzy ouvu |vu uqov|q uxzvzo u|vzq uuwvu ooouv| |qxvxuwoz ozqxv|x o|vq |z|vuo uq|vqz xxyv|u xzzvy oo|vwo xqvzzoy|wu uvuw uqvz uovz u|vwo xwwvo y|vu| oxvz ozvoq|yw uuvw zuvy uvzz uozv yqzvyx xzyv|u ovo oxvwyxx|y owvy ouuv|u uwv|| uuwvx| x|ovx| yqyvux o|ozvuo oyoovyoo|oq |wzvy xyvyw uuwvz| uvqu x|xv xuvu ooyyvxo oxxvoyyouu u|wvq |zxvwx uvo uyzvu xo|vzq xuqvqw oyzu|v|o q|quvqxxuww uuv|o ozuxvuu uuvxx uzvzu xqvow xuwvzz oyxxxvyx owyovyooquwxy xwvy |qzv| uuzvxu uzvo| xyuvwo uwzvwx oxwwvxz oxyuyvy|

oqu oxyovoo vo yo|vyu v|w xwyvzz zv |xwovy xzzxxvyquw oxvz |vwz yquvxx y|ovx xywvyz xzyvyq x|xy|vzo w|ywzvwouqzy oyvu| zv xzwvx| yyvu xvoq xv yxy|vz zqy|vyowoz oy|xvxx uvy yozv|q yvyu xwv| xxvu| ywuv| zqqqvozoy|wu oxyzvuw uyvzz y|vyy yowvxq xwuvwx xwxvy ywzuv|q wqzvoo|yw oxoovuy wwv yqyvyq yuvyw xwov|x xzov|u ywozyvqy wyyyvywyxx|y oxwxvz ouyvxy yoyvuz yvoq xwxvwx xzqvww ywv| wxxqzvo|oq oyxwvw qvwx yovw yuvw| xxwvwo xwyvuq yquvzw woyvyouu oyw|v|z yzvx| yqvwu yxyvq| x|vx| xywvuq yywqvxw wqvq|xuww oxqvxu ouxqv|u yqvuq y|wvqx xyqvw xovz| yx|||vqq yuy|vuoquwxy oyywvqx uwvuy yo|vxz yuzvz yyxvq xwv|u yox|wvww yoovqx

oqu

|

xy

acutedegmicrodegpara

degmiddotmdegemdegkk~qgcedilsup1sup3

mdegkk~ocedilsup1sup3 cedilordm cedilsup3

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3oqu |yvoy uvqw zqxvxx xvwq uzvx ovu y|zuvx xywy|vy|quw uxzvy yvuz zoyvxo vox xxv| |vqw zyuyovyo zqzuvuyuqzy uyvuz oovx zuovw wvzy zvqx |vzu o|qxxuvy oowzyyvyqwoz uoyvxx vy xyzvzq xxv|o zuovwx uyvyo ouyyqovxx o|woo|v||oy|wu xqvzy uuvy oquvou oqovq oqqovuw z|vy oxuwv| ouqxzyvwz|yw xoyvyw xwvqy oqqvo oz|vo oqzvyz o|yvu ouwwqvq o|wzuv|yxx|y yu|vq ooyvq o|yvzx qvu o|wvqq uwvy oozwwvyo oozqyvxo|oq youv|y ovoy oxu|v|w yqv ouv|y owzvoo wuo|vou w|qzvzwyouu yxvw uouvwo oywzvwo o|yxvqo oyyvqq oxzzvq xo||voy xoyvwuxuww wozvq yuv| oyuvwo ouxvo ox|wvq o|o|v| |qwqvxw |ouvyzoquwxy zyxv|o zqyvu| oyq|vx oxqwvy oyzvz ooxvu ozzvx owovyu

oqu |xvyy v|u ozuvy w|vyy |xvw |vxy uoyyzov|o |zqyyyvoquw |xvww uvxw owq|vx xvzx oyovu |vqy xw|zvzw xyzuzvzxuqzy |yzv|z zvqw qvuy v|x oyovww wqvyw wzyv|w yqu|vowwoz |wuvz owvu o||vu oqv xzyv ouuv|| yozqovu zzqvzoy|wu xqxvwy |yvx qovou uwvyx oz|zvxy ywv| xqqxvzq yq|qxvzw|yw xuv|u uv|z qywvyu zxvy qxovww |quvz yxwoqv yuquu|vy|yxx|y yzxvqq ouyv| xzwvwx xyvzq yvoq wqzvy uzyqzvu x|yzqzvw|o|oq yyqvx| zzvq |ov|x zzovw| w|ov|| ozvo x|o||xvuz |yqqvwyouu zzyvzq yq|vwz |zzzvz |qvy |wzuv zyvxq |xowu|v |uuoqovuuxuww ooxvu o|wvu |wyvww |zvu |ux|vo zw|vzo |xuwvy |uwovxoquwxy oz|yv| owzuv| |wxyvwu ||yvqu |yzuvzx |yyovq o|uwvo o|uywvo

oqu |yvq vq |qyvqq ozwv|u |zvwq qzvww o|wuzvyo ooq|xxvzyquw |yvzu uv ooqvyu ozzv|w uxxv ozqv| owzzwvyy oxxuq|vouuqzy |yyvx| wvxo yvqq ozwvqy |zwvzq ozwv zzvx qyqqovoowoz |xv| oyvyz ozxovou |wwvw uxqvq| uquvq| |ozqxv|| qwuzzzvw|oy|wu uyqvzo ||v|| |wvyy wquvw wxuvox zvuo |oxy|vw q|qyvoq|yw ux|vuw yvw |xov|u |uvz uuvyu |wqvy o|xxovzu q|wvwzyxx|y yzvwo o|xvu |uqovuy qvw |yvy wwzvx o|wuyov|y owxuqzyvowo|oq xzzvwz yzv|y |q|vxq ovzu uqwvuq oqqvz oyuyyxuv|u oyouoz|vuoyouu wxyvq xuuv|u uoyxvo |yywvo uzvw| uouvu ooq|ozvo oqyq|vqxuww oqqvqu oowvoz uoy|vw |yvx uwuvuw uux|vxx woyyqzvow zqzvqwoquwxy owquvu owov|u uouvx |zxyv uvw uxoqvqu uy|yozvuw uuqqvuo

xy

oqu

|

cedilsup1sup3plusmnsup2

microdegpara cedilsup1sup3 cedilordm cedilsup3degmiddotmdegemdegkk~qg mdegkk~o

JAEA-Testing 2011-005

Ta

ble

13L

MPI

IOP

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

qv

qvz

ovqq

ov

|vq

vwy

quw

vyu

ovq

ovzz

vx

yvqq

yvox

uqzy

xvq

|vx

|vzq

xvo

oovw

oovxu

woz

oqvz

wvqu

vy

oqvy

|vuz

vy

oy|wu

ozvwy

oyvwq

oyvy

ozvz|

uxvw

uxvz

|yw

|yvzx

|vyu

wvy

|xvxo

vq

xv|

yxx|y

yvy

xyvoz

xqvqq

xuv||

oyvqo

oovz

o|oq

zyv|o

wqvu

zovu

zvx

owwvw

oyvzx

youuoxvoq

oqyvx

ouovzu

oqov

x|vqw

uvux

xuwwuyuvoo

|wvuo

uwvxu

oyovqq

wuvqq

||zv|y

oquwxyyqxvu

uwyvux

uwxvzx

ozuvqy

|uwv|

|uuvx

oqu

ovou

qv|o

qvy|

qvyx

|vy

|voq

quw

v|

qvx|

ovq

ovw

yvu

yvz

uqzy

uvxy

ovu

vqw

vyw

ovu

ovou

woz

zv|w

vyu

xv

xvoq

xv

uv|

oy|wu

ov|z

uvyz

oqvou

zvzw

uzvyo

uwvwz

|yw

||vz

zv|w

uv|z

qvy

wvwq

zovy

yxx|y

xvxw

v

uyv

|wvy|

oxqvz

ouzv|q

o|oq

w|vuu

u|vz

wyvx

ywvzz

y|vw

yvuz

youuooyvu|

yovw

oyqvoqvxo

||xvyo

|vx

xuwwwwvww

o|ovu

|uvqzyvuy

|yxv|

uqvw

oquwxy|xvo|

oovux

qovwu|qyvqz

uqwvo

x|v||

|

xy

acutedeg

microdegpara

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

ovou

ovoo

ovq

ovo

vo

vwx

quw

vx

vow

vo

vxu

uvxo

yvo

uqzy

xvoq

uvou

uv|

xvqu

zvo

oovy

woz

oqv

wvx

wvy

oqvo

ozvwo

vo

oy|wu

qvoq

ovo

ovux

qvq|

|yvyq

uyvou

|yw

uqvyo

|yvz

|uvy|

|wvu|

wovo|

wyvzu

yxx|y

wqvox

|v|y

yyvww

wvy

ouovz

ouwvqq

o|oqoxyv|w

o|vuq

oxvyo

oxovz

wvw

xwv||

youuwovuu

xyvox

u|vzq

yzvxw

xoovux

x|xvwu

xuww|xvo

uzvx

u|vyz

xqvozoquqvw

zuyv

oquwxyzv|y

yovz

wqvou

qyvuqooo|vuoqovu|

oqu

ovo|

qvxy

qvzy

ovo

voz

|v

quw

vz

qvzu

ovzq

vy

uv|u

yvx

uqzy

uvy|

vqx

|v

uvuz

wvwq

ovxo

woz

zv|

|vx

vwq

wvwu

ovyy

yvqu

oy|wu

owvzu

vu

oyvox

owvoq

|xvxo

xuvwx

|yw

|vy|

ouv

|ov|u

|xvx

qvq

oquvwq

yxx|y

xvq|

|vu

uxvx|

yvwo

oxuv||

owov|w

o|oqoxqvyw

v|

wyvq

ovux

|qvo

|vu

youuz|vwo

ouzvwq

o|vy

wuvqw

yovy

yywvz

xuwwxxvyu

|qv

uy|voy

xoqvwoxqvwwo|yvzw

oquwxywxwvxq

xov

wqvx

zqwvu|uovxzoqvu

xy

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2

microdegpara

|

JAEA-Testing 2011-005

521 POSIX Fig 20GbrvbarPZYgtFGeacutea`brvbardotildeiWXoacute

CZordfYgtFGeacutea`laquoGnordfzyacuteCqXFGeacutea

`ordf 1MB oumlZYWZordfYWgtiacuteOcirc$FGe$WXDHordfiGqUgraveDZYecircCo^_ordfn

~nparaCDaYGouml`$iaYbrvbarsect

EGoacuteCpoundEacuteEacutegtiaYGoumlAgraveordfAumlgtlaquopoundDq

mnd$Ocirc$YUumleacutenotccedilIuml]OcircOtildeYYL13YEuml

plusmnFEumlPthornordfuumlwdnqecircCCgtordfFGeacutea`

iexclcentIacuteiWXmW]poundEacuteEacuteecircC+Xnota

13gtOslashordfmacrdnX=WC-eacutearingaeligccedil13hLIOiumlIumlshyaringaeligccedil13hPthornWuumlwdnq 522 MPIIO POSIXZordfYWiacute=zCFGeacutea`

otilderyenCordfzyacuteCLFig 21PqUgraveDEumlIgraveiexclcentCXqmndreg=cd MPIIO ]^_ordfZordfY5POslashOumlWXpoundgtXcWmacrdegeumlnq 523 _aumlaringaeligccedil13h 13 IOoiumlIuml_aumlaringaeligccedil13hWa-eacutearingaeligccedil

13hordflaquoqa-eacutearingaeligccedil13humlcopyordfplusmnsup2GmWsup3acutegtlaquoordf13

BeumlnmemcleanfIumlegravePmWbrvbarsect_aumlaringaeligccedil13hY|1micro5PmWordfgtecircq 13G3brvbarsectpoundEacuteEacuteCDPOSIX_aumlaringaeligccedil13h

szligordfaacuteHdnDLTable 14Pq

(1) ecircEacute` (2) memclean (3) TplusmnpoundEacuteEacute` (4) 2plusmnpoundEacuteEacute`

JAEA-Testing 2011-005

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomSingleFileFilePerProc

ReadWrite

homehome

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

Write Read

sequential randomhomework

N=322561024

N=256

N=1024

N=32

N=32

N=2561024

FilePerProc FilePerProc

N=256

N=1024

N=32

N=2561024

N=32

FFig 20 POSIX

JAEA-Testing 2011-005

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential random

use viewcollective

use view N=256 use view N=256

N=32 256 N=32 256

homehome

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomhomework N=32 N=32

FFig 21 MPIIO

JAEA-Testing 2011-005

Table 14 _aumlaringaeligccedil13hszlig

| xy | xy

oregRplusmnsup2 zyvx qzv|| xwvxw zyuvqu

sup2Rplusmnsup2 woxqvoz ||qqvz| uzvu z|voy

kfrac14 frac12kfrac14

53 IORYUgraveWH

IORYbrvbardegdnDregmacrb13tuq 531 a13egravecurren occedil`brvbarYCDcparaX IO O=5Pnota13LSRFSP

homea13brvbarsectiumlIumlS_aumla13Lext3Ptmpa13PordfOslashszligordfAumlDHcAumlFXgttmpa13YZccedila)W^|13~AumlFUcircgtecirc9ordflaquoLFig 22middot Table 15Pq UgraveDhomemiddotAQFS_ccedilY-a5 512kcentCsect186OcircOtildeYOumltimes

raquoZETHIuml_Kgtaordf)JeumlnqUcircfIuml13NtildeY|^`bgta

gt_ccedilY-a5gtyBUacuteCqiAgravework middotAIO eacuteIumlkCQFS_ccedilY-a5 1MBcentCsect366OcircOtildeYOumltimes`Zaggtaordf)Jeumlnq^|13~cda_ccedilY-a5gtlgtyB

UacuteCqa-a5ordfecircCXbrvbarPX MPIIO work middotAegravePuecircgtlaquoq 532 POSIXW MPIIO c^_brvbariaYhomemiddotA Readgt 256EumlIgravegt 2curren

Writegt 10currenW POSIXbrvbarsect MPIIOordfcedilqPOSIXbrvbaraYmacrsup1Gq 533 eacuteccedilecirceuml ntransferSizeGX 1 plusmn13fgtleumlneacute

a`ordfecircMFGFAringordfecircCXWPdegdnDmWcd^|13~

^_Zeacuteccedil-a5laquo]GWordfyacuteordf9ordflaquoq C`|lOcirc`eacuteccedil-a5Y_ BUFSIZgtAtildeUgravesect13

gt gccegravePoumlgt fccegravePoumlgt 8192eacutea`gtlaquoqCfIumlgtsetvbufZaZ|atildebrvbarpoundeacuteccedilecirceuml^wmWordfgtecircq QR Fortran1313aeligleacuteccedil-a5GAgraveograve13sectLAacute

ordmPq (1) OPENyacute BLOCKSIZE (2) ^ fuxxbf (3) Oslash5fIumlZaUuml^13~ -g (4) Ocirc`gt 8Meacutea`

JAEA-Testing 2011-005

ETER

NU

S D

X80

1-

18 (d

x01frac34

dx18

)

DX

80

01

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

DX

80

02D

X80

03

DX

80

04D

X80

05

DX

80

06D

X80

07

DX

80

08D

X80

09

DX

80

10D

X80

11

DX

80

12D

X80

13

DX

80

14D

X80

15

DX

80

16D

X80

17

DX

80

18

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

IObrvbar

1(io

1)

ETER

NU

S D

X80

19

-36

(dx1

9frac34dx

36)

DX

80

19

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

DX

80

20D

X80

21

DX

80

22D

X80

23

DX

80

24D

X80

25

DX

80

26D

X80

27

DX

80

28D

X80

29

DX

80

30D

X80

31

DX

80

32D

X80

33

DX

80

34D

X80

35

DX

80

36

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

Wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

IObrvbar

2(io

2)

Fi

g 2

2a13J

JAEA-Testing 2011-005

Table 15 OcircOtilde7Y`|J

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_01 QFS_data_01 2109888 MB 2109888Icirc36 = 724TB 1024k

data_02 QFS_data_02 2109888 MB 2109888Icirc36 = 724TB 1024k

data_03 QFS_data_03 2109888 MB 2109888Icirc36 = 724TB 1024k

data_04 QFS_data_04 2109888 MB 2109888Icirc36 = 724TB 1024k

data_05 QFS_data_05 3695488 MB 3695488Icirc36= 1269TB 1024k

home QFS_home 3695488 MB 3695488Icirc36= 1269TB 512k

work 524288 MB Io2JAgraveAacuteAcirc

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_06 QFS_data_06 2109888 MB 2109888Icirc36 = 724TB 1024k

data_07 QFS_data_07 2109888 MB 2109888Icirc36 = 724TB 1024k

data_08 QFS_data_08 2109888 MB 2109888Icirc36 = 724TB 1024k

data_09 QFS_data_09 2109888 MB 2109888Icirc36 = 724TB 1024k

data_10 QFS_data_10 3695488 MB 3695488Icirc36= 1269TB 1024k

data QFS_data 3695488 MB 3695488Icirc24= 846TB 1024k

sysdata QFS_sysdata 3695488 MB 3695488Icirc12= 423TB 16k IumlETHNtildeETH

work QFS_work 524288 MB 524288Icirc144=72TB 1024k dx01frac3436OgraveJ

JAEA-Testing 2011-005

6 a$fY` mmgt BX900 a$fY`ETHNtildeCDtuq

ETHNtildeBX900Xraquozfrac14frac12GmWordfgtecircDq 61 a$fY`igraveiacute mmgt BX900 a$fY`IcircIumligraveiacute AgravejiumlIumleacuteaIuml

0Gq 611 icircWiumlIumlethsectntilde

BX900 InfiniBand QDRegraveBCD Fat Treeicircgtlaquoq113aelig13S 18iumlIumlAEligicircgtlaquoordf113aelig13 LeafWdividenaccedil7Iumlordf 2Igravecenttimeseumln1accedil7IumlcdoiumlIuml QDR ordf 1 7gtYZeumlnqaccedil7Iuml 2 Igravelaquogt1 iumlIuml 2 QDR ordfYZeumln(QDR Iacutegt 18 j2Igravegt 36WX)qoiumlIuml`ZaringbrvbarsectT 8GBs (4GBsj2) AAEligpoundq 13aelig13 5PDHLeafaccedilyacute7 SpineWdividenaccedilordf

9iexcllaquosecto LeafaccedilGu Spineaccedil QDRgtYZeumlnqbrvbarpound 113aelig13cd SpineIacutegt 9j2Igrave=18 QDRordfegraveBeumlnq QDRordf 5PoumlLeaf accedil3xmacrDownlink ordf 36 Uplink ordf 18 Xgt Uplinkshy 50WXsectmn 50_ccedilaringWdividenotgtqBX900gta$fY`Ecircfrac34h Fig 23GqBX900gt 50_ccedilaring13aelig13ocircotilde gtthornordf]GqSpineLeafaccedilW13aelig13SoiumlIumlYZGEcircfrac34h Fig 24Gqmmgt13aelig13S nAumliumlIuml nAuml(niquest9)UgraveD nU9Auml(nAgrave9) SpineaccedilegraveBGqlaquo13aelig13laquoiumlIumlordfcent13aelig13iumlIumlcdOcirc$ltAacuteoumlSpineaccedilcd Leafaccedil(2Igrave) QDR2(1j2Igrave)ordfegraveBeumln(2Igrave)LeafaccedilcdiumlIumlQDR2gtYZeumlnqDotildeCmSpineaccedilcd Leaf accedilordmAcirc13aelig13SPAtildeWiumlIumlWnoteumlnqmDHyenwiumlIumlTEacuteOcirc$lt GoumlWiumlIuml 1 W 10 ordflt GoumlWgtn~n FAringordf+XqX13aelig13Sotilde gtiuml_ccedilaring

WXqmDHoYacute~iumlIumlethsectntildeAgraveograveLiumlIumleacuteaOcircOtildePordf

thornotildeGq (1) XiumlIumlethsectntildeL13klP iumlIumlZeacuteaIumleumlnqyenw13aelig13 1 iumlIumlordf bx0001ccedil

bx0018Ugravegtlaquoordf144EumlIgraveYacute~oumlmAumlgtiumlIumlordfeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDdmiumlIumlAringAEligw[iumlIumlegraveBGq13

aelig13ocircotildeoumliacuteZeacuteaIumleumlnq

JAEA-Testing 2011-005

(2) CcedilXiumlIumlethsectntilde Wsup33iIacuteOslash5YZniumlIuml 1ecirceacuteaIumleumlnqyenw13aelig13

1gt bx0001bx0003bx0005bx0007WeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDd8)eacuteaIumleumlnDiumlIumlordfEgraveAumlAringXdAringAEligwTEacuteEgrave

AumlAringiumlIumlegraveBGqEcircAumlAringoumliacuteAringAEligwTEacuteEcircAumlAringiumlIumlegrave

BGqmnAtildeWYacute~SiumlIuml gt IB [EumlDHcenteumlnDq plusmnoqgt65EacuteWsup3AgraveogravegtCDq

13aelig13 1 13aelig13 119

Leaf(1-2) Leaf(1-1)

bx0001 bx0002 bx0018

Spine(1) Spine(2) Spine(9)

Leaf(119-2)

Leaf(119-1)

bx2125 bx2126

Fig 23 BX900a$fY`icirc

3OslashCDqIgraveIacuteoFL13aelig13raquoIcircIumlntildePfrac34wWsup313kl

WiacuteiumlIumlethsectntildeAgravejBCq

JAEA-Testing 2011-005

accedil (Spine) accedil

(Leaf) 13aelig13 iumlIuml (1 13aelig13acute18 iumlIuml ) (acutecsect]GCGDHLeaf accedilTCcX)

1

10

2

11

3

12

4 5 6 7 8 9

13 14 15 16 17 18

SW 1

SW 2

SW 3

SW4

SW5

SW6

SW 7

SW 8

SW 9

SW

Fig 24 iumlIumlWaccedilatildeAcirc

612 Agravej

BX900 Ba$fY`plusmnsup2yBCD IB AgraveograveWC

RDMA(Remote Direct Memory Access)yBCD RDMA Agravej(13 RDMA WC)WRDMA egraveBCX NORDMAAgravej(13 NORDMA WC)ordflaquoqBX900 gtOcirc` RDMA gtlaquoqRDMA W IB begravecccedil`brvbarYaumlIuml(Host Channel Adapter(HCA))Ocirc$^|13~|W8Y]sectWsectgtecircAEligDIacutemWgtETH_fg (1)5PqAuml^ccedil`E7a13 ordfNtildeEumlgtecircqiAgraveNORDMA CPUbrvbar|fg5Pq

RDMANORDMAWcentccedilYacute AgraveograveWC Normal SendAgravej(2)W Send on RequestAgravej(3)ordflaquoqRDMANORDMAacircWmnd Normal SendAgravejSend on RequestAgravej5PmWordfgtecircordfmndsectOgravewG ccedilYacute-a5brvbarpoundAtildeUgraveqRDMANORDMAgtmndAgravejsectOgraveOcirc`Oacuteshy Table 16GqOacuteshyOcircOtilde Normal SendAgravejOacuteshy13yacute Send on RequestAgravejWXqmndAgravejWccedilYacute-a5^_atildeAcirc Table 17W Table 18Gq Oacuteshy^ MP_MAX_NORMAL_SENDgt^wdnordfyacuteaacuteshyCDOcirc`shyWEacutegtlaquoqbrvbarpound^egravePoumlOacuteshyOumloumlEacutegtlaquoq

JAEA-Testing 2011-005

XDshy 5PDHOslash5 -onesideUuml^13~CDoumlNORDMAAgravejc(ccedilYacute-a5WatildeAcirc)Normal SendAgravejgt5nq

Table 16 AgravejsectOgravewOacuteshy Iuml RDMA RDMA NORDMA

^_ Tccedil1024 1025ccedil U Oacuteshy 32768eacutea` 16384eacutea` 524288eacutea`

Table 17 RDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil16KBOcircOtilde 16KBccedil32KBOcircOtilde 32KB13yacute 1ccedil1024^_ Normal Send Normal Send Send on Request

1025ccedil2048^_ Normal Send Send on Request Send on Request 2049^_13yacute NORDMAWX Send on Request Send on Request

Table 18 NORDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil512KBOcircOtilde 512KB13yacute Gu^_ Normal Send Send on Request

timesOslash (1) ETH_fg ^_ccedil-brvbarOcirc$fgIacuteccedil`brvbarYordmgt ordm|8

YOcirc$FGGCcedilEgravemWq^_ccedil-( 13)Obbrvbarsectdivide3GmWordfgtecircq (2) Normal SendAgravej G shylt shyccdccedilYacuteG GqccedilYacuteUgraveUacutesect

umlordmordfagraveCnUgraveUgravefg(direct copy)qagraveCXoumliUcircIacuteEumlAagravefCmmfg(indirect copy)agraveCDdsectumlordmfgGqUumlccedilYacutezecircq (3) Send on RequestAgravej G shylt shyWYacuteAacutesectlt shyccedilYacutesectumlordmordfagraveCcdG 5Pq

`ccedilYacutezecircq

JAEA-Testing 2011-005

62 MPI MPI acircX atildeegraveBCD^_Z)JCBX900 yacutegt

CDq256EumlIgraveccedil4096EumlIgravegt Ocirc$-a5THORNumlcopyszligpound5poundDordfmmgto MPIatildegtagraveaacuteX=ecircordfmacrdn 32ccedil80000eacutea`gtGq WCRDMANORDMA ordflteumln atildeordflaquomWlaquoOcirc

$-a5cdecircCordf^UcircG atildeordflaquomWXMordfethCDq 621 Agraveograve mmgtG Ocirc$umlcopy 0ccedil80000 eacutea`gtlaquoordfOslash5vEacuteAacuteG

Ocirc$acircETHeCLTable 19Pq AgraveograveWCMPIatildeTplusmnOslash5GraquoCmn 20plusmnatildesectaumlGq20plusmnatilde

sectaumlC3 1plusmnWecircXshyordfeumlnmWordfDDlaquopoundDqWCpumlcopyYacute~ W[ordfuumlwdnqmDHT`C19plusmnshyIIumlAacutemWCDqXIuml RDMAW NORDMA 2IumlgtGq^_ZOslash MPIatildeIacutecurren Fig 25Gq

Table 19 Ocirc$acirc

EumlIgrave AllreduceReduceBcastSendRecv 13 MPIatilde 256 32eacutea` 128eacutea` 512 yacute 128eacutea`

1024 yacute 256eacutea` 2048 yacute 2048eacutea` 4096 yacute 2048eacutea`

do 1000 III = 1 NNN call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() mesuarment start call MPI_ALLGATHER(anelemMPI_DOUBLE_PRECISION amp bnelemMPI_DOUBLE_PRECISION amp MPI_COMM_WORLDierr) ttime2 = MPI_WTIME() mesuarment end ttime0 = ttime2 - ttime1 call MPI_REDUCE(ttime0ttime1MPI_DOUBLE_PRECISIONMPI_MAX0 amp MPI_COMM_WORLDierr) if(myideq0) then max_time = max(max_timettime) min_time = min(min_timettime) all_time = all_time + ttime end if 1000 continue

Fig 25 ^_ZLIacutecurrenP

JAEA-Testing 2011-005

13

622 mnUgravegttuDOslash5CcedilEgraveOslash5IumlgtP512EumlIgrave Fig 26 Fig

27 1024EumlIgrave B Gqoh$a` MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

0

500

1000

1500

2000

2500

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Reduce(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(512-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(512-)

rdma

Nordma

Fig 26 MPI atildeL512EumlIgrave-1P

JAEA-Testing 2011-005

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(512-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(512-)

rdma

Nordma

0

500000

1000000

1500000

2000000

2500000

3000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(512-)

rdma

Nordma

0

50

100

150

200

250

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(512-Rank0gtRank511)

rdma

nordma

Fig 27 MPI atildeL512EumlIgrave-2P

JAEA-Testing 2011-005

brvbarsect[raquozfrac14frac12CDq

(1) C gt RDMAbrvbarsect NORDMAordfFoumlordflaquoq (2) RDMAgtlaquo Ocirc$-a5cdaeligccedil ordflaquoEGqmn Normal Send

Agravejcd Send on RequestAgravejsectOgravethornWnq (3) NORDMA gt AgravejsectOgravewOacuteshyWatildeAcircOcirc$-a5gtaeligccedil ordflaquo

EGoumlordflaquoq (4) EumlIgraveordfCXWEacutebrvbarPX Ocirc$-a5gt decircordfmacrdn

oumlordflaquoq ^_ZgtacircegraveBG MPIatildebrvbarpoundRDMAW NORDMAegravecurrenaeliguordflaquo

qDotildeCmnd MPI ^_Z=GfIumlgtlaquo mpiexec Uuml^13~gtlaquogtGu MPIatildeoacuteC RDMAgt5Pc NORDMAgt5Pclaquo XgtC MPI atildeegravepoundoumlAgraveIumlgtOslash5rsGmWordfaeliguWXq ucircuumlWCo MPI atildegtUVOcirc$(ccedil8KB)3UVOcirc$(8Kccedil78KB)U

VOcirc$(80KBccedil584KB)gt RDMAZUcircCD Fig 28LgtUVOcirc$P CLUVOcirc$PGqZyacutegt 100n RDMA ordfnoty0n NORDMAordfnotyWXq

JAEA-Testing 2011-005

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

middotplusmnsup1raquo

~plusmnreg

plusmn

sup2cedil~raquo

sup2~

sup2~

sup2~Oslashreg~plusmn

middotplusmnsup1

~plusmnraquo

~plusmn

Ugraveplusmnsup1raquo

Ugraveplusmnsup1

~plusmn

plusmnraquo

Fig

28

RD

MAW

NO

RD

MALgtUVOcirc$acute

0K

B~8

KBP

JAEA-Testing 2011-005

63 RDMAszlig RDMA 612 gt0CDbrvbarPCPU plusmnsup2WnotM5XDH 13

O=Vfrac34gtecircqmnbrvbarsect WbordfO=gtecirc9ordflaquosectC

cCcedilEgrave`5poundEacuteDqUgraveDrsDHNORDMAgtiacuteX`5poundDq WC WordfO=eumlnDegraveyenagraveaacutegtecircXcpoundDq 631 Agraveograve W5P^_ZegravepoundRDMA W NORDMA rsGq

eacute^_WOcirc$ecircacuteGeumlsectCigraveecircacuteiacuteVicircG (13cedil Wdivideoslash)5PqGlt iuml_ccedilaring (MPI_IsendIrecv)egraveBGqG UgraveDlt GAtildeWcopyIgraveOcirc$-a5 5MB gtlaquoq W5P5Igravegtlaquoq5Igrave-a5(450450)gtlaquoqrsDHgt n~nOslash55Pq ^_ZZccedil`MPIW 2q|Icirca|ccedil`(MPIiumlOpenMP) 3q|)

JCDqZccedil` MPIgtOslash5CDoumliumlIumlS RDMAgtXDHmAgravejszligordfgtecircXqmDHiumlIumlSgt MPI 5XCeth2Icirca|ccedil`)JCDqIcirca|ccedil`Oslash5WOslash5Iacutecurren7ccedilIumlEumlIgraveUcircCXeacute

Yacute~ 1W7ccedilIumlEumlIgraveUcircCDeacuteYacute~ 2)JCDqeacuteYacute~ 2 WOslash5Iacutecurren Fig 29GqXMPIWIcirca|ccedil`eacuteYacute~ 15IgraveIacutecurren7ccedilIumlEumlIgraveampeuml$OMP parallel doiacuteordfotildegtlaquoq DHOslash5Zccedil` MPIgt 8EumlIgraveIcirca|ccedil`Oslash5 4mpij8omp(32core

egraveB)gtOslash5Gq

JAEA-Testing 2011-005

call MPI_BARRIER(MPI_COMM_WORLDierr) ttime5 = MPI_WTIME() mesuarment start call mpi_isend(aNMBmpi_DOUBLE_PRECISIONir1 1 MPI_COMM_WORLDireq(1)ierr) call mpi_irecv(bNMBmpi_DOUBLE_PRECISIONil1 1 MPI_COMM_WORLDireq(2)ierr) call mpi_isend(dNMBmpi_DOUBLE_PRECISIONil2 1 MPI_COMM_WORLDireq(3)ierr) call mpi_irecv(cNMBmpi_DOUBLE_PRECISIONir2 1 MPI_COMM_WORLDireq(4)ierr) ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ttime7 = MPI_WTIME() mesuarment start $OMP parallel do do j = 1 NMM do i = 1 NMM a1(ij) = 1d0 b1(ij) = 1d0 c1(ij) = 0d0 end do end do $OMP parallel do do j = 1 NMM do k = 1 NMM do i = 1 NMM c1(ij) = c1(ij) + a1(ik)b1(kj) end do end do end do ttime8 = MPI_WTIME() mesuarment end ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc call mpi_waitall(4ireqiastatusierr) ttime6 = MPI_WTIME() mesuarment end ttimeDD = ttime8 - ttime7 ttimeal = ttime6 - ttime5 Oslash5 ttimesd = ttimeal ndash ttimeDD Oslash5cdnD (= UgraveDiIacute)

Fig 29 WO=5PDH^_Z

JAEA-Testing 2011-005

632 Zccedil` MPI 2plusmnIcirca|ccedil`eacuteYacute~ 1 WeacuteYacute~ 2n~n 3plusmn

Oslash55poundDqmnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt D Gqontilde13Gq

acutegtOslash5CD Oslash5 acutegtOslash5CDOslash5 R acute Wn~nOslash5shy Oslash5 acute W5poundDOslash5 (U) acuteOslash5cd()Dshy acuteOslash5CD(OslashC) U acuteOslash5cdRshyDshy

laquogtOslash5CD oacuteG WOslash5GmWbrvbarpound

UumlCXpoundDrGqmj13sectgtlaquoq

laquo() = (Oslash5U( iuml)) j100 aTHORNshyordfecircMszligordflaquomWXqbrvbarsect[raquozfrac14frac12CDq

(1) Zccedil` MPIgtiumlIumlS IBegravepoundXDH RDMAograveoacuteordfltdnOslash5szligq

(2) Icirca|ccedil`eacuteYacute~ 1 RDMAOslash5gt WOslash5rsGWucircOslash5ordfFWPXq NORDMAatildeCiacuteXWXpoundq

(3) Icirca|ccedil`eacuteYacute~ 2 gtIacutecurrenordf7ccedilIumlEumlIgraveUcirceumlngteacuteYacute~ 1WRrsgt 72currenOslash5Rrsgt 55currenAumlFUcircordfXeumln(RDMA)qCcCgtucircOslash5ordfFWPdegdnXq NORDMAIacutecurren7ccedilIumlUcircgt Oslash5eacuteYacute~ 1ruUumlIgraveUcirceumlnordfRDMAbrvbarsectcedilCXpoundq

BX900gt RDMA Uuml_IumlOslashWGgtCPU oacuteG

dCb^|13~ CPUbrvbarsectVfrac34GmWordfgtecircqCcCBX900 ordfiacuteCUgraveDplusmngt ordfUumlmWlaquosect Wordf

O=eumlnDegraveyenagraveaacuteGmWordfgtecircXcpoundDq

JAEA-Testing 2011-005

64 CPUSCPUiumlIuml 1oacute 1 core FAringG^_Zegravepound CDqCPU S

iumlIumlS CPU iumlIuml gt Ocirc$ordmAcircAgravejordf+Xqn~n gtordfaEacuteiumlIuml ordfTFWXpoundDq 641 Agraveograve

1oacute 1 ^_Z coreOcirc$ G^_ZgtlaquoqZY0cdcZYCD^_acircgtCD-a5Ocirc$ 100plusmnG Cmnd uCDn~nGq^_ZiIacute Fig 30Gq plusmn 24EumlIgravegt128MBOcirc$G C^_acirc 1WGq BX900gt 1iuml

Iuml 2CPU1CPU 4coregtlaquogtZY 3Ugravegt CPUS ZY 7UgravegtiumlIumlSPAtildeW CPUW mnd13yacuteiumlIuml WXqiumlIumlS coreoacuteG core IDWethsectntildedn MPIZYAumlAring Fig 31Gq

IC = 0 do 1000 III = intcpu nodes-1 intcpu IIRANK = III call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN if(myideq0) amp call MPI_SEND(ansizeMPI_DOUBLE_PRECISION amp IIRANKIIMPI_COMM_WORLDierr) if(myideqIIRANK) amp call MPI_RECV(ansizeMPI_DOUBLE_PRECISION amp 0IIMPI_COMM_WORLDistatusierr) 120 continue ttime2 = MPI_WTIME() c IC = IC + 1 timew(IC) = ttime2 - ttime1 ir(IC) = IIRANK 1000 continue

Fig 30 ToacuteT ^_ZLIacutecurrenP

JAEA-Testing 2011-005

NODE

CPU0 CPU1

Core IDacute0 Core IDacute2 (Rank 0) (Rank 1) Core IDacute4 Core IDacute6 (Rank 2) (Rank 3)

Core IDacute1 Core IDacute3 (Rank 4) (Rank 5) Core IDacute5 Core IDacute7 (Rank 6) (Rank 7)

Fig 31 f MPIZYethsectntilde

BX900Ocirc`QRfeaZ+QR MPIZaZ|gtlaquoordfmQRRIgraveEacuteIacute13

IntelfeaZ+QR MPI IntelfeaZ+Intel MPI QRfeaZ+Open MPI

3 gt5poundDqDotildeCQR MPI W Intel MPI MPI ^_ coreoacuteGethsectntilde Fig 31CDsectgtlaquoordfOpen MPIatildeC rankAumlAring 01234567 Core ID 01234567gtethsectntildednq 642 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt E Gqmbrvbarsect[

raquozfrac14frac12CDq

(1) QRfeaZ+QR MPI gtCPU S FAringordf 3260MBsiumlIumlS CPU FAringordf 3750MBsiumlIumlordf 5350MBsgt CPUS ordfiAumlcedilCiumlIumlordfiAumlFWPWXpoundDq

(2) Intel feaZiumlQR MPI CPU SiumlIumlS CPU QRfeaZWotildeiacuteXotildeordfiumlIuml 4750MBsWegraveeacuteEGqmnQR MPIQRfeaZUcircCD5poundordfIntelfeaZoacuteC5nXDHWnq

(3) IntelfeaZ+ Intel MPI CPUSiumlIumlS CPUgtAgraveXordfmndocircoacute(1)(2)QR MPI egraveBqUgraveDiumlIumlgt Intel MPI 2(` QDRegraveBordfgtecircXDH1(` QDRCcXq

JAEA-Testing 2011-005

(4) QRfeaZiumlOpen MPICPUS ordf 4460MBSgtmnUgravegt3gtTbrvbarordfiumlIumlS CPU QR MPI WotildeAringiumlIumloumlCXWXpoundDqDotildeCiIacute QDRegraveB5poundbrvbarPotildeq

dividefrac12XegraveAgravegtlaquoQRfeaZiumlQR MPIgt CPUSiumlIumlS CPU ordf

iumlIuml brvbarsectcedilWPfrac34W+XWXpoundDqmmacraacutesectXuC

iumlIumlSgt MPI XuCEumluecircgtlaquoq 65 13aelig13ocircotildeouml 13aelig13ccedil`brvbarY 50_ccedilaringicircXpoundDHegraveBoslashbrvbarpound

thornltCUgravePqmthorn 1oacute 1 Wdivideoslash ETHuDq 651 1oacute 1

1oacute 1 MPI atildegtlaquosectbrvbarCegraveBeumlnqyenwOslashfIumlgteacute^_WFGgt^_ordfgt5noumlordflaquoqmbrvbarPXouml50_ccedilaringthornlt9ordflaquoqmmgt 5P^_Z)JC

thornETHuDq WC1oacute 1 gt 2iumlIumlgt IBnotGn[ordfecircDH50_

ccedilaringthornCltmWordfcpoundDqDotildeCWsup3iumlIumleacuteaOcircOtilde(Yacute~iumlIumlethsectntilde)AgraveogravegtAtildeWYacute~ordf IBnotGbrvbarPXiumlIumlcopytimesUVYacute~139ordfAgraveXDHthornordfCq 6511 Agraveograve ETHNtildeegraveBCD^_Z^_1curreno^_cdfrac14secto^_

1oacute 1 (MPI_SendRecv)gt 128MBOcirc$ 100plusmnG CmnuCDiexcl FAringGqOcirc$G aYacute Fig 32 Gqmyengt8EumlIgraveYacute~oumlgtZY 0ccedil3^_cdiexclZY 4ccedil7^_Ocirc$G CqOcirc$Glt ^_iugravepGgtoslashbrvbarpoundccedil`brvbarYgt[

ordfecircmWXq ^_ZZccedil` MPI WIcirca|ccedil`(MPI+OpenMP))JCDqIcirc

a|ccedil`7ccedilIumlETHeGmWgt1 iumlIumlcdOcirc$G 2^wmWordfgtecircq

JAEA-Testing 2011-005

13

6512 (1) 113aelig13S rsDH1 13aelig13S(144core)5poundDqZccedil` MPI 144 EumlIgraveIcirca

|ccedil` 72mpij2omp36mpij4omp18mpij8omp 3 IumlgtOslash55poundDq13aelig13Siuml_ccedilaringgtlaquoDH iumlIumluacuteordf IB iexclcentGqZccedil` MPI 1 iumlIumlcd 8 ^_currenOcirc$ordfeumlnIcirca|ccedil` 2 7ccedilIumliacute 1iumlIumlcd 4^_curren47ccedilIumlEacuteC 2^_curren87ccedilIuml1^_currenWXgt87ccedilIuml13 IBgt[ordfecircqmDHmnd FAring+XordfIBagraveaacuteGDHogtIIuml FAringW 1iumlIumlntildesect FAringshyiquestHDqm Table 20GqXeacuteYacute~Icirca|ccedil`cedil$ucirc3shy(MPI^_j7ccedilIuml)Gq Icirca|ccedil`(18mpij8omp)EacuteegraveeacuteshyordfEordfFAringshyGueacuteYacute~gt

otildeEacutebrvbarPXshyWXpoundqmnordf QDR2currenOslashgtlaquoigravePq (2) 13aelig13Zccedil` MPI TEacute144EumlIgraveZccedil`MPIgt5poundDqYacute~Oslash5gt t256YZegraveBCDq

mnmYZiumlIumleacuteaOcircOtildeordfZcopytimesWXDHIBnotordfecirc]GDHgtlaquoq _copytimes 72^_13aelig13(=9iumlIuml13aelig13)gtIumlb^_ 213aelig13currenethCDq13aelig13S 1iumlIumlecirc 2 IBordfuacuteordfpoundq13aelig13S 18iumlIumllaquogt 36 IBgteumlnqiAgrave13aelig13 50_ccedilaringgtlaquogt 18(9j2ccedil`)gteumlnq144EumlIgraveouml13aelig13S13aelig13W 18 IB egraveBGgtiuml_ccedilaringWXsect13aelig13S13aelig13EacuteXqbrvbarpoundm13aelig13ocircotilde 13aelig13SgtCD Wotilde^

dXq

Ocirc$G (G G)

^_0

^_1

^_2

^_3

^_4

^_5

^_6

^_7

Fig 32 ToacuteT Ocirc$G aYacute

JAEA-Testing 2011-005

Table 20 L13aelig13SP eacuteYacute~ Zccedil` MPI uumlyacutethorn(72j2) uumlyacutethorn(36j4) uumlyacutethorn(18j8)IIuml FAring 7238MBs 14474MBs 28942MBs 53467MBs iumlIumlshy 57906MBs 57899MBs 57884MBs 53467MBs

Table 21 L13aelig13Zccedil` MPIP 144 MPI 192 MPI 224 MPI iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 7246 R 4670 laquosectR 4671 laquosectR

2 7244 R 4670 laquosectR 4671 laquosectR

3 7246 R 4670 laquosectR 4671 laquosectR

4 7242 R 7243 R 4670 laquosectR

5 7243 R 7244 R 4671 laquosectR

6 7242 R 7240 R 7237 R

7 7241 R 7239 R 7246 R

8 7244 R 7240 R 7245 R

9 7249 R 7243 R 7240 R

10 R 4670 laquosectR 4670 laquosectR

11 R 4670 laquosectR 4672 laquosectR

12 R 4669 laquosectR 4671 laquosectR

13 R R 4671 laquosectR

14 R R 4670 laquosectR

iacute 192EumlIgraveW 224EumlIgraveZccedil` MPIn~n 96^_13aelig13gt 213

aelig13Oslash5112^_13aelig13gt 213aelig13Oslash55poundDqordm 144EumlIgraveOslash5WWlt shy13aelig13iumlIumlraquoIIuml FAring Table 21Gq yenw 224 EumlIgraveoumlltAacutesectshygt IBnotGiumlIumlordflaquogt(50_ccedilaring)

miumlIuml FAringordfcedilCXqFig 24a$fY`icircgtCDbrvbarPiumlIuml1W 102W 113W 12ordfnotGmWXsectmndiumlIuml FAringcedilCXpoundqCcCiumlIuml 15161718 egraveBeumlnXDHmndiumlIumlWnotGiumlIuml 6789 FAringordfFC 144MPI oumlW FAringordfEacutegtlaquoqbrvbarpoundiumlIuml6ccedil9 iuml_ccedilaringWXq192 EumlIgraveiacutegtiumlIuml 4ccedil9 p IB notGiumlIumlegraveBeumlnXDH FAringordfFq

JAEA-Testing 2011-005

Table 22 L13aelig13Icirca|ccedilIumlP Icirca|ccedil`(112x2) Icirca|ccedil`(56x4) Icirca|ccedil`(28x8) iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 9339 laquosectR 18567 laquosectR 36408 laquosectR

2 9337 laquosectR 18564 laquosectR 36919 laquosectR

3 9346 laquosectR 18566 laquosectR 36536 laquosectR

4 9347 laquosectR 18570 laquosectR 36457 laquosectR

5 9342 laquosectR 18651 laquosectR 36646 laquosectR

6 14385 R 28606 R 53808 R

7 14488 R 29118 R 53779 R

8 14500 R 28676 R 53789 R

9 14495 R 29100 R 53784 R

10 9337 laquosectR 18797 laquosectR 36408 laquosectR

11 9340 laquosectR 18775 laquosectR 36715 laquosectR

12 9343 laquosectR 18651 laquosectR 36559 laquosectR

13 9348 laquosectR 18811 laquosectR 36457 laquosectR

14 9338 laquosectR 18694 laquosectR 36991 laquosectR

(3) 13aelig13Icirca|ccedil` Icirca|ccedil`gt 213aelig13(224core)egravepound112mpij2omp(56^_13aelig13)

56mpij4omp(28^_13aelig13)28mpij8omp(14^_13aelig13)IumlgtOslash5CDqmndoiumlIumlgtIIuml FAringWpiumlIumlWnotnot Table 22Gq MIcirca|ccedil`gt IB notnotgt FAringordfecircC^poundqUgraveDIcirca|ccedil

`7ccedilIumlbrvbarpoundiumlIumlcdGlt GOcirc$2ordf^DHiumlIumlntildesect^_

ordfMiumlIumlXordf IB gtOcirc$[thornordfagraveaacuteXqmnduumlGWBX900 gtZccedil` MPI brvbarsectIcirca|ccedil`UcirceumlnDfIumlordfIcirca|ccedil`UcircfIumlXd 87ccedilIumlordfnotygtlaquomWordfcq WmigravegtWsup3iumlIumleacuteaOcircOtildeAgraveograveiumlIumlAtildeWecirccopytimeseumlnD

HUVYacute~13currenYacute~gt IB not9AgraveXqbrvbarpound50_ccedilaringmWuumlGbrvbarsectordmgttuDIcirca|ccedil`Ucirc5GuecircgtlaquoqCcC

Yacute~Oslash5 IB notGiumlIuml ordfpYacute~ordfOslash5eumlnWthornltCUgravePordfucircoB3mnEumlmWgtecircXq

JAEA-Testing 2011-005

652 divideoslash mmgtdivideoslash atildePiAuml 2ordfCX MPI_AlltoallegravepoundEacute

DGq 6521 Agraveograve

MPI_AlltoallGCcedilEgravecentCOslash5GqCmrsGqXMPI_Alltoall 100 plusmnOslash5C oacute copyIgrave-a5Zccedil` MPI ordf512MBIcirca|ccedil` 1024MBWCDq

A ucircYacute~Oslash5 B 13aelig13ntildesectiumlIumlgtecircotildeCGq C 13aelig13ntildesectiumlIumlAgraveXCG(13aelig13ordfCX)q D IBnotCG(iuml_ccedilaring)q

BccedilD atildeGgtlaquoordfiumlcopyordfYacute~gtegraveBGiumlIumllaquo GmWgtecircXq mDHWsup3ucircoBgteacuteccedilYacute~TYZ h4096egraveTEacute 512iumlIuml(4096core)agravefCm3cdXuCCcedilEgravePiumlIumllaquonotgtecircmCcedilEgravegtagravefgtecircEumlIgravegtCfNh$)JCmfNh$egravepound

MPI_Alltoall Oslash55poundDqmDHAacutedegCDOcirc$ MPI ^_UYacuteordfXq 6522 Zccedil` MPIWIcirca|ccedil`egravepoundD Table 23GqmmgtCXC

XmWOslash5CcedilEgraveordf+XDH AucircYacute~Oslash5WBccedilDUgravegtoshy8YrsCXmWgtlaquo(A 10WCDrClaquoordfmnoumlgt BccedilDEacutersWG)qBccedilDTEacute 512iumlIumlagravefCcdEumlIgravefNh$)JCOslash5GordfQR MPI gtmbrvbarPTEacuteCiumlIumlagravefCcdAgraveXiumlIumlgtCfNh$)poundOslash5CDoumlWTEacutecd

EumlIgravegtOslash5CDoumlWgt)acircOslash5ordfEGqZccedil` MPI WIcirca|ccedil`gtgt coreordfAgraveXoumlmraquozordfagraveaacuteXqX BccedilDrsXq

JAEA-Testing 2011-005

Ta

ble

23iacuteXCcedilEgrave

MPI

_allt

oall

vUacuteSTUcircR

vRcopyUumlYacutebrvbarTHORN

szligvcopyUumlYacutebrvbaragrave

pvRaacuteacircatildeNR

+aumleAumlovqaringaeligPgR

Oslash 5 Iuml

copy Igrave - a 5 (MB)

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

R

R

szligR

pR

144m

pi

512

3R6

0 11

14

118

0

145

9 3

60 R

138

6

36

0 13

87

100

1

31

124

1

25

2plusmn

(BC

DEacute

) 51

2 3R

60

111

4

118

0

148

2 11

16 R

146

9

63

0 14

75

100

1

33

132

1

32

3plusmn

(BC

DEacute

) 51

2 3R

60

111

4

29

0 14

81

714

13 R

153

3

29

0 14

21

100

1

33

138

1

28

448m

pi

512

14R

40

142

6

414

0

182

4 21

272

1 R

170

3

78

0 16

32

100

1

28

119

1

14

504m

pi

512

14R

45

142

7

415

8

180

4 27

242

6 R

160

5

79

0 14

33

100

1

26

112

1

00

2048

mpi

51

2 37R

69

182

1

298

8 24

93

465

6 R

250

1

465

6 25

04

100

1

37

137

1

38

2plusmn

51

2 37R

69

166

7

2211

6

262

5 67

564

6 R

264

7

298

8 25

04

100

1

57

159

1

50

3plusmn

(BC

DEacute

) 51

2 37R

69

166

7

2211

6

262

1 69

524

9 R

269

3 6

298

8 24

59

100

1

57

162

1

47

3072

mpi

51

2 69R

56

258

7 28

399

9 27

34

5371

54 R

286

3 9

507

7 26

44

100

1

06

111

1

02

68m

pij

2om

p 10

24

3R5

7 12

10

11

170

11

80

91

9 R

122

3

28

5 12

21

100

0

97

101

1

01

120m

pij

2om

p 10

24

9R3

3 12

89

215

0

127

0 12

191

6 R

130

4

47

5 12

94

100

0

99

101

1

00

132m

pij

2om

p 10

24

8R4

1 13

79

216

5

116

1 15

211

6 R

119

5

48

3 11

38

100

0

84

087

0

83

432m

pij

2om

p 10

24

17R

64

162

4

119

8 16

28

1925

43 R

161

2 1

129

0 16

18

100

1

00

099

1

00

30m

pitimes4

omp

1024

4R

38

598

115

0

674

12

13 R

683

27

5 68

2

100

1

13

114

1

14

2plusmn

(BC

DEacute

) 10

24

4R3

8 59

8

115

0

678

12

13 R

690

27

5 70

0

100

1

13

115

1

17

36m

pitimes4

omp

1024

6R

30

603

118

0

684

11

16 R

697

63

0 70

3

100

1

13

116

1

17

2plusmn

(BC

DEacute

) 10

24

6R3

0 60

3

29

0 68

8

714

13 R

702

29

0 71

0

100

1

14

116

1

18

126m

pitimes4

omp

1024

16R

39

822

3

415

8

805

27

242

6 R

738

1

79

0 71

5

100

0

98

090

0

87

180m

pitimes4

omp

1024

18R

50

925

615

0

892

40

283

2 R

776

127

5 75

3

100

0

96

084

0

81

2plusmn

(BC

DEacute

) 10

24

18R

50

925

712

9

861

29

352

6 R

808

109

0 76

8

100

0

93

087

0

83

612m

pitimes4

omp

1024

45R

68

110

2

349

0 10

34

525

9 R

111

4

349

0 10

46

100

0

94

101

0

95

2plusmn

(BC

DEacute

) 10

24

45R

68

110

2

2810

9

920

57

634

9 R

960

5

358

7 83

4

100

0

83

087

0

76

17m

pitimes8

omp

1024

3R

57

338

1

117

0

430

9

19 R

432

28

5 43

3

100

1

27

128

1

28

90m

pitimes8

omp

1024

15R

60

504

109

0 44

9

243

8 R

471

109

0 47

1

100

0

89

093

0

94

357m

pitimes8

omp

1024

53R

67

731

418

7 69

2

576

3 R

711

418

7 72

9

100

0

95

097

1

00

2plusmn

(BC

DEacute

) 10

24

53R

67

731

3410

5

642

57

685

3 R

628

6

438

3 60

2

100

0

88

086

0

82

JAEA-Testing 2011-005

brvbarWBccedilD gtecircCOslash5ordf+XAgraveXq13aelig13ordfAgraveXCc IB notGiumlIumlordfAgraveXnOslash5ordfUumlWnDordfmacrWaeligCmbrvbarPXpoundXqMPI_Alltoall MPI atilde3gtT 2ordfCX 5Pordfmnegravepound BX900 gtecircXordfXmWordfethCDq1 oacute 1 gt13aelig13thornordfDordfmgtpYacute~thornltC t256YZegraveBCDoacuteCdivideoslash h4096YZegraveBCDgtpYacute~thornlt9]QR MPIbrvbar TUacuteUcircthornuumlwdnWgtszligagraveXNXq 66 a$fY`UgraveWH mnUgravegt BX900 gt brvbarsect ecircXthorn7wWCacirc

RDMANORDMA Agravej Ocirc$-a5 1oacute 1 (13aelig13ocirc)IBnot

ordflaquopoundDqumlcopyfIuml oslashfrac14frac12CTbrvbarCcedilEgravelaquo GuecircgtlaquoigravePqeumld

fIumlszligOslash5GDH Zccedil` MPIUgraveDIcirca|ccedil`laquo

ordfugtlaquoqucircuumlWCGZccedil`MPIWIcirca|ccedil` rsGordfWmcdBX900gtIcirca|ccedil`ordfnotyXmWordfcurrencqmncdfIuml)JGoumllaquocentfIumlAumlFUcirc5GoumlcentX=ordfX

nIcirca|ccedil`laquo GuecircgtlaquoqBX9001iumlIuml 8coreoacuteC IB2gt brvbarntildeGDHm IBgt 2laquo]eumlXDHIcirca|ccedil`ordfugtlaquoqCPU13fUcircordfbrvbarsect2WnIcirca|ccedil`fIuml13vC]G

Weumlnq

JAEA-Testing 2011-005

7 UgraveWH

HPCC BMT7q|^_ZWUuml|YacuteTHORN^_ZBDbrvbarsectBX900W FX1gtfeaZbrvbarTUacuteUcircordflaquosectnordf^_ZOslashszligthornGmWordfdcXpoundDqIORBDgtegraveBG API]a13brvbarpoundlordfagraveaacutegtlaquopoundDqiacuteX B^_ZbrvbarOcirc$

a$fY`yBAgraveogravegt ordfecircC+XmWethCDqmnd

EcircHdegdnDregmacrHPCCYoJusectWUgraveWHWCtCqXyacuteampG(GOcirc$b AccedilGUgraveWHq mndregmacrBX900 W FX1 o13gtegraveBG^_ZIacuteDh

Agraveogravelaquo GDHnotszligXWXqordfbrvbarsectCumlcopy13

ABGlCWCJ7gtecircngtlaquoq

ugraveuacute

)JlaquoDsectXsup2 sup2kDotildeDAElig13=` v oBeacuteQRjlucircsup2acircAgraveXugraveG

q

ucircuumlyacutethorn

1) Innovative Innovative Computing Laboratory the University of Tennesee ldquoHPC CHALLENGErdquo httpiclcsutkeduhpcc

2) gt$middotamp()p HPCaelig7Yacutegt SX13L32 3plusmnAElig13THORNYacutel 3 A=P+ frac14AElig13THORNYacute$ 2005-05middot4 p98-116 (2005)

3) ldquoIOR HPC Benchmarkrdquo httpior-siosourceforgenet

JAEA-Testing 2011-005

A

FX1

HPC

C B

MT

G-F

FTEfeaZUuml^13~

-DH

PCC

_FFT

_235GmWgt

FFT|5ordfTUacuteUcirceumlnordfzyacuteCq

iAgrave

EP-

DG

EM

M-a5^regbrvbarpoundaringaeligccedil13h-a5

(15

MB

Cor

e)UacuteCordfzyacuteCq

Tabl

e A

-1feaZUuml^13~]-a5^regCD

FX1

HPC

C B

MT

UgraveccedilegravekeacuteUgraveccedilknUgraveccedilplusmnsup2sup3

~~regreg

Ugraveccedilmmnecircecirckccedilnecircfrac12

regReeumlgecirckccedilnecircfrac12

ndegplusmnsup2

ecirckccedilpUgraveecircfrac12frac12plusmnsup2sup3degmiddot

plusmnsup2ordmdegsup2sup1plusmnsup2sup3degmiddot

eacuteplusmn~+igraveiacute8

nmcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

reg~

reg~v

reg

sup2sup3plusmn

qvqyuqou

ovwyy

qvquzqq

vxzwo

qvzz

vyoy

zzv|w|

qvuqqy

uvuyx

qyu

sup2sup3plusmnqvqy|zx

ovwz

qvquyxz

vxuo

qvzuw

vyowx

zzv|z

qvxx|y

yvox

qwy

sup2sup3plusmn

qvxoquq

uvoxzo

qvoqqz

yyvuox

w|vyzx

vyoxx

zzvzyq

qv|qq

vqy

yy

sup2sup3plusmnqvxoqwwq

uvqwxx

qvooqzu

yyvxuwo

w|vyyq

vyouu

zzvzzx

qvux|

oqvz|

xw

sup2sup3plusmn

|vwq|qqq

|uvwz

qvqoxxw

qv|

o|uuv|q

vyxy

zzv||wq

qvqwzuw

oqvywyu

sup2sup3plusmn|vw||wqq

|vyz

qvuxu

yyyvuwwu

o|uyvyw

vy|q

zzv|uy

qvxwzqu

ovxuo

zq|z

sup2sup3plusmn

|vwuzqq

uuvxqwq

qvqoxxw

wwyv|zq

o|uv|q

vy|ox

xvuuwu

qvowox

oqvowuyq

sup2sup3plusmn|vwoquxqq

|yvzyxu

qvuzou

yvqwq|

o|uyvzxu

vy|qw

xvuuz

qvqwuy

ovuz

|y

uxqqq

wxqqq

|xqqqq

|qqqq

w

frac14icircRfrac14Rplusmndegreg

szligsup3degicircRRszligcedilszligiumliumlRethvRntildeRfrac14degicircRccedilogravemicroplusmnregntildemicro~sup1Oslashsup3sup2moRenecircfrac12oacuteregplusmnAumlocircotildeg

egravekeacuteOslashfrac14knicircccedilpegravekszligszligOslashmmnOslash|xRccedilpOslashpOslashfrac14knRccedilpegravekszligszligOslashfrac12ecircfrac12eacuteeacuteszlignRocircotilde

mszligpecircmicircccedilpeacutefrac14UgraveOslashOslashyunRocircotilde

frac12kicircRRkplusmnplusmnplusmnraquodegReacutemiddotplusmnmiddotRkplusmn~plusmnmiddot

-

moRexoszligcedilowszligkoumlg

szligdegicircRkszligyudividentildeRvxUgraveegraveparantildeRpposlashueuqUgravecedilregedegsup2plusmngntildeRo|vxUgravecedilregesup3plusmnregsup2RugraveRnecircfrac12gg

Cuacute uacuteucircicircuumlTyacutethornYacuteN0ucircicircuumlTyacutethornYacuteN0gt

|

xo

xo

egravekszligszligRethvicircRethov|vo

eeumlgicircecirckccedilnecircfrac12RregR0Eeecirckccedilnecircfrac12Rndegplusmnsup2gIcirc-RJ13NOIPQ

~~icircRppRmicrodegdegugraveplusmnsup2ntildeRoreg~cedilsup2ntildeRmplusmnRn

JAEA-Testing 2011-005

B EumlIgravecentMPI atilde oh$a`MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

1024EumlIgrave

0

1000

2000

3000

4000

5000

6000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

2000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(1024-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(1024-)

rdma

Nordma

0

100000

200000

300000

400000

500000

600000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(1024-)

rdma

Nordma

Fig B-1 MPI atildeL1024EumlIgrave-1P

JAEA-Testing 2011-005

0

50000

100000

150000

200000

250000

300000

350000

400000

q qqqq uqqqq yqqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(1024-)

rdma

Nordma

0

1000000

2000000

3000000

4000000

5000000

6000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(1024-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(1024-)

rdma

Nordma

0

50

100

150

200

250

300

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(1024-Rank0gtRank1023)

rdma

nordma

Fig B-2 MPI atildeL1024EumlIgrave-2P

JAEA-Testing 2011-005

13

C

RD

MAoacute

NO

RD

MA

Ta

ble

C-13UVOcirc$

(8K

Bccedil

78K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

~plusmnreg

plusmn

sup2~

sup2~

sup2cedil~raquo

sup2~Oslashreg~plusmn

middotplusmnsup1

middotplusmnsup1raquo

~plusmnraquo

~plusmn

~plusmn

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

plusmnraquo

Tabl

e C

-2UVOcirc$

(80

KBccedil

584K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

sup2~Oslashreg~plusmn

plusmn

~plusmnreg

middotplusmnmiddotsup1

~plusmnraquo

~plusmn

sup2~

sup2cedil~raquo

sup2~

middotplusmnsup1raquo

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

~plusmn

plusmnraquo

Zordf

100M

RD

MAAgravejordfnotyWXq

0CXM

NO

RD

MA

AgravejordfnotyWXq

JAEA-Testing 2011-005

D

W Oslash5

Tabl

e D

-1Zccedil`

MPI

8mpi

qzvquecirciumlq|yvwuecirciumlquvzuecirciumlquovu|oecirciumlqxvwoqecirciumlquyvxqoecirciumlquyvxoecirciumlqu

qwvo

qzvzzecirciumlq|vuecirciumlquwvyecirciumlquovu|oecirciumlqxvwuecirciumlquyvuzoecirciumlquxvy|ecirciumlqu

yqyv|

owvz|oecirciumlq|yvwyyecirciumlquvxzecirciumlquvqxecirciumlquov|uecirciumlquyvuecirciumlquccedilxv|xecirciumlq

ccedilyvq

ozvqoecirciumlq|vxxecirciumlquwvux|ecirciumlquvyzoecirciumlquovxoecirciumlquyvuuqecirciumlquccedilvyecirciumlq|

ccedilwuvy

ovq|ecirciumlquyvzz|ecirciumlquwvq|qecirciumlquovu|qecirciumlqxvw|wecirciumlquyvuyoecirciumlquyvywecirciumlqu

yquv

ovquyecirciumlquvuyqecirciumlquwvxqecirciumlquovu|qecirciumlqxvwecirciumlquyvuwqecirciumlquxvzxecirciumlqu

xx|vw

|ovq|zecirciumlquyvwzuecirciumlquvz||ecirciumlquvyoecirciumlquovwecirciumlquyvuzecirciumlquccedilovozecirciumlq|

ccediloyvy

|ovquecirciumlquv|qecirciumlquwvuoyecirciumlquvyy|ecirciumlquovowyecirciumlquyvuecirciumlquccedilvxwecirciumlq|

ccedilovz

uzvwoecirciumlq|vuxyecirciumlquwvu|uecirciumlquovu|ecirciumlqxvw|ecirciumlquyvuuxecirciumlquxvwwuecirciumlqu

yqovy

uzvywecirciumlq|yvzuzecirciumlquvzoyecirciumlquovuwecirciumlqxvwqyecirciumlquyvuecirciumlquyv|yecirciumlqu

yxvu

xzvqecirciumlq|vuoqecirciumlquwv|wqecirciumlquvyyecirciumlquovowecirciumlquyvuuzecirciumlquccedilvozecirciumlq|

ccedil|vx

xzvyzzecirciumlq|voxuecirciumlquwvouecirciumlquvqqecirciumlquovxuecirciumlquyvuuyecirciumlquccediluv|yecirciumlq|

ccedilu|v

yzvuecirciumlq|vxzecirciumlquwvxecirciumlquovuyecirciumlqxvwecirciumlquyvuxecirciumlquxvyzqecirciumlqu

xw|vw

yzvz|oecirciumlq|yvwzoecirciumlquvwwxecirciumlquovuzecirciumlqxvwyecirciumlquyvuyyecirciumlquyvuqwecirciumlqu

yuxv|

zvwqecirciumlq|v|uzecirciumlquwv||yecirciumlquvyxwecirciumlquovozecirciumlquyvuzecirciumlquccedilyvwuecirciumlq|

ccedilywv

ovqq|ecirciumlquyvzzyecirciumlquvzzzecirciumlquvyzecirciumlquovqoecirciumlquyvuzecirciumlquccedil|vozxecirciumlq|

ccedil|ovz

zvuzecirciumlq|vowqecirciumlquwvoxxecirciumlquovoqqecirciumlqxuvxwecirciumlquyvuqecirciumlquvwu|ecirciumlqu

zovy

zvwecirciumlq|vyxecirciumlquwvuecirciumlquovqzzecirciumlqxuvxoecirciumlquyvuyzecirciumlquvu|ecirciumlqu

wvw

qwv|xzecirciumlq|vzecirciumlquwvywecirciumlquvxyoecirciumlquovqwuecirciumlquyvuecirciumlquccedilovqyecirciumlquccedilov

qzvuuuecirciumlq|v|yecirciumlquwvqecirciumlquovuzecirciumlqxvwqqecirciumlquyvuzoecirciumlquyvqoecirciumlqu

y|vx

oovouwecirciumlquwvqqecirciumlquzvoywecirciumlquovuqwecirciumlqxvyxxecirciumlquyvuwecirciumlquuvzoxecirciumlqu

uwvq

owvwuecirciumlq|vq|qecirciumlquvzqzecirciumlquv|uuecirciumlquzvqoecirciumlq|yvu|ecirciumlquccedilxvyxqecirciumlq|

ccedilyuv|

yvz||ecirciumlq|v|oecirciumlquwvuuecirciumlquvqyecirciumlqxovuowecirciumlqxyvu||ecirciumlquovozecirciumlqxoxwvy

ovouecirciumlquvuxecirciumlquwv|zecirciumlquovu|oecirciumlqxvwxyecirciumlquyvux|ecirciumlquxvzoecirciumlqu

xoyvo

|ov|qoecirciumlquv|zqecirciumlquwvyzoecirciumlquv|oecirciumlquwvu|uecirciumlq|yvu|ecirciumlquccedilov|uecirciumlquccediloqxv

|v|qecirciumlq|vqecirciumlquvxzecirciumlquvxqecirciumlquovq|ecirciumlquyvuzecirciumlquccedilv|zoecirciumlq|

ccedil|vu

uwvxy|ecirciumlq|wvzoyecirciumlquzv|ecirciumlquovuozecirciumlqxvuoqecirciumlquyvwqecirciumlquuvuoecirciumlqu

xoxvw

uovquecirciumlquvwecirciumlquwvqecirciumlquovuouecirciumlqxvyxqecirciumlquyvuzoecirciumlquxvwoecirciumlqu

xy|vx

xzvux|ecirciumlq|vxwecirciumlquwvx|ecirciumlquvx|oecirciumlquovoqoecirciumlquyvu|qecirciumlquccedilzvzoecirciumlq|ccediloquvz

xwvqqecirciumlq|vuuwecirciumlquwv|owecirciumlquvzuzecirciumlquov|wyecirciumlquyvxy|ecirciumlquccedil|vywuecirciumlq|

ccediluv|

yzvqecirciumlq|zv|w|ecirciumlquovq|qecirciumlqxovuxecirciumlqxvyecirciumlquyvzqecirciumlquuvqecirciumlqu

uyuvq

yovqyuecirciumlquv|quecirciumlquwv|ywecirciumlquovuyuecirciumlqxwvqxwecirciumlquyvxwoecirciumlquyvecirciumlqu

xwzvx

ovoooecirciumlquvzwwecirciumlquzvqzzecirciumlquvo|ecirciumlqxovuoecirciumlqxyvyouecirciumlquovecirciumlqxooqqvy

ovqoecirciumlquvqyoecirciumlquwvo|ecirciumlquvy|ecirciumlquovoecirciumlquyvuxecirciumlquccediluvxzecirciumlq|

ccediluvz

zvy|ecirciumlq|wvoqqecirciumlquzvqyecirciumlquov|uqecirciumlqxyvw|ecirciumlquyvxyecirciumlquuv||ecirciumlqu

uzovo

zvyzoecirciumlq|vqwecirciumlquwvoecirciumlquovqzwecirciumlqxuvuwwecirciumlquyvuzyecirciumlquvwqyecirciumlqu

yxvy

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

JAEA-Testing 2011-005

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy1

Ccedil

Ccedil

+

Ccedil

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

Fi

g D

-1 Zccedil`

MPI

8mpi

JAEA-Testing 2011-005

Tabl

e D

-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p

q|vu|ecirciumlq|yvuqqecirciumlquyvuuecirciumlquyvy|ecirciumlqu|vuuxecirciumlq|yvuozecirciumlquovzzoecirciumlq

xvw

qvuuwecirciumlq|yvuqecirciumlquvouecirciumlquvqyecirciumlquyvqxqecirciumlq|yvuqecirciumlquccedilovoecirciumlq|

ccediloyv|

o|vuuecirciumlq|yvuq|ecirciumlquyvuecirciumlquyvwuwecirciumlquuv|ecirciumlq|yvuuecirciumlquovqquecirciumlq|

zv

oyvqzwecirciumlq|yvuqoecirciumlquvqooecirciumlquvqecirciumlquyvqyecirciumlq|yvuozecirciumlquovxxecirciumlq

vx

|vwuecirciumlq|yvuqoecirciumlquyv|qecirciumlquyvwxecirciumlquuv|uqecirciumlq|yvu|ecirciumlquovoecirciumlq|

|wv

yvqqwecirciumlq|yvuqyecirciumlquvqqecirciumlquvq|oecirciumlquyvoyecirciumlq|yvuouecirciumlquvuowecirciumlq

uvq

||vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvwxecirciumlquuv|yqecirciumlq|yvuoecirciumlquovooecirciumlq|

|yv

|vuzwecirciumlq|yvuqyecirciumlquvoxxecirciumlquvq|ecirciumlquyvoqyecirciumlq|yvuoecirciumlquccedilov|uecirciumlq|

ccediloyvx

|v|yuecirciumlq|yvuq|ecirciumlquyv|zecirciumlquyvw|oecirciumlquuvqzuecirciumlq|yvuecirciumlquzvo|ecirciumlq

vy

yvy|ecirciumlq|yvuquecirciumlquvqwqecirciumlquvqzecirciumlquyvoqecirciumlq|yvuozecirciumlquccedilxvo|uecirciumlq

ccedilyvy

q|vu|ecirciumlq|yvuqoecirciumlquyvu|ecirciumlquyvyyyecirciumlquvx|qecirciumlq|yvuo|ecirciumlquccedilvoqecirciumlq

ccedilvx

qvu||ecirciumlq|yvuquecirciumlquvouwecirciumlquvq|ecirciumlquyvoxecirciumlq|yvuecirciumlquccedilovoquecirciumlq|

ccedilouvz

ovxouecirciumlq|yvuqecirciumlquyvyxzecirciumlquyvyyecirciumlquvu|ecirciumlq|yvuxecirciumlquwvuwwecirciumlqo

|vu

oyvozecirciumlq|yvuq|ecirciumlquvqoyecirciumlquvq|ecirciumlquyvoecirciumlq|yvuxecirciumlquvooqecirciumlq

|vu

|vyoqecirciumlq|yvuoyecirciumlquyvecirciumlquyvxyecirciumlqu|v|xwecirciumlq|yvuqecirciumlquccedilvozecirciumlq

ccedilxvz

xvzzyecirciumlq|yv|wzecirciumlquyvzwzecirciumlquvq|oecirciumlquyvouecirciumlq|yvuoyecirciumlquuvoxwecirciumlq

yvz

||vyecirciumlq|yvuqoecirciumlquyvy|ecirciumlquyvxecirciumlqu|v|xxecirciumlq|yvuoecirciumlquccedilxvwwzecirciumlqo

ccedilovy

|vuzuecirciumlq|yvuquecirciumlquvox|ecirciumlquvq||ecirciumlquyvo|ecirciumlq|yvuozecirciumlquccedilovq|ecirciumlq|

ccediloyvo

|vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvooecirciumlquvzoyecirciumlq|yvuqecirciumlquccedilv|zxecirciumlq

ccedilyv

yvy|ecirciumlq|yvuqqecirciumlquvqecirciumlquvq|xecirciumlquyvouecirciumlq|yvuqecirciumlquccediluvqoecirciumlq

ccedilxvo

q|vozyecirciumlq|yv|zzecirciumlquyvowecirciumlquyvyzecirciumlqu|vxu|ecirciumlq|yvuouecirciumlquxvq|wecirciumlq

oxvw

qv|wyecirciumlq|yvuqxecirciumlquvouuecirciumlquvquuecirciumlquyvoqyecirciumlq|yvu||ecirciumlquccedilovqqqecirciumlq|

ccedilo|vx

ouvoq|ecirciumlq|yv|zecirciumlquyvwqecirciumlquyvyxecirciumlquvuoecirciumlq|yvuoyecirciumlquccedilovxqoecirciumlq|

ccedil|yvy

ovuqzecirciumlq|yv|zoecirciumlquvo|ecirciumlquvq|zecirciumlquyvo|ecirciumlq|yvuowecirciumlquccedilzv|ecirciumlq

ccedilovy

uvoecirciumlq|yvuqqecirciumlquyvwo|ecirciumlquyvxoecirciumlqu|vzwecirciumlq|yvuoecirciumlquccedilyvowecirciumlq

ccediloxvq

vxozecirciumlq|yvuoecirciumlquvo|ecirciumlquvquuecirciumlquyvoyecirciumlq|yvuecirciumlquccedilovzzecirciumlq|

ccedilov|

||vuoqecirciumlq|yvuqxecirciumlquyvuecirciumlquyvywecirciumlqu|vuyuecirciumlq|yvuoecirciumlquvooqecirciumlq

yv

|vxoecirciumlq|yvuquecirciumlquvoxyecirciumlquvq||ecirciumlquyvqwyecirciumlq|yvuuecirciumlquccedilov||ecirciumlq|

ccediloyvu

|vqwecirciumlq|yvuqqecirciumlquyvoecirciumlquyv|yecirciumlqu|vozecirciumlq|yvuowecirciumlquccedil|vxooecirciumlq

ccedilvu

vuxwecirciumlq|yvuqxecirciumlquvoxoecirciumlquvquqecirciumlquyvoxxecirciumlq|yvuuecirciumlquccedilovooyecirciumlq|

ccedilouvz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

jetimesg

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-3Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (2

)

JAEA-Testing 2011-005

Tabl

e D

-3Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p

q|v|oxecirciumlq|wvwz|ecirciumlq|ovoecirciumlquovou|ecirciumlquvu|qecirciumlq|zvqqecirciumlq|ccedilvxwecirciumlq

ccedil|vu

qvx||ecirciumlq|wvzquecirciumlq|ovyuuecirciumlquovyuyecirciumlquvxqoecirciumlq|wvzyecirciumlq|vy|ecirciumlqo

qv|

ovyecirciumlq|wvwzqecirciumlq|ovoyxecirciumlquovouuecirciumlquvuwecirciumlq|wvzyuecirciumlq|ccedilvoqqecirciumlq

ccedilvy

ovxqoecirciumlq|wvwzecirciumlq|ovyuqecirciumlquovy|yecirciumlquvuoqecirciumlq|wvzxqecirciumlq|ccedil|vzoecirciumlqo

ccedilqvx

|vywoecirciumlq|wvz|ecirciumlq|ovyoecirciumlquov|ecirciumlqu|v|wyecirciumlq|wvz|oecirciumlq|ccedilvzxzecirciumlq

ccedilwvq

v|zoecirciumlq|wvwzyecirciumlq|ovyzecirciumlquovuzyecirciumlquxvzzwecirciumlq|wvzxwecirciumlq|ccedilov||oecirciumlq|

ccedilowvq

||vywecirciumlq|wvwwzecirciumlq|ovxecirciumlquov||ecirciumlqu|v|xwecirciumlq|wvzoecirciumlq|ccedilv|zecirciumlq

ccedilyvx

|vuooecirciumlq|wvzqwecirciumlq|ovy|ecirciumlquovxqxecirciumlquyvqzecirciumlq|wvz|ecirciumlq|ccedilovyecirciumlq|

ccedilovo

|v|xzecirciumlq|wvzqoecirciumlq|ovyecirciumlquovowwecirciumlquvzo|ecirciumlq|wvzyecirciumlq|ccedil|vzzecirciumlq

ccediloovu

vuxzecirciumlq|wvzqoecirciumlq|ovy|yecirciumlquovxoecirciumlquyvuecirciumlq|wvzyoecirciumlq|ccedilyvxuecirciumlq

ccedilwvw

q|v|xzecirciumlq|wvwwoecirciumlq|ovuecirciumlquovxqecirciumlqu|vxxuecirciumlq|wvzuuecirciumlq|vxwecirciumlq

v

qvuxecirciumlq|wvzqyecirciumlq|ovy||ecirciumlquovuzwecirciumlquyvqqecirciumlq|wvzy|ecirciumlq|ccedilov|uwecirciumlq|

ccedilowv

ovuwqecirciumlq|wvwzecirciumlq|ovo|yecirciumlquovo|zecirciumlquvuuqecirciumlq|wvzuyecirciumlq|vowecirciumlqo

ovo

oyvoozecirciumlq|wvz|qecirciumlq|ovxqxecirciumlquovuzecirciumlquyvqoecirciumlq|wvzxzecirciumlq|ccedilvzyecirciumlqo

ccedilov|

|v||zecirciumlq|wvzqzecirciumlq|ovxecirciumlquov|oecirciumlqu|v|oxecirciumlq|wvzzyecirciumlq|yv|owecirciumlqo

ovz

xvzyuecirciumlq|wvzqoecirciumlq|ovuwecirciumlquovxqxecirciumlquyvqxecirciumlq|wvzuecirciumlq|ovwuoecirciumlq

|vo

||v|xzecirciumlq|wvwzoecirciumlq|ovxecirciumlquovuzecirciumlqu|vxozecirciumlq|wvzqecirciumlq|v|zoecirciumlq

vo

|vuwxecirciumlq|wvzqyecirciumlq|ovy|zecirciumlquovxquecirciumlquyvqwoecirciumlq|wvzxzecirciumlq|ccedilov|xoecirciumlq|

ccedilowvo

|vo|uecirciumlq|wvwzqecirciumlq|ovqecirciumlquovoecirciumlqu|vqecirciumlq|wvzyuecirciumlq|ovuyzecirciumlq

uvu

yvuwecirciumlq|wvzooecirciumlq|ovxyyecirciumlquovxqoecirciumlquyvquecirciumlq|wvzyuecirciumlq|ccedilyvuw|ecirciumlq

ccedilwvy

q|v|oyecirciumlq|wvwzqecirciumlq|ovoecirciumlquov|yecirciumlqu|v|yzecirciumlq|wvzwzecirciumlq|ovxoyecirciumlq

uvy

qxvzyyecirciumlq|wvzq|ecirciumlq|ovuwecirciumlquovy|wecirciumlquvu|qecirciumlq|wvzx|ecirciumlq|ovxouecirciumlq|

xvu

ovxyecirciumlq|wvwqecirciumlq|ovoy|ecirciumlquov|oecirciumlquuvuuecirciumlq|wvzyecirciumlq|ovxwqecirciumlq|

xv|

oyvqqecirciumlq|wvzqecirciumlq|ovuzecirciumlquovy|wecirciumlquvuyecirciumlq|wvzuzecirciumlq|ovux|ecirciumlq|

uvo

|vyecirciumlq|wvwzqecirciumlq|ovxecirciumlquov|zecirciumlquuv||ecirciumlq|wvzxuecirciumlq|vowyecirciumlq

ozvx

vuuuecirciumlq|wvzuuecirciumlq|ovy|zecirciumlquovuwzecirciumlquxvzxzecirciumlq|wvz|yecirciumlq|ccedilovuz|ecirciumlq|

ccedilqvo

||vywecirciumlq|wvww|ecirciumlq|ovxyecirciumlquov|zecirciumlquuv|owecirciumlq|wvzoecirciumlq|vwoecirciumlq

ozvw

|vuqqecirciumlq|wvzqwecirciumlq|ovy|oecirciumlquovxqqecirciumlquyvquqecirciumlq|wvzxwecirciumlq|ccedilov|oqecirciumlq|

ccedilov

|v|xecirciumlq|wvww|ecirciumlq|ovuecirciumlquov|q|ecirciumlquuvqyyecirciumlq|wvzyzecirciumlq|vzuyecirciumlq

xv|

yvqwecirciumlq|wvzouecirciumlq|ovxyecirciumlquovxyyecirciumlquyvouecirciumlq|wvzuzecirciumlq|uvqzxecirciumlqo

vz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

Ccedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

Xpound

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-4Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-5Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (2

)

JAEA-Testing 2011-005

E 1oacute 1

Fig E-1 QRfeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39305 sec Ave Speed = 32566145 MBs RANK = 2 TIME = 39197 sec Ave Speed = 32655168 MBs RANK = 3 TIME = 39212 sec Ave Speed = 32643219 MBs RANK = 4 TIME = 34341 sec Ave Speed = 37272765 MBs RANK = 5 TIME = 34175 sec Ave Speed = 37453885 MBs RANK = 6 TIME = 34086 sec Ave Speed = 37552470 MBs RANK = 7 TIME = 34329 sec Ave Speed = 37286226 MBs RANK = 8 TIME = 24021 sec Ave Speed = 53287420 MBs RANK = 9 TIME = 23794 sec Ave Speed = 53794939 MBs RANK = 10 TIME = 23805 sec Ave Speed = 53770236 MBs RANK = 11 TIME = 23798 sec Ave Speed = 53786914 MBs RANK = 12 TIME = 23942 sec Ave Speed = 53462490 MBs RANK = 13 TIME = 23931 sec Ave Speed = 53488046 MBs RANK = 14 TIME = 23950 sec Ave Speed = 53443826 MBs RANK = 15 TIME = 24066 sec Ave Speed = 53186052 MBs RANK = 16 TIME = 23798 sec Ave Speed = 53786327 MBs RANK = 17 TIME = 23802 sec Ave Speed = 53775886 MBs RANK = 18 TIME = 23821 sec Ave Speed = 53733399 MBs RANK = 19 TIME = 23794 sec Ave Speed = 53795866 MBs RANK = 20 TIME = 23939 sec Ave Speed = 53468837 MBs RANK = 21 TIME = 23938 sec Ave Speed = 53470775 MBs RANK = 22 TIME = 23939 sec Ave Speed = 53470333 MBs RANK = 23 TIME = 23942 sec Ave Speed = 53462889 MBs

JAEA-Testing 2011-005

Fig E-2 afeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39370 sec Ave Speed = 32511775 MBs RANK = 2 TIME = 39475 sec Ave Speed = 32425231 MBs RANK = 3 TIME = 39415 sec Ave Speed = 32474846 MBs RANK = 4 TIME = 34392 sec Ave Speed = 37217925 MBs RANK = 5 TIME = 34543 sec Ave Speed = 37055373 MBs RANK = 6 TIME = 34250 sec Ave Speed = 37372742 MBs RANK = 7 TIME = 34709 sec Ave Speed = 36878286 MBs RANK = 8 TIME = 26967 sec Ave Speed = 47465876 MBs RANK = 9 TIME = 26964 sec Ave Speed = 47471018 MBs RANK = 10 TIME = 26955 sec Ave Speed = 47486289 MBs RANK = 11 TIME = 26947 sec Ave Speed = 47499771 MBs RANK = 12 TIME = 26957 sec Ave Speed = 47483647 MBs RANK = 13 TIME = 27020 sec Ave Speed = 47373054 MBs RANK = 14 TIME = 26948 sec Ave Speed = 47498023 MBs RANK = 15 TIME = 27174 sec Ave Speed = 47104542 MBs RANK = 16 TIME = 26956 sec Ave Speed = 47485054 MBs RANK = 17 TIME = 26945 sec Ave Speed = 47504899 MBs RANK = 18 TIME = 26946 sec Ave Speed = 47502360 MBs RANK = 19 TIME = 26967 sec Ave Speed = 47466158 MBs RANK = 20 TIME = 26948 sec Ave Speed = 47499767 MBs RANK = 21 TIME = 26982 sec Ave Speed = 47438204 MBs RANK = 22 TIME = 27087 sec Ave Speed = 47254830 MBs RANK = 23 TIME = 27004 sec Ave Speed = 47399960 MBs

JAEA-Testing 2011-005

Fig E-3 QRfeaZOpenMPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes NODE CPU coreid Rank = 0 bx2050 CPU0 0 all time RANK = 1 TIME = 38110 sec Ave Speed = 33587213 MBs bx2050 CPU1 1 RANK = 2 TIME = 28656 sec Ave Speed = 44667752 MBs bx2050 CPU0 2 RANK = 3 TIME = 37959 sec Ave Speed = 33720175 MBs bx2050 CPU1 3 RANK = 4 TIME = 28658 sec Ave Speed = 44664664 MBs bx2050 CPU0 4 RANK = 5 TIME = 37829 sec Ave Speed = 33836073 MBs bx2050 CPU1 5 RANK = 6 TIME = 28641 sec Ave Speed = 44691880 MBs bx2050 CPU0 6 RANK = 7 TIME = 37892 sec Ave Speed = 33779924 MBs bx2050 CPU1 7 RANK = 8 TIME = 35140 sec Ave Speed = 36425620 MBs bx2051 CPU0 0 RANK = 9 TIME = 35017 sec Ave Speed = 36554163 MBs bx2051 CPU1 1 RANK = 10 TIME = 38916 sec Ave Speed = 32891010 MBs bx2051 CPU0 2 RANK = 11 TIME = 45416 sec Ave Speed = 28184186 MBs bx2051 CPU1 3 RANK = 12 TIME = 35172 sec Ave Speed = 36392511 MBs bx2051 CPU0 4 RANK = 13 TIME = 35053 sec Ave Speed = 36516468 MBs bx2051 CPU1 5 RANK = 14 TIME = 35200 sec Ave Speed = 36363472 MBs bx2051 CPU0 6 RANK = 15 TIME = 35009 sec Ave Speed = 36561987 MBs bx2051 CPU1 7 RANK = 16 TIME = 22996 sec Ave Speed = 55661441 MBs bx2052 CPU0 0 RANK = 17 TIME = 35587 sec Ave Speed = 35968575 MBs bx2052 CPU1 1 RANK = 18 TIME = 35587 sec Ave Speed = 35968151 MBs bx2052 CPU0 2 RANK = 19 TIME = 35712 sec Ave Speed = 35842525 MBs bx2052 CPU1 3 RANK = 20 TIME = 22996 sec Ave Speed = 55661978 MBs bx2052 CPU0 4 RANK = 21 TIME = 36856 sec Ave Speed = 34729458 MBs bx2052 CPU1 5 RANK = 22 TIME = 22994 sec Ave Speed = 55666382 MBs bx2052 CPU0 6 RANK = 23 TIME = 22991 sec Ave Speed = 55673257 MBs bx2052 CPU1 7

JAEA-Testing 2011-005

13

F |eacuteIuml -microX|eacuteIuml^_Z)JCQRfeaZW Intel feaZegrave

pound5poundDqQRfeaZgt|YatildeGfeaUuml^13~

WCeumlUKntstiacuteordflaquoqmnaringaeligccedil13heumlX store amp5GUuml^13~gtlaquoq2csup1egraveBordfOcirc$ storegtszligordflaquoqiAgraveIntelmnntildeGfeaUuml^13~centsup3CXqCcCcedilEgravegtacircOslash5Cagraveaacute5pound

Dq WCIntelfeaZmUuml^13~ordfccdQRfeaZ

UKntst egraveBCDoumlWordfqiAgraveUKntst ordfQRfeaZEGmWordfethCDq

F1 Agraveograve o^_UgraveDo7ccedilIumlgt 128MB copyIgrave0]lyacuteOslash5GZccedil` MPI

OpenMP^_Z)JCDqm0]lyacute 100plusmnatildesectaumlC5mnccC|eacuteIuml (MBs)iquestHq^_ZiIacute[oumlYacutebrvbarsectGqOslash5IumlWCMPI gt 12348 ^_OpenMP gt 123487ccedilIumlgtOslash5UgraveD IFyacute1EC1CPUcd2^_UgraveD7ccedilIuml 2laquonotgtOslash55poundEacuteDqQRfeaZUKntst CDoumlWCXouml 2sectCDqIntelucircfeagt5poundDq

JAEA-Testing 2011-005

Fig F-1 ^_ZugraveuacuteLFlat MPIP

(main) n = nmb102410248 nmb = 128 (from input data) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) nsize = n size c NNN Repetition number of times NNN = 100 do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = MPI_WTIME() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

Fig F-2 ^_ZugraveuacuteLOpenMPP

(main) $OMP threadprivate(ioutompidompnumabierr1ierr2) n = nmb102410248 nmb = 128 (from input data) $OMP parallel shared(n) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) if(ierr1ne0orierr2ne0) then write(6) Array allocation failed stop end if $OMP end parallel nsize = n size c NNN Repetition number of times NNN = 100 $OMP parallel private(i)shared(nsize) do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue $OMP end parallel $OMP parallel private(IIttime1ttime2timewave_spd) $OMPamp shared(NNNn) ttime1 = omp_get_wtime() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = omp_get_wtime() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

F2 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt[oumlYacutebrvbarsect 4gtGqmn

d QRUKntstlaquooumlWoumlWr 2 2^_laquonotgtUKntstlaquooumlWoumlWr 3KntstCc 1ccedil8^_gtQRW IntelWr QR(-Kntst)W Intel2 2^_Wr CqOMP Intel EcircHGu MPI Wiacute

XWXpoundDDHyzGqmndbrvbarsect[raquozfrac14frac12CDq (1) MPI^_ZOpenMP^_ZWQRfeaZgtUKntstordflaquooumlW

oumlgtordfaacuteCq (2) IntelfeaZUKntstCDQRfeaZWbAEligq (3) MPI^_ZOpenMP^_ZW 1CPUSgto coremacrdnXq (4) UKntstordfQRfeaZOslash5thornlt]G(1plusmnW 2plusmn

Oslash5ordfaEacuteD)q ^_ZbrvbarPXOcirc$sup1yBordfAgraveXCc store ampordfCXoumlQR

feaZUKntst Uuml^13~CXnordfyacuteordfdXqOcirc$sup1yBordfcAgraveXcpoundecircsectCXoumlmUuml^13~CagraveaacuteGaeliguordflaquoWnq

IntelfeaZdegIacuteQRUKntstlaquosectoumlWGmWordfgtecircq(XETHNtildefeaOslash5|YatildeG4X

yen5oslashIacuteD|Yordf9WXpoundbrvbarPmacrltdnDq

JAEA-Testing 2011-005

Table F-1 |eacuteIumlWeumlUKntstiacuteUuml^13~

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqvo uuuwvq u|xvq qvzu

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |o|v wq|vy xxwvu qvXpoundo |xwxvz |xwvq |xw|vz Xpoundo |xvy wovw xov qv

|xyvu |xuvo |xxv| |ozvu woqv xyxvq qv

Xpoundq z|vx zvz z|v Xpoundq ouuwvz ozwov ooxv| qvyoXpoundo woqv wqxvo wqvy Xpoundo oux|vu ozwv| oqvu qvyoXpound wqzv| wqvo wqxv Xpound oux|v| ozzqvo oov qvyo

wquv| wqqvo wqv ouxovz ozwyvu oozvo qvyo

Xpoundq o||vo o|xvq o|uvq Xpoundq oqoxvz oxquvu oyqvo qvxzXpoundo ouv ouov ouov Xpoundo oqovw oxoqvo oy|vz qvxzXpound ouov ouyv| ouuvq Xpound oqowvz oxqzv oyuv| qvxzXpound| ouvy ouuvw ou|v Xpound| oqowvz oxqzv oyuvq qvxz

o|zvz ouovw ouqvz oqovz oxqwv| oy|vo qvxz

Xpoundq qwyvq oqyvw qzyvu Xpoundq ouxvq ouwuvy ouywv| qvqXpoundo qz|v ooyvy oqxv Xpoundo ouxvz ouwuv ouovq qvqXpound qzuvy oovq oqxvw Xpound ouxzv ouz|v ouyv qvqXpound| qzxvy ooxvu oqxvx Xpound| ouxuvw ouzv ou|vx qvqXpoundu qwzv| o||v ooovx Xpoundu ouqxvo ouuv| ouyv qvyXpoundx qwzvu o|vw oo|vy Xpoundx ouq|vo ouuxv ouuv qvyXpoundy oqvy o|xvu oovx Xpoundy ouquvw ouuyvq ouxvu qvyXpound ooqvu o|yvo o|v Xpound ouqxvx ouuyv ouyvo qvy

qzxvw ouvw ooqv| ou|qv| ouyvu ouuwvz qvyw

usup3deg

wsup3deg wsup3deg

usup3deg

|sup3deg

sup3deg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

sup3deg

|sup3deg

Table F-2 |eacuteIumlLiacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq ywxvu xyuvw yxvo qvuXpoundo |xywv |xwvx |x|vy Xpoundo yzwvz xvu y|wvo qvuXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz yzvo xovo y|ovy qvu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq yqv xywvy yozvy qvuXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound ywyvz xwv y|uvw qvuXpound| Xpound|

|xyov| |xovu |xyyv| ywvw xxvy yv qvu

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq yoov xvu ywovw qvxXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| yxv qvo yzvy qvx

|xyv| |xqvo |xyyv yowv yov ywzv qvx

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo uyv| wqvo y|v qvuXpound |xyvz |xuvu |xovo Xpound uqvu wqyv| y|wv| qvuXpound| Xpound|

|xyyv| |xuvu |xqv| uywvw wqyv y|vw qvu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo uyxvy wqyv y|yvo qvuXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| uywvq wqxvz y|vq qvu

|xyuvw |x|vo |xywvz uyyvw wqyv| y|yvx qvu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound xq|v wovo yxvz qvuXpound| |xyxvu |x|v| |xyzv| Xpound| uzzvx wqyvu yxvz qvu

|xyuvw |xuvy |xyzv xqovy wqzv yxxvu qvu

ntildeR| ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

ontildeR

ontildeR|

qntildeRo

qntildeR

qntildeR|

qntildeRo

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

JAEA-Testing 2011-005

Talbe F-3 |eacuteIumlLQRafeaZrsacute 1ccedil8 ^_P

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqxvw uqzvq uqvu ovqx

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |ywqvx |ywovy |ywovo ovq|Xpoundo |xwxvz |xwvq |xw|vz Xpoundo |yzyvz |yzvo |yzvq ovq|

|xyvu |xuvo |xxv| |ywwv |ywzv| |ywzvq ovq|

Xpoundq z|vx zvz z|v Xpoundq zv| zxvq z|v ovqqXpoundo woqv wqxvo wqvy Xpoundo wqxvx wqwvy wqvo ovqqXpound wqzv| wqvo wqxv Xpound wquv| wqwv| wqyv| ovqq

wquv| wqqvo wqv wqqv wquvq wqv| ovqq

Xpoundq o||vo o|xvq o|uvq Xpoundq ov ouqv| o|ov| ovqqXpoundo ouv ouov ouov Xpoundo o|ovy ouwvw ouqv ovqqXpound ouov ouyv| ouuvq Xpound o|ovy oxqvo ouqvw ovqqXpound| ouvy ouuvw ou|v Xpound| o|qv ouzvo o|zvz ovqq

o|zvz ouovw ouqvz ozvq ouvo o|wvq ovqq

Xpoundq qwyvq oqyvw qzyvu Xpoundq qzv ooxvq oqyvo ovqqXpoundo qz|v ooyvy oqxv Xpoundo oquv oouv| oqzv ovqqXpound qzuvy oovq oqxvw Xpound oqxv| oxvx ooxvu ovqqXpound| qzxvy ooxvu oqxvx Xpound| oqxvw oxv ooxvx ovqqXpoundu qwzv| o||v ooovx Xpoundu oqv qzuv| oqvx qvzzXpoundx qwzvu o|vw oo|vy Xpoundx oov qzov oquvu qvzwXpoundy oqvy o|xvu oovx Xpoundy ooxvu qwzv oqvy qvzwXpound ooqvu o|yvo o|v Xpound oouvo qzqvo oqvo qvzw

qzxvw ouvw ooqv| ooqvq oqxv oqvz qvzz

wsup3deg

usup3deg

|sup3deg

sup3deg

13aumlecedilg

sup3deg

|sup3deg

usup3deg

eccedilograveregYacuteg

efrac12cedilregg

oefrac12cedilregg

wsup3deg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

Table F-4 |eacuteIumlLQRafeaZrsacute iacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq |quvy |quvx |quvx ovquXpoundo |xywv |xwvx |x|vy Xpoundo ||vx |ovw |vy ovquXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz |ouvq |o|vo |o|vy ovqu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq ||zzv |qv| |xxqvw ovqqXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound |zwv| ||v |yovq ovqxXpound| Xpound|

|xyov| |xovu |xyyv| |xzwvw |o|vq |yxxvz ovq

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq |quv |quvq |quvu ovquXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| |ovx |ovy |ovy ovqu

|xyv| |xqvo |xyyv |o|vo |ovw |o|vq ovqu

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo |oyvw |oyvu |oyvy ovquXpound |xyvz |xuvu |xovo Xpound |oyv |oyvq |oyvo ovquXpound| Xpound|

|xyyv| |xuvu |xqv| |oyvx |oyv |oyv| ovqu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo |ov |oyvo |oyvz ovquXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| |ouv| |ovx |o|vu ovqu

|xyuvw |x|vo |xywvz |oyvq |ouv| |oxvo ovqu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound |oyv |oxvz |oyvo ovquXpound| |xyxvu |x|v| |xyzv| Xpound| |oxvx |oyvw |oyvo ovqu

|xyuvw |xuvy |xyzv |oxvw |oyv| |oyvo ovqu

ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

qntildeRoqntildeRo

qntildeR

qntildeR|

ontildeR

ontildeR|

ntildeR|

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpound

JAEA-Testing 2011-005

G Zccedil` MPIWIcirca|ccedil`gt MPI atilde 6MPI ETHNtildegtegraveBCD^_ZiIacute^regCZccedil`MPI

WIcirca|ccedil`gtOslash55poundDq

G1 Agraveograve MPI ETHNtildegtegraveBCD^_ZiIacute^regC1024 ^_Zccedil

` MPIW 128^_j87ccedilIumlIcirca|ccedil`128^_Zccedil` MPIW128 ^_j8 7ccedilIumlIcirca|ccedil`acircgtCc Ocirc$-a5gtrs5poundDqUgraveDRDMANORDMAIumlgt5poundDq )acircEacute coregtZccedil` MPIWIcirca|ccedil`rs acircEacute MPI^_gt

coreCCDouml rsWGq Ocirc$-a5n~n 8eacutea`ccedil1Keacutea`256Keacutea`ccedil1Meacutea`WCDq

G2 [oumlYacutebrvbarsectGqmbrvbarsect[raquozfrac14frac12CDq (1) 1024Zccedil` MPIW 128j8Icirca|ccedil`gtGu MPIatildegtIcirca|ccedil`

ordfFqUgraveDatildebrvbarpoundaacuteCCordflaquoq (2) 128Zccedil` MPIW 128j8Icirca|ccedil`gtplusmn Ocirc$-a5(256KB13yacute)

gtlaquoWGu MPIatildegtaeligIcirca|ccedil`ordfnotyW wXordfaacuteCCIcirca|ccedil`ordfZccedil` MPIbrvbarsectordfmWq

(2)gt ordfAgravecedilCXpoundegraveBG core ordfCXgtOslashf

IumlntildeHDoumlfIumlOslash5FCXWeumlnqmndbrvbarsect

BX900gtIcirca|ccedil`ordfnotygtlaquoqfIumlUTUacuteUcircbgtIcirca|ccedil`Ucirc5Gaeliguordflaquoq

JAEA-Testing 2011-005

Table G-1 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (1)

sup2~ w |y |o ovq| zx o| qv |vu vuqsup2~ oy |o |qz ovq| oqq o|x qv| |vow vwsup2~ u |ow |o ovqq zz o|x qv| |vo v|xsup2~ | |y |o ovqu zw o|x qv| |v|u v|sup2~ yu |o |oy ovqq zy o|z qvyz |v| vsup2~ ow ||u |o| ovq zw o| qv |vuq vwsup2~ xy ||u |o ovqx oqx o|z qvx |voz vsup2~ xo ||u |q ovqu oq o|w qv |vo| v|sup2~ ontildequ |uo |q ovq oqz ou| qvy |vo vu

sup2~ w owq qw qvw q oo qvy| vxy ovwxsup2~ oy owu o| qvw q oqw qvyy vyo ovzwsup2~ u owq qu qvww yw oqz qvy vyx ovwsup2~ | o| qy qvwu o oqw qvyy vu| ovzosup2~ yu o| qu qvwx o oqw qvyy vu| ovwzsup2~ ow owx qq qvz| oqz qvy vxy ovwusup2~ xy oz qo qvwz wy ooo qvw vqz ovwsup2~ xo ow qu qvw w oo qv vqx ovwsup2~ ontildequ ow ou qvwx z oq qv ovzw ovz

sup2~Oslashreg~plusmn w untildeqoy untilde|z ov| ontildeuw ontildeyz| qvu xzv|q xvusup2~Oslashreg~plusmn oy |ntildeyuw uqntilde ovwo ontildezw ontildeyuu qvz xyvx uvwsup2~Oslashreg~plusmn u untildeqxu |zntildezu| ovwx ontildexo ontildeyoq qvw xzvow uvwsup2~Oslashreg~plusmn | |ntilde|| uqntildeyq ovw ontilde|q ontildeyq qvwo xxvxy uvwysup2~Oslashreg~plusmn yu untildey uqntildeuuu ovwu ontildeww ontildeyuy qvw xvyu uvxsup2~Oslashreg~plusmn ow |ntildeqq uqntildezw| ovw ontildew ontildeyzx qv xzvxq uvowsup2~Oslashreg~plusmn xy untildeyyx uqntildeywz ovwu ontildeuxw ontildeyy qvw xovq uvuosup2~Oslashreg~plusmn xo untildeqz |zntildeoww ovwz ontildexq ontildeqx qvwz uwvyz vzzsup2~Oslashreg~plusmn ontildequ yntildexwx |ntildeuuz vqx ontildey ontildewx qvw uvo ozvzw

middotplusmnsup1 w |w | ovqq oqq o|y qv| |vw vuqmiddotplusmnsup1 oy |u |uo qvzx oqo o|y qvu |vq vxqmiddotplusmnsup1 u |uo |yq qvzx oqo ouq qv |v|w vxwmiddotplusmnsup1 | |xo |yw qvzy oq o|w qvu |vux vyymiddotplusmnsup1 yu xzo u|| ov|y oqx ouw qvo xvy| vzmiddotplusmnsup1 ow ntildez yqu uvz| ooy oxx qvx xvq |vzqmiddotplusmnsup1 xy xntilde z xv| o| oy qvz |zvyx xvwumiddotplusmnsup1 xo wntildeqoz ontildew uvxo oy oz ovo| |voy zvmiddotplusmnsup1 ontildequ oqntildezy |ntildexxo |vqz zo |w uvqz oovz ouvzx

middotplusmnsup1raquo w ntildey| oxntildequq qvo || oy qvu vz ovqomiddotplusmnsup1raquo oy ntildeuyw oxntildeqx qvoy ||x oq qvu v| ovoxmiddotplusmnsup1raquo u ntilde|y oxntildequq qvow |uo o| qvuw wvq| ovqzmiddotplusmnsup1raquo | ntildexw| oxntildeq|| qvo || ox qvu vyy ovqomiddotplusmnsup1raquo yu ntildeyyz oxntildeqqz qvow |xq yw qvuy vy ozvxumiddotplusmnsup1raquo ow |ntildeqqx oxntildeqy| qvq u|q uo qvxw yvzz qv|middotplusmnsup1raquo xy |ntilde|uy oxntildeoqu qv xyo y qvu xvzy ozvw|middotplusmnsup1raquo xo |ntilde|zq oxntildeoxq qv xzw woq qvu xvy owvomiddotplusmnsup1raquo ontildequ |ntildex| oxntildeuox qvu yy z|z qv xvxx oyvuo

Ugraveplusmnsup1 w ontildeuox yntildeyo| vy| u q ovq ovzq |vUgraveplusmnsup1 oy owntildeqz yntildewu vy |q qo ovou wv|q |uvoxUgraveplusmnsup1 u owntilde| yntildewzz v |qo q ovuy yvoy ||v|wUgraveplusmnsup1 | ozntildeqzy yntildezq v uz q ovq yvwq ||v|wUgraveplusmnsup1 yu ozntilde|wx yntildewx| vw| z oy ov| yxv|o |ovqUgraveplusmnsup1 ow qntildewo yntildez|y vz ||q o ovxq yovxq |ovuxUgraveplusmnsup1 xy qntilde|o ntildeqo vzq |wy y ovo xvx |ovqqUgraveplusmnsup1 xo ontildeo|u ntildeow| vzu |xy |u ovx| xzv| |qvyUgraveplusmnsup1 ontildequ ontildey|q ntilde|z vz| |zw ow| vow xuv|o uqvu

Ugraveplusmnsup1raquo w ozntildexw yntildez| vw| |z q ovow wvqu |uvuqUgraveplusmnsup1raquo oy ozntildexoz yntildezq vw| |q q ovou wuvyz |uvoUgraveplusmnsup1raquo u ozntildexqy yntildezu vwo u q ov| ov|q ||vyqUgraveplusmnsup1raquo | ozntildeuu yntildewz vwy o qy ov|o vwx ||vuqUgraveplusmnsup1raquo yu ozntildeqw yntildezq| vwx y oy ovu |vy |ovzy

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

sectWOcirceg

oqu$XAcircsup3deg owsup3degIcircwsup3+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

JAEA-Testing 2011-005

Table G-2 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (2)

Ugraveplusmnsup1raquo ow qntildeqx yntildewz| |vqq |u q ovq xxv|| |ovzUgraveplusmnsup1raquo xy qntildew| ntildeq| vzy ||y y ovuz yvqx |ovoUgraveplusmnsup1raquo xo ontildew ntildexy vz| u|| |u ovwx uzvoo |qvzxUgraveplusmnsup1raquo oqu ontildeww| ntildeuqz vzx |oy owu ov yzv|q uqv||

~plusmn w yntildez zntildeuw qvo ontildeozu uux vyw xvwq ovzo~plusmn oy ntildeoqz zntildeu| qv| ontildeoy| uu vyq yvoo ovz~plusmn u ntildeq|z zntildeuy qv ontildeo uuy vy| yvqq ovwx~plusmn | ntildeoz zntildeyq qvu ontildequ uu vq xvzw ovwy~plusmn yu yntildezoq zntildey qvo ontildeoyu uu vyq xvzu ovw|~plusmn ow zntilde|qz zntildew| qvzx ntildeo uuy uvw uvz ovz~plusmn xy zntilde|wz zntildewqw qvzy ntildey uuy uvzz uv ovz~plusmn xo zntildeqx zntildewuq qvzu ntildeowx uuz uvw uvo ovzu~plusmn oqu zntildexz zntildezq| qvzz ntildeo uyo uv uvuz ovxq

~plusmnraquo w yntildewox zntilde| qvq ontildeuo uux vz xvuz ovw~plusmnraquo oy yntildezq zntildeu| qvo ontildeoz uu vyw xvw ovwo~plusmnraquo u yntildeyz| zntildeux qvyz ontildeoyw uuu vy| xv| ovz|~plusmnraquo | yntildewou zntildexx qvq ontildeu uux vx xvx ovzo~plusmnraquo yu yntildezx| zntildex qvo ontildeoyo uuu vyo xvzz ovzx~plusmnraquo ow zntildeuux zntildew qvz ntildeoxx uu| uvwy uv|w vqx~plusmnraquo xy zntilde|x zntildewqo qvzx ntildeq uuy uvzx uvu ovzw~plusmnraquo xo zntildeyw zntildewuy qvzw ntildeqy uxq uvzq uv|z ovwz~plusmnraquo oqu zntildeyxq zntildezoo qvz ntildeoxy uyq uvyw uvuw ovx|

plusmn w uy uo ovo zy oow qvw uvwq |vxqplusmn oy xq uw ovqy oqo oq qvwx xvqq |vzzplusmn u xwz x|| ovoq oqy o| qvwy xvx uv|uplusmn | oo yqy ovo ooq o|q qvwx yvuy uvywplusmn yu ntilde zoz vu o| ouo qvw owvqu yvxqplusmn ow ntildeww ntildeqqz ov|z oxo ox qvwy owvu oovxqplusmn xy untildeouy |ntildeoy ovo q| oz qvz| qvuy oyvzplusmn xo |zntildeo| |xntilde|u ovoo ontildeuox ontildeuy qvwo vyz qvowplusmn oqu uqntildeqwy |xntilde|w ovo ontildeuww ontildewqw qvw yvzu ozv

plusmnraquo w oqqntildexy yqntilde|y ovy ntildeuou wy vx uovyy ywvwzplusmnraquo oy oqntildey| yontildeq ovyw ntildeu|w ww| vy uvqz yzv|oplusmnraquo u oqntildexyw yontildeqz| ovyw ntilde|zy wz| vyw uvw ywvuqplusmnraquo | oq|ntildeqo yontildeo ovyw ntildeuqq www vq uvzx ywvzoplusmnraquo yu oq|ntildeqzw yontildeoou ovyz ntildeuyx wzx vx uovw| ywv|qplusmnraquo ow oq|ntildezo yontildeq| ovq |ntildexqx wz |vzo zvyx ywvoplusmnraquo xy oquntildeqy yontildeqq ovo |ntildexyw wzz |vz zvo yvwxplusmnraquo xo oqwntilde|zq yqntildewo ovw |ntildexyx wzx |vzw |qvuo yvzplusmnraquo oqu ooqntildeqq yontildeoxx ovwq |ntildeyu zo |vzz zvzx yyv|z

~plusmnreg w ooz o|o qvzq q x qvwq xvwo xvo~plusmnreg oy oo| ou qvzo y qvwx xvq uvwu~plusmnreg u oow o|o qvzo q x qvw xvz xvy~plusmnreg | oq ou qvz q x qvwq xvzu uvwz~plusmnreg yu oq ox qvzy q x qvw yvo uvzz~plusmnreg ow o oz qvzz y qvw xvy xvq|~plusmnreg xy ox o|o qvzx y qvzx uvwy uvw|~plusmnreg xo ox o|o qvzx | y qvww xv|x uvzx~plusmnreg oqu ou o|| qvz| y z qvz uvq uvy|

~plusmn w oontildewxu |ntildeyuy |vx yu ontildeoow qvx owvuy |vy~plusmn oy oontildewo |ntildeyq |vw y|z ontildeoox qvx owvuw |v|~plusmn u oontildey |ntildexw |vz yuo ontildeoox qvx owv| |vo~plusmn | oontilde| |ntildexzx |v y| ontildeo| qvx owvuz |vq~plusmn yu oontildew|o |ntildey |v y| ontildeo qvx owvxw |vo~plusmn ow ontildeqw |ntilde| |v| yy ontildeouq qvxx ozv|o |vw~plusmn xy ontilde |ntildewox |v zz ontildeoyw qvyw oxv| |v~plusmn xo ontildeuo |ntildezzo |voo wu ontilde qvyz ouvy |v~plusmn oqu ontildequ untilde|w vzq zu ontilde|u qvo o|vuz |v|o

sup2cedil~raquo w y o qvx o | qvu| xvoy uvusup2cedil~raquo oy z o qvwq o qvxx vzx xvuxsup2cedil~raquo u x oq qvuy o qvyo |v| uvuzsup2cedil~raquo | | oo qv o qvxu vuz uvzosup2cedil~raquo yu oo qvyo o qvuz xvyo uvxusup2cedil~raquo ow oo qvyw o qvxx yvqq uvwxsup2cedil~raquo xy y oo qvxy o qvy uvyq xvoosup2cedil~raquo xo y oo qvxy | qvy| |vq| |vusup2cedil~raquo oqu x o qv|z o qvx |vyz xvuq

JAEA-Testing 2011-005

Table G-3 MPIatilderaquo LMPI and hybrid acute256ccedil1024KBP

sup2~ yntildeouu |ntildewzz |ntildeuou ovou ontildezx ontildeqyw ovw| ovzz |vqsup2~ xuntildeww ntilde|o ntildeuo qvzz |ntildeqo |ntildey qvz vu vsup2~ wyntildeu| oqntildexx oontildeqyy qvz |ntildezwy untildeuyw qvwz vq vuwsup2~ ontildequwntildexy ountildexxy oxntildeuy qvzu untildezqq xntildewyw qvwu vz vyu

sup2~ yntildeouu zou zo qvzu ontildeqqq ontildeq qvzw qvzo qvzxsup2~ xuntildeww ontildeo ontildezy qvwz ontildew ntildeuuz qv| qvz qvzsup2~ wyntildeu| ntildexuo ntildey| qvz ntildeyzz |ntildeuqw qvz qvzu qvwosup2~ ontildequwntildexy |ntildex untildeqox qvzu |ntildexww untilde|| qvw ovqx qvz

sup2~Oslashreg~plusmn yntildeouu yntildeuw| wontildexoz qvzu xntildeuo |ntildeoqw ovoq |vqq |vx|sup2~Oslashreg~plusmn xuntildeww ou|ntildewx ountildeqo qvw| uyntildeqzq yyntildewxo qvyz |vo vyosup2~Oslashreg~plusmn wyntildeu| oontildezoo yzntildeuz qvz yntildeqz zxntildeox qvo |voy vw|sup2~Oslashreg~plusmn ontildequwntildexy w|ntildeyox |yontildeuq| qvw wntildexo o|ntildexqz qvo |v| vz|

middotplusmnsup1 yntildeouu |ntildezw ||ntilde|x| qvzz oontildexww oxntildeyzu qvu vwx vo|middotplusmnsup1 xuntildeww y|ntildex yyntildeyy qvzx ontildez|x untilde|x qvuy vww ovuomiddotplusmnsup1 wyntildeu| wzntildewy z|ntildeyux qvzy |qntildeyx yntildeyqu qvuy vz ov|zmiddotplusmnsup1 ontildequwntildexy oqntildeux| o|untildeqox qvzq uqntildeyxx wwntildez| qvuy vzy ovxo

middotplusmnsup1raquo yntildeouu yntildew|u ||ntildeu ovww ntildewxy oxntildeyuw uvyy qvwy voumiddotplusmnsup1raquo xuntildeww wxntildeoz yuntildeo| ov|| wxntildex|| untildeuy ovwq ovqq ov|xmiddotplusmnsup1raquo wyntildeu| ooxntildeuw z|ntildezyu ov| z|ntildeyow yntildexuz ov|z ov| ov|zmiddotplusmnsup1raquo ontildequwntildexy o|ntildexx o|untildeuqz ovq zyntildexoq wntildeuuy ovoq ovu| ovxu

Ugraveplusmnsup1 yntildeouu |zntildeyu| oqntildexwz |vu uontildeuyq ountildewy vz qvzy qvoUgraveplusmnsup1 xuntildeww untildewxx ntildeqo voy uzntildeoy| yntildewo ovw| qvz qvw|Ugraveplusmnsup1 wyntildeu| xxntildewoq zntildewyu ovw xyntildewuu |yntildexqx ovxy qvzw qvwUgraveplusmnsup1 ontildequwntildexy yntildewxy |ntildexxw ovy yuntildewzx uyntildeoq ovuo qvz qvwo

Ugraveplusmnsup1raquo yntildeouu uqntildeq|u oqntildeyxq |vy uontildey| ountildez|z vz qvzy qvoUgraveplusmnsup1raquo xuntildeww uwntilde|u ntildeqz vo uzntildew yntildewx ovw| qvzw qvw|Ugraveplusmnsup1raquo wyntildeu| xyntildezy zntildez|y ovww xntildeo| |yntildeuwq ovx qvzw qvwUgraveplusmnsup1raquo ontildequwntildexy y|ntildeo| |ntildeyxq ovyw yxntilde|yw uyntildeqx ovuo qvz qvwo

~plusmn yntildeouu untildeyw wntildezu uvo uuntildeo wntildeuq| xvx qvzy ovq~plusmn xuntildeww xqntildeqyx qntilde|xq vuy xontildexwy untildezux vq qvz qvw~plusmn wyntildeu| xwntildeq wntildeux vqy yqntildeox |untildeu ov| qvzw qvw~plusmn ontildequwntildexy yyntildequq |yntilde|q| ovw ywntildeqx uuntildeuyy ovx| qvz qvw

~plusmnraquo yntildeouu zntilde|q zntildeqq ovqu zntildexq wntildeu|w ovo| qvzw ovq~plusmnraquo xuntildeww oyntildeu| qntilde|wy qvwo oyntildeoz untildezuy qvyx ovq qvw~plusmnraquo wyntildeu| |ntildewo wntildeuxu qvw ntildewz |untilde qvyy ovq qvw~plusmnraquo ontildequwntildexy zntildezy |yntildexo| qvw zntildezxq uuntildeu qvy ovqq qvw

plusmn yntildeouu yxntildewu wzntildeq| qv| ontildeyoz yntildezz qvwq |vq |v|qplusmn xuntildeww ountildeqoz oxntilde|o qvwo |zntildexu xxntilde|ux qv |vo vxplusmn wyntildeu| ow|ntildeuwy untildeyu qvw xntildeoyu untildezu| qvy |vo |vqqplusmn ontildequwntildexy u|ntildewo zxntildezqo qvw xntildezy zuntildeoy qvwq |vo |vo

plusmnraquo yntildeouu x|ontildeyx wyntildeqoz yvow o|ntildeqox owntilde|xu vu |vww uvyzplusmnraquo xuntildeww yozntildeyz| oz|ntildeuwu |vq oyntildeyw ywntildeyw v| |vwo vwplusmnraquo wyntildeu| yw|ntildewxu wwntildeooz v| owyntilde|q| oqxntildeyw ovy |vy vplusmnraquo ontildequwntildexy |yntildezo uqxntildeoyq ovw oontildexo ouontildezzx ovuz |vuw vwx

~plusmnreg yntildeouu uzu y| qvw uwo xzw qvwq ovq| ovqy~plusmnreg xuntildeww zq| ontildeoz qvy wo| ontildexuy qvx| ovoo qv~plusmnreg wyntildeu| ontilde|q ontildeyq qvw ontildeoxo ntildeoqz qvxx ovo| qvz~plusmnreg ontildequwntildexy ontildezqu ntilde|ux qvwo ontildeuxu ntildeyx qvxu ov|o qvww

~plusmn yntildeouu ountildewqx ountildez|x qvzz ontildeqq owntildeuw| ovou qvq qvwo~plusmn xuntildeww zntildeuy zntildewxz qvzw uqntildex|o uwntildexyw qvw| qv qvyo~plusmn wyntildeu| uuntildeq uuntildeo qvzz xyntildeyoq yzntildexzw qvwo qvw qvyu~plusmn ontildequwntildexy qntildeqqx qntildeu|q qvzz ntildeo zqntilde|ux qvwo qvzy qvw

sup2cedil~raquo yntildeouu x w qvz zo ux vq| qvw ov|sup2cedil~raquo xuntildeww ox o| qv| o|w o| qvyx qvzo qvwosup2cedil~raquo wyntildeu| oo |o qvu owz w| qvy qvzo qvwsup2cedil~raquo ontildequwntildexy ou wz qvu | |x qvyx qvz qvwo

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

sectWOcirceg

ow$XAcircsup3deg owsup3degIcircwsup3

国際単位系(SI)

乗数  接頭語 記号 乗数  接頭語 記号

1024 ヨ タ Y 10-1 デ シ d1021 ゼ タ Z 10-2 セ ン チ c1018 エ ク サ E 10-3 ミ リ m1015 ペ タ P 10-6 マイクロ micro1012 テ ラ T 10-9 ナ ノ n109 ギ ガ G 10-12 ピ コ p106 メ ガ M 10-15 フェムト f103 キ ロ k 10-18 ア ト a102 ヘ ク ト h 10-21 ゼ プ ト z101 デ カ da 10-24 ヨ ク ト y

表5SI 接頭語

名称 記号 SI 単位による値

分 min 1 min=60s時 h 1h =60 min=3600 s日 d 1 d=24 h=86 400 s度 deg 1deg=(π180) rad分 rsquo 1rsquo=(160)deg=(π10800) rad秒 rdquo 1rdquo=(160)rsquo=(π648000) rad

ヘクタール ha 1ha=1hm2=104m2

リットル Ll 1L=11=1dm3=103cm3=10-3m3

トン t 1t=103 kg

表6SIに属さないがSIと併用される単位

名称 記号 SI 単位で表される数値

電 子 ボ ル ト eV 1eV=1602 176 53(14)times10-19Jダ ル ト ン Da 1Da=1660 538 86(28)times10-27kg統一原子質量単位 u 1u=1 Da天 文 単 位 ua 1ua=1495 978 706 91(6)times1011m

表7SIに属さないがSIと併用される単位でSI単位で表される数値が実験的に得られるもの

名称 記号 SI 単位で表される数値

キ ュ リ ー Ci 1 Ci=37times1010Bqレ ン ト ゲ ン R 1 R = 258times10-4Ckgラ ド rad 1 rad=1cGy=10-2Gyレ ム rem 1 rem=1 cSv=10-2Svガ ン マ γ 1γ=1 nT=10-9Tフ ェ ル ミ 1フェルミ=1 fm=10-15mメートル系カラット 1メートル系カラット = 200 mg = 2times10-4kgト ル Torr 1 Torr = (101 325760) Pa標 準 大 気 圧 atm 1 atm = 101 325 Pa

1cal=41858J(「15」カロリー)41868J(「IT」カロリー)4184J(「熱化学」カロリー)

ミ ク ロ ン micro 1 micro =1microm=10-6m

表10SIに属さないその他の単位の例

カ ロ リ ー cal

(a)SI接頭語は固有の名称と記号を持つ組立単位と組み合わせても使用できるしかし接頭語を付した単位はもはや コヒーレントではない(b)ラジアンとステラジアンは数字の1に対する単位の特別な名称で量についての情報をつたえるために使われる

 実際には使用する時には記号rad及びsrが用いられるが習慣として組立単位としての記号である数字の1は明 示されない(c)測光学ではステラジアンという名称と記号srを単位の表し方の中にそのまま維持している

(d)ヘルツは周期現象についてのみベクレルは放射性核種の統計的過程についてのみ使用される

(e)セルシウス度はケルビンの特別な名称でセルシウス温度を表すために使用されるセルシウス度とケルビンの

  単位の大きさは同一であるしたがって温度差や温度間隔を表す数値はどちらの単位で表しても同じである

(f)放射性核種の放射能(activity referred to a radionuclide)はしばしば誤った用語でrdquoradioactivityrdquoと記される

(g)単位シーベルト(PV200270205)についてはCIPM勧告2(CI-2002)を参照

(a)量濃度(amount concentration)は臨床化学の分野では物質濃度

  (substance concentration)ともよばれる(b)これらは無次元量あるいは次元1をもつ量であるがそのこと   を表す単位記号である数字の1は通常は表記しない

名称 記号SI 基本単位による

表し方

秒ルカスパ度粘 Pa s m-1 kg s-1

力 の モ ー メ ン ト ニュートンメートル N m m2 kg s-2

表 面 張 力 ニュートン毎メートル Nm kg s-2

角 速 度 ラジアン毎秒 rads m m-1 s-1=s-1

角 加 速 度 ラジアン毎秒毎秒 rads2 m m-1 s-2=s-2

熱 流 密 度 放 射 照 度 ワット毎平方メートル Wm2 kg s-3

熱 容 量 エ ン ト ロ ピ ー ジュール毎ケルビン JK m2 kg s-2 K-1

比熱容量比エントロピー ジュール毎キログラム毎ケルビン J(kg K) m2 s-2 K-1

比 エ ネ ル ギ ー ジュール毎キログラム Jkg m2 s-2

熱 伝 導 率 ワット毎メートル毎ケルビン W(m K) m kg s-3 K-1

体 積 エ ネ ル ギ ー ジュール毎立方メートル Jm3 m-1 kg s-2

電 界 の 強 さ ボルト毎メートル Vm m kg s-3 A-1

電 荷 密 度 クーロン毎立方メートル Cm3 m-3 sA表 面 電 荷 クーロン毎平方メートル Cm2 m-2 sA電 束 密 度 電 気 変 位 クーロン毎平方メートル Cm2 m-2 sA誘 電 率 ファラド毎メートル Fm m-3 kg-1 s4 A2

透 磁 率 ヘンリー毎メートル Hm m kg s-2 A-2

モ ル エ ネ ル ギ ー ジュール毎モル Jmol m2 kg s-2 mol-1

モルエントロピー モル熱容量ジュール毎モル毎ケルビン J(mol K) m2 kg s-2 K-1 mol-1

照射線量(X線及びγ線) クーロン毎キログラム Ckg kg-1 sA吸 収 線 量 率 グレイ毎秒 Gys m2 s-3

放 射 強 度 ワット毎ステラジアン Wsr m4 m-2 kg s-3=m2 kg s-3

放 射 輝 度 ワット毎平方メートル毎ステラジアン W(m2 sr) m2 m-2 kg s-3=kg s-3

酵 素 活 性 濃 度 カタール毎立方メートル katm3 m-3 s-1 mol

表4単位の中に固有の名称と記号を含むSI組立単位の例

組立量SI 組立単位

名称 記号

面 積 平方メートル m2

体 積 立法メートル m3

速 さ 速 度 メートル毎秒 ms加 速 度 メートル毎秒毎秒 ms2

波 数 毎メートル m-1

密 度 質 量 密 度 キログラム毎立方メートル kgm3

面 積 密 度 キログラム毎平方メートル kgm2

比 体 積 立方メートル毎キログラム m3kg電 流 密 度 アンペア毎平方メートル Am2

磁 界 の 強 さ アンペア毎メートル Am量 濃 度 (a) 濃 度 モル毎立方メートル molm3

質 量 濃 度 キログラム毎立法メートル kgm3

輝 度 カンデラ毎平方メートル cdm2

屈 折 率 (b) (数字の) 1 1比 透 磁 率 (b) (数字の) 1 1

組立量SI 基本単位

表2基本単位を用いて表されるSI組立単位の例

名称 記号他のSI単位による

表し方SI基本単位による

表し方平 面 角 ラジアン(b) rad 1(b) mm立 体 角 ステラジアン(b) sr(c) 1(b) m2m2

周 波 数 ヘルツ(d) Hz s-1

ントーュニ力 N m kg s-2

圧 力 応 力 パスカル Pa Nm2 m-1 kg s-2

エ ネ ル ギ ー 仕 事 熱 量 ジュール J N m m2 kg s-2

仕 事 率 工 率 放 射 束 ワット W Js m2 kg s-3

電 荷 電 気 量 クーロン A sC電 位 差 ( 電 圧 ) 起 電 力 ボルト V WA m2 kg s-3 A-1

静 電 容 量 ファラド F CV m-2 kg-1 s4 A2

電 気 抵 抗 オーム Ω VA m2 kg s-3 A-2

コ ン ダ ク タ ン ス ジーメンス S AV m-2 kg-1 s3 A2

バーエウ束磁 Wb Vs m2 kg s-2 A-1

磁 束 密 度 テスラ T Wbm2 kg s-2 A-1

イ ン ダ ク タ ン ス ヘンリー H WbA m2 kg s-2 A-2

セ ル シ ウ ス 温 度 セルシウス度(e) Kンメール束光 lm cd sr(c) cd

スクル度照 lx lmm2 m-2 cd放射性核種の放射能( f ) ベクレル(d) Bq s-1

吸収線量 比エネルギー分与カーマ

グレイ Gy Jkg m2 s-2

線量当量 周辺線量当量 方向

性線量当量 個人線量当量シーベルト(g) Sv Jkg m2 s-2

酸 素 活 性 カタール kat s-1 mol

表3固有の名称と記号で表されるSI組立単位SI 組立単位

組立量

名称 記号 SI 単位で表される数値

バ ー ル bar 1bar=01MPa=100kPa=105Pa水銀柱ミリメートル mmHg 1mmHg=133322Paオングストローム Å 1Å=01nm=100pm=10-10m海 里 M 1M=1852mバ ー ン b 1b=100fm2=(10-12cm)2=10-28m2

ノ ッ ト kn 1kn=(18523600)msネ ー パ Npベ ル B

デ ジ ベ ル dB

表8SIに属さないがSIと併用されるその他の単位

SI単位との数値的な関係は    対数量の定義に依存

名称 記号

長 さ メ ー ト ル m質 量 キログラム kg時 間 秒 s電 流 ア ン ペ ア A熱力学温度 ケ ル ビ ン K物 質 量 モ ル mol光 度 カ ン デ ラ cd

基本量SI 基本単位

表1SI 基本単位

名称 記号 SI 単位で表される数値

エ ル グ erg 1 erg=10-7 Jダ イ ン dyn 1 dyn=10-5Nポ ア ズ P 1 P=1 dyn s cm-2=01Pa sス ト ー ク ス St 1 St =1cm2 s-1=10-4m2 s-1

ス チ ル ブ sb 1 sb =1cd cm-2=104cd m-2

フ ォ ト ph 1 ph=1cd sr cm-2 104lxガ ル Gal 1 Gal =1cm s-2=10-2ms-2

マ ク ス ウ ェ ル Mx 1 Mx = 1G cm2=10-8Wbガ ウ ス G 1 G =1Mx cm-2 =10-4Tエルステッド( c ) Oe 1 Oe  (1034π)A m-1

表9固有の名称をもつCGS組立単位

(c)3元系のCGS単位系とSIでは直接比較できないため等号「   」

   は対応関係を示すものである

(第8版2006年改訂)

この印刷物は再生紙を使用しています

Page 3: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター

JAEA-Testing 2011-005

Evaluation of Fundamental Performance of JAEArsquos New Supercomputer System

Kensaku SAKAMOTO Futoshi SHIMIZU Takuya TSURUOKA2 Toshiyuki NEMOTO2 and Naota ISHIKAWA2

Center for Computational Science amp e-Systems

Japan Atomic Energy Agency Tokai-mura Naka-gun Ibaraki-ken

(Received August 30 2011)

A new supercomputer system was deployed at JAEA in March 2010 The system is mainly composed of a large scale Linux cluster system (PRIMERGY BX900 200Tflops) and a lead off system of RIKENrsquos K computer (FX1 12Tflops) whose purposes are to deliver a high-performance parallel computing environment and application code development environment for the K computer respectively In this report we present results of the evaluation of fundamental performance of the system HPCC Benchmark suit as well as independent codes to measure the performances of floating-point operation memory bandwidth and interconnect bandwidth was performed Many profitable findings for uses of BX900 and FX1 were obtained by analyzing these results Keywords Supercomputer Fundamental Performance Benchmark 2 RIST (Research Organization for Information Science and Technology)

JAEA-Testing 2011-005

[ 1 EacuteH ---------------------------------------------------------------------------------------------------- 1 2 13Ecircu ------------------------------------------------------------------------------------------- 1

21 UVEumlIgraveOIacuteIcircIumlETHNtildeJ ----------------------------------------------------------- 1 22 []fIumlIacuteIcircIumlETHNtildeJ -------------------------------------------------------- 2 23 OgraveOacuteOcircOtildeYOumltimesIcircIumlETHNtildeJ ----------------------------------------------------------- 2

3 HPCCYbrvbar --------------------------------------------------------------- 5 31 HPCCYEcircu -------------------------------------------------------------------------- 5

311 ^_Z ------------------------------------------------------------------------------------------ 5 312 Oslash5 ------------------------------------------------------------------------------------------ 7

32 -------------------------------------------------------------------------------------------------- 8 321 BX900 ----------------------------------------------------------------------------------------------- 8 322 FX1 ------------------------------------------------------------------------------------------------- 14

33 HPCCYUgraveWH ---------------------------------------------------------------------- 18 4 BX900W FX1TUacuteUcirc ---------------------------------------------------------------------- 20

41 ^_aZ-------------------------------------------------------------------------------------------- 20 42 feaZbrvbar|YTUacuteUcirc --------------------------------------------------- 20

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute------------------------------------ 21 422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute ------------------------------ 23

43 hhbrvbar|YTUacuteUcirc----------------------------------- 24 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute ------------------------------------------ 28 45 TUacuteUcircUgraveWH----------------------------------------------------------------------------------------- 31

5 IORYbrvbar ---------------------------------------------------------------- 33 51 IORYEcircu --------------------------------------------------------------------------- 33

511 YacircueZ$ ------------------------------------------------------------- 33 512 MPIIOatildeGUuml^13~eZ$ -------------------------------------------------- 34

52 ------------------------------------------------------------------------------------------------ 35 521 POSIX --------------------------------------------------------------------------------------------- 38 522 MPIIO --------------------------------------------------------------------------------------------- 38 523 _aumlaringaeligccedil13h ----------------------------------------------------------------------------- 38

53 IORYUgraveWH ------------------------------------------------------------------------- 41 531 a13egravecurren ----------------------------------------------------------------- 41 532 POSIXW MPIIO ------------------------------------------------------------------------------- 41 533 eacuteccedilecirceuml ------------------------------------------------------------------------------- 41

6 a$fY` ------------------------------------------------------------------------ 44 61 a$fY`igraveiacute----------------------------------------------------------------------------- 44

JAEA-Testing 2011-005

611 icircWiumlIumlethsectntilde -------------------------------------------------------------------------- 44 612 Agravej -------------------------------------------------------------------------------------------- 46

62 MPI ---------------------------------------------------------------------------------- 48 621 Agraveograve -------------------------------------------------------------------------------------------- 48 622 -------------------------------------------------------------------------------------------- 49

63 RDMAszlig ------------------------------------------------------------------------------------------ 53 631 Agraveograve -------------------------------------------------------------------------------------------- 53 632 -------------------------------------------------------------------------------------------- 55

64 CPUSCPUiumlIuml 1oacute 1 ----------------------------------------------- 56 641 Agraveograve -------------------------------------------------------------------------------------------- 56 642 -------------------------------------------------------------------------------------------- 57

65 13aelig13ocircotildeouml ----------------------------------------------------------------- 58 651 1oacute 1 ------------------------------------------------------------------------------------------ 58 652 divideoslash -------------------------------------------------------------------------------------------- 62

66 a$fY`UgraveWH-------------------------------------------------------------------------- 64 7 UgraveWH ----------------------------------------------------------------------------------------------------- 65 ugraveuacute ------------------------------------------------------------------------------------------------------------------ 65 ucircuumlyacutethorn ------------------------------------------------------------------------------------------------------------ 65 A FX1 HPCC BMT----------------------------------------------- 66 B EumlIgravecent MPI atilde ---------------------------------------------------------------- 67 C RDMAoacute NORDMA ----------------------------------------------------------------------------- 69 D W Oslash5 -------------------------------- 70 E 1oacute 1 --------------------------------------------------------------------------------- 76 F |eacuteIuml --------------------------------------------------------------------------- 79 G Zccedil` MPIWIcirca|ccedil`gt MPI atilde ------------------------ 85

JAEA-Testing 2011-005

Contents 1 Introduction ---------------------------------------------------------------------------------------------- 1 2 Outline of the system ---------------------------------------------------------------------------------- 1

21 Hardware construction of BX900 ------------------------------------------------------------------ 1 22 Hardware construction of FX1 -------------------------------------------------------------------- 2 23 Hardware construction of Storage----------------------------------------------------------------- 2

3 Performance evaluation by HPCC benchmarks ----------------------------------------------- 5 31 Outline of HPCC benchmarks --------------------------------------------------------------------- 5

311 Test programs ------------------------------------------------------------------------------------- 5 312 Rules ------------------------------------------------------------------------------------------------- 7

32 Measured results -------------------------------------------------------------------------------------- 8 321 BX900 ----------------------------------------------------------------------------------------------- 8 322 FX1 ------------------------------------------------------------------------------------------------- 14

33 Summary ------------------------------------------------------------------------------------------------ 18 4 Tunings for BX900 and FX1------------------------------------------------------------------------- 20

41 Profiler --------------------------------------------------------------------------------------------------- 20 42 Automatic tuning of memory access ----------------------------------------------------------- 20

421 Efficacy for Original version of STREAM benchmark ------------------------------- 21 422 Efficacy for HPCC version of STREAM benchmark ---------------------------------- 23

43 Manual tuning of memory access ---------------------------------------------------------------- 24 44 Efficacy for NPB23 FT benchmark ------------------------------------------------------------- 28 45 Summary ------------------------------------------------------------------------------------------------ 31

5 Performance evaluation by IOR benchmarks ------------------------------------------------ 33 51 Outline of IOR benchmarks ---------------------------------------------------------------------- 33

511 Key parameters -------------------------------------------------------------------------------- 33 512 Option parameters for MPIIO ------------------------------------------------------------- 34

52 Measured results ------------------------------------------------------------------------------------ 35 521 POSIX --------------------------------------------------------------------------------------------- 38 522 MPIIO --------------------------------------------------------------------------------------------- 38 523 Local cache --------------------------------------------------------------------------------------- 38

53 Summary----------------------------------------------------------------------------------------------- 41 531 Choice of file system---------------------------------------------------------------------------- 41 532 POSIX vs MPIIO ------------------------------------------------------------------------------ 41 533 Buffer size ---------------------------------------------------------------------------------------- 41

6 Performance evaluation of the interconnect --------------------------------------------------- 44 61 Specification of the interconnect ----------------------------------------------------------------- 44

JAEA-Testing 2011-005

611 Construction and Nodebind ------------------------------------------------------------------ 44 612 Communication method ----------------------------------------------------------------------- 46

62 Fundamental performance of MPI communication ---------------------------------------- 48 621 Measuring method ----------------------------------------------------------------------------- 48 622 Measured results -------------------------------------------------------------------------------- 49

63 RDMA ---------------------------------------------------------------------------------------------------- 53 631 Measuring method ------------------------------------------------------------------------------ 53 632 Measured results -------------------------------------------------------------------------------- 55

64 Communications between intra-CPUs inter-CPUs and inter-nodes ----------------- 56 641 Measuring method ------------------------------------------------------------------------------ 56 642 Measured results -------------------------------------------------------------------------------- 57

65 Communications between chassises ------------------------------------------------------------ 58 651 Point-to-point communication--------------------------------------------------------------- 58 652 Collective communication -------------------------------------------------------------------- 62

66 Summary ------------------------------------------------------------------------------------------------ 64 7 Concluding Remarks --------------------------------------------------------------------------------- 65 Acknowledgements -------------------------------------------------------------------------------------------- 65 References ------------------------------------------------------------------------------------------------------- 65 Appendix A HPCC benchmark results on FX1 --------------------------------------------------- 66 Appendix B Fundamental performance of MPI communication ---------------------------- 67 Appendix C RDMA vs NORDMA--------------------------------------------------------------------- 69 Appendix D Overlapping communications with calculations -------------------------------- 70 Appendix E Peer-to-peer communications --------------------------------------------------------- 76 Appendix F Memory band width ---------------------------------------------------------------------- 79 Appendix G Flat-MPI vs Hybrid --------------------------------------------------------------------- 85

JAEA-Testing 2011-005

1 EacuteH L13Pgt ABCDE

FGDHIJ 22 3K13LM=NO 153TflopsPQCRSTUVWXM=NO 200TflopsUV LinuxYZ$13LPRIMERGY BX900PWM=NO 12 Tflops []^_`$a^LFX1PbcdXefgh$13ijklmndoBpCDqBX900 rsXtuvwmWFX1 []efgh$Lxfgh$PyBzD^|13~hGmWWCq gtoBGXumlcopyordfyBClaquoPRIMERGY

BX900WFX1oacuteCO|Yl WpoundDGY`OslashCOslashGmWbrvbarsectdcWXpoundD13

GqUgrave[13EcircuJGq 3gtefgh$13MXBdn HPCC YbrvbarC 4cd 6O|YalLIOPa$fY`L^_ Pcent AringcdGq 7M$CqXyacuteampG()Ocirc$bWCUgraveWHDq

2 13Ecircu 13 3+XB-eacute13GXUVEumlIgraveOIacute[

]fIumlIacutenot|O-eacutecdX13gtlaquoqmnd13M

=NORSTUVWX 214TflopsMacirc012 56TBOgraveOacuteOcircOtildeY1212PBgtlaquoq13Ecircu Fig 1Gq 21 UVEumlIgraveOIacuteIcircIumlETHNtildeJ 1334WXUVEumlIgraveOIacutePRIMERGY BX900L13BX900P

200Tflops=NOLgYPWMacirc0 50TBnotGUV LinuxYZ$13gtlaquoq

BX900 10U-a5Aumleuml13aelig13T 186-eacute7IumlT 86accedil7Iuml8amp9XAumlAringOslashOuml7Iuml-eacutegtlaquoq1 7Iuml-eacuteaXeon ^_ccedil- X5570LYccedilIumlfP 2 ^_ccedil-Wacirc0 24GBLDDR3 SDRAMBiIacuteiumlIuml 12GB UgraveD 48GBP8ampGqiumlIumllaquoDsect=NO9376GflopsL293GHztimes4lt=gtOY_ccedilYtimes4ftimes2^_ccedil-Pgt|eacuteIuml 512GBsAumlAnotGqiumlIumlInfiniBandLIBP 4xQDRL13B QDRWCP2`ZaringCDAgravez 8GBsAbrvbarAumlFE7a13XOcirc$FGordf9gtlaquoq

JAEA-Testing 2011-005

Fig 1 13Ecircu

BX900gtacircHI4J2K currenLXMUV13Nh713~O=5Pq

22 []fIumlIacuteIcircIumlETHNtildeJ []fIumlIacuteFX1L13FX1P12Tflops =NOLgYPWMacirc0 46TBnotGYZ$13gtlaquoq

1iumlIumlQR SPARC64SLYccedilIumlfPT^_ccedil-Wacirc0 16GB8ampGqUgraveD|UCPUVBccedil^ccedil`LJSCPbrvbarsectAuml|eacuteIumlOslashWCqiumlIuml FBBLFull Bisectional BandwidthPccedil`X| InfiniBand 4xDDRLDAgravez 2GBsPgtYZGqiumlIumleacute|O=f7YOtilde IcircIumlETHNtildegtO=GVBOumltimesLAumla$fY`accedilPyBGmWgt [

brvbarUcircXOcirc$FG]EumlIgrave^_ZOslash5^=_wXM

`notGq FX1 gtFX1 CcedilEgraveordfxfgh$WXpound`acC2012 b

=Gxfgh$yBzD^|13~h5Pq 23 OgraveOacuteOcircOtildeYOumltimesIcircIumlETHNtildeJ OgraveOacuteOcircOtildeYOumltimes IO iumlIumlSPARC Enterprise M90002 6Ocirc$OcircOtildeYETERNUS DX80366$OcircOtildeYETERNUS4000 model40016gtJeumlnq

JAEA-Testing 2011-005

MOcircOtildeY12 12PBLRAID6Pgtlaquosect13gt 576GBs IO LgYPnotGLFig 2PqIO iumlIuml InfiniBand nota13Parallelnavi SRFSbrvbarsectciumlIumlcdYoacuteCaidefgCq Ocirc$OcircOtildeYOumltimeslaquoDsect 466Za SASOcircOtildeYL1TB7200rpmP8ampG

qmndOcircOtildeY RAID6 gt^UcircCOumltimesgt`ZagGmWgt IO AumlFUcirchpoundqXIOiumlIuml_aumla13Sun QFSBCLFig 3Pq

UVEumlIgraveOIacute11913aelig13()

i QDR 9j2 (113aelig13ntildesect) InfiniBandaccedil

(9iexcl) i FGFAring 72GBs (2iumlIumlcurren) IOiumlIuml (M90002iumlIuml) i FGFAring 576GBs (OcircOtildeYcurren) OgraveOacuteOcircOtildeY (ETERNUS DX80366) ()9kUcirclmiumlIumlnC

1 2 3 119

IBUSW

IO node

Fig 2 oOumltimesYZJWFGFAring

JAEA-Testing 2011-005

M90

00

2IO

M90

00

1IO

FC (3)

FC (4)

FC (36)

hellipFC (1

)F

C (2)

FC

(42)

hellipFC (3

7)FC (3

)FC (4

)FC (3

6)hellip

FC (1)

FC (2)

FC (42)

hellipFC (3

7)

DX8

0 1

helliphelliphelliphelliphelliphellip

helliphelliphelliphelliphellip

hellip

1TB 1T

B1T

B1T

B

DX8

0 2

1TB 1T

B1T

B1T

B

DX

80

35

1TB 1T

B1T

B1T

B

DX8

0 3

6

1TB 1T

B1T

B1T

B

ETER

NU

S D

X80

(36

)E4

K 1

1TB 1T

B1T

B1T

B

ETER

NU

S400

0 M

odel

400

(1)

32G

Bs(8

FC)

QFS

Q

FS

36

576

GBs

(144

FC)

F Fi

g 3

JAEA-Testing 2011-005

3 HPCCYbrvbar

HPCC (High Performance Computing Challenge)1) Y 7^_ZB BX900W FX15poundDq13gtHPCCEcircupqWrstuq 31 HPCCYEcircu

HPCC YvR Tennessee JDongara wRordf3xWXpoundyoBCefgh$BY`^_ZzgtlaquoqO

|Y^_ WpoundDefgh$acircuXu

GmWbrvbarsectefgh$13WCeacuteZGUacuteC

qO HPL(High-Performance Linpack)DGEMMFFTE 3q|^_Zgt|Y STREAM W Random Access 2 q|^_Zgt _ PTRANSW b_eff 2q|^_Zgt5n~noacuteCc^_Zbrvbar5PJWXpound 2)qo

^_ZUuml|YacuteTHORNFortranCgtcnordfHPCCYWCCgtcnD 1^_ZeumlnqXHPCCYC 2003 sect 300wordfeumlnsect+X13rsyBGmW9gtlaquoq

311 ^_Z efgh$13f CPU 1 iexclXCiexcl8ampG

L13iumlIumlPa$fY`gtYZCDYZ$JgtlaquosectCoreCPUiumlIuml13WeumlUgraveUgraveXgtO=ordf5nDHogtordfugt

laquoqmgtiumlIumlMG` (G Global system performance)WiumlIuml[gt` (SN Single Environment)gt` (EP Embarrassingly Parallel) 3gt5Pq

3111 HPL^_Z

HPL5IgraveLUcurrengtCmWgt13OG^_ZgtlaquoqOslashm^_Z|5 CcedilEgrave^_ZaumlIacutecurrenWCyB

eumlnmWordfqUgraveDefgh$13O[P TOP500B^_ZWCBeumln LinpackMPIEumlIgraveUcircWCnotgtlaquoqiumlIuml`$XOordfecircMAumlWXq Tflopsgteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq

JAEA-Testing 2011-005

3112 DGEMM^_Z DGEMM5IgraveGmWgtiumlIumlOG^_Zgtlaquoqo^_gt5curren13^_W i5Xqm^_Z

LinpackSgtegravenauml^_ZgtlaquoqUgraveDaumlefgh$gto13zeumlnDeacuteaTHORN|ordfZaZ|WCBeumlnsect

meacuteaTHORN|ordfoacute WXqY`LBMTPbrvbariumlIumlOiexclcentGq Gflops(Giga floating-point operations per second)gteumlnq` 2gtiumlIuml[(SN)(EP)`5Pq 3113 STREAM^_Z STREAM iumlIumlordf|W CDWecirceacuteIumlG^_Zgtlaquoqm

Wecirco^_gtO=ordf5np^_W i5Xqm^_Z

cpound(Copy)curren(Scale)Myen(Add)yen(Triad) 4 XO=gtJeumlnsectiumlIuml|YiexclcentGq GBs(Giga Bytes per second)gteumlnq` 8 gt[(SN)(EP)gtoiumlIuml 4(cpoundcurrenMyenyen)`5Pq 3114 PTRANS^_Z PTARNS(Parallel matrix TRANSpose)EumlIgrave^_ocirc5IgraveFtimesbrvbarsect13ccedil`brvbarY G^_Zgtlaquoq GBs gteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq 3115 Random Access^_Z Random Access |sectumleumlnDecopyIgravecdZordf 1 ulaquonotgtshyregGO=ordf 1 macrdegplusmn5ncG^_Zgtlaquoq1 plusmnYGOcirc$2ordf8BWgteumlordfgtlaquoqiumlIuml|Yucircsup2`WiumlIuml MPI gtucircsup2`5Pq Gups(Giga updates per second)gteumlnq` 3 gtiumlIuml[(SN)(EP)iumlIumlegravepoundDM(G)`5Pq 3116 FFTE^_Z FFTEAumlF|sup3^acuteOG^_ZgtlaquoqHPLWiacute CcedilEgrave^_ZaumlIacutecurrenWCyBeumlnmWordfq[(SN)W(EP)`gt2 [microcopyIgraveordfo^_gtpara|sup3^acuteeumlnqiumlIumlegravepoundDM(G)`gt3[microcopyIgraveordflaquo 1AgravezmiddotAcurrenetheumlnDcedil|sup3^acuteeumlnlaquoW^_ordf^_W 5PmWLoacute Pbrvbaro^_

copyIgravesup1copytimesordm^_raquofrac14sect[microoacuteG|sup3^acuteordf5nq

JAEA-Testing 2011-005

Gflopsgteumlnq 3117 b_eff^_Z b_eff Ocirc$FGFGfrac12frac34(7a13acute latency)WFGFAring(eacuteIuml band width)shycdiquestH^_ZgtlaquoqFGfrac12frac34ordfAgraveXCeacuteIumlordfecircM13^_ ordfAacutenmWGqb_eff iumlIumlegravepoundDM(G)G Ring W PingPong 2 ^_ZgtJeumlnqb_eff FGGOcirc$-a5mnUgravegt^_ZW+XsectlOcirc$-a5 N ] MPI ^_ P atildeAcircXq7a13 8B Ocirc$eacuteIuml 2MB Ocirc$FGGmWbrvbarsect5Pq Ring MPIbrvbarpoundAtildeHdnDZYAumlAringAEligo^_oacuteCAumlAring 1oacute 1 5PoumlWZYAumlAringZordfEumlu 1oacute 1 5Pouml 2CcedilEgravegtCDq PingPong 2Core Eacute Ecircnototilde^_ordfethsectntildednp Core EumlCOslash5Gq2CoreIgraveEacuteIacute5poundecircTIcircIIumlshyGq b_effETHNtildeOgraveRing(AumlAring)brvbar7a13OacuteRing(AumlAring)brvbareacuteIumlOcircRing(Zordf)brvbar7a13OtildeRing(Zordf)brvbareacuteIumlOumlPingPongbrvbar7a13timesPingPongbrvbareacuteIuml 6gtlaquoq7a13ordfaY_LOslashPmacreacuteIumlordf GBsgteumlnq 312 Oslash5

HPCCZaZ(Baseline Runs)WUuml^Otildea5Z(Optimized Runs) 2Oslash5ordflaquoordffeaZTUacuteUcircEacutebrvbarZaZgt5pound

q

3121 ZaZ Oslash5gtlaquoZaZfeaZbrvbarTUacuteUcircW13Oumlfrac34

eumlnaumlUgraveAuml BLAS ZaZ|W MPI ZaZ|egraveBUgravegtUacuteeumlnordfUcircfIuml^regUacuteeumlnXqegraveBCDfeaZeacuteYacute~

WfeaegraveBCDUuml^13~ZaZ|eacuteYacute~UumlordfYacutegtlaquoq 3122 Uuml^Otildea5Z UcircfIumlTHORNszligagravePTUacuteUcirc2 7gtUacute9eumlnqiAgraveaacutednDIacutecurren(-)fIumlTUacuteUcircgtlaquosectpAgrave|5macr8CEcircnototilde3XTUacuteUcircgtlaquoqUgraveD acirc HPCC^_YacuteNtildeY`acircatilde^WaumlaringordfaeligugtlaquosectatildeCUumlGaeliguordflaquoq

JAEA-Testing 2011-005

32 BX900 FX1n~nHPCC BMT^_Z 8ccedil512EumlIgravegtOslash5CDq 321 BX900 BX900 HPCC BMT^_Z Table 1GWHPL

PTRANSRandom Access FFTEo^_ZZK|Otilde Fig 4ccedilFig 7GqZK|Otildedegn^_Zegraveeacuteecirccopyordfeuml]cgtlaquoordf512 EumlIgravegtigraveyenCiacuteXCiacuteicircXordfdegdnqUgraveDHPL=NoacuteC 822(=4902596100)1gtlaquopoundDq

Table 1iumlntildeethntildeograveoacuteWCHPCCocircAEligcdotildebIcircIumlETHNtildeoumlccedilYAElig Intel Endeavor cluster (13gt Endeavor Wdivideoslash)HPCCWeboumlYacute 1) ugraveuacute Table 2Gq BX900shyucircoB3gtlaquoDH Ecircnototilde^_ZP EndeavoregraveeacuteplusmnpoundDlaquoordfEcircuumlb13yacuteordfdegdnqUgraveDEndeavorWEacute IntelfeaZBD Table 3 GbrvbarPQRfeaZbrvbarWbgtlaquopoundDqDotildeCSTREAM iumlIumlSgt5poundmWcd thornCXatilde

d(35GBs oacute 42GBs)W 2ethM EndeavorbrvbarsectEshygtlaquopoundDqm422gtaringNGq

1 BX90013VBbrvbargt17072fgt=N 2000TflopsoacuteC 1914TflopsOslashCsectOslashszlig 9566gtlaquopoundDq

- 9 -

JAEA-Testing 2011-005

Ta

ble

1 富士

通コ

ンパイ

ラに

よる機

構B

X900

での

HPC

C B

MT測

定結果

G-H

PL

G-P

TR

AN

SG

-FFTE

EP-S

TR

EA

MSys

()

EP-S

TR

EA

MTriad

EP-D

GEM

MR

ando

mR

ing

Ban

dwid

thR

ando

mR

ing

Lat

ency

TFlo

ps

GB

sG

ups

Gup

sG

Flo

ps

GB

sG

Bs

GFlo

ps

GB

sus

ecrd

ma

008

1315

12

9330

001

5112

008

9847

555

3328

074

350

9311

018

41

1171

51

268

nord

ma

008

1226

12

9544

001

5504

009

0139

557

5727

813

347

6711

023

91

1053

51

278

rdm

a0

1579

130

510

540

0238

850

1510

719

4469

555

143

4697

111

145

092

418

149

3no

rdm

a0

1536

650

491

080

0222

270

1334

439

7508

555

683

4730

111

123

078

781

324

5rd

ma

032

2493

09

0716

003

0751

024

2906

186

357

111

164

347

3911

101

00

5220

61

944

nord

ma

032

3494

08

4422

003

5001

023

7182

190

981

112

052

350

1611

100

30

5194

95

010

rdm

a0

6201

260

150

637

003

2786

037

8769

369

923

222

203

347

1911

092

40

4192

43

156

nord

ma

060

9871

014

686

80

0763

460

3882

6638

139

722

260

03

4781

110

887

044

542

652

7rd

ma

120

6270

026

303

70

0370

050

5186

5673

075

744

426

43

4708

111

007

038

892

463

5no

rdm

a1

2265

800

302

225

007

3652

061

6532

742

721

444

063

346

9211

100

50

4161

78

290

rdm

a2

4166

200

583

348

002

9897

053

8761

139

3770

888

453

347

0511

091

00

3583

55

937

nord

ma

236

3890

057

519

80

0765

560

8003

4814

845

5089

503

73

4962

110

879

040

323

103

72rd

ma

468

4620

011

199

400

0163

910

2622

2026

859

1017

779

513

4726

111

147

032

838

711

3no

rdm

a4

9017

400

109

9150

008

2799

153

7610

275

5280

1816

433

354

7711

111

90

3572

112

790

富士

通B

X90

0 (5

12C

ore

128C

PU

)C

ip

Xeo

n X55

70 2

93G

Hz

DD

R3-

1066

SM

T-O

N T

urbo

Mod

e -O

FF

Inte

rcon

nect

Q

DR

Inf

inib

and

2so

cket

nod

e F

at T

ree

OS

Red

Hat

EL 5

Com

pile

r 富

士通

CC

++ V

er O

ption -

Kfa

st -

SS

L2

()

EP-S

TR

EA

M S

ys の

値は

(EP-S

TR

EA

M T

riad

)times並

列数

で算

出し

てい

8 16 32 64

MPI

富士

通 P

aral

leln

avi Lng

uage

Pac

kage

512

128

HP

CC

Ver

V

13

1

256

並列

数通

信方

式G

-Ran

dom

Acc

ess

オリジナ

ル版

  

SA

ND

IA_O

PT2版

JAEA-Testing 2011-005

FFig 4 BX900 HPL

Fig 5 BX900 PTRANS

JAEA-Testing 2011-005

FFig 6 BX900 Random Access

Fig 7 BX900 FFTE

JAEA-Testing 2011-005

TTa

ble

2In

tel E

ndea

vor c

lust

er (

HPC

CW

eb1)

)

JAEA-Testing 2011-005

TTa

ble

3In

tel

BX9

00H

PCC

BM

T

JAEA-Testing 2011-005

322 FX1 FX1 HPCC BMT Table 4 HPL

PTRANS Random Access FFTE Fig 8 Fig 11512

HPL755(=377650100)2

Table 4 FFTE DGEMM

A

2 FX1 1200 129Tflops116Tflops 9037

JAEA-Testing 2011-005

TTa

ble

4FX

1H

PCC

BM

T

JAEA-Testing 2011-005

FFig 8 FX1 HPL

Fig 9 FX1 PTRANS

JAEA-Testing 2011-005

FFig 10 FX1 Random Access

Fig 11 FX1 FFTE

JAEA-Testing 2011-005

33 HPCCYUgraveWH Uklq BX900W FX1WrsndWK(Altix3700Bx2)Wrs

Fig 12GqmhgtTWXpoundDqshy 100WCLRandom Ring LatencyPqRandom Ring Latency13BX900ordf`ccedil^fCmWordfcurrencq

BX900 W FX1 rsGWSTREAM W FFTE necircOatildeC atildeC eacuteIumlrvCDordfmacrdnDqSTREAM

421FFTE 43aringNGq XAltix3700Bx2 Random AccessordfEordfAltix3700Bx2W

plusmngtegraveBCD HPCC BMTeacuteYacute~cdOcirc$Y|5ordf+Xpoundsect8YrsGmW13XqAltix3700Bx2 atildeC FFTEEordfmdoacute OslashszligWuumlwdn

q

JAEA-Testing 2011-005

13

0

20

40

60

80

100

BX90

0

FX1

Altix37

00Bx2

GHPL(490Tfplos)

113

GRando

mAccess(154G

ups)

GPTRANS(1120G

Bs)

EPSTREA

M(436GBs)

EPDGEM

M(111Gflo

ps)

13

GFFTE

(2755Gflo

ps)

$amp(13

Rand

omRingBandw

idth

(0359GBs)

Rand

omRingLatency(518usec)

)+

512-(FlatM

PI)

1172

10 64

64

10 32

8 2 32

130

13

123

[GFLOPS]

4567898

123

[GBs]

lt8

7898

[GBs]

=gtAB130CDgtEFGHIAB0J

KLMNOIPQ

Fi

g 1

213cent

HPC

C B

MT^_ZrsZ

Ran

dom

Rin

g La

tenc

y (5

18

sec)

)+R

JAEA-Testing 2011-005

4 BX900W FX1TUacuteUcirc 32 c^_Zgt BX900 W FX1 IcircIumlETHNtildeoumlccedilYWiCXordfWnDmWQR^_aZegraveBC(Xcurrenyen

5poundDqeumldfeaZUuml^13~brvbarTUacuteUcirc]Ucirc7^_ZTUacuteUcirc

EacuteDq 41 ^_aZ

BX900 W FX1 gtegraveBCD^_aZWQRUgraveyenXgtlaquosectoacute ^_ZOslash5^_a|Ocirc$divideC^_ZOslash5cedildivideOcirc$divideC

^_Z=)yen5PmWordf13q ^_aZyBGUgrave^_Z_IumlYacuteh)JGDHfe

a|YUuml^13~UKtl_trtGq[ _ZOslash5_IumlYacute

h fpcoll fIumlgt=GqTcedildivideCDeacuteaTHORN|Ocirc$ fprof fIumlgtdivideCaring`Gq^_aZgtPAElig 8q|gtlaquoq

^_a|Ocirc$divideAElig AElig ^_ AElig MPIZaZ|ordm)AElig f`AElig IcircIumlETHNtilde$AElig fZAElig UcircfIumlAElig

mPIcircIumlETHNtilde$AEliggtCPUlt=gtOaringaeligccedil13

hccedil`|WOcirc$FGszligbordfoacute WXq

42 feaZbrvbar|YTUacuteUcirc BX900HPCC BMT^_Z STREAMordfotildeb

gtlaquo EndeavorWrsC(35GBsacute42GBs)W 2ethMEshygtlaquopoundDmWETHNtilde5poundDqETHNtildeBDUcirc HPCCpSTREAMUuml|YacuteTHORN(CUcirc amp FortranUcirc)7ccedilIumlEumlIgraveUcircWMPIEumlIgraveUcircgtlaquoq[13BGbrvbarPmdegn BX900 feaZUuml^13~UKntst brvbarsectCDmWcdEndeavorfeaZ(IntelUgrave)W BX900feaZ(QRUgrave)igraveiacuteWuumlwdnq

JAEA-Testing 2011-005

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute BX900Uuml|YacuteTHORN STREAM^_Z Table 5Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntst brvbarordfmacrdnDqDotildeCUuml^13~gtOcirc$|sectumlGaringaeligccedil13hyBCXbrvbarPGDHoacute WX DO^SgtOcirc$sup1yBordfXmWordfdcXoumlEacuteegraveBGuecircgtlaquosectPgtXoumlordfUcirc

GordflaquoqUgraveDfeaZordfoacute WGDO ^SgtsectumlOcirc$ordf^^shypoundZYeumlnoumlEacutegtlaquoq CUcircoumlIntelfeaZXd OpenMPEumlIgraveUgraveUgravegt 42GBscoreordfdegdnDordfQRfeaZgt7ccedilIumlEumlIgraveUcirc OpenMP EumlIgravecd=EumlIgrave^regCbrvbarP]C Intel feaZWbWXpoundDqQRfeaZgtlaquo CD7ccedilIumlEumlIgraveUcircAgraveogravebrvbarpoundOcirc$copytimes]AgravebTUacuteUcircAgraveograveordflaquoWnq [ FX1Uuml|YacuteTHORN STREAM^_Z Table 6Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntstbrvbarordfmacrdnDqXTriad (132GBs)CPUQRordfWebgtUumlCshy(135GBs)CPUWotildeEacutegtlaquoq CUcircoumlordf FortranUcircUKntstUuml^13~XCWotildeiCqFX1 CfeaZ BX900W+XsectUKntstUuml^13~ordfcentsup3CXqDHFX1 CfeaZbrvbarTUacuteUcircordf$currengtlaquopoundD9ordfAumlq

JAEA-Testing 2011-005

TTa

ble

5B

X900

STRE

AM

Ta

ble

6FX

1ST

REA

M

JAEA-Testing 2011-005

422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute 321gtdegdnD BX900 HPCC BMT^_Z STREAM

Table 7GqmmgtUKntstcopyIgraveOcirc$ecircEacutearingaeligccedil13heumlX storeampegraveBGmWUKrestp=all(a$XsectordfXmW)WCDTUacuteUcirc5PfeaZUuml^13~gtlaquoq UgraveDyacutegtgtlaquoordf _ZOslash53iumlIuml XC

iumlIumlSgt|Yo CPU]o CoreiDHEumlIgrave] Agravejatildedn~nfeaZUuml^13~brvbar^UcircCXq

STREAMbrvbarPX CoreIumliX2gtZX|Yordf5n DO^gtfeaZUuml^13~UKntst GmWgt|Y+13mWordfcurrencpoundDq

Table 7 HPCC BMT^_Z STREAM (MPIEumlIgraveUcircCUcirc) feaUuml^13~Ograve(1) feaUuml^13~Oacute(2)

EumlIgrave Agravej 13aelig13

EP-STREAMTriad (GBs)

13aelig13 EP-STREAMTriad (GBs)

rdma 1 34739 U U 32

nordma 1 35016 1 43316 rdma 3 34708 U U

128 nordma 3 34692 1 43572

rdma 5 34726 4 43595 512

nordma 4 35477 4 43424 L-1QRfeaZUuml^13~OgraveacuteKfastpackedntst SSL2 -2QRfeaZUuml^13~OacuteacuteKfastpackedntstrestp=all SSL2P

JAEA-Testing 2011-005

43 hhbrvbar|YTUacuteUcirc 32 gtdcWXpoundDbrvbarPBX900 W FX1 FFTE ^_Z

acircIcircIumlETHNtilde13yacuteordfmacrdnDqmETHNtildeDH5pound

DhhW13gtszlig0Gq 13gt FFTE 32EumlIgraveOslash5CDWecircoacuteG ethBX900

gt 12FX1 gt 16Wgteumlqpoundordm)1|YEcirc2OgtlaquopoundD9ordfAumlqmgtFFTEUuml|YacuteTHORN(FortranUcirc)l3CUuml|YacuteTHORN(13aumlWoacuterGoumlUcircWdivideoslash)WaumlIacutecurren4C)JCDaumlacircoacuteGaumlZh5poundDiacuteegraveBC

hbrvbarpound^UcircCDordm)GmWbrvbarsectBX900 W FX1 oacuteG5g5poundDqUcircpaumlbrvbar5poundDh 1QRfeaZW Endeavor egraveBeumlnDafeaZWrs5PDHW 2auml13IacutecurrenTUacuteUcircbrvbar6)BbordfaumlIacutecurrenthorn7wXmWagraveaacuteGDHgtlaquoq UcircaumlIacutecurren 38plusmnUcircWmndaumlZhCDiacute

Ucircn~n Fig 13W Fig 14GWoIacutecurreniacuteEcircu0Gq Fig 13W Fig 14OgravegtCDIacutecurrenOO=5pound^gtlaquoqUuml|YacuteTHORN

gtcopyIgrave APXYZWWWWOcirc$Yordf 2ugt5nsectZgtXcpoundDDHiacutegt)9BcopyIgrave CY_W^curreneth5poundcopyIgrave APXYZWWWWYordfZXbrvbarPCDq)9BcopyIgrave CY_YZCXcpoundD=Ocirc$sectumlG^gtYordfZXpoundDoumlUcircordfecircDHgtlaquoq

Fig 13W Fig 14OacutegtCDIacutecurren3[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 3ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 3gtXC 6Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquosectaringaeligccedil13h-

a5brvbarpoundltG_ccedilY-a5^_ZUgrave)acircregmacr= 16 gteumlnq

Fig 13W Fig 14OcircgtCDIacutecurren2[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 2ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 2gtXC 4Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquoqFX1 gtBX900rufntildeDsectaringaeligccedilaringh-a5ordfgteumlmWcdmO=)t_ccedilY-a5 16cd 4^regGmWgtzyacuteordfmacrdnDq

Table 8BX900W FX1QRfeaZegraveBCDUcircWaumlbrvbarEuml BX900 afeaZegraveBCDaumlbrvbarGq

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeFFQhFQeFHFFIhFIeFHFFhFQWFRWFFILCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildeKgMSCFDEFGLCKmMlhKOFRLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeNHQRIAacuteVOEOlOjAtildehSRAacutelOEYEEfKAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKnMSCFDEFGKmMSCFDEFGKdMSCFDEFGKkMSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePMMSCFDEFGPKMSCFDEFGPPMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLSABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtildeLCnMEEhKOFKOFAVjLCgMllhKOFPOFAVjLCPMEhEEOUEFMAacuteEEfFAVjYKOFKAtildeLCKMlhllOUEFMAacutellfFAVjYKOFPAtildeAAacutelOEAtildehNAacuteEOlAtildeaumlIacutecurrenOcircKMSCFDEFGPMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBF

GFL

Fig 13 FFTEUuml|YacuteTHORNaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeop]qrsAacuteXAtildeObqqpobtbuqrvvowxAacutevOvOvAtildeL3yzPbqqpobtrAacuteowxAacutesOwO]|AtildeAtildeLCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildep~hKOwowxAacute^O~OqAtildehowAacute~O^Atilde ^currenethDHrp )9BcopyIgrave)JIacutecurrenKgMSCFDEFGKdMopt^|rKkMopt^|rpKdK^^hKOsOuqpKmM~~hKOwOuqpKmK~h~~O^MAacute~~fuqYKOwAtildeLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildepKkKVhKO]|NHQRIAacuteVOEOlOjAtildehowxAacute^Y^^fKO~OqAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKkKopt^|r LEW V^lnacutewPKnMSCFDEFGKmKSCFDEFGKmMSCFDEFGKdKSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePPMSCFDEFG LEW j^lnacutewPPKMSCFDEFGPMMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtilde]bcbrtrcAacuteuqKhnAtildeLCnMEEhKOFKOuqKLCgMllhKOFPOuqKLCKMlhllOUEFMAacutellfuqKYKOFPAtildeLCPMEhEEOUEFMAacuteEEfuqKYKOFKAtildeAAacutelOEAtildehNAacuteEOlAtilde aumlIacutecurrenOcircPMSCFDEFGLEW l^lnacutewPKMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBFGFL

Fig 14 FFTEiacuteaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

Ta

ble

8FF

TE V

er 4

1

FullW

kern

el

(EumlIgrave

32

180M

BP

roc

)

ST

UVWXYZ[]

^_

`6a8bcdRefghHdRefgij

`6a8bcdRefghHdRefgij

klmmnopqr

cltst

ouvq

wvx

ovy

zqvz

|vy

|vz

klmmnopq

~cltst

yvy

yvo

ovo

wqvq

v|

|vy

s88oy

oqvu

oqvu

ovq

z|vu

oquv

qvz

s88Ru

ovz

qvw

x|vq

ovw

klmmnopqr

cltst

ouvy

zvy

ovx

oq|vu

uvz

uv

klmmnopq

~cltst

yvx

xvw

ovo

w|vu

vz

|vy

s88oy

oqvx

oqvo

ovq

wxv

z|vo

qvz

s88Ru

o|v|

qvw

xuvw

ovy

klmmnopqr

cltst

ouvo

zvx

ovx

klmmnopq

~cltst

yv

xv

ovo

s88oy

oqvq

oqvu

ovq

s88Ru

ovz

qvw

ln

ltd

ln

zqq

D

mo

ltd

ln

mltd

JAEA-Testing 2011-005

UgraveBX900gtUcircWQRfeaZaumlWQRfeaZaumlWafeaZdegnIgraveEacuteIacuteotildeigt

laquosectauml4ordfXmWWfeaZAacuteordfXcpoundDmWordfcurren

cqUgraveDFig 14OcircgtCDIacutecurren_ccedilY-a5^regGW(_ccedilY-a5acute164)BX900gtordfpoundUcircCsectmoumlaringaeligccedil13hegravepoundXcpoundD9ordfAumlq [FX1gtUcircWQRfeaZaumlWQR

feaZIgraveEacuteIacuteshyegraveeacuteordfaEacuteDordfUuml|YacuteTHORNcdz

yacuteCiacutecurrengtrsGWotildeigtlaquomWcd BX900 WrsGmWXqBX900W FX1gtUuml|YacuteTHORNWiacuteOslash5GWFig 14OgraveIacutecurrenO=OordfEcircUgraven^currenethWPrsUVXiacute5pound

DDHBX900 gt FX1 gtzyacuteCqCcurren BX900 16 oacuteC FX1gt 40WXzyacutegtlaquopoundDqUgraveDFig 14OacuteWOcircIacutecurrendegncopyIgravefgO=EacutegtlaquosectBX900gt 11currenW 08currengtzyacuteXcpoundDordfFX1gt 36currenW 18curren(or 16curren)gtMddcXzyacuteordfaacuteHdnDq 13yacutecdFX1feaZTUacuteUcirc BX900Wdc+XpoundsectFX1

|Y]OszligyBGDHhhordfaeliggt

laquomWordfcurrencpoundDq 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute

42 STREAM^_ZgtethCDW ordfcent^_Zgtecircc5g5poundDq egraveBCD^_Z43 FFTEWEacute Stockham|5gt FFT5P

NPB (NAS Parallel Benchmarks) 23FT^_ZgtlaquoqFFTEoumlWiacuteNPB23 FT Uuml|YacuteTHORN(Fortran Ucircgt MPI EumlIgraveUcirceumln)l3CUuml|YacuteTHORNcdaumlIacutecurren4C)JCDaumloacuteCfeaZUuml^13~brvbarpound^

UcircCDordm)5poundDqUuml|YacuteTHORNgtXCaumlBD=YZ C7UV((XYZ)=(512512512))gtaumlIacutecurrenordf On aringaeligccedil13hWXsectfeaZUuml^13~szligordf13XDHgtlaquosectaumlIacutecurrenUVEacuteCDcpoundDcd

gtlaquoq ^_ZaumlIacutecurren 28plusmngtlaquosectn~n Fig 15W Fig 16GqMd

8plusmncopyIgraveoacuteGYordfZsectumlO=gt|YordfZgtlaquo

mWcdhtgtUKntst oacute WXXgtmUuml^13~notbrvbarGq UgraveDOcirc$_ccedilY-a5 16W 64brvbar5poundDq

JAEA-Testing 2011-005

13

|ucp|t^ro__tAacute^OOOsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]|trFRFY]p^top]qrsYtpYop]qrsJJDp_Q|^bbqpc^ta|rotpbctcb|urc`Q^uptatar^]|tbtarp|t]|tbccbwOa^qrR^boocbtoabccbw`Et^b|rtabtFhPU`Ar_pcrobqq^SJJDItpo]rc_pcJJDOtarbccbw|tur^^t^bq^ruwobqq^SJJDI^taEorttpMbUrttpUQOarcrUQ^tarbs^|bq|rp_U_pcbwo|ur|rtobqq`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^oq|r[qpubq`a[^oq|r[]^_`a[^trrc^OOO^O~OqOsO^rccOr^trrc__tuqpoO__tuqpo]b]bcbrtrcAacute__tuqpohdnO__tuqpo]bhddAtilde auml^_ccedilY-a5Kd]bcbrtrcAacute__tuqpohKdO__tuqpo]bhKXAtilde]bcbrtrcAacutehiOhmKPAtildep|uqrop]qrssOwO|p|uqr]cro^^prqOrqrOrq]^r^psAacute__tuqpo]bOAtildeOwAacute__tuqpo]bOAtildeO|AacuteAtildeobqqUHExE^tAacute^rccAtildesAacutevOvAtildehAacuteK`MMOM`MMAtildewAacutevOvAtildehAacuteK`MMOM`MMAtilde|AacutevAtildehAacuteK`MMOM`MMAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pcprbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqh]^xt^rAacuteAtildep^hKOKMMM^hY^pqhKOOPobqq__tPAacute^OqOOO__tuqpoO__tuqpo]bO|OsOwAtilde^_Aacuteq`r`AtildeptpKdMobqq__tPAacute^OqfKOOO__tuqpoO__tuqpo]bO|OwOsAtilderpptpKXMoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]wRtpQ`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKdMopt^|rp~hKOp^hKO__tuqposAacute^O~AtildehwAacute^O~Atilde aumlIacutecurrenLfgO=PrprpKXMopt^|rrpobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqrh]^xt^rAacuteAtilderq]hrqrYrqobqqUHExSpxcbAacuteUHExSCUUxTCBVLOrO^rccAtilde^_Aacuter`r`MAtildetarc^trAacutedOWAtilde[Gqb]rD^ruw|u`__tPAacutero`Atildev[Orq]r^_obqqUHExJ^bq^rAacute^rccAtildetp]

Fig 15 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- cfftz)

JAEA-Testing 2011-005

|ucp|t^r__tPAacute^OqOOOwOwKO|OsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pctarVYta^trcbt^pp_tarropbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^trrc^OOqOOOwOwKOKOq^Oq~OqO|O^O~O^KKO^KPO^PKO^PPp|uqrop]qrs|OsOwO|KOsKKOsPK^r^p|AacuteAtildeOsAacutewKOAtildeOwAacutewKOAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYort^^t^bq]bcbrtrc`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKhePqhPWWAacuteqYKAtildeq^hPWWAacuteYqAtildeq~hPWq|hq^fKp^hMOq^YK^KKh^WqfK^KPh^KKfK^PKh^Wq~fK^PPh^PKfq^_Aacute^`r`KAtildetar|Kh|Aacute|f^Atilderqr|Khop~Aacute|Aacute|f^AtildeAtilder^_oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoDa^qpp]^rotpc^buqr`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYphMOqYKp~hKOwsKKhsAacute~O^KKfAtildesPKhsAacute~O^KPfAtildewAacute~O^PKfAtildehsKKfsPK aumlIacutecurrenLOO=PwAacute~O^PPfAtildeh|KWAacutesKKYsPKAtilderprprpcrt|cr

Fig 16 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- fftz2)

JAEA-Testing 2011-005

UgraveWH Fig 17(Ocirc$ Table 9 Table 10GqQRfeaZbrvbarshyatildeCBX900 fftz2gtfeaZUuml^13~UKntstbrvbarzyacuteordfagraveaacute13cfftz gtnotXordfmacrdnXqUgraveDFX1degnnotXordfmacrdnXcpoundDqmW ETHNtildeCD

zyacuteordflaquopoundD BX900 fftz213gtfeaaringaeligccedil13heumlXsectumlampordf5eumlnXmWordfcurrencpoundDq13]h

eumlnX(cXCcedilEgraveWuumlwdnordf(currencdXq Ocirc$_ccedilY-a5degnOcirc` 16ordfTUacuteshygtlaquopoundDq UgraveDQRfeaZbrvbar BX900W FX1shyordfFFTEoumlbrvbarsectgteuml

(2curren13)ucircsup2WsectumlMd|YZgtlaquopoundDDHWuumlwdnq TcedilQRfeaZWafeaZUKntst ordf)BCD

IacutecurrenotildeQRfeaZAgraveordf 1 ethAacuteWXpoundDordfSTREAM gtfrac12Eacutesect

iumlntildegtlaquoq 45 TUacuteUcircUgraveWH

BX900WFX1STREAMWFFTEUuml|YacuteTHORNfIumlWNPB23 FTB13feaZbrvbarTUacuteUcirc5gCDFX1gtegraveBGfeaZUuml^13~brvbarpoundordfecircC^ordflaquo 5ordfaeligugtlaquopoundDq

FX1gtfeaZUuml^13~otildegt13ecircGmWordf13XoumlordflaquosectWecircUcirc7gth5Paeliguordflaquoq

Fig 17 feaZbrvbarTUacuteUcirc

JAEA-Testing 2011-005

TTa

ble

9N

AS

PARA

LLEL

Ben

chm

arks

Ver

23

FT

Tabl

e 10

NA

S PA

RALL

EL

Ben

chm

arks

Ver

23

FT

JAEA-Testing 2011-005

5 IORYbrvbar Y`fIuml IORBoa13alFGFAring

CaYoBETHNtildeCDq 51 IORYEcircu

IOR Benchmark3)vR ASCI Purple=_7|eacuteplusmngteumlnD C iexclbrvbarYfIumlgtlaquoLplusmneacuteYacute~ 2102egraveBPqPOSIXLwrite() read()PMPIIOLMPI_File_write_at() MPI_File_read_at()PbatildeB13EumlIgravel5PmWordfgtecircq 511 YacircueZ$

MPI oZYcplusmnLrepetitions plusmnP`5XFGFAringTshyiquestHqOcirc`centgtEumlIgraveOslash5eumln^_cdiaoacuteCpound

Eacuteecircordf5nndFGFAringshyordfeumlnqIOROslash59XacircueZ$ Fig 18Gq

1plusmn`gt readwriteplusmn(blockSizecurrentransferSizejsegmentCount)gtlaquosect`aecirceuml (blockSizej segmentCountjMPI EumlIgraveAring )WXqyenwtransferSize=1kBblockSize=1MiBL106 Bytes MB220 Bytes MiB WGPsegmentCount=1MPI EumlIgraveAring=32 oumlblockSizejsegmentCountjEumlIgraveAring=32MiB testFile ordf)dnoZY 1MiBmiddotAbrvbarntildeC1KB7fIuml 1024iexcl1313aeligUgraveDZordfpoundEacuteecircGq

[transferSize] 1plusmn readwritegtFGGeacutea`qsizeof(long long int)currencblockSizegtlaquoaeliguordflaquoq

[blockSize] o^_ordfbrvbarntildeGeacutea`qgt 1MiBq [segmentCount] o^_brvbarsectYeumlniOcirc$qyzshy 1q [randomOffset] 1XdZordfYqOcirc`shy 0q [fsync] ecircEacuteAring fsync13f5Gqyzshy 0q [filePerProc] ^_sectWcent`aYGqPOSIXEacutenotszligq

yzshy 0q [useFileView] MPI_File_set_view() MPI_File_write() MPI_File_read()egravePq

ZordfYWparaCXqyzshy 0q [collective] MPI_File_write_at_all() MPI_File_read_at_all()egravePqZordfY

WparaCXqyzshy 0q

Fig 18 IOR YacircueZ$

JAEA-Testing 2011-005

512 MPIIOatildeGUuml^13~eZ$

IOR gt Info aringbrvbarpound MPIIO =)`7wmWordfgtecircqBX900 notszligX InfoUumlYacuteNtildeY` keyW value Table 11GqplusmnCXLyzshyPq

Table 11 MPIIO Info UumlYacuteNtildeY` key value

cb_buffer_size 4194304 $amp()

+

cb_nodes -012345

6789lt

=gt-012A

B

ind_rd_buffer_size 4194304 ACDEFGF$amp()+

ind_wr_buffer_size 524288 ACDH+GF$amp()+

JAEA-Testing 2011-005

52 latildeLPOSIX MPIIOPn~nEumlIgraveAringtransferSize (1plusmn13

fgtFGeacutea`)YAgraveograveLsequential randomPb^wFig 19GumlcopyFGFAringCDL^_P Table 12Table 13Fig 20Fig 21GqUgraveDntildea13iumlIuml_aumlaringaeligccedil13hIO -eacute-eacutearingaeligccedil13hordfcentsup3CDHn~naringaeligccedil13hoacuteGIO XcOgraveOacuteOcircOtildeYoacuteG IO XcordfcentGDH_aumlaringaeligccedil13hszligEacuteDq

iexclcentpoundcurrenyenETERNUSDX80

970TB

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

brvbar

SRFSltlaquocumlcopyordf

READD WnotiteD

brvbar

SRFSltlaquocumlcopyordf

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

shyshyshy

Fig 19 ReadWriteumlcopy

JAEA-Testing 2011-005

Table 12 LPOSIXP

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3

oqu yvq ovy| |uxvzw zvwu |zwvyy ouv|u oywvo ozuqv

quw zqzvuy uvox uu|v|u u|uvqo uxovz| uwvuo owovuq wqvxuqzy oqqvo| vy uxyvyw uuuvxx uyvyw xqzvu| |vw| |q|wvzqwoz o|uv|w ouvoy ozwv|o uyxvux uqvyo u|v| uovzu zuvxoy|wu ouoov|w uwvqu uuvxq uyvwz uuvyw uxovww uqvxw zovzw|yw ouxzv wzvwq uy|vx uyvz| uyuvwz uwxvoo |xvo wxvzyyxx|y ouwyvq| wv uxvo| uovyx uzvuu u|qv xvxy yvyxo|oq ouqwvu| uqvqq uxv|w uxvxy uuwvyo uyvoo qvyw zv|qyouu oxoyvu ywovoq uuvw ux|vy |uxvu| uxovoy oqvow uwvzwxuww oxovww oo|zv uyuv uvoq |w|vo u|wv|y ouqvoq uqwvooquwxy ouwvyq ovq uuyvo| uzvyu uquvq u|uvo q|zvx| q|vq

oqu ow|qvy ovu uqvyo vyu xx|vyq uwvwy z|zyv|y oxxqwvuwquw quzvoy |vq |w|vwz uyvyw xxwvzu xuvw o|xxxvy owv|xuqzy ouvu |vu uqov|q uxzvzo u|vzq uuwvu ooouv| |qxvxuwoz ozqxv|x o|vq |z|vuo uq|vqz xxyv|u xzzvy oo|vwo xqvzzoy|wu uvuw uqvz uovz u|vwo xwwvo y|vu| oxvz ozvoq|yw uuvw zuvy uvzz uozv yqzvyx xzyv|u ovo oxvwyxx|y owvy ouuv|u uwv|| uuwvx| x|ovx| yqyvux o|ozvuo oyoovyoo|oq |wzvy xyvyw uuwvz| uvqu x|xv xuvu ooyyvxo oxxvoyyouu u|wvq |zxvwx uvo uyzvu xo|vzq xuqvqw oyzu|v|o q|quvqxxuww uuv|o ozuxvuu uuvxx uzvzu xqvow xuwvzz oyxxxvyx owyovyooquwxy xwvy |qzv| uuzvxu uzvo| xyuvwo uwzvwx oxwwvxz oxyuyvy|

oqu oxyovoo vo yo|vyu v|w xwyvzz zv |xwovy xzzxxvyquw oxvz |vwz yquvxx y|ovx xywvyz xzyvyq x|xy|vzo w|ywzvwouqzy oyvu| zv xzwvx| yyvu xvoq xv yxy|vz zqy|vyowoz oy|xvxx uvy yozv|q yvyu xwv| xxvu| ywuv| zqqqvozoy|wu oxyzvuw uyvzz y|vyy yowvxq xwuvwx xwxvy ywzuv|q wqzvoo|yw oxoovuy wwv yqyvyq yuvyw xwov|x xzov|u ywozyvqy wyyyvywyxx|y oxwxvz ouyvxy yoyvuz yvoq xwxvwx xzqvww ywv| wxxqzvo|oq oyxwvw qvwx yovw yuvw| xxwvwo xwyvuq yquvzw woyvyouu oyw|v|z yzvx| yqvwu yxyvq| x|vx| xywvuq yywqvxw wqvq|xuww oxqvxu ouxqv|u yqvuq y|wvqx xyqvw xovz| yx|||vqq yuy|vuoquwxy oyywvqx uwvuy yo|vxz yuzvz yyxvq xwv|u yox|wvww yoovqx

oqu

|

xy

acutedegmicrodegpara

degmiddotmdegemdegkk~qgcedilsup1sup3

mdegkk~ocedilsup1sup3 cedilordm cedilsup3

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3oqu |yvoy uvqw zqxvxx xvwq uzvx ovu y|zuvx xywy|vy|quw uxzvy yvuz zoyvxo vox xxv| |vqw zyuyovyo zqzuvuyuqzy uyvuz oovx zuovw wvzy zvqx |vzu o|qxxuvy oowzyyvyqwoz uoyvxx vy xyzvzq xxv|o zuovwx uyvyo ouyyqovxx o|woo|v||oy|wu xqvzy uuvy oquvou oqovq oqqovuw z|vy oxuwv| ouqxzyvwz|yw xoyvyw xwvqy oqqvo oz|vo oqzvyz o|yvu ouwwqvq o|wzuv|yxx|y yu|vq ooyvq o|yvzx qvu o|wvqq uwvy oozwwvyo oozqyvxo|oq youv|y ovoy oxu|v|w yqv ouv|y owzvoo wuo|vou w|qzvzwyouu yxvw uouvwo oywzvwo o|yxvqo oyyvqq oxzzvq xo||voy xoyvwuxuww wozvq yuv| oyuvwo ouxvo ox|wvq o|o|v| |qwqvxw |ouvyzoquwxy zyxv|o zqyvu| oyq|vx oxqwvy oyzvz ooxvu ozzvx owovyu

oqu |xvyy v|u ozuvy w|vyy |xvw |vxy uoyyzov|o |zqyyyvoquw |xvww uvxw owq|vx xvzx oyovu |vqy xw|zvzw xyzuzvzxuqzy |yzv|z zvqw qvuy v|x oyovww wqvyw wzyv|w yqu|vowwoz |wuvz owvu o||vu oqv xzyv ouuv|| yozqovu zzqvzoy|wu xqxvwy |yvx qovou uwvyx oz|zvxy ywv| xqqxvzq yq|qxvzw|yw xuv|u uv|z qywvyu zxvy qxovww |quvz yxwoqv yuquu|vy|yxx|y yzxvqq ouyv| xzwvwx xyvzq yvoq wqzvy uzyqzvu x|yzqzvw|o|oq yyqvx| zzvq |ov|x zzovw| w|ov|| ozvo x|o||xvuz |yqqvwyouu zzyvzq yq|vwz |zzzvz |qvy |wzuv zyvxq |xowu|v |uuoqovuuxuww ooxvu o|wvu |wyvww |zvu |ux|vo zw|vzo |xuwvy |uwovxoquwxy oz|yv| owzuv| |wxyvwu ||yvqu |yzuvzx |yyovq o|uwvo o|uywvo

oqu |yvq vq |qyvqq ozwv|u |zvwq qzvww o|wuzvyo ooq|xxvzyquw |yvzu uv ooqvyu ozzv|w uxxv ozqv| owzzwvyy oxxuq|vouuqzy |yyvx| wvxo yvqq ozwvqy |zwvzq ozwv zzvx qyqqovoowoz |xv| oyvyz ozxovou |wwvw uxqvq| uquvq| |ozqxv|| qwuzzzvw|oy|wu uyqvzo ||v|| |wvyy wquvw wxuvox zvuo |oxy|vw q|qyvoq|yw ux|vuw yvw |xov|u |uvz uuvyu |wqvy o|xxovzu q|wvwzyxx|y yzvwo o|xvu |uqovuy qvw |yvy wwzvx o|wuyov|y owxuqzyvowo|oq xzzvwz yzv|y |q|vxq ovzu uqwvuq oqqvz oyuyyxuv|u oyouoz|vuoyouu wxyvq xuuv|u uoyxvo |yywvo uzvw| uouvu ooq|ozvo oqyq|vqxuww oqqvqu oowvoz uoy|vw |yvx uwuvuw uux|vxx woyyqzvow zqzvqwoquwxy owquvu owov|u uouvx |zxyv uvw uxoqvqu uy|yozvuw uuqqvuo

xy

oqu

|

cedilsup1sup3plusmnsup2

microdegpara cedilsup1sup3 cedilordm cedilsup3degmiddotmdegemdegkk~qg mdegkk~o

JAEA-Testing 2011-005

Ta

ble

13L

MPI

IOP

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

qv

qvz

ovqq

ov

|vq

vwy

quw

vyu

ovq

ovzz

vx

yvqq

yvox

uqzy

xvq

|vx

|vzq

xvo

oovw

oovxu

woz

oqvz

wvqu

vy

oqvy

|vuz

vy

oy|wu

ozvwy

oyvwq

oyvy

ozvz|

uxvw

uxvz

|yw

|yvzx

|vyu

wvy

|xvxo

vq

xv|

yxx|y

yvy

xyvoz

xqvqq

xuv||

oyvqo

oovz

o|oq

zyv|o

wqvu

zovu

zvx

owwvw

oyvzx

youuoxvoq

oqyvx

ouovzu

oqov

x|vqw

uvux

xuwwuyuvoo

|wvuo

uwvxu

oyovqq

wuvqq

||zv|y

oquwxyyqxvu

uwyvux

uwxvzx

ozuvqy

|uwv|

|uuvx

oqu

ovou

qv|o

qvy|

qvyx

|vy

|voq

quw

v|

qvx|

ovq

ovw

yvu

yvz

uqzy

uvxy

ovu

vqw

vyw

ovu

ovou

woz

zv|w

vyu

xv

xvoq

xv

uv|

oy|wu

ov|z

uvyz

oqvou

zvzw

uzvyo

uwvwz

|yw

||vz

zv|w

uv|z

qvy

wvwq

zovy

yxx|y

xvxw

v

uyv

|wvy|

oxqvz

ouzv|q

o|oq

w|vuu

u|vz

wyvx

ywvzz

y|vw

yvuz

youuooyvu|

yovw

oyqvoqvxo

||xvyo

|vx

xuwwwwvww

o|ovu

|uvqzyvuy

|yxv|

uqvw

oquwxy|xvo|

oovux

qovwu|qyvqz

uqwvo

x|v||

|

xy

acutedeg

microdegpara

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

ovou

ovoo

ovq

ovo

vo

vwx

quw

vx

vow

vo

vxu

uvxo

yvo

uqzy

xvoq

uvou

uv|

xvqu

zvo

oovy

woz

oqv

wvx

wvy

oqvo

ozvwo

vo

oy|wu

qvoq

ovo

ovux

qvq|

|yvyq

uyvou

|yw

uqvyo

|yvz

|uvy|

|wvu|

wovo|

wyvzu

yxx|y

wqvox

|v|y

yyvww

wvy

ouovz

ouwvqq

o|oqoxyv|w

o|vuq

oxvyo

oxovz

wvw

xwv||

youuwovuu

xyvox

u|vzq

yzvxw

xoovux

x|xvwu

xuww|xvo

uzvx

u|vyz

xqvozoquqvw

zuyv

oquwxyzv|y

yovz

wqvou

qyvuqooo|vuoqovu|

oqu

ovo|

qvxy

qvzy

ovo

voz

|v

quw

vz

qvzu

ovzq

vy

uv|u

yvx

uqzy

uvy|

vqx

|v

uvuz

wvwq

ovxo

woz

zv|

|vx

vwq

wvwu

ovyy

yvqu

oy|wu

owvzu

vu

oyvox

owvoq

|xvxo

xuvwx

|yw

|vy|

ouv

|ov|u

|xvx

qvq

oquvwq

yxx|y

xvq|

|vu

uxvx|

yvwo

oxuv||

owov|w

o|oqoxqvyw

v|

wyvq

ovux

|qvo

|vu

youuz|vwo

ouzvwq

o|vy

wuvqw

yovy

yywvz

xuwwxxvyu

|qv

uy|voy

xoqvwoxqvwwo|yvzw

oquwxywxwvxq

xov

wqvx

zqwvu|uovxzoqvu

xy

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2

microdegpara

|

JAEA-Testing 2011-005

521 POSIX Fig 20GbrvbarPZYgtFGeacutea`brvbardotildeiWXoacute

CZordfYgtFGeacutea`laquoGnordfzyacuteCqXFGeacutea

`ordf 1MB oumlZYWZordfYWgtiacuteOcirc$FGe$WXDHordfiGqUgraveDZYecircCo^_ordfn

~nparaCDaYGouml`$iaYbrvbarsect

EGoacuteCpoundEacuteEacutegtiaYGoumlAgraveordfAumlgtlaquopoundDq

mnd$Ocirc$YUumleacutenotccedilIuml]OcircOtildeYYL13YEuml

plusmnFEumlPthornordfuumlwdnqecircCCgtordfFGeacutea`

iexclcentIacuteiWXmW]poundEacuteEacuteecircC+Xnota

13gtOslashordfmacrdnX=WC-eacutearingaeligccedil13hLIOiumlIumlshyaringaeligccedil13hPthornWuumlwdnq 522 MPIIO POSIXZordfYWiacute=zCFGeacutea`

otilderyenCordfzyacuteCLFig 21PqUgraveDEumlIgraveiexclcentCXqmndreg=cd MPIIO ]^_ordfZordfY5POslashOumlWXpoundgtXcWmacrdegeumlnq 523 _aumlaringaeligccedil13h 13 IOoiumlIuml_aumlaringaeligccedil13hWa-eacutearingaeligccedil

13hordflaquoqa-eacutearingaeligccedil13humlcopyordfplusmnsup2GmWsup3acutegtlaquoordf13

BeumlnmemcleanfIumlegravePmWbrvbarsect_aumlaringaeligccedil13hY|1micro5PmWordfgtecircq 13G3brvbarsectpoundEacuteEacuteCDPOSIX_aumlaringaeligccedil13h

szligordfaacuteHdnDLTable 14Pq

(1) ecircEacute` (2) memclean (3) TplusmnpoundEacuteEacute` (4) 2plusmnpoundEacuteEacute`

JAEA-Testing 2011-005

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomSingleFileFilePerProc

ReadWrite

homehome

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

Write Read

sequential randomhomework

N=322561024

N=256

N=1024

N=32

N=32

N=2561024

FilePerProc FilePerProc

N=256

N=1024

N=32

N=2561024

N=32

FFig 20 POSIX

JAEA-Testing 2011-005

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential random

use viewcollective

use view N=256 use view N=256

N=32 256 N=32 256

homehome

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomhomework N=32 N=32

FFig 21 MPIIO

JAEA-Testing 2011-005

Table 14 _aumlaringaeligccedil13hszlig

| xy | xy

oregRplusmnsup2 zyvx qzv|| xwvxw zyuvqu

sup2Rplusmnsup2 woxqvoz ||qqvz| uzvu z|voy

kfrac14 frac12kfrac14

53 IORYUgraveWH

IORYbrvbardegdnDregmacrb13tuq 531 a13egravecurren occedil`brvbarYCDcparaX IO O=5Pnota13LSRFSP

homea13brvbarsectiumlIumlS_aumla13Lext3Ptmpa13PordfOslashszligordfAumlDHcAumlFXgttmpa13YZccedila)W^|13~AumlFUcircgtecirc9ordflaquoLFig 22middot Table 15Pq UgraveDhomemiddotAQFS_ccedilY-a5 512kcentCsect186OcircOtildeYOumltimes

raquoZETHIuml_Kgtaordf)JeumlnqUcircfIuml13NtildeY|^`bgta

gt_ccedilY-a5gtyBUacuteCqiAgravework middotAIO eacuteIumlkCQFS_ccedilY-a5 1MBcentCsect366OcircOtildeYOumltimes`Zaggtaordf)Jeumlnq^|13~cda_ccedilY-a5gtlgtyB

UacuteCqa-a5ordfecircCXbrvbarPX MPIIO work middotAegravePuecircgtlaquoq 532 POSIXW MPIIO c^_brvbariaYhomemiddotA Readgt 256EumlIgravegt 2curren

Writegt 10currenW POSIXbrvbarsect MPIIOordfcedilqPOSIXbrvbaraYmacrsup1Gq 533 eacuteccedilecirceuml ntransferSizeGX 1 plusmn13fgtleumlneacute

a`ordfecircMFGFAringordfecircCXWPdegdnDmWcd^|13~

^_Zeacuteccedil-a5laquo]GWordfyacuteordf9ordflaquoq C`|lOcirc`eacuteccedil-a5Y_ BUFSIZgtAtildeUgravesect13

gt gccegravePoumlgt fccegravePoumlgt 8192eacutea`gtlaquoqCfIumlgtsetvbufZaZ|atildebrvbarpoundeacuteccedilecirceuml^wmWordfgtecircq QR Fortran1313aeligleacuteccedil-a5GAgraveograve13sectLAacute

ordmPq (1) OPENyacute BLOCKSIZE (2) ^ fuxxbf (3) Oslash5fIumlZaUuml^13~ -g (4) Ocirc`gt 8Meacutea`

JAEA-Testing 2011-005

ETER

NU

S D

X80

1-

18 (d

x01frac34

dx18

)

DX

80

01

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

DX

80

02D

X80

03

DX

80

04D

X80

05

DX

80

06D

X80

07

DX

80

08D

X80

09

DX

80

10D

X80

11

DX

80

12D

X80

13

DX

80

14D

X80

15

DX

80

16D

X80

17

DX

80

18

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

IObrvbar

1(io

1)

ETER

NU

S D

X80

19

-36

(dx1

9frac34dx

36)

DX

80

19

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

DX

80

20D

X80

21

DX

80

22D

X80

23

DX

80

24D

X80

25

DX

80

26D

X80

27

DX

80

28D

X80

29

DX

80

30D

X80

31

DX

80

32D

X80

33

DX

80

34D

X80

35

DX

80

36

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

Wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

IObrvbar

2(io

2)

Fi

g 2

2a13J

JAEA-Testing 2011-005

Table 15 OcircOtilde7Y`|J

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_01 QFS_data_01 2109888 MB 2109888Icirc36 = 724TB 1024k

data_02 QFS_data_02 2109888 MB 2109888Icirc36 = 724TB 1024k

data_03 QFS_data_03 2109888 MB 2109888Icirc36 = 724TB 1024k

data_04 QFS_data_04 2109888 MB 2109888Icirc36 = 724TB 1024k

data_05 QFS_data_05 3695488 MB 3695488Icirc36= 1269TB 1024k

home QFS_home 3695488 MB 3695488Icirc36= 1269TB 512k

work 524288 MB Io2JAgraveAacuteAcirc

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_06 QFS_data_06 2109888 MB 2109888Icirc36 = 724TB 1024k

data_07 QFS_data_07 2109888 MB 2109888Icirc36 = 724TB 1024k

data_08 QFS_data_08 2109888 MB 2109888Icirc36 = 724TB 1024k

data_09 QFS_data_09 2109888 MB 2109888Icirc36 = 724TB 1024k

data_10 QFS_data_10 3695488 MB 3695488Icirc36= 1269TB 1024k

data QFS_data 3695488 MB 3695488Icirc24= 846TB 1024k

sysdata QFS_sysdata 3695488 MB 3695488Icirc12= 423TB 16k IumlETHNtildeETH

work QFS_work 524288 MB 524288Icirc144=72TB 1024k dx01frac3436OgraveJ

JAEA-Testing 2011-005

6 a$fY` mmgt BX900 a$fY`ETHNtildeCDtuq

ETHNtildeBX900Xraquozfrac14frac12GmWordfgtecircDq 61 a$fY`igraveiacute mmgt BX900 a$fY`IcircIumligraveiacute AgravejiumlIumleacuteaIuml

0Gq 611 icircWiumlIumlethsectntilde

BX900 InfiniBand QDRegraveBCD Fat Treeicircgtlaquoq113aelig13S 18iumlIumlAEligicircgtlaquoordf113aelig13 LeafWdividenaccedil7Iumlordf 2Igravecenttimeseumln1accedil7IumlcdoiumlIuml QDR ordf 1 7gtYZeumlnqaccedil7Iuml 2 Igravelaquogt1 iumlIuml 2 QDR ordfYZeumln(QDR Iacutegt 18 j2Igravegt 36WX)qoiumlIuml`ZaringbrvbarsectT 8GBs (4GBsj2) AAEligpoundq 13aelig13 5PDHLeafaccedilyacute7 SpineWdividenaccedilordf

9iexcllaquosecto LeafaccedilGu Spineaccedil QDRgtYZeumlnqbrvbarpound 113aelig13cd SpineIacutegt 9j2Igrave=18 QDRordfegraveBeumlnq QDRordf 5PoumlLeaf accedil3xmacrDownlink ordf 36 Uplink ordf 18 Xgt Uplinkshy 50WXsectmn 50_ccedilaringWdividenotgtqBX900gta$fY`Ecircfrac34h Fig 23GqBX900gt 50_ccedilaring13aelig13ocircotilde gtthornordf]GqSpineLeafaccedilW13aelig13SoiumlIumlYZGEcircfrac34h Fig 24Gqmmgt13aelig13S nAumliumlIuml nAuml(niquest9)UgraveD nU9Auml(nAgrave9) SpineaccedilegraveBGqlaquo13aelig13laquoiumlIumlordfcent13aelig13iumlIumlcdOcirc$ltAacuteoumlSpineaccedilcd Leafaccedil(2Igrave) QDR2(1j2Igrave)ordfegraveBeumln(2Igrave)LeafaccedilcdiumlIumlQDR2gtYZeumlnqDotildeCmSpineaccedilcd Leaf accedilordmAcirc13aelig13SPAtildeWiumlIumlWnoteumlnqmDHyenwiumlIumlTEacuteOcirc$lt GoumlWiumlIuml 1 W 10 ordflt GoumlWgtn~n FAringordf+XqX13aelig13Sotilde gtiuml_ccedilaring

WXqmDHoYacute~iumlIumlethsectntildeAgraveograveLiumlIumleacuteaOcircOtildePordf

thornotildeGq (1) XiumlIumlethsectntildeL13klP iumlIumlZeacuteaIumleumlnqyenw13aelig13 1 iumlIumlordf bx0001ccedil

bx0018Ugravegtlaquoordf144EumlIgraveYacute~oumlmAumlgtiumlIumlordfeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDdmiumlIumlAringAEligw[iumlIumlegraveBGq13

aelig13ocircotildeoumliacuteZeacuteaIumleumlnq

JAEA-Testing 2011-005

(2) CcedilXiumlIumlethsectntilde Wsup33iIacuteOslash5YZniumlIuml 1ecirceacuteaIumleumlnqyenw13aelig13

1gt bx0001bx0003bx0005bx0007WeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDd8)eacuteaIumleumlnDiumlIumlordfEgraveAumlAringXdAringAEligwTEacuteEgrave

AumlAringiumlIumlegraveBGqEcircAumlAringoumliacuteAringAEligwTEacuteEcircAumlAringiumlIumlegrave

BGqmnAtildeWYacute~SiumlIuml gt IB [EumlDHcenteumlnDq plusmnoqgt65EacuteWsup3AgraveogravegtCDq

13aelig13 1 13aelig13 119

Leaf(1-2) Leaf(1-1)

bx0001 bx0002 bx0018

Spine(1) Spine(2) Spine(9)

Leaf(119-2)

Leaf(119-1)

bx2125 bx2126

Fig 23 BX900a$fY`icirc

3OslashCDqIgraveIacuteoFL13aelig13raquoIcircIumlntildePfrac34wWsup313kl

WiacuteiumlIumlethsectntildeAgravejBCq

JAEA-Testing 2011-005

accedil (Spine) accedil

(Leaf) 13aelig13 iumlIuml (1 13aelig13acute18 iumlIuml ) (acutecsect]GCGDHLeaf accedilTCcX)

1

10

2

11

3

12

4 5 6 7 8 9

13 14 15 16 17 18

SW 1

SW 2

SW 3

SW4

SW5

SW6

SW 7

SW 8

SW 9

SW

Fig 24 iumlIumlWaccedilatildeAcirc

612 Agravej

BX900 Ba$fY`plusmnsup2yBCD IB AgraveograveWC

RDMA(Remote Direct Memory Access)yBCD RDMA Agravej(13 RDMA WC)WRDMA egraveBCX NORDMAAgravej(13 NORDMA WC)ordflaquoqBX900 gtOcirc` RDMA gtlaquoqRDMA W IB begravecccedil`brvbarYaumlIuml(Host Channel Adapter(HCA))Ocirc$^|13~|W8Y]sectWsectgtecircAEligDIacutemWgtETH_fg (1)5PqAuml^ccedil`E7a13 ordfNtildeEumlgtecircqiAgraveNORDMA CPUbrvbar|fg5Pq

RDMANORDMAWcentccedilYacute AgraveograveWC Normal SendAgravej(2)W Send on RequestAgravej(3)ordflaquoqRDMANORDMAacircWmnd Normal SendAgravejSend on RequestAgravej5PmWordfgtecircordfmndsectOgravewG ccedilYacute-a5brvbarpoundAtildeUgraveqRDMANORDMAgtmndAgravejsectOgraveOcirc`Oacuteshy Table 16GqOacuteshyOcircOtilde Normal SendAgravejOacuteshy13yacute Send on RequestAgravejWXqmndAgravejWccedilYacute-a5^_atildeAcirc Table 17W Table 18Gq Oacuteshy^ MP_MAX_NORMAL_SENDgt^wdnordfyacuteaacuteshyCDOcirc`shyWEacutegtlaquoqbrvbarpound^egravePoumlOacuteshyOumloumlEacutegtlaquoq

JAEA-Testing 2011-005

XDshy 5PDHOslash5 -onesideUuml^13~CDoumlNORDMAAgravejc(ccedilYacute-a5WatildeAcirc)Normal SendAgravejgt5nq

Table 16 AgravejsectOgravewOacuteshy Iuml RDMA RDMA NORDMA

^_ Tccedil1024 1025ccedil U Oacuteshy 32768eacutea` 16384eacutea` 524288eacutea`

Table 17 RDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil16KBOcircOtilde 16KBccedil32KBOcircOtilde 32KB13yacute 1ccedil1024^_ Normal Send Normal Send Send on Request

1025ccedil2048^_ Normal Send Send on Request Send on Request 2049^_13yacute NORDMAWX Send on Request Send on Request

Table 18 NORDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil512KBOcircOtilde 512KB13yacute Gu^_ Normal Send Send on Request

timesOslash (1) ETH_fg ^_ccedil-brvbarOcirc$fgIacuteccedil`brvbarYordmgt ordm|8

YOcirc$FGGCcedilEgravemWq^_ccedil-( 13)Obbrvbarsectdivide3GmWordfgtecircq (2) Normal SendAgravej G shylt shyccdccedilYacuteG GqccedilYacuteUgraveUacutesect

umlordmordfagraveCnUgraveUgravefg(direct copy)qagraveCXoumliUcircIacuteEumlAagravefCmmfg(indirect copy)agraveCDdsectumlordmfgGqUumlccedilYacutezecircq (3) Send on RequestAgravej G shylt shyWYacuteAacutesectlt shyccedilYacutesectumlordmordfagraveCcdG 5Pq

`ccedilYacutezecircq

JAEA-Testing 2011-005

62 MPI MPI acircX atildeegraveBCD^_Z)JCBX900 yacutegt

CDq256EumlIgraveccedil4096EumlIgravegt Ocirc$-a5THORNumlcopyszligpound5poundDordfmmgto MPIatildegtagraveaacuteX=ecircordfmacrdn 32ccedil80000eacutea`gtGq WCRDMANORDMA ordflteumln atildeordflaquomWlaquoOcirc

$-a5cdecircCordf^UcircG atildeordflaquomWXMordfethCDq 621 Agraveograve mmgtG Ocirc$umlcopy 0ccedil80000 eacutea`gtlaquoordfOslash5vEacuteAacuteG

Ocirc$acircETHeCLTable 19Pq AgraveograveWCMPIatildeTplusmnOslash5GraquoCmn 20plusmnatildesectaumlGq20plusmnatilde

sectaumlC3 1plusmnWecircXshyordfeumlnmWordfDDlaquopoundDqWCpumlcopyYacute~ W[ordfuumlwdnqmDHT`C19plusmnshyIIumlAacutemWCDqXIuml RDMAW NORDMA 2IumlgtGq^_ZOslash MPIatildeIacutecurren Fig 25Gq

Table 19 Ocirc$acirc

EumlIgrave AllreduceReduceBcastSendRecv 13 MPIatilde 256 32eacutea` 128eacutea` 512 yacute 128eacutea`

1024 yacute 256eacutea` 2048 yacute 2048eacutea` 4096 yacute 2048eacutea`

do 1000 III = 1 NNN call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() mesuarment start call MPI_ALLGATHER(anelemMPI_DOUBLE_PRECISION amp bnelemMPI_DOUBLE_PRECISION amp MPI_COMM_WORLDierr) ttime2 = MPI_WTIME() mesuarment end ttime0 = ttime2 - ttime1 call MPI_REDUCE(ttime0ttime1MPI_DOUBLE_PRECISIONMPI_MAX0 amp MPI_COMM_WORLDierr) if(myideq0) then max_time = max(max_timettime) min_time = min(min_timettime) all_time = all_time + ttime end if 1000 continue

Fig 25 ^_ZLIacutecurrenP

JAEA-Testing 2011-005

13

622 mnUgravegttuDOslash5CcedilEgraveOslash5IumlgtP512EumlIgrave Fig 26 Fig

27 1024EumlIgrave B Gqoh$a` MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

0

500

1000

1500

2000

2500

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Reduce(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(512-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(512-)

rdma

Nordma

Fig 26 MPI atildeL512EumlIgrave-1P

JAEA-Testing 2011-005

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(512-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(512-)

rdma

Nordma

0

500000

1000000

1500000

2000000

2500000

3000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(512-)

rdma

Nordma

0

50

100

150

200

250

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(512-Rank0gtRank511)

rdma

nordma

Fig 27 MPI atildeL512EumlIgrave-2P

JAEA-Testing 2011-005

brvbarsect[raquozfrac14frac12CDq

(1) C gt RDMAbrvbarsect NORDMAordfFoumlordflaquoq (2) RDMAgtlaquo Ocirc$-a5cdaeligccedil ordflaquoEGqmn Normal Send

Agravejcd Send on RequestAgravejsectOgravethornWnq (3) NORDMA gt AgravejsectOgravewOacuteshyWatildeAcircOcirc$-a5gtaeligccedil ordflaquo

EGoumlordflaquoq (4) EumlIgraveordfCXWEacutebrvbarPX Ocirc$-a5gt decircordfmacrdn

oumlordflaquoq ^_ZgtacircegraveBG MPIatildebrvbarpoundRDMAW NORDMAegravecurrenaeliguordflaquo

qDotildeCmnd MPI ^_Z=GfIumlgtlaquo mpiexec Uuml^13~gtlaquogtGu MPIatildeoacuteC RDMAgt5Pc NORDMAgt5Pclaquo XgtC MPI atildeegravepoundoumlAgraveIumlgtOslash5rsGmWordfaeliguWXq ucircuumlWCo MPI atildegtUVOcirc$(ccedil8KB)3UVOcirc$(8Kccedil78KB)U

VOcirc$(80KBccedil584KB)gt RDMAZUcircCD Fig 28LgtUVOcirc$P CLUVOcirc$PGqZyacutegt 100n RDMA ordfnoty0n NORDMAordfnotyWXq

JAEA-Testing 2011-005

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

middotplusmnsup1raquo

~plusmnreg

plusmn

sup2cedil~raquo

sup2~

sup2~

sup2~Oslashreg~plusmn

middotplusmnsup1

~plusmnraquo

~plusmn

Ugraveplusmnsup1raquo

Ugraveplusmnsup1

~plusmn

plusmnraquo

Fig

28

RD

MAW

NO

RD

MALgtUVOcirc$acute

0K

B~8

KBP

JAEA-Testing 2011-005

63 RDMAszlig RDMA 612 gt0CDbrvbarPCPU plusmnsup2WnotM5XDH 13

O=Vfrac34gtecircqmnbrvbarsect WbordfO=gtecirc9ordflaquosectC

cCcedilEgrave`5poundEacuteDqUgraveDrsDHNORDMAgtiacuteX`5poundDq WC WordfO=eumlnDegraveyenagraveaacutegtecircXcpoundDq 631 Agraveograve W5P^_ZegravepoundRDMA W NORDMA rsGq

eacute^_WOcirc$ecircacuteGeumlsectCigraveecircacuteiacuteVicircG (13cedil Wdivideoslash)5PqGlt iuml_ccedilaring (MPI_IsendIrecv)egraveBGqG UgraveDlt GAtildeWcopyIgraveOcirc$-a5 5MB gtlaquoq W5P5Igravegtlaquoq5Igrave-a5(450450)gtlaquoqrsDHgt n~nOslash55Pq ^_ZZccedil`MPIW 2q|Icirca|ccedil`(MPIiumlOpenMP) 3q|)

JCDqZccedil` MPIgtOslash5CDoumliumlIumlS RDMAgtXDHmAgravejszligordfgtecircXqmDHiumlIumlSgt MPI 5XCeth2Icirca|ccedil`)JCDqIcirca|ccedil`Oslash5WOslash5Iacutecurren7ccedilIumlEumlIgraveUcircCXeacute

Yacute~ 1W7ccedilIumlEumlIgraveUcircCDeacuteYacute~ 2)JCDqeacuteYacute~ 2 WOslash5Iacutecurren Fig 29GqXMPIWIcirca|ccedil`eacuteYacute~ 15IgraveIacutecurren7ccedilIumlEumlIgraveampeuml$OMP parallel doiacuteordfotildegtlaquoq DHOslash5Zccedil` MPIgt 8EumlIgraveIcirca|ccedil`Oslash5 4mpij8omp(32core

egraveB)gtOslash5Gq

JAEA-Testing 2011-005

call MPI_BARRIER(MPI_COMM_WORLDierr) ttime5 = MPI_WTIME() mesuarment start call mpi_isend(aNMBmpi_DOUBLE_PRECISIONir1 1 MPI_COMM_WORLDireq(1)ierr) call mpi_irecv(bNMBmpi_DOUBLE_PRECISIONil1 1 MPI_COMM_WORLDireq(2)ierr) call mpi_isend(dNMBmpi_DOUBLE_PRECISIONil2 1 MPI_COMM_WORLDireq(3)ierr) call mpi_irecv(cNMBmpi_DOUBLE_PRECISIONir2 1 MPI_COMM_WORLDireq(4)ierr) ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ttime7 = MPI_WTIME() mesuarment start $OMP parallel do do j = 1 NMM do i = 1 NMM a1(ij) = 1d0 b1(ij) = 1d0 c1(ij) = 0d0 end do end do $OMP parallel do do j = 1 NMM do k = 1 NMM do i = 1 NMM c1(ij) = c1(ij) + a1(ik)b1(kj) end do end do end do ttime8 = MPI_WTIME() mesuarment end ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc call mpi_waitall(4ireqiastatusierr) ttime6 = MPI_WTIME() mesuarment end ttimeDD = ttime8 - ttime7 ttimeal = ttime6 - ttime5 Oslash5 ttimesd = ttimeal ndash ttimeDD Oslash5cdnD (= UgraveDiIacute)

Fig 29 WO=5PDH^_Z

JAEA-Testing 2011-005

632 Zccedil` MPI 2plusmnIcirca|ccedil`eacuteYacute~ 1 WeacuteYacute~ 2n~n 3plusmn

Oslash55poundDqmnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt D Gqontilde13Gq

acutegtOslash5CD Oslash5 acutegtOslash5CDOslash5 R acute Wn~nOslash5shy Oslash5 acute W5poundDOslash5 (U) acuteOslash5cd()Dshy acuteOslash5CD(OslashC) U acuteOslash5cdRshyDshy

laquogtOslash5CD oacuteG WOslash5GmWbrvbarpound

UumlCXpoundDrGqmj13sectgtlaquoq

laquo() = (Oslash5U( iuml)) j100 aTHORNshyordfecircMszligordflaquomWXqbrvbarsect[raquozfrac14frac12CDq

(1) Zccedil` MPIgtiumlIumlS IBegravepoundXDH RDMAograveoacuteordfltdnOslash5szligq

(2) Icirca|ccedil`eacuteYacute~ 1 RDMAOslash5gt WOslash5rsGWucircOslash5ordfFWPXq NORDMAatildeCiacuteXWXpoundq

(3) Icirca|ccedil`eacuteYacute~ 2 gtIacutecurrenordf7ccedilIumlEumlIgraveUcirceumlngteacuteYacute~ 1WRrsgt 72currenOslash5Rrsgt 55currenAumlFUcircordfXeumln(RDMA)qCcCgtucircOslash5ordfFWPdegdnXq NORDMAIacutecurren7ccedilIumlUcircgt Oslash5eacuteYacute~ 1ruUumlIgraveUcirceumlnordfRDMAbrvbarsectcedilCXpoundq

BX900gt RDMA Uuml_IumlOslashWGgtCPU oacuteG

dCb^|13~ CPUbrvbarsectVfrac34GmWordfgtecircqCcCBX900 ordfiacuteCUgraveDplusmngt ordfUumlmWlaquosect Wordf

O=eumlnDegraveyenagraveaacuteGmWordfgtecircXcpoundDq

JAEA-Testing 2011-005

64 CPUSCPUiumlIuml 1oacute 1 core FAringG^_Zegravepound CDqCPU S

iumlIumlS CPU iumlIuml gt Ocirc$ordmAcircAgravejordf+Xqn~n gtordfaEacuteiumlIuml ordfTFWXpoundDq 641 Agraveograve

1oacute 1 ^_Z coreOcirc$ G^_ZgtlaquoqZY0cdcZYCD^_acircgtCD-a5Ocirc$ 100plusmnG Cmnd uCDn~nGq^_ZiIacute Fig 30Gq plusmn 24EumlIgravegt128MBOcirc$G C^_acirc 1WGq BX900gt 1iuml

Iuml 2CPU1CPU 4coregtlaquogtZY 3Ugravegt CPUS ZY 7UgravegtiumlIumlSPAtildeW CPUW mnd13yacuteiumlIuml WXqiumlIumlS coreoacuteG core IDWethsectntildedn MPIZYAumlAring Fig 31Gq

IC = 0 do 1000 III = intcpu nodes-1 intcpu IIRANK = III call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN if(myideq0) amp call MPI_SEND(ansizeMPI_DOUBLE_PRECISION amp IIRANKIIMPI_COMM_WORLDierr) if(myideqIIRANK) amp call MPI_RECV(ansizeMPI_DOUBLE_PRECISION amp 0IIMPI_COMM_WORLDistatusierr) 120 continue ttime2 = MPI_WTIME() c IC = IC + 1 timew(IC) = ttime2 - ttime1 ir(IC) = IIRANK 1000 continue

Fig 30 ToacuteT ^_ZLIacutecurrenP

JAEA-Testing 2011-005

NODE

CPU0 CPU1

Core IDacute0 Core IDacute2 (Rank 0) (Rank 1) Core IDacute4 Core IDacute6 (Rank 2) (Rank 3)

Core IDacute1 Core IDacute3 (Rank 4) (Rank 5) Core IDacute5 Core IDacute7 (Rank 6) (Rank 7)

Fig 31 f MPIZYethsectntilde

BX900Ocirc`QRfeaZ+QR MPIZaZ|gtlaquoordfmQRRIgraveEacuteIacute13

IntelfeaZ+QR MPI IntelfeaZ+Intel MPI QRfeaZ+Open MPI

3 gt5poundDqDotildeCQR MPI W Intel MPI MPI ^_ coreoacuteGethsectntilde Fig 31CDsectgtlaquoordfOpen MPIatildeC rankAumlAring 01234567 Core ID 01234567gtethsectntildednq 642 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt E Gqmbrvbarsect[

raquozfrac14frac12CDq

(1) QRfeaZ+QR MPI gtCPU S FAringordf 3260MBsiumlIumlS CPU FAringordf 3750MBsiumlIumlordf 5350MBsgt CPUS ordfiAumlcedilCiumlIumlordfiAumlFWPWXpoundDq

(2) Intel feaZiumlQR MPI CPU SiumlIumlS CPU QRfeaZWotildeiacuteXotildeordfiumlIuml 4750MBsWegraveeacuteEGqmnQR MPIQRfeaZUcircCD5poundordfIntelfeaZoacuteC5nXDHWnq

(3) IntelfeaZ+ Intel MPI CPUSiumlIumlS CPUgtAgraveXordfmndocircoacute(1)(2)QR MPI egraveBqUgraveDiumlIumlgt Intel MPI 2(` QDRegraveBordfgtecircXDH1(` QDRCcXq

JAEA-Testing 2011-005

(4) QRfeaZiumlOpen MPICPUS ordf 4460MBSgtmnUgravegt3gtTbrvbarordfiumlIumlS CPU QR MPI WotildeAringiumlIumloumlCXWXpoundDqDotildeCiIacute QDRegraveB5poundbrvbarPotildeq

dividefrac12XegraveAgravegtlaquoQRfeaZiumlQR MPIgt CPUSiumlIumlS CPU ordf

iumlIuml brvbarsectcedilWPfrac34W+XWXpoundDqmmacraacutesectXuC

iumlIumlSgt MPI XuCEumluecircgtlaquoq 65 13aelig13ocircotildeouml 13aelig13ccedil`brvbarY 50_ccedilaringicircXpoundDHegraveBoslashbrvbarpound

thornltCUgravePqmthorn 1oacute 1 Wdivideoslash ETHuDq 651 1oacute 1

1oacute 1 MPI atildegtlaquosectbrvbarCegraveBeumlnqyenwOslashfIumlgteacute^_WFGgt^_ordfgt5noumlordflaquoqmbrvbarPXouml50_ccedilaringthornlt9ordflaquoqmmgt 5P^_Z)JC

thornETHuDq WC1oacute 1 gt 2iumlIumlgt IBnotGn[ordfecircDH50_

ccedilaringthornCltmWordfcpoundDqDotildeCWsup3iumlIumleacuteaOcircOtilde(Yacute~iumlIumlethsectntilde)AgraveogravegtAtildeWYacute~ordf IBnotGbrvbarPXiumlIumlcopytimesUVYacute~139ordfAgraveXDHthornordfCq 6511 Agraveograve ETHNtildeegraveBCD^_Z^_1curreno^_cdfrac14secto^_

1oacute 1 (MPI_SendRecv)gt 128MBOcirc$ 100plusmnG CmnuCDiexcl FAringGqOcirc$G aYacute Fig 32 Gqmyengt8EumlIgraveYacute~oumlgtZY 0ccedil3^_cdiexclZY 4ccedil7^_Ocirc$G CqOcirc$Glt ^_iugravepGgtoslashbrvbarpoundccedil`brvbarYgt[

ordfecircmWXq ^_ZZccedil` MPI WIcirca|ccedil`(MPI+OpenMP))JCDqIcirc

a|ccedil`7ccedilIumlETHeGmWgt1 iumlIumlcdOcirc$G 2^wmWordfgtecircq

JAEA-Testing 2011-005

13

6512 (1) 113aelig13S rsDH1 13aelig13S(144core)5poundDqZccedil` MPI 144 EumlIgraveIcirca

|ccedil` 72mpij2omp36mpij4omp18mpij8omp 3 IumlgtOslash55poundDq13aelig13Siuml_ccedilaringgtlaquoDH iumlIumluacuteordf IB iexclcentGqZccedil` MPI 1 iumlIumlcd 8 ^_currenOcirc$ordfeumlnIcirca|ccedil` 2 7ccedilIumliacute 1iumlIumlcd 4^_curren47ccedilIumlEacuteC 2^_curren87ccedilIuml1^_currenWXgt87ccedilIuml13 IBgt[ordfecircqmDHmnd FAring+XordfIBagraveaacuteGDHogtIIuml FAringW 1iumlIumlntildesect FAringshyiquestHDqm Table 20GqXeacuteYacute~Icirca|ccedil`cedil$ucirc3shy(MPI^_j7ccedilIuml)Gq Icirca|ccedil`(18mpij8omp)EacuteegraveeacuteshyordfEordfFAringshyGueacuteYacute~gt

otildeEacutebrvbarPXshyWXpoundqmnordf QDR2currenOslashgtlaquoigravePq (2) 13aelig13Zccedil` MPI TEacute144EumlIgraveZccedil`MPIgt5poundDqYacute~Oslash5gt t256YZegraveBCDq

mnmYZiumlIumleacuteaOcircOtildeordfZcopytimesWXDHIBnotordfecirc]GDHgtlaquoq _copytimes 72^_13aelig13(=9iumlIuml13aelig13)gtIumlb^_ 213aelig13currenethCDq13aelig13S 1iumlIumlecirc 2 IBordfuacuteordfpoundq13aelig13S 18iumlIumllaquogt 36 IBgteumlnqiAgrave13aelig13 50_ccedilaringgtlaquogt 18(9j2ccedil`)gteumlnq144EumlIgraveouml13aelig13S13aelig13W 18 IB egraveBGgtiuml_ccedilaringWXsect13aelig13S13aelig13EacuteXqbrvbarpoundm13aelig13ocircotilde 13aelig13SgtCD Wotilde^

dXq

Ocirc$G (G G)

^_0

^_1

^_2

^_3

^_4

^_5

^_6

^_7

Fig 32 ToacuteT Ocirc$G aYacute

JAEA-Testing 2011-005

Table 20 L13aelig13SP eacuteYacute~ Zccedil` MPI uumlyacutethorn(72j2) uumlyacutethorn(36j4) uumlyacutethorn(18j8)IIuml FAring 7238MBs 14474MBs 28942MBs 53467MBs iumlIumlshy 57906MBs 57899MBs 57884MBs 53467MBs

Table 21 L13aelig13Zccedil` MPIP 144 MPI 192 MPI 224 MPI iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 7246 R 4670 laquosectR 4671 laquosectR

2 7244 R 4670 laquosectR 4671 laquosectR

3 7246 R 4670 laquosectR 4671 laquosectR

4 7242 R 7243 R 4670 laquosectR

5 7243 R 7244 R 4671 laquosectR

6 7242 R 7240 R 7237 R

7 7241 R 7239 R 7246 R

8 7244 R 7240 R 7245 R

9 7249 R 7243 R 7240 R

10 R 4670 laquosectR 4670 laquosectR

11 R 4670 laquosectR 4672 laquosectR

12 R 4669 laquosectR 4671 laquosectR

13 R R 4671 laquosectR

14 R R 4670 laquosectR

iacute 192EumlIgraveW 224EumlIgraveZccedil` MPIn~n 96^_13aelig13gt 213

aelig13Oslash5112^_13aelig13gt 213aelig13Oslash55poundDqordm 144EumlIgraveOslash5WWlt shy13aelig13iumlIumlraquoIIuml FAring Table 21Gq yenw 224 EumlIgraveoumlltAacutesectshygt IBnotGiumlIumlordflaquogt(50_ccedilaring)

miumlIuml FAringordfcedilCXqFig 24a$fY`icircgtCDbrvbarPiumlIuml1W 102W 113W 12ordfnotGmWXsectmndiumlIuml FAringcedilCXpoundqCcCiumlIuml 15161718 egraveBeumlnXDHmndiumlIumlWnotGiumlIuml 6789 FAringordfFC 144MPI oumlW FAringordfEacutegtlaquoqbrvbarpoundiumlIuml6ccedil9 iuml_ccedilaringWXq192 EumlIgraveiacutegtiumlIuml 4ccedil9 p IB notGiumlIumlegraveBeumlnXDH FAringordfFq

JAEA-Testing 2011-005

Table 22 L13aelig13Icirca|ccedilIumlP Icirca|ccedil`(112x2) Icirca|ccedil`(56x4) Icirca|ccedil`(28x8) iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 9339 laquosectR 18567 laquosectR 36408 laquosectR

2 9337 laquosectR 18564 laquosectR 36919 laquosectR

3 9346 laquosectR 18566 laquosectR 36536 laquosectR

4 9347 laquosectR 18570 laquosectR 36457 laquosectR

5 9342 laquosectR 18651 laquosectR 36646 laquosectR

6 14385 R 28606 R 53808 R

7 14488 R 29118 R 53779 R

8 14500 R 28676 R 53789 R

9 14495 R 29100 R 53784 R

10 9337 laquosectR 18797 laquosectR 36408 laquosectR

11 9340 laquosectR 18775 laquosectR 36715 laquosectR

12 9343 laquosectR 18651 laquosectR 36559 laquosectR

13 9348 laquosectR 18811 laquosectR 36457 laquosectR

14 9338 laquosectR 18694 laquosectR 36991 laquosectR

(3) 13aelig13Icirca|ccedil` Icirca|ccedil`gt 213aelig13(224core)egravepound112mpij2omp(56^_13aelig13)

56mpij4omp(28^_13aelig13)28mpij8omp(14^_13aelig13)IumlgtOslash5CDqmndoiumlIumlgtIIuml FAringWpiumlIumlWnotnot Table 22Gq MIcirca|ccedil`gt IB notnotgt FAringordfecircC^poundqUgraveDIcirca|ccedil

`7ccedilIumlbrvbarpoundiumlIumlcdGlt GOcirc$2ordf^DHiumlIumlntildesect^_

ordfMiumlIumlXordf IB gtOcirc$[thornordfagraveaacuteXqmnduumlGWBX900 gtZccedil` MPI brvbarsectIcirca|ccedil`UcirceumlnDfIumlordfIcirca|ccedil`UcircfIumlXd 87ccedilIumlordfnotygtlaquomWordfcq WmigravegtWsup3iumlIumleacuteaOcircOtildeAgraveograveiumlIumlAtildeWecirccopytimeseumlnD

HUVYacute~13currenYacute~gt IB not9AgraveXqbrvbarpound50_ccedilaringmWuumlGbrvbarsectordmgttuDIcirca|ccedil`Ucirc5GuecircgtlaquoqCcC

Yacute~Oslash5 IB notGiumlIuml ordfpYacute~ordfOslash5eumlnWthornltCUgravePordfucircoB3mnEumlmWgtecircXq

JAEA-Testing 2011-005

652 divideoslash mmgtdivideoslash atildePiAuml 2ordfCX MPI_AlltoallegravepoundEacute

DGq 6521 Agraveograve

MPI_AlltoallGCcedilEgravecentCOslash5GqCmrsGqXMPI_Alltoall 100 plusmnOslash5C oacute copyIgrave-a5Zccedil` MPI ordf512MBIcirca|ccedil` 1024MBWCDq

A ucircYacute~Oslash5 B 13aelig13ntildesectiumlIumlgtecircotildeCGq C 13aelig13ntildesectiumlIumlAgraveXCG(13aelig13ordfCX)q D IBnotCG(iuml_ccedilaring)q

BccedilD atildeGgtlaquoordfiumlcopyordfYacute~gtegraveBGiumlIumllaquo GmWgtecircXq mDHWsup3ucircoBgteacuteccedilYacute~TYZ h4096egraveTEacute 512iumlIuml(4096core)agravefCm3cdXuCCcedilEgravePiumlIumllaquonotgtecircmCcedilEgravegtagravefgtecircEumlIgravegtCfNh$)JCmfNh$egravepound

MPI_Alltoall Oslash55poundDqmDHAacutedegCDOcirc$ MPI ^_UYacuteordfXq 6522 Zccedil` MPIWIcirca|ccedil`egravepoundD Table 23GqmmgtCXC

XmWOslash5CcedilEgraveordf+XDH AucircYacute~Oslash5WBccedilDUgravegtoshy8YrsCXmWgtlaquo(A 10WCDrClaquoordfmnoumlgt BccedilDEacutersWG)qBccedilDTEacute 512iumlIumlagravefCcdEumlIgravefNh$)JCOslash5GordfQR MPI gtmbrvbarPTEacuteCiumlIumlagravefCcdAgraveXiumlIumlgtCfNh$)poundOslash5CDoumlWTEacutecd

EumlIgravegtOslash5CDoumlWgt)acircOslash5ordfEGqZccedil` MPI WIcirca|ccedil`gtgt coreordfAgraveXoumlmraquozordfagraveaacuteXqX BccedilDrsXq

JAEA-Testing 2011-005

Ta

ble

23iacuteXCcedilEgrave

MPI

_allt

oall

vUacuteSTUcircR

vRcopyUumlYacutebrvbarTHORN

szligvcopyUumlYacutebrvbaragrave

pvRaacuteacircatildeNR

+aumleAumlovqaringaeligPgR

Oslash 5 Iuml

copy Igrave - a 5 (MB)

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

R

R

szligR

pR

144m

pi

512

3R6

0 11

14

118

0

145

9 3

60 R

138

6

36

0 13

87

100

1

31

124

1

25

2plusmn

(BC

DEacute

) 51

2 3R

60

111

4

118

0

148

2 11

16 R

146

9

63

0 14

75

100

1

33

132

1

32

3plusmn

(BC

DEacute

) 51

2 3R

60

111

4

29

0 14

81

714

13 R

153

3

29

0 14

21

100

1

33

138

1

28

448m

pi

512

14R

40

142

6

414

0

182

4 21

272

1 R

170

3

78

0 16

32

100

1

28

119

1

14

504m

pi

512

14R

45

142

7

415

8

180

4 27

242

6 R

160

5

79

0 14

33

100

1

26

112

1

00

2048

mpi

51

2 37R

69

182

1

298

8 24

93

465

6 R

250

1

465

6 25

04

100

1

37

137

1

38

2plusmn

51

2 37R

69

166

7

2211

6

262

5 67

564

6 R

264

7

298

8 25

04

100

1

57

159

1

50

3plusmn

(BC

DEacute

) 51

2 37R

69

166

7

2211

6

262

1 69

524

9 R

269

3 6

298

8 24

59

100

1

57

162

1

47

3072

mpi

51

2 69R

56

258

7 28

399

9 27

34

5371

54 R

286

3 9

507

7 26

44

100

1

06

111

1

02

68m

pij

2om

p 10

24

3R5

7 12

10

11

170

11

80

91

9 R

122

3

28

5 12

21

100

0

97

101

1

01

120m

pij

2om

p 10

24

9R3

3 12

89

215

0

127

0 12

191

6 R

130

4

47

5 12

94

100

0

99

101

1

00

132m

pij

2om

p 10

24

8R4

1 13

79

216

5

116

1 15

211

6 R

119

5

48

3 11

38

100

0

84

087

0

83

432m

pij

2om

p 10

24

17R

64

162

4

119

8 16

28

1925

43 R

161

2 1

129

0 16

18

100

1

00

099

1

00

30m

pitimes4

omp

1024

4R

38

598

115

0

674

12

13 R

683

27

5 68

2

100

1

13

114

1

14

2plusmn

(BC

DEacute

) 10

24

4R3

8 59

8

115

0

678

12

13 R

690

27

5 70

0

100

1

13

115

1

17

36m

pitimes4

omp

1024

6R

30

603

118

0

684

11

16 R

697

63

0 70

3

100

1

13

116

1

17

2plusmn

(BC

DEacute

) 10

24

6R3

0 60

3

29

0 68

8

714

13 R

702

29

0 71

0

100

1

14

116

1

18

126m

pitimes4

omp

1024

16R

39

822

3

415

8

805

27

242

6 R

738

1

79

0 71

5

100

0

98

090

0

87

180m

pitimes4

omp

1024

18R

50

925

615

0

892

40

283

2 R

776

127

5 75

3

100

0

96

084

0

81

2plusmn

(BC

DEacute

) 10

24

18R

50

925

712

9

861

29

352

6 R

808

109

0 76

8

100

0

93

087

0

83

612m

pitimes4

omp

1024

45R

68

110

2

349

0 10

34

525

9 R

111

4

349

0 10

46

100

0

94

101

0

95

2plusmn

(BC

DEacute

) 10

24

45R

68

110

2

2810

9

920

57

634

9 R

960

5

358

7 83

4

100

0

83

087

0

76

17m

pitimes8

omp

1024

3R

57

338

1

117

0

430

9

19 R

432

28

5 43

3

100

1

27

128

1

28

90m

pitimes8

omp

1024

15R

60

504

109

0 44

9

243

8 R

471

109

0 47

1

100

0

89

093

0

94

357m

pitimes8

omp

1024

53R

67

731

418

7 69

2

576

3 R

711

418

7 72

9

100

0

95

097

1

00

2plusmn

(BC

DEacute

) 10

24

53R

67

731

3410

5

642

57

685

3 R

628

6

438

3 60

2

100

0

88

086

0

82

JAEA-Testing 2011-005

brvbarWBccedilD gtecircCOslash5ordf+XAgraveXq13aelig13ordfAgraveXCc IB notGiumlIumlordfAgraveXnOslash5ordfUumlWnDordfmacrWaeligCmbrvbarPXpoundXqMPI_Alltoall MPI atilde3gtT 2ordfCX 5Pordfmnegravepound BX900 gtecircXordfXmWordfethCDq1 oacute 1 gt13aelig13thornordfDordfmgtpYacute~thornltC t256YZegraveBCDoacuteCdivideoslash h4096YZegraveBCDgtpYacute~thornlt9]QR MPIbrvbar TUacuteUcircthornuumlwdnWgtszligagraveXNXq 66 a$fY`UgraveWH mnUgravegt BX900 gt brvbarsect ecircXthorn7wWCacirc

RDMANORDMA Agravej Ocirc$-a5 1oacute 1 (13aelig13ocirc)IBnot

ordflaquopoundDqumlcopyfIuml oslashfrac14frac12CTbrvbarCcedilEgravelaquo GuecircgtlaquoigravePqeumld

fIumlszligOslash5GDH Zccedil` MPIUgraveDIcirca|ccedil`laquo

ordfugtlaquoqucircuumlWCGZccedil`MPIWIcirca|ccedil` rsGordfWmcdBX900gtIcirca|ccedil`ordfnotyXmWordfcurrencqmncdfIuml)JGoumllaquocentfIumlAumlFUcirc5GoumlcentX=ordfX

nIcirca|ccedil`laquo GuecircgtlaquoqBX9001iumlIuml 8coreoacuteC IB2gt brvbarntildeGDHm IBgt 2laquo]eumlXDHIcirca|ccedil`ordfugtlaquoqCPU13fUcircordfbrvbarsect2WnIcirca|ccedil`fIuml13vC]G

Weumlnq

JAEA-Testing 2011-005

7 UgraveWH

HPCC BMT7q|^_ZWUuml|YacuteTHORN^_ZBDbrvbarsectBX900W FX1gtfeaZbrvbarTUacuteUcircordflaquosectnordf^_ZOslashszligthornGmWordfdcXpoundDqIORBDgtegraveBG API]a13brvbarpoundlordfagraveaacutegtlaquopoundDqiacuteX B^_ZbrvbarOcirc$

a$fY`yBAgraveogravegt ordfecircC+XmWethCDqmnd

EcircHdegdnDregmacrHPCCYoJusectWUgraveWHWCtCqXyacuteampG(GOcirc$b AccedilGUgraveWHq mndregmacrBX900 W FX1 o13gtegraveBG^_ZIacuteDh

Agraveogravelaquo GDHnotszligXWXqordfbrvbarsectCumlcopy13

ABGlCWCJ7gtecircngtlaquoq

ugraveuacute

)JlaquoDsectXsup2 sup2kDotildeDAElig13=` v oBeacuteQRjlucircsup2acircAgraveXugraveG

q

ucircuumlyacutethorn

1) Innovative Innovative Computing Laboratory the University of Tennesee ldquoHPC CHALLENGErdquo httpiclcsutkeduhpcc

2) gt$middotamp()p HPCaelig7Yacutegt SX13L32 3plusmnAElig13THORNYacutel 3 A=P+ frac14AElig13THORNYacute$ 2005-05middot4 p98-116 (2005)

3) ldquoIOR HPC Benchmarkrdquo httpior-siosourceforgenet

JAEA-Testing 2011-005

A

FX1

HPC

C B

MT

G-F

FTEfeaZUuml^13~

-DH

PCC

_FFT

_235GmWgt

FFT|5ordfTUacuteUcirceumlnordfzyacuteCq

iAgrave

EP-

DG

EM

M-a5^regbrvbarpoundaringaeligccedil13h-a5

(15

MB

Cor

e)UacuteCordfzyacuteCq

Tabl

e A

-1feaZUuml^13~]-a5^regCD

FX1

HPC

C B

MT

UgraveccedilegravekeacuteUgraveccedilknUgraveccedilplusmnsup2sup3

~~regreg

Ugraveccedilmmnecircecirckccedilnecircfrac12

regReeumlgecirckccedilnecircfrac12

ndegplusmnsup2

ecirckccedilpUgraveecircfrac12frac12plusmnsup2sup3degmiddot

plusmnsup2ordmdegsup2sup1plusmnsup2sup3degmiddot

eacuteplusmn~+igraveiacute8

nmcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

reg~

reg~v

reg

sup2sup3plusmn

qvqyuqou

ovwyy

qvquzqq

vxzwo

qvzz

vyoy

zzv|w|

qvuqqy

uvuyx

qyu

sup2sup3plusmnqvqy|zx

ovwz

qvquyxz

vxuo

qvzuw

vyowx

zzv|z

qvxx|y

yvox

qwy

sup2sup3plusmn

qvxoquq

uvoxzo

qvoqqz

yyvuox

w|vyzx

vyoxx

zzvzyq

qv|qq

vqy

yy

sup2sup3plusmnqvxoqwwq

uvqwxx

qvooqzu

yyvxuwo

w|vyyq

vyouu

zzvzzx

qvux|

oqvz|

xw

sup2sup3plusmn

|vwq|qqq

|uvwz

qvqoxxw

qv|

o|uuv|q

vyxy

zzv||wq

qvqwzuw

oqvywyu

sup2sup3plusmn|vw||wqq

|vyz

qvuxu

yyyvuwwu

o|uyvyw

vy|q

zzv|uy

qvxwzqu

ovxuo

zq|z

sup2sup3plusmn

|vwuzqq

uuvxqwq

qvqoxxw

wwyv|zq

o|uv|q

vy|ox

xvuuwu

qvowox

oqvowuyq

sup2sup3plusmn|vwoquxqq

|yvzyxu

qvuzou

yvqwq|

o|uyvzxu

vy|qw

xvuuz

qvqwuy

ovuz

|y

uxqqq

wxqqq

|xqqqq

|qqqq

w

frac14icircRfrac14Rplusmndegreg

szligsup3degicircRRszligcedilszligiumliumlRethvRntildeRfrac14degicircRccedilogravemicroplusmnregntildemicro~sup1Oslashsup3sup2moRenecircfrac12oacuteregplusmnAumlocircotildeg

egravekeacuteOslashfrac14knicircccedilpegravekszligszligOslashmmnOslash|xRccedilpOslashpOslashfrac14knRccedilpegravekszligszligOslashfrac12ecircfrac12eacuteeacuteszlignRocircotilde

mszligpecircmicircccedilpeacutefrac14UgraveOslashOslashyunRocircotilde

frac12kicircRRkplusmnplusmnplusmnraquodegReacutemiddotplusmnmiddotRkplusmn~plusmnmiddot

-

moRexoszligcedilowszligkoumlg

szligdegicircRkszligyudividentildeRvxUgraveegraveparantildeRpposlashueuqUgravecedilregedegsup2plusmngntildeRo|vxUgravecedilregesup3plusmnregsup2RugraveRnecircfrac12gg

Cuacute uacuteucircicircuumlTyacutethornYacuteN0ucircicircuumlTyacutethornYacuteN0gt

|

xo

xo

egravekszligszligRethvicircRethov|vo

eeumlgicircecirckccedilnecircfrac12RregR0Eeecirckccedilnecircfrac12Rndegplusmnsup2gIcirc-RJ13NOIPQ

~~icircRppRmicrodegdegugraveplusmnsup2ntildeRoreg~cedilsup2ntildeRmplusmnRn

JAEA-Testing 2011-005

B EumlIgravecentMPI atilde oh$a`MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

1024EumlIgrave

0

1000

2000

3000

4000

5000

6000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

2000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(1024-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(1024-)

rdma

Nordma

0

100000

200000

300000

400000

500000

600000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(1024-)

rdma

Nordma

Fig B-1 MPI atildeL1024EumlIgrave-1P

JAEA-Testing 2011-005

0

50000

100000

150000

200000

250000

300000

350000

400000

q qqqq uqqqq yqqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(1024-)

rdma

Nordma

0

1000000

2000000

3000000

4000000

5000000

6000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(1024-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(1024-)

rdma

Nordma

0

50

100

150

200

250

300

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(1024-Rank0gtRank1023)

rdma

nordma

Fig B-2 MPI atildeL1024EumlIgrave-2P

JAEA-Testing 2011-005

13

C

RD

MAoacute

NO

RD

MA

Ta

ble

C-13UVOcirc$

(8K

Bccedil

78K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

~plusmnreg

plusmn

sup2~

sup2~

sup2cedil~raquo

sup2~Oslashreg~plusmn

middotplusmnsup1

middotplusmnsup1raquo

~plusmnraquo

~plusmn

~plusmn

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

plusmnraquo

Tabl

e C

-2UVOcirc$

(80

KBccedil

584K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

sup2~Oslashreg~plusmn

plusmn

~plusmnreg

middotplusmnmiddotsup1

~plusmnraquo

~plusmn

sup2~

sup2cedil~raquo

sup2~

middotplusmnsup1raquo

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

~plusmn

plusmnraquo

Zordf

100M

RD

MAAgravejordfnotyWXq

0CXM

NO

RD

MA

AgravejordfnotyWXq

JAEA-Testing 2011-005

D

W Oslash5

Tabl

e D

-1Zccedil`

MPI

8mpi

qzvquecirciumlq|yvwuecirciumlquvzuecirciumlquovu|oecirciumlqxvwoqecirciumlquyvxqoecirciumlquyvxoecirciumlqu

qwvo

qzvzzecirciumlq|vuecirciumlquwvyecirciumlquovu|oecirciumlqxvwuecirciumlquyvuzoecirciumlquxvy|ecirciumlqu

yqyv|

owvz|oecirciumlq|yvwyyecirciumlquvxzecirciumlquvqxecirciumlquov|uecirciumlquyvuecirciumlquccedilxv|xecirciumlq

ccedilyvq

ozvqoecirciumlq|vxxecirciumlquwvux|ecirciumlquvyzoecirciumlquovxoecirciumlquyvuuqecirciumlquccedilvyecirciumlq|

ccedilwuvy

ovq|ecirciumlquyvzz|ecirciumlquwvq|qecirciumlquovu|qecirciumlqxvw|wecirciumlquyvuyoecirciumlquyvywecirciumlqu

yquv

ovquyecirciumlquvuyqecirciumlquwvxqecirciumlquovu|qecirciumlqxvwecirciumlquyvuwqecirciumlquxvzxecirciumlqu

xx|vw

|ovq|zecirciumlquyvwzuecirciumlquvz||ecirciumlquvyoecirciumlquovwecirciumlquyvuzecirciumlquccedilovozecirciumlq|

ccediloyvy

|ovquecirciumlquv|qecirciumlquwvuoyecirciumlquvyy|ecirciumlquovowyecirciumlquyvuecirciumlquccedilvxwecirciumlq|

ccedilovz

uzvwoecirciumlq|vuxyecirciumlquwvu|uecirciumlquovu|ecirciumlqxvw|ecirciumlquyvuuxecirciumlquxvwwuecirciumlqu

yqovy

uzvywecirciumlq|yvzuzecirciumlquvzoyecirciumlquovuwecirciumlqxvwqyecirciumlquyvuecirciumlquyv|yecirciumlqu

yxvu

xzvqecirciumlq|vuoqecirciumlquwv|wqecirciumlquvyyecirciumlquovowecirciumlquyvuuzecirciumlquccedilvozecirciumlq|

ccedil|vx

xzvyzzecirciumlq|voxuecirciumlquwvouecirciumlquvqqecirciumlquovxuecirciumlquyvuuyecirciumlquccediluv|yecirciumlq|

ccedilu|v

yzvuecirciumlq|vxzecirciumlquwvxecirciumlquovuyecirciumlqxvwecirciumlquyvuxecirciumlquxvyzqecirciumlqu

xw|vw

yzvz|oecirciumlq|yvwzoecirciumlquvwwxecirciumlquovuzecirciumlqxvwyecirciumlquyvuyyecirciumlquyvuqwecirciumlqu

yuxv|

zvwqecirciumlq|v|uzecirciumlquwv||yecirciumlquvyxwecirciumlquovozecirciumlquyvuzecirciumlquccedilyvwuecirciumlq|

ccedilywv

ovqq|ecirciumlquyvzzyecirciumlquvzzzecirciumlquvyzecirciumlquovqoecirciumlquyvuzecirciumlquccedil|vozxecirciumlq|

ccedil|ovz

zvuzecirciumlq|vowqecirciumlquwvoxxecirciumlquovoqqecirciumlqxuvxwecirciumlquyvuqecirciumlquvwu|ecirciumlqu

zovy

zvwecirciumlq|vyxecirciumlquwvuecirciumlquovqzzecirciumlqxuvxoecirciumlquyvuyzecirciumlquvu|ecirciumlqu

wvw

qwv|xzecirciumlq|vzecirciumlquwvywecirciumlquvxyoecirciumlquovqwuecirciumlquyvuecirciumlquccedilovqyecirciumlquccedilov

qzvuuuecirciumlq|v|yecirciumlquwvqecirciumlquovuzecirciumlqxvwqqecirciumlquyvuzoecirciumlquyvqoecirciumlqu

y|vx

oovouwecirciumlquwvqqecirciumlquzvoywecirciumlquovuqwecirciumlqxvyxxecirciumlquyvuwecirciumlquuvzoxecirciumlqu

uwvq

owvwuecirciumlq|vq|qecirciumlquvzqzecirciumlquv|uuecirciumlquzvqoecirciumlq|yvu|ecirciumlquccedilxvyxqecirciumlq|

ccedilyuv|

yvz||ecirciumlq|v|oecirciumlquwvuuecirciumlquvqyecirciumlqxovuowecirciumlqxyvu||ecirciumlquovozecirciumlqxoxwvy

ovouecirciumlquvuxecirciumlquwv|zecirciumlquovu|oecirciumlqxvwxyecirciumlquyvux|ecirciumlquxvzoecirciumlqu

xoyvo

|ov|qoecirciumlquv|zqecirciumlquwvyzoecirciumlquv|oecirciumlquwvu|uecirciumlq|yvu|ecirciumlquccedilov|uecirciumlquccediloqxv

|v|qecirciumlq|vqecirciumlquvxzecirciumlquvxqecirciumlquovq|ecirciumlquyvuzecirciumlquccedilv|zoecirciumlq|

ccedil|vu

uwvxy|ecirciumlq|wvzoyecirciumlquzv|ecirciumlquovuozecirciumlqxvuoqecirciumlquyvwqecirciumlquuvuoecirciumlqu

xoxvw

uovquecirciumlquvwecirciumlquwvqecirciumlquovuouecirciumlqxvyxqecirciumlquyvuzoecirciumlquxvwoecirciumlqu

xy|vx

xzvux|ecirciumlq|vxwecirciumlquwvx|ecirciumlquvx|oecirciumlquovoqoecirciumlquyvu|qecirciumlquccedilzvzoecirciumlq|ccediloquvz

xwvqqecirciumlq|vuuwecirciumlquwv|owecirciumlquvzuzecirciumlquov|wyecirciumlquyvxy|ecirciumlquccedil|vywuecirciumlq|

ccediluv|

yzvqecirciumlq|zv|w|ecirciumlquovq|qecirciumlqxovuxecirciumlqxvyecirciumlquyvzqecirciumlquuvqecirciumlqu

uyuvq

yovqyuecirciumlquv|quecirciumlquwv|ywecirciumlquovuyuecirciumlqxwvqxwecirciumlquyvxwoecirciumlquyvecirciumlqu

xwzvx

ovoooecirciumlquvzwwecirciumlquzvqzzecirciumlquvo|ecirciumlqxovuoecirciumlqxyvyouecirciumlquovecirciumlqxooqqvy

ovqoecirciumlquvqyoecirciumlquwvo|ecirciumlquvy|ecirciumlquovoecirciumlquyvuxecirciumlquccediluvxzecirciumlq|

ccediluvz

zvy|ecirciumlq|wvoqqecirciumlquzvqyecirciumlquov|uqecirciumlqxyvw|ecirciumlquyvxyecirciumlquuv||ecirciumlqu

uzovo

zvyzoecirciumlq|vqwecirciumlquwvoecirciumlquovqzwecirciumlqxuvuwwecirciumlquyvuzyecirciumlquvwqyecirciumlqu

yxvy

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

JAEA-Testing 2011-005

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy1

Ccedil

Ccedil

+

Ccedil

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

Fi

g D

-1 Zccedil`

MPI

8mpi

JAEA-Testing 2011-005

Tabl

e D

-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p

q|vu|ecirciumlq|yvuqqecirciumlquyvuuecirciumlquyvy|ecirciumlqu|vuuxecirciumlq|yvuozecirciumlquovzzoecirciumlq

xvw

qvuuwecirciumlq|yvuqecirciumlquvouecirciumlquvqyecirciumlquyvqxqecirciumlq|yvuqecirciumlquccedilovoecirciumlq|

ccediloyv|

o|vuuecirciumlq|yvuq|ecirciumlquyvuecirciumlquyvwuwecirciumlquuv|ecirciumlq|yvuuecirciumlquovqquecirciumlq|

zv

oyvqzwecirciumlq|yvuqoecirciumlquvqooecirciumlquvqecirciumlquyvqyecirciumlq|yvuozecirciumlquovxxecirciumlq

vx

|vwuecirciumlq|yvuqoecirciumlquyv|qecirciumlquyvwxecirciumlquuv|uqecirciumlq|yvu|ecirciumlquovoecirciumlq|

|wv

yvqqwecirciumlq|yvuqyecirciumlquvqqecirciumlquvq|oecirciumlquyvoyecirciumlq|yvuouecirciumlquvuowecirciumlq

uvq

||vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvwxecirciumlquuv|yqecirciumlq|yvuoecirciumlquovooecirciumlq|

|yv

|vuzwecirciumlq|yvuqyecirciumlquvoxxecirciumlquvq|ecirciumlquyvoqyecirciumlq|yvuoecirciumlquccedilov|uecirciumlq|

ccediloyvx

|v|yuecirciumlq|yvuq|ecirciumlquyv|zecirciumlquyvw|oecirciumlquuvqzuecirciumlq|yvuecirciumlquzvo|ecirciumlq

vy

yvy|ecirciumlq|yvuquecirciumlquvqwqecirciumlquvqzecirciumlquyvoqecirciumlq|yvuozecirciumlquccedilxvo|uecirciumlq

ccedilyvy

q|vu|ecirciumlq|yvuqoecirciumlquyvu|ecirciumlquyvyyyecirciumlquvx|qecirciumlq|yvuo|ecirciumlquccedilvoqecirciumlq

ccedilvx

qvu||ecirciumlq|yvuquecirciumlquvouwecirciumlquvq|ecirciumlquyvoxecirciumlq|yvuecirciumlquccedilovoquecirciumlq|

ccedilouvz

ovxouecirciumlq|yvuqecirciumlquyvyxzecirciumlquyvyyecirciumlquvu|ecirciumlq|yvuxecirciumlquwvuwwecirciumlqo

|vu

oyvozecirciumlq|yvuq|ecirciumlquvqoyecirciumlquvq|ecirciumlquyvoecirciumlq|yvuxecirciumlquvooqecirciumlq

|vu

|vyoqecirciumlq|yvuoyecirciumlquyvecirciumlquyvxyecirciumlqu|v|xwecirciumlq|yvuqecirciumlquccedilvozecirciumlq

ccedilxvz

xvzzyecirciumlq|yv|wzecirciumlquyvzwzecirciumlquvq|oecirciumlquyvouecirciumlq|yvuoyecirciumlquuvoxwecirciumlq

yvz

||vyecirciumlq|yvuqoecirciumlquyvy|ecirciumlquyvxecirciumlqu|v|xxecirciumlq|yvuoecirciumlquccedilxvwwzecirciumlqo

ccedilovy

|vuzuecirciumlq|yvuquecirciumlquvox|ecirciumlquvq||ecirciumlquyvo|ecirciumlq|yvuozecirciumlquccedilovq|ecirciumlq|

ccediloyvo

|vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvooecirciumlquvzoyecirciumlq|yvuqecirciumlquccedilv|zxecirciumlq

ccedilyv

yvy|ecirciumlq|yvuqqecirciumlquvqecirciumlquvq|xecirciumlquyvouecirciumlq|yvuqecirciumlquccediluvqoecirciumlq

ccedilxvo

q|vozyecirciumlq|yv|zzecirciumlquyvowecirciumlquyvyzecirciumlqu|vxu|ecirciumlq|yvuouecirciumlquxvq|wecirciumlq

oxvw

qv|wyecirciumlq|yvuqxecirciumlquvouuecirciumlquvquuecirciumlquyvoqyecirciumlq|yvu||ecirciumlquccedilovqqqecirciumlq|

ccedilo|vx

ouvoq|ecirciumlq|yv|zecirciumlquyvwqecirciumlquyvyxecirciumlquvuoecirciumlq|yvuoyecirciumlquccedilovxqoecirciumlq|

ccedil|yvy

ovuqzecirciumlq|yv|zoecirciumlquvo|ecirciumlquvq|zecirciumlquyvo|ecirciumlq|yvuowecirciumlquccedilzv|ecirciumlq

ccedilovy

uvoecirciumlq|yvuqqecirciumlquyvwo|ecirciumlquyvxoecirciumlqu|vzwecirciumlq|yvuoecirciumlquccedilyvowecirciumlq

ccediloxvq

vxozecirciumlq|yvuoecirciumlquvo|ecirciumlquvquuecirciumlquyvoyecirciumlq|yvuecirciumlquccedilovzzecirciumlq|

ccedilov|

||vuoqecirciumlq|yvuqxecirciumlquyvuecirciumlquyvywecirciumlqu|vuyuecirciumlq|yvuoecirciumlquvooqecirciumlq

yv

|vxoecirciumlq|yvuquecirciumlquvoxyecirciumlquvq||ecirciumlquyvqwyecirciumlq|yvuuecirciumlquccedilov||ecirciumlq|

ccediloyvu

|vqwecirciumlq|yvuqqecirciumlquyvoecirciumlquyv|yecirciumlqu|vozecirciumlq|yvuowecirciumlquccedil|vxooecirciumlq

ccedilvu

vuxwecirciumlq|yvuqxecirciumlquvoxoecirciumlquvquqecirciumlquyvoxxecirciumlq|yvuuecirciumlquccedilovooyecirciumlq|

ccedilouvz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

jetimesg

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-3Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (2

)

JAEA-Testing 2011-005

Tabl

e D

-3Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p

q|v|oxecirciumlq|wvwz|ecirciumlq|ovoecirciumlquovou|ecirciumlquvu|qecirciumlq|zvqqecirciumlq|ccedilvxwecirciumlq

ccedil|vu

qvx||ecirciumlq|wvzquecirciumlq|ovyuuecirciumlquovyuyecirciumlquvxqoecirciumlq|wvzyecirciumlq|vy|ecirciumlqo

qv|

ovyecirciumlq|wvwzqecirciumlq|ovoyxecirciumlquovouuecirciumlquvuwecirciumlq|wvzyuecirciumlq|ccedilvoqqecirciumlq

ccedilvy

ovxqoecirciumlq|wvwzecirciumlq|ovyuqecirciumlquovy|yecirciumlquvuoqecirciumlq|wvzxqecirciumlq|ccedil|vzoecirciumlqo

ccedilqvx

|vywoecirciumlq|wvz|ecirciumlq|ovyoecirciumlquov|ecirciumlqu|v|wyecirciumlq|wvz|oecirciumlq|ccedilvzxzecirciumlq

ccedilwvq

v|zoecirciumlq|wvwzyecirciumlq|ovyzecirciumlquovuzyecirciumlquxvzzwecirciumlq|wvzxwecirciumlq|ccedilov||oecirciumlq|

ccedilowvq

||vywecirciumlq|wvwwzecirciumlq|ovxecirciumlquov||ecirciumlqu|v|xwecirciumlq|wvzoecirciumlq|ccedilv|zecirciumlq

ccedilyvx

|vuooecirciumlq|wvzqwecirciumlq|ovy|ecirciumlquovxqxecirciumlquyvqzecirciumlq|wvz|ecirciumlq|ccedilovyecirciumlq|

ccedilovo

|v|xzecirciumlq|wvzqoecirciumlq|ovyecirciumlquovowwecirciumlquvzo|ecirciumlq|wvzyecirciumlq|ccedil|vzzecirciumlq

ccediloovu

vuxzecirciumlq|wvzqoecirciumlq|ovy|yecirciumlquovxoecirciumlquyvuecirciumlq|wvzyoecirciumlq|ccedilyvxuecirciumlq

ccedilwvw

q|v|xzecirciumlq|wvwwoecirciumlq|ovuecirciumlquovxqecirciumlqu|vxxuecirciumlq|wvzuuecirciumlq|vxwecirciumlq

v

qvuxecirciumlq|wvzqyecirciumlq|ovy||ecirciumlquovuzwecirciumlquyvqqecirciumlq|wvzy|ecirciumlq|ccedilov|uwecirciumlq|

ccedilowv

ovuwqecirciumlq|wvwzecirciumlq|ovo|yecirciumlquovo|zecirciumlquvuuqecirciumlq|wvzuyecirciumlq|vowecirciumlqo

ovo

oyvoozecirciumlq|wvz|qecirciumlq|ovxqxecirciumlquovuzecirciumlquyvqoecirciumlq|wvzxzecirciumlq|ccedilvzyecirciumlqo

ccedilov|

|v||zecirciumlq|wvzqzecirciumlq|ovxecirciumlquov|oecirciumlqu|v|oxecirciumlq|wvzzyecirciumlq|yv|owecirciumlqo

ovz

xvzyuecirciumlq|wvzqoecirciumlq|ovuwecirciumlquovxqxecirciumlquyvqxecirciumlq|wvzuecirciumlq|ovwuoecirciumlq

|vo

||v|xzecirciumlq|wvwzoecirciumlq|ovxecirciumlquovuzecirciumlqu|vxozecirciumlq|wvzqecirciumlq|v|zoecirciumlq

vo

|vuwxecirciumlq|wvzqyecirciumlq|ovy|zecirciumlquovxquecirciumlquyvqwoecirciumlq|wvzxzecirciumlq|ccedilov|xoecirciumlq|

ccedilowvo

|vo|uecirciumlq|wvwzqecirciumlq|ovqecirciumlquovoecirciumlqu|vqecirciumlq|wvzyuecirciumlq|ovuyzecirciumlq

uvu

yvuwecirciumlq|wvzooecirciumlq|ovxyyecirciumlquovxqoecirciumlquyvquecirciumlq|wvzyuecirciumlq|ccedilyvuw|ecirciumlq

ccedilwvy

q|v|oyecirciumlq|wvwzqecirciumlq|ovoecirciumlquov|yecirciumlqu|v|yzecirciumlq|wvzwzecirciumlq|ovxoyecirciumlq

uvy

qxvzyyecirciumlq|wvzq|ecirciumlq|ovuwecirciumlquovy|wecirciumlquvu|qecirciumlq|wvzx|ecirciumlq|ovxouecirciumlq|

xvu

ovxyecirciumlq|wvwqecirciumlq|ovoy|ecirciumlquov|oecirciumlquuvuuecirciumlq|wvzyecirciumlq|ovxwqecirciumlq|

xv|

oyvqqecirciumlq|wvzqecirciumlq|ovuzecirciumlquovy|wecirciumlquvuyecirciumlq|wvzuzecirciumlq|ovux|ecirciumlq|

uvo

|vyecirciumlq|wvwzqecirciumlq|ovxecirciumlquov|zecirciumlquuv||ecirciumlq|wvzxuecirciumlq|vowyecirciumlq

ozvx

vuuuecirciumlq|wvzuuecirciumlq|ovy|zecirciumlquovuwzecirciumlquxvzxzecirciumlq|wvz|yecirciumlq|ccedilovuz|ecirciumlq|

ccedilqvo

||vywecirciumlq|wvww|ecirciumlq|ovxyecirciumlquov|zecirciumlquuv|owecirciumlq|wvzoecirciumlq|vwoecirciumlq

ozvw

|vuqqecirciumlq|wvzqwecirciumlq|ovy|oecirciumlquovxqqecirciumlquyvquqecirciumlq|wvzxwecirciumlq|ccedilov|oqecirciumlq|

ccedilov

|v|xecirciumlq|wvww|ecirciumlq|ovuecirciumlquov|q|ecirciumlquuvqyyecirciumlq|wvzyzecirciumlq|vzuyecirciumlq

xv|

yvqwecirciumlq|wvzouecirciumlq|ovxyecirciumlquovxyyecirciumlquyvouecirciumlq|wvzuzecirciumlq|uvqzxecirciumlqo

vz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

Ccedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

Xpound

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-4Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-5Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (2

)

JAEA-Testing 2011-005

E 1oacute 1

Fig E-1 QRfeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39305 sec Ave Speed = 32566145 MBs RANK = 2 TIME = 39197 sec Ave Speed = 32655168 MBs RANK = 3 TIME = 39212 sec Ave Speed = 32643219 MBs RANK = 4 TIME = 34341 sec Ave Speed = 37272765 MBs RANK = 5 TIME = 34175 sec Ave Speed = 37453885 MBs RANK = 6 TIME = 34086 sec Ave Speed = 37552470 MBs RANK = 7 TIME = 34329 sec Ave Speed = 37286226 MBs RANK = 8 TIME = 24021 sec Ave Speed = 53287420 MBs RANK = 9 TIME = 23794 sec Ave Speed = 53794939 MBs RANK = 10 TIME = 23805 sec Ave Speed = 53770236 MBs RANK = 11 TIME = 23798 sec Ave Speed = 53786914 MBs RANK = 12 TIME = 23942 sec Ave Speed = 53462490 MBs RANK = 13 TIME = 23931 sec Ave Speed = 53488046 MBs RANK = 14 TIME = 23950 sec Ave Speed = 53443826 MBs RANK = 15 TIME = 24066 sec Ave Speed = 53186052 MBs RANK = 16 TIME = 23798 sec Ave Speed = 53786327 MBs RANK = 17 TIME = 23802 sec Ave Speed = 53775886 MBs RANK = 18 TIME = 23821 sec Ave Speed = 53733399 MBs RANK = 19 TIME = 23794 sec Ave Speed = 53795866 MBs RANK = 20 TIME = 23939 sec Ave Speed = 53468837 MBs RANK = 21 TIME = 23938 sec Ave Speed = 53470775 MBs RANK = 22 TIME = 23939 sec Ave Speed = 53470333 MBs RANK = 23 TIME = 23942 sec Ave Speed = 53462889 MBs

JAEA-Testing 2011-005

Fig E-2 afeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39370 sec Ave Speed = 32511775 MBs RANK = 2 TIME = 39475 sec Ave Speed = 32425231 MBs RANK = 3 TIME = 39415 sec Ave Speed = 32474846 MBs RANK = 4 TIME = 34392 sec Ave Speed = 37217925 MBs RANK = 5 TIME = 34543 sec Ave Speed = 37055373 MBs RANK = 6 TIME = 34250 sec Ave Speed = 37372742 MBs RANK = 7 TIME = 34709 sec Ave Speed = 36878286 MBs RANK = 8 TIME = 26967 sec Ave Speed = 47465876 MBs RANK = 9 TIME = 26964 sec Ave Speed = 47471018 MBs RANK = 10 TIME = 26955 sec Ave Speed = 47486289 MBs RANK = 11 TIME = 26947 sec Ave Speed = 47499771 MBs RANK = 12 TIME = 26957 sec Ave Speed = 47483647 MBs RANK = 13 TIME = 27020 sec Ave Speed = 47373054 MBs RANK = 14 TIME = 26948 sec Ave Speed = 47498023 MBs RANK = 15 TIME = 27174 sec Ave Speed = 47104542 MBs RANK = 16 TIME = 26956 sec Ave Speed = 47485054 MBs RANK = 17 TIME = 26945 sec Ave Speed = 47504899 MBs RANK = 18 TIME = 26946 sec Ave Speed = 47502360 MBs RANK = 19 TIME = 26967 sec Ave Speed = 47466158 MBs RANK = 20 TIME = 26948 sec Ave Speed = 47499767 MBs RANK = 21 TIME = 26982 sec Ave Speed = 47438204 MBs RANK = 22 TIME = 27087 sec Ave Speed = 47254830 MBs RANK = 23 TIME = 27004 sec Ave Speed = 47399960 MBs

JAEA-Testing 2011-005

Fig E-3 QRfeaZOpenMPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes NODE CPU coreid Rank = 0 bx2050 CPU0 0 all time RANK = 1 TIME = 38110 sec Ave Speed = 33587213 MBs bx2050 CPU1 1 RANK = 2 TIME = 28656 sec Ave Speed = 44667752 MBs bx2050 CPU0 2 RANK = 3 TIME = 37959 sec Ave Speed = 33720175 MBs bx2050 CPU1 3 RANK = 4 TIME = 28658 sec Ave Speed = 44664664 MBs bx2050 CPU0 4 RANK = 5 TIME = 37829 sec Ave Speed = 33836073 MBs bx2050 CPU1 5 RANK = 6 TIME = 28641 sec Ave Speed = 44691880 MBs bx2050 CPU0 6 RANK = 7 TIME = 37892 sec Ave Speed = 33779924 MBs bx2050 CPU1 7 RANK = 8 TIME = 35140 sec Ave Speed = 36425620 MBs bx2051 CPU0 0 RANK = 9 TIME = 35017 sec Ave Speed = 36554163 MBs bx2051 CPU1 1 RANK = 10 TIME = 38916 sec Ave Speed = 32891010 MBs bx2051 CPU0 2 RANK = 11 TIME = 45416 sec Ave Speed = 28184186 MBs bx2051 CPU1 3 RANK = 12 TIME = 35172 sec Ave Speed = 36392511 MBs bx2051 CPU0 4 RANK = 13 TIME = 35053 sec Ave Speed = 36516468 MBs bx2051 CPU1 5 RANK = 14 TIME = 35200 sec Ave Speed = 36363472 MBs bx2051 CPU0 6 RANK = 15 TIME = 35009 sec Ave Speed = 36561987 MBs bx2051 CPU1 7 RANK = 16 TIME = 22996 sec Ave Speed = 55661441 MBs bx2052 CPU0 0 RANK = 17 TIME = 35587 sec Ave Speed = 35968575 MBs bx2052 CPU1 1 RANK = 18 TIME = 35587 sec Ave Speed = 35968151 MBs bx2052 CPU0 2 RANK = 19 TIME = 35712 sec Ave Speed = 35842525 MBs bx2052 CPU1 3 RANK = 20 TIME = 22996 sec Ave Speed = 55661978 MBs bx2052 CPU0 4 RANK = 21 TIME = 36856 sec Ave Speed = 34729458 MBs bx2052 CPU1 5 RANK = 22 TIME = 22994 sec Ave Speed = 55666382 MBs bx2052 CPU0 6 RANK = 23 TIME = 22991 sec Ave Speed = 55673257 MBs bx2052 CPU1 7

JAEA-Testing 2011-005

13

F |eacuteIuml -microX|eacuteIuml^_Z)JCQRfeaZW Intel feaZegrave

pound5poundDqQRfeaZgt|YatildeGfeaUuml^13~

WCeumlUKntstiacuteordflaquoqmnaringaeligccedil13heumlX store amp5GUuml^13~gtlaquoq2csup1egraveBordfOcirc$ storegtszligordflaquoqiAgraveIntelmnntildeGfeaUuml^13~centsup3CXqCcCcedilEgravegtacircOslash5Cagraveaacute5pound

Dq WCIntelfeaZmUuml^13~ordfccdQRfeaZ

UKntst egraveBCDoumlWordfqiAgraveUKntst ordfQRfeaZEGmWordfethCDq

F1 Agraveograve o^_UgraveDo7ccedilIumlgt 128MB copyIgrave0]lyacuteOslash5GZccedil` MPI

OpenMP^_Z)JCDqm0]lyacute 100plusmnatildesectaumlC5mnccC|eacuteIuml (MBs)iquestHq^_ZiIacute[oumlYacutebrvbarsectGqOslash5IumlWCMPI gt 12348 ^_OpenMP gt 123487ccedilIumlgtOslash5UgraveD IFyacute1EC1CPUcd2^_UgraveD7ccedilIuml 2laquonotgtOslash55poundEacuteDqQRfeaZUKntst CDoumlWCXouml 2sectCDqIntelucircfeagt5poundDq

JAEA-Testing 2011-005

Fig F-1 ^_ZugraveuacuteLFlat MPIP

(main) n = nmb102410248 nmb = 128 (from input data) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) nsize = n size c NNN Repetition number of times NNN = 100 do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = MPI_WTIME() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

Fig F-2 ^_ZugraveuacuteLOpenMPP

(main) $OMP threadprivate(ioutompidompnumabierr1ierr2) n = nmb102410248 nmb = 128 (from input data) $OMP parallel shared(n) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) if(ierr1ne0orierr2ne0) then write(6) Array allocation failed stop end if $OMP end parallel nsize = n size c NNN Repetition number of times NNN = 100 $OMP parallel private(i)shared(nsize) do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue $OMP end parallel $OMP parallel private(IIttime1ttime2timewave_spd) $OMPamp shared(NNNn) ttime1 = omp_get_wtime() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = omp_get_wtime() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

F2 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt[oumlYacutebrvbarsect 4gtGqmn

d QRUKntstlaquooumlWoumlWr 2 2^_laquonotgtUKntstlaquooumlWoumlWr 3KntstCc 1ccedil8^_gtQRW IntelWr QR(-Kntst)W Intel2 2^_Wr CqOMP Intel EcircHGu MPI Wiacute

XWXpoundDDHyzGqmndbrvbarsect[raquozfrac14frac12CDq (1) MPI^_ZOpenMP^_ZWQRfeaZgtUKntstordflaquooumlW

oumlgtordfaacuteCq (2) IntelfeaZUKntstCDQRfeaZWbAEligq (3) MPI^_ZOpenMP^_ZW 1CPUSgto coremacrdnXq (4) UKntstordfQRfeaZOslash5thornlt]G(1plusmnW 2plusmn

Oslash5ordfaEacuteD)q ^_ZbrvbarPXOcirc$sup1yBordfAgraveXCc store ampordfCXoumlQR

feaZUKntst Uuml^13~CXnordfyacuteordfdXqOcirc$sup1yBordfcAgraveXcpoundecircsectCXoumlmUuml^13~CagraveaacuteGaeliguordflaquoWnq

IntelfeaZdegIacuteQRUKntstlaquosectoumlWGmWordfgtecircq(XETHNtildefeaOslash5|YatildeG4X

yen5oslashIacuteD|Yordf9WXpoundbrvbarPmacrltdnDq

JAEA-Testing 2011-005

Table F-1 |eacuteIumlWeumlUKntstiacuteUuml^13~

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqvo uuuwvq u|xvq qvzu

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |o|v wq|vy xxwvu qvXpoundo |xwxvz |xwvq |xw|vz Xpoundo |xvy wovw xov qv

|xyvu |xuvo |xxv| |ozvu woqv xyxvq qv

Xpoundq z|vx zvz z|v Xpoundq ouuwvz ozwov ooxv| qvyoXpoundo woqv wqxvo wqvy Xpoundo oux|vu ozwv| oqvu qvyoXpound wqzv| wqvo wqxv Xpound oux|v| ozzqvo oov qvyo

wquv| wqqvo wqv ouxovz ozwyvu oozvo qvyo

Xpoundq o||vo o|xvq o|uvq Xpoundq oqoxvz oxquvu oyqvo qvxzXpoundo ouv ouov ouov Xpoundo oqovw oxoqvo oy|vz qvxzXpound ouov ouyv| ouuvq Xpound oqowvz oxqzv oyuv| qvxzXpound| ouvy ouuvw ou|v Xpound| oqowvz oxqzv oyuvq qvxz

o|zvz ouovw ouqvz oqovz oxqwv| oy|vo qvxz

Xpoundq qwyvq oqyvw qzyvu Xpoundq ouxvq ouwuvy ouywv| qvqXpoundo qz|v ooyvy oqxv Xpoundo ouxvz ouwuv ouovq qvqXpound qzuvy oovq oqxvw Xpound ouxzv ouz|v ouyv qvqXpound| qzxvy ooxvu oqxvx Xpound| ouxuvw ouzv ou|vx qvqXpoundu qwzv| o||v ooovx Xpoundu ouqxvo ouuv| ouyv qvyXpoundx qwzvu o|vw oo|vy Xpoundx ouq|vo ouuxv ouuv qvyXpoundy oqvy o|xvu oovx Xpoundy ouquvw ouuyvq ouxvu qvyXpound ooqvu o|yvo o|v Xpound ouqxvx ouuyv ouyvo qvy

qzxvw ouvw ooqv| ou|qv| ouyvu ouuwvz qvyw

usup3deg

wsup3deg wsup3deg

usup3deg

|sup3deg

sup3deg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

sup3deg

|sup3deg

Table F-2 |eacuteIumlLiacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq ywxvu xyuvw yxvo qvuXpoundo |xywv |xwvx |x|vy Xpoundo yzwvz xvu y|wvo qvuXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz yzvo xovo y|ovy qvu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq yqv xywvy yozvy qvuXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound ywyvz xwv y|uvw qvuXpound| Xpound|

|xyov| |xovu |xyyv| ywvw xxvy yv qvu

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq yoov xvu ywovw qvxXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| yxv qvo yzvy qvx

|xyv| |xqvo |xyyv yowv yov ywzv qvx

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo uyv| wqvo y|v qvuXpound |xyvz |xuvu |xovo Xpound uqvu wqyv| y|wv| qvuXpound| Xpound|

|xyyv| |xuvu |xqv| uywvw wqyv y|vw qvu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo uyxvy wqyv y|yvo qvuXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| uywvq wqxvz y|vq qvu

|xyuvw |x|vo |xywvz uyyvw wqyv| y|yvx qvu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound xq|v wovo yxvz qvuXpound| |xyxvu |x|v| |xyzv| Xpound| uzzvx wqyvu yxvz qvu

|xyuvw |xuvy |xyzv xqovy wqzv yxxvu qvu

ntildeR| ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

ontildeR

ontildeR|

qntildeRo

qntildeR

qntildeR|

qntildeRo

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

JAEA-Testing 2011-005

Talbe F-3 |eacuteIumlLQRafeaZrsacute 1ccedil8 ^_P

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqxvw uqzvq uqvu ovqx

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |ywqvx |ywovy |ywovo ovq|Xpoundo |xwxvz |xwvq |xw|vz Xpoundo |yzyvz |yzvo |yzvq ovq|

|xyvu |xuvo |xxv| |ywwv |ywzv| |ywzvq ovq|

Xpoundq z|vx zvz z|v Xpoundq zv| zxvq z|v ovqqXpoundo woqv wqxvo wqvy Xpoundo wqxvx wqwvy wqvo ovqqXpound wqzv| wqvo wqxv Xpound wquv| wqwv| wqyv| ovqq

wquv| wqqvo wqv wqqv wquvq wqv| ovqq

Xpoundq o||vo o|xvq o|uvq Xpoundq ov ouqv| o|ov| ovqqXpoundo ouv ouov ouov Xpoundo o|ovy ouwvw ouqv ovqqXpound ouov ouyv| ouuvq Xpound o|ovy oxqvo ouqvw ovqqXpound| ouvy ouuvw ou|v Xpound| o|qv ouzvo o|zvz ovqq

o|zvz ouovw ouqvz ozvq ouvo o|wvq ovqq

Xpoundq qwyvq oqyvw qzyvu Xpoundq qzv ooxvq oqyvo ovqqXpoundo qz|v ooyvy oqxv Xpoundo oquv oouv| oqzv ovqqXpound qzuvy oovq oqxvw Xpound oqxv| oxvx ooxvu ovqqXpound| qzxvy ooxvu oqxvx Xpound| oqxvw oxv ooxvx ovqqXpoundu qwzv| o||v ooovx Xpoundu oqv qzuv| oqvx qvzzXpoundx qwzvu o|vw oo|vy Xpoundx oov qzov oquvu qvzwXpoundy oqvy o|xvu oovx Xpoundy ooxvu qwzv oqvy qvzwXpound ooqvu o|yvo o|v Xpound oouvo qzqvo oqvo qvzw

qzxvw ouvw ooqv| ooqvq oqxv oqvz qvzz

wsup3deg

usup3deg

|sup3deg

sup3deg

13aumlecedilg

sup3deg

|sup3deg

usup3deg

eccedilograveregYacuteg

efrac12cedilregg

oefrac12cedilregg

wsup3deg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

Table F-4 |eacuteIumlLQRafeaZrsacute iacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq |quvy |quvx |quvx ovquXpoundo |xywv |xwvx |x|vy Xpoundo ||vx |ovw |vy ovquXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz |ouvq |o|vo |o|vy ovqu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq ||zzv |qv| |xxqvw ovqqXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound |zwv| ||v |yovq ovqxXpound| Xpound|

|xyov| |xovu |xyyv| |xzwvw |o|vq |yxxvz ovq

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq |quv |quvq |quvu ovquXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| |ovx |ovy |ovy ovqu

|xyv| |xqvo |xyyv |o|vo |ovw |o|vq ovqu

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo |oyvw |oyvu |oyvy ovquXpound |xyvz |xuvu |xovo Xpound |oyv |oyvq |oyvo ovquXpound| Xpound|

|xyyv| |xuvu |xqv| |oyvx |oyv |oyv| ovqu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo |ov |oyvo |oyvz ovquXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| |ouv| |ovx |o|vu ovqu

|xyuvw |x|vo |xywvz |oyvq |ouv| |oxvo ovqu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound |oyv |oxvz |oyvo ovquXpound| |xyxvu |x|v| |xyzv| Xpound| |oxvx |oyvw |oyvo ovqu

|xyuvw |xuvy |xyzv |oxvw |oyv| |oyvo ovqu

ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

qntildeRoqntildeRo

qntildeR

qntildeR|

ontildeR

ontildeR|

ntildeR|

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpound

JAEA-Testing 2011-005

G Zccedil` MPIWIcirca|ccedil`gt MPI atilde 6MPI ETHNtildegtegraveBCD^_ZiIacute^regCZccedil`MPI

WIcirca|ccedil`gtOslash55poundDq

G1 Agraveograve MPI ETHNtildegtegraveBCD^_ZiIacute^regC1024 ^_Zccedil

` MPIW 128^_j87ccedilIumlIcirca|ccedil`128^_Zccedil` MPIW128 ^_j8 7ccedilIumlIcirca|ccedil`acircgtCc Ocirc$-a5gtrs5poundDqUgraveDRDMANORDMAIumlgt5poundDq )acircEacute coregtZccedil` MPIWIcirca|ccedil`rs acircEacute MPI^_gt

coreCCDouml rsWGq Ocirc$-a5n~n 8eacutea`ccedil1Keacutea`256Keacutea`ccedil1Meacutea`WCDq

G2 [oumlYacutebrvbarsectGqmbrvbarsect[raquozfrac14frac12CDq (1) 1024Zccedil` MPIW 128j8Icirca|ccedil`gtGu MPIatildegtIcirca|ccedil`

ordfFqUgraveDatildebrvbarpoundaacuteCCordflaquoq (2) 128Zccedil` MPIW 128j8Icirca|ccedil`gtplusmn Ocirc$-a5(256KB13yacute)

gtlaquoWGu MPIatildegtaeligIcirca|ccedil`ordfnotyW wXordfaacuteCCIcirca|ccedil`ordfZccedil` MPIbrvbarsectordfmWq

(2)gt ordfAgravecedilCXpoundegraveBG core ordfCXgtOslashf

IumlntildeHDoumlfIumlOslash5FCXWeumlnqmndbrvbarsect

BX900gtIcirca|ccedil`ordfnotygtlaquoqfIumlUTUacuteUcircbgtIcirca|ccedil`Ucirc5Gaeliguordflaquoq

JAEA-Testing 2011-005

Table G-1 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (1)

sup2~ w |y |o ovq| zx o| qv |vu vuqsup2~ oy |o |qz ovq| oqq o|x qv| |vow vwsup2~ u |ow |o ovqq zz o|x qv| |vo v|xsup2~ | |y |o ovqu zw o|x qv| |v|u v|sup2~ yu |o |oy ovqq zy o|z qvyz |v| vsup2~ ow ||u |o| ovq zw o| qv |vuq vwsup2~ xy ||u |o ovqx oqx o|z qvx |voz vsup2~ xo ||u |q ovqu oq o|w qv |vo| v|sup2~ ontildequ |uo |q ovq oqz ou| qvy |vo vu

sup2~ w owq qw qvw q oo qvy| vxy ovwxsup2~ oy owu o| qvw q oqw qvyy vyo ovzwsup2~ u owq qu qvww yw oqz qvy vyx ovwsup2~ | o| qy qvwu o oqw qvyy vu| ovzosup2~ yu o| qu qvwx o oqw qvyy vu| ovwzsup2~ ow owx qq qvz| oqz qvy vxy ovwusup2~ xy oz qo qvwz wy ooo qvw vqz ovwsup2~ xo ow qu qvw w oo qv vqx ovwsup2~ ontildequ ow ou qvwx z oq qv ovzw ovz

sup2~Oslashreg~plusmn w untildeqoy untilde|z ov| ontildeuw ontildeyz| qvu xzv|q xvusup2~Oslashreg~plusmn oy |ntildeyuw uqntilde ovwo ontildezw ontildeyuu qvz xyvx uvwsup2~Oslashreg~plusmn u untildeqxu |zntildezu| ovwx ontildexo ontildeyoq qvw xzvow uvwsup2~Oslashreg~plusmn | |ntilde|| uqntildeyq ovw ontilde|q ontildeyq qvwo xxvxy uvwysup2~Oslashreg~plusmn yu untildey uqntildeuuu ovwu ontildeww ontildeyuy qvw xvyu uvxsup2~Oslashreg~plusmn ow |ntildeqq uqntildezw| ovw ontildew ontildeyzx qv xzvxq uvowsup2~Oslashreg~plusmn xy untildeyyx uqntildeywz ovwu ontildeuxw ontildeyy qvw xovq uvuosup2~Oslashreg~plusmn xo untildeqz |zntildeoww ovwz ontildexq ontildeqx qvwz uwvyz vzzsup2~Oslashreg~plusmn ontildequ yntildexwx |ntildeuuz vqx ontildey ontildewx qvw uvo ozvzw

middotplusmnsup1 w |w | ovqq oqq o|y qv| |vw vuqmiddotplusmnsup1 oy |u |uo qvzx oqo o|y qvu |vq vxqmiddotplusmnsup1 u |uo |yq qvzx oqo ouq qv |v|w vxwmiddotplusmnsup1 | |xo |yw qvzy oq o|w qvu |vux vyymiddotplusmnsup1 yu xzo u|| ov|y oqx ouw qvo xvy| vzmiddotplusmnsup1 ow ntildez yqu uvz| ooy oxx qvx xvq |vzqmiddotplusmnsup1 xy xntilde z xv| o| oy qvz |zvyx xvwumiddotplusmnsup1 xo wntildeqoz ontildew uvxo oy oz ovo| |voy zvmiddotplusmnsup1 ontildequ oqntildezy |ntildexxo |vqz zo |w uvqz oovz ouvzx

middotplusmnsup1raquo w ntildey| oxntildequq qvo || oy qvu vz ovqomiddotplusmnsup1raquo oy ntildeuyw oxntildeqx qvoy ||x oq qvu v| ovoxmiddotplusmnsup1raquo u ntilde|y oxntildequq qvow |uo o| qvuw wvq| ovqzmiddotplusmnsup1raquo | ntildexw| oxntildeq|| qvo || ox qvu vyy ovqomiddotplusmnsup1raquo yu ntildeyyz oxntildeqqz qvow |xq yw qvuy vy ozvxumiddotplusmnsup1raquo ow |ntildeqqx oxntildeqy| qvq u|q uo qvxw yvzz qv|middotplusmnsup1raquo xy |ntilde|uy oxntildeoqu qv xyo y qvu xvzy ozvw|middotplusmnsup1raquo xo |ntilde|zq oxntildeoxq qv xzw woq qvu xvy owvomiddotplusmnsup1raquo ontildequ |ntildex| oxntildeuox qvu yy z|z qv xvxx oyvuo

Ugraveplusmnsup1 w ontildeuox yntildeyo| vy| u q ovq ovzq |vUgraveplusmnsup1 oy owntildeqz yntildewu vy |q qo ovou wv|q |uvoxUgraveplusmnsup1 u owntilde| yntildewzz v |qo q ovuy yvoy ||v|wUgraveplusmnsup1 | ozntildeqzy yntildezq v uz q ovq yvwq ||v|wUgraveplusmnsup1 yu ozntilde|wx yntildewx| vw| z oy ov| yxv|o |ovqUgraveplusmnsup1 ow qntildewo yntildez|y vz ||q o ovxq yovxq |ovuxUgraveplusmnsup1 xy qntilde|o ntildeqo vzq |wy y ovo xvx |ovqqUgraveplusmnsup1 xo ontildeo|u ntildeow| vzu |xy |u ovx| xzv| |qvyUgraveplusmnsup1 ontildequ ontildey|q ntilde|z vz| |zw ow| vow xuv|o uqvu

Ugraveplusmnsup1raquo w ozntildexw yntildez| vw| |z q ovow wvqu |uvuqUgraveplusmnsup1raquo oy ozntildexoz yntildezq vw| |q q ovou wuvyz |uvoUgraveplusmnsup1raquo u ozntildexqy yntildezu vwo u q ov| ov|q ||vyqUgraveplusmnsup1raquo | ozntildeuu yntildewz vwy o qy ov|o vwx ||vuqUgraveplusmnsup1raquo yu ozntildeqw yntildezq| vwx y oy ovu |vy |ovzy

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

sectWOcirceg

oqu$XAcircsup3deg owsup3degIcircwsup3+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

JAEA-Testing 2011-005

Table G-2 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (2)

Ugraveplusmnsup1raquo ow qntildeqx yntildewz| |vqq |u q ovq xxv|| |ovzUgraveplusmnsup1raquo xy qntildew| ntildeq| vzy ||y y ovuz yvqx |ovoUgraveplusmnsup1raquo xo ontildew ntildexy vz| u|| |u ovwx uzvoo |qvzxUgraveplusmnsup1raquo oqu ontildeww| ntildeuqz vzx |oy owu ov yzv|q uqv||

~plusmn w yntildez zntildeuw qvo ontildeozu uux vyw xvwq ovzo~plusmn oy ntildeoqz zntildeu| qv| ontildeoy| uu vyq yvoo ovz~plusmn u ntildeq|z zntildeuy qv ontildeo uuy vy| yvqq ovwx~plusmn | ntildeoz zntildeyq qvu ontildequ uu vq xvzw ovwy~plusmn yu yntildezoq zntildey qvo ontildeoyu uu vyq xvzu ovw|~plusmn ow zntilde|qz zntildew| qvzx ntildeo uuy uvw uvz ovz~plusmn xy zntilde|wz zntildewqw qvzy ntildey uuy uvzz uv ovz~plusmn xo zntildeqx zntildewuq qvzu ntildeowx uuz uvw uvo ovzu~plusmn oqu zntildexz zntildezq| qvzz ntildeo uyo uv uvuz ovxq

~plusmnraquo w yntildewox zntilde| qvq ontildeuo uux vz xvuz ovw~plusmnraquo oy yntildezq zntildeu| qvo ontildeoz uu vyw xvw ovwo~plusmnraquo u yntildeyz| zntildeux qvyz ontildeoyw uuu vy| xv| ovz|~plusmnraquo | yntildewou zntildexx qvq ontildeu uux vx xvx ovzo~plusmnraquo yu yntildezx| zntildex qvo ontildeoyo uuu vyo xvzz ovzx~plusmnraquo ow zntildeuux zntildew qvz ntildeoxx uu| uvwy uv|w vqx~plusmnraquo xy zntilde|x zntildewqo qvzx ntildeq uuy uvzx uvu ovzw~plusmnraquo xo zntildeyw zntildewuy qvzw ntildeqy uxq uvzq uv|z ovwz~plusmnraquo oqu zntildeyxq zntildezoo qvz ntildeoxy uyq uvyw uvuw ovx|

plusmn w uy uo ovo zy oow qvw uvwq |vxqplusmn oy xq uw ovqy oqo oq qvwx xvqq |vzzplusmn u xwz x|| ovoq oqy o| qvwy xvx uv|uplusmn | oo yqy ovo ooq o|q qvwx yvuy uvywplusmn yu ntilde zoz vu o| ouo qvw owvqu yvxqplusmn ow ntildeww ntildeqqz ov|z oxo ox qvwy owvu oovxqplusmn xy untildeouy |ntildeoy ovo q| oz qvz| qvuy oyvzplusmn xo |zntildeo| |xntilde|u ovoo ontildeuox ontildeuy qvwo vyz qvowplusmn oqu uqntildeqwy |xntilde|w ovo ontildeuww ontildewqw qvw yvzu ozv

plusmnraquo w oqqntildexy yqntilde|y ovy ntildeuou wy vx uovyy ywvwzplusmnraquo oy oqntildey| yontildeq ovyw ntildeu|w ww| vy uvqz yzv|oplusmnraquo u oqntildexyw yontildeqz| ovyw ntilde|zy wz| vyw uvw ywvuqplusmnraquo | oq|ntildeqo yontildeo ovyw ntildeuqq www vq uvzx ywvzoplusmnraquo yu oq|ntildeqzw yontildeoou ovyz ntildeuyx wzx vx uovw| ywv|qplusmnraquo ow oq|ntildezo yontildeq| ovq |ntildexqx wz |vzo zvyx ywvoplusmnraquo xy oquntildeqy yontildeqq ovo |ntildexyw wzz |vz zvo yvwxplusmnraquo xo oqwntilde|zq yqntildewo ovw |ntildexyx wzx |vzw |qvuo yvzplusmnraquo oqu ooqntildeqq yontildeoxx ovwq |ntildeyu zo |vzz zvzx yyv|z

~plusmnreg w ooz o|o qvzq q x qvwq xvwo xvo~plusmnreg oy oo| ou qvzo y qvwx xvq uvwu~plusmnreg u oow o|o qvzo q x qvw xvz xvy~plusmnreg | oq ou qvz q x qvwq xvzu uvwz~plusmnreg yu oq ox qvzy q x qvw yvo uvzz~plusmnreg ow o oz qvzz y qvw xvy xvq|~plusmnreg xy ox o|o qvzx y qvzx uvwy uvw|~plusmnreg xo ox o|o qvzx | y qvww xv|x uvzx~plusmnreg oqu ou o|| qvz| y z qvz uvq uvy|

~plusmn w oontildewxu |ntildeyuy |vx yu ontildeoow qvx owvuy |vy~plusmn oy oontildewo |ntildeyq |vw y|z ontildeoox qvx owvuw |v|~plusmn u oontildey |ntildexw |vz yuo ontildeoox qvx owv| |vo~plusmn | oontilde| |ntildexzx |v y| ontildeo| qvx owvuz |vq~plusmn yu oontildew|o |ntildey |v y| ontildeo qvx owvxw |vo~plusmn ow ontildeqw |ntilde| |v| yy ontildeouq qvxx ozv|o |vw~plusmn xy ontilde |ntildewox |v zz ontildeoyw qvyw oxv| |v~plusmn xo ontildeuo |ntildezzo |voo wu ontilde qvyz ouvy |v~plusmn oqu ontildequ untilde|w vzq zu ontilde|u qvo o|vuz |v|o

sup2cedil~raquo w y o qvx o | qvu| xvoy uvusup2cedil~raquo oy z o qvwq o qvxx vzx xvuxsup2cedil~raquo u x oq qvuy o qvyo |v| uvuzsup2cedil~raquo | | oo qv o qvxu vuz uvzosup2cedil~raquo yu oo qvyo o qvuz xvyo uvxusup2cedil~raquo ow oo qvyw o qvxx yvqq uvwxsup2cedil~raquo xy y oo qvxy o qvy uvyq xvoosup2cedil~raquo xo y oo qvxy | qvy| |vq| |vusup2cedil~raquo oqu x o qv|z o qvx |vyz xvuq

JAEA-Testing 2011-005

Table G-3 MPIatilderaquo LMPI and hybrid acute256ccedil1024KBP

sup2~ yntildeouu |ntildewzz |ntildeuou ovou ontildezx ontildeqyw ovw| ovzz |vqsup2~ xuntildeww ntilde|o ntildeuo qvzz |ntildeqo |ntildey qvz vu vsup2~ wyntildeu| oqntildexx oontildeqyy qvz |ntildezwy untildeuyw qvwz vq vuwsup2~ ontildequwntildexy ountildexxy oxntildeuy qvzu untildezqq xntildewyw qvwu vz vyu

sup2~ yntildeouu zou zo qvzu ontildeqqq ontildeq qvzw qvzo qvzxsup2~ xuntildeww ontildeo ontildezy qvwz ontildew ntildeuuz qv| qvz qvzsup2~ wyntildeu| ntildexuo ntildey| qvz ntildeyzz |ntildeuqw qvz qvzu qvwosup2~ ontildequwntildexy |ntildex untildeqox qvzu |ntildexww untilde|| qvw ovqx qvz

sup2~Oslashreg~plusmn yntildeouu yntildeuw| wontildexoz qvzu xntildeuo |ntildeoqw ovoq |vqq |vx|sup2~Oslashreg~plusmn xuntildeww ou|ntildewx ountildeqo qvw| uyntildeqzq yyntildewxo qvyz |vo vyosup2~Oslashreg~plusmn wyntildeu| oontildezoo yzntildeuz qvz yntildeqz zxntildeox qvo |voy vw|sup2~Oslashreg~plusmn ontildequwntildexy w|ntildeyox |yontildeuq| qvw wntildexo o|ntildexqz qvo |v| vz|

middotplusmnsup1 yntildeouu |ntildezw ||ntilde|x| qvzz oontildexww oxntildeyzu qvu vwx vo|middotplusmnsup1 xuntildeww y|ntildex yyntildeyy qvzx ontildez|x untilde|x qvuy vww ovuomiddotplusmnsup1 wyntildeu| wzntildewy z|ntildeyux qvzy |qntildeyx yntildeyqu qvuy vz ov|zmiddotplusmnsup1 ontildequwntildexy oqntildeux| o|untildeqox qvzq uqntildeyxx wwntildez| qvuy vzy ovxo

middotplusmnsup1raquo yntildeouu yntildew|u ||ntildeu ovww ntildewxy oxntildeyuw uvyy qvwy voumiddotplusmnsup1raquo xuntildeww wxntildeoz yuntildeo| ov|| wxntildex|| untildeuy ovwq ovqq ov|xmiddotplusmnsup1raquo wyntildeu| ooxntildeuw z|ntildezyu ov| z|ntildeyow yntildexuz ov|z ov| ov|zmiddotplusmnsup1raquo ontildequwntildexy o|ntildexx o|untildeuqz ovq zyntildexoq wntildeuuy ovoq ovu| ovxu

Ugraveplusmnsup1 yntildeouu |zntildeyu| oqntildexwz |vu uontildeuyq ountildewy vz qvzy qvoUgraveplusmnsup1 xuntildeww untildewxx ntildeqo voy uzntildeoy| yntildewo ovw| qvz qvw|Ugraveplusmnsup1 wyntildeu| xxntildewoq zntildewyu ovw xyntildewuu |yntildexqx ovxy qvzw qvwUgraveplusmnsup1 ontildequwntildexy yntildewxy |ntildexxw ovy yuntildewzx uyntildeoq ovuo qvz qvwo

Ugraveplusmnsup1raquo yntildeouu uqntildeq|u oqntildeyxq |vy uontildey| ountildez|z vz qvzy qvoUgraveplusmnsup1raquo xuntildeww uwntilde|u ntildeqz vo uzntildew yntildewx ovw| qvzw qvw|Ugraveplusmnsup1raquo wyntildeu| xyntildezy zntildez|y ovww xntildeo| |yntildeuwq ovx qvzw qvwUgraveplusmnsup1raquo ontildequwntildexy y|ntildeo| |ntildeyxq ovyw yxntilde|yw uyntildeqx ovuo qvz qvwo

~plusmn yntildeouu untildeyw wntildezu uvo uuntildeo wntildeuq| xvx qvzy ovq~plusmn xuntildeww xqntildeqyx qntilde|xq vuy xontildexwy untildezux vq qvz qvw~plusmn wyntildeu| xwntildeq wntildeux vqy yqntildeox |untildeu ov| qvzw qvw~plusmn ontildequwntildexy yyntildequq |yntilde|q| ovw ywntildeqx uuntildeuyy ovx| qvz qvw

~plusmnraquo yntildeouu zntilde|q zntildeqq ovqu zntildexq wntildeu|w ovo| qvzw ovq~plusmnraquo xuntildeww oyntildeu| qntilde|wy qvwo oyntildeoz untildezuy qvyx ovq qvw~plusmnraquo wyntildeu| |ntildewo wntildeuxu qvw ntildewz |untilde qvyy ovq qvw~plusmnraquo ontildequwntildexy zntildezy |yntildexo| qvw zntildezxq uuntildeu qvy ovqq qvw

plusmn yntildeouu yxntildewu wzntildeq| qv| ontildeyoz yntildezz qvwq |vq |v|qplusmn xuntildeww ountildeqoz oxntilde|o qvwo |zntildexu xxntilde|ux qv |vo vxplusmn wyntildeu| ow|ntildeuwy untildeyu qvw xntildeoyu untildezu| qvy |vo |vqqplusmn ontildequwntildexy u|ntildewo zxntildezqo qvw xntildezy zuntildeoy qvwq |vo |vo

plusmnraquo yntildeouu x|ontildeyx wyntildeqoz yvow o|ntildeqox owntilde|xu vu |vww uvyzplusmnraquo xuntildeww yozntildeyz| oz|ntildeuwu |vq oyntildeyw ywntildeyw v| |vwo vwplusmnraquo wyntildeu| yw|ntildewxu wwntildeooz v| owyntilde|q| oqxntildeyw ovy |vy vplusmnraquo ontildequwntildexy |yntildezo uqxntildeoyq ovw oontildexo ouontildezzx ovuz |vuw vwx

~plusmnreg yntildeouu uzu y| qvw uwo xzw qvwq ovq| ovqy~plusmnreg xuntildeww zq| ontildeoz qvy wo| ontildexuy qvx| ovoo qv~plusmnreg wyntildeu| ontilde|q ontildeyq qvw ontildeoxo ntildeoqz qvxx ovo| qvz~plusmnreg ontildequwntildexy ontildezqu ntilde|ux qvwo ontildeuxu ntildeyx qvxu ov|o qvww

~plusmn yntildeouu ountildewqx ountildez|x qvzz ontildeqq owntildeuw| ovou qvq qvwo~plusmn xuntildeww zntildeuy zntildewxz qvzw uqntildex|o uwntildexyw qvw| qv qvyo~plusmn wyntildeu| uuntildeq uuntildeo qvzz xyntildeyoq yzntildexzw qvwo qvw qvyu~plusmn ontildequwntildexy qntildeqqx qntildeu|q qvzz ntildeo zqntilde|ux qvwo qvzy qvw

sup2cedil~raquo yntildeouu x w qvz zo ux vq| qvw ov|sup2cedil~raquo xuntildeww ox o| qv| o|w o| qvyx qvzo qvwosup2cedil~raquo wyntildeu| oo |o qvu owz w| qvy qvzo qvwsup2cedil~raquo ontildequwntildexy ou wz qvu | |x qvyx qvz qvwo

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

sectWOcirceg

ow$XAcircsup3deg owsup3degIcircwsup3

国際単位系(SI)

乗数  接頭語 記号 乗数  接頭語 記号

1024 ヨ タ Y 10-1 デ シ d1021 ゼ タ Z 10-2 セ ン チ c1018 エ ク サ E 10-3 ミ リ m1015 ペ タ P 10-6 マイクロ micro1012 テ ラ T 10-9 ナ ノ n109 ギ ガ G 10-12 ピ コ p106 メ ガ M 10-15 フェムト f103 キ ロ k 10-18 ア ト a102 ヘ ク ト h 10-21 ゼ プ ト z101 デ カ da 10-24 ヨ ク ト y

表5SI 接頭語

名称 記号 SI 単位による値

分 min 1 min=60s時 h 1h =60 min=3600 s日 d 1 d=24 h=86 400 s度 deg 1deg=(π180) rad分 rsquo 1rsquo=(160)deg=(π10800) rad秒 rdquo 1rdquo=(160)rsquo=(π648000) rad

ヘクタール ha 1ha=1hm2=104m2

リットル Ll 1L=11=1dm3=103cm3=10-3m3

トン t 1t=103 kg

表6SIに属さないがSIと併用される単位

名称 記号 SI 単位で表される数値

電 子 ボ ル ト eV 1eV=1602 176 53(14)times10-19Jダ ル ト ン Da 1Da=1660 538 86(28)times10-27kg統一原子質量単位 u 1u=1 Da天 文 単 位 ua 1ua=1495 978 706 91(6)times1011m

表7SIに属さないがSIと併用される単位でSI単位で表される数値が実験的に得られるもの

名称 記号 SI 単位で表される数値

キ ュ リ ー Ci 1 Ci=37times1010Bqレ ン ト ゲ ン R 1 R = 258times10-4Ckgラ ド rad 1 rad=1cGy=10-2Gyレ ム rem 1 rem=1 cSv=10-2Svガ ン マ γ 1γ=1 nT=10-9Tフ ェ ル ミ 1フェルミ=1 fm=10-15mメートル系カラット 1メートル系カラット = 200 mg = 2times10-4kgト ル Torr 1 Torr = (101 325760) Pa標 準 大 気 圧 atm 1 atm = 101 325 Pa

1cal=41858J(「15」カロリー)41868J(「IT」カロリー)4184J(「熱化学」カロリー)

ミ ク ロ ン micro 1 micro =1microm=10-6m

表10SIに属さないその他の単位の例

カ ロ リ ー cal

(a)SI接頭語は固有の名称と記号を持つ組立単位と組み合わせても使用できるしかし接頭語を付した単位はもはや コヒーレントではない(b)ラジアンとステラジアンは数字の1に対する単位の特別な名称で量についての情報をつたえるために使われる

 実際には使用する時には記号rad及びsrが用いられるが習慣として組立単位としての記号である数字の1は明 示されない(c)測光学ではステラジアンという名称と記号srを単位の表し方の中にそのまま維持している

(d)ヘルツは周期現象についてのみベクレルは放射性核種の統計的過程についてのみ使用される

(e)セルシウス度はケルビンの特別な名称でセルシウス温度を表すために使用されるセルシウス度とケルビンの

  単位の大きさは同一であるしたがって温度差や温度間隔を表す数値はどちらの単位で表しても同じである

(f)放射性核種の放射能(activity referred to a radionuclide)はしばしば誤った用語でrdquoradioactivityrdquoと記される

(g)単位シーベルト(PV200270205)についてはCIPM勧告2(CI-2002)を参照

(a)量濃度(amount concentration)は臨床化学の分野では物質濃度

  (substance concentration)ともよばれる(b)これらは無次元量あるいは次元1をもつ量であるがそのこと   を表す単位記号である数字の1は通常は表記しない

名称 記号SI 基本単位による

表し方

秒ルカスパ度粘 Pa s m-1 kg s-1

力 の モ ー メ ン ト ニュートンメートル N m m2 kg s-2

表 面 張 力 ニュートン毎メートル Nm kg s-2

角 速 度 ラジアン毎秒 rads m m-1 s-1=s-1

角 加 速 度 ラジアン毎秒毎秒 rads2 m m-1 s-2=s-2

熱 流 密 度 放 射 照 度 ワット毎平方メートル Wm2 kg s-3

熱 容 量 エ ン ト ロ ピ ー ジュール毎ケルビン JK m2 kg s-2 K-1

比熱容量比エントロピー ジュール毎キログラム毎ケルビン J(kg K) m2 s-2 K-1

比 エ ネ ル ギ ー ジュール毎キログラム Jkg m2 s-2

熱 伝 導 率 ワット毎メートル毎ケルビン W(m K) m kg s-3 K-1

体 積 エ ネ ル ギ ー ジュール毎立方メートル Jm3 m-1 kg s-2

電 界 の 強 さ ボルト毎メートル Vm m kg s-3 A-1

電 荷 密 度 クーロン毎立方メートル Cm3 m-3 sA表 面 電 荷 クーロン毎平方メートル Cm2 m-2 sA電 束 密 度 電 気 変 位 クーロン毎平方メートル Cm2 m-2 sA誘 電 率 ファラド毎メートル Fm m-3 kg-1 s4 A2

透 磁 率 ヘンリー毎メートル Hm m kg s-2 A-2

モ ル エ ネ ル ギ ー ジュール毎モル Jmol m2 kg s-2 mol-1

モルエントロピー モル熱容量ジュール毎モル毎ケルビン J(mol K) m2 kg s-2 K-1 mol-1

照射線量(X線及びγ線) クーロン毎キログラム Ckg kg-1 sA吸 収 線 量 率 グレイ毎秒 Gys m2 s-3

放 射 強 度 ワット毎ステラジアン Wsr m4 m-2 kg s-3=m2 kg s-3

放 射 輝 度 ワット毎平方メートル毎ステラジアン W(m2 sr) m2 m-2 kg s-3=kg s-3

酵 素 活 性 濃 度 カタール毎立方メートル katm3 m-3 s-1 mol

表4単位の中に固有の名称と記号を含むSI組立単位の例

組立量SI 組立単位

名称 記号

面 積 平方メートル m2

体 積 立法メートル m3

速 さ 速 度 メートル毎秒 ms加 速 度 メートル毎秒毎秒 ms2

波 数 毎メートル m-1

密 度 質 量 密 度 キログラム毎立方メートル kgm3

面 積 密 度 キログラム毎平方メートル kgm2

比 体 積 立方メートル毎キログラム m3kg電 流 密 度 アンペア毎平方メートル Am2

磁 界 の 強 さ アンペア毎メートル Am量 濃 度 (a) 濃 度 モル毎立方メートル molm3

質 量 濃 度 キログラム毎立法メートル kgm3

輝 度 カンデラ毎平方メートル cdm2

屈 折 率 (b) (数字の) 1 1比 透 磁 率 (b) (数字の) 1 1

組立量SI 基本単位

表2基本単位を用いて表されるSI組立単位の例

名称 記号他のSI単位による

表し方SI基本単位による

表し方平 面 角 ラジアン(b) rad 1(b) mm立 体 角 ステラジアン(b) sr(c) 1(b) m2m2

周 波 数 ヘルツ(d) Hz s-1

ントーュニ力 N m kg s-2

圧 力 応 力 パスカル Pa Nm2 m-1 kg s-2

エ ネ ル ギ ー 仕 事 熱 量 ジュール J N m m2 kg s-2

仕 事 率 工 率 放 射 束 ワット W Js m2 kg s-3

電 荷 電 気 量 クーロン A sC電 位 差 ( 電 圧 ) 起 電 力 ボルト V WA m2 kg s-3 A-1

静 電 容 量 ファラド F CV m-2 kg-1 s4 A2

電 気 抵 抗 オーム Ω VA m2 kg s-3 A-2

コ ン ダ ク タ ン ス ジーメンス S AV m-2 kg-1 s3 A2

バーエウ束磁 Wb Vs m2 kg s-2 A-1

磁 束 密 度 テスラ T Wbm2 kg s-2 A-1

イ ン ダ ク タ ン ス ヘンリー H WbA m2 kg s-2 A-2

セ ル シ ウ ス 温 度 セルシウス度(e) Kンメール束光 lm cd sr(c) cd

スクル度照 lx lmm2 m-2 cd放射性核種の放射能( f ) ベクレル(d) Bq s-1

吸収線量 比エネルギー分与カーマ

グレイ Gy Jkg m2 s-2

線量当量 周辺線量当量 方向

性線量当量 個人線量当量シーベルト(g) Sv Jkg m2 s-2

酸 素 活 性 カタール kat s-1 mol

表3固有の名称と記号で表されるSI組立単位SI 組立単位

組立量

名称 記号 SI 単位で表される数値

バ ー ル bar 1bar=01MPa=100kPa=105Pa水銀柱ミリメートル mmHg 1mmHg=133322Paオングストローム Å 1Å=01nm=100pm=10-10m海 里 M 1M=1852mバ ー ン b 1b=100fm2=(10-12cm)2=10-28m2

ノ ッ ト kn 1kn=(18523600)msネ ー パ Npベ ル B

デ ジ ベ ル dB

表8SIに属さないがSIと併用されるその他の単位

SI単位との数値的な関係は    対数量の定義に依存

名称 記号

長 さ メ ー ト ル m質 量 キログラム kg時 間 秒 s電 流 ア ン ペ ア A熱力学温度 ケ ル ビ ン K物 質 量 モ ル mol光 度 カ ン デ ラ cd

基本量SI 基本単位

表1SI 基本単位

名称 記号 SI 単位で表される数値

エ ル グ erg 1 erg=10-7 Jダ イ ン dyn 1 dyn=10-5Nポ ア ズ P 1 P=1 dyn s cm-2=01Pa sス ト ー ク ス St 1 St =1cm2 s-1=10-4m2 s-1

ス チ ル ブ sb 1 sb =1cd cm-2=104cd m-2

フ ォ ト ph 1 ph=1cd sr cm-2 104lxガ ル Gal 1 Gal =1cm s-2=10-2ms-2

マ ク ス ウ ェ ル Mx 1 Mx = 1G cm2=10-8Wbガ ウ ス G 1 G =1Mx cm-2 =10-4Tエルステッド( c ) Oe 1 Oe  (1034π)A m-1

表9固有の名称をもつCGS組立単位

(c)3元系のCGS単位系とSIでは直接比較できないため等号「   」

   は対応関係を示すものである

(第8版2006年改訂)

この印刷物は再生紙を使用しています

Page 4: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター

JAEA-Testing 2011-005

[ 1 EacuteH ---------------------------------------------------------------------------------------------------- 1 2 13Ecircu ------------------------------------------------------------------------------------------- 1

21 UVEumlIgraveOIacuteIcircIumlETHNtildeJ ----------------------------------------------------------- 1 22 []fIumlIacuteIcircIumlETHNtildeJ -------------------------------------------------------- 2 23 OgraveOacuteOcircOtildeYOumltimesIcircIumlETHNtildeJ ----------------------------------------------------------- 2

3 HPCCYbrvbar --------------------------------------------------------------- 5 31 HPCCYEcircu -------------------------------------------------------------------------- 5

311 ^_Z ------------------------------------------------------------------------------------------ 5 312 Oslash5 ------------------------------------------------------------------------------------------ 7

32 -------------------------------------------------------------------------------------------------- 8 321 BX900 ----------------------------------------------------------------------------------------------- 8 322 FX1 ------------------------------------------------------------------------------------------------- 14

33 HPCCYUgraveWH ---------------------------------------------------------------------- 18 4 BX900W FX1TUacuteUcirc ---------------------------------------------------------------------- 20

41 ^_aZ-------------------------------------------------------------------------------------------- 20 42 feaZbrvbar|YTUacuteUcirc --------------------------------------------------- 20

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute------------------------------------ 21 422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute ------------------------------ 23

43 hhbrvbar|YTUacuteUcirc----------------------------------- 24 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute ------------------------------------------ 28 45 TUacuteUcircUgraveWH----------------------------------------------------------------------------------------- 31

5 IORYbrvbar ---------------------------------------------------------------- 33 51 IORYEcircu --------------------------------------------------------------------------- 33

511 YacircueZ$ ------------------------------------------------------------- 33 512 MPIIOatildeGUuml^13~eZ$ -------------------------------------------------- 34

52 ------------------------------------------------------------------------------------------------ 35 521 POSIX --------------------------------------------------------------------------------------------- 38 522 MPIIO --------------------------------------------------------------------------------------------- 38 523 _aumlaringaeligccedil13h ----------------------------------------------------------------------------- 38

53 IORYUgraveWH ------------------------------------------------------------------------- 41 531 a13egravecurren ----------------------------------------------------------------- 41 532 POSIXW MPIIO ------------------------------------------------------------------------------- 41 533 eacuteccedilecirceuml ------------------------------------------------------------------------------- 41

6 a$fY` ------------------------------------------------------------------------ 44 61 a$fY`igraveiacute----------------------------------------------------------------------------- 44

JAEA-Testing 2011-005

611 icircWiumlIumlethsectntilde -------------------------------------------------------------------------- 44 612 Agravej -------------------------------------------------------------------------------------------- 46

62 MPI ---------------------------------------------------------------------------------- 48 621 Agraveograve -------------------------------------------------------------------------------------------- 48 622 -------------------------------------------------------------------------------------------- 49

63 RDMAszlig ------------------------------------------------------------------------------------------ 53 631 Agraveograve -------------------------------------------------------------------------------------------- 53 632 -------------------------------------------------------------------------------------------- 55

64 CPUSCPUiumlIuml 1oacute 1 ----------------------------------------------- 56 641 Agraveograve -------------------------------------------------------------------------------------------- 56 642 -------------------------------------------------------------------------------------------- 57

65 13aelig13ocircotildeouml ----------------------------------------------------------------- 58 651 1oacute 1 ------------------------------------------------------------------------------------------ 58 652 divideoslash -------------------------------------------------------------------------------------------- 62

66 a$fY`UgraveWH-------------------------------------------------------------------------- 64 7 UgraveWH ----------------------------------------------------------------------------------------------------- 65 ugraveuacute ------------------------------------------------------------------------------------------------------------------ 65 ucircuumlyacutethorn ------------------------------------------------------------------------------------------------------------ 65 A FX1 HPCC BMT----------------------------------------------- 66 B EumlIgravecent MPI atilde ---------------------------------------------------------------- 67 C RDMAoacute NORDMA ----------------------------------------------------------------------------- 69 D W Oslash5 -------------------------------- 70 E 1oacute 1 --------------------------------------------------------------------------------- 76 F |eacuteIuml --------------------------------------------------------------------------- 79 G Zccedil` MPIWIcirca|ccedil`gt MPI atilde ------------------------ 85

JAEA-Testing 2011-005

Contents 1 Introduction ---------------------------------------------------------------------------------------------- 1 2 Outline of the system ---------------------------------------------------------------------------------- 1

21 Hardware construction of BX900 ------------------------------------------------------------------ 1 22 Hardware construction of FX1 -------------------------------------------------------------------- 2 23 Hardware construction of Storage----------------------------------------------------------------- 2

3 Performance evaluation by HPCC benchmarks ----------------------------------------------- 5 31 Outline of HPCC benchmarks --------------------------------------------------------------------- 5

311 Test programs ------------------------------------------------------------------------------------- 5 312 Rules ------------------------------------------------------------------------------------------------- 7

32 Measured results -------------------------------------------------------------------------------------- 8 321 BX900 ----------------------------------------------------------------------------------------------- 8 322 FX1 ------------------------------------------------------------------------------------------------- 14

33 Summary ------------------------------------------------------------------------------------------------ 18 4 Tunings for BX900 and FX1------------------------------------------------------------------------- 20

41 Profiler --------------------------------------------------------------------------------------------------- 20 42 Automatic tuning of memory access ----------------------------------------------------------- 20

421 Efficacy for Original version of STREAM benchmark ------------------------------- 21 422 Efficacy for HPCC version of STREAM benchmark ---------------------------------- 23

43 Manual tuning of memory access ---------------------------------------------------------------- 24 44 Efficacy for NPB23 FT benchmark ------------------------------------------------------------- 28 45 Summary ------------------------------------------------------------------------------------------------ 31

5 Performance evaluation by IOR benchmarks ------------------------------------------------ 33 51 Outline of IOR benchmarks ---------------------------------------------------------------------- 33

511 Key parameters -------------------------------------------------------------------------------- 33 512 Option parameters for MPIIO ------------------------------------------------------------- 34

52 Measured results ------------------------------------------------------------------------------------ 35 521 POSIX --------------------------------------------------------------------------------------------- 38 522 MPIIO --------------------------------------------------------------------------------------------- 38 523 Local cache --------------------------------------------------------------------------------------- 38

53 Summary----------------------------------------------------------------------------------------------- 41 531 Choice of file system---------------------------------------------------------------------------- 41 532 POSIX vs MPIIO ------------------------------------------------------------------------------ 41 533 Buffer size ---------------------------------------------------------------------------------------- 41

6 Performance evaluation of the interconnect --------------------------------------------------- 44 61 Specification of the interconnect ----------------------------------------------------------------- 44

JAEA-Testing 2011-005

611 Construction and Nodebind ------------------------------------------------------------------ 44 612 Communication method ----------------------------------------------------------------------- 46

62 Fundamental performance of MPI communication ---------------------------------------- 48 621 Measuring method ----------------------------------------------------------------------------- 48 622 Measured results -------------------------------------------------------------------------------- 49

63 RDMA ---------------------------------------------------------------------------------------------------- 53 631 Measuring method ------------------------------------------------------------------------------ 53 632 Measured results -------------------------------------------------------------------------------- 55

64 Communications between intra-CPUs inter-CPUs and inter-nodes ----------------- 56 641 Measuring method ------------------------------------------------------------------------------ 56 642 Measured results -------------------------------------------------------------------------------- 57

65 Communications between chassises ------------------------------------------------------------ 58 651 Point-to-point communication--------------------------------------------------------------- 58 652 Collective communication -------------------------------------------------------------------- 62

66 Summary ------------------------------------------------------------------------------------------------ 64 7 Concluding Remarks --------------------------------------------------------------------------------- 65 Acknowledgements -------------------------------------------------------------------------------------------- 65 References ------------------------------------------------------------------------------------------------------- 65 Appendix A HPCC benchmark results on FX1 --------------------------------------------------- 66 Appendix B Fundamental performance of MPI communication ---------------------------- 67 Appendix C RDMA vs NORDMA--------------------------------------------------------------------- 69 Appendix D Overlapping communications with calculations -------------------------------- 70 Appendix E Peer-to-peer communications --------------------------------------------------------- 76 Appendix F Memory band width ---------------------------------------------------------------------- 79 Appendix G Flat-MPI vs Hybrid --------------------------------------------------------------------- 85

JAEA-Testing 2011-005

1 EacuteH L13Pgt ABCDE

FGDHIJ 22 3K13LM=NO 153TflopsPQCRSTUVWXM=NO 200TflopsUV LinuxYZ$13LPRIMERGY BX900PWM=NO 12 Tflops []^_`$a^LFX1PbcdXefgh$13ijklmndoBpCDqBX900 rsXtuvwmWFX1 []efgh$Lxfgh$PyBzD^|13~hGmWWCq gtoBGXumlcopyordfyBClaquoPRIMERGY

BX900WFX1oacuteCO|Yl WpoundDGY`OslashCOslashGmWbrvbarsectdcWXpoundD13

GqUgrave[13EcircuJGq 3gtefgh$13MXBdn HPCC YbrvbarC 4cd 6O|YalLIOPa$fY`L^_ Pcent AringcdGq 7M$CqXyacuteampG()Ocirc$bWCUgraveWHDq

2 13Ecircu 13 3+XB-eacute13GXUVEumlIgraveOIacute[

]fIumlIacutenot|O-eacutecdX13gtlaquoqmnd13M

=NORSTUVWX 214TflopsMacirc012 56TBOgraveOacuteOcircOtildeY1212PBgtlaquoq13Ecircu Fig 1Gq 21 UVEumlIgraveOIacuteIcircIumlETHNtildeJ 1334WXUVEumlIgraveOIacutePRIMERGY BX900L13BX900P

200Tflops=NOLgYPWMacirc0 50TBnotGUV LinuxYZ$13gtlaquoq

BX900 10U-a5Aumleuml13aelig13T 186-eacute7IumlT 86accedil7Iuml8amp9XAumlAringOslashOuml7Iuml-eacutegtlaquoq1 7Iuml-eacuteaXeon ^_ccedil- X5570LYccedilIumlfP 2 ^_ccedil-Wacirc0 24GBLDDR3 SDRAMBiIacuteiumlIuml 12GB UgraveD 48GBP8ampGqiumlIumllaquoDsect=NO9376GflopsL293GHztimes4lt=gtOY_ccedilYtimes4ftimes2^_ccedil-Pgt|eacuteIuml 512GBsAumlAnotGqiumlIumlInfiniBandLIBP 4xQDRL13B QDRWCP2`ZaringCDAgravez 8GBsAbrvbarAumlFE7a13XOcirc$FGordf9gtlaquoq

JAEA-Testing 2011-005

Fig 1 13Ecircu

BX900gtacircHI4J2K currenLXMUV13Nh713~O=5Pq

22 []fIumlIacuteIcircIumlETHNtildeJ []fIumlIacuteFX1L13FX1P12Tflops =NOLgYPWMacirc0 46TBnotGYZ$13gtlaquoq

1iumlIumlQR SPARC64SLYccedilIumlfPT^_ccedil-Wacirc0 16GB8ampGqUgraveD|UCPUVBccedil^ccedil`LJSCPbrvbarsectAuml|eacuteIumlOslashWCqiumlIuml FBBLFull Bisectional BandwidthPccedil`X| InfiniBand 4xDDRLDAgravez 2GBsPgtYZGqiumlIumleacute|O=f7YOtilde IcircIumlETHNtildegtO=GVBOumltimesLAumla$fY`accedilPyBGmWgt [

brvbarUcircXOcirc$FG]EumlIgrave^_ZOslash5^=_wXM

`notGq FX1 gtFX1 CcedilEgraveordfxfgh$WXpound`acC2012 b

=Gxfgh$yBzD^|13~h5Pq 23 OgraveOacuteOcircOtildeYOumltimesIcircIumlETHNtildeJ OgraveOacuteOcircOtildeYOumltimes IO iumlIumlSPARC Enterprise M90002 6Ocirc$OcircOtildeYETERNUS DX80366$OcircOtildeYETERNUS4000 model40016gtJeumlnq

JAEA-Testing 2011-005

MOcircOtildeY12 12PBLRAID6Pgtlaquosect13gt 576GBs IO LgYPnotGLFig 2PqIO iumlIuml InfiniBand nota13Parallelnavi SRFSbrvbarsectciumlIumlcdYoacuteCaidefgCq Ocirc$OcircOtildeYOumltimeslaquoDsect 466Za SASOcircOtildeYL1TB7200rpmP8ampG

qmndOcircOtildeY RAID6 gt^UcircCOumltimesgt`ZagGmWgt IO AumlFUcirchpoundqXIOiumlIuml_aumla13Sun QFSBCLFig 3Pq

UVEumlIgraveOIacute11913aelig13()

i QDR 9j2 (113aelig13ntildesect) InfiniBandaccedil

(9iexcl) i FGFAring 72GBs (2iumlIumlcurren) IOiumlIuml (M90002iumlIuml) i FGFAring 576GBs (OcircOtildeYcurren) OgraveOacuteOcircOtildeY (ETERNUS DX80366) ()9kUcirclmiumlIumlnC

1 2 3 119

IBUSW

IO node

Fig 2 oOumltimesYZJWFGFAring

JAEA-Testing 2011-005

M90

00

2IO

M90

00

1IO

FC (3)

FC (4)

FC (36)

hellipFC (1

)F

C (2)

FC

(42)

hellipFC (3

7)FC (3

)FC (4

)FC (3

6)hellip

FC (1)

FC (2)

FC (42)

hellipFC (3

7)

DX8

0 1

helliphelliphelliphelliphelliphellip

helliphelliphelliphelliphellip

hellip

1TB 1T

B1T

B1T

B

DX8

0 2

1TB 1T

B1T

B1T

B

DX

80

35

1TB 1T

B1T

B1T

B

DX8

0 3

6

1TB 1T

B1T

B1T

B

ETER

NU

S D

X80

(36

)E4

K 1

1TB 1T

B1T

B1T

B

ETER

NU

S400

0 M

odel

400

(1)

32G

Bs(8

FC)

QFS

Q

FS

36

576

GBs

(144

FC)

F Fi

g 3

JAEA-Testing 2011-005

3 HPCCYbrvbar

HPCC (High Performance Computing Challenge)1) Y 7^_ZB BX900W FX15poundDq13gtHPCCEcircupqWrstuq 31 HPCCYEcircu

HPCC YvR Tennessee JDongara wRordf3xWXpoundyoBCefgh$BY`^_ZzgtlaquoqO

|Y^_ WpoundDefgh$acircuXu

GmWbrvbarsectefgh$13WCeacuteZGUacuteC

qO HPL(High-Performance Linpack)DGEMMFFTE 3q|^_Zgt|Y STREAM W Random Access 2 q|^_Zgt _ PTRANSW b_eff 2q|^_Zgt5n~noacuteCc^_Zbrvbar5PJWXpound 2)qo

^_ZUuml|YacuteTHORNFortranCgtcnordfHPCCYWCCgtcnD 1^_ZeumlnqXHPCCYC 2003 sect 300wordfeumlnsect+X13rsyBGmW9gtlaquoq

311 ^_Z efgh$13f CPU 1 iexclXCiexcl8ampG

L13iumlIumlPa$fY`gtYZCDYZ$JgtlaquosectCoreCPUiumlIuml13WeumlUgraveUgraveXgtO=ordf5nDHogtordfugt

laquoqmgtiumlIumlMG` (G Global system performance)WiumlIuml[gt` (SN Single Environment)gt` (EP Embarrassingly Parallel) 3gt5Pq

3111 HPL^_Z

HPL5IgraveLUcurrengtCmWgt13OG^_ZgtlaquoqOslashm^_Z|5 CcedilEgrave^_ZaumlIacutecurrenWCyB

eumlnmWordfqUgraveDefgh$13O[P TOP500B^_ZWCBeumln LinpackMPIEumlIgraveUcircWCnotgtlaquoqiumlIuml`$XOordfecircMAumlWXq Tflopsgteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq

JAEA-Testing 2011-005

3112 DGEMM^_Z DGEMM5IgraveGmWgtiumlIumlOG^_Zgtlaquoqo^_gt5curren13^_W i5Xqm^_Z

LinpackSgtegravenauml^_ZgtlaquoqUgraveDaumlefgh$gto13zeumlnDeacuteaTHORN|ordfZaZ|WCBeumlnsect

meacuteaTHORN|ordfoacute WXqY`LBMTPbrvbariumlIumlOiexclcentGq Gflops(Giga floating-point operations per second)gteumlnq` 2gtiumlIuml[(SN)(EP)`5Pq 3113 STREAM^_Z STREAM iumlIumlordf|W CDWecirceacuteIumlG^_Zgtlaquoqm

Wecirco^_gtO=ordf5np^_W i5Xqm^_Z

cpound(Copy)curren(Scale)Myen(Add)yen(Triad) 4 XO=gtJeumlnsectiumlIuml|YiexclcentGq GBs(Giga Bytes per second)gteumlnq` 8 gt[(SN)(EP)gtoiumlIuml 4(cpoundcurrenMyenyen)`5Pq 3114 PTRANS^_Z PTARNS(Parallel matrix TRANSpose)EumlIgrave^_ocirc5IgraveFtimesbrvbarsect13ccedil`brvbarY G^_Zgtlaquoq GBs gteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq 3115 Random Access^_Z Random Access |sectumleumlnDecopyIgravecdZordf 1 ulaquonotgtshyregGO=ordf 1 macrdegplusmn5ncG^_Zgtlaquoq1 plusmnYGOcirc$2ordf8BWgteumlordfgtlaquoqiumlIuml|Yucircsup2`WiumlIuml MPI gtucircsup2`5Pq Gups(Giga updates per second)gteumlnq` 3 gtiumlIuml[(SN)(EP)iumlIumlegravepoundDM(G)`5Pq 3116 FFTE^_Z FFTEAumlF|sup3^acuteOG^_ZgtlaquoqHPLWiacute CcedilEgrave^_ZaumlIacutecurrenWCyBeumlnmWordfq[(SN)W(EP)`gt2 [microcopyIgraveordfo^_gtpara|sup3^acuteeumlnqiumlIumlegravepoundDM(G)`gt3[microcopyIgraveordflaquo 1AgravezmiddotAcurrenetheumlnDcedil|sup3^acuteeumlnlaquoW^_ordf^_W 5PmWLoacute Pbrvbaro^_

copyIgravesup1copytimesordm^_raquofrac14sect[microoacuteG|sup3^acuteordf5nq

JAEA-Testing 2011-005

Gflopsgteumlnq 3117 b_eff^_Z b_eff Ocirc$FGFGfrac12frac34(7a13acute latency)WFGFAring(eacuteIuml band width)shycdiquestH^_ZgtlaquoqFGfrac12frac34ordfAgraveXCeacuteIumlordfecircM13^_ ordfAacutenmWGqb_eff iumlIumlegravepoundDM(G)G Ring W PingPong 2 ^_ZgtJeumlnqb_eff FGGOcirc$-a5mnUgravegt^_ZW+XsectlOcirc$-a5 N ] MPI ^_ P atildeAcircXq7a13 8B Ocirc$eacuteIuml 2MB Ocirc$FGGmWbrvbarsect5Pq Ring MPIbrvbarpoundAtildeHdnDZYAumlAringAEligo^_oacuteCAumlAring 1oacute 1 5PoumlWZYAumlAringZordfEumlu 1oacute 1 5Pouml 2CcedilEgravegtCDq PingPong 2Core Eacute Ecircnototilde^_ordfethsectntildednp Core EumlCOslash5Gq2CoreIgraveEacuteIacute5poundecircTIcircIIumlshyGq b_effETHNtildeOgraveRing(AumlAring)brvbar7a13OacuteRing(AumlAring)brvbareacuteIumlOcircRing(Zordf)brvbar7a13OtildeRing(Zordf)brvbareacuteIumlOumlPingPongbrvbar7a13timesPingPongbrvbareacuteIuml 6gtlaquoq7a13ordfaY_LOslashPmacreacuteIumlordf GBsgteumlnq 312 Oslash5

HPCCZaZ(Baseline Runs)WUuml^Otildea5Z(Optimized Runs) 2Oslash5ordflaquoordffeaZTUacuteUcircEacutebrvbarZaZgt5pound

q

3121 ZaZ Oslash5gtlaquoZaZfeaZbrvbarTUacuteUcircW13Oumlfrac34

eumlnaumlUgraveAuml BLAS ZaZ|W MPI ZaZ|egraveBUgravegtUacuteeumlnordfUcircfIuml^regUacuteeumlnXqegraveBCDfeaZeacuteYacute~

WfeaegraveBCDUuml^13~ZaZ|eacuteYacute~UumlordfYacutegtlaquoq 3122 Uuml^Otildea5Z UcircfIumlTHORNszligagravePTUacuteUcirc2 7gtUacute9eumlnqiAgraveaacutednDIacutecurren(-)fIumlTUacuteUcircgtlaquosectpAgrave|5macr8CEcircnototilde3XTUacuteUcircgtlaquoqUgraveD acirc HPCC^_YacuteNtildeY`acircatilde^WaumlaringordfaeligugtlaquosectatildeCUumlGaeliguordflaquoq

JAEA-Testing 2011-005

32 BX900 FX1n~nHPCC BMT^_Z 8ccedil512EumlIgravegtOslash5CDq 321 BX900 BX900 HPCC BMT^_Z Table 1GWHPL

PTRANSRandom Access FFTEo^_ZZK|Otilde Fig 4ccedilFig 7GqZK|Otildedegn^_Zegraveeacuteecirccopyordfeuml]cgtlaquoordf512 EumlIgravegtigraveyenCiacuteXCiacuteicircXordfdegdnqUgraveDHPL=NoacuteC 822(=4902596100)1gtlaquopoundDq

Table 1iumlntildeethntildeograveoacuteWCHPCCocircAEligcdotildebIcircIumlETHNtildeoumlccedilYAElig Intel Endeavor cluster (13gt Endeavor Wdivideoslash)HPCCWeboumlYacute 1) ugraveuacute Table 2Gq BX900shyucircoB3gtlaquoDH Ecircnototilde^_ZP EndeavoregraveeacuteplusmnpoundDlaquoordfEcircuumlb13yacuteordfdegdnqUgraveDEndeavorWEacute IntelfeaZBD Table 3 GbrvbarPQRfeaZbrvbarWbgtlaquopoundDqDotildeCSTREAM iumlIumlSgt5poundmWcd thornCXatilde

d(35GBs oacute 42GBs)W 2ethM EndeavorbrvbarsectEshygtlaquopoundDqm422gtaringNGq

1 BX90013VBbrvbargt17072fgt=N 2000TflopsoacuteC 1914TflopsOslashCsectOslashszlig 9566gtlaquopoundDq

- 9 -

JAEA-Testing 2011-005

Ta

ble

1 富士

通コ

ンパイ

ラに

よる機

構B

X900

での

HPC

C B

MT測

定結果

G-H

PL

G-P

TR

AN

SG

-FFTE

EP-S

TR

EA

MSys

()

EP-S

TR

EA

MTriad

EP-D

GEM

MR

ando

mR

ing

Ban

dwid

thR

ando

mR

ing

Lat

ency

TFlo

ps

GB

sG

ups

Gup

sG

Flo

ps

GB

sG

Bs

GFlo

ps

GB

sus

ecrd

ma

008

1315

12

9330

001

5112

008

9847

555

3328

074

350

9311

018

41

1171

51

268

nord

ma

008

1226

12

9544

001

5504

009

0139

557

5727

813

347

6711

023

91

1053

51

278

rdm

a0

1579

130

510

540

0238

850

1510

719

4469

555

143

4697

111

145

092

418

149

3no

rdm

a0

1536

650

491

080

0222

270

1334

439

7508

555

683

4730

111

123

078

781

324

5rd

ma

032

2493

09

0716

003

0751

024

2906

186

357

111

164

347

3911

101

00

5220

61

944

nord

ma

032

3494

08

4422

003

5001

023

7182

190

981

112

052

350

1611

100

30

5194

95

010

rdm

a0

6201

260

150

637

003

2786

037

8769

369

923

222

203

347

1911

092

40

4192

43

156

nord

ma

060

9871

014

686

80

0763

460

3882

6638

139

722

260

03

4781

110

887

044

542

652

7rd

ma

120

6270

026

303

70

0370

050

5186

5673

075

744

426

43

4708

111

007

038

892

463

5no

rdm

a1

2265

800

302

225

007

3652

061

6532

742

721

444

063

346

9211

100

50

4161

78

290

rdm

a2

4166

200

583

348

002

9897

053

8761

139

3770

888

453

347

0511

091

00

3583

55

937

nord

ma

236

3890

057

519

80

0765

560

8003

4814

845

5089

503

73

4962

110

879

040

323

103

72rd

ma

468

4620

011

199

400

0163

910

2622

2026

859

1017

779

513

4726

111

147

032

838

711

3no

rdm

a4

9017

400

109

9150

008

2799

153

7610

275

5280

1816

433

354

7711

111

90

3572

112

790

富士

通B

X90

0 (5

12C

ore

128C

PU

)C

ip

Xeo

n X55

70 2

93G

Hz

DD

R3-

1066

SM

T-O

N T

urbo

Mod

e -O

FF

Inte

rcon

nect

Q

DR

Inf

inib

and

2so

cket

nod

e F

at T

ree

OS

Red

Hat

EL 5

Com

pile

r 富

士通

CC

++ V

er O

ption -

Kfa

st -

SS

L2

()

EP-S

TR

EA

M S

ys の

値は

(EP-S

TR

EA

M T

riad

)times並

列数

で算

出し

てい

8 16 32 64

MPI

富士

通 P

aral

leln

avi Lng

uage

Pac

kage

512

128

HP

CC

Ver

V

13

1

256

並列

数通

信方

式G

-Ran

dom

Acc

ess

オリジナ

ル版

  

SA

ND

IA_O

PT2版

JAEA-Testing 2011-005

FFig 4 BX900 HPL

Fig 5 BX900 PTRANS

JAEA-Testing 2011-005

FFig 6 BX900 Random Access

Fig 7 BX900 FFTE

JAEA-Testing 2011-005

TTa

ble

2In

tel E

ndea

vor c

lust

er (

HPC

CW

eb1)

)

JAEA-Testing 2011-005

TTa

ble

3In

tel

BX9

00H

PCC

BM

T

JAEA-Testing 2011-005

322 FX1 FX1 HPCC BMT Table 4 HPL

PTRANS Random Access FFTE Fig 8 Fig 11512

HPL755(=377650100)2

Table 4 FFTE DGEMM

A

2 FX1 1200 129Tflops116Tflops 9037

JAEA-Testing 2011-005

TTa

ble

4FX

1H

PCC

BM

T

JAEA-Testing 2011-005

FFig 8 FX1 HPL

Fig 9 FX1 PTRANS

JAEA-Testing 2011-005

FFig 10 FX1 Random Access

Fig 11 FX1 FFTE

JAEA-Testing 2011-005

33 HPCCYUgraveWH Uklq BX900W FX1WrsndWK(Altix3700Bx2)Wrs

Fig 12GqmhgtTWXpoundDqshy 100WCLRandom Ring LatencyPqRandom Ring Latency13BX900ordf`ccedil^fCmWordfcurrencq

BX900 W FX1 rsGWSTREAM W FFTE necircOatildeC atildeC eacuteIumlrvCDordfmacrdnDqSTREAM

421FFTE 43aringNGq XAltix3700Bx2 Random AccessordfEordfAltix3700Bx2W

plusmngtegraveBCD HPCC BMTeacuteYacute~cdOcirc$Y|5ordf+Xpoundsect8YrsGmW13XqAltix3700Bx2 atildeC FFTEEordfmdoacute OslashszligWuumlwdn

q

JAEA-Testing 2011-005

13

0

20

40

60

80

100

BX90

0

FX1

Altix37

00Bx2

GHPL(490Tfplos)

113

GRando

mAccess(154G

ups)

GPTRANS(1120G

Bs)

EPSTREA

M(436GBs)

EPDGEM

M(111Gflo

ps)

13

GFFTE

(2755Gflo

ps)

$amp(13

Rand

omRingBandw

idth

(0359GBs)

Rand

omRingLatency(518usec)

)+

512-(FlatM

PI)

1172

10 64

64

10 32

8 2 32

130

13

123

[GFLOPS]

4567898

123

[GBs]

lt8

7898

[GBs]

=gtAB130CDgtEFGHIAB0J

KLMNOIPQ

Fi

g 1

213cent

HPC

C B

MT^_ZrsZ

Ran

dom

Rin

g La

tenc

y (5

18

sec)

)+R

JAEA-Testing 2011-005

4 BX900W FX1TUacuteUcirc 32 c^_Zgt BX900 W FX1 IcircIumlETHNtildeoumlccedilYWiCXordfWnDmWQR^_aZegraveBC(Xcurrenyen

5poundDqeumldfeaZUuml^13~brvbarTUacuteUcirc]Ucirc7^_ZTUacuteUcirc

EacuteDq 41 ^_aZ

BX900 W FX1 gtegraveBCD^_aZWQRUgraveyenXgtlaquosectoacute ^_ZOslash5^_a|Ocirc$divideC^_ZOslash5cedildivideOcirc$divideC

^_Z=)yen5PmWordf13q ^_aZyBGUgrave^_Z_IumlYacuteh)JGDHfe

a|YUuml^13~UKtl_trtGq[ _ZOslash5_IumlYacute

h fpcoll fIumlgt=GqTcedildivideCDeacuteaTHORN|Ocirc$ fprof fIumlgtdivideCaring`Gq^_aZgtPAElig 8q|gtlaquoq

^_a|Ocirc$divideAElig AElig ^_ AElig MPIZaZ|ordm)AElig f`AElig IcircIumlETHNtilde$AElig fZAElig UcircfIumlAElig

mPIcircIumlETHNtilde$AEliggtCPUlt=gtOaringaeligccedil13

hccedil`|WOcirc$FGszligbordfoacute WXq

42 feaZbrvbar|YTUacuteUcirc BX900HPCC BMT^_Z STREAMordfotildeb

gtlaquo EndeavorWrsC(35GBsacute42GBs)W 2ethMEshygtlaquopoundDmWETHNtilde5poundDqETHNtildeBDUcirc HPCCpSTREAMUuml|YacuteTHORN(CUcirc amp FortranUcirc)7ccedilIumlEumlIgraveUcircWMPIEumlIgraveUcircgtlaquoq[13BGbrvbarPmdegn BX900 feaZUuml^13~UKntst brvbarsectCDmWcdEndeavorfeaZ(IntelUgrave)W BX900feaZ(QRUgrave)igraveiacuteWuumlwdnq

JAEA-Testing 2011-005

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute BX900Uuml|YacuteTHORN STREAM^_Z Table 5Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntst brvbarordfmacrdnDqDotildeCUuml^13~gtOcirc$|sectumlGaringaeligccedil13hyBCXbrvbarPGDHoacute WX DO^SgtOcirc$sup1yBordfXmWordfdcXoumlEacuteegraveBGuecircgtlaquosectPgtXoumlordfUcirc

GordflaquoqUgraveDfeaZordfoacute WGDO ^SgtsectumlOcirc$ordf^^shypoundZYeumlnoumlEacutegtlaquoq CUcircoumlIntelfeaZXd OpenMPEumlIgraveUgraveUgravegt 42GBscoreordfdegdnDordfQRfeaZgt7ccedilIumlEumlIgraveUcirc OpenMP EumlIgravecd=EumlIgrave^regCbrvbarP]C Intel feaZWbWXpoundDqQRfeaZgtlaquo CD7ccedilIumlEumlIgraveUcircAgraveogravebrvbarpoundOcirc$copytimes]AgravebTUacuteUcircAgraveograveordflaquoWnq [ FX1Uuml|YacuteTHORN STREAM^_Z Table 6Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntstbrvbarordfmacrdnDqXTriad (132GBs)CPUQRordfWebgtUumlCshy(135GBs)CPUWotildeEacutegtlaquoq CUcircoumlordf FortranUcircUKntstUuml^13~XCWotildeiCqFX1 CfeaZ BX900W+XsectUKntstUuml^13~ordfcentsup3CXqDHFX1 CfeaZbrvbarTUacuteUcircordf$currengtlaquopoundD9ordfAumlq

JAEA-Testing 2011-005

TTa

ble

5B

X900

STRE

AM

Ta

ble

6FX

1ST

REA

M

JAEA-Testing 2011-005

422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute 321gtdegdnD BX900 HPCC BMT^_Z STREAM

Table 7GqmmgtUKntstcopyIgraveOcirc$ecircEacutearingaeligccedil13heumlX storeampegraveBGmWUKrestp=all(a$XsectordfXmW)WCDTUacuteUcirc5PfeaZUuml^13~gtlaquoq UgraveDyacutegtgtlaquoordf _ZOslash53iumlIuml XC

iumlIumlSgt|Yo CPU]o CoreiDHEumlIgrave] Agravejatildedn~nfeaZUuml^13~brvbar^UcircCXq

STREAMbrvbarPX CoreIumliX2gtZX|Yordf5n DO^gtfeaZUuml^13~UKntst GmWgt|Y+13mWordfcurrencpoundDq

Table 7 HPCC BMT^_Z STREAM (MPIEumlIgraveUcircCUcirc) feaUuml^13~Ograve(1) feaUuml^13~Oacute(2)

EumlIgrave Agravej 13aelig13

EP-STREAMTriad (GBs)

13aelig13 EP-STREAMTriad (GBs)

rdma 1 34739 U U 32

nordma 1 35016 1 43316 rdma 3 34708 U U

128 nordma 3 34692 1 43572

rdma 5 34726 4 43595 512

nordma 4 35477 4 43424 L-1QRfeaZUuml^13~OgraveacuteKfastpackedntst SSL2 -2QRfeaZUuml^13~OacuteacuteKfastpackedntstrestp=all SSL2P

JAEA-Testing 2011-005

43 hhbrvbar|YTUacuteUcirc 32 gtdcWXpoundDbrvbarPBX900 W FX1 FFTE ^_Z

acircIcircIumlETHNtilde13yacuteordfmacrdnDqmETHNtildeDH5pound

DhhW13gtszlig0Gq 13gt FFTE 32EumlIgraveOslash5CDWecircoacuteG ethBX900

gt 12FX1 gt 16Wgteumlqpoundordm)1|YEcirc2OgtlaquopoundD9ordfAumlqmgtFFTEUuml|YacuteTHORN(FortranUcirc)l3CUuml|YacuteTHORN(13aumlWoacuterGoumlUcircWdivideoslash)WaumlIacutecurren4C)JCDaumlacircoacuteGaumlZh5poundDiacuteegraveBC

hbrvbarpound^UcircCDordm)GmWbrvbarsectBX900 W FX1 oacuteG5g5poundDqUcircpaumlbrvbar5poundDh 1QRfeaZW Endeavor egraveBeumlnDafeaZWrs5PDHW 2auml13IacutecurrenTUacuteUcircbrvbar6)BbordfaumlIacutecurrenthorn7wXmWagraveaacuteGDHgtlaquoq UcircaumlIacutecurren 38plusmnUcircWmndaumlZhCDiacute

Ucircn~n Fig 13W Fig 14GWoIacutecurreniacuteEcircu0Gq Fig 13W Fig 14OgravegtCDIacutecurrenOO=5pound^gtlaquoqUuml|YacuteTHORN

gtcopyIgrave APXYZWWWWOcirc$Yordf 2ugt5nsectZgtXcpoundDDHiacutegt)9BcopyIgrave CY_W^curreneth5poundcopyIgrave APXYZWWWWYordfZXbrvbarPCDq)9BcopyIgrave CY_YZCXcpoundD=Ocirc$sectumlG^gtYordfZXpoundDoumlUcircordfecircDHgtlaquoq

Fig 13W Fig 14OacutegtCDIacutecurren3[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 3ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 3gtXC 6Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquosectaringaeligccedil13h-

a5brvbarpoundltG_ccedilY-a5^_ZUgrave)acircregmacr= 16 gteumlnq

Fig 13W Fig 14OcircgtCDIacutecurren2[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 2ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 2gtXC 4Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquoqFX1 gtBX900rufntildeDsectaringaeligccedilaringh-a5ordfgteumlmWcdmO=)t_ccedilY-a5 16cd 4^regGmWgtzyacuteordfmacrdnDq

Table 8BX900W FX1QRfeaZegraveBCDUcircWaumlbrvbarEuml BX900 afeaZegraveBCDaumlbrvbarGq

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeFFQhFQeFHFFIhFIeFHFFhFQWFRWFFILCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildeKgMSCFDEFGLCKmMlhKOFRLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeNHQRIAacuteVOEOlOjAtildehSRAacutelOEYEEfKAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKnMSCFDEFGKmMSCFDEFGKdMSCFDEFGKkMSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePMMSCFDEFGPKMSCFDEFGPPMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLSABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtildeLCnMEEhKOFKOFAVjLCgMllhKOFPOFAVjLCPMEhEEOUEFMAacuteEEfFAVjYKOFKAtildeLCKMlhllOUEFMAacutellfFAVjYKOFPAtildeAAacutelOEAtildehNAacuteEOlAtildeaumlIacutecurrenOcircKMSCFDEFGPMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBF

GFL

Fig 13 FFTEUuml|YacuteTHORNaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeop]qrsAacuteXAtildeObqqpobtbuqrvvowxAacutevOvOvAtildeL3yzPbqqpobtrAacuteowxAacutesOwO]|AtildeAtildeLCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildep~hKOwowxAacute^O~OqAtildehowAacute~O^Atilde ^currenethDHrp )9BcopyIgrave)JIacutecurrenKgMSCFDEFGKdMopt^|rKkMopt^|rpKdK^^hKOsOuqpKmM~~hKOwOuqpKmK~h~~O^MAacute~~fuqYKOwAtildeLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildepKkKVhKO]|NHQRIAacuteVOEOlOjAtildehowxAacute^Y^^fKO~OqAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKkKopt^|r LEW V^lnacutewPKnMSCFDEFGKmKSCFDEFGKmMSCFDEFGKdKSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePPMSCFDEFG LEW j^lnacutewPPKMSCFDEFGPMMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtilde]bcbrtrcAacuteuqKhnAtildeLCnMEEhKOFKOuqKLCgMllhKOFPOuqKLCKMlhllOUEFMAacutellfuqKYKOFPAtildeLCPMEhEEOUEFMAacuteEEfuqKYKOFKAtildeAAacutelOEAtildehNAacuteEOlAtilde aumlIacutecurrenOcircPMSCFDEFGLEW l^lnacutewPKMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBFGFL

Fig 14 FFTEiacuteaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

Ta

ble

8FF

TE V

er 4

1

FullW

kern

el

(EumlIgrave

32

180M

BP

roc

)

ST

UVWXYZ[]

^_

`6a8bcdRefghHdRefgij

`6a8bcdRefghHdRefgij

klmmnopqr

cltst

ouvq

wvx

ovy

zqvz

|vy

|vz

klmmnopq

~cltst

yvy

yvo

ovo

wqvq

v|

|vy

s88oy

oqvu

oqvu

ovq

z|vu

oquv

qvz

s88Ru

ovz

qvw

x|vq

ovw

klmmnopqr

cltst

ouvy

zvy

ovx

oq|vu

uvz

uv

klmmnopq

~cltst

yvx

xvw

ovo

w|vu

vz

|vy

s88oy

oqvx

oqvo

ovq

wxv

z|vo

qvz

s88Ru

o|v|

qvw

xuvw

ovy

klmmnopqr

cltst

ouvo

zvx

ovx

klmmnopq

~cltst

yv

xv

ovo

s88oy

oqvq

oqvu

ovq

s88Ru

ovz

qvw

ln

ltd

ln

zqq

D

mo

ltd

ln

mltd

JAEA-Testing 2011-005

UgraveBX900gtUcircWQRfeaZaumlWQRfeaZaumlWafeaZdegnIgraveEacuteIacuteotildeigt

laquosectauml4ordfXmWWfeaZAacuteordfXcpoundDmWordfcurren

cqUgraveDFig 14OcircgtCDIacutecurren_ccedilY-a5^regGW(_ccedilY-a5acute164)BX900gtordfpoundUcircCsectmoumlaringaeligccedil13hegravepoundXcpoundD9ordfAumlq [FX1gtUcircWQRfeaZaumlWQR

feaZIgraveEacuteIacuteshyegraveeacuteordfaEacuteDordfUuml|YacuteTHORNcdz

yacuteCiacutecurrengtrsGWotildeigtlaquomWcd BX900 WrsGmWXqBX900W FX1gtUuml|YacuteTHORNWiacuteOslash5GWFig 14OgraveIacutecurrenO=OordfEcircUgraven^currenethWPrsUVXiacute5pound

DDHBX900 gt FX1 gtzyacuteCqCcurren BX900 16 oacuteC FX1gt 40WXzyacutegtlaquopoundDqUgraveDFig 14OacuteWOcircIacutecurrendegncopyIgravefgO=EacutegtlaquosectBX900gt 11currenW 08currengtzyacuteXcpoundDordfFX1gt 36currenW 18curren(or 16curren)gtMddcXzyacuteordfaacuteHdnDq 13yacutecdFX1feaZTUacuteUcirc BX900Wdc+XpoundsectFX1

|Y]OszligyBGDHhhordfaeliggt

laquomWordfcurrencpoundDq 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute

42 STREAM^_ZgtethCDW ordfcent^_Zgtecircc5g5poundDq egraveBCD^_Z43 FFTEWEacute Stockham|5gt FFT5P

NPB (NAS Parallel Benchmarks) 23FT^_ZgtlaquoqFFTEoumlWiacuteNPB23 FT Uuml|YacuteTHORN(Fortran Ucircgt MPI EumlIgraveUcirceumln)l3CUuml|YacuteTHORNcdaumlIacutecurren4C)JCDaumloacuteCfeaZUuml^13~brvbarpound^

UcircCDordm)5poundDqUuml|YacuteTHORNgtXCaumlBD=YZ C7UV((XYZ)=(512512512))gtaumlIacutecurrenordf On aringaeligccedil13hWXsectfeaZUuml^13~szligordf13XDHgtlaquosectaumlIacutecurrenUVEacuteCDcpoundDcd

gtlaquoq ^_ZaumlIacutecurren 28plusmngtlaquosectn~n Fig 15W Fig 16GqMd

8plusmncopyIgraveoacuteGYordfZsectumlO=gt|YordfZgtlaquo

mWcdhtgtUKntst oacute WXXgtmUuml^13~notbrvbarGq UgraveDOcirc$_ccedilY-a5 16W 64brvbar5poundDq

JAEA-Testing 2011-005

13

|ucp|t^ro__tAacute^OOOsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]|trFRFY]p^top]qrsYtpYop]qrsJJDp_Q|^bbqpc^ta|rotpbctcb|urc`Q^uptatar^]|tbtarp|t]|tbccbwOa^qrR^boocbtoabccbw`Et^b|rtabtFhPU`Ar_pcrobqq^SJJDItpo]rc_pcJJDOtarbccbw|tur^^t^bq^ruwobqq^SJJDI^taEorttpMbUrttpUQOarcrUQ^tarbs^|bq|rp_U_pcbwo|ur|rtobqq`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^oq|r[qpubq`a[^oq|r[]^_`a[^trrc^OOO^O~OqOsO^rccOr^trrc__tuqpoO__tuqpo]b]bcbrtrcAacute__tuqpohdnO__tuqpo]bhddAtilde auml^_ccedilY-a5Kd]bcbrtrcAacute__tuqpohKdO__tuqpo]bhKXAtilde]bcbrtrcAacutehiOhmKPAtildep|uqrop]qrssOwO|p|uqr]cro^^prqOrqrOrq]^r^psAacute__tuqpo]bOAtildeOwAacute__tuqpo]bOAtildeO|AacuteAtildeobqqUHExE^tAacute^rccAtildesAacutevOvAtildehAacuteK`MMOM`MMAtildewAacutevOvAtildehAacuteK`MMOM`MMAtilde|AacutevAtildehAacuteK`MMOM`MMAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pcprbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqh]^xt^rAacuteAtildep^hKOKMMM^hY^pqhKOOPobqq__tPAacute^OqOOO__tuqpoO__tuqpo]bO|OsOwAtilde^_Aacuteq`r`AtildeptpKdMobqq__tPAacute^OqfKOOO__tuqpoO__tuqpo]bO|OwOsAtilderpptpKXMoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]wRtpQ`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKdMopt^|rp~hKOp^hKO__tuqposAacute^O~AtildehwAacute^O~Atilde aumlIacutecurrenLfgO=PrprpKXMopt^|rrpobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqrh]^xt^rAacuteAtilderq]hrqrYrqobqqUHExSpxcbAacuteUHExSCUUxTCBVLOrO^rccAtilde^_Aacuter`r`MAtildetarc^trAacutedOWAtilde[Gqb]rD^ruw|u`__tPAacutero`Atildev[Orq]r^_obqqUHExJ^bq^rAacute^rccAtildetp]

Fig 15 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- cfftz)

JAEA-Testing 2011-005

|ucp|t^r__tPAacute^OqOOOwOwKO|OsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pctarVYta^trcbt^pp_tarropbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^trrc^OOqOOOwOwKOKOq^Oq~OqO|O^O~O^KKO^KPO^PKO^PPp|uqrop]qrs|OsOwO|KOsKKOsPK^r^p|AacuteAtildeOsAacutewKOAtildeOwAacutewKOAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYort^^t^bq]bcbrtrc`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKhePqhPWWAacuteqYKAtildeq^hPWWAacuteYqAtildeq~hPWq|hq^fKp^hMOq^YK^KKh^WqfK^KPh^KKfK^PKh^Wq~fK^PPh^PKfq^_Aacute^`r`KAtildetar|Kh|Aacute|f^Atilderqr|Khop~Aacute|Aacute|f^AtildeAtilder^_oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoDa^qpp]^rotpc^buqr`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYphMOqYKp~hKOwsKKhsAacute~O^KKfAtildesPKhsAacute~O^KPfAtildewAacute~O^PKfAtildehsKKfsPK aumlIacutecurrenLOO=PwAacute~O^PPfAtildeh|KWAacutesKKYsPKAtilderprprpcrt|cr

Fig 16 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- fftz2)

JAEA-Testing 2011-005

UgraveWH Fig 17(Ocirc$ Table 9 Table 10GqQRfeaZbrvbarshyatildeCBX900 fftz2gtfeaZUuml^13~UKntstbrvbarzyacuteordfagraveaacute13cfftz gtnotXordfmacrdnXqUgraveDFX1degnnotXordfmacrdnXcpoundDqmW ETHNtildeCD

zyacuteordflaquopoundD BX900 fftz213gtfeaaringaeligccedil13heumlXsectumlampordf5eumlnXmWordfcurrencpoundDq13]h

eumlnX(cXCcedilEgraveWuumlwdnordf(currencdXq Ocirc$_ccedilY-a5degnOcirc` 16ordfTUacuteshygtlaquopoundDq UgraveDQRfeaZbrvbar BX900W FX1shyordfFFTEoumlbrvbarsectgteuml

(2curren13)ucircsup2WsectumlMd|YZgtlaquopoundDDHWuumlwdnq TcedilQRfeaZWafeaZUKntst ordf)BCD

IacutecurrenotildeQRfeaZAgraveordf 1 ethAacuteWXpoundDordfSTREAM gtfrac12Eacutesect

iumlntildegtlaquoq 45 TUacuteUcircUgraveWH

BX900WFX1STREAMWFFTEUuml|YacuteTHORNfIumlWNPB23 FTB13feaZbrvbarTUacuteUcirc5gCDFX1gtegraveBGfeaZUuml^13~brvbarpoundordfecircC^ordflaquo 5ordfaeligugtlaquopoundDq

FX1gtfeaZUuml^13~otildegt13ecircGmWordf13XoumlordflaquosectWecircUcirc7gth5Paeliguordflaquoq

Fig 17 feaZbrvbarTUacuteUcirc

JAEA-Testing 2011-005

TTa

ble

9N

AS

PARA

LLEL

Ben

chm

arks

Ver

23

FT

Tabl

e 10

NA

S PA

RALL

EL

Ben

chm

arks

Ver

23

FT

JAEA-Testing 2011-005

5 IORYbrvbar Y`fIuml IORBoa13alFGFAring

CaYoBETHNtildeCDq 51 IORYEcircu

IOR Benchmark3)vR ASCI Purple=_7|eacuteplusmngteumlnD C iexclbrvbarYfIumlgtlaquoLplusmneacuteYacute~ 2102egraveBPqPOSIXLwrite() read()PMPIIOLMPI_File_write_at() MPI_File_read_at()PbatildeB13EumlIgravel5PmWordfgtecircq 511 YacircueZ$

MPI oZYcplusmnLrepetitions plusmnP`5XFGFAringTshyiquestHqOcirc`centgtEumlIgraveOslash5eumln^_cdiaoacuteCpound

Eacuteecircordf5nndFGFAringshyordfeumlnqIOROslash59XacircueZ$ Fig 18Gq

1plusmn`gt readwriteplusmn(blockSizecurrentransferSizejsegmentCount)gtlaquosect`aecirceuml (blockSizej segmentCountjMPI EumlIgraveAring )WXqyenwtransferSize=1kBblockSize=1MiBL106 Bytes MB220 Bytes MiB WGPsegmentCount=1MPI EumlIgraveAring=32 oumlblockSizejsegmentCountjEumlIgraveAring=32MiB testFile ordf)dnoZY 1MiBmiddotAbrvbarntildeC1KB7fIuml 1024iexcl1313aeligUgraveDZordfpoundEacuteecircGq

[transferSize] 1plusmn readwritegtFGGeacutea`qsizeof(long long int)currencblockSizegtlaquoaeliguordflaquoq

[blockSize] o^_ordfbrvbarntildeGeacutea`qgt 1MiBq [segmentCount] o^_brvbarsectYeumlniOcirc$qyzshy 1q [randomOffset] 1XdZordfYqOcirc`shy 0q [fsync] ecircEacuteAring fsync13f5Gqyzshy 0q [filePerProc] ^_sectWcent`aYGqPOSIXEacutenotszligq

yzshy 0q [useFileView] MPI_File_set_view() MPI_File_write() MPI_File_read()egravePq

ZordfYWparaCXqyzshy 0q [collective] MPI_File_write_at_all() MPI_File_read_at_all()egravePqZordfY

WparaCXqyzshy 0q

Fig 18 IOR YacircueZ$

JAEA-Testing 2011-005

512 MPIIOatildeGUuml^13~eZ$

IOR gt Info aringbrvbarpound MPIIO =)`7wmWordfgtecircqBX900 notszligX InfoUumlYacuteNtildeY` keyW value Table 11GqplusmnCXLyzshyPq

Table 11 MPIIO Info UumlYacuteNtildeY` key value

cb_buffer_size 4194304 $amp()

+

cb_nodes -012345

6789lt

=gt-012A

B

ind_rd_buffer_size 4194304 ACDEFGF$amp()+

ind_wr_buffer_size 524288 ACDH+GF$amp()+

JAEA-Testing 2011-005

52 latildeLPOSIX MPIIOPn~nEumlIgraveAringtransferSize (1plusmn13

fgtFGeacutea`)YAgraveograveLsequential randomPb^wFig 19GumlcopyFGFAringCDL^_P Table 12Table 13Fig 20Fig 21GqUgraveDntildea13iumlIuml_aumlaringaeligccedil13hIO -eacute-eacutearingaeligccedil13hordfcentsup3CDHn~naringaeligccedil13hoacuteGIO XcOgraveOacuteOcircOtildeYoacuteG IO XcordfcentGDH_aumlaringaeligccedil13hszligEacuteDq

iexclcentpoundcurrenyenETERNUSDX80

970TB

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

brvbar

SRFSltlaquocumlcopyordf

READD WnotiteD

brvbar

SRFSltlaquocumlcopyordf

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

shyshyshy

Fig 19 ReadWriteumlcopy

JAEA-Testing 2011-005

Table 12 LPOSIXP

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3

oqu yvq ovy| |uxvzw zvwu |zwvyy ouv|u oywvo ozuqv

quw zqzvuy uvox uu|v|u u|uvqo uxovz| uwvuo owovuq wqvxuqzy oqqvo| vy uxyvyw uuuvxx uyvyw xqzvu| |vw| |q|wvzqwoz o|uv|w ouvoy ozwv|o uyxvux uqvyo u|v| uovzu zuvxoy|wu ouoov|w uwvqu uuvxq uyvwz uuvyw uxovww uqvxw zovzw|yw ouxzv wzvwq uy|vx uyvz| uyuvwz uwxvoo |xvo wxvzyyxx|y ouwyvq| wv uxvo| uovyx uzvuu u|qv xvxy yvyxo|oq ouqwvu| uqvqq uxv|w uxvxy uuwvyo uyvoo qvyw zv|qyouu oxoyvu ywovoq uuvw ux|vy |uxvu| uxovoy oqvow uwvzwxuww oxovww oo|zv uyuv uvoq |w|vo u|wv|y ouqvoq uqwvooquwxy ouwvyq ovq uuyvo| uzvyu uquvq u|uvo q|zvx| q|vq

oqu ow|qvy ovu uqvyo vyu xx|vyq uwvwy z|zyv|y oxxqwvuwquw quzvoy |vq |w|vwz uyvyw xxwvzu xuvw o|xxxvy owv|xuqzy ouvu |vu uqov|q uxzvzo u|vzq uuwvu ooouv| |qxvxuwoz ozqxv|x o|vq |z|vuo uq|vqz xxyv|u xzzvy oo|vwo xqvzzoy|wu uvuw uqvz uovz u|vwo xwwvo y|vu| oxvz ozvoq|yw uuvw zuvy uvzz uozv yqzvyx xzyv|u ovo oxvwyxx|y owvy ouuv|u uwv|| uuwvx| x|ovx| yqyvux o|ozvuo oyoovyoo|oq |wzvy xyvyw uuwvz| uvqu x|xv xuvu ooyyvxo oxxvoyyouu u|wvq |zxvwx uvo uyzvu xo|vzq xuqvqw oyzu|v|o q|quvqxxuww uuv|o ozuxvuu uuvxx uzvzu xqvow xuwvzz oyxxxvyx owyovyooquwxy xwvy |qzv| uuzvxu uzvo| xyuvwo uwzvwx oxwwvxz oxyuyvy|

oqu oxyovoo vo yo|vyu v|w xwyvzz zv |xwovy xzzxxvyquw oxvz |vwz yquvxx y|ovx xywvyz xzyvyq x|xy|vzo w|ywzvwouqzy oyvu| zv xzwvx| yyvu xvoq xv yxy|vz zqy|vyowoz oy|xvxx uvy yozv|q yvyu xwv| xxvu| ywuv| zqqqvozoy|wu oxyzvuw uyvzz y|vyy yowvxq xwuvwx xwxvy ywzuv|q wqzvoo|yw oxoovuy wwv yqyvyq yuvyw xwov|x xzov|u ywozyvqy wyyyvywyxx|y oxwxvz ouyvxy yoyvuz yvoq xwxvwx xzqvww ywv| wxxqzvo|oq oyxwvw qvwx yovw yuvw| xxwvwo xwyvuq yquvzw woyvyouu oyw|v|z yzvx| yqvwu yxyvq| x|vx| xywvuq yywqvxw wqvq|xuww oxqvxu ouxqv|u yqvuq y|wvqx xyqvw xovz| yx|||vqq yuy|vuoquwxy oyywvqx uwvuy yo|vxz yuzvz yyxvq xwv|u yox|wvww yoovqx

oqu

|

xy

acutedegmicrodegpara

degmiddotmdegemdegkk~qgcedilsup1sup3

mdegkk~ocedilsup1sup3 cedilordm cedilsup3

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3oqu |yvoy uvqw zqxvxx xvwq uzvx ovu y|zuvx xywy|vy|quw uxzvy yvuz zoyvxo vox xxv| |vqw zyuyovyo zqzuvuyuqzy uyvuz oovx zuovw wvzy zvqx |vzu o|qxxuvy oowzyyvyqwoz uoyvxx vy xyzvzq xxv|o zuovwx uyvyo ouyyqovxx o|woo|v||oy|wu xqvzy uuvy oquvou oqovq oqqovuw z|vy oxuwv| ouqxzyvwz|yw xoyvyw xwvqy oqqvo oz|vo oqzvyz o|yvu ouwwqvq o|wzuv|yxx|y yu|vq ooyvq o|yvzx qvu o|wvqq uwvy oozwwvyo oozqyvxo|oq youv|y ovoy oxu|v|w yqv ouv|y owzvoo wuo|vou w|qzvzwyouu yxvw uouvwo oywzvwo o|yxvqo oyyvqq oxzzvq xo||voy xoyvwuxuww wozvq yuv| oyuvwo ouxvo ox|wvq o|o|v| |qwqvxw |ouvyzoquwxy zyxv|o zqyvu| oyq|vx oxqwvy oyzvz ooxvu ozzvx owovyu

oqu |xvyy v|u ozuvy w|vyy |xvw |vxy uoyyzov|o |zqyyyvoquw |xvww uvxw owq|vx xvzx oyovu |vqy xw|zvzw xyzuzvzxuqzy |yzv|z zvqw qvuy v|x oyovww wqvyw wzyv|w yqu|vowwoz |wuvz owvu o||vu oqv xzyv ouuv|| yozqovu zzqvzoy|wu xqxvwy |yvx qovou uwvyx oz|zvxy ywv| xqqxvzq yq|qxvzw|yw xuv|u uv|z qywvyu zxvy qxovww |quvz yxwoqv yuquu|vy|yxx|y yzxvqq ouyv| xzwvwx xyvzq yvoq wqzvy uzyqzvu x|yzqzvw|o|oq yyqvx| zzvq |ov|x zzovw| w|ov|| ozvo x|o||xvuz |yqqvwyouu zzyvzq yq|vwz |zzzvz |qvy |wzuv zyvxq |xowu|v |uuoqovuuxuww ooxvu o|wvu |wyvww |zvu |ux|vo zw|vzo |xuwvy |uwovxoquwxy oz|yv| owzuv| |wxyvwu ||yvqu |yzuvzx |yyovq o|uwvo o|uywvo

oqu |yvq vq |qyvqq ozwv|u |zvwq qzvww o|wuzvyo ooq|xxvzyquw |yvzu uv ooqvyu ozzv|w uxxv ozqv| owzzwvyy oxxuq|vouuqzy |yyvx| wvxo yvqq ozwvqy |zwvzq ozwv zzvx qyqqovoowoz |xv| oyvyz ozxovou |wwvw uxqvq| uquvq| |ozqxv|| qwuzzzvw|oy|wu uyqvzo ||v|| |wvyy wquvw wxuvox zvuo |oxy|vw q|qyvoq|yw ux|vuw yvw |xov|u |uvz uuvyu |wqvy o|xxovzu q|wvwzyxx|y yzvwo o|xvu |uqovuy qvw |yvy wwzvx o|wuyov|y owxuqzyvowo|oq xzzvwz yzv|y |q|vxq ovzu uqwvuq oqqvz oyuyyxuv|u oyouoz|vuoyouu wxyvq xuuv|u uoyxvo |yywvo uzvw| uouvu ooq|ozvo oqyq|vqxuww oqqvqu oowvoz uoy|vw |yvx uwuvuw uux|vxx woyyqzvow zqzvqwoquwxy owquvu owov|u uouvx |zxyv uvw uxoqvqu uy|yozvuw uuqqvuo

xy

oqu

|

cedilsup1sup3plusmnsup2

microdegpara cedilsup1sup3 cedilordm cedilsup3degmiddotmdegemdegkk~qg mdegkk~o

JAEA-Testing 2011-005

Ta

ble

13L

MPI

IOP

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

qv

qvz

ovqq

ov

|vq

vwy

quw

vyu

ovq

ovzz

vx

yvqq

yvox

uqzy

xvq

|vx

|vzq

xvo

oovw

oovxu

woz

oqvz

wvqu

vy

oqvy

|vuz

vy

oy|wu

ozvwy

oyvwq

oyvy

ozvz|

uxvw

uxvz

|yw

|yvzx

|vyu

wvy

|xvxo

vq

xv|

yxx|y

yvy

xyvoz

xqvqq

xuv||

oyvqo

oovz

o|oq

zyv|o

wqvu

zovu

zvx

owwvw

oyvzx

youuoxvoq

oqyvx

ouovzu

oqov

x|vqw

uvux

xuwwuyuvoo

|wvuo

uwvxu

oyovqq

wuvqq

||zv|y

oquwxyyqxvu

uwyvux

uwxvzx

ozuvqy

|uwv|

|uuvx

oqu

ovou

qv|o

qvy|

qvyx

|vy

|voq

quw

v|

qvx|

ovq

ovw

yvu

yvz

uqzy

uvxy

ovu

vqw

vyw

ovu

ovou

woz

zv|w

vyu

xv

xvoq

xv

uv|

oy|wu

ov|z

uvyz

oqvou

zvzw

uzvyo

uwvwz

|yw

||vz

zv|w

uv|z

qvy

wvwq

zovy

yxx|y

xvxw

v

uyv

|wvy|

oxqvz

ouzv|q

o|oq

w|vuu

u|vz

wyvx

ywvzz

y|vw

yvuz

youuooyvu|

yovw

oyqvoqvxo

||xvyo

|vx

xuwwwwvww

o|ovu

|uvqzyvuy

|yxv|

uqvw

oquwxy|xvo|

oovux

qovwu|qyvqz

uqwvo

x|v||

|

xy

acutedeg

microdegpara

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

ovou

ovoo

ovq

ovo

vo

vwx

quw

vx

vow

vo

vxu

uvxo

yvo

uqzy

xvoq

uvou

uv|

xvqu

zvo

oovy

woz

oqv

wvx

wvy

oqvo

ozvwo

vo

oy|wu

qvoq

ovo

ovux

qvq|

|yvyq

uyvou

|yw

uqvyo

|yvz

|uvy|

|wvu|

wovo|

wyvzu

yxx|y

wqvox

|v|y

yyvww

wvy

ouovz

ouwvqq

o|oqoxyv|w

o|vuq

oxvyo

oxovz

wvw

xwv||

youuwovuu

xyvox

u|vzq

yzvxw

xoovux

x|xvwu

xuww|xvo

uzvx

u|vyz

xqvozoquqvw

zuyv

oquwxyzv|y

yovz

wqvou

qyvuqooo|vuoqovu|

oqu

ovo|

qvxy

qvzy

ovo

voz

|v

quw

vz

qvzu

ovzq

vy

uv|u

yvx

uqzy

uvy|

vqx

|v

uvuz

wvwq

ovxo

woz

zv|

|vx

vwq

wvwu

ovyy

yvqu

oy|wu

owvzu

vu

oyvox

owvoq

|xvxo

xuvwx

|yw

|vy|

ouv

|ov|u

|xvx

qvq

oquvwq

yxx|y

xvq|

|vu

uxvx|

yvwo

oxuv||

owov|w

o|oqoxqvyw

v|

wyvq

ovux

|qvo

|vu

youuz|vwo

ouzvwq

o|vy

wuvqw

yovy

yywvz

xuwwxxvyu

|qv

uy|voy

xoqvwoxqvwwo|yvzw

oquwxywxwvxq

xov

wqvx

zqwvu|uovxzoqvu

xy

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2

microdegpara

|

JAEA-Testing 2011-005

521 POSIX Fig 20GbrvbarPZYgtFGeacutea`brvbardotildeiWXoacute

CZordfYgtFGeacutea`laquoGnordfzyacuteCqXFGeacutea

`ordf 1MB oumlZYWZordfYWgtiacuteOcirc$FGe$WXDHordfiGqUgraveDZYecircCo^_ordfn

~nparaCDaYGouml`$iaYbrvbarsect

EGoacuteCpoundEacuteEacutegtiaYGoumlAgraveordfAumlgtlaquopoundDq

mnd$Ocirc$YUumleacutenotccedilIuml]OcircOtildeYYL13YEuml

plusmnFEumlPthornordfuumlwdnqecircCCgtordfFGeacutea`

iexclcentIacuteiWXmW]poundEacuteEacuteecircC+Xnota

13gtOslashordfmacrdnX=WC-eacutearingaeligccedil13hLIOiumlIumlshyaringaeligccedil13hPthornWuumlwdnq 522 MPIIO POSIXZordfYWiacute=zCFGeacutea`

otilderyenCordfzyacuteCLFig 21PqUgraveDEumlIgraveiexclcentCXqmndreg=cd MPIIO ]^_ordfZordfY5POslashOumlWXpoundgtXcWmacrdegeumlnq 523 _aumlaringaeligccedil13h 13 IOoiumlIuml_aumlaringaeligccedil13hWa-eacutearingaeligccedil

13hordflaquoqa-eacutearingaeligccedil13humlcopyordfplusmnsup2GmWsup3acutegtlaquoordf13

BeumlnmemcleanfIumlegravePmWbrvbarsect_aumlaringaeligccedil13hY|1micro5PmWordfgtecircq 13G3brvbarsectpoundEacuteEacuteCDPOSIX_aumlaringaeligccedil13h

szligordfaacuteHdnDLTable 14Pq

(1) ecircEacute` (2) memclean (3) TplusmnpoundEacuteEacute` (4) 2plusmnpoundEacuteEacute`

JAEA-Testing 2011-005

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomSingleFileFilePerProc

ReadWrite

homehome

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

Write Read

sequential randomhomework

N=322561024

N=256

N=1024

N=32

N=32

N=2561024

FilePerProc FilePerProc

N=256

N=1024

N=32

N=2561024

N=32

FFig 20 POSIX

JAEA-Testing 2011-005

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential random

use viewcollective

use view N=256 use view N=256

N=32 256 N=32 256

homehome

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomhomework N=32 N=32

FFig 21 MPIIO

JAEA-Testing 2011-005

Table 14 _aumlaringaeligccedil13hszlig

| xy | xy

oregRplusmnsup2 zyvx qzv|| xwvxw zyuvqu

sup2Rplusmnsup2 woxqvoz ||qqvz| uzvu z|voy

kfrac14 frac12kfrac14

53 IORYUgraveWH

IORYbrvbardegdnDregmacrb13tuq 531 a13egravecurren occedil`brvbarYCDcparaX IO O=5Pnota13LSRFSP

homea13brvbarsectiumlIumlS_aumla13Lext3Ptmpa13PordfOslashszligordfAumlDHcAumlFXgttmpa13YZccedila)W^|13~AumlFUcircgtecirc9ordflaquoLFig 22middot Table 15Pq UgraveDhomemiddotAQFS_ccedilY-a5 512kcentCsect186OcircOtildeYOumltimes

raquoZETHIuml_Kgtaordf)JeumlnqUcircfIuml13NtildeY|^`bgta

gt_ccedilY-a5gtyBUacuteCqiAgravework middotAIO eacuteIumlkCQFS_ccedilY-a5 1MBcentCsect366OcircOtildeYOumltimes`Zaggtaordf)Jeumlnq^|13~cda_ccedilY-a5gtlgtyB

UacuteCqa-a5ordfecircCXbrvbarPX MPIIO work middotAegravePuecircgtlaquoq 532 POSIXW MPIIO c^_brvbariaYhomemiddotA Readgt 256EumlIgravegt 2curren

Writegt 10currenW POSIXbrvbarsect MPIIOordfcedilqPOSIXbrvbaraYmacrsup1Gq 533 eacuteccedilecirceuml ntransferSizeGX 1 plusmn13fgtleumlneacute

a`ordfecircMFGFAringordfecircCXWPdegdnDmWcd^|13~

^_Zeacuteccedil-a5laquo]GWordfyacuteordf9ordflaquoq C`|lOcirc`eacuteccedil-a5Y_ BUFSIZgtAtildeUgravesect13

gt gccegravePoumlgt fccegravePoumlgt 8192eacutea`gtlaquoqCfIumlgtsetvbufZaZ|atildebrvbarpoundeacuteccedilecirceuml^wmWordfgtecircq QR Fortran1313aeligleacuteccedil-a5GAgraveograve13sectLAacute

ordmPq (1) OPENyacute BLOCKSIZE (2) ^ fuxxbf (3) Oslash5fIumlZaUuml^13~ -g (4) Ocirc`gt 8Meacutea`

JAEA-Testing 2011-005

ETER

NU

S D

X80

1-

18 (d

x01frac34

dx18

)

DX

80

01

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

DX

80

02D

X80

03

DX

80

04D

X80

05

DX

80

06D

X80

07

DX

80

08D

X80

09

DX

80

10D

X80

11

DX

80

12D

X80

13

DX

80

14D

X80

15

DX

80

16D

X80

17

DX

80

18

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

IObrvbar

1(io

1)

ETER

NU

S D

X80

19

-36

(dx1

9frac34dx

36)

DX

80

19

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

DX

80

20D

X80

21

DX

80

22D

X80

23

DX

80

24D

X80

25

DX

80

26D

X80

27

DX

80

28D

X80

29

DX

80

30D

X80

31

DX

80

32D

X80

33

DX

80

34D

X80

35

DX

80

36

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

Wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

IObrvbar

2(io

2)

Fi

g 2

2a13J

JAEA-Testing 2011-005

Table 15 OcircOtilde7Y`|J

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_01 QFS_data_01 2109888 MB 2109888Icirc36 = 724TB 1024k

data_02 QFS_data_02 2109888 MB 2109888Icirc36 = 724TB 1024k

data_03 QFS_data_03 2109888 MB 2109888Icirc36 = 724TB 1024k

data_04 QFS_data_04 2109888 MB 2109888Icirc36 = 724TB 1024k

data_05 QFS_data_05 3695488 MB 3695488Icirc36= 1269TB 1024k

home QFS_home 3695488 MB 3695488Icirc36= 1269TB 512k

work 524288 MB Io2JAgraveAacuteAcirc

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_06 QFS_data_06 2109888 MB 2109888Icirc36 = 724TB 1024k

data_07 QFS_data_07 2109888 MB 2109888Icirc36 = 724TB 1024k

data_08 QFS_data_08 2109888 MB 2109888Icirc36 = 724TB 1024k

data_09 QFS_data_09 2109888 MB 2109888Icirc36 = 724TB 1024k

data_10 QFS_data_10 3695488 MB 3695488Icirc36= 1269TB 1024k

data QFS_data 3695488 MB 3695488Icirc24= 846TB 1024k

sysdata QFS_sysdata 3695488 MB 3695488Icirc12= 423TB 16k IumlETHNtildeETH

work QFS_work 524288 MB 524288Icirc144=72TB 1024k dx01frac3436OgraveJ

JAEA-Testing 2011-005

6 a$fY` mmgt BX900 a$fY`ETHNtildeCDtuq

ETHNtildeBX900Xraquozfrac14frac12GmWordfgtecircDq 61 a$fY`igraveiacute mmgt BX900 a$fY`IcircIumligraveiacute AgravejiumlIumleacuteaIuml

0Gq 611 icircWiumlIumlethsectntilde

BX900 InfiniBand QDRegraveBCD Fat Treeicircgtlaquoq113aelig13S 18iumlIumlAEligicircgtlaquoordf113aelig13 LeafWdividenaccedil7Iumlordf 2Igravecenttimeseumln1accedil7IumlcdoiumlIuml QDR ordf 1 7gtYZeumlnqaccedil7Iuml 2 Igravelaquogt1 iumlIuml 2 QDR ordfYZeumln(QDR Iacutegt 18 j2Igravegt 36WX)qoiumlIuml`ZaringbrvbarsectT 8GBs (4GBsj2) AAEligpoundq 13aelig13 5PDHLeafaccedilyacute7 SpineWdividenaccedilordf

9iexcllaquosecto LeafaccedilGu Spineaccedil QDRgtYZeumlnqbrvbarpound 113aelig13cd SpineIacutegt 9j2Igrave=18 QDRordfegraveBeumlnq QDRordf 5PoumlLeaf accedil3xmacrDownlink ordf 36 Uplink ordf 18 Xgt Uplinkshy 50WXsectmn 50_ccedilaringWdividenotgtqBX900gta$fY`Ecircfrac34h Fig 23GqBX900gt 50_ccedilaring13aelig13ocircotilde gtthornordf]GqSpineLeafaccedilW13aelig13SoiumlIumlYZGEcircfrac34h Fig 24Gqmmgt13aelig13S nAumliumlIuml nAuml(niquest9)UgraveD nU9Auml(nAgrave9) SpineaccedilegraveBGqlaquo13aelig13laquoiumlIumlordfcent13aelig13iumlIumlcdOcirc$ltAacuteoumlSpineaccedilcd Leafaccedil(2Igrave) QDR2(1j2Igrave)ordfegraveBeumln(2Igrave)LeafaccedilcdiumlIumlQDR2gtYZeumlnqDotildeCmSpineaccedilcd Leaf accedilordmAcirc13aelig13SPAtildeWiumlIumlWnoteumlnqmDHyenwiumlIumlTEacuteOcirc$lt GoumlWiumlIuml 1 W 10 ordflt GoumlWgtn~n FAringordf+XqX13aelig13Sotilde gtiuml_ccedilaring

WXqmDHoYacute~iumlIumlethsectntildeAgraveograveLiumlIumleacuteaOcircOtildePordf

thornotildeGq (1) XiumlIumlethsectntildeL13klP iumlIumlZeacuteaIumleumlnqyenw13aelig13 1 iumlIumlordf bx0001ccedil

bx0018Ugravegtlaquoordf144EumlIgraveYacute~oumlmAumlgtiumlIumlordfeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDdmiumlIumlAringAEligw[iumlIumlegraveBGq13

aelig13ocircotildeoumliacuteZeacuteaIumleumlnq

JAEA-Testing 2011-005

(2) CcedilXiumlIumlethsectntilde Wsup33iIacuteOslash5YZniumlIuml 1ecirceacuteaIumleumlnqyenw13aelig13

1gt bx0001bx0003bx0005bx0007WeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDd8)eacuteaIumleumlnDiumlIumlordfEgraveAumlAringXdAringAEligwTEacuteEgrave

AumlAringiumlIumlegraveBGqEcircAumlAringoumliacuteAringAEligwTEacuteEcircAumlAringiumlIumlegrave

BGqmnAtildeWYacute~SiumlIuml gt IB [EumlDHcenteumlnDq plusmnoqgt65EacuteWsup3AgraveogravegtCDq

13aelig13 1 13aelig13 119

Leaf(1-2) Leaf(1-1)

bx0001 bx0002 bx0018

Spine(1) Spine(2) Spine(9)

Leaf(119-2)

Leaf(119-1)

bx2125 bx2126

Fig 23 BX900a$fY`icirc

3OslashCDqIgraveIacuteoFL13aelig13raquoIcircIumlntildePfrac34wWsup313kl

WiacuteiumlIumlethsectntildeAgravejBCq

JAEA-Testing 2011-005

accedil (Spine) accedil

(Leaf) 13aelig13 iumlIuml (1 13aelig13acute18 iumlIuml ) (acutecsect]GCGDHLeaf accedilTCcX)

1

10

2

11

3

12

4 5 6 7 8 9

13 14 15 16 17 18

SW 1

SW 2

SW 3

SW4

SW5

SW6

SW 7

SW 8

SW 9

SW

Fig 24 iumlIumlWaccedilatildeAcirc

612 Agravej

BX900 Ba$fY`plusmnsup2yBCD IB AgraveograveWC

RDMA(Remote Direct Memory Access)yBCD RDMA Agravej(13 RDMA WC)WRDMA egraveBCX NORDMAAgravej(13 NORDMA WC)ordflaquoqBX900 gtOcirc` RDMA gtlaquoqRDMA W IB begravecccedil`brvbarYaumlIuml(Host Channel Adapter(HCA))Ocirc$^|13~|W8Y]sectWsectgtecircAEligDIacutemWgtETH_fg (1)5PqAuml^ccedil`E7a13 ordfNtildeEumlgtecircqiAgraveNORDMA CPUbrvbar|fg5Pq

RDMANORDMAWcentccedilYacute AgraveograveWC Normal SendAgravej(2)W Send on RequestAgravej(3)ordflaquoqRDMANORDMAacircWmnd Normal SendAgravejSend on RequestAgravej5PmWordfgtecircordfmndsectOgravewG ccedilYacute-a5brvbarpoundAtildeUgraveqRDMANORDMAgtmndAgravejsectOgraveOcirc`Oacuteshy Table 16GqOacuteshyOcircOtilde Normal SendAgravejOacuteshy13yacute Send on RequestAgravejWXqmndAgravejWccedilYacute-a5^_atildeAcirc Table 17W Table 18Gq Oacuteshy^ MP_MAX_NORMAL_SENDgt^wdnordfyacuteaacuteshyCDOcirc`shyWEacutegtlaquoqbrvbarpound^egravePoumlOacuteshyOumloumlEacutegtlaquoq

JAEA-Testing 2011-005

XDshy 5PDHOslash5 -onesideUuml^13~CDoumlNORDMAAgravejc(ccedilYacute-a5WatildeAcirc)Normal SendAgravejgt5nq

Table 16 AgravejsectOgravewOacuteshy Iuml RDMA RDMA NORDMA

^_ Tccedil1024 1025ccedil U Oacuteshy 32768eacutea` 16384eacutea` 524288eacutea`

Table 17 RDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil16KBOcircOtilde 16KBccedil32KBOcircOtilde 32KB13yacute 1ccedil1024^_ Normal Send Normal Send Send on Request

1025ccedil2048^_ Normal Send Send on Request Send on Request 2049^_13yacute NORDMAWX Send on Request Send on Request

Table 18 NORDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil512KBOcircOtilde 512KB13yacute Gu^_ Normal Send Send on Request

timesOslash (1) ETH_fg ^_ccedil-brvbarOcirc$fgIacuteccedil`brvbarYordmgt ordm|8

YOcirc$FGGCcedilEgravemWq^_ccedil-( 13)Obbrvbarsectdivide3GmWordfgtecircq (2) Normal SendAgravej G shylt shyccdccedilYacuteG GqccedilYacuteUgraveUacutesect

umlordmordfagraveCnUgraveUgravefg(direct copy)qagraveCXoumliUcircIacuteEumlAagravefCmmfg(indirect copy)agraveCDdsectumlordmfgGqUumlccedilYacutezecircq (3) Send on RequestAgravej G shylt shyWYacuteAacutesectlt shyccedilYacutesectumlordmordfagraveCcdG 5Pq

`ccedilYacutezecircq

JAEA-Testing 2011-005

62 MPI MPI acircX atildeegraveBCD^_Z)JCBX900 yacutegt

CDq256EumlIgraveccedil4096EumlIgravegt Ocirc$-a5THORNumlcopyszligpound5poundDordfmmgto MPIatildegtagraveaacuteX=ecircordfmacrdn 32ccedil80000eacutea`gtGq WCRDMANORDMA ordflteumln atildeordflaquomWlaquoOcirc

$-a5cdecircCordf^UcircG atildeordflaquomWXMordfethCDq 621 Agraveograve mmgtG Ocirc$umlcopy 0ccedil80000 eacutea`gtlaquoordfOslash5vEacuteAacuteG

Ocirc$acircETHeCLTable 19Pq AgraveograveWCMPIatildeTplusmnOslash5GraquoCmn 20plusmnatildesectaumlGq20plusmnatilde

sectaumlC3 1plusmnWecircXshyordfeumlnmWordfDDlaquopoundDqWCpumlcopyYacute~ W[ordfuumlwdnqmDHT`C19plusmnshyIIumlAacutemWCDqXIuml RDMAW NORDMA 2IumlgtGq^_ZOslash MPIatildeIacutecurren Fig 25Gq

Table 19 Ocirc$acirc

EumlIgrave AllreduceReduceBcastSendRecv 13 MPIatilde 256 32eacutea` 128eacutea` 512 yacute 128eacutea`

1024 yacute 256eacutea` 2048 yacute 2048eacutea` 4096 yacute 2048eacutea`

do 1000 III = 1 NNN call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() mesuarment start call MPI_ALLGATHER(anelemMPI_DOUBLE_PRECISION amp bnelemMPI_DOUBLE_PRECISION amp MPI_COMM_WORLDierr) ttime2 = MPI_WTIME() mesuarment end ttime0 = ttime2 - ttime1 call MPI_REDUCE(ttime0ttime1MPI_DOUBLE_PRECISIONMPI_MAX0 amp MPI_COMM_WORLDierr) if(myideq0) then max_time = max(max_timettime) min_time = min(min_timettime) all_time = all_time + ttime end if 1000 continue

Fig 25 ^_ZLIacutecurrenP

JAEA-Testing 2011-005

13

622 mnUgravegttuDOslash5CcedilEgraveOslash5IumlgtP512EumlIgrave Fig 26 Fig

27 1024EumlIgrave B Gqoh$a` MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

0

500

1000

1500

2000

2500

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Reduce(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(512-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(512-)

rdma

Nordma

Fig 26 MPI atildeL512EumlIgrave-1P

JAEA-Testing 2011-005

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(512-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(512-)

rdma

Nordma

0

500000

1000000

1500000

2000000

2500000

3000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(512-)

rdma

Nordma

0

50

100

150

200

250

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(512-Rank0gtRank511)

rdma

nordma

Fig 27 MPI atildeL512EumlIgrave-2P

JAEA-Testing 2011-005

brvbarsect[raquozfrac14frac12CDq

(1) C gt RDMAbrvbarsect NORDMAordfFoumlordflaquoq (2) RDMAgtlaquo Ocirc$-a5cdaeligccedil ordflaquoEGqmn Normal Send

Agravejcd Send on RequestAgravejsectOgravethornWnq (3) NORDMA gt AgravejsectOgravewOacuteshyWatildeAcircOcirc$-a5gtaeligccedil ordflaquo

EGoumlordflaquoq (4) EumlIgraveordfCXWEacutebrvbarPX Ocirc$-a5gt decircordfmacrdn

oumlordflaquoq ^_ZgtacircegraveBG MPIatildebrvbarpoundRDMAW NORDMAegravecurrenaeliguordflaquo

qDotildeCmnd MPI ^_Z=GfIumlgtlaquo mpiexec Uuml^13~gtlaquogtGu MPIatildeoacuteC RDMAgt5Pc NORDMAgt5Pclaquo XgtC MPI atildeegravepoundoumlAgraveIumlgtOslash5rsGmWordfaeliguWXq ucircuumlWCo MPI atildegtUVOcirc$(ccedil8KB)3UVOcirc$(8Kccedil78KB)U

VOcirc$(80KBccedil584KB)gt RDMAZUcircCD Fig 28LgtUVOcirc$P CLUVOcirc$PGqZyacutegt 100n RDMA ordfnoty0n NORDMAordfnotyWXq

JAEA-Testing 2011-005

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

middotplusmnsup1raquo

~plusmnreg

plusmn

sup2cedil~raquo

sup2~

sup2~

sup2~Oslashreg~plusmn

middotplusmnsup1

~plusmnraquo

~plusmn

Ugraveplusmnsup1raquo

Ugraveplusmnsup1

~plusmn

plusmnraquo

Fig

28

RD

MAW

NO

RD

MALgtUVOcirc$acute

0K

B~8

KBP

JAEA-Testing 2011-005

63 RDMAszlig RDMA 612 gt0CDbrvbarPCPU plusmnsup2WnotM5XDH 13

O=Vfrac34gtecircqmnbrvbarsect WbordfO=gtecirc9ordflaquosectC

cCcedilEgrave`5poundEacuteDqUgraveDrsDHNORDMAgtiacuteX`5poundDq WC WordfO=eumlnDegraveyenagraveaacutegtecircXcpoundDq 631 Agraveograve W5P^_ZegravepoundRDMA W NORDMA rsGq

eacute^_WOcirc$ecircacuteGeumlsectCigraveecircacuteiacuteVicircG (13cedil Wdivideoslash)5PqGlt iuml_ccedilaring (MPI_IsendIrecv)egraveBGqG UgraveDlt GAtildeWcopyIgraveOcirc$-a5 5MB gtlaquoq W5P5Igravegtlaquoq5Igrave-a5(450450)gtlaquoqrsDHgt n~nOslash55Pq ^_ZZccedil`MPIW 2q|Icirca|ccedil`(MPIiumlOpenMP) 3q|)

JCDqZccedil` MPIgtOslash5CDoumliumlIumlS RDMAgtXDHmAgravejszligordfgtecircXqmDHiumlIumlSgt MPI 5XCeth2Icirca|ccedil`)JCDqIcirca|ccedil`Oslash5WOslash5Iacutecurren7ccedilIumlEumlIgraveUcircCXeacute

Yacute~ 1W7ccedilIumlEumlIgraveUcircCDeacuteYacute~ 2)JCDqeacuteYacute~ 2 WOslash5Iacutecurren Fig 29GqXMPIWIcirca|ccedil`eacuteYacute~ 15IgraveIacutecurren7ccedilIumlEumlIgraveampeuml$OMP parallel doiacuteordfotildegtlaquoq DHOslash5Zccedil` MPIgt 8EumlIgraveIcirca|ccedil`Oslash5 4mpij8omp(32core

egraveB)gtOslash5Gq

JAEA-Testing 2011-005

call MPI_BARRIER(MPI_COMM_WORLDierr) ttime5 = MPI_WTIME() mesuarment start call mpi_isend(aNMBmpi_DOUBLE_PRECISIONir1 1 MPI_COMM_WORLDireq(1)ierr) call mpi_irecv(bNMBmpi_DOUBLE_PRECISIONil1 1 MPI_COMM_WORLDireq(2)ierr) call mpi_isend(dNMBmpi_DOUBLE_PRECISIONil2 1 MPI_COMM_WORLDireq(3)ierr) call mpi_irecv(cNMBmpi_DOUBLE_PRECISIONir2 1 MPI_COMM_WORLDireq(4)ierr) ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ttime7 = MPI_WTIME() mesuarment start $OMP parallel do do j = 1 NMM do i = 1 NMM a1(ij) = 1d0 b1(ij) = 1d0 c1(ij) = 0d0 end do end do $OMP parallel do do j = 1 NMM do k = 1 NMM do i = 1 NMM c1(ij) = c1(ij) + a1(ik)b1(kj) end do end do end do ttime8 = MPI_WTIME() mesuarment end ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc call mpi_waitall(4ireqiastatusierr) ttime6 = MPI_WTIME() mesuarment end ttimeDD = ttime8 - ttime7 ttimeal = ttime6 - ttime5 Oslash5 ttimesd = ttimeal ndash ttimeDD Oslash5cdnD (= UgraveDiIacute)

Fig 29 WO=5PDH^_Z

JAEA-Testing 2011-005

632 Zccedil` MPI 2plusmnIcirca|ccedil`eacuteYacute~ 1 WeacuteYacute~ 2n~n 3plusmn

Oslash55poundDqmnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt D Gqontilde13Gq

acutegtOslash5CD Oslash5 acutegtOslash5CDOslash5 R acute Wn~nOslash5shy Oslash5 acute W5poundDOslash5 (U) acuteOslash5cd()Dshy acuteOslash5CD(OslashC) U acuteOslash5cdRshyDshy

laquogtOslash5CD oacuteG WOslash5GmWbrvbarpound

UumlCXpoundDrGqmj13sectgtlaquoq

laquo() = (Oslash5U( iuml)) j100 aTHORNshyordfecircMszligordflaquomWXqbrvbarsect[raquozfrac14frac12CDq

(1) Zccedil` MPIgtiumlIumlS IBegravepoundXDH RDMAograveoacuteordfltdnOslash5szligq

(2) Icirca|ccedil`eacuteYacute~ 1 RDMAOslash5gt WOslash5rsGWucircOslash5ordfFWPXq NORDMAatildeCiacuteXWXpoundq

(3) Icirca|ccedil`eacuteYacute~ 2 gtIacutecurrenordf7ccedilIumlEumlIgraveUcirceumlngteacuteYacute~ 1WRrsgt 72currenOslash5Rrsgt 55currenAumlFUcircordfXeumln(RDMA)qCcCgtucircOslash5ordfFWPdegdnXq NORDMAIacutecurren7ccedilIumlUcircgt Oslash5eacuteYacute~ 1ruUumlIgraveUcirceumlnordfRDMAbrvbarsectcedilCXpoundq

BX900gt RDMA Uuml_IumlOslashWGgtCPU oacuteG

dCb^|13~ CPUbrvbarsectVfrac34GmWordfgtecircqCcCBX900 ordfiacuteCUgraveDplusmngt ordfUumlmWlaquosect Wordf

O=eumlnDegraveyenagraveaacuteGmWordfgtecircXcpoundDq

JAEA-Testing 2011-005

64 CPUSCPUiumlIuml 1oacute 1 core FAringG^_Zegravepound CDqCPU S

iumlIumlS CPU iumlIuml gt Ocirc$ordmAcircAgravejordf+Xqn~n gtordfaEacuteiumlIuml ordfTFWXpoundDq 641 Agraveograve

1oacute 1 ^_Z coreOcirc$ G^_ZgtlaquoqZY0cdcZYCD^_acircgtCD-a5Ocirc$ 100plusmnG Cmnd uCDn~nGq^_ZiIacute Fig 30Gq plusmn 24EumlIgravegt128MBOcirc$G C^_acirc 1WGq BX900gt 1iuml

Iuml 2CPU1CPU 4coregtlaquogtZY 3Ugravegt CPUS ZY 7UgravegtiumlIumlSPAtildeW CPUW mnd13yacuteiumlIuml WXqiumlIumlS coreoacuteG core IDWethsectntildedn MPIZYAumlAring Fig 31Gq

IC = 0 do 1000 III = intcpu nodes-1 intcpu IIRANK = III call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN if(myideq0) amp call MPI_SEND(ansizeMPI_DOUBLE_PRECISION amp IIRANKIIMPI_COMM_WORLDierr) if(myideqIIRANK) amp call MPI_RECV(ansizeMPI_DOUBLE_PRECISION amp 0IIMPI_COMM_WORLDistatusierr) 120 continue ttime2 = MPI_WTIME() c IC = IC + 1 timew(IC) = ttime2 - ttime1 ir(IC) = IIRANK 1000 continue

Fig 30 ToacuteT ^_ZLIacutecurrenP

JAEA-Testing 2011-005

NODE

CPU0 CPU1

Core IDacute0 Core IDacute2 (Rank 0) (Rank 1) Core IDacute4 Core IDacute6 (Rank 2) (Rank 3)

Core IDacute1 Core IDacute3 (Rank 4) (Rank 5) Core IDacute5 Core IDacute7 (Rank 6) (Rank 7)

Fig 31 f MPIZYethsectntilde

BX900Ocirc`QRfeaZ+QR MPIZaZ|gtlaquoordfmQRRIgraveEacuteIacute13

IntelfeaZ+QR MPI IntelfeaZ+Intel MPI QRfeaZ+Open MPI

3 gt5poundDqDotildeCQR MPI W Intel MPI MPI ^_ coreoacuteGethsectntilde Fig 31CDsectgtlaquoordfOpen MPIatildeC rankAumlAring 01234567 Core ID 01234567gtethsectntildednq 642 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt E Gqmbrvbarsect[

raquozfrac14frac12CDq

(1) QRfeaZ+QR MPI gtCPU S FAringordf 3260MBsiumlIumlS CPU FAringordf 3750MBsiumlIumlordf 5350MBsgt CPUS ordfiAumlcedilCiumlIumlordfiAumlFWPWXpoundDq

(2) Intel feaZiumlQR MPI CPU SiumlIumlS CPU QRfeaZWotildeiacuteXotildeordfiumlIuml 4750MBsWegraveeacuteEGqmnQR MPIQRfeaZUcircCD5poundordfIntelfeaZoacuteC5nXDHWnq

(3) IntelfeaZ+ Intel MPI CPUSiumlIumlS CPUgtAgraveXordfmndocircoacute(1)(2)QR MPI egraveBqUgraveDiumlIumlgt Intel MPI 2(` QDRegraveBordfgtecircXDH1(` QDRCcXq

JAEA-Testing 2011-005

(4) QRfeaZiumlOpen MPICPUS ordf 4460MBSgtmnUgravegt3gtTbrvbarordfiumlIumlS CPU QR MPI WotildeAringiumlIumloumlCXWXpoundDqDotildeCiIacute QDRegraveB5poundbrvbarPotildeq

dividefrac12XegraveAgravegtlaquoQRfeaZiumlQR MPIgt CPUSiumlIumlS CPU ordf

iumlIuml brvbarsectcedilWPfrac34W+XWXpoundDqmmacraacutesectXuC

iumlIumlSgt MPI XuCEumluecircgtlaquoq 65 13aelig13ocircotildeouml 13aelig13ccedil`brvbarY 50_ccedilaringicircXpoundDHegraveBoslashbrvbarpound

thornltCUgravePqmthorn 1oacute 1 Wdivideoslash ETHuDq 651 1oacute 1

1oacute 1 MPI atildegtlaquosectbrvbarCegraveBeumlnqyenwOslashfIumlgteacute^_WFGgt^_ordfgt5noumlordflaquoqmbrvbarPXouml50_ccedilaringthornlt9ordflaquoqmmgt 5P^_Z)JC

thornETHuDq WC1oacute 1 gt 2iumlIumlgt IBnotGn[ordfecircDH50_

ccedilaringthornCltmWordfcpoundDqDotildeCWsup3iumlIumleacuteaOcircOtilde(Yacute~iumlIumlethsectntilde)AgraveogravegtAtildeWYacute~ordf IBnotGbrvbarPXiumlIumlcopytimesUVYacute~139ordfAgraveXDHthornordfCq 6511 Agraveograve ETHNtildeegraveBCD^_Z^_1curreno^_cdfrac14secto^_

1oacute 1 (MPI_SendRecv)gt 128MBOcirc$ 100plusmnG CmnuCDiexcl FAringGqOcirc$G aYacute Fig 32 Gqmyengt8EumlIgraveYacute~oumlgtZY 0ccedil3^_cdiexclZY 4ccedil7^_Ocirc$G CqOcirc$Glt ^_iugravepGgtoslashbrvbarpoundccedil`brvbarYgt[

ordfecircmWXq ^_ZZccedil` MPI WIcirca|ccedil`(MPI+OpenMP))JCDqIcirc

a|ccedil`7ccedilIumlETHeGmWgt1 iumlIumlcdOcirc$G 2^wmWordfgtecircq

JAEA-Testing 2011-005

13

6512 (1) 113aelig13S rsDH1 13aelig13S(144core)5poundDqZccedil` MPI 144 EumlIgraveIcirca

|ccedil` 72mpij2omp36mpij4omp18mpij8omp 3 IumlgtOslash55poundDq13aelig13Siuml_ccedilaringgtlaquoDH iumlIumluacuteordf IB iexclcentGqZccedil` MPI 1 iumlIumlcd 8 ^_currenOcirc$ordfeumlnIcirca|ccedil` 2 7ccedilIumliacute 1iumlIumlcd 4^_curren47ccedilIumlEacuteC 2^_curren87ccedilIuml1^_currenWXgt87ccedilIuml13 IBgt[ordfecircqmDHmnd FAring+XordfIBagraveaacuteGDHogtIIuml FAringW 1iumlIumlntildesect FAringshyiquestHDqm Table 20GqXeacuteYacute~Icirca|ccedil`cedil$ucirc3shy(MPI^_j7ccedilIuml)Gq Icirca|ccedil`(18mpij8omp)EacuteegraveeacuteshyordfEordfFAringshyGueacuteYacute~gt

otildeEacutebrvbarPXshyWXpoundqmnordf QDR2currenOslashgtlaquoigravePq (2) 13aelig13Zccedil` MPI TEacute144EumlIgraveZccedil`MPIgt5poundDqYacute~Oslash5gt t256YZegraveBCDq

mnmYZiumlIumleacuteaOcircOtildeordfZcopytimesWXDHIBnotordfecirc]GDHgtlaquoq _copytimes 72^_13aelig13(=9iumlIuml13aelig13)gtIumlb^_ 213aelig13currenethCDq13aelig13S 1iumlIumlecirc 2 IBordfuacuteordfpoundq13aelig13S 18iumlIumllaquogt 36 IBgteumlnqiAgrave13aelig13 50_ccedilaringgtlaquogt 18(9j2ccedil`)gteumlnq144EumlIgraveouml13aelig13S13aelig13W 18 IB egraveBGgtiuml_ccedilaringWXsect13aelig13S13aelig13EacuteXqbrvbarpoundm13aelig13ocircotilde 13aelig13SgtCD Wotilde^

dXq

Ocirc$G (G G)

^_0

^_1

^_2

^_3

^_4

^_5

^_6

^_7

Fig 32 ToacuteT Ocirc$G aYacute

JAEA-Testing 2011-005

Table 20 L13aelig13SP eacuteYacute~ Zccedil` MPI uumlyacutethorn(72j2) uumlyacutethorn(36j4) uumlyacutethorn(18j8)IIuml FAring 7238MBs 14474MBs 28942MBs 53467MBs iumlIumlshy 57906MBs 57899MBs 57884MBs 53467MBs

Table 21 L13aelig13Zccedil` MPIP 144 MPI 192 MPI 224 MPI iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 7246 R 4670 laquosectR 4671 laquosectR

2 7244 R 4670 laquosectR 4671 laquosectR

3 7246 R 4670 laquosectR 4671 laquosectR

4 7242 R 7243 R 4670 laquosectR

5 7243 R 7244 R 4671 laquosectR

6 7242 R 7240 R 7237 R

7 7241 R 7239 R 7246 R

8 7244 R 7240 R 7245 R

9 7249 R 7243 R 7240 R

10 R 4670 laquosectR 4670 laquosectR

11 R 4670 laquosectR 4672 laquosectR

12 R 4669 laquosectR 4671 laquosectR

13 R R 4671 laquosectR

14 R R 4670 laquosectR

iacute 192EumlIgraveW 224EumlIgraveZccedil` MPIn~n 96^_13aelig13gt 213

aelig13Oslash5112^_13aelig13gt 213aelig13Oslash55poundDqordm 144EumlIgraveOslash5WWlt shy13aelig13iumlIumlraquoIIuml FAring Table 21Gq yenw 224 EumlIgraveoumlltAacutesectshygt IBnotGiumlIumlordflaquogt(50_ccedilaring)

miumlIuml FAringordfcedilCXqFig 24a$fY`icircgtCDbrvbarPiumlIuml1W 102W 113W 12ordfnotGmWXsectmndiumlIuml FAringcedilCXpoundqCcCiumlIuml 15161718 egraveBeumlnXDHmndiumlIumlWnotGiumlIuml 6789 FAringordfFC 144MPI oumlW FAringordfEacutegtlaquoqbrvbarpoundiumlIuml6ccedil9 iuml_ccedilaringWXq192 EumlIgraveiacutegtiumlIuml 4ccedil9 p IB notGiumlIumlegraveBeumlnXDH FAringordfFq

JAEA-Testing 2011-005

Table 22 L13aelig13Icirca|ccedilIumlP Icirca|ccedil`(112x2) Icirca|ccedil`(56x4) Icirca|ccedil`(28x8) iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 9339 laquosectR 18567 laquosectR 36408 laquosectR

2 9337 laquosectR 18564 laquosectR 36919 laquosectR

3 9346 laquosectR 18566 laquosectR 36536 laquosectR

4 9347 laquosectR 18570 laquosectR 36457 laquosectR

5 9342 laquosectR 18651 laquosectR 36646 laquosectR

6 14385 R 28606 R 53808 R

7 14488 R 29118 R 53779 R

8 14500 R 28676 R 53789 R

9 14495 R 29100 R 53784 R

10 9337 laquosectR 18797 laquosectR 36408 laquosectR

11 9340 laquosectR 18775 laquosectR 36715 laquosectR

12 9343 laquosectR 18651 laquosectR 36559 laquosectR

13 9348 laquosectR 18811 laquosectR 36457 laquosectR

14 9338 laquosectR 18694 laquosectR 36991 laquosectR

(3) 13aelig13Icirca|ccedil` Icirca|ccedil`gt 213aelig13(224core)egravepound112mpij2omp(56^_13aelig13)

56mpij4omp(28^_13aelig13)28mpij8omp(14^_13aelig13)IumlgtOslash5CDqmndoiumlIumlgtIIuml FAringWpiumlIumlWnotnot Table 22Gq MIcirca|ccedil`gt IB notnotgt FAringordfecircC^poundqUgraveDIcirca|ccedil

`7ccedilIumlbrvbarpoundiumlIumlcdGlt GOcirc$2ordf^DHiumlIumlntildesect^_

ordfMiumlIumlXordf IB gtOcirc$[thornordfagraveaacuteXqmnduumlGWBX900 gtZccedil` MPI brvbarsectIcirca|ccedil`UcirceumlnDfIumlordfIcirca|ccedil`UcircfIumlXd 87ccedilIumlordfnotygtlaquomWordfcq WmigravegtWsup3iumlIumleacuteaOcircOtildeAgraveograveiumlIumlAtildeWecirccopytimeseumlnD

HUVYacute~13currenYacute~gt IB not9AgraveXqbrvbarpound50_ccedilaringmWuumlGbrvbarsectordmgttuDIcirca|ccedil`Ucirc5GuecircgtlaquoqCcC

Yacute~Oslash5 IB notGiumlIuml ordfpYacute~ordfOslash5eumlnWthornltCUgravePordfucircoB3mnEumlmWgtecircXq

JAEA-Testing 2011-005

652 divideoslash mmgtdivideoslash atildePiAuml 2ordfCX MPI_AlltoallegravepoundEacute

DGq 6521 Agraveograve

MPI_AlltoallGCcedilEgravecentCOslash5GqCmrsGqXMPI_Alltoall 100 plusmnOslash5C oacute copyIgrave-a5Zccedil` MPI ordf512MBIcirca|ccedil` 1024MBWCDq

A ucircYacute~Oslash5 B 13aelig13ntildesectiumlIumlgtecircotildeCGq C 13aelig13ntildesectiumlIumlAgraveXCG(13aelig13ordfCX)q D IBnotCG(iuml_ccedilaring)q

BccedilD atildeGgtlaquoordfiumlcopyordfYacute~gtegraveBGiumlIumllaquo GmWgtecircXq mDHWsup3ucircoBgteacuteccedilYacute~TYZ h4096egraveTEacute 512iumlIuml(4096core)agravefCm3cdXuCCcedilEgravePiumlIumllaquonotgtecircmCcedilEgravegtagravefgtecircEumlIgravegtCfNh$)JCmfNh$egravepound

MPI_Alltoall Oslash55poundDqmDHAacutedegCDOcirc$ MPI ^_UYacuteordfXq 6522 Zccedil` MPIWIcirca|ccedil`egravepoundD Table 23GqmmgtCXC

XmWOslash5CcedilEgraveordf+XDH AucircYacute~Oslash5WBccedilDUgravegtoshy8YrsCXmWgtlaquo(A 10WCDrClaquoordfmnoumlgt BccedilDEacutersWG)qBccedilDTEacute 512iumlIumlagravefCcdEumlIgravefNh$)JCOslash5GordfQR MPI gtmbrvbarPTEacuteCiumlIumlagravefCcdAgraveXiumlIumlgtCfNh$)poundOslash5CDoumlWTEacutecd

EumlIgravegtOslash5CDoumlWgt)acircOslash5ordfEGqZccedil` MPI WIcirca|ccedil`gtgt coreordfAgraveXoumlmraquozordfagraveaacuteXqX BccedilDrsXq

JAEA-Testing 2011-005

Ta

ble

23iacuteXCcedilEgrave

MPI

_allt

oall

vUacuteSTUcircR

vRcopyUumlYacutebrvbarTHORN

szligvcopyUumlYacutebrvbaragrave

pvRaacuteacircatildeNR

+aumleAumlovqaringaeligPgR

Oslash 5 Iuml

copy Igrave - a 5 (MB)

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

R

R

szligR

pR

144m

pi

512

3R6

0 11

14

118

0

145

9 3

60 R

138

6

36

0 13

87

100

1

31

124

1

25

2plusmn

(BC

DEacute

) 51

2 3R

60

111

4

118

0

148

2 11

16 R

146

9

63

0 14

75

100

1

33

132

1

32

3plusmn

(BC

DEacute

) 51

2 3R

60

111

4

29

0 14

81

714

13 R

153

3

29

0 14

21

100

1

33

138

1

28

448m

pi

512

14R

40

142

6

414

0

182

4 21

272

1 R

170

3

78

0 16

32

100

1

28

119

1

14

504m

pi

512

14R

45

142

7

415

8

180

4 27

242

6 R

160

5

79

0 14

33

100

1

26

112

1

00

2048

mpi

51

2 37R

69

182

1

298

8 24

93

465

6 R

250

1

465

6 25

04

100

1

37

137

1

38

2plusmn

51

2 37R

69

166

7

2211

6

262

5 67

564

6 R

264

7

298

8 25

04

100

1

57

159

1

50

3plusmn

(BC

DEacute

) 51

2 37R

69

166

7

2211

6

262

1 69

524

9 R

269

3 6

298

8 24

59

100

1

57

162

1

47

3072

mpi

51

2 69R

56

258

7 28

399

9 27

34

5371

54 R

286

3 9

507

7 26

44

100

1

06

111

1

02

68m

pij

2om

p 10

24

3R5

7 12

10

11

170

11

80

91

9 R

122

3

28

5 12

21

100

0

97

101

1

01

120m

pij

2om

p 10

24

9R3

3 12

89

215

0

127

0 12

191

6 R

130

4

47

5 12

94

100

0

99

101

1

00

132m

pij

2om

p 10

24

8R4

1 13

79

216

5

116

1 15

211

6 R

119

5

48

3 11

38

100

0

84

087

0

83

432m

pij

2om

p 10

24

17R

64

162

4

119

8 16

28

1925

43 R

161

2 1

129

0 16

18

100

1

00

099

1

00

30m

pitimes4

omp

1024

4R

38

598

115

0

674

12

13 R

683

27

5 68

2

100

1

13

114

1

14

2plusmn

(BC

DEacute

) 10

24

4R3

8 59

8

115

0

678

12

13 R

690

27

5 70

0

100

1

13

115

1

17

36m

pitimes4

omp

1024

6R

30

603

118

0

684

11

16 R

697

63

0 70

3

100

1

13

116

1

17

2plusmn

(BC

DEacute

) 10

24

6R3

0 60

3

29

0 68

8

714

13 R

702

29

0 71

0

100

1

14

116

1

18

126m

pitimes4

omp

1024

16R

39

822

3

415

8

805

27

242

6 R

738

1

79

0 71

5

100

0

98

090

0

87

180m

pitimes4

omp

1024

18R

50

925

615

0

892

40

283

2 R

776

127

5 75

3

100

0

96

084

0

81

2plusmn

(BC

DEacute

) 10

24

18R

50

925

712

9

861

29

352

6 R

808

109

0 76

8

100

0

93

087

0

83

612m

pitimes4

omp

1024

45R

68

110

2

349

0 10

34

525

9 R

111

4

349

0 10

46

100

0

94

101

0

95

2plusmn

(BC

DEacute

) 10

24

45R

68

110

2

2810

9

920

57

634

9 R

960

5

358

7 83

4

100

0

83

087

0

76

17m

pitimes8

omp

1024

3R

57

338

1

117

0

430

9

19 R

432

28

5 43

3

100

1

27

128

1

28

90m

pitimes8

omp

1024

15R

60

504

109

0 44

9

243

8 R

471

109

0 47

1

100

0

89

093

0

94

357m

pitimes8

omp

1024

53R

67

731

418

7 69

2

576

3 R

711

418

7 72

9

100

0

95

097

1

00

2plusmn

(BC

DEacute

) 10

24

53R

67

731

3410

5

642

57

685

3 R

628

6

438

3 60

2

100

0

88

086

0

82

JAEA-Testing 2011-005

brvbarWBccedilD gtecircCOslash5ordf+XAgraveXq13aelig13ordfAgraveXCc IB notGiumlIumlordfAgraveXnOslash5ordfUumlWnDordfmacrWaeligCmbrvbarPXpoundXqMPI_Alltoall MPI atilde3gtT 2ordfCX 5Pordfmnegravepound BX900 gtecircXordfXmWordfethCDq1 oacute 1 gt13aelig13thornordfDordfmgtpYacute~thornltC t256YZegraveBCDoacuteCdivideoslash h4096YZegraveBCDgtpYacute~thornlt9]QR MPIbrvbar TUacuteUcircthornuumlwdnWgtszligagraveXNXq 66 a$fY`UgraveWH mnUgravegt BX900 gt brvbarsect ecircXthorn7wWCacirc

RDMANORDMA Agravej Ocirc$-a5 1oacute 1 (13aelig13ocirc)IBnot

ordflaquopoundDqumlcopyfIuml oslashfrac14frac12CTbrvbarCcedilEgravelaquo GuecircgtlaquoigravePqeumld

fIumlszligOslash5GDH Zccedil` MPIUgraveDIcirca|ccedil`laquo

ordfugtlaquoqucircuumlWCGZccedil`MPIWIcirca|ccedil` rsGordfWmcdBX900gtIcirca|ccedil`ordfnotyXmWordfcurrencqmncdfIuml)JGoumllaquocentfIumlAumlFUcirc5GoumlcentX=ordfX

nIcirca|ccedil`laquo GuecircgtlaquoqBX9001iumlIuml 8coreoacuteC IB2gt brvbarntildeGDHm IBgt 2laquo]eumlXDHIcirca|ccedil`ordfugtlaquoqCPU13fUcircordfbrvbarsect2WnIcirca|ccedil`fIuml13vC]G

Weumlnq

JAEA-Testing 2011-005

7 UgraveWH

HPCC BMT7q|^_ZWUuml|YacuteTHORN^_ZBDbrvbarsectBX900W FX1gtfeaZbrvbarTUacuteUcircordflaquosectnordf^_ZOslashszligthornGmWordfdcXpoundDqIORBDgtegraveBG API]a13brvbarpoundlordfagraveaacutegtlaquopoundDqiacuteX B^_ZbrvbarOcirc$

a$fY`yBAgraveogravegt ordfecircC+XmWethCDqmnd

EcircHdegdnDregmacrHPCCYoJusectWUgraveWHWCtCqXyacuteampG(GOcirc$b AccedilGUgraveWHq mndregmacrBX900 W FX1 o13gtegraveBG^_ZIacuteDh

Agraveogravelaquo GDHnotszligXWXqordfbrvbarsectCumlcopy13

ABGlCWCJ7gtecircngtlaquoq

ugraveuacute

)JlaquoDsectXsup2 sup2kDotildeDAElig13=` v oBeacuteQRjlucircsup2acircAgraveXugraveG

q

ucircuumlyacutethorn

1) Innovative Innovative Computing Laboratory the University of Tennesee ldquoHPC CHALLENGErdquo httpiclcsutkeduhpcc

2) gt$middotamp()p HPCaelig7Yacutegt SX13L32 3plusmnAElig13THORNYacutel 3 A=P+ frac14AElig13THORNYacute$ 2005-05middot4 p98-116 (2005)

3) ldquoIOR HPC Benchmarkrdquo httpior-siosourceforgenet

JAEA-Testing 2011-005

A

FX1

HPC

C B

MT

G-F

FTEfeaZUuml^13~

-DH

PCC

_FFT

_235GmWgt

FFT|5ordfTUacuteUcirceumlnordfzyacuteCq

iAgrave

EP-

DG

EM

M-a5^regbrvbarpoundaringaeligccedil13h-a5

(15

MB

Cor

e)UacuteCordfzyacuteCq

Tabl

e A

-1feaZUuml^13~]-a5^regCD

FX1

HPC

C B

MT

UgraveccedilegravekeacuteUgraveccedilknUgraveccedilplusmnsup2sup3

~~regreg

Ugraveccedilmmnecircecirckccedilnecircfrac12

regReeumlgecirckccedilnecircfrac12

ndegplusmnsup2

ecirckccedilpUgraveecircfrac12frac12plusmnsup2sup3degmiddot

plusmnsup2ordmdegsup2sup1plusmnsup2sup3degmiddot

eacuteplusmn~+igraveiacute8

nmcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

reg~

reg~v

reg

sup2sup3plusmn

qvqyuqou

ovwyy

qvquzqq

vxzwo

qvzz

vyoy

zzv|w|

qvuqqy

uvuyx

qyu

sup2sup3plusmnqvqy|zx

ovwz

qvquyxz

vxuo

qvzuw

vyowx

zzv|z

qvxx|y

yvox

qwy

sup2sup3plusmn

qvxoquq

uvoxzo

qvoqqz

yyvuox

w|vyzx

vyoxx

zzvzyq

qv|qq

vqy

yy

sup2sup3plusmnqvxoqwwq

uvqwxx

qvooqzu

yyvxuwo

w|vyyq

vyouu

zzvzzx

qvux|

oqvz|

xw

sup2sup3plusmn

|vwq|qqq

|uvwz

qvqoxxw

qv|

o|uuv|q

vyxy

zzv||wq

qvqwzuw

oqvywyu

sup2sup3plusmn|vw||wqq

|vyz

qvuxu

yyyvuwwu

o|uyvyw

vy|q

zzv|uy

qvxwzqu

ovxuo

zq|z

sup2sup3plusmn

|vwuzqq

uuvxqwq

qvqoxxw

wwyv|zq

o|uv|q

vy|ox

xvuuwu

qvowox

oqvowuyq

sup2sup3plusmn|vwoquxqq

|yvzyxu

qvuzou

yvqwq|

o|uyvzxu

vy|qw

xvuuz

qvqwuy

ovuz

|y

uxqqq

wxqqq

|xqqqq

|qqqq

w

frac14icircRfrac14Rplusmndegreg

szligsup3degicircRRszligcedilszligiumliumlRethvRntildeRfrac14degicircRccedilogravemicroplusmnregntildemicro~sup1Oslashsup3sup2moRenecircfrac12oacuteregplusmnAumlocircotildeg

egravekeacuteOslashfrac14knicircccedilpegravekszligszligOslashmmnOslash|xRccedilpOslashpOslashfrac14knRccedilpegravekszligszligOslashfrac12ecircfrac12eacuteeacuteszlignRocircotilde

mszligpecircmicircccedilpeacutefrac14UgraveOslashOslashyunRocircotilde

frac12kicircRRkplusmnplusmnplusmnraquodegReacutemiddotplusmnmiddotRkplusmn~plusmnmiddot

-

moRexoszligcedilowszligkoumlg

szligdegicircRkszligyudividentildeRvxUgraveegraveparantildeRpposlashueuqUgravecedilregedegsup2plusmngntildeRo|vxUgravecedilregesup3plusmnregsup2RugraveRnecircfrac12gg

Cuacute uacuteucircicircuumlTyacutethornYacuteN0ucircicircuumlTyacutethornYacuteN0gt

|

xo

xo

egravekszligszligRethvicircRethov|vo

eeumlgicircecirckccedilnecircfrac12RregR0Eeecirckccedilnecircfrac12Rndegplusmnsup2gIcirc-RJ13NOIPQ

~~icircRppRmicrodegdegugraveplusmnsup2ntildeRoreg~cedilsup2ntildeRmplusmnRn

JAEA-Testing 2011-005

B EumlIgravecentMPI atilde oh$a`MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

1024EumlIgrave

0

1000

2000

3000

4000

5000

6000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

2000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(1024-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(1024-)

rdma

Nordma

0

100000

200000

300000

400000

500000

600000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(1024-)

rdma

Nordma

Fig B-1 MPI atildeL1024EumlIgrave-1P

JAEA-Testing 2011-005

0

50000

100000

150000

200000

250000

300000

350000

400000

q qqqq uqqqq yqqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(1024-)

rdma

Nordma

0

1000000

2000000

3000000

4000000

5000000

6000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(1024-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(1024-)

rdma

Nordma

0

50

100

150

200

250

300

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(1024-Rank0gtRank1023)

rdma

nordma

Fig B-2 MPI atildeL1024EumlIgrave-2P

JAEA-Testing 2011-005

13

C

RD

MAoacute

NO

RD

MA

Ta

ble

C-13UVOcirc$

(8K

Bccedil

78K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

~plusmnreg

plusmn

sup2~

sup2~

sup2cedil~raquo

sup2~Oslashreg~plusmn

middotplusmnsup1

middotplusmnsup1raquo

~plusmnraquo

~plusmn

~plusmn

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

plusmnraquo

Tabl

e C

-2UVOcirc$

(80

KBccedil

584K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

sup2~Oslashreg~plusmn

plusmn

~plusmnreg

middotplusmnmiddotsup1

~plusmnraquo

~plusmn

sup2~

sup2cedil~raquo

sup2~

middotplusmnsup1raquo

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

~plusmn

plusmnraquo

Zordf

100M

RD

MAAgravejordfnotyWXq

0CXM

NO

RD

MA

AgravejordfnotyWXq

JAEA-Testing 2011-005

D

W Oslash5

Tabl

e D

-1Zccedil`

MPI

8mpi

qzvquecirciumlq|yvwuecirciumlquvzuecirciumlquovu|oecirciumlqxvwoqecirciumlquyvxqoecirciumlquyvxoecirciumlqu

qwvo

qzvzzecirciumlq|vuecirciumlquwvyecirciumlquovu|oecirciumlqxvwuecirciumlquyvuzoecirciumlquxvy|ecirciumlqu

yqyv|

owvz|oecirciumlq|yvwyyecirciumlquvxzecirciumlquvqxecirciumlquov|uecirciumlquyvuecirciumlquccedilxv|xecirciumlq

ccedilyvq

ozvqoecirciumlq|vxxecirciumlquwvux|ecirciumlquvyzoecirciumlquovxoecirciumlquyvuuqecirciumlquccedilvyecirciumlq|

ccedilwuvy

ovq|ecirciumlquyvzz|ecirciumlquwvq|qecirciumlquovu|qecirciumlqxvw|wecirciumlquyvuyoecirciumlquyvywecirciumlqu

yquv

ovquyecirciumlquvuyqecirciumlquwvxqecirciumlquovu|qecirciumlqxvwecirciumlquyvuwqecirciumlquxvzxecirciumlqu

xx|vw

|ovq|zecirciumlquyvwzuecirciumlquvz||ecirciumlquvyoecirciumlquovwecirciumlquyvuzecirciumlquccedilovozecirciumlq|

ccediloyvy

|ovquecirciumlquv|qecirciumlquwvuoyecirciumlquvyy|ecirciumlquovowyecirciumlquyvuecirciumlquccedilvxwecirciumlq|

ccedilovz

uzvwoecirciumlq|vuxyecirciumlquwvu|uecirciumlquovu|ecirciumlqxvw|ecirciumlquyvuuxecirciumlquxvwwuecirciumlqu

yqovy

uzvywecirciumlq|yvzuzecirciumlquvzoyecirciumlquovuwecirciumlqxvwqyecirciumlquyvuecirciumlquyv|yecirciumlqu

yxvu

xzvqecirciumlq|vuoqecirciumlquwv|wqecirciumlquvyyecirciumlquovowecirciumlquyvuuzecirciumlquccedilvozecirciumlq|

ccedil|vx

xzvyzzecirciumlq|voxuecirciumlquwvouecirciumlquvqqecirciumlquovxuecirciumlquyvuuyecirciumlquccediluv|yecirciumlq|

ccedilu|v

yzvuecirciumlq|vxzecirciumlquwvxecirciumlquovuyecirciumlqxvwecirciumlquyvuxecirciumlquxvyzqecirciumlqu

xw|vw

yzvz|oecirciumlq|yvwzoecirciumlquvwwxecirciumlquovuzecirciumlqxvwyecirciumlquyvuyyecirciumlquyvuqwecirciumlqu

yuxv|

zvwqecirciumlq|v|uzecirciumlquwv||yecirciumlquvyxwecirciumlquovozecirciumlquyvuzecirciumlquccedilyvwuecirciumlq|

ccedilywv

ovqq|ecirciumlquyvzzyecirciumlquvzzzecirciumlquvyzecirciumlquovqoecirciumlquyvuzecirciumlquccedil|vozxecirciumlq|

ccedil|ovz

zvuzecirciumlq|vowqecirciumlquwvoxxecirciumlquovoqqecirciumlqxuvxwecirciumlquyvuqecirciumlquvwu|ecirciumlqu

zovy

zvwecirciumlq|vyxecirciumlquwvuecirciumlquovqzzecirciumlqxuvxoecirciumlquyvuyzecirciumlquvu|ecirciumlqu

wvw

qwv|xzecirciumlq|vzecirciumlquwvywecirciumlquvxyoecirciumlquovqwuecirciumlquyvuecirciumlquccedilovqyecirciumlquccedilov

qzvuuuecirciumlq|v|yecirciumlquwvqecirciumlquovuzecirciumlqxvwqqecirciumlquyvuzoecirciumlquyvqoecirciumlqu

y|vx

oovouwecirciumlquwvqqecirciumlquzvoywecirciumlquovuqwecirciumlqxvyxxecirciumlquyvuwecirciumlquuvzoxecirciumlqu

uwvq

owvwuecirciumlq|vq|qecirciumlquvzqzecirciumlquv|uuecirciumlquzvqoecirciumlq|yvu|ecirciumlquccedilxvyxqecirciumlq|

ccedilyuv|

yvz||ecirciumlq|v|oecirciumlquwvuuecirciumlquvqyecirciumlqxovuowecirciumlqxyvu||ecirciumlquovozecirciumlqxoxwvy

ovouecirciumlquvuxecirciumlquwv|zecirciumlquovu|oecirciumlqxvwxyecirciumlquyvux|ecirciumlquxvzoecirciumlqu

xoyvo

|ov|qoecirciumlquv|zqecirciumlquwvyzoecirciumlquv|oecirciumlquwvu|uecirciumlq|yvu|ecirciumlquccedilov|uecirciumlquccediloqxv

|v|qecirciumlq|vqecirciumlquvxzecirciumlquvxqecirciumlquovq|ecirciumlquyvuzecirciumlquccedilv|zoecirciumlq|

ccedil|vu

uwvxy|ecirciumlq|wvzoyecirciumlquzv|ecirciumlquovuozecirciumlqxvuoqecirciumlquyvwqecirciumlquuvuoecirciumlqu

xoxvw

uovquecirciumlquvwecirciumlquwvqecirciumlquovuouecirciumlqxvyxqecirciumlquyvuzoecirciumlquxvwoecirciumlqu

xy|vx

xzvux|ecirciumlq|vxwecirciumlquwvx|ecirciumlquvx|oecirciumlquovoqoecirciumlquyvu|qecirciumlquccedilzvzoecirciumlq|ccediloquvz

xwvqqecirciumlq|vuuwecirciumlquwv|owecirciumlquvzuzecirciumlquov|wyecirciumlquyvxy|ecirciumlquccedil|vywuecirciumlq|

ccediluv|

yzvqecirciumlq|zv|w|ecirciumlquovq|qecirciumlqxovuxecirciumlqxvyecirciumlquyvzqecirciumlquuvqecirciumlqu

uyuvq

yovqyuecirciumlquv|quecirciumlquwv|ywecirciumlquovuyuecirciumlqxwvqxwecirciumlquyvxwoecirciumlquyvecirciumlqu

xwzvx

ovoooecirciumlquvzwwecirciumlquzvqzzecirciumlquvo|ecirciumlqxovuoecirciumlqxyvyouecirciumlquovecirciumlqxooqqvy

ovqoecirciumlquvqyoecirciumlquwvo|ecirciumlquvy|ecirciumlquovoecirciumlquyvuxecirciumlquccediluvxzecirciumlq|

ccediluvz

zvy|ecirciumlq|wvoqqecirciumlquzvqyecirciumlquov|uqecirciumlqxyvw|ecirciumlquyvxyecirciumlquuv||ecirciumlqu

uzovo

zvyzoecirciumlq|vqwecirciumlquwvoecirciumlquovqzwecirciumlqxuvuwwecirciumlquyvuzyecirciumlquvwqyecirciumlqu

yxvy

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

JAEA-Testing 2011-005

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy1

Ccedil

Ccedil

+

Ccedil

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

Fi

g D

-1 Zccedil`

MPI

8mpi

JAEA-Testing 2011-005

Tabl

e D

-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p

q|vu|ecirciumlq|yvuqqecirciumlquyvuuecirciumlquyvy|ecirciumlqu|vuuxecirciumlq|yvuozecirciumlquovzzoecirciumlq

xvw

qvuuwecirciumlq|yvuqecirciumlquvouecirciumlquvqyecirciumlquyvqxqecirciumlq|yvuqecirciumlquccedilovoecirciumlq|

ccediloyv|

o|vuuecirciumlq|yvuq|ecirciumlquyvuecirciumlquyvwuwecirciumlquuv|ecirciumlq|yvuuecirciumlquovqquecirciumlq|

zv

oyvqzwecirciumlq|yvuqoecirciumlquvqooecirciumlquvqecirciumlquyvqyecirciumlq|yvuozecirciumlquovxxecirciumlq

vx

|vwuecirciumlq|yvuqoecirciumlquyv|qecirciumlquyvwxecirciumlquuv|uqecirciumlq|yvu|ecirciumlquovoecirciumlq|

|wv

yvqqwecirciumlq|yvuqyecirciumlquvqqecirciumlquvq|oecirciumlquyvoyecirciumlq|yvuouecirciumlquvuowecirciumlq

uvq

||vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvwxecirciumlquuv|yqecirciumlq|yvuoecirciumlquovooecirciumlq|

|yv

|vuzwecirciumlq|yvuqyecirciumlquvoxxecirciumlquvq|ecirciumlquyvoqyecirciumlq|yvuoecirciumlquccedilov|uecirciumlq|

ccediloyvx

|v|yuecirciumlq|yvuq|ecirciumlquyv|zecirciumlquyvw|oecirciumlquuvqzuecirciumlq|yvuecirciumlquzvo|ecirciumlq

vy

yvy|ecirciumlq|yvuquecirciumlquvqwqecirciumlquvqzecirciumlquyvoqecirciumlq|yvuozecirciumlquccedilxvo|uecirciumlq

ccedilyvy

q|vu|ecirciumlq|yvuqoecirciumlquyvu|ecirciumlquyvyyyecirciumlquvx|qecirciumlq|yvuo|ecirciumlquccedilvoqecirciumlq

ccedilvx

qvu||ecirciumlq|yvuquecirciumlquvouwecirciumlquvq|ecirciumlquyvoxecirciumlq|yvuecirciumlquccedilovoquecirciumlq|

ccedilouvz

ovxouecirciumlq|yvuqecirciumlquyvyxzecirciumlquyvyyecirciumlquvu|ecirciumlq|yvuxecirciumlquwvuwwecirciumlqo

|vu

oyvozecirciumlq|yvuq|ecirciumlquvqoyecirciumlquvq|ecirciumlquyvoecirciumlq|yvuxecirciumlquvooqecirciumlq

|vu

|vyoqecirciumlq|yvuoyecirciumlquyvecirciumlquyvxyecirciumlqu|v|xwecirciumlq|yvuqecirciumlquccedilvozecirciumlq

ccedilxvz

xvzzyecirciumlq|yv|wzecirciumlquyvzwzecirciumlquvq|oecirciumlquyvouecirciumlq|yvuoyecirciumlquuvoxwecirciumlq

yvz

||vyecirciumlq|yvuqoecirciumlquyvy|ecirciumlquyvxecirciumlqu|v|xxecirciumlq|yvuoecirciumlquccedilxvwwzecirciumlqo

ccedilovy

|vuzuecirciumlq|yvuquecirciumlquvox|ecirciumlquvq||ecirciumlquyvo|ecirciumlq|yvuozecirciumlquccedilovq|ecirciumlq|

ccediloyvo

|vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvooecirciumlquvzoyecirciumlq|yvuqecirciumlquccedilv|zxecirciumlq

ccedilyv

yvy|ecirciumlq|yvuqqecirciumlquvqecirciumlquvq|xecirciumlquyvouecirciumlq|yvuqecirciumlquccediluvqoecirciumlq

ccedilxvo

q|vozyecirciumlq|yv|zzecirciumlquyvowecirciumlquyvyzecirciumlqu|vxu|ecirciumlq|yvuouecirciumlquxvq|wecirciumlq

oxvw

qv|wyecirciumlq|yvuqxecirciumlquvouuecirciumlquvquuecirciumlquyvoqyecirciumlq|yvu||ecirciumlquccedilovqqqecirciumlq|

ccedilo|vx

ouvoq|ecirciumlq|yv|zecirciumlquyvwqecirciumlquyvyxecirciumlquvuoecirciumlq|yvuoyecirciumlquccedilovxqoecirciumlq|

ccedil|yvy

ovuqzecirciumlq|yv|zoecirciumlquvo|ecirciumlquvq|zecirciumlquyvo|ecirciumlq|yvuowecirciumlquccedilzv|ecirciumlq

ccedilovy

uvoecirciumlq|yvuqqecirciumlquyvwo|ecirciumlquyvxoecirciumlqu|vzwecirciumlq|yvuoecirciumlquccedilyvowecirciumlq

ccediloxvq

vxozecirciumlq|yvuoecirciumlquvo|ecirciumlquvquuecirciumlquyvoyecirciumlq|yvuecirciumlquccedilovzzecirciumlq|

ccedilov|

||vuoqecirciumlq|yvuqxecirciumlquyvuecirciumlquyvywecirciumlqu|vuyuecirciumlq|yvuoecirciumlquvooqecirciumlq

yv

|vxoecirciumlq|yvuquecirciumlquvoxyecirciumlquvq||ecirciumlquyvqwyecirciumlq|yvuuecirciumlquccedilov||ecirciumlq|

ccediloyvu

|vqwecirciumlq|yvuqqecirciumlquyvoecirciumlquyv|yecirciumlqu|vozecirciumlq|yvuowecirciumlquccedil|vxooecirciumlq

ccedilvu

vuxwecirciumlq|yvuqxecirciumlquvoxoecirciumlquvquqecirciumlquyvoxxecirciumlq|yvuuecirciumlquccedilovooyecirciumlq|

ccedilouvz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

jetimesg

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-3Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (2

)

JAEA-Testing 2011-005

Tabl

e D

-3Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p

q|v|oxecirciumlq|wvwz|ecirciumlq|ovoecirciumlquovou|ecirciumlquvu|qecirciumlq|zvqqecirciumlq|ccedilvxwecirciumlq

ccedil|vu

qvx||ecirciumlq|wvzquecirciumlq|ovyuuecirciumlquovyuyecirciumlquvxqoecirciumlq|wvzyecirciumlq|vy|ecirciumlqo

qv|

ovyecirciumlq|wvwzqecirciumlq|ovoyxecirciumlquovouuecirciumlquvuwecirciumlq|wvzyuecirciumlq|ccedilvoqqecirciumlq

ccedilvy

ovxqoecirciumlq|wvwzecirciumlq|ovyuqecirciumlquovy|yecirciumlquvuoqecirciumlq|wvzxqecirciumlq|ccedil|vzoecirciumlqo

ccedilqvx

|vywoecirciumlq|wvz|ecirciumlq|ovyoecirciumlquov|ecirciumlqu|v|wyecirciumlq|wvz|oecirciumlq|ccedilvzxzecirciumlq

ccedilwvq

v|zoecirciumlq|wvwzyecirciumlq|ovyzecirciumlquovuzyecirciumlquxvzzwecirciumlq|wvzxwecirciumlq|ccedilov||oecirciumlq|

ccedilowvq

||vywecirciumlq|wvwwzecirciumlq|ovxecirciumlquov||ecirciumlqu|v|xwecirciumlq|wvzoecirciumlq|ccedilv|zecirciumlq

ccedilyvx

|vuooecirciumlq|wvzqwecirciumlq|ovy|ecirciumlquovxqxecirciumlquyvqzecirciumlq|wvz|ecirciumlq|ccedilovyecirciumlq|

ccedilovo

|v|xzecirciumlq|wvzqoecirciumlq|ovyecirciumlquovowwecirciumlquvzo|ecirciumlq|wvzyecirciumlq|ccedil|vzzecirciumlq

ccediloovu

vuxzecirciumlq|wvzqoecirciumlq|ovy|yecirciumlquovxoecirciumlquyvuecirciumlq|wvzyoecirciumlq|ccedilyvxuecirciumlq

ccedilwvw

q|v|xzecirciumlq|wvwwoecirciumlq|ovuecirciumlquovxqecirciumlqu|vxxuecirciumlq|wvzuuecirciumlq|vxwecirciumlq

v

qvuxecirciumlq|wvzqyecirciumlq|ovy||ecirciumlquovuzwecirciumlquyvqqecirciumlq|wvzy|ecirciumlq|ccedilov|uwecirciumlq|

ccedilowv

ovuwqecirciumlq|wvwzecirciumlq|ovo|yecirciumlquovo|zecirciumlquvuuqecirciumlq|wvzuyecirciumlq|vowecirciumlqo

ovo

oyvoozecirciumlq|wvz|qecirciumlq|ovxqxecirciumlquovuzecirciumlquyvqoecirciumlq|wvzxzecirciumlq|ccedilvzyecirciumlqo

ccedilov|

|v||zecirciumlq|wvzqzecirciumlq|ovxecirciumlquov|oecirciumlqu|v|oxecirciumlq|wvzzyecirciumlq|yv|owecirciumlqo

ovz

xvzyuecirciumlq|wvzqoecirciumlq|ovuwecirciumlquovxqxecirciumlquyvqxecirciumlq|wvzuecirciumlq|ovwuoecirciumlq

|vo

||v|xzecirciumlq|wvwzoecirciumlq|ovxecirciumlquovuzecirciumlqu|vxozecirciumlq|wvzqecirciumlq|v|zoecirciumlq

vo

|vuwxecirciumlq|wvzqyecirciumlq|ovy|zecirciumlquovxquecirciumlquyvqwoecirciumlq|wvzxzecirciumlq|ccedilov|xoecirciumlq|

ccedilowvo

|vo|uecirciumlq|wvwzqecirciumlq|ovqecirciumlquovoecirciumlqu|vqecirciumlq|wvzyuecirciumlq|ovuyzecirciumlq

uvu

yvuwecirciumlq|wvzooecirciumlq|ovxyyecirciumlquovxqoecirciumlquyvquecirciumlq|wvzyuecirciumlq|ccedilyvuw|ecirciumlq

ccedilwvy

q|v|oyecirciumlq|wvwzqecirciumlq|ovoecirciumlquov|yecirciumlqu|v|yzecirciumlq|wvzwzecirciumlq|ovxoyecirciumlq

uvy

qxvzyyecirciumlq|wvzq|ecirciumlq|ovuwecirciumlquovy|wecirciumlquvu|qecirciumlq|wvzx|ecirciumlq|ovxouecirciumlq|

xvu

ovxyecirciumlq|wvwqecirciumlq|ovoy|ecirciumlquov|oecirciumlquuvuuecirciumlq|wvzyecirciumlq|ovxwqecirciumlq|

xv|

oyvqqecirciumlq|wvzqecirciumlq|ovuzecirciumlquovy|wecirciumlquvuyecirciumlq|wvzuzecirciumlq|ovux|ecirciumlq|

uvo

|vyecirciumlq|wvwzqecirciumlq|ovxecirciumlquov|zecirciumlquuv||ecirciumlq|wvzxuecirciumlq|vowyecirciumlq

ozvx

vuuuecirciumlq|wvzuuecirciumlq|ovy|zecirciumlquovuwzecirciumlquxvzxzecirciumlq|wvz|yecirciumlq|ccedilovuz|ecirciumlq|

ccedilqvo

||vywecirciumlq|wvww|ecirciumlq|ovxyecirciumlquov|zecirciumlquuv|owecirciumlq|wvzoecirciumlq|vwoecirciumlq

ozvw

|vuqqecirciumlq|wvzqwecirciumlq|ovy|oecirciumlquovxqqecirciumlquyvquqecirciumlq|wvzxwecirciumlq|ccedilov|oqecirciumlq|

ccedilov

|v|xecirciumlq|wvww|ecirciumlq|ovuecirciumlquov|q|ecirciumlquuvqyyecirciumlq|wvzyzecirciumlq|vzuyecirciumlq

xv|

yvqwecirciumlq|wvzouecirciumlq|ovxyecirciumlquovxyyecirciumlquyvouecirciumlq|wvzuzecirciumlq|uvqzxecirciumlqo

vz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

Ccedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

Xpound

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-4Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-5Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (2

)

JAEA-Testing 2011-005

E 1oacute 1

Fig E-1 QRfeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39305 sec Ave Speed = 32566145 MBs RANK = 2 TIME = 39197 sec Ave Speed = 32655168 MBs RANK = 3 TIME = 39212 sec Ave Speed = 32643219 MBs RANK = 4 TIME = 34341 sec Ave Speed = 37272765 MBs RANK = 5 TIME = 34175 sec Ave Speed = 37453885 MBs RANK = 6 TIME = 34086 sec Ave Speed = 37552470 MBs RANK = 7 TIME = 34329 sec Ave Speed = 37286226 MBs RANK = 8 TIME = 24021 sec Ave Speed = 53287420 MBs RANK = 9 TIME = 23794 sec Ave Speed = 53794939 MBs RANK = 10 TIME = 23805 sec Ave Speed = 53770236 MBs RANK = 11 TIME = 23798 sec Ave Speed = 53786914 MBs RANK = 12 TIME = 23942 sec Ave Speed = 53462490 MBs RANK = 13 TIME = 23931 sec Ave Speed = 53488046 MBs RANK = 14 TIME = 23950 sec Ave Speed = 53443826 MBs RANK = 15 TIME = 24066 sec Ave Speed = 53186052 MBs RANK = 16 TIME = 23798 sec Ave Speed = 53786327 MBs RANK = 17 TIME = 23802 sec Ave Speed = 53775886 MBs RANK = 18 TIME = 23821 sec Ave Speed = 53733399 MBs RANK = 19 TIME = 23794 sec Ave Speed = 53795866 MBs RANK = 20 TIME = 23939 sec Ave Speed = 53468837 MBs RANK = 21 TIME = 23938 sec Ave Speed = 53470775 MBs RANK = 22 TIME = 23939 sec Ave Speed = 53470333 MBs RANK = 23 TIME = 23942 sec Ave Speed = 53462889 MBs

JAEA-Testing 2011-005

Fig E-2 afeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39370 sec Ave Speed = 32511775 MBs RANK = 2 TIME = 39475 sec Ave Speed = 32425231 MBs RANK = 3 TIME = 39415 sec Ave Speed = 32474846 MBs RANK = 4 TIME = 34392 sec Ave Speed = 37217925 MBs RANK = 5 TIME = 34543 sec Ave Speed = 37055373 MBs RANK = 6 TIME = 34250 sec Ave Speed = 37372742 MBs RANK = 7 TIME = 34709 sec Ave Speed = 36878286 MBs RANK = 8 TIME = 26967 sec Ave Speed = 47465876 MBs RANK = 9 TIME = 26964 sec Ave Speed = 47471018 MBs RANK = 10 TIME = 26955 sec Ave Speed = 47486289 MBs RANK = 11 TIME = 26947 sec Ave Speed = 47499771 MBs RANK = 12 TIME = 26957 sec Ave Speed = 47483647 MBs RANK = 13 TIME = 27020 sec Ave Speed = 47373054 MBs RANK = 14 TIME = 26948 sec Ave Speed = 47498023 MBs RANK = 15 TIME = 27174 sec Ave Speed = 47104542 MBs RANK = 16 TIME = 26956 sec Ave Speed = 47485054 MBs RANK = 17 TIME = 26945 sec Ave Speed = 47504899 MBs RANK = 18 TIME = 26946 sec Ave Speed = 47502360 MBs RANK = 19 TIME = 26967 sec Ave Speed = 47466158 MBs RANK = 20 TIME = 26948 sec Ave Speed = 47499767 MBs RANK = 21 TIME = 26982 sec Ave Speed = 47438204 MBs RANK = 22 TIME = 27087 sec Ave Speed = 47254830 MBs RANK = 23 TIME = 27004 sec Ave Speed = 47399960 MBs

JAEA-Testing 2011-005

Fig E-3 QRfeaZOpenMPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes NODE CPU coreid Rank = 0 bx2050 CPU0 0 all time RANK = 1 TIME = 38110 sec Ave Speed = 33587213 MBs bx2050 CPU1 1 RANK = 2 TIME = 28656 sec Ave Speed = 44667752 MBs bx2050 CPU0 2 RANK = 3 TIME = 37959 sec Ave Speed = 33720175 MBs bx2050 CPU1 3 RANK = 4 TIME = 28658 sec Ave Speed = 44664664 MBs bx2050 CPU0 4 RANK = 5 TIME = 37829 sec Ave Speed = 33836073 MBs bx2050 CPU1 5 RANK = 6 TIME = 28641 sec Ave Speed = 44691880 MBs bx2050 CPU0 6 RANK = 7 TIME = 37892 sec Ave Speed = 33779924 MBs bx2050 CPU1 7 RANK = 8 TIME = 35140 sec Ave Speed = 36425620 MBs bx2051 CPU0 0 RANK = 9 TIME = 35017 sec Ave Speed = 36554163 MBs bx2051 CPU1 1 RANK = 10 TIME = 38916 sec Ave Speed = 32891010 MBs bx2051 CPU0 2 RANK = 11 TIME = 45416 sec Ave Speed = 28184186 MBs bx2051 CPU1 3 RANK = 12 TIME = 35172 sec Ave Speed = 36392511 MBs bx2051 CPU0 4 RANK = 13 TIME = 35053 sec Ave Speed = 36516468 MBs bx2051 CPU1 5 RANK = 14 TIME = 35200 sec Ave Speed = 36363472 MBs bx2051 CPU0 6 RANK = 15 TIME = 35009 sec Ave Speed = 36561987 MBs bx2051 CPU1 7 RANK = 16 TIME = 22996 sec Ave Speed = 55661441 MBs bx2052 CPU0 0 RANK = 17 TIME = 35587 sec Ave Speed = 35968575 MBs bx2052 CPU1 1 RANK = 18 TIME = 35587 sec Ave Speed = 35968151 MBs bx2052 CPU0 2 RANK = 19 TIME = 35712 sec Ave Speed = 35842525 MBs bx2052 CPU1 3 RANK = 20 TIME = 22996 sec Ave Speed = 55661978 MBs bx2052 CPU0 4 RANK = 21 TIME = 36856 sec Ave Speed = 34729458 MBs bx2052 CPU1 5 RANK = 22 TIME = 22994 sec Ave Speed = 55666382 MBs bx2052 CPU0 6 RANK = 23 TIME = 22991 sec Ave Speed = 55673257 MBs bx2052 CPU1 7

JAEA-Testing 2011-005

13

F |eacuteIuml -microX|eacuteIuml^_Z)JCQRfeaZW Intel feaZegrave

pound5poundDqQRfeaZgt|YatildeGfeaUuml^13~

WCeumlUKntstiacuteordflaquoqmnaringaeligccedil13heumlX store amp5GUuml^13~gtlaquoq2csup1egraveBordfOcirc$ storegtszligordflaquoqiAgraveIntelmnntildeGfeaUuml^13~centsup3CXqCcCcedilEgravegtacircOslash5Cagraveaacute5pound

Dq WCIntelfeaZmUuml^13~ordfccdQRfeaZ

UKntst egraveBCDoumlWordfqiAgraveUKntst ordfQRfeaZEGmWordfethCDq

F1 Agraveograve o^_UgraveDo7ccedilIumlgt 128MB copyIgrave0]lyacuteOslash5GZccedil` MPI

OpenMP^_Z)JCDqm0]lyacute 100plusmnatildesectaumlC5mnccC|eacuteIuml (MBs)iquestHq^_ZiIacute[oumlYacutebrvbarsectGqOslash5IumlWCMPI gt 12348 ^_OpenMP gt 123487ccedilIumlgtOslash5UgraveD IFyacute1EC1CPUcd2^_UgraveD7ccedilIuml 2laquonotgtOslash55poundEacuteDqQRfeaZUKntst CDoumlWCXouml 2sectCDqIntelucircfeagt5poundDq

JAEA-Testing 2011-005

Fig F-1 ^_ZugraveuacuteLFlat MPIP

(main) n = nmb102410248 nmb = 128 (from input data) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) nsize = n size c NNN Repetition number of times NNN = 100 do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = MPI_WTIME() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

Fig F-2 ^_ZugraveuacuteLOpenMPP

(main) $OMP threadprivate(ioutompidompnumabierr1ierr2) n = nmb102410248 nmb = 128 (from input data) $OMP parallel shared(n) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) if(ierr1ne0orierr2ne0) then write(6) Array allocation failed stop end if $OMP end parallel nsize = n size c NNN Repetition number of times NNN = 100 $OMP parallel private(i)shared(nsize) do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue $OMP end parallel $OMP parallel private(IIttime1ttime2timewave_spd) $OMPamp shared(NNNn) ttime1 = omp_get_wtime() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = omp_get_wtime() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

F2 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt[oumlYacutebrvbarsect 4gtGqmn

d QRUKntstlaquooumlWoumlWr 2 2^_laquonotgtUKntstlaquooumlWoumlWr 3KntstCc 1ccedil8^_gtQRW IntelWr QR(-Kntst)W Intel2 2^_Wr CqOMP Intel EcircHGu MPI Wiacute

XWXpoundDDHyzGqmndbrvbarsect[raquozfrac14frac12CDq (1) MPI^_ZOpenMP^_ZWQRfeaZgtUKntstordflaquooumlW

oumlgtordfaacuteCq (2) IntelfeaZUKntstCDQRfeaZWbAEligq (3) MPI^_ZOpenMP^_ZW 1CPUSgto coremacrdnXq (4) UKntstordfQRfeaZOslash5thornlt]G(1plusmnW 2plusmn

Oslash5ordfaEacuteD)q ^_ZbrvbarPXOcirc$sup1yBordfAgraveXCc store ampordfCXoumlQR

feaZUKntst Uuml^13~CXnordfyacuteordfdXqOcirc$sup1yBordfcAgraveXcpoundecircsectCXoumlmUuml^13~CagraveaacuteGaeliguordflaquoWnq

IntelfeaZdegIacuteQRUKntstlaquosectoumlWGmWordfgtecircq(XETHNtildefeaOslash5|YatildeG4X

yen5oslashIacuteD|Yordf9WXpoundbrvbarPmacrltdnDq

JAEA-Testing 2011-005

Table F-1 |eacuteIumlWeumlUKntstiacuteUuml^13~

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqvo uuuwvq u|xvq qvzu

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |o|v wq|vy xxwvu qvXpoundo |xwxvz |xwvq |xw|vz Xpoundo |xvy wovw xov qv

|xyvu |xuvo |xxv| |ozvu woqv xyxvq qv

Xpoundq z|vx zvz z|v Xpoundq ouuwvz ozwov ooxv| qvyoXpoundo woqv wqxvo wqvy Xpoundo oux|vu ozwv| oqvu qvyoXpound wqzv| wqvo wqxv Xpound oux|v| ozzqvo oov qvyo

wquv| wqqvo wqv ouxovz ozwyvu oozvo qvyo

Xpoundq o||vo o|xvq o|uvq Xpoundq oqoxvz oxquvu oyqvo qvxzXpoundo ouv ouov ouov Xpoundo oqovw oxoqvo oy|vz qvxzXpound ouov ouyv| ouuvq Xpound oqowvz oxqzv oyuv| qvxzXpound| ouvy ouuvw ou|v Xpound| oqowvz oxqzv oyuvq qvxz

o|zvz ouovw ouqvz oqovz oxqwv| oy|vo qvxz

Xpoundq qwyvq oqyvw qzyvu Xpoundq ouxvq ouwuvy ouywv| qvqXpoundo qz|v ooyvy oqxv Xpoundo ouxvz ouwuv ouovq qvqXpound qzuvy oovq oqxvw Xpound ouxzv ouz|v ouyv qvqXpound| qzxvy ooxvu oqxvx Xpound| ouxuvw ouzv ou|vx qvqXpoundu qwzv| o||v ooovx Xpoundu ouqxvo ouuv| ouyv qvyXpoundx qwzvu o|vw oo|vy Xpoundx ouq|vo ouuxv ouuv qvyXpoundy oqvy o|xvu oovx Xpoundy ouquvw ouuyvq ouxvu qvyXpound ooqvu o|yvo o|v Xpound ouqxvx ouuyv ouyvo qvy

qzxvw ouvw ooqv| ou|qv| ouyvu ouuwvz qvyw

usup3deg

wsup3deg wsup3deg

usup3deg

|sup3deg

sup3deg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

sup3deg

|sup3deg

Table F-2 |eacuteIumlLiacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq ywxvu xyuvw yxvo qvuXpoundo |xywv |xwvx |x|vy Xpoundo yzwvz xvu y|wvo qvuXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz yzvo xovo y|ovy qvu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq yqv xywvy yozvy qvuXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound ywyvz xwv y|uvw qvuXpound| Xpound|

|xyov| |xovu |xyyv| ywvw xxvy yv qvu

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq yoov xvu ywovw qvxXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| yxv qvo yzvy qvx

|xyv| |xqvo |xyyv yowv yov ywzv qvx

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo uyv| wqvo y|v qvuXpound |xyvz |xuvu |xovo Xpound uqvu wqyv| y|wv| qvuXpound| Xpound|

|xyyv| |xuvu |xqv| uywvw wqyv y|vw qvu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo uyxvy wqyv y|yvo qvuXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| uywvq wqxvz y|vq qvu

|xyuvw |x|vo |xywvz uyyvw wqyv| y|yvx qvu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound xq|v wovo yxvz qvuXpound| |xyxvu |x|v| |xyzv| Xpound| uzzvx wqyvu yxvz qvu

|xyuvw |xuvy |xyzv xqovy wqzv yxxvu qvu

ntildeR| ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

ontildeR

ontildeR|

qntildeRo

qntildeR

qntildeR|

qntildeRo

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

JAEA-Testing 2011-005

Talbe F-3 |eacuteIumlLQRafeaZrsacute 1ccedil8 ^_P

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqxvw uqzvq uqvu ovqx

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |ywqvx |ywovy |ywovo ovq|Xpoundo |xwxvz |xwvq |xw|vz Xpoundo |yzyvz |yzvo |yzvq ovq|

|xyvu |xuvo |xxv| |ywwv |ywzv| |ywzvq ovq|

Xpoundq z|vx zvz z|v Xpoundq zv| zxvq z|v ovqqXpoundo woqv wqxvo wqvy Xpoundo wqxvx wqwvy wqvo ovqqXpound wqzv| wqvo wqxv Xpound wquv| wqwv| wqyv| ovqq

wquv| wqqvo wqv wqqv wquvq wqv| ovqq

Xpoundq o||vo o|xvq o|uvq Xpoundq ov ouqv| o|ov| ovqqXpoundo ouv ouov ouov Xpoundo o|ovy ouwvw ouqv ovqqXpound ouov ouyv| ouuvq Xpound o|ovy oxqvo ouqvw ovqqXpound| ouvy ouuvw ou|v Xpound| o|qv ouzvo o|zvz ovqq

o|zvz ouovw ouqvz ozvq ouvo o|wvq ovqq

Xpoundq qwyvq oqyvw qzyvu Xpoundq qzv ooxvq oqyvo ovqqXpoundo qz|v ooyvy oqxv Xpoundo oquv oouv| oqzv ovqqXpound qzuvy oovq oqxvw Xpound oqxv| oxvx ooxvu ovqqXpound| qzxvy ooxvu oqxvx Xpound| oqxvw oxv ooxvx ovqqXpoundu qwzv| o||v ooovx Xpoundu oqv qzuv| oqvx qvzzXpoundx qwzvu o|vw oo|vy Xpoundx oov qzov oquvu qvzwXpoundy oqvy o|xvu oovx Xpoundy ooxvu qwzv oqvy qvzwXpound ooqvu o|yvo o|v Xpound oouvo qzqvo oqvo qvzw

qzxvw ouvw ooqv| ooqvq oqxv oqvz qvzz

wsup3deg

usup3deg

|sup3deg

sup3deg

13aumlecedilg

sup3deg

|sup3deg

usup3deg

eccedilograveregYacuteg

efrac12cedilregg

oefrac12cedilregg

wsup3deg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

Table F-4 |eacuteIumlLQRafeaZrsacute iacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq |quvy |quvx |quvx ovquXpoundo |xywv |xwvx |x|vy Xpoundo ||vx |ovw |vy ovquXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz |ouvq |o|vo |o|vy ovqu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq ||zzv |qv| |xxqvw ovqqXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound |zwv| ||v |yovq ovqxXpound| Xpound|

|xyov| |xovu |xyyv| |xzwvw |o|vq |yxxvz ovq

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq |quv |quvq |quvu ovquXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| |ovx |ovy |ovy ovqu

|xyv| |xqvo |xyyv |o|vo |ovw |o|vq ovqu

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo |oyvw |oyvu |oyvy ovquXpound |xyvz |xuvu |xovo Xpound |oyv |oyvq |oyvo ovquXpound| Xpound|

|xyyv| |xuvu |xqv| |oyvx |oyv |oyv| ovqu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo |ov |oyvo |oyvz ovquXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| |ouv| |ovx |o|vu ovqu

|xyuvw |x|vo |xywvz |oyvq |ouv| |oxvo ovqu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound |oyv |oxvz |oyvo ovquXpound| |xyxvu |x|v| |xyzv| Xpound| |oxvx |oyvw |oyvo ovqu

|xyuvw |xuvy |xyzv |oxvw |oyv| |oyvo ovqu

ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

qntildeRoqntildeRo

qntildeR

qntildeR|

ontildeR

ontildeR|

ntildeR|

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpound

JAEA-Testing 2011-005

G Zccedil` MPIWIcirca|ccedil`gt MPI atilde 6MPI ETHNtildegtegraveBCD^_ZiIacute^regCZccedil`MPI

WIcirca|ccedil`gtOslash55poundDq

G1 Agraveograve MPI ETHNtildegtegraveBCD^_ZiIacute^regC1024 ^_Zccedil

` MPIW 128^_j87ccedilIumlIcirca|ccedil`128^_Zccedil` MPIW128 ^_j8 7ccedilIumlIcirca|ccedil`acircgtCc Ocirc$-a5gtrs5poundDqUgraveDRDMANORDMAIumlgt5poundDq )acircEacute coregtZccedil` MPIWIcirca|ccedil`rs acircEacute MPI^_gt

coreCCDouml rsWGq Ocirc$-a5n~n 8eacutea`ccedil1Keacutea`256Keacutea`ccedil1Meacutea`WCDq

G2 [oumlYacutebrvbarsectGqmbrvbarsect[raquozfrac14frac12CDq (1) 1024Zccedil` MPIW 128j8Icirca|ccedil`gtGu MPIatildegtIcirca|ccedil`

ordfFqUgraveDatildebrvbarpoundaacuteCCordflaquoq (2) 128Zccedil` MPIW 128j8Icirca|ccedil`gtplusmn Ocirc$-a5(256KB13yacute)

gtlaquoWGu MPIatildegtaeligIcirca|ccedil`ordfnotyW wXordfaacuteCCIcirca|ccedil`ordfZccedil` MPIbrvbarsectordfmWq

(2)gt ordfAgravecedilCXpoundegraveBG core ordfCXgtOslashf

IumlntildeHDoumlfIumlOslash5FCXWeumlnqmndbrvbarsect

BX900gtIcirca|ccedil`ordfnotygtlaquoqfIumlUTUacuteUcircbgtIcirca|ccedil`Ucirc5Gaeliguordflaquoq

JAEA-Testing 2011-005

Table G-1 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (1)

sup2~ w |y |o ovq| zx o| qv |vu vuqsup2~ oy |o |qz ovq| oqq o|x qv| |vow vwsup2~ u |ow |o ovqq zz o|x qv| |vo v|xsup2~ | |y |o ovqu zw o|x qv| |v|u v|sup2~ yu |o |oy ovqq zy o|z qvyz |v| vsup2~ ow ||u |o| ovq zw o| qv |vuq vwsup2~ xy ||u |o ovqx oqx o|z qvx |voz vsup2~ xo ||u |q ovqu oq o|w qv |vo| v|sup2~ ontildequ |uo |q ovq oqz ou| qvy |vo vu

sup2~ w owq qw qvw q oo qvy| vxy ovwxsup2~ oy owu o| qvw q oqw qvyy vyo ovzwsup2~ u owq qu qvww yw oqz qvy vyx ovwsup2~ | o| qy qvwu o oqw qvyy vu| ovzosup2~ yu o| qu qvwx o oqw qvyy vu| ovwzsup2~ ow owx qq qvz| oqz qvy vxy ovwusup2~ xy oz qo qvwz wy ooo qvw vqz ovwsup2~ xo ow qu qvw w oo qv vqx ovwsup2~ ontildequ ow ou qvwx z oq qv ovzw ovz

sup2~Oslashreg~plusmn w untildeqoy untilde|z ov| ontildeuw ontildeyz| qvu xzv|q xvusup2~Oslashreg~plusmn oy |ntildeyuw uqntilde ovwo ontildezw ontildeyuu qvz xyvx uvwsup2~Oslashreg~plusmn u untildeqxu |zntildezu| ovwx ontildexo ontildeyoq qvw xzvow uvwsup2~Oslashreg~plusmn | |ntilde|| uqntildeyq ovw ontilde|q ontildeyq qvwo xxvxy uvwysup2~Oslashreg~plusmn yu untildey uqntildeuuu ovwu ontildeww ontildeyuy qvw xvyu uvxsup2~Oslashreg~plusmn ow |ntildeqq uqntildezw| ovw ontildew ontildeyzx qv xzvxq uvowsup2~Oslashreg~plusmn xy untildeyyx uqntildeywz ovwu ontildeuxw ontildeyy qvw xovq uvuosup2~Oslashreg~plusmn xo untildeqz |zntildeoww ovwz ontildexq ontildeqx qvwz uwvyz vzzsup2~Oslashreg~plusmn ontildequ yntildexwx |ntildeuuz vqx ontildey ontildewx qvw uvo ozvzw

middotplusmnsup1 w |w | ovqq oqq o|y qv| |vw vuqmiddotplusmnsup1 oy |u |uo qvzx oqo o|y qvu |vq vxqmiddotplusmnsup1 u |uo |yq qvzx oqo ouq qv |v|w vxwmiddotplusmnsup1 | |xo |yw qvzy oq o|w qvu |vux vyymiddotplusmnsup1 yu xzo u|| ov|y oqx ouw qvo xvy| vzmiddotplusmnsup1 ow ntildez yqu uvz| ooy oxx qvx xvq |vzqmiddotplusmnsup1 xy xntilde z xv| o| oy qvz |zvyx xvwumiddotplusmnsup1 xo wntildeqoz ontildew uvxo oy oz ovo| |voy zvmiddotplusmnsup1 ontildequ oqntildezy |ntildexxo |vqz zo |w uvqz oovz ouvzx

middotplusmnsup1raquo w ntildey| oxntildequq qvo || oy qvu vz ovqomiddotplusmnsup1raquo oy ntildeuyw oxntildeqx qvoy ||x oq qvu v| ovoxmiddotplusmnsup1raquo u ntilde|y oxntildequq qvow |uo o| qvuw wvq| ovqzmiddotplusmnsup1raquo | ntildexw| oxntildeq|| qvo || ox qvu vyy ovqomiddotplusmnsup1raquo yu ntildeyyz oxntildeqqz qvow |xq yw qvuy vy ozvxumiddotplusmnsup1raquo ow |ntildeqqx oxntildeqy| qvq u|q uo qvxw yvzz qv|middotplusmnsup1raquo xy |ntilde|uy oxntildeoqu qv xyo y qvu xvzy ozvw|middotplusmnsup1raquo xo |ntilde|zq oxntildeoxq qv xzw woq qvu xvy owvomiddotplusmnsup1raquo ontildequ |ntildex| oxntildeuox qvu yy z|z qv xvxx oyvuo

Ugraveplusmnsup1 w ontildeuox yntildeyo| vy| u q ovq ovzq |vUgraveplusmnsup1 oy owntildeqz yntildewu vy |q qo ovou wv|q |uvoxUgraveplusmnsup1 u owntilde| yntildewzz v |qo q ovuy yvoy ||v|wUgraveplusmnsup1 | ozntildeqzy yntildezq v uz q ovq yvwq ||v|wUgraveplusmnsup1 yu ozntilde|wx yntildewx| vw| z oy ov| yxv|o |ovqUgraveplusmnsup1 ow qntildewo yntildez|y vz ||q o ovxq yovxq |ovuxUgraveplusmnsup1 xy qntilde|o ntildeqo vzq |wy y ovo xvx |ovqqUgraveplusmnsup1 xo ontildeo|u ntildeow| vzu |xy |u ovx| xzv| |qvyUgraveplusmnsup1 ontildequ ontildey|q ntilde|z vz| |zw ow| vow xuv|o uqvu

Ugraveplusmnsup1raquo w ozntildexw yntildez| vw| |z q ovow wvqu |uvuqUgraveplusmnsup1raquo oy ozntildexoz yntildezq vw| |q q ovou wuvyz |uvoUgraveplusmnsup1raquo u ozntildexqy yntildezu vwo u q ov| ov|q ||vyqUgraveplusmnsup1raquo | ozntildeuu yntildewz vwy o qy ov|o vwx ||vuqUgraveplusmnsup1raquo yu ozntildeqw yntildezq| vwx y oy ovu |vy |ovzy

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

sectWOcirceg

oqu$XAcircsup3deg owsup3degIcircwsup3+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

JAEA-Testing 2011-005

Table G-2 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (2)

Ugraveplusmnsup1raquo ow qntildeqx yntildewz| |vqq |u q ovq xxv|| |ovzUgraveplusmnsup1raquo xy qntildew| ntildeq| vzy ||y y ovuz yvqx |ovoUgraveplusmnsup1raquo xo ontildew ntildexy vz| u|| |u ovwx uzvoo |qvzxUgraveplusmnsup1raquo oqu ontildeww| ntildeuqz vzx |oy owu ov yzv|q uqv||

~plusmn w yntildez zntildeuw qvo ontildeozu uux vyw xvwq ovzo~plusmn oy ntildeoqz zntildeu| qv| ontildeoy| uu vyq yvoo ovz~plusmn u ntildeq|z zntildeuy qv ontildeo uuy vy| yvqq ovwx~plusmn | ntildeoz zntildeyq qvu ontildequ uu vq xvzw ovwy~plusmn yu yntildezoq zntildey qvo ontildeoyu uu vyq xvzu ovw|~plusmn ow zntilde|qz zntildew| qvzx ntildeo uuy uvw uvz ovz~plusmn xy zntilde|wz zntildewqw qvzy ntildey uuy uvzz uv ovz~plusmn xo zntildeqx zntildewuq qvzu ntildeowx uuz uvw uvo ovzu~plusmn oqu zntildexz zntildezq| qvzz ntildeo uyo uv uvuz ovxq

~plusmnraquo w yntildewox zntilde| qvq ontildeuo uux vz xvuz ovw~plusmnraquo oy yntildezq zntildeu| qvo ontildeoz uu vyw xvw ovwo~plusmnraquo u yntildeyz| zntildeux qvyz ontildeoyw uuu vy| xv| ovz|~plusmnraquo | yntildewou zntildexx qvq ontildeu uux vx xvx ovzo~plusmnraquo yu yntildezx| zntildex qvo ontildeoyo uuu vyo xvzz ovzx~plusmnraquo ow zntildeuux zntildew qvz ntildeoxx uu| uvwy uv|w vqx~plusmnraquo xy zntilde|x zntildewqo qvzx ntildeq uuy uvzx uvu ovzw~plusmnraquo xo zntildeyw zntildewuy qvzw ntildeqy uxq uvzq uv|z ovwz~plusmnraquo oqu zntildeyxq zntildezoo qvz ntildeoxy uyq uvyw uvuw ovx|

plusmn w uy uo ovo zy oow qvw uvwq |vxqplusmn oy xq uw ovqy oqo oq qvwx xvqq |vzzplusmn u xwz x|| ovoq oqy o| qvwy xvx uv|uplusmn | oo yqy ovo ooq o|q qvwx yvuy uvywplusmn yu ntilde zoz vu o| ouo qvw owvqu yvxqplusmn ow ntildeww ntildeqqz ov|z oxo ox qvwy owvu oovxqplusmn xy untildeouy |ntildeoy ovo q| oz qvz| qvuy oyvzplusmn xo |zntildeo| |xntilde|u ovoo ontildeuox ontildeuy qvwo vyz qvowplusmn oqu uqntildeqwy |xntilde|w ovo ontildeuww ontildewqw qvw yvzu ozv

plusmnraquo w oqqntildexy yqntilde|y ovy ntildeuou wy vx uovyy ywvwzplusmnraquo oy oqntildey| yontildeq ovyw ntildeu|w ww| vy uvqz yzv|oplusmnraquo u oqntildexyw yontildeqz| ovyw ntilde|zy wz| vyw uvw ywvuqplusmnraquo | oq|ntildeqo yontildeo ovyw ntildeuqq www vq uvzx ywvzoplusmnraquo yu oq|ntildeqzw yontildeoou ovyz ntildeuyx wzx vx uovw| ywv|qplusmnraquo ow oq|ntildezo yontildeq| ovq |ntildexqx wz |vzo zvyx ywvoplusmnraquo xy oquntildeqy yontildeqq ovo |ntildexyw wzz |vz zvo yvwxplusmnraquo xo oqwntilde|zq yqntildewo ovw |ntildexyx wzx |vzw |qvuo yvzplusmnraquo oqu ooqntildeqq yontildeoxx ovwq |ntildeyu zo |vzz zvzx yyv|z

~plusmnreg w ooz o|o qvzq q x qvwq xvwo xvo~plusmnreg oy oo| ou qvzo y qvwx xvq uvwu~plusmnreg u oow o|o qvzo q x qvw xvz xvy~plusmnreg | oq ou qvz q x qvwq xvzu uvwz~plusmnreg yu oq ox qvzy q x qvw yvo uvzz~plusmnreg ow o oz qvzz y qvw xvy xvq|~plusmnreg xy ox o|o qvzx y qvzx uvwy uvw|~plusmnreg xo ox o|o qvzx | y qvww xv|x uvzx~plusmnreg oqu ou o|| qvz| y z qvz uvq uvy|

~plusmn w oontildewxu |ntildeyuy |vx yu ontildeoow qvx owvuy |vy~plusmn oy oontildewo |ntildeyq |vw y|z ontildeoox qvx owvuw |v|~plusmn u oontildey |ntildexw |vz yuo ontildeoox qvx owv| |vo~plusmn | oontilde| |ntildexzx |v y| ontildeo| qvx owvuz |vq~plusmn yu oontildew|o |ntildey |v y| ontildeo qvx owvxw |vo~plusmn ow ontildeqw |ntilde| |v| yy ontildeouq qvxx ozv|o |vw~plusmn xy ontilde |ntildewox |v zz ontildeoyw qvyw oxv| |v~plusmn xo ontildeuo |ntildezzo |voo wu ontilde qvyz ouvy |v~plusmn oqu ontildequ untilde|w vzq zu ontilde|u qvo o|vuz |v|o

sup2cedil~raquo w y o qvx o | qvu| xvoy uvusup2cedil~raquo oy z o qvwq o qvxx vzx xvuxsup2cedil~raquo u x oq qvuy o qvyo |v| uvuzsup2cedil~raquo | | oo qv o qvxu vuz uvzosup2cedil~raquo yu oo qvyo o qvuz xvyo uvxusup2cedil~raquo ow oo qvyw o qvxx yvqq uvwxsup2cedil~raquo xy y oo qvxy o qvy uvyq xvoosup2cedil~raquo xo y oo qvxy | qvy| |vq| |vusup2cedil~raquo oqu x o qv|z o qvx |vyz xvuq

JAEA-Testing 2011-005

Table G-3 MPIatilderaquo LMPI and hybrid acute256ccedil1024KBP

sup2~ yntildeouu |ntildewzz |ntildeuou ovou ontildezx ontildeqyw ovw| ovzz |vqsup2~ xuntildeww ntilde|o ntildeuo qvzz |ntildeqo |ntildey qvz vu vsup2~ wyntildeu| oqntildexx oontildeqyy qvz |ntildezwy untildeuyw qvwz vq vuwsup2~ ontildequwntildexy ountildexxy oxntildeuy qvzu untildezqq xntildewyw qvwu vz vyu

sup2~ yntildeouu zou zo qvzu ontildeqqq ontildeq qvzw qvzo qvzxsup2~ xuntildeww ontildeo ontildezy qvwz ontildew ntildeuuz qv| qvz qvzsup2~ wyntildeu| ntildexuo ntildey| qvz ntildeyzz |ntildeuqw qvz qvzu qvwosup2~ ontildequwntildexy |ntildex untildeqox qvzu |ntildexww untilde|| qvw ovqx qvz

sup2~Oslashreg~plusmn yntildeouu yntildeuw| wontildexoz qvzu xntildeuo |ntildeoqw ovoq |vqq |vx|sup2~Oslashreg~plusmn xuntildeww ou|ntildewx ountildeqo qvw| uyntildeqzq yyntildewxo qvyz |vo vyosup2~Oslashreg~plusmn wyntildeu| oontildezoo yzntildeuz qvz yntildeqz zxntildeox qvo |voy vw|sup2~Oslashreg~plusmn ontildequwntildexy w|ntildeyox |yontildeuq| qvw wntildexo o|ntildexqz qvo |v| vz|

middotplusmnsup1 yntildeouu |ntildezw ||ntilde|x| qvzz oontildexww oxntildeyzu qvu vwx vo|middotplusmnsup1 xuntildeww y|ntildex yyntildeyy qvzx ontildez|x untilde|x qvuy vww ovuomiddotplusmnsup1 wyntildeu| wzntildewy z|ntildeyux qvzy |qntildeyx yntildeyqu qvuy vz ov|zmiddotplusmnsup1 ontildequwntildexy oqntildeux| o|untildeqox qvzq uqntildeyxx wwntildez| qvuy vzy ovxo

middotplusmnsup1raquo yntildeouu yntildew|u ||ntildeu ovww ntildewxy oxntildeyuw uvyy qvwy voumiddotplusmnsup1raquo xuntildeww wxntildeoz yuntildeo| ov|| wxntildex|| untildeuy ovwq ovqq ov|xmiddotplusmnsup1raquo wyntildeu| ooxntildeuw z|ntildezyu ov| z|ntildeyow yntildexuz ov|z ov| ov|zmiddotplusmnsup1raquo ontildequwntildexy o|ntildexx o|untildeuqz ovq zyntildexoq wntildeuuy ovoq ovu| ovxu

Ugraveplusmnsup1 yntildeouu |zntildeyu| oqntildexwz |vu uontildeuyq ountildewy vz qvzy qvoUgraveplusmnsup1 xuntildeww untildewxx ntildeqo voy uzntildeoy| yntildewo ovw| qvz qvw|Ugraveplusmnsup1 wyntildeu| xxntildewoq zntildewyu ovw xyntildewuu |yntildexqx ovxy qvzw qvwUgraveplusmnsup1 ontildequwntildexy yntildewxy |ntildexxw ovy yuntildewzx uyntildeoq ovuo qvz qvwo

Ugraveplusmnsup1raquo yntildeouu uqntildeq|u oqntildeyxq |vy uontildey| ountildez|z vz qvzy qvoUgraveplusmnsup1raquo xuntildeww uwntilde|u ntildeqz vo uzntildew yntildewx ovw| qvzw qvw|Ugraveplusmnsup1raquo wyntildeu| xyntildezy zntildez|y ovww xntildeo| |yntildeuwq ovx qvzw qvwUgraveplusmnsup1raquo ontildequwntildexy y|ntildeo| |ntildeyxq ovyw yxntilde|yw uyntildeqx ovuo qvz qvwo

~plusmn yntildeouu untildeyw wntildezu uvo uuntildeo wntildeuq| xvx qvzy ovq~plusmn xuntildeww xqntildeqyx qntilde|xq vuy xontildexwy untildezux vq qvz qvw~plusmn wyntildeu| xwntildeq wntildeux vqy yqntildeox |untildeu ov| qvzw qvw~plusmn ontildequwntildexy yyntildequq |yntilde|q| ovw ywntildeqx uuntildeuyy ovx| qvz qvw

~plusmnraquo yntildeouu zntilde|q zntildeqq ovqu zntildexq wntildeu|w ovo| qvzw ovq~plusmnraquo xuntildeww oyntildeu| qntilde|wy qvwo oyntildeoz untildezuy qvyx ovq qvw~plusmnraquo wyntildeu| |ntildewo wntildeuxu qvw ntildewz |untilde qvyy ovq qvw~plusmnraquo ontildequwntildexy zntildezy |yntildexo| qvw zntildezxq uuntildeu qvy ovqq qvw

plusmn yntildeouu yxntildewu wzntildeq| qv| ontildeyoz yntildezz qvwq |vq |v|qplusmn xuntildeww ountildeqoz oxntilde|o qvwo |zntildexu xxntilde|ux qv |vo vxplusmn wyntildeu| ow|ntildeuwy untildeyu qvw xntildeoyu untildezu| qvy |vo |vqqplusmn ontildequwntildexy u|ntildewo zxntildezqo qvw xntildezy zuntildeoy qvwq |vo |vo

plusmnraquo yntildeouu x|ontildeyx wyntildeqoz yvow o|ntildeqox owntilde|xu vu |vww uvyzplusmnraquo xuntildeww yozntildeyz| oz|ntildeuwu |vq oyntildeyw ywntildeyw v| |vwo vwplusmnraquo wyntildeu| yw|ntildewxu wwntildeooz v| owyntilde|q| oqxntildeyw ovy |vy vplusmnraquo ontildequwntildexy |yntildezo uqxntildeoyq ovw oontildexo ouontildezzx ovuz |vuw vwx

~plusmnreg yntildeouu uzu y| qvw uwo xzw qvwq ovq| ovqy~plusmnreg xuntildeww zq| ontildeoz qvy wo| ontildexuy qvx| ovoo qv~plusmnreg wyntildeu| ontilde|q ontildeyq qvw ontildeoxo ntildeoqz qvxx ovo| qvz~plusmnreg ontildequwntildexy ontildezqu ntilde|ux qvwo ontildeuxu ntildeyx qvxu ov|o qvww

~plusmn yntildeouu ountildewqx ountildez|x qvzz ontildeqq owntildeuw| ovou qvq qvwo~plusmn xuntildeww zntildeuy zntildewxz qvzw uqntildex|o uwntildexyw qvw| qv qvyo~plusmn wyntildeu| uuntildeq uuntildeo qvzz xyntildeyoq yzntildexzw qvwo qvw qvyu~plusmn ontildequwntildexy qntildeqqx qntildeu|q qvzz ntildeo zqntilde|ux qvwo qvzy qvw

sup2cedil~raquo yntildeouu x w qvz zo ux vq| qvw ov|sup2cedil~raquo xuntildeww ox o| qv| o|w o| qvyx qvzo qvwosup2cedil~raquo wyntildeu| oo |o qvu owz w| qvy qvzo qvwsup2cedil~raquo ontildequwntildexy ou wz qvu | |x qvyx qvz qvwo

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

sectWOcirceg

ow$XAcircsup3deg owsup3degIcircwsup3

国際単位系(SI)

乗数  接頭語 記号 乗数  接頭語 記号

1024 ヨ タ Y 10-1 デ シ d1021 ゼ タ Z 10-2 セ ン チ c1018 エ ク サ E 10-3 ミ リ m1015 ペ タ P 10-6 マイクロ micro1012 テ ラ T 10-9 ナ ノ n109 ギ ガ G 10-12 ピ コ p106 メ ガ M 10-15 フェムト f103 キ ロ k 10-18 ア ト a102 ヘ ク ト h 10-21 ゼ プ ト z101 デ カ da 10-24 ヨ ク ト y

表5SI 接頭語

名称 記号 SI 単位による値

分 min 1 min=60s時 h 1h =60 min=3600 s日 d 1 d=24 h=86 400 s度 deg 1deg=(π180) rad分 rsquo 1rsquo=(160)deg=(π10800) rad秒 rdquo 1rdquo=(160)rsquo=(π648000) rad

ヘクタール ha 1ha=1hm2=104m2

リットル Ll 1L=11=1dm3=103cm3=10-3m3

トン t 1t=103 kg

表6SIに属さないがSIと併用される単位

名称 記号 SI 単位で表される数値

電 子 ボ ル ト eV 1eV=1602 176 53(14)times10-19Jダ ル ト ン Da 1Da=1660 538 86(28)times10-27kg統一原子質量単位 u 1u=1 Da天 文 単 位 ua 1ua=1495 978 706 91(6)times1011m

表7SIに属さないがSIと併用される単位でSI単位で表される数値が実験的に得られるもの

名称 記号 SI 単位で表される数値

キ ュ リ ー Ci 1 Ci=37times1010Bqレ ン ト ゲ ン R 1 R = 258times10-4Ckgラ ド rad 1 rad=1cGy=10-2Gyレ ム rem 1 rem=1 cSv=10-2Svガ ン マ γ 1γ=1 nT=10-9Tフ ェ ル ミ 1フェルミ=1 fm=10-15mメートル系カラット 1メートル系カラット = 200 mg = 2times10-4kgト ル Torr 1 Torr = (101 325760) Pa標 準 大 気 圧 atm 1 atm = 101 325 Pa

1cal=41858J(「15」カロリー)41868J(「IT」カロリー)4184J(「熱化学」カロリー)

ミ ク ロ ン micro 1 micro =1microm=10-6m

表10SIに属さないその他の単位の例

カ ロ リ ー cal

(a)SI接頭語は固有の名称と記号を持つ組立単位と組み合わせても使用できるしかし接頭語を付した単位はもはや コヒーレントではない(b)ラジアンとステラジアンは数字の1に対する単位の特別な名称で量についての情報をつたえるために使われる

 実際には使用する時には記号rad及びsrが用いられるが習慣として組立単位としての記号である数字の1は明 示されない(c)測光学ではステラジアンという名称と記号srを単位の表し方の中にそのまま維持している

(d)ヘルツは周期現象についてのみベクレルは放射性核種の統計的過程についてのみ使用される

(e)セルシウス度はケルビンの特別な名称でセルシウス温度を表すために使用されるセルシウス度とケルビンの

  単位の大きさは同一であるしたがって温度差や温度間隔を表す数値はどちらの単位で表しても同じである

(f)放射性核種の放射能(activity referred to a radionuclide)はしばしば誤った用語でrdquoradioactivityrdquoと記される

(g)単位シーベルト(PV200270205)についてはCIPM勧告2(CI-2002)を参照

(a)量濃度(amount concentration)は臨床化学の分野では物質濃度

  (substance concentration)ともよばれる(b)これらは無次元量あるいは次元1をもつ量であるがそのこと   を表す単位記号である数字の1は通常は表記しない

名称 記号SI 基本単位による

表し方

秒ルカスパ度粘 Pa s m-1 kg s-1

力 の モ ー メ ン ト ニュートンメートル N m m2 kg s-2

表 面 張 力 ニュートン毎メートル Nm kg s-2

角 速 度 ラジアン毎秒 rads m m-1 s-1=s-1

角 加 速 度 ラジアン毎秒毎秒 rads2 m m-1 s-2=s-2

熱 流 密 度 放 射 照 度 ワット毎平方メートル Wm2 kg s-3

熱 容 量 エ ン ト ロ ピ ー ジュール毎ケルビン JK m2 kg s-2 K-1

比熱容量比エントロピー ジュール毎キログラム毎ケルビン J(kg K) m2 s-2 K-1

比 エ ネ ル ギ ー ジュール毎キログラム Jkg m2 s-2

熱 伝 導 率 ワット毎メートル毎ケルビン W(m K) m kg s-3 K-1

体 積 エ ネ ル ギ ー ジュール毎立方メートル Jm3 m-1 kg s-2

電 界 の 強 さ ボルト毎メートル Vm m kg s-3 A-1

電 荷 密 度 クーロン毎立方メートル Cm3 m-3 sA表 面 電 荷 クーロン毎平方メートル Cm2 m-2 sA電 束 密 度 電 気 変 位 クーロン毎平方メートル Cm2 m-2 sA誘 電 率 ファラド毎メートル Fm m-3 kg-1 s4 A2

透 磁 率 ヘンリー毎メートル Hm m kg s-2 A-2

モ ル エ ネ ル ギ ー ジュール毎モル Jmol m2 kg s-2 mol-1

モルエントロピー モル熱容量ジュール毎モル毎ケルビン J(mol K) m2 kg s-2 K-1 mol-1

照射線量(X線及びγ線) クーロン毎キログラム Ckg kg-1 sA吸 収 線 量 率 グレイ毎秒 Gys m2 s-3

放 射 強 度 ワット毎ステラジアン Wsr m4 m-2 kg s-3=m2 kg s-3

放 射 輝 度 ワット毎平方メートル毎ステラジアン W(m2 sr) m2 m-2 kg s-3=kg s-3

酵 素 活 性 濃 度 カタール毎立方メートル katm3 m-3 s-1 mol

表4単位の中に固有の名称と記号を含むSI組立単位の例

組立量SI 組立単位

名称 記号

面 積 平方メートル m2

体 積 立法メートル m3

速 さ 速 度 メートル毎秒 ms加 速 度 メートル毎秒毎秒 ms2

波 数 毎メートル m-1

密 度 質 量 密 度 キログラム毎立方メートル kgm3

面 積 密 度 キログラム毎平方メートル kgm2

比 体 積 立方メートル毎キログラム m3kg電 流 密 度 アンペア毎平方メートル Am2

磁 界 の 強 さ アンペア毎メートル Am量 濃 度 (a) 濃 度 モル毎立方メートル molm3

質 量 濃 度 キログラム毎立法メートル kgm3

輝 度 カンデラ毎平方メートル cdm2

屈 折 率 (b) (数字の) 1 1比 透 磁 率 (b) (数字の) 1 1

組立量SI 基本単位

表2基本単位を用いて表されるSI組立単位の例

名称 記号他のSI単位による

表し方SI基本単位による

表し方平 面 角 ラジアン(b) rad 1(b) mm立 体 角 ステラジアン(b) sr(c) 1(b) m2m2

周 波 数 ヘルツ(d) Hz s-1

ントーュニ力 N m kg s-2

圧 力 応 力 パスカル Pa Nm2 m-1 kg s-2

エ ネ ル ギ ー 仕 事 熱 量 ジュール J N m m2 kg s-2

仕 事 率 工 率 放 射 束 ワット W Js m2 kg s-3

電 荷 電 気 量 クーロン A sC電 位 差 ( 電 圧 ) 起 電 力 ボルト V WA m2 kg s-3 A-1

静 電 容 量 ファラド F CV m-2 kg-1 s4 A2

電 気 抵 抗 オーム Ω VA m2 kg s-3 A-2

コ ン ダ ク タ ン ス ジーメンス S AV m-2 kg-1 s3 A2

バーエウ束磁 Wb Vs m2 kg s-2 A-1

磁 束 密 度 テスラ T Wbm2 kg s-2 A-1

イ ン ダ ク タ ン ス ヘンリー H WbA m2 kg s-2 A-2

セ ル シ ウ ス 温 度 セルシウス度(e) Kンメール束光 lm cd sr(c) cd

スクル度照 lx lmm2 m-2 cd放射性核種の放射能( f ) ベクレル(d) Bq s-1

吸収線量 比エネルギー分与カーマ

グレイ Gy Jkg m2 s-2

線量当量 周辺線量当量 方向

性線量当量 個人線量当量シーベルト(g) Sv Jkg m2 s-2

酸 素 活 性 カタール kat s-1 mol

表3固有の名称と記号で表されるSI組立単位SI 組立単位

組立量

名称 記号 SI 単位で表される数値

バ ー ル bar 1bar=01MPa=100kPa=105Pa水銀柱ミリメートル mmHg 1mmHg=133322Paオングストローム Å 1Å=01nm=100pm=10-10m海 里 M 1M=1852mバ ー ン b 1b=100fm2=(10-12cm)2=10-28m2

ノ ッ ト kn 1kn=(18523600)msネ ー パ Npベ ル B

デ ジ ベ ル dB

表8SIに属さないがSIと併用されるその他の単位

SI単位との数値的な関係は    対数量の定義に依存

名称 記号

長 さ メ ー ト ル m質 量 キログラム kg時 間 秒 s電 流 ア ン ペ ア A熱力学温度 ケ ル ビ ン K物 質 量 モ ル mol光 度 カ ン デ ラ cd

基本量SI 基本単位

表1SI 基本単位

名称 記号 SI 単位で表される数値

エ ル グ erg 1 erg=10-7 Jダ イ ン dyn 1 dyn=10-5Nポ ア ズ P 1 P=1 dyn s cm-2=01Pa sス ト ー ク ス St 1 St =1cm2 s-1=10-4m2 s-1

ス チ ル ブ sb 1 sb =1cd cm-2=104cd m-2

フ ォ ト ph 1 ph=1cd sr cm-2 104lxガ ル Gal 1 Gal =1cm s-2=10-2ms-2

マ ク ス ウ ェ ル Mx 1 Mx = 1G cm2=10-8Wbガ ウ ス G 1 G =1Mx cm-2 =10-4Tエルステッド( c ) Oe 1 Oe  (1034π)A m-1

表9固有の名称をもつCGS組立単位

(c)3元系のCGS単位系とSIでは直接比較できないため等号「   」

   は対応関係を示すものである

(第8版2006年改訂)

この印刷物は再生紙を使用しています

Page 5: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター

JAEA-Testing 2011-005

611 icircWiumlIumlethsectntilde -------------------------------------------------------------------------- 44 612 Agravej -------------------------------------------------------------------------------------------- 46

62 MPI ---------------------------------------------------------------------------------- 48 621 Agraveograve -------------------------------------------------------------------------------------------- 48 622 -------------------------------------------------------------------------------------------- 49

63 RDMAszlig ------------------------------------------------------------------------------------------ 53 631 Agraveograve -------------------------------------------------------------------------------------------- 53 632 -------------------------------------------------------------------------------------------- 55

64 CPUSCPUiumlIuml 1oacute 1 ----------------------------------------------- 56 641 Agraveograve -------------------------------------------------------------------------------------------- 56 642 -------------------------------------------------------------------------------------------- 57

65 13aelig13ocircotildeouml ----------------------------------------------------------------- 58 651 1oacute 1 ------------------------------------------------------------------------------------------ 58 652 divideoslash -------------------------------------------------------------------------------------------- 62

66 a$fY`UgraveWH-------------------------------------------------------------------------- 64 7 UgraveWH ----------------------------------------------------------------------------------------------------- 65 ugraveuacute ------------------------------------------------------------------------------------------------------------------ 65 ucircuumlyacutethorn ------------------------------------------------------------------------------------------------------------ 65 A FX1 HPCC BMT----------------------------------------------- 66 B EumlIgravecent MPI atilde ---------------------------------------------------------------- 67 C RDMAoacute NORDMA ----------------------------------------------------------------------------- 69 D W Oslash5 -------------------------------- 70 E 1oacute 1 --------------------------------------------------------------------------------- 76 F |eacuteIuml --------------------------------------------------------------------------- 79 G Zccedil` MPIWIcirca|ccedil`gt MPI atilde ------------------------ 85

JAEA-Testing 2011-005

Contents 1 Introduction ---------------------------------------------------------------------------------------------- 1 2 Outline of the system ---------------------------------------------------------------------------------- 1

21 Hardware construction of BX900 ------------------------------------------------------------------ 1 22 Hardware construction of FX1 -------------------------------------------------------------------- 2 23 Hardware construction of Storage----------------------------------------------------------------- 2

3 Performance evaluation by HPCC benchmarks ----------------------------------------------- 5 31 Outline of HPCC benchmarks --------------------------------------------------------------------- 5

311 Test programs ------------------------------------------------------------------------------------- 5 312 Rules ------------------------------------------------------------------------------------------------- 7

32 Measured results -------------------------------------------------------------------------------------- 8 321 BX900 ----------------------------------------------------------------------------------------------- 8 322 FX1 ------------------------------------------------------------------------------------------------- 14

33 Summary ------------------------------------------------------------------------------------------------ 18 4 Tunings for BX900 and FX1------------------------------------------------------------------------- 20

41 Profiler --------------------------------------------------------------------------------------------------- 20 42 Automatic tuning of memory access ----------------------------------------------------------- 20

421 Efficacy for Original version of STREAM benchmark ------------------------------- 21 422 Efficacy for HPCC version of STREAM benchmark ---------------------------------- 23

43 Manual tuning of memory access ---------------------------------------------------------------- 24 44 Efficacy for NPB23 FT benchmark ------------------------------------------------------------- 28 45 Summary ------------------------------------------------------------------------------------------------ 31

5 Performance evaluation by IOR benchmarks ------------------------------------------------ 33 51 Outline of IOR benchmarks ---------------------------------------------------------------------- 33

511 Key parameters -------------------------------------------------------------------------------- 33 512 Option parameters for MPIIO ------------------------------------------------------------- 34

52 Measured results ------------------------------------------------------------------------------------ 35 521 POSIX --------------------------------------------------------------------------------------------- 38 522 MPIIO --------------------------------------------------------------------------------------------- 38 523 Local cache --------------------------------------------------------------------------------------- 38

53 Summary----------------------------------------------------------------------------------------------- 41 531 Choice of file system---------------------------------------------------------------------------- 41 532 POSIX vs MPIIO ------------------------------------------------------------------------------ 41 533 Buffer size ---------------------------------------------------------------------------------------- 41

6 Performance evaluation of the interconnect --------------------------------------------------- 44 61 Specification of the interconnect ----------------------------------------------------------------- 44

JAEA-Testing 2011-005

611 Construction and Nodebind ------------------------------------------------------------------ 44 612 Communication method ----------------------------------------------------------------------- 46

62 Fundamental performance of MPI communication ---------------------------------------- 48 621 Measuring method ----------------------------------------------------------------------------- 48 622 Measured results -------------------------------------------------------------------------------- 49

63 RDMA ---------------------------------------------------------------------------------------------------- 53 631 Measuring method ------------------------------------------------------------------------------ 53 632 Measured results -------------------------------------------------------------------------------- 55

64 Communications between intra-CPUs inter-CPUs and inter-nodes ----------------- 56 641 Measuring method ------------------------------------------------------------------------------ 56 642 Measured results -------------------------------------------------------------------------------- 57

65 Communications between chassises ------------------------------------------------------------ 58 651 Point-to-point communication--------------------------------------------------------------- 58 652 Collective communication -------------------------------------------------------------------- 62

66 Summary ------------------------------------------------------------------------------------------------ 64 7 Concluding Remarks --------------------------------------------------------------------------------- 65 Acknowledgements -------------------------------------------------------------------------------------------- 65 References ------------------------------------------------------------------------------------------------------- 65 Appendix A HPCC benchmark results on FX1 --------------------------------------------------- 66 Appendix B Fundamental performance of MPI communication ---------------------------- 67 Appendix C RDMA vs NORDMA--------------------------------------------------------------------- 69 Appendix D Overlapping communications with calculations -------------------------------- 70 Appendix E Peer-to-peer communications --------------------------------------------------------- 76 Appendix F Memory band width ---------------------------------------------------------------------- 79 Appendix G Flat-MPI vs Hybrid --------------------------------------------------------------------- 85

JAEA-Testing 2011-005

1 EacuteH L13Pgt ABCDE

FGDHIJ 22 3K13LM=NO 153TflopsPQCRSTUVWXM=NO 200TflopsUV LinuxYZ$13LPRIMERGY BX900PWM=NO 12 Tflops []^_`$a^LFX1PbcdXefgh$13ijklmndoBpCDqBX900 rsXtuvwmWFX1 []efgh$Lxfgh$PyBzD^|13~hGmWWCq gtoBGXumlcopyordfyBClaquoPRIMERGY

BX900WFX1oacuteCO|Yl WpoundDGY`OslashCOslashGmWbrvbarsectdcWXpoundD13

GqUgrave[13EcircuJGq 3gtefgh$13MXBdn HPCC YbrvbarC 4cd 6O|YalLIOPa$fY`L^_ Pcent AringcdGq 7M$CqXyacuteampG()Ocirc$bWCUgraveWHDq

2 13Ecircu 13 3+XB-eacute13GXUVEumlIgraveOIacute[

]fIumlIacutenot|O-eacutecdX13gtlaquoqmnd13M

=NORSTUVWX 214TflopsMacirc012 56TBOgraveOacuteOcircOtildeY1212PBgtlaquoq13Ecircu Fig 1Gq 21 UVEumlIgraveOIacuteIcircIumlETHNtildeJ 1334WXUVEumlIgraveOIacutePRIMERGY BX900L13BX900P

200Tflops=NOLgYPWMacirc0 50TBnotGUV LinuxYZ$13gtlaquoq

BX900 10U-a5Aumleuml13aelig13T 186-eacute7IumlT 86accedil7Iuml8amp9XAumlAringOslashOuml7Iuml-eacutegtlaquoq1 7Iuml-eacuteaXeon ^_ccedil- X5570LYccedilIumlfP 2 ^_ccedil-Wacirc0 24GBLDDR3 SDRAMBiIacuteiumlIuml 12GB UgraveD 48GBP8ampGqiumlIumllaquoDsect=NO9376GflopsL293GHztimes4lt=gtOY_ccedilYtimes4ftimes2^_ccedil-Pgt|eacuteIuml 512GBsAumlAnotGqiumlIumlInfiniBandLIBP 4xQDRL13B QDRWCP2`ZaringCDAgravez 8GBsAbrvbarAumlFE7a13XOcirc$FGordf9gtlaquoq

JAEA-Testing 2011-005

Fig 1 13Ecircu

BX900gtacircHI4J2K currenLXMUV13Nh713~O=5Pq

22 []fIumlIacuteIcircIumlETHNtildeJ []fIumlIacuteFX1L13FX1P12Tflops =NOLgYPWMacirc0 46TBnotGYZ$13gtlaquoq

1iumlIumlQR SPARC64SLYccedilIumlfPT^_ccedil-Wacirc0 16GB8ampGqUgraveD|UCPUVBccedil^ccedil`LJSCPbrvbarsectAuml|eacuteIumlOslashWCqiumlIuml FBBLFull Bisectional BandwidthPccedil`X| InfiniBand 4xDDRLDAgravez 2GBsPgtYZGqiumlIumleacute|O=f7YOtilde IcircIumlETHNtildegtO=GVBOumltimesLAumla$fY`accedilPyBGmWgt [

brvbarUcircXOcirc$FG]EumlIgrave^_ZOslash5^=_wXM

`notGq FX1 gtFX1 CcedilEgraveordfxfgh$WXpound`acC2012 b

=Gxfgh$yBzD^|13~h5Pq 23 OgraveOacuteOcircOtildeYOumltimesIcircIumlETHNtildeJ OgraveOacuteOcircOtildeYOumltimes IO iumlIumlSPARC Enterprise M90002 6Ocirc$OcircOtildeYETERNUS DX80366$OcircOtildeYETERNUS4000 model40016gtJeumlnq

JAEA-Testing 2011-005

MOcircOtildeY12 12PBLRAID6Pgtlaquosect13gt 576GBs IO LgYPnotGLFig 2PqIO iumlIuml InfiniBand nota13Parallelnavi SRFSbrvbarsectciumlIumlcdYoacuteCaidefgCq Ocirc$OcircOtildeYOumltimeslaquoDsect 466Za SASOcircOtildeYL1TB7200rpmP8ampG

qmndOcircOtildeY RAID6 gt^UcircCOumltimesgt`ZagGmWgt IO AumlFUcirchpoundqXIOiumlIuml_aumla13Sun QFSBCLFig 3Pq

UVEumlIgraveOIacute11913aelig13()

i QDR 9j2 (113aelig13ntildesect) InfiniBandaccedil

(9iexcl) i FGFAring 72GBs (2iumlIumlcurren) IOiumlIuml (M90002iumlIuml) i FGFAring 576GBs (OcircOtildeYcurren) OgraveOacuteOcircOtildeY (ETERNUS DX80366) ()9kUcirclmiumlIumlnC

1 2 3 119

IBUSW

IO node

Fig 2 oOumltimesYZJWFGFAring

JAEA-Testing 2011-005

M90

00

2IO

M90

00

1IO

FC (3)

FC (4)

FC (36)

hellipFC (1

)F

C (2)

FC

(42)

hellipFC (3

7)FC (3

)FC (4

)FC (3

6)hellip

FC (1)

FC (2)

FC (42)

hellipFC (3

7)

DX8

0 1

helliphelliphelliphelliphelliphellip

helliphelliphelliphelliphellip

hellip

1TB 1T

B1T

B1T

B

DX8

0 2

1TB 1T

B1T

B1T

B

DX

80

35

1TB 1T

B1T

B1T

B

DX8

0 3

6

1TB 1T

B1T

B1T

B

ETER

NU

S D

X80

(36

)E4

K 1

1TB 1T

B1T

B1T

B

ETER

NU

S400

0 M

odel

400

(1)

32G

Bs(8

FC)

QFS

Q

FS

36

576

GBs

(144

FC)

F Fi

g 3

JAEA-Testing 2011-005

3 HPCCYbrvbar

HPCC (High Performance Computing Challenge)1) Y 7^_ZB BX900W FX15poundDq13gtHPCCEcircupqWrstuq 31 HPCCYEcircu

HPCC YvR Tennessee JDongara wRordf3xWXpoundyoBCefgh$BY`^_ZzgtlaquoqO

|Y^_ WpoundDefgh$acircuXu

GmWbrvbarsectefgh$13WCeacuteZGUacuteC

qO HPL(High-Performance Linpack)DGEMMFFTE 3q|^_Zgt|Y STREAM W Random Access 2 q|^_Zgt _ PTRANSW b_eff 2q|^_Zgt5n~noacuteCc^_Zbrvbar5PJWXpound 2)qo

^_ZUuml|YacuteTHORNFortranCgtcnordfHPCCYWCCgtcnD 1^_ZeumlnqXHPCCYC 2003 sect 300wordfeumlnsect+X13rsyBGmW9gtlaquoq

311 ^_Z efgh$13f CPU 1 iexclXCiexcl8ampG

L13iumlIumlPa$fY`gtYZCDYZ$JgtlaquosectCoreCPUiumlIuml13WeumlUgraveUgraveXgtO=ordf5nDHogtordfugt

laquoqmgtiumlIumlMG` (G Global system performance)WiumlIuml[gt` (SN Single Environment)gt` (EP Embarrassingly Parallel) 3gt5Pq

3111 HPL^_Z

HPL5IgraveLUcurrengtCmWgt13OG^_ZgtlaquoqOslashm^_Z|5 CcedilEgrave^_ZaumlIacutecurrenWCyB

eumlnmWordfqUgraveDefgh$13O[P TOP500B^_ZWCBeumln LinpackMPIEumlIgraveUcircWCnotgtlaquoqiumlIuml`$XOordfecircMAumlWXq Tflopsgteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq

JAEA-Testing 2011-005

3112 DGEMM^_Z DGEMM5IgraveGmWgtiumlIumlOG^_Zgtlaquoqo^_gt5curren13^_W i5Xqm^_Z

LinpackSgtegravenauml^_ZgtlaquoqUgraveDaumlefgh$gto13zeumlnDeacuteaTHORN|ordfZaZ|WCBeumlnsect

meacuteaTHORN|ordfoacute WXqY`LBMTPbrvbariumlIumlOiexclcentGq Gflops(Giga floating-point operations per second)gteumlnq` 2gtiumlIuml[(SN)(EP)`5Pq 3113 STREAM^_Z STREAM iumlIumlordf|W CDWecirceacuteIumlG^_Zgtlaquoqm

Wecirco^_gtO=ordf5np^_W i5Xqm^_Z

cpound(Copy)curren(Scale)Myen(Add)yen(Triad) 4 XO=gtJeumlnsectiumlIuml|YiexclcentGq GBs(Giga Bytes per second)gteumlnq` 8 gt[(SN)(EP)gtoiumlIuml 4(cpoundcurrenMyenyen)`5Pq 3114 PTRANS^_Z PTARNS(Parallel matrix TRANSpose)EumlIgrave^_ocirc5IgraveFtimesbrvbarsect13ccedil`brvbarY G^_Zgtlaquoq GBs gteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq 3115 Random Access^_Z Random Access |sectumleumlnDecopyIgravecdZordf 1 ulaquonotgtshyregGO=ordf 1 macrdegplusmn5ncG^_Zgtlaquoq1 plusmnYGOcirc$2ordf8BWgteumlordfgtlaquoqiumlIuml|Yucircsup2`WiumlIuml MPI gtucircsup2`5Pq Gups(Giga updates per second)gteumlnq` 3 gtiumlIuml[(SN)(EP)iumlIumlegravepoundDM(G)`5Pq 3116 FFTE^_Z FFTEAumlF|sup3^acuteOG^_ZgtlaquoqHPLWiacute CcedilEgrave^_ZaumlIacutecurrenWCyBeumlnmWordfq[(SN)W(EP)`gt2 [microcopyIgraveordfo^_gtpara|sup3^acuteeumlnqiumlIumlegravepoundDM(G)`gt3[microcopyIgraveordflaquo 1AgravezmiddotAcurrenetheumlnDcedil|sup3^acuteeumlnlaquoW^_ordf^_W 5PmWLoacute Pbrvbaro^_

copyIgravesup1copytimesordm^_raquofrac14sect[microoacuteG|sup3^acuteordf5nq

JAEA-Testing 2011-005

Gflopsgteumlnq 3117 b_eff^_Z b_eff Ocirc$FGFGfrac12frac34(7a13acute latency)WFGFAring(eacuteIuml band width)shycdiquestH^_ZgtlaquoqFGfrac12frac34ordfAgraveXCeacuteIumlordfecircM13^_ ordfAacutenmWGqb_eff iumlIumlegravepoundDM(G)G Ring W PingPong 2 ^_ZgtJeumlnqb_eff FGGOcirc$-a5mnUgravegt^_ZW+XsectlOcirc$-a5 N ] MPI ^_ P atildeAcircXq7a13 8B Ocirc$eacuteIuml 2MB Ocirc$FGGmWbrvbarsect5Pq Ring MPIbrvbarpoundAtildeHdnDZYAumlAringAEligo^_oacuteCAumlAring 1oacute 1 5PoumlWZYAumlAringZordfEumlu 1oacute 1 5Pouml 2CcedilEgravegtCDq PingPong 2Core Eacute Ecircnototilde^_ordfethsectntildednp Core EumlCOslash5Gq2CoreIgraveEacuteIacute5poundecircTIcircIIumlshyGq b_effETHNtildeOgraveRing(AumlAring)brvbar7a13OacuteRing(AumlAring)brvbareacuteIumlOcircRing(Zordf)brvbar7a13OtildeRing(Zordf)brvbareacuteIumlOumlPingPongbrvbar7a13timesPingPongbrvbareacuteIuml 6gtlaquoq7a13ordfaY_LOslashPmacreacuteIumlordf GBsgteumlnq 312 Oslash5

HPCCZaZ(Baseline Runs)WUuml^Otildea5Z(Optimized Runs) 2Oslash5ordflaquoordffeaZTUacuteUcircEacutebrvbarZaZgt5pound

q

3121 ZaZ Oslash5gtlaquoZaZfeaZbrvbarTUacuteUcircW13Oumlfrac34

eumlnaumlUgraveAuml BLAS ZaZ|W MPI ZaZ|egraveBUgravegtUacuteeumlnordfUcircfIuml^regUacuteeumlnXqegraveBCDfeaZeacuteYacute~

WfeaegraveBCDUuml^13~ZaZ|eacuteYacute~UumlordfYacutegtlaquoq 3122 Uuml^Otildea5Z UcircfIumlTHORNszligagravePTUacuteUcirc2 7gtUacute9eumlnqiAgraveaacutednDIacutecurren(-)fIumlTUacuteUcircgtlaquosectpAgrave|5macr8CEcircnototilde3XTUacuteUcircgtlaquoqUgraveD acirc HPCC^_YacuteNtildeY`acircatilde^WaumlaringordfaeligugtlaquosectatildeCUumlGaeliguordflaquoq

JAEA-Testing 2011-005

32 BX900 FX1n~nHPCC BMT^_Z 8ccedil512EumlIgravegtOslash5CDq 321 BX900 BX900 HPCC BMT^_Z Table 1GWHPL

PTRANSRandom Access FFTEo^_ZZK|Otilde Fig 4ccedilFig 7GqZK|Otildedegn^_Zegraveeacuteecirccopyordfeuml]cgtlaquoordf512 EumlIgravegtigraveyenCiacuteXCiacuteicircXordfdegdnqUgraveDHPL=NoacuteC 822(=4902596100)1gtlaquopoundDq

Table 1iumlntildeethntildeograveoacuteWCHPCCocircAEligcdotildebIcircIumlETHNtildeoumlccedilYAElig Intel Endeavor cluster (13gt Endeavor Wdivideoslash)HPCCWeboumlYacute 1) ugraveuacute Table 2Gq BX900shyucircoB3gtlaquoDH Ecircnototilde^_ZP EndeavoregraveeacuteplusmnpoundDlaquoordfEcircuumlb13yacuteordfdegdnqUgraveDEndeavorWEacute IntelfeaZBD Table 3 GbrvbarPQRfeaZbrvbarWbgtlaquopoundDqDotildeCSTREAM iumlIumlSgt5poundmWcd thornCXatilde

d(35GBs oacute 42GBs)W 2ethM EndeavorbrvbarsectEshygtlaquopoundDqm422gtaringNGq

1 BX90013VBbrvbargt17072fgt=N 2000TflopsoacuteC 1914TflopsOslashCsectOslashszlig 9566gtlaquopoundDq

- 9 -

JAEA-Testing 2011-005

Ta

ble

1 富士

通コ

ンパイ

ラに

よる機

構B

X900

での

HPC

C B

MT測

定結果

G-H

PL

G-P

TR

AN

SG

-FFTE

EP-S

TR

EA

MSys

()

EP-S

TR

EA

MTriad

EP-D

GEM

MR

ando

mR

ing

Ban

dwid

thR

ando

mR

ing

Lat

ency

TFlo

ps

GB

sG

ups

Gup

sG

Flo

ps

GB

sG

Bs

GFlo

ps

GB

sus

ecrd

ma

008

1315

12

9330

001

5112

008

9847

555

3328

074

350

9311

018

41

1171

51

268

nord

ma

008

1226

12

9544

001

5504

009

0139

557

5727

813

347

6711

023

91

1053

51

278

rdm

a0

1579

130

510

540

0238

850

1510

719

4469

555

143

4697

111

145

092

418

149

3no

rdm

a0

1536

650

491

080

0222

270

1334

439

7508

555

683

4730

111

123

078

781

324

5rd

ma

032

2493

09

0716

003

0751

024

2906

186

357

111

164

347

3911

101

00

5220

61

944

nord

ma

032

3494

08

4422

003

5001

023

7182

190

981

112

052

350

1611

100

30

5194

95

010

rdm

a0

6201

260

150

637

003

2786

037

8769

369

923

222

203

347

1911

092

40

4192

43

156

nord

ma

060

9871

014

686

80

0763

460

3882

6638

139

722

260

03

4781

110

887

044

542

652

7rd

ma

120

6270

026

303

70

0370

050

5186

5673

075

744

426

43

4708

111

007

038

892

463

5no

rdm

a1

2265

800

302

225

007

3652

061

6532

742

721

444

063

346

9211

100

50

4161

78

290

rdm

a2

4166

200

583

348

002

9897

053

8761

139

3770

888

453

347

0511

091

00

3583

55

937

nord

ma

236

3890

057

519

80

0765

560

8003

4814

845

5089

503

73

4962

110

879

040

323

103

72rd

ma

468

4620

011

199

400

0163

910

2622

2026

859

1017

779

513

4726

111

147

032

838

711

3no

rdm

a4

9017

400

109

9150

008

2799

153

7610

275

5280

1816

433

354

7711

111

90

3572

112

790

富士

通B

X90

0 (5

12C

ore

128C

PU

)C

ip

Xeo

n X55

70 2

93G

Hz

DD

R3-

1066

SM

T-O

N T

urbo

Mod

e -O

FF

Inte

rcon

nect

Q

DR

Inf

inib

and

2so

cket

nod

e F

at T

ree

OS

Red

Hat

EL 5

Com

pile

r 富

士通

CC

++ V

er O

ption -

Kfa

st -

SS

L2

()

EP-S

TR

EA

M S

ys の

値は

(EP-S

TR

EA

M T

riad

)times並

列数

で算

出し

てい

8 16 32 64

MPI

富士

通 P

aral

leln

avi Lng

uage

Pac

kage

512

128

HP

CC

Ver

V

13

1

256

並列

数通

信方

式G

-Ran

dom

Acc

ess

オリジナ

ル版

  

SA

ND

IA_O

PT2版

JAEA-Testing 2011-005

FFig 4 BX900 HPL

Fig 5 BX900 PTRANS

JAEA-Testing 2011-005

FFig 6 BX900 Random Access

Fig 7 BX900 FFTE

JAEA-Testing 2011-005

TTa

ble

2In

tel E

ndea

vor c

lust

er (

HPC

CW

eb1)

)

JAEA-Testing 2011-005

TTa

ble

3In

tel

BX9

00H

PCC

BM

T

JAEA-Testing 2011-005

322 FX1 FX1 HPCC BMT Table 4 HPL

PTRANS Random Access FFTE Fig 8 Fig 11512

HPL755(=377650100)2

Table 4 FFTE DGEMM

A

2 FX1 1200 129Tflops116Tflops 9037

JAEA-Testing 2011-005

TTa

ble

4FX

1H

PCC

BM

T

JAEA-Testing 2011-005

FFig 8 FX1 HPL

Fig 9 FX1 PTRANS

JAEA-Testing 2011-005

FFig 10 FX1 Random Access

Fig 11 FX1 FFTE

JAEA-Testing 2011-005

33 HPCCYUgraveWH Uklq BX900W FX1WrsndWK(Altix3700Bx2)Wrs

Fig 12GqmhgtTWXpoundDqshy 100WCLRandom Ring LatencyPqRandom Ring Latency13BX900ordf`ccedil^fCmWordfcurrencq

BX900 W FX1 rsGWSTREAM W FFTE necircOatildeC atildeC eacuteIumlrvCDordfmacrdnDqSTREAM

421FFTE 43aringNGq XAltix3700Bx2 Random AccessordfEordfAltix3700Bx2W

plusmngtegraveBCD HPCC BMTeacuteYacute~cdOcirc$Y|5ordf+Xpoundsect8YrsGmW13XqAltix3700Bx2 atildeC FFTEEordfmdoacute OslashszligWuumlwdn

q

JAEA-Testing 2011-005

13

0

20

40

60

80

100

BX90

0

FX1

Altix37

00Bx2

GHPL(490Tfplos)

113

GRando

mAccess(154G

ups)

GPTRANS(1120G

Bs)

EPSTREA

M(436GBs)

EPDGEM

M(111Gflo

ps)

13

GFFTE

(2755Gflo

ps)

$amp(13

Rand

omRingBandw

idth

(0359GBs)

Rand

omRingLatency(518usec)

)+

512-(FlatM

PI)

1172

10 64

64

10 32

8 2 32

130

13

123

[GFLOPS]

4567898

123

[GBs]

lt8

7898

[GBs]

=gtAB130CDgtEFGHIAB0J

KLMNOIPQ

Fi

g 1

213cent

HPC

C B

MT^_ZrsZ

Ran

dom

Rin

g La

tenc

y (5

18

sec)

)+R

JAEA-Testing 2011-005

4 BX900W FX1TUacuteUcirc 32 c^_Zgt BX900 W FX1 IcircIumlETHNtildeoumlccedilYWiCXordfWnDmWQR^_aZegraveBC(Xcurrenyen

5poundDqeumldfeaZUuml^13~brvbarTUacuteUcirc]Ucirc7^_ZTUacuteUcirc

EacuteDq 41 ^_aZ

BX900 W FX1 gtegraveBCD^_aZWQRUgraveyenXgtlaquosectoacute ^_ZOslash5^_a|Ocirc$divideC^_ZOslash5cedildivideOcirc$divideC

^_Z=)yen5PmWordf13q ^_aZyBGUgrave^_Z_IumlYacuteh)JGDHfe

a|YUuml^13~UKtl_trtGq[ _ZOslash5_IumlYacute

h fpcoll fIumlgt=GqTcedildivideCDeacuteaTHORN|Ocirc$ fprof fIumlgtdivideCaring`Gq^_aZgtPAElig 8q|gtlaquoq

^_a|Ocirc$divideAElig AElig ^_ AElig MPIZaZ|ordm)AElig f`AElig IcircIumlETHNtilde$AElig fZAElig UcircfIumlAElig

mPIcircIumlETHNtilde$AEliggtCPUlt=gtOaringaeligccedil13

hccedil`|WOcirc$FGszligbordfoacute WXq

42 feaZbrvbar|YTUacuteUcirc BX900HPCC BMT^_Z STREAMordfotildeb

gtlaquo EndeavorWrsC(35GBsacute42GBs)W 2ethMEshygtlaquopoundDmWETHNtilde5poundDqETHNtildeBDUcirc HPCCpSTREAMUuml|YacuteTHORN(CUcirc amp FortranUcirc)7ccedilIumlEumlIgraveUcircWMPIEumlIgraveUcircgtlaquoq[13BGbrvbarPmdegn BX900 feaZUuml^13~UKntst brvbarsectCDmWcdEndeavorfeaZ(IntelUgrave)W BX900feaZ(QRUgrave)igraveiacuteWuumlwdnq

JAEA-Testing 2011-005

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute BX900Uuml|YacuteTHORN STREAM^_Z Table 5Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntst brvbarordfmacrdnDqDotildeCUuml^13~gtOcirc$|sectumlGaringaeligccedil13hyBCXbrvbarPGDHoacute WX DO^SgtOcirc$sup1yBordfXmWordfdcXoumlEacuteegraveBGuecircgtlaquosectPgtXoumlordfUcirc

GordflaquoqUgraveDfeaZordfoacute WGDO ^SgtsectumlOcirc$ordf^^shypoundZYeumlnoumlEacutegtlaquoq CUcircoumlIntelfeaZXd OpenMPEumlIgraveUgraveUgravegt 42GBscoreordfdegdnDordfQRfeaZgt7ccedilIumlEumlIgraveUcirc OpenMP EumlIgravecd=EumlIgrave^regCbrvbarP]C Intel feaZWbWXpoundDqQRfeaZgtlaquo CD7ccedilIumlEumlIgraveUcircAgraveogravebrvbarpoundOcirc$copytimes]AgravebTUacuteUcircAgraveograveordflaquoWnq [ FX1Uuml|YacuteTHORN STREAM^_Z Table 6Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntstbrvbarordfmacrdnDqXTriad (132GBs)CPUQRordfWebgtUumlCshy(135GBs)CPUWotildeEacutegtlaquoq CUcircoumlordf FortranUcircUKntstUuml^13~XCWotildeiCqFX1 CfeaZ BX900W+XsectUKntstUuml^13~ordfcentsup3CXqDHFX1 CfeaZbrvbarTUacuteUcircordf$currengtlaquopoundD9ordfAumlq

JAEA-Testing 2011-005

TTa

ble

5B

X900

STRE

AM

Ta

ble

6FX

1ST

REA

M

JAEA-Testing 2011-005

422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute 321gtdegdnD BX900 HPCC BMT^_Z STREAM

Table 7GqmmgtUKntstcopyIgraveOcirc$ecircEacutearingaeligccedil13heumlX storeampegraveBGmWUKrestp=all(a$XsectordfXmW)WCDTUacuteUcirc5PfeaZUuml^13~gtlaquoq UgraveDyacutegtgtlaquoordf _ZOslash53iumlIuml XC

iumlIumlSgt|Yo CPU]o CoreiDHEumlIgrave] Agravejatildedn~nfeaZUuml^13~brvbar^UcircCXq

STREAMbrvbarPX CoreIumliX2gtZX|Yordf5n DO^gtfeaZUuml^13~UKntst GmWgt|Y+13mWordfcurrencpoundDq

Table 7 HPCC BMT^_Z STREAM (MPIEumlIgraveUcircCUcirc) feaUuml^13~Ograve(1) feaUuml^13~Oacute(2)

EumlIgrave Agravej 13aelig13

EP-STREAMTriad (GBs)

13aelig13 EP-STREAMTriad (GBs)

rdma 1 34739 U U 32

nordma 1 35016 1 43316 rdma 3 34708 U U

128 nordma 3 34692 1 43572

rdma 5 34726 4 43595 512

nordma 4 35477 4 43424 L-1QRfeaZUuml^13~OgraveacuteKfastpackedntst SSL2 -2QRfeaZUuml^13~OacuteacuteKfastpackedntstrestp=all SSL2P

JAEA-Testing 2011-005

43 hhbrvbar|YTUacuteUcirc 32 gtdcWXpoundDbrvbarPBX900 W FX1 FFTE ^_Z

acircIcircIumlETHNtilde13yacuteordfmacrdnDqmETHNtildeDH5pound

DhhW13gtszlig0Gq 13gt FFTE 32EumlIgraveOslash5CDWecircoacuteG ethBX900

gt 12FX1 gt 16Wgteumlqpoundordm)1|YEcirc2OgtlaquopoundD9ordfAumlqmgtFFTEUuml|YacuteTHORN(FortranUcirc)l3CUuml|YacuteTHORN(13aumlWoacuterGoumlUcircWdivideoslash)WaumlIacutecurren4C)JCDaumlacircoacuteGaumlZh5poundDiacuteegraveBC

hbrvbarpound^UcircCDordm)GmWbrvbarsectBX900 W FX1 oacuteG5g5poundDqUcircpaumlbrvbar5poundDh 1QRfeaZW Endeavor egraveBeumlnDafeaZWrs5PDHW 2auml13IacutecurrenTUacuteUcircbrvbar6)BbordfaumlIacutecurrenthorn7wXmWagraveaacuteGDHgtlaquoq UcircaumlIacutecurren 38plusmnUcircWmndaumlZhCDiacute

Ucircn~n Fig 13W Fig 14GWoIacutecurreniacuteEcircu0Gq Fig 13W Fig 14OgravegtCDIacutecurrenOO=5pound^gtlaquoqUuml|YacuteTHORN

gtcopyIgrave APXYZWWWWOcirc$Yordf 2ugt5nsectZgtXcpoundDDHiacutegt)9BcopyIgrave CY_W^curreneth5poundcopyIgrave APXYZWWWWYordfZXbrvbarPCDq)9BcopyIgrave CY_YZCXcpoundD=Ocirc$sectumlG^gtYordfZXpoundDoumlUcircordfecircDHgtlaquoq

Fig 13W Fig 14OacutegtCDIacutecurren3[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 3ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 3gtXC 6Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquosectaringaeligccedil13h-

a5brvbarpoundltG_ccedilY-a5^_ZUgrave)acircregmacr= 16 gteumlnq

Fig 13W Fig 14OcircgtCDIacutecurren2[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 2ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 2gtXC 4Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquoqFX1 gtBX900rufntildeDsectaringaeligccedilaringh-a5ordfgteumlmWcdmO=)t_ccedilY-a5 16cd 4^regGmWgtzyacuteordfmacrdnDq

Table 8BX900W FX1QRfeaZegraveBCDUcircWaumlbrvbarEuml BX900 afeaZegraveBCDaumlbrvbarGq

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeFFQhFQeFHFFIhFIeFHFFhFQWFRWFFILCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildeKgMSCFDEFGLCKmMlhKOFRLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeNHQRIAacuteVOEOlOjAtildehSRAacutelOEYEEfKAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKnMSCFDEFGKmMSCFDEFGKdMSCFDEFGKkMSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePMMSCFDEFGPKMSCFDEFGPPMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLSABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtildeLCnMEEhKOFKOFAVjLCgMllhKOFPOFAVjLCPMEhEEOUEFMAacuteEEfFAVjYKOFKAtildeLCKMlhllOUEFMAacutellfFAVjYKOFPAtildeAAacutelOEAtildehNAacuteEOlAtildeaumlIacutecurrenOcircKMSCFDEFGPMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBF

GFL

Fig 13 FFTEUuml|YacuteTHORNaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeop]qrsAacuteXAtildeObqqpobtbuqrvvowxAacutevOvOvAtildeL3yzPbqqpobtrAacuteowxAacutesOwO]|AtildeAtildeLCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildep~hKOwowxAacute^O~OqAtildehowAacute~O^Atilde ^currenethDHrp )9BcopyIgrave)JIacutecurrenKgMSCFDEFGKdMopt^|rKkMopt^|rpKdK^^hKOsOuqpKmM~~hKOwOuqpKmK~h~~O^MAacute~~fuqYKOwAtildeLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildepKkKVhKO]|NHQRIAacuteVOEOlOjAtildehowxAacute^Y^^fKO~OqAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKkKopt^|r LEW V^lnacutewPKnMSCFDEFGKmKSCFDEFGKmMSCFDEFGKdKSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePPMSCFDEFG LEW j^lnacutewPPKMSCFDEFGPMMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtilde]bcbrtrcAacuteuqKhnAtildeLCnMEEhKOFKOuqKLCgMllhKOFPOuqKLCKMlhllOUEFMAacutellfuqKYKOFPAtildeLCPMEhEEOUEFMAacuteEEfuqKYKOFKAtildeAAacutelOEAtildehNAacuteEOlAtilde aumlIacutecurrenOcircPMSCFDEFGLEW l^lnacutewPKMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBFGFL

Fig 14 FFTEiacuteaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

Ta

ble

8FF

TE V

er 4

1

FullW

kern

el

(EumlIgrave

32

180M

BP

roc

)

ST

UVWXYZ[]

^_

`6a8bcdRefghHdRefgij

`6a8bcdRefghHdRefgij

klmmnopqr

cltst

ouvq

wvx

ovy

zqvz

|vy

|vz

klmmnopq

~cltst

yvy

yvo

ovo

wqvq

v|

|vy

s88oy

oqvu

oqvu

ovq

z|vu

oquv

qvz

s88Ru

ovz

qvw

x|vq

ovw

klmmnopqr

cltst

ouvy

zvy

ovx

oq|vu

uvz

uv

klmmnopq

~cltst

yvx

xvw

ovo

w|vu

vz

|vy

s88oy

oqvx

oqvo

ovq

wxv

z|vo

qvz

s88Ru

o|v|

qvw

xuvw

ovy

klmmnopqr

cltst

ouvo

zvx

ovx

klmmnopq

~cltst

yv

xv

ovo

s88oy

oqvq

oqvu

ovq

s88Ru

ovz

qvw

ln

ltd

ln

zqq

D

mo

ltd

ln

mltd

JAEA-Testing 2011-005

UgraveBX900gtUcircWQRfeaZaumlWQRfeaZaumlWafeaZdegnIgraveEacuteIacuteotildeigt

laquosectauml4ordfXmWWfeaZAacuteordfXcpoundDmWordfcurren

cqUgraveDFig 14OcircgtCDIacutecurren_ccedilY-a5^regGW(_ccedilY-a5acute164)BX900gtordfpoundUcircCsectmoumlaringaeligccedil13hegravepoundXcpoundD9ordfAumlq [FX1gtUcircWQRfeaZaumlWQR

feaZIgraveEacuteIacuteshyegraveeacuteordfaEacuteDordfUuml|YacuteTHORNcdz

yacuteCiacutecurrengtrsGWotildeigtlaquomWcd BX900 WrsGmWXqBX900W FX1gtUuml|YacuteTHORNWiacuteOslash5GWFig 14OgraveIacutecurrenO=OordfEcircUgraven^currenethWPrsUVXiacute5pound

DDHBX900 gt FX1 gtzyacuteCqCcurren BX900 16 oacuteC FX1gt 40WXzyacutegtlaquopoundDqUgraveDFig 14OacuteWOcircIacutecurrendegncopyIgravefgO=EacutegtlaquosectBX900gt 11currenW 08currengtzyacuteXcpoundDordfFX1gt 36currenW 18curren(or 16curren)gtMddcXzyacuteordfaacuteHdnDq 13yacutecdFX1feaZTUacuteUcirc BX900Wdc+XpoundsectFX1

|Y]OszligyBGDHhhordfaeliggt

laquomWordfcurrencpoundDq 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute

42 STREAM^_ZgtethCDW ordfcent^_Zgtecircc5g5poundDq egraveBCD^_Z43 FFTEWEacute Stockham|5gt FFT5P

NPB (NAS Parallel Benchmarks) 23FT^_ZgtlaquoqFFTEoumlWiacuteNPB23 FT Uuml|YacuteTHORN(Fortran Ucircgt MPI EumlIgraveUcirceumln)l3CUuml|YacuteTHORNcdaumlIacutecurren4C)JCDaumloacuteCfeaZUuml^13~brvbarpound^

UcircCDordm)5poundDqUuml|YacuteTHORNgtXCaumlBD=YZ C7UV((XYZ)=(512512512))gtaumlIacutecurrenordf On aringaeligccedil13hWXsectfeaZUuml^13~szligordf13XDHgtlaquosectaumlIacutecurrenUVEacuteCDcpoundDcd

gtlaquoq ^_ZaumlIacutecurren 28plusmngtlaquosectn~n Fig 15W Fig 16GqMd

8plusmncopyIgraveoacuteGYordfZsectumlO=gt|YordfZgtlaquo

mWcdhtgtUKntst oacute WXXgtmUuml^13~notbrvbarGq UgraveDOcirc$_ccedilY-a5 16W 64brvbar5poundDq

JAEA-Testing 2011-005

13

|ucp|t^ro__tAacute^OOOsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]|trFRFY]p^top]qrsYtpYop]qrsJJDp_Q|^bbqpc^ta|rotpbctcb|urc`Q^uptatar^]|tbtarp|t]|tbccbwOa^qrR^boocbtoabccbw`Et^b|rtabtFhPU`Ar_pcrobqq^SJJDItpo]rc_pcJJDOtarbccbw|tur^^t^bq^ruwobqq^SJJDI^taEorttpMbUrttpUQOarcrUQ^tarbs^|bq|rp_U_pcbwo|ur|rtobqq`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^oq|r[qpubq`a[^oq|r[]^_`a[^trrc^OOO^O~OqOsO^rccOr^trrc__tuqpoO__tuqpo]b]bcbrtrcAacute__tuqpohdnO__tuqpo]bhddAtilde auml^_ccedilY-a5Kd]bcbrtrcAacute__tuqpohKdO__tuqpo]bhKXAtilde]bcbrtrcAacutehiOhmKPAtildep|uqrop]qrssOwO|p|uqr]cro^^prqOrqrOrq]^r^psAacute__tuqpo]bOAtildeOwAacute__tuqpo]bOAtildeO|AacuteAtildeobqqUHExE^tAacute^rccAtildesAacutevOvAtildehAacuteK`MMOM`MMAtildewAacutevOvAtildehAacuteK`MMOM`MMAtilde|AacutevAtildehAacuteK`MMOM`MMAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pcprbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqh]^xt^rAacuteAtildep^hKOKMMM^hY^pqhKOOPobqq__tPAacute^OqOOO__tuqpoO__tuqpo]bO|OsOwAtilde^_Aacuteq`r`AtildeptpKdMobqq__tPAacute^OqfKOOO__tuqpoO__tuqpo]bO|OwOsAtilderpptpKXMoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]wRtpQ`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKdMopt^|rp~hKOp^hKO__tuqposAacute^O~AtildehwAacute^O~Atilde aumlIacutecurrenLfgO=PrprpKXMopt^|rrpobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqrh]^xt^rAacuteAtilderq]hrqrYrqobqqUHExSpxcbAacuteUHExSCUUxTCBVLOrO^rccAtilde^_Aacuter`r`MAtildetarc^trAacutedOWAtilde[Gqb]rD^ruw|u`__tPAacutero`Atildev[Orq]r^_obqqUHExJ^bq^rAacute^rccAtildetp]

Fig 15 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- cfftz)

JAEA-Testing 2011-005

|ucp|t^r__tPAacute^OqOOOwOwKO|OsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pctarVYta^trcbt^pp_tarropbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^trrc^OOqOOOwOwKOKOq^Oq~OqO|O^O~O^KKO^KPO^PKO^PPp|uqrop]qrs|OsOwO|KOsKKOsPK^r^p|AacuteAtildeOsAacutewKOAtildeOwAacutewKOAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYort^^t^bq]bcbrtrc`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKhePqhPWWAacuteqYKAtildeq^hPWWAacuteYqAtildeq~hPWq|hq^fKp^hMOq^YK^KKh^WqfK^KPh^KKfK^PKh^Wq~fK^PPh^PKfq^_Aacute^`r`KAtildetar|Kh|Aacute|f^Atilderqr|Khop~Aacute|Aacute|f^AtildeAtilder^_oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoDa^qpp]^rotpc^buqr`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYphMOqYKp~hKOwsKKhsAacute~O^KKfAtildesPKhsAacute~O^KPfAtildewAacute~O^PKfAtildehsKKfsPK aumlIacutecurrenLOO=PwAacute~O^PPfAtildeh|KWAacutesKKYsPKAtilderprprpcrt|cr

Fig 16 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- fftz2)

JAEA-Testing 2011-005

UgraveWH Fig 17(Ocirc$ Table 9 Table 10GqQRfeaZbrvbarshyatildeCBX900 fftz2gtfeaZUuml^13~UKntstbrvbarzyacuteordfagraveaacute13cfftz gtnotXordfmacrdnXqUgraveDFX1degnnotXordfmacrdnXcpoundDqmW ETHNtildeCD

zyacuteordflaquopoundD BX900 fftz213gtfeaaringaeligccedil13heumlXsectumlampordf5eumlnXmWordfcurrencpoundDq13]h

eumlnX(cXCcedilEgraveWuumlwdnordf(currencdXq Ocirc$_ccedilY-a5degnOcirc` 16ordfTUacuteshygtlaquopoundDq UgraveDQRfeaZbrvbar BX900W FX1shyordfFFTEoumlbrvbarsectgteuml

(2curren13)ucircsup2WsectumlMd|YZgtlaquopoundDDHWuumlwdnq TcedilQRfeaZWafeaZUKntst ordf)BCD

IacutecurrenotildeQRfeaZAgraveordf 1 ethAacuteWXpoundDordfSTREAM gtfrac12Eacutesect

iumlntildegtlaquoq 45 TUacuteUcircUgraveWH

BX900WFX1STREAMWFFTEUuml|YacuteTHORNfIumlWNPB23 FTB13feaZbrvbarTUacuteUcirc5gCDFX1gtegraveBGfeaZUuml^13~brvbarpoundordfecircC^ordflaquo 5ordfaeligugtlaquopoundDq

FX1gtfeaZUuml^13~otildegt13ecircGmWordf13XoumlordflaquosectWecircUcirc7gth5Paeliguordflaquoq

Fig 17 feaZbrvbarTUacuteUcirc

JAEA-Testing 2011-005

TTa

ble

9N

AS

PARA

LLEL

Ben

chm

arks

Ver

23

FT

Tabl

e 10

NA

S PA

RALL

EL

Ben

chm

arks

Ver

23

FT

JAEA-Testing 2011-005

5 IORYbrvbar Y`fIuml IORBoa13alFGFAring

CaYoBETHNtildeCDq 51 IORYEcircu

IOR Benchmark3)vR ASCI Purple=_7|eacuteplusmngteumlnD C iexclbrvbarYfIumlgtlaquoLplusmneacuteYacute~ 2102egraveBPqPOSIXLwrite() read()PMPIIOLMPI_File_write_at() MPI_File_read_at()PbatildeB13EumlIgravel5PmWordfgtecircq 511 YacircueZ$

MPI oZYcplusmnLrepetitions plusmnP`5XFGFAringTshyiquestHqOcirc`centgtEumlIgraveOslash5eumln^_cdiaoacuteCpound

Eacuteecircordf5nndFGFAringshyordfeumlnqIOROslash59XacircueZ$ Fig 18Gq

1plusmn`gt readwriteplusmn(blockSizecurrentransferSizejsegmentCount)gtlaquosect`aecirceuml (blockSizej segmentCountjMPI EumlIgraveAring )WXqyenwtransferSize=1kBblockSize=1MiBL106 Bytes MB220 Bytes MiB WGPsegmentCount=1MPI EumlIgraveAring=32 oumlblockSizejsegmentCountjEumlIgraveAring=32MiB testFile ordf)dnoZY 1MiBmiddotAbrvbarntildeC1KB7fIuml 1024iexcl1313aeligUgraveDZordfpoundEacuteecircGq

[transferSize] 1plusmn readwritegtFGGeacutea`qsizeof(long long int)currencblockSizegtlaquoaeliguordflaquoq

[blockSize] o^_ordfbrvbarntildeGeacutea`qgt 1MiBq [segmentCount] o^_brvbarsectYeumlniOcirc$qyzshy 1q [randomOffset] 1XdZordfYqOcirc`shy 0q [fsync] ecircEacuteAring fsync13f5Gqyzshy 0q [filePerProc] ^_sectWcent`aYGqPOSIXEacutenotszligq

yzshy 0q [useFileView] MPI_File_set_view() MPI_File_write() MPI_File_read()egravePq

ZordfYWparaCXqyzshy 0q [collective] MPI_File_write_at_all() MPI_File_read_at_all()egravePqZordfY

WparaCXqyzshy 0q

Fig 18 IOR YacircueZ$

JAEA-Testing 2011-005

512 MPIIOatildeGUuml^13~eZ$

IOR gt Info aringbrvbarpound MPIIO =)`7wmWordfgtecircqBX900 notszligX InfoUumlYacuteNtildeY` keyW value Table 11GqplusmnCXLyzshyPq

Table 11 MPIIO Info UumlYacuteNtildeY` key value

cb_buffer_size 4194304 $amp()

+

cb_nodes -012345

6789lt

=gt-012A

B

ind_rd_buffer_size 4194304 ACDEFGF$amp()+

ind_wr_buffer_size 524288 ACDH+GF$amp()+

JAEA-Testing 2011-005

52 latildeLPOSIX MPIIOPn~nEumlIgraveAringtransferSize (1plusmn13

fgtFGeacutea`)YAgraveograveLsequential randomPb^wFig 19GumlcopyFGFAringCDL^_P Table 12Table 13Fig 20Fig 21GqUgraveDntildea13iumlIuml_aumlaringaeligccedil13hIO -eacute-eacutearingaeligccedil13hordfcentsup3CDHn~naringaeligccedil13hoacuteGIO XcOgraveOacuteOcircOtildeYoacuteG IO XcordfcentGDH_aumlaringaeligccedil13hszligEacuteDq

iexclcentpoundcurrenyenETERNUSDX80

970TB

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

brvbar

SRFSltlaquocumlcopyordf

READD WnotiteD

brvbar

SRFSltlaquocumlcopyordf

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

shyshyshy

Fig 19 ReadWriteumlcopy

JAEA-Testing 2011-005

Table 12 LPOSIXP

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3

oqu yvq ovy| |uxvzw zvwu |zwvyy ouv|u oywvo ozuqv

quw zqzvuy uvox uu|v|u u|uvqo uxovz| uwvuo owovuq wqvxuqzy oqqvo| vy uxyvyw uuuvxx uyvyw xqzvu| |vw| |q|wvzqwoz o|uv|w ouvoy ozwv|o uyxvux uqvyo u|v| uovzu zuvxoy|wu ouoov|w uwvqu uuvxq uyvwz uuvyw uxovww uqvxw zovzw|yw ouxzv wzvwq uy|vx uyvz| uyuvwz uwxvoo |xvo wxvzyyxx|y ouwyvq| wv uxvo| uovyx uzvuu u|qv xvxy yvyxo|oq ouqwvu| uqvqq uxv|w uxvxy uuwvyo uyvoo qvyw zv|qyouu oxoyvu ywovoq uuvw ux|vy |uxvu| uxovoy oqvow uwvzwxuww oxovww oo|zv uyuv uvoq |w|vo u|wv|y ouqvoq uqwvooquwxy ouwvyq ovq uuyvo| uzvyu uquvq u|uvo q|zvx| q|vq

oqu ow|qvy ovu uqvyo vyu xx|vyq uwvwy z|zyv|y oxxqwvuwquw quzvoy |vq |w|vwz uyvyw xxwvzu xuvw o|xxxvy owv|xuqzy ouvu |vu uqov|q uxzvzo u|vzq uuwvu ooouv| |qxvxuwoz ozqxv|x o|vq |z|vuo uq|vqz xxyv|u xzzvy oo|vwo xqvzzoy|wu uvuw uqvz uovz u|vwo xwwvo y|vu| oxvz ozvoq|yw uuvw zuvy uvzz uozv yqzvyx xzyv|u ovo oxvwyxx|y owvy ouuv|u uwv|| uuwvx| x|ovx| yqyvux o|ozvuo oyoovyoo|oq |wzvy xyvyw uuwvz| uvqu x|xv xuvu ooyyvxo oxxvoyyouu u|wvq |zxvwx uvo uyzvu xo|vzq xuqvqw oyzu|v|o q|quvqxxuww uuv|o ozuxvuu uuvxx uzvzu xqvow xuwvzz oyxxxvyx owyovyooquwxy xwvy |qzv| uuzvxu uzvo| xyuvwo uwzvwx oxwwvxz oxyuyvy|

oqu oxyovoo vo yo|vyu v|w xwyvzz zv |xwovy xzzxxvyquw oxvz |vwz yquvxx y|ovx xywvyz xzyvyq x|xy|vzo w|ywzvwouqzy oyvu| zv xzwvx| yyvu xvoq xv yxy|vz zqy|vyowoz oy|xvxx uvy yozv|q yvyu xwv| xxvu| ywuv| zqqqvozoy|wu oxyzvuw uyvzz y|vyy yowvxq xwuvwx xwxvy ywzuv|q wqzvoo|yw oxoovuy wwv yqyvyq yuvyw xwov|x xzov|u ywozyvqy wyyyvywyxx|y oxwxvz ouyvxy yoyvuz yvoq xwxvwx xzqvww ywv| wxxqzvo|oq oyxwvw qvwx yovw yuvw| xxwvwo xwyvuq yquvzw woyvyouu oyw|v|z yzvx| yqvwu yxyvq| x|vx| xywvuq yywqvxw wqvq|xuww oxqvxu ouxqv|u yqvuq y|wvqx xyqvw xovz| yx|||vqq yuy|vuoquwxy oyywvqx uwvuy yo|vxz yuzvz yyxvq xwv|u yox|wvww yoovqx

oqu

|

xy

acutedegmicrodegpara

degmiddotmdegemdegkk~qgcedilsup1sup3

mdegkk~ocedilsup1sup3 cedilordm cedilsup3

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3oqu |yvoy uvqw zqxvxx xvwq uzvx ovu y|zuvx xywy|vy|quw uxzvy yvuz zoyvxo vox xxv| |vqw zyuyovyo zqzuvuyuqzy uyvuz oovx zuovw wvzy zvqx |vzu o|qxxuvy oowzyyvyqwoz uoyvxx vy xyzvzq xxv|o zuovwx uyvyo ouyyqovxx o|woo|v||oy|wu xqvzy uuvy oquvou oqovq oqqovuw z|vy oxuwv| ouqxzyvwz|yw xoyvyw xwvqy oqqvo oz|vo oqzvyz o|yvu ouwwqvq o|wzuv|yxx|y yu|vq ooyvq o|yvzx qvu o|wvqq uwvy oozwwvyo oozqyvxo|oq youv|y ovoy oxu|v|w yqv ouv|y owzvoo wuo|vou w|qzvzwyouu yxvw uouvwo oywzvwo o|yxvqo oyyvqq oxzzvq xo||voy xoyvwuxuww wozvq yuv| oyuvwo ouxvo ox|wvq o|o|v| |qwqvxw |ouvyzoquwxy zyxv|o zqyvu| oyq|vx oxqwvy oyzvz ooxvu ozzvx owovyu

oqu |xvyy v|u ozuvy w|vyy |xvw |vxy uoyyzov|o |zqyyyvoquw |xvww uvxw owq|vx xvzx oyovu |vqy xw|zvzw xyzuzvzxuqzy |yzv|z zvqw qvuy v|x oyovww wqvyw wzyv|w yqu|vowwoz |wuvz owvu o||vu oqv xzyv ouuv|| yozqovu zzqvzoy|wu xqxvwy |yvx qovou uwvyx oz|zvxy ywv| xqqxvzq yq|qxvzw|yw xuv|u uv|z qywvyu zxvy qxovww |quvz yxwoqv yuquu|vy|yxx|y yzxvqq ouyv| xzwvwx xyvzq yvoq wqzvy uzyqzvu x|yzqzvw|o|oq yyqvx| zzvq |ov|x zzovw| w|ov|| ozvo x|o||xvuz |yqqvwyouu zzyvzq yq|vwz |zzzvz |qvy |wzuv zyvxq |xowu|v |uuoqovuuxuww ooxvu o|wvu |wyvww |zvu |ux|vo zw|vzo |xuwvy |uwovxoquwxy oz|yv| owzuv| |wxyvwu ||yvqu |yzuvzx |yyovq o|uwvo o|uywvo

oqu |yvq vq |qyvqq ozwv|u |zvwq qzvww o|wuzvyo ooq|xxvzyquw |yvzu uv ooqvyu ozzv|w uxxv ozqv| owzzwvyy oxxuq|vouuqzy |yyvx| wvxo yvqq ozwvqy |zwvzq ozwv zzvx qyqqovoowoz |xv| oyvyz ozxovou |wwvw uxqvq| uquvq| |ozqxv|| qwuzzzvw|oy|wu uyqvzo ||v|| |wvyy wquvw wxuvox zvuo |oxy|vw q|qyvoq|yw ux|vuw yvw |xov|u |uvz uuvyu |wqvy o|xxovzu q|wvwzyxx|y yzvwo o|xvu |uqovuy qvw |yvy wwzvx o|wuyov|y owxuqzyvowo|oq xzzvwz yzv|y |q|vxq ovzu uqwvuq oqqvz oyuyyxuv|u oyouoz|vuoyouu wxyvq xuuv|u uoyxvo |yywvo uzvw| uouvu ooq|ozvo oqyq|vqxuww oqqvqu oowvoz uoy|vw |yvx uwuvuw uux|vxx woyyqzvow zqzvqwoquwxy owquvu owov|u uouvx |zxyv uvw uxoqvqu uy|yozvuw uuqqvuo

xy

oqu

|

cedilsup1sup3plusmnsup2

microdegpara cedilsup1sup3 cedilordm cedilsup3degmiddotmdegemdegkk~qg mdegkk~o

JAEA-Testing 2011-005

Ta

ble

13L

MPI

IOP

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

qv

qvz

ovqq

ov

|vq

vwy

quw

vyu

ovq

ovzz

vx

yvqq

yvox

uqzy

xvq

|vx

|vzq

xvo

oovw

oovxu

woz

oqvz

wvqu

vy

oqvy

|vuz

vy

oy|wu

ozvwy

oyvwq

oyvy

ozvz|

uxvw

uxvz

|yw

|yvzx

|vyu

wvy

|xvxo

vq

xv|

yxx|y

yvy

xyvoz

xqvqq

xuv||

oyvqo

oovz

o|oq

zyv|o

wqvu

zovu

zvx

owwvw

oyvzx

youuoxvoq

oqyvx

ouovzu

oqov

x|vqw

uvux

xuwwuyuvoo

|wvuo

uwvxu

oyovqq

wuvqq

||zv|y

oquwxyyqxvu

uwyvux

uwxvzx

ozuvqy

|uwv|

|uuvx

oqu

ovou

qv|o

qvy|

qvyx

|vy

|voq

quw

v|

qvx|

ovq

ovw

yvu

yvz

uqzy

uvxy

ovu

vqw

vyw

ovu

ovou

woz

zv|w

vyu

xv

xvoq

xv

uv|

oy|wu

ov|z

uvyz

oqvou

zvzw

uzvyo

uwvwz

|yw

||vz

zv|w

uv|z

qvy

wvwq

zovy

yxx|y

xvxw

v

uyv

|wvy|

oxqvz

ouzv|q

o|oq

w|vuu

u|vz

wyvx

ywvzz

y|vw

yvuz

youuooyvu|

yovw

oyqvoqvxo

||xvyo

|vx

xuwwwwvww

o|ovu

|uvqzyvuy

|yxv|

uqvw

oquwxy|xvo|

oovux

qovwu|qyvqz

uqwvo

x|v||

|

xy

acutedeg

microdegpara

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

ovou

ovoo

ovq

ovo

vo

vwx

quw

vx

vow

vo

vxu

uvxo

yvo

uqzy

xvoq

uvou

uv|

xvqu

zvo

oovy

woz

oqv

wvx

wvy

oqvo

ozvwo

vo

oy|wu

qvoq

ovo

ovux

qvq|

|yvyq

uyvou

|yw

uqvyo

|yvz

|uvy|

|wvu|

wovo|

wyvzu

yxx|y

wqvox

|v|y

yyvww

wvy

ouovz

ouwvqq

o|oqoxyv|w

o|vuq

oxvyo

oxovz

wvw

xwv||

youuwovuu

xyvox

u|vzq

yzvxw

xoovux

x|xvwu

xuww|xvo

uzvx

u|vyz

xqvozoquqvw

zuyv

oquwxyzv|y

yovz

wqvou

qyvuqooo|vuoqovu|

oqu

ovo|

qvxy

qvzy

ovo

voz

|v

quw

vz

qvzu

ovzq

vy

uv|u

yvx

uqzy

uvy|

vqx

|v

uvuz

wvwq

ovxo

woz

zv|

|vx

vwq

wvwu

ovyy

yvqu

oy|wu

owvzu

vu

oyvox

owvoq

|xvxo

xuvwx

|yw

|vy|

ouv

|ov|u

|xvx

qvq

oquvwq

yxx|y

xvq|

|vu

uxvx|

yvwo

oxuv||

owov|w

o|oqoxqvyw

v|

wyvq

ovux

|qvo

|vu

youuz|vwo

ouzvwq

o|vy

wuvqw

yovy

yywvz

xuwwxxvyu

|qv

uy|voy

xoqvwoxqvwwo|yvzw

oquwxywxwvxq

xov

wqvx

zqwvu|uovxzoqvu

xy

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2

microdegpara

|

JAEA-Testing 2011-005

521 POSIX Fig 20GbrvbarPZYgtFGeacutea`brvbardotildeiWXoacute

CZordfYgtFGeacutea`laquoGnordfzyacuteCqXFGeacutea

`ordf 1MB oumlZYWZordfYWgtiacuteOcirc$FGe$WXDHordfiGqUgraveDZYecircCo^_ordfn

~nparaCDaYGouml`$iaYbrvbarsect

EGoacuteCpoundEacuteEacutegtiaYGoumlAgraveordfAumlgtlaquopoundDq

mnd$Ocirc$YUumleacutenotccedilIuml]OcircOtildeYYL13YEuml

plusmnFEumlPthornordfuumlwdnqecircCCgtordfFGeacutea`

iexclcentIacuteiWXmW]poundEacuteEacuteecircC+Xnota

13gtOslashordfmacrdnX=WC-eacutearingaeligccedil13hLIOiumlIumlshyaringaeligccedil13hPthornWuumlwdnq 522 MPIIO POSIXZordfYWiacute=zCFGeacutea`

otilderyenCordfzyacuteCLFig 21PqUgraveDEumlIgraveiexclcentCXqmndreg=cd MPIIO ]^_ordfZordfY5POslashOumlWXpoundgtXcWmacrdegeumlnq 523 _aumlaringaeligccedil13h 13 IOoiumlIuml_aumlaringaeligccedil13hWa-eacutearingaeligccedil

13hordflaquoqa-eacutearingaeligccedil13humlcopyordfplusmnsup2GmWsup3acutegtlaquoordf13

BeumlnmemcleanfIumlegravePmWbrvbarsect_aumlaringaeligccedil13hY|1micro5PmWordfgtecircq 13G3brvbarsectpoundEacuteEacuteCDPOSIX_aumlaringaeligccedil13h

szligordfaacuteHdnDLTable 14Pq

(1) ecircEacute` (2) memclean (3) TplusmnpoundEacuteEacute` (4) 2plusmnpoundEacuteEacute`

JAEA-Testing 2011-005

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomSingleFileFilePerProc

ReadWrite

homehome

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

Write Read

sequential randomhomework

N=322561024

N=256

N=1024

N=32

N=32

N=2561024

FilePerProc FilePerProc

N=256

N=1024

N=32

N=2561024

N=32

FFig 20 POSIX

JAEA-Testing 2011-005

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential random

use viewcollective

use view N=256 use view N=256

N=32 256 N=32 256

homehome

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomhomework N=32 N=32

FFig 21 MPIIO

JAEA-Testing 2011-005

Table 14 _aumlaringaeligccedil13hszlig

| xy | xy

oregRplusmnsup2 zyvx qzv|| xwvxw zyuvqu

sup2Rplusmnsup2 woxqvoz ||qqvz| uzvu z|voy

kfrac14 frac12kfrac14

53 IORYUgraveWH

IORYbrvbardegdnDregmacrb13tuq 531 a13egravecurren occedil`brvbarYCDcparaX IO O=5Pnota13LSRFSP

homea13brvbarsectiumlIumlS_aumla13Lext3Ptmpa13PordfOslashszligordfAumlDHcAumlFXgttmpa13YZccedila)W^|13~AumlFUcircgtecirc9ordflaquoLFig 22middot Table 15Pq UgraveDhomemiddotAQFS_ccedilY-a5 512kcentCsect186OcircOtildeYOumltimes

raquoZETHIuml_Kgtaordf)JeumlnqUcircfIuml13NtildeY|^`bgta

gt_ccedilY-a5gtyBUacuteCqiAgravework middotAIO eacuteIumlkCQFS_ccedilY-a5 1MBcentCsect366OcircOtildeYOumltimes`Zaggtaordf)Jeumlnq^|13~cda_ccedilY-a5gtlgtyB

UacuteCqa-a5ordfecircCXbrvbarPX MPIIO work middotAegravePuecircgtlaquoq 532 POSIXW MPIIO c^_brvbariaYhomemiddotA Readgt 256EumlIgravegt 2curren

Writegt 10currenW POSIXbrvbarsect MPIIOordfcedilqPOSIXbrvbaraYmacrsup1Gq 533 eacuteccedilecirceuml ntransferSizeGX 1 plusmn13fgtleumlneacute

a`ordfecircMFGFAringordfecircCXWPdegdnDmWcd^|13~

^_Zeacuteccedil-a5laquo]GWordfyacuteordf9ordflaquoq C`|lOcirc`eacuteccedil-a5Y_ BUFSIZgtAtildeUgravesect13

gt gccegravePoumlgt fccegravePoumlgt 8192eacutea`gtlaquoqCfIumlgtsetvbufZaZ|atildebrvbarpoundeacuteccedilecirceuml^wmWordfgtecircq QR Fortran1313aeligleacuteccedil-a5GAgraveograve13sectLAacute

ordmPq (1) OPENyacute BLOCKSIZE (2) ^ fuxxbf (3) Oslash5fIumlZaUuml^13~ -g (4) Ocirc`gt 8Meacutea`

JAEA-Testing 2011-005

ETER

NU

S D

X80

1-

18 (d

x01frac34

dx18

)

DX

80

01

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

DX

80

02D

X80

03

DX

80

04D

X80

05

DX

80

06D

X80

07

DX

80

08D

X80

09

DX

80

10D

X80

11

DX

80

12D

X80

13

DX

80

14D

X80

15

DX

80

16D

X80

17

DX

80

18

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

IObrvbar

1(io

1)

ETER

NU

S D

X80

19

-36

(dx1

9frac34dx

36)

DX

80

19

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

DX

80

20D

X80

21

DX

80

22D

X80

23

DX

80

24D

X80

25

DX

80

26D

X80

27

DX

80

28D

X80

29

DX

80

30D

X80

31

DX

80

32D

X80

33

DX

80

34D

X80

35

DX

80

36

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

Wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

IObrvbar

2(io

2)

Fi

g 2

2a13J

JAEA-Testing 2011-005

Table 15 OcircOtilde7Y`|J

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_01 QFS_data_01 2109888 MB 2109888Icirc36 = 724TB 1024k

data_02 QFS_data_02 2109888 MB 2109888Icirc36 = 724TB 1024k

data_03 QFS_data_03 2109888 MB 2109888Icirc36 = 724TB 1024k

data_04 QFS_data_04 2109888 MB 2109888Icirc36 = 724TB 1024k

data_05 QFS_data_05 3695488 MB 3695488Icirc36= 1269TB 1024k

home QFS_home 3695488 MB 3695488Icirc36= 1269TB 512k

work 524288 MB Io2JAgraveAacuteAcirc

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_06 QFS_data_06 2109888 MB 2109888Icirc36 = 724TB 1024k

data_07 QFS_data_07 2109888 MB 2109888Icirc36 = 724TB 1024k

data_08 QFS_data_08 2109888 MB 2109888Icirc36 = 724TB 1024k

data_09 QFS_data_09 2109888 MB 2109888Icirc36 = 724TB 1024k

data_10 QFS_data_10 3695488 MB 3695488Icirc36= 1269TB 1024k

data QFS_data 3695488 MB 3695488Icirc24= 846TB 1024k

sysdata QFS_sysdata 3695488 MB 3695488Icirc12= 423TB 16k IumlETHNtildeETH

work QFS_work 524288 MB 524288Icirc144=72TB 1024k dx01frac3436OgraveJ

JAEA-Testing 2011-005

6 a$fY` mmgt BX900 a$fY`ETHNtildeCDtuq

ETHNtildeBX900Xraquozfrac14frac12GmWordfgtecircDq 61 a$fY`igraveiacute mmgt BX900 a$fY`IcircIumligraveiacute AgravejiumlIumleacuteaIuml

0Gq 611 icircWiumlIumlethsectntilde

BX900 InfiniBand QDRegraveBCD Fat Treeicircgtlaquoq113aelig13S 18iumlIumlAEligicircgtlaquoordf113aelig13 LeafWdividenaccedil7Iumlordf 2Igravecenttimeseumln1accedil7IumlcdoiumlIuml QDR ordf 1 7gtYZeumlnqaccedil7Iuml 2 Igravelaquogt1 iumlIuml 2 QDR ordfYZeumln(QDR Iacutegt 18 j2Igravegt 36WX)qoiumlIuml`ZaringbrvbarsectT 8GBs (4GBsj2) AAEligpoundq 13aelig13 5PDHLeafaccedilyacute7 SpineWdividenaccedilordf

9iexcllaquosecto LeafaccedilGu Spineaccedil QDRgtYZeumlnqbrvbarpound 113aelig13cd SpineIacutegt 9j2Igrave=18 QDRordfegraveBeumlnq QDRordf 5PoumlLeaf accedil3xmacrDownlink ordf 36 Uplink ordf 18 Xgt Uplinkshy 50WXsectmn 50_ccedilaringWdividenotgtqBX900gta$fY`Ecircfrac34h Fig 23GqBX900gt 50_ccedilaring13aelig13ocircotilde gtthornordf]GqSpineLeafaccedilW13aelig13SoiumlIumlYZGEcircfrac34h Fig 24Gqmmgt13aelig13S nAumliumlIuml nAuml(niquest9)UgraveD nU9Auml(nAgrave9) SpineaccedilegraveBGqlaquo13aelig13laquoiumlIumlordfcent13aelig13iumlIumlcdOcirc$ltAacuteoumlSpineaccedilcd Leafaccedil(2Igrave) QDR2(1j2Igrave)ordfegraveBeumln(2Igrave)LeafaccedilcdiumlIumlQDR2gtYZeumlnqDotildeCmSpineaccedilcd Leaf accedilordmAcirc13aelig13SPAtildeWiumlIumlWnoteumlnqmDHyenwiumlIumlTEacuteOcirc$lt GoumlWiumlIuml 1 W 10 ordflt GoumlWgtn~n FAringordf+XqX13aelig13Sotilde gtiuml_ccedilaring

WXqmDHoYacute~iumlIumlethsectntildeAgraveograveLiumlIumleacuteaOcircOtildePordf

thornotildeGq (1) XiumlIumlethsectntildeL13klP iumlIumlZeacuteaIumleumlnqyenw13aelig13 1 iumlIumlordf bx0001ccedil

bx0018Ugravegtlaquoordf144EumlIgraveYacute~oumlmAumlgtiumlIumlordfeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDdmiumlIumlAringAEligw[iumlIumlegraveBGq13

aelig13ocircotildeoumliacuteZeacuteaIumleumlnq

JAEA-Testing 2011-005

(2) CcedilXiumlIumlethsectntilde Wsup33iIacuteOslash5YZniumlIuml 1ecirceacuteaIumleumlnqyenw13aelig13

1gt bx0001bx0003bx0005bx0007WeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDd8)eacuteaIumleumlnDiumlIumlordfEgraveAumlAringXdAringAEligwTEacuteEgrave

AumlAringiumlIumlegraveBGqEcircAumlAringoumliacuteAringAEligwTEacuteEcircAumlAringiumlIumlegrave

BGqmnAtildeWYacute~SiumlIuml gt IB [EumlDHcenteumlnDq plusmnoqgt65EacuteWsup3AgraveogravegtCDq

13aelig13 1 13aelig13 119

Leaf(1-2) Leaf(1-1)

bx0001 bx0002 bx0018

Spine(1) Spine(2) Spine(9)

Leaf(119-2)

Leaf(119-1)

bx2125 bx2126

Fig 23 BX900a$fY`icirc

3OslashCDqIgraveIacuteoFL13aelig13raquoIcircIumlntildePfrac34wWsup313kl

WiacuteiumlIumlethsectntildeAgravejBCq

JAEA-Testing 2011-005

accedil (Spine) accedil

(Leaf) 13aelig13 iumlIuml (1 13aelig13acute18 iumlIuml ) (acutecsect]GCGDHLeaf accedilTCcX)

1

10

2

11

3

12

4 5 6 7 8 9

13 14 15 16 17 18

SW 1

SW 2

SW 3

SW4

SW5

SW6

SW 7

SW 8

SW 9

SW

Fig 24 iumlIumlWaccedilatildeAcirc

612 Agravej

BX900 Ba$fY`plusmnsup2yBCD IB AgraveograveWC

RDMA(Remote Direct Memory Access)yBCD RDMA Agravej(13 RDMA WC)WRDMA egraveBCX NORDMAAgravej(13 NORDMA WC)ordflaquoqBX900 gtOcirc` RDMA gtlaquoqRDMA W IB begravecccedil`brvbarYaumlIuml(Host Channel Adapter(HCA))Ocirc$^|13~|W8Y]sectWsectgtecircAEligDIacutemWgtETH_fg (1)5PqAuml^ccedil`E7a13 ordfNtildeEumlgtecircqiAgraveNORDMA CPUbrvbar|fg5Pq

RDMANORDMAWcentccedilYacute AgraveograveWC Normal SendAgravej(2)W Send on RequestAgravej(3)ordflaquoqRDMANORDMAacircWmnd Normal SendAgravejSend on RequestAgravej5PmWordfgtecircordfmndsectOgravewG ccedilYacute-a5brvbarpoundAtildeUgraveqRDMANORDMAgtmndAgravejsectOgraveOcirc`Oacuteshy Table 16GqOacuteshyOcircOtilde Normal SendAgravejOacuteshy13yacute Send on RequestAgravejWXqmndAgravejWccedilYacute-a5^_atildeAcirc Table 17W Table 18Gq Oacuteshy^ MP_MAX_NORMAL_SENDgt^wdnordfyacuteaacuteshyCDOcirc`shyWEacutegtlaquoqbrvbarpound^egravePoumlOacuteshyOumloumlEacutegtlaquoq

JAEA-Testing 2011-005

XDshy 5PDHOslash5 -onesideUuml^13~CDoumlNORDMAAgravejc(ccedilYacute-a5WatildeAcirc)Normal SendAgravejgt5nq

Table 16 AgravejsectOgravewOacuteshy Iuml RDMA RDMA NORDMA

^_ Tccedil1024 1025ccedil U Oacuteshy 32768eacutea` 16384eacutea` 524288eacutea`

Table 17 RDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil16KBOcircOtilde 16KBccedil32KBOcircOtilde 32KB13yacute 1ccedil1024^_ Normal Send Normal Send Send on Request

1025ccedil2048^_ Normal Send Send on Request Send on Request 2049^_13yacute NORDMAWX Send on Request Send on Request

Table 18 NORDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil512KBOcircOtilde 512KB13yacute Gu^_ Normal Send Send on Request

timesOslash (1) ETH_fg ^_ccedil-brvbarOcirc$fgIacuteccedil`brvbarYordmgt ordm|8

YOcirc$FGGCcedilEgravemWq^_ccedil-( 13)Obbrvbarsectdivide3GmWordfgtecircq (2) Normal SendAgravej G shylt shyccdccedilYacuteG GqccedilYacuteUgraveUacutesect

umlordmordfagraveCnUgraveUgravefg(direct copy)qagraveCXoumliUcircIacuteEumlAagravefCmmfg(indirect copy)agraveCDdsectumlordmfgGqUumlccedilYacutezecircq (3) Send on RequestAgravej G shylt shyWYacuteAacutesectlt shyccedilYacutesectumlordmordfagraveCcdG 5Pq

`ccedilYacutezecircq

JAEA-Testing 2011-005

62 MPI MPI acircX atildeegraveBCD^_Z)JCBX900 yacutegt

CDq256EumlIgraveccedil4096EumlIgravegt Ocirc$-a5THORNumlcopyszligpound5poundDordfmmgto MPIatildegtagraveaacuteX=ecircordfmacrdn 32ccedil80000eacutea`gtGq WCRDMANORDMA ordflteumln atildeordflaquomWlaquoOcirc

$-a5cdecircCordf^UcircG atildeordflaquomWXMordfethCDq 621 Agraveograve mmgtG Ocirc$umlcopy 0ccedil80000 eacutea`gtlaquoordfOslash5vEacuteAacuteG

Ocirc$acircETHeCLTable 19Pq AgraveograveWCMPIatildeTplusmnOslash5GraquoCmn 20plusmnatildesectaumlGq20plusmnatilde

sectaumlC3 1plusmnWecircXshyordfeumlnmWordfDDlaquopoundDqWCpumlcopyYacute~ W[ordfuumlwdnqmDHT`C19plusmnshyIIumlAacutemWCDqXIuml RDMAW NORDMA 2IumlgtGq^_ZOslash MPIatildeIacutecurren Fig 25Gq

Table 19 Ocirc$acirc

EumlIgrave AllreduceReduceBcastSendRecv 13 MPIatilde 256 32eacutea` 128eacutea` 512 yacute 128eacutea`

1024 yacute 256eacutea` 2048 yacute 2048eacutea` 4096 yacute 2048eacutea`

do 1000 III = 1 NNN call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() mesuarment start call MPI_ALLGATHER(anelemMPI_DOUBLE_PRECISION amp bnelemMPI_DOUBLE_PRECISION amp MPI_COMM_WORLDierr) ttime2 = MPI_WTIME() mesuarment end ttime0 = ttime2 - ttime1 call MPI_REDUCE(ttime0ttime1MPI_DOUBLE_PRECISIONMPI_MAX0 amp MPI_COMM_WORLDierr) if(myideq0) then max_time = max(max_timettime) min_time = min(min_timettime) all_time = all_time + ttime end if 1000 continue

Fig 25 ^_ZLIacutecurrenP

JAEA-Testing 2011-005

13

622 mnUgravegttuDOslash5CcedilEgraveOslash5IumlgtP512EumlIgrave Fig 26 Fig

27 1024EumlIgrave B Gqoh$a` MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

0

500

1000

1500

2000

2500

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Reduce(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(512-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(512-)

rdma

Nordma

Fig 26 MPI atildeL512EumlIgrave-1P

JAEA-Testing 2011-005

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(512-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(512-)

rdma

Nordma

0

500000

1000000

1500000

2000000

2500000

3000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(512-)

rdma

Nordma

0

50

100

150

200

250

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(512-Rank0gtRank511)

rdma

nordma

Fig 27 MPI atildeL512EumlIgrave-2P

JAEA-Testing 2011-005

brvbarsect[raquozfrac14frac12CDq

(1) C gt RDMAbrvbarsect NORDMAordfFoumlordflaquoq (2) RDMAgtlaquo Ocirc$-a5cdaeligccedil ordflaquoEGqmn Normal Send

Agravejcd Send on RequestAgravejsectOgravethornWnq (3) NORDMA gt AgravejsectOgravewOacuteshyWatildeAcircOcirc$-a5gtaeligccedil ordflaquo

EGoumlordflaquoq (4) EumlIgraveordfCXWEacutebrvbarPX Ocirc$-a5gt decircordfmacrdn

oumlordflaquoq ^_ZgtacircegraveBG MPIatildebrvbarpoundRDMAW NORDMAegravecurrenaeliguordflaquo

qDotildeCmnd MPI ^_Z=GfIumlgtlaquo mpiexec Uuml^13~gtlaquogtGu MPIatildeoacuteC RDMAgt5Pc NORDMAgt5Pclaquo XgtC MPI atildeegravepoundoumlAgraveIumlgtOslash5rsGmWordfaeliguWXq ucircuumlWCo MPI atildegtUVOcirc$(ccedil8KB)3UVOcirc$(8Kccedil78KB)U

VOcirc$(80KBccedil584KB)gt RDMAZUcircCD Fig 28LgtUVOcirc$P CLUVOcirc$PGqZyacutegt 100n RDMA ordfnoty0n NORDMAordfnotyWXq

JAEA-Testing 2011-005

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

middotplusmnsup1raquo

~plusmnreg

plusmn

sup2cedil~raquo

sup2~

sup2~

sup2~Oslashreg~plusmn

middotplusmnsup1

~plusmnraquo

~plusmn

Ugraveplusmnsup1raquo

Ugraveplusmnsup1

~plusmn

plusmnraquo

Fig

28

RD

MAW

NO

RD

MALgtUVOcirc$acute

0K

B~8

KBP

JAEA-Testing 2011-005

63 RDMAszlig RDMA 612 gt0CDbrvbarPCPU plusmnsup2WnotM5XDH 13

O=Vfrac34gtecircqmnbrvbarsect WbordfO=gtecirc9ordflaquosectC

cCcedilEgrave`5poundEacuteDqUgraveDrsDHNORDMAgtiacuteX`5poundDq WC WordfO=eumlnDegraveyenagraveaacutegtecircXcpoundDq 631 Agraveograve W5P^_ZegravepoundRDMA W NORDMA rsGq

eacute^_WOcirc$ecircacuteGeumlsectCigraveecircacuteiacuteVicircG (13cedil Wdivideoslash)5PqGlt iuml_ccedilaring (MPI_IsendIrecv)egraveBGqG UgraveDlt GAtildeWcopyIgraveOcirc$-a5 5MB gtlaquoq W5P5Igravegtlaquoq5Igrave-a5(450450)gtlaquoqrsDHgt n~nOslash55Pq ^_ZZccedil`MPIW 2q|Icirca|ccedil`(MPIiumlOpenMP) 3q|)

JCDqZccedil` MPIgtOslash5CDoumliumlIumlS RDMAgtXDHmAgravejszligordfgtecircXqmDHiumlIumlSgt MPI 5XCeth2Icirca|ccedil`)JCDqIcirca|ccedil`Oslash5WOslash5Iacutecurren7ccedilIumlEumlIgraveUcircCXeacute

Yacute~ 1W7ccedilIumlEumlIgraveUcircCDeacuteYacute~ 2)JCDqeacuteYacute~ 2 WOslash5Iacutecurren Fig 29GqXMPIWIcirca|ccedil`eacuteYacute~ 15IgraveIacutecurren7ccedilIumlEumlIgraveampeuml$OMP parallel doiacuteordfotildegtlaquoq DHOslash5Zccedil` MPIgt 8EumlIgraveIcirca|ccedil`Oslash5 4mpij8omp(32core

egraveB)gtOslash5Gq

JAEA-Testing 2011-005

call MPI_BARRIER(MPI_COMM_WORLDierr) ttime5 = MPI_WTIME() mesuarment start call mpi_isend(aNMBmpi_DOUBLE_PRECISIONir1 1 MPI_COMM_WORLDireq(1)ierr) call mpi_irecv(bNMBmpi_DOUBLE_PRECISIONil1 1 MPI_COMM_WORLDireq(2)ierr) call mpi_isend(dNMBmpi_DOUBLE_PRECISIONil2 1 MPI_COMM_WORLDireq(3)ierr) call mpi_irecv(cNMBmpi_DOUBLE_PRECISIONir2 1 MPI_COMM_WORLDireq(4)ierr) ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ttime7 = MPI_WTIME() mesuarment start $OMP parallel do do j = 1 NMM do i = 1 NMM a1(ij) = 1d0 b1(ij) = 1d0 c1(ij) = 0d0 end do end do $OMP parallel do do j = 1 NMM do k = 1 NMM do i = 1 NMM c1(ij) = c1(ij) + a1(ik)b1(kj) end do end do end do ttime8 = MPI_WTIME() mesuarment end ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc call mpi_waitall(4ireqiastatusierr) ttime6 = MPI_WTIME() mesuarment end ttimeDD = ttime8 - ttime7 ttimeal = ttime6 - ttime5 Oslash5 ttimesd = ttimeal ndash ttimeDD Oslash5cdnD (= UgraveDiIacute)

Fig 29 WO=5PDH^_Z

JAEA-Testing 2011-005

632 Zccedil` MPI 2plusmnIcirca|ccedil`eacuteYacute~ 1 WeacuteYacute~ 2n~n 3plusmn

Oslash55poundDqmnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt D Gqontilde13Gq

acutegtOslash5CD Oslash5 acutegtOslash5CDOslash5 R acute Wn~nOslash5shy Oslash5 acute W5poundDOslash5 (U) acuteOslash5cd()Dshy acuteOslash5CD(OslashC) U acuteOslash5cdRshyDshy

laquogtOslash5CD oacuteG WOslash5GmWbrvbarpound

UumlCXpoundDrGqmj13sectgtlaquoq

laquo() = (Oslash5U( iuml)) j100 aTHORNshyordfecircMszligordflaquomWXqbrvbarsect[raquozfrac14frac12CDq

(1) Zccedil` MPIgtiumlIumlS IBegravepoundXDH RDMAograveoacuteordfltdnOslash5szligq

(2) Icirca|ccedil`eacuteYacute~ 1 RDMAOslash5gt WOslash5rsGWucircOslash5ordfFWPXq NORDMAatildeCiacuteXWXpoundq

(3) Icirca|ccedil`eacuteYacute~ 2 gtIacutecurrenordf7ccedilIumlEumlIgraveUcirceumlngteacuteYacute~ 1WRrsgt 72currenOslash5Rrsgt 55currenAumlFUcircordfXeumln(RDMA)qCcCgtucircOslash5ordfFWPdegdnXq NORDMAIacutecurren7ccedilIumlUcircgt Oslash5eacuteYacute~ 1ruUumlIgraveUcirceumlnordfRDMAbrvbarsectcedilCXpoundq

BX900gt RDMA Uuml_IumlOslashWGgtCPU oacuteG

dCb^|13~ CPUbrvbarsectVfrac34GmWordfgtecircqCcCBX900 ordfiacuteCUgraveDplusmngt ordfUumlmWlaquosect Wordf

O=eumlnDegraveyenagraveaacuteGmWordfgtecircXcpoundDq

JAEA-Testing 2011-005

64 CPUSCPUiumlIuml 1oacute 1 core FAringG^_Zegravepound CDqCPU S

iumlIumlS CPU iumlIuml gt Ocirc$ordmAcircAgravejordf+Xqn~n gtordfaEacuteiumlIuml ordfTFWXpoundDq 641 Agraveograve

1oacute 1 ^_Z coreOcirc$ G^_ZgtlaquoqZY0cdcZYCD^_acircgtCD-a5Ocirc$ 100plusmnG Cmnd uCDn~nGq^_ZiIacute Fig 30Gq plusmn 24EumlIgravegt128MBOcirc$G C^_acirc 1WGq BX900gt 1iuml

Iuml 2CPU1CPU 4coregtlaquogtZY 3Ugravegt CPUS ZY 7UgravegtiumlIumlSPAtildeW CPUW mnd13yacuteiumlIuml WXqiumlIumlS coreoacuteG core IDWethsectntildedn MPIZYAumlAring Fig 31Gq

IC = 0 do 1000 III = intcpu nodes-1 intcpu IIRANK = III call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN if(myideq0) amp call MPI_SEND(ansizeMPI_DOUBLE_PRECISION amp IIRANKIIMPI_COMM_WORLDierr) if(myideqIIRANK) amp call MPI_RECV(ansizeMPI_DOUBLE_PRECISION amp 0IIMPI_COMM_WORLDistatusierr) 120 continue ttime2 = MPI_WTIME() c IC = IC + 1 timew(IC) = ttime2 - ttime1 ir(IC) = IIRANK 1000 continue

Fig 30 ToacuteT ^_ZLIacutecurrenP

JAEA-Testing 2011-005

NODE

CPU0 CPU1

Core IDacute0 Core IDacute2 (Rank 0) (Rank 1) Core IDacute4 Core IDacute6 (Rank 2) (Rank 3)

Core IDacute1 Core IDacute3 (Rank 4) (Rank 5) Core IDacute5 Core IDacute7 (Rank 6) (Rank 7)

Fig 31 f MPIZYethsectntilde

BX900Ocirc`QRfeaZ+QR MPIZaZ|gtlaquoordfmQRRIgraveEacuteIacute13

IntelfeaZ+QR MPI IntelfeaZ+Intel MPI QRfeaZ+Open MPI

3 gt5poundDqDotildeCQR MPI W Intel MPI MPI ^_ coreoacuteGethsectntilde Fig 31CDsectgtlaquoordfOpen MPIatildeC rankAumlAring 01234567 Core ID 01234567gtethsectntildednq 642 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt E Gqmbrvbarsect[

raquozfrac14frac12CDq

(1) QRfeaZ+QR MPI gtCPU S FAringordf 3260MBsiumlIumlS CPU FAringordf 3750MBsiumlIumlordf 5350MBsgt CPUS ordfiAumlcedilCiumlIumlordfiAumlFWPWXpoundDq

(2) Intel feaZiumlQR MPI CPU SiumlIumlS CPU QRfeaZWotildeiacuteXotildeordfiumlIuml 4750MBsWegraveeacuteEGqmnQR MPIQRfeaZUcircCD5poundordfIntelfeaZoacuteC5nXDHWnq

(3) IntelfeaZ+ Intel MPI CPUSiumlIumlS CPUgtAgraveXordfmndocircoacute(1)(2)QR MPI egraveBqUgraveDiumlIumlgt Intel MPI 2(` QDRegraveBordfgtecircXDH1(` QDRCcXq

JAEA-Testing 2011-005

(4) QRfeaZiumlOpen MPICPUS ordf 4460MBSgtmnUgravegt3gtTbrvbarordfiumlIumlS CPU QR MPI WotildeAringiumlIumloumlCXWXpoundDqDotildeCiIacute QDRegraveB5poundbrvbarPotildeq

dividefrac12XegraveAgravegtlaquoQRfeaZiumlQR MPIgt CPUSiumlIumlS CPU ordf

iumlIuml brvbarsectcedilWPfrac34W+XWXpoundDqmmacraacutesectXuC

iumlIumlSgt MPI XuCEumluecircgtlaquoq 65 13aelig13ocircotildeouml 13aelig13ccedil`brvbarY 50_ccedilaringicircXpoundDHegraveBoslashbrvbarpound

thornltCUgravePqmthorn 1oacute 1 Wdivideoslash ETHuDq 651 1oacute 1

1oacute 1 MPI atildegtlaquosectbrvbarCegraveBeumlnqyenwOslashfIumlgteacute^_WFGgt^_ordfgt5noumlordflaquoqmbrvbarPXouml50_ccedilaringthornlt9ordflaquoqmmgt 5P^_Z)JC

thornETHuDq WC1oacute 1 gt 2iumlIumlgt IBnotGn[ordfecircDH50_

ccedilaringthornCltmWordfcpoundDqDotildeCWsup3iumlIumleacuteaOcircOtilde(Yacute~iumlIumlethsectntilde)AgraveogravegtAtildeWYacute~ordf IBnotGbrvbarPXiumlIumlcopytimesUVYacute~139ordfAgraveXDHthornordfCq 6511 Agraveograve ETHNtildeegraveBCD^_Z^_1curreno^_cdfrac14secto^_

1oacute 1 (MPI_SendRecv)gt 128MBOcirc$ 100plusmnG CmnuCDiexcl FAringGqOcirc$G aYacute Fig 32 Gqmyengt8EumlIgraveYacute~oumlgtZY 0ccedil3^_cdiexclZY 4ccedil7^_Ocirc$G CqOcirc$Glt ^_iugravepGgtoslashbrvbarpoundccedil`brvbarYgt[

ordfecircmWXq ^_ZZccedil` MPI WIcirca|ccedil`(MPI+OpenMP))JCDqIcirc

a|ccedil`7ccedilIumlETHeGmWgt1 iumlIumlcdOcirc$G 2^wmWordfgtecircq

JAEA-Testing 2011-005

13

6512 (1) 113aelig13S rsDH1 13aelig13S(144core)5poundDqZccedil` MPI 144 EumlIgraveIcirca

|ccedil` 72mpij2omp36mpij4omp18mpij8omp 3 IumlgtOslash55poundDq13aelig13Siuml_ccedilaringgtlaquoDH iumlIumluacuteordf IB iexclcentGqZccedil` MPI 1 iumlIumlcd 8 ^_currenOcirc$ordfeumlnIcirca|ccedil` 2 7ccedilIumliacute 1iumlIumlcd 4^_curren47ccedilIumlEacuteC 2^_curren87ccedilIuml1^_currenWXgt87ccedilIuml13 IBgt[ordfecircqmDHmnd FAring+XordfIBagraveaacuteGDHogtIIuml FAringW 1iumlIumlntildesect FAringshyiquestHDqm Table 20GqXeacuteYacute~Icirca|ccedil`cedil$ucirc3shy(MPI^_j7ccedilIuml)Gq Icirca|ccedil`(18mpij8omp)EacuteegraveeacuteshyordfEordfFAringshyGueacuteYacute~gt

otildeEacutebrvbarPXshyWXpoundqmnordf QDR2currenOslashgtlaquoigravePq (2) 13aelig13Zccedil` MPI TEacute144EumlIgraveZccedil`MPIgt5poundDqYacute~Oslash5gt t256YZegraveBCDq

mnmYZiumlIumleacuteaOcircOtildeordfZcopytimesWXDHIBnotordfecirc]GDHgtlaquoq _copytimes 72^_13aelig13(=9iumlIuml13aelig13)gtIumlb^_ 213aelig13currenethCDq13aelig13S 1iumlIumlecirc 2 IBordfuacuteordfpoundq13aelig13S 18iumlIumllaquogt 36 IBgteumlnqiAgrave13aelig13 50_ccedilaringgtlaquogt 18(9j2ccedil`)gteumlnq144EumlIgraveouml13aelig13S13aelig13W 18 IB egraveBGgtiuml_ccedilaringWXsect13aelig13S13aelig13EacuteXqbrvbarpoundm13aelig13ocircotilde 13aelig13SgtCD Wotilde^

dXq

Ocirc$G (G G)

^_0

^_1

^_2

^_3

^_4

^_5

^_6

^_7

Fig 32 ToacuteT Ocirc$G aYacute

JAEA-Testing 2011-005

Table 20 L13aelig13SP eacuteYacute~ Zccedil` MPI uumlyacutethorn(72j2) uumlyacutethorn(36j4) uumlyacutethorn(18j8)IIuml FAring 7238MBs 14474MBs 28942MBs 53467MBs iumlIumlshy 57906MBs 57899MBs 57884MBs 53467MBs

Table 21 L13aelig13Zccedil` MPIP 144 MPI 192 MPI 224 MPI iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 7246 R 4670 laquosectR 4671 laquosectR

2 7244 R 4670 laquosectR 4671 laquosectR

3 7246 R 4670 laquosectR 4671 laquosectR

4 7242 R 7243 R 4670 laquosectR

5 7243 R 7244 R 4671 laquosectR

6 7242 R 7240 R 7237 R

7 7241 R 7239 R 7246 R

8 7244 R 7240 R 7245 R

9 7249 R 7243 R 7240 R

10 R 4670 laquosectR 4670 laquosectR

11 R 4670 laquosectR 4672 laquosectR

12 R 4669 laquosectR 4671 laquosectR

13 R R 4671 laquosectR

14 R R 4670 laquosectR

iacute 192EumlIgraveW 224EumlIgraveZccedil` MPIn~n 96^_13aelig13gt 213

aelig13Oslash5112^_13aelig13gt 213aelig13Oslash55poundDqordm 144EumlIgraveOslash5WWlt shy13aelig13iumlIumlraquoIIuml FAring Table 21Gq yenw 224 EumlIgraveoumlltAacutesectshygt IBnotGiumlIumlordflaquogt(50_ccedilaring)

miumlIuml FAringordfcedilCXqFig 24a$fY`icircgtCDbrvbarPiumlIuml1W 102W 113W 12ordfnotGmWXsectmndiumlIuml FAringcedilCXpoundqCcCiumlIuml 15161718 egraveBeumlnXDHmndiumlIumlWnotGiumlIuml 6789 FAringordfFC 144MPI oumlW FAringordfEacutegtlaquoqbrvbarpoundiumlIuml6ccedil9 iuml_ccedilaringWXq192 EumlIgraveiacutegtiumlIuml 4ccedil9 p IB notGiumlIumlegraveBeumlnXDH FAringordfFq

JAEA-Testing 2011-005

Table 22 L13aelig13Icirca|ccedilIumlP Icirca|ccedil`(112x2) Icirca|ccedil`(56x4) Icirca|ccedil`(28x8) iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 9339 laquosectR 18567 laquosectR 36408 laquosectR

2 9337 laquosectR 18564 laquosectR 36919 laquosectR

3 9346 laquosectR 18566 laquosectR 36536 laquosectR

4 9347 laquosectR 18570 laquosectR 36457 laquosectR

5 9342 laquosectR 18651 laquosectR 36646 laquosectR

6 14385 R 28606 R 53808 R

7 14488 R 29118 R 53779 R

8 14500 R 28676 R 53789 R

9 14495 R 29100 R 53784 R

10 9337 laquosectR 18797 laquosectR 36408 laquosectR

11 9340 laquosectR 18775 laquosectR 36715 laquosectR

12 9343 laquosectR 18651 laquosectR 36559 laquosectR

13 9348 laquosectR 18811 laquosectR 36457 laquosectR

14 9338 laquosectR 18694 laquosectR 36991 laquosectR

(3) 13aelig13Icirca|ccedil` Icirca|ccedil`gt 213aelig13(224core)egravepound112mpij2omp(56^_13aelig13)

56mpij4omp(28^_13aelig13)28mpij8omp(14^_13aelig13)IumlgtOslash5CDqmndoiumlIumlgtIIuml FAringWpiumlIumlWnotnot Table 22Gq MIcirca|ccedil`gt IB notnotgt FAringordfecircC^poundqUgraveDIcirca|ccedil

`7ccedilIumlbrvbarpoundiumlIumlcdGlt GOcirc$2ordf^DHiumlIumlntildesect^_

ordfMiumlIumlXordf IB gtOcirc$[thornordfagraveaacuteXqmnduumlGWBX900 gtZccedil` MPI brvbarsectIcirca|ccedil`UcirceumlnDfIumlordfIcirca|ccedil`UcircfIumlXd 87ccedilIumlordfnotygtlaquomWordfcq WmigravegtWsup3iumlIumleacuteaOcircOtildeAgraveograveiumlIumlAtildeWecirccopytimeseumlnD

HUVYacute~13currenYacute~gt IB not9AgraveXqbrvbarpound50_ccedilaringmWuumlGbrvbarsectordmgttuDIcirca|ccedil`Ucirc5GuecircgtlaquoqCcC

Yacute~Oslash5 IB notGiumlIuml ordfpYacute~ordfOslash5eumlnWthornltCUgravePordfucircoB3mnEumlmWgtecircXq

JAEA-Testing 2011-005

652 divideoslash mmgtdivideoslash atildePiAuml 2ordfCX MPI_AlltoallegravepoundEacute

DGq 6521 Agraveograve

MPI_AlltoallGCcedilEgravecentCOslash5GqCmrsGqXMPI_Alltoall 100 plusmnOslash5C oacute copyIgrave-a5Zccedil` MPI ordf512MBIcirca|ccedil` 1024MBWCDq

A ucircYacute~Oslash5 B 13aelig13ntildesectiumlIumlgtecircotildeCGq C 13aelig13ntildesectiumlIumlAgraveXCG(13aelig13ordfCX)q D IBnotCG(iuml_ccedilaring)q

BccedilD atildeGgtlaquoordfiumlcopyordfYacute~gtegraveBGiumlIumllaquo GmWgtecircXq mDHWsup3ucircoBgteacuteccedilYacute~TYZ h4096egraveTEacute 512iumlIuml(4096core)agravefCm3cdXuCCcedilEgravePiumlIumllaquonotgtecircmCcedilEgravegtagravefgtecircEumlIgravegtCfNh$)JCmfNh$egravepound

MPI_Alltoall Oslash55poundDqmDHAacutedegCDOcirc$ MPI ^_UYacuteordfXq 6522 Zccedil` MPIWIcirca|ccedil`egravepoundD Table 23GqmmgtCXC

XmWOslash5CcedilEgraveordf+XDH AucircYacute~Oslash5WBccedilDUgravegtoshy8YrsCXmWgtlaquo(A 10WCDrClaquoordfmnoumlgt BccedilDEacutersWG)qBccedilDTEacute 512iumlIumlagravefCcdEumlIgravefNh$)JCOslash5GordfQR MPI gtmbrvbarPTEacuteCiumlIumlagravefCcdAgraveXiumlIumlgtCfNh$)poundOslash5CDoumlWTEacutecd

EumlIgravegtOslash5CDoumlWgt)acircOslash5ordfEGqZccedil` MPI WIcirca|ccedil`gtgt coreordfAgraveXoumlmraquozordfagraveaacuteXqX BccedilDrsXq

JAEA-Testing 2011-005

Ta

ble

23iacuteXCcedilEgrave

MPI

_allt

oall

vUacuteSTUcircR

vRcopyUumlYacutebrvbarTHORN

szligvcopyUumlYacutebrvbaragrave

pvRaacuteacircatildeNR

+aumleAumlovqaringaeligPgR

Oslash 5 Iuml

copy Igrave - a 5 (MB)

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

R

R

szligR

pR

144m

pi

512

3R6

0 11

14

118

0

145

9 3

60 R

138

6

36

0 13

87

100

1

31

124

1

25

2plusmn

(BC

DEacute

) 51

2 3R

60

111

4

118

0

148

2 11

16 R

146

9

63

0 14

75

100

1

33

132

1

32

3plusmn

(BC

DEacute

) 51

2 3R

60

111

4

29

0 14

81

714

13 R

153

3

29

0 14

21

100

1

33

138

1

28

448m

pi

512

14R

40

142

6

414

0

182

4 21

272

1 R

170

3

78

0 16

32

100

1

28

119

1

14

504m

pi

512

14R

45

142

7

415

8

180

4 27

242

6 R

160

5

79

0 14

33

100

1

26

112

1

00

2048

mpi

51

2 37R

69

182

1

298

8 24

93

465

6 R

250

1

465

6 25

04

100

1

37

137

1

38

2plusmn

51

2 37R

69

166

7

2211

6

262

5 67

564

6 R

264

7

298

8 25

04

100

1

57

159

1

50

3plusmn

(BC

DEacute

) 51

2 37R

69

166

7

2211

6

262

1 69

524

9 R

269

3 6

298

8 24

59

100

1

57

162

1

47

3072

mpi

51

2 69R

56

258

7 28

399

9 27

34

5371

54 R

286

3 9

507

7 26

44

100

1

06

111

1

02

68m

pij

2om

p 10

24

3R5

7 12

10

11

170

11

80

91

9 R

122

3

28

5 12

21

100

0

97

101

1

01

120m

pij

2om

p 10

24

9R3

3 12

89

215

0

127

0 12

191

6 R

130

4

47

5 12

94

100

0

99

101

1

00

132m

pij

2om

p 10

24

8R4

1 13

79

216

5

116

1 15

211

6 R

119

5

48

3 11

38

100

0

84

087

0

83

432m

pij

2om

p 10

24

17R

64

162

4

119

8 16

28

1925

43 R

161

2 1

129

0 16

18

100

1

00

099

1

00

30m

pitimes4

omp

1024

4R

38

598

115

0

674

12

13 R

683

27

5 68

2

100

1

13

114

1

14

2plusmn

(BC

DEacute

) 10

24

4R3

8 59

8

115

0

678

12

13 R

690

27

5 70

0

100

1

13

115

1

17

36m

pitimes4

omp

1024

6R

30

603

118

0

684

11

16 R

697

63

0 70

3

100

1

13

116

1

17

2plusmn

(BC

DEacute

) 10

24

6R3

0 60

3

29

0 68

8

714

13 R

702

29

0 71

0

100

1

14

116

1

18

126m

pitimes4

omp

1024

16R

39

822

3

415

8

805

27

242

6 R

738

1

79

0 71

5

100

0

98

090

0

87

180m

pitimes4

omp

1024

18R

50

925

615

0

892

40

283

2 R

776

127

5 75

3

100

0

96

084

0

81

2plusmn

(BC

DEacute

) 10

24

18R

50

925

712

9

861

29

352

6 R

808

109

0 76

8

100

0

93

087

0

83

612m

pitimes4

omp

1024

45R

68

110

2

349

0 10

34

525

9 R

111

4

349

0 10

46

100

0

94

101

0

95

2plusmn

(BC

DEacute

) 10

24

45R

68

110

2

2810

9

920

57

634

9 R

960

5

358

7 83

4

100

0

83

087

0

76

17m

pitimes8

omp

1024

3R

57

338

1

117

0

430

9

19 R

432

28

5 43

3

100

1

27

128

1

28

90m

pitimes8

omp

1024

15R

60

504

109

0 44

9

243

8 R

471

109

0 47

1

100

0

89

093

0

94

357m

pitimes8

omp

1024

53R

67

731

418

7 69

2

576

3 R

711

418

7 72

9

100

0

95

097

1

00

2plusmn

(BC

DEacute

) 10

24

53R

67

731

3410

5

642

57

685

3 R

628

6

438

3 60

2

100

0

88

086

0

82

JAEA-Testing 2011-005

brvbarWBccedilD gtecircCOslash5ordf+XAgraveXq13aelig13ordfAgraveXCc IB notGiumlIumlordfAgraveXnOslash5ordfUumlWnDordfmacrWaeligCmbrvbarPXpoundXqMPI_Alltoall MPI atilde3gtT 2ordfCX 5Pordfmnegravepound BX900 gtecircXordfXmWordfethCDq1 oacute 1 gt13aelig13thornordfDordfmgtpYacute~thornltC t256YZegraveBCDoacuteCdivideoslash h4096YZegraveBCDgtpYacute~thornlt9]QR MPIbrvbar TUacuteUcircthornuumlwdnWgtszligagraveXNXq 66 a$fY`UgraveWH mnUgravegt BX900 gt brvbarsect ecircXthorn7wWCacirc

RDMANORDMA Agravej Ocirc$-a5 1oacute 1 (13aelig13ocirc)IBnot

ordflaquopoundDqumlcopyfIuml oslashfrac14frac12CTbrvbarCcedilEgravelaquo GuecircgtlaquoigravePqeumld

fIumlszligOslash5GDH Zccedil` MPIUgraveDIcirca|ccedil`laquo

ordfugtlaquoqucircuumlWCGZccedil`MPIWIcirca|ccedil` rsGordfWmcdBX900gtIcirca|ccedil`ordfnotyXmWordfcurrencqmncdfIuml)JGoumllaquocentfIumlAumlFUcirc5GoumlcentX=ordfX

nIcirca|ccedil`laquo GuecircgtlaquoqBX9001iumlIuml 8coreoacuteC IB2gt brvbarntildeGDHm IBgt 2laquo]eumlXDHIcirca|ccedil`ordfugtlaquoqCPU13fUcircordfbrvbarsect2WnIcirca|ccedil`fIuml13vC]G

Weumlnq

JAEA-Testing 2011-005

7 UgraveWH

HPCC BMT7q|^_ZWUuml|YacuteTHORN^_ZBDbrvbarsectBX900W FX1gtfeaZbrvbarTUacuteUcircordflaquosectnordf^_ZOslashszligthornGmWordfdcXpoundDqIORBDgtegraveBG API]a13brvbarpoundlordfagraveaacutegtlaquopoundDqiacuteX B^_ZbrvbarOcirc$

a$fY`yBAgraveogravegt ordfecircC+XmWethCDqmnd

EcircHdegdnDregmacrHPCCYoJusectWUgraveWHWCtCqXyacuteampG(GOcirc$b AccedilGUgraveWHq mndregmacrBX900 W FX1 o13gtegraveBG^_ZIacuteDh

Agraveogravelaquo GDHnotszligXWXqordfbrvbarsectCumlcopy13

ABGlCWCJ7gtecircngtlaquoq

ugraveuacute

)JlaquoDsectXsup2 sup2kDotildeDAElig13=` v oBeacuteQRjlucircsup2acircAgraveXugraveG

q

ucircuumlyacutethorn

1) Innovative Innovative Computing Laboratory the University of Tennesee ldquoHPC CHALLENGErdquo httpiclcsutkeduhpcc

2) gt$middotamp()p HPCaelig7Yacutegt SX13L32 3plusmnAElig13THORNYacutel 3 A=P+ frac14AElig13THORNYacute$ 2005-05middot4 p98-116 (2005)

3) ldquoIOR HPC Benchmarkrdquo httpior-siosourceforgenet

JAEA-Testing 2011-005

A

FX1

HPC

C B

MT

G-F

FTEfeaZUuml^13~

-DH

PCC

_FFT

_235GmWgt

FFT|5ordfTUacuteUcirceumlnordfzyacuteCq

iAgrave

EP-

DG

EM

M-a5^regbrvbarpoundaringaeligccedil13h-a5

(15

MB

Cor

e)UacuteCordfzyacuteCq

Tabl

e A

-1feaZUuml^13~]-a5^regCD

FX1

HPC

C B

MT

UgraveccedilegravekeacuteUgraveccedilknUgraveccedilplusmnsup2sup3

~~regreg

Ugraveccedilmmnecircecirckccedilnecircfrac12

regReeumlgecirckccedilnecircfrac12

ndegplusmnsup2

ecirckccedilpUgraveecircfrac12frac12plusmnsup2sup3degmiddot

plusmnsup2ordmdegsup2sup1plusmnsup2sup3degmiddot

eacuteplusmn~+igraveiacute8

nmcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

reg~

reg~v

reg

sup2sup3plusmn

qvqyuqou

ovwyy

qvquzqq

vxzwo

qvzz

vyoy

zzv|w|

qvuqqy

uvuyx

qyu

sup2sup3plusmnqvqy|zx

ovwz

qvquyxz

vxuo

qvzuw

vyowx

zzv|z

qvxx|y

yvox

qwy

sup2sup3plusmn

qvxoquq

uvoxzo

qvoqqz

yyvuox

w|vyzx

vyoxx

zzvzyq

qv|qq

vqy

yy

sup2sup3plusmnqvxoqwwq

uvqwxx

qvooqzu

yyvxuwo

w|vyyq

vyouu

zzvzzx

qvux|

oqvz|

xw

sup2sup3plusmn

|vwq|qqq

|uvwz

qvqoxxw

qv|

o|uuv|q

vyxy

zzv||wq

qvqwzuw

oqvywyu

sup2sup3plusmn|vw||wqq

|vyz

qvuxu

yyyvuwwu

o|uyvyw

vy|q

zzv|uy

qvxwzqu

ovxuo

zq|z

sup2sup3plusmn

|vwuzqq

uuvxqwq

qvqoxxw

wwyv|zq

o|uv|q

vy|ox

xvuuwu

qvowox

oqvowuyq

sup2sup3plusmn|vwoquxqq

|yvzyxu

qvuzou

yvqwq|

o|uyvzxu

vy|qw

xvuuz

qvqwuy

ovuz

|y

uxqqq

wxqqq

|xqqqq

|qqqq

w

frac14icircRfrac14Rplusmndegreg

szligsup3degicircRRszligcedilszligiumliumlRethvRntildeRfrac14degicircRccedilogravemicroplusmnregntildemicro~sup1Oslashsup3sup2moRenecircfrac12oacuteregplusmnAumlocircotildeg

egravekeacuteOslashfrac14knicircccedilpegravekszligszligOslashmmnOslash|xRccedilpOslashpOslashfrac14knRccedilpegravekszligszligOslashfrac12ecircfrac12eacuteeacuteszlignRocircotilde

mszligpecircmicircccedilpeacutefrac14UgraveOslashOslashyunRocircotilde

frac12kicircRRkplusmnplusmnplusmnraquodegReacutemiddotplusmnmiddotRkplusmn~plusmnmiddot

-

moRexoszligcedilowszligkoumlg

szligdegicircRkszligyudividentildeRvxUgraveegraveparantildeRpposlashueuqUgravecedilregedegsup2plusmngntildeRo|vxUgravecedilregesup3plusmnregsup2RugraveRnecircfrac12gg

Cuacute uacuteucircicircuumlTyacutethornYacuteN0ucircicircuumlTyacutethornYacuteN0gt

|

xo

xo

egravekszligszligRethvicircRethov|vo

eeumlgicircecirckccedilnecircfrac12RregR0Eeecirckccedilnecircfrac12Rndegplusmnsup2gIcirc-RJ13NOIPQ

~~icircRppRmicrodegdegugraveplusmnsup2ntildeRoreg~cedilsup2ntildeRmplusmnRn

JAEA-Testing 2011-005

B EumlIgravecentMPI atilde oh$a`MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

1024EumlIgrave

0

1000

2000

3000

4000

5000

6000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

2000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(1024-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(1024-)

rdma

Nordma

0

100000

200000

300000

400000

500000

600000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(1024-)

rdma

Nordma

Fig B-1 MPI atildeL1024EumlIgrave-1P

JAEA-Testing 2011-005

0

50000

100000

150000

200000

250000

300000

350000

400000

q qqqq uqqqq yqqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(1024-)

rdma

Nordma

0

1000000

2000000

3000000

4000000

5000000

6000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(1024-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(1024-)

rdma

Nordma

0

50

100

150

200

250

300

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(1024-Rank0gtRank1023)

rdma

nordma

Fig B-2 MPI atildeL1024EumlIgrave-2P

JAEA-Testing 2011-005

13

C

RD

MAoacute

NO

RD

MA

Ta

ble

C-13UVOcirc$

(8K

Bccedil

78K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

~plusmnreg

plusmn

sup2~

sup2~

sup2cedil~raquo

sup2~Oslashreg~plusmn

middotplusmnsup1

middotplusmnsup1raquo

~plusmnraquo

~plusmn

~plusmn

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

plusmnraquo

Tabl

e C

-2UVOcirc$

(80

KBccedil

584K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

sup2~Oslashreg~plusmn

plusmn

~plusmnreg

middotplusmnmiddotsup1

~plusmnraquo

~plusmn

sup2~

sup2cedil~raquo

sup2~

middotplusmnsup1raquo

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

~plusmn

plusmnraquo

Zordf

100M

RD

MAAgravejordfnotyWXq

0CXM

NO

RD

MA

AgravejordfnotyWXq

JAEA-Testing 2011-005

D

W Oslash5

Tabl

e D

-1Zccedil`

MPI

8mpi

qzvquecirciumlq|yvwuecirciumlquvzuecirciumlquovu|oecirciumlqxvwoqecirciumlquyvxqoecirciumlquyvxoecirciumlqu

qwvo

qzvzzecirciumlq|vuecirciumlquwvyecirciumlquovu|oecirciumlqxvwuecirciumlquyvuzoecirciumlquxvy|ecirciumlqu

yqyv|

owvz|oecirciumlq|yvwyyecirciumlquvxzecirciumlquvqxecirciumlquov|uecirciumlquyvuecirciumlquccedilxv|xecirciumlq

ccedilyvq

ozvqoecirciumlq|vxxecirciumlquwvux|ecirciumlquvyzoecirciumlquovxoecirciumlquyvuuqecirciumlquccedilvyecirciumlq|

ccedilwuvy

ovq|ecirciumlquyvzz|ecirciumlquwvq|qecirciumlquovu|qecirciumlqxvw|wecirciumlquyvuyoecirciumlquyvywecirciumlqu

yquv

ovquyecirciumlquvuyqecirciumlquwvxqecirciumlquovu|qecirciumlqxvwecirciumlquyvuwqecirciumlquxvzxecirciumlqu

xx|vw

|ovq|zecirciumlquyvwzuecirciumlquvz||ecirciumlquvyoecirciumlquovwecirciumlquyvuzecirciumlquccedilovozecirciumlq|

ccediloyvy

|ovquecirciumlquv|qecirciumlquwvuoyecirciumlquvyy|ecirciumlquovowyecirciumlquyvuecirciumlquccedilvxwecirciumlq|

ccedilovz

uzvwoecirciumlq|vuxyecirciumlquwvu|uecirciumlquovu|ecirciumlqxvw|ecirciumlquyvuuxecirciumlquxvwwuecirciumlqu

yqovy

uzvywecirciumlq|yvzuzecirciumlquvzoyecirciumlquovuwecirciumlqxvwqyecirciumlquyvuecirciumlquyv|yecirciumlqu

yxvu

xzvqecirciumlq|vuoqecirciumlquwv|wqecirciumlquvyyecirciumlquovowecirciumlquyvuuzecirciumlquccedilvozecirciumlq|

ccedil|vx

xzvyzzecirciumlq|voxuecirciumlquwvouecirciumlquvqqecirciumlquovxuecirciumlquyvuuyecirciumlquccediluv|yecirciumlq|

ccedilu|v

yzvuecirciumlq|vxzecirciumlquwvxecirciumlquovuyecirciumlqxvwecirciumlquyvuxecirciumlquxvyzqecirciumlqu

xw|vw

yzvz|oecirciumlq|yvwzoecirciumlquvwwxecirciumlquovuzecirciumlqxvwyecirciumlquyvuyyecirciumlquyvuqwecirciumlqu

yuxv|

zvwqecirciumlq|v|uzecirciumlquwv||yecirciumlquvyxwecirciumlquovozecirciumlquyvuzecirciumlquccedilyvwuecirciumlq|

ccedilywv

ovqq|ecirciumlquyvzzyecirciumlquvzzzecirciumlquvyzecirciumlquovqoecirciumlquyvuzecirciumlquccedil|vozxecirciumlq|

ccedil|ovz

zvuzecirciumlq|vowqecirciumlquwvoxxecirciumlquovoqqecirciumlqxuvxwecirciumlquyvuqecirciumlquvwu|ecirciumlqu

zovy

zvwecirciumlq|vyxecirciumlquwvuecirciumlquovqzzecirciumlqxuvxoecirciumlquyvuyzecirciumlquvu|ecirciumlqu

wvw

qwv|xzecirciumlq|vzecirciumlquwvywecirciumlquvxyoecirciumlquovqwuecirciumlquyvuecirciumlquccedilovqyecirciumlquccedilov

qzvuuuecirciumlq|v|yecirciumlquwvqecirciumlquovuzecirciumlqxvwqqecirciumlquyvuzoecirciumlquyvqoecirciumlqu

y|vx

oovouwecirciumlquwvqqecirciumlquzvoywecirciumlquovuqwecirciumlqxvyxxecirciumlquyvuwecirciumlquuvzoxecirciumlqu

uwvq

owvwuecirciumlq|vq|qecirciumlquvzqzecirciumlquv|uuecirciumlquzvqoecirciumlq|yvu|ecirciumlquccedilxvyxqecirciumlq|

ccedilyuv|

yvz||ecirciumlq|v|oecirciumlquwvuuecirciumlquvqyecirciumlqxovuowecirciumlqxyvu||ecirciumlquovozecirciumlqxoxwvy

ovouecirciumlquvuxecirciumlquwv|zecirciumlquovu|oecirciumlqxvwxyecirciumlquyvux|ecirciumlquxvzoecirciumlqu

xoyvo

|ov|qoecirciumlquv|zqecirciumlquwvyzoecirciumlquv|oecirciumlquwvu|uecirciumlq|yvu|ecirciumlquccedilov|uecirciumlquccediloqxv

|v|qecirciumlq|vqecirciumlquvxzecirciumlquvxqecirciumlquovq|ecirciumlquyvuzecirciumlquccedilv|zoecirciumlq|

ccedil|vu

uwvxy|ecirciumlq|wvzoyecirciumlquzv|ecirciumlquovuozecirciumlqxvuoqecirciumlquyvwqecirciumlquuvuoecirciumlqu

xoxvw

uovquecirciumlquvwecirciumlquwvqecirciumlquovuouecirciumlqxvyxqecirciumlquyvuzoecirciumlquxvwoecirciumlqu

xy|vx

xzvux|ecirciumlq|vxwecirciumlquwvx|ecirciumlquvx|oecirciumlquovoqoecirciumlquyvu|qecirciumlquccedilzvzoecirciumlq|ccediloquvz

xwvqqecirciumlq|vuuwecirciumlquwv|owecirciumlquvzuzecirciumlquov|wyecirciumlquyvxy|ecirciumlquccedil|vywuecirciumlq|

ccediluv|

yzvqecirciumlq|zv|w|ecirciumlquovq|qecirciumlqxovuxecirciumlqxvyecirciumlquyvzqecirciumlquuvqecirciumlqu

uyuvq

yovqyuecirciumlquv|quecirciumlquwv|ywecirciumlquovuyuecirciumlqxwvqxwecirciumlquyvxwoecirciumlquyvecirciumlqu

xwzvx

ovoooecirciumlquvzwwecirciumlquzvqzzecirciumlquvo|ecirciumlqxovuoecirciumlqxyvyouecirciumlquovecirciumlqxooqqvy

ovqoecirciumlquvqyoecirciumlquwvo|ecirciumlquvy|ecirciumlquovoecirciumlquyvuxecirciumlquccediluvxzecirciumlq|

ccediluvz

zvy|ecirciumlq|wvoqqecirciumlquzvqyecirciumlquov|uqecirciumlqxyvw|ecirciumlquyvxyecirciumlquuv||ecirciumlqu

uzovo

zvyzoecirciumlq|vqwecirciumlquwvoecirciumlquovqzwecirciumlqxuvuwwecirciumlquyvuzyecirciumlquvwqyecirciumlqu

yxvy

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

JAEA-Testing 2011-005

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy1

Ccedil

Ccedil

+

Ccedil

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

Fi

g D

-1 Zccedil`

MPI

8mpi

JAEA-Testing 2011-005

Tabl

e D

-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p

q|vu|ecirciumlq|yvuqqecirciumlquyvuuecirciumlquyvy|ecirciumlqu|vuuxecirciumlq|yvuozecirciumlquovzzoecirciumlq

xvw

qvuuwecirciumlq|yvuqecirciumlquvouecirciumlquvqyecirciumlquyvqxqecirciumlq|yvuqecirciumlquccedilovoecirciumlq|

ccediloyv|

o|vuuecirciumlq|yvuq|ecirciumlquyvuecirciumlquyvwuwecirciumlquuv|ecirciumlq|yvuuecirciumlquovqquecirciumlq|

zv

oyvqzwecirciumlq|yvuqoecirciumlquvqooecirciumlquvqecirciumlquyvqyecirciumlq|yvuozecirciumlquovxxecirciumlq

vx

|vwuecirciumlq|yvuqoecirciumlquyv|qecirciumlquyvwxecirciumlquuv|uqecirciumlq|yvu|ecirciumlquovoecirciumlq|

|wv

yvqqwecirciumlq|yvuqyecirciumlquvqqecirciumlquvq|oecirciumlquyvoyecirciumlq|yvuouecirciumlquvuowecirciumlq

uvq

||vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvwxecirciumlquuv|yqecirciumlq|yvuoecirciumlquovooecirciumlq|

|yv

|vuzwecirciumlq|yvuqyecirciumlquvoxxecirciumlquvq|ecirciumlquyvoqyecirciumlq|yvuoecirciumlquccedilov|uecirciumlq|

ccediloyvx

|v|yuecirciumlq|yvuq|ecirciumlquyv|zecirciumlquyvw|oecirciumlquuvqzuecirciumlq|yvuecirciumlquzvo|ecirciumlq

vy

yvy|ecirciumlq|yvuquecirciumlquvqwqecirciumlquvqzecirciumlquyvoqecirciumlq|yvuozecirciumlquccedilxvo|uecirciumlq

ccedilyvy

q|vu|ecirciumlq|yvuqoecirciumlquyvu|ecirciumlquyvyyyecirciumlquvx|qecirciumlq|yvuo|ecirciumlquccedilvoqecirciumlq

ccedilvx

qvu||ecirciumlq|yvuquecirciumlquvouwecirciumlquvq|ecirciumlquyvoxecirciumlq|yvuecirciumlquccedilovoquecirciumlq|

ccedilouvz

ovxouecirciumlq|yvuqecirciumlquyvyxzecirciumlquyvyyecirciumlquvu|ecirciumlq|yvuxecirciumlquwvuwwecirciumlqo

|vu

oyvozecirciumlq|yvuq|ecirciumlquvqoyecirciumlquvq|ecirciumlquyvoecirciumlq|yvuxecirciumlquvooqecirciumlq

|vu

|vyoqecirciumlq|yvuoyecirciumlquyvecirciumlquyvxyecirciumlqu|v|xwecirciumlq|yvuqecirciumlquccedilvozecirciumlq

ccedilxvz

xvzzyecirciumlq|yv|wzecirciumlquyvzwzecirciumlquvq|oecirciumlquyvouecirciumlq|yvuoyecirciumlquuvoxwecirciumlq

yvz

||vyecirciumlq|yvuqoecirciumlquyvy|ecirciumlquyvxecirciumlqu|v|xxecirciumlq|yvuoecirciumlquccedilxvwwzecirciumlqo

ccedilovy

|vuzuecirciumlq|yvuquecirciumlquvox|ecirciumlquvq||ecirciumlquyvo|ecirciumlq|yvuozecirciumlquccedilovq|ecirciumlq|

ccediloyvo

|vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvooecirciumlquvzoyecirciumlq|yvuqecirciumlquccedilv|zxecirciumlq

ccedilyv

yvy|ecirciumlq|yvuqqecirciumlquvqecirciumlquvq|xecirciumlquyvouecirciumlq|yvuqecirciumlquccediluvqoecirciumlq

ccedilxvo

q|vozyecirciumlq|yv|zzecirciumlquyvowecirciumlquyvyzecirciumlqu|vxu|ecirciumlq|yvuouecirciumlquxvq|wecirciumlq

oxvw

qv|wyecirciumlq|yvuqxecirciumlquvouuecirciumlquvquuecirciumlquyvoqyecirciumlq|yvu||ecirciumlquccedilovqqqecirciumlq|

ccedilo|vx

ouvoq|ecirciumlq|yv|zecirciumlquyvwqecirciumlquyvyxecirciumlquvuoecirciumlq|yvuoyecirciumlquccedilovxqoecirciumlq|

ccedil|yvy

ovuqzecirciumlq|yv|zoecirciumlquvo|ecirciumlquvq|zecirciumlquyvo|ecirciumlq|yvuowecirciumlquccedilzv|ecirciumlq

ccedilovy

uvoecirciumlq|yvuqqecirciumlquyvwo|ecirciumlquyvxoecirciumlqu|vzwecirciumlq|yvuoecirciumlquccedilyvowecirciumlq

ccediloxvq

vxozecirciumlq|yvuoecirciumlquvo|ecirciumlquvquuecirciumlquyvoyecirciumlq|yvuecirciumlquccedilovzzecirciumlq|

ccedilov|

||vuoqecirciumlq|yvuqxecirciumlquyvuecirciumlquyvywecirciumlqu|vuyuecirciumlq|yvuoecirciumlquvooqecirciumlq

yv

|vxoecirciumlq|yvuquecirciumlquvoxyecirciumlquvq||ecirciumlquyvqwyecirciumlq|yvuuecirciumlquccedilov||ecirciumlq|

ccediloyvu

|vqwecirciumlq|yvuqqecirciumlquyvoecirciumlquyv|yecirciumlqu|vozecirciumlq|yvuowecirciumlquccedil|vxooecirciumlq

ccedilvu

vuxwecirciumlq|yvuqxecirciumlquvoxoecirciumlquvquqecirciumlquyvoxxecirciumlq|yvuuecirciumlquccedilovooyecirciumlq|

ccedilouvz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

jetimesg

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-3Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (2

)

JAEA-Testing 2011-005

Tabl

e D

-3Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p

q|v|oxecirciumlq|wvwz|ecirciumlq|ovoecirciumlquovou|ecirciumlquvu|qecirciumlq|zvqqecirciumlq|ccedilvxwecirciumlq

ccedil|vu

qvx||ecirciumlq|wvzquecirciumlq|ovyuuecirciumlquovyuyecirciumlquvxqoecirciumlq|wvzyecirciumlq|vy|ecirciumlqo

qv|

ovyecirciumlq|wvwzqecirciumlq|ovoyxecirciumlquovouuecirciumlquvuwecirciumlq|wvzyuecirciumlq|ccedilvoqqecirciumlq

ccedilvy

ovxqoecirciumlq|wvwzecirciumlq|ovyuqecirciumlquovy|yecirciumlquvuoqecirciumlq|wvzxqecirciumlq|ccedil|vzoecirciumlqo

ccedilqvx

|vywoecirciumlq|wvz|ecirciumlq|ovyoecirciumlquov|ecirciumlqu|v|wyecirciumlq|wvz|oecirciumlq|ccedilvzxzecirciumlq

ccedilwvq

v|zoecirciumlq|wvwzyecirciumlq|ovyzecirciumlquovuzyecirciumlquxvzzwecirciumlq|wvzxwecirciumlq|ccedilov||oecirciumlq|

ccedilowvq

||vywecirciumlq|wvwwzecirciumlq|ovxecirciumlquov||ecirciumlqu|v|xwecirciumlq|wvzoecirciumlq|ccedilv|zecirciumlq

ccedilyvx

|vuooecirciumlq|wvzqwecirciumlq|ovy|ecirciumlquovxqxecirciumlquyvqzecirciumlq|wvz|ecirciumlq|ccedilovyecirciumlq|

ccedilovo

|v|xzecirciumlq|wvzqoecirciumlq|ovyecirciumlquovowwecirciumlquvzo|ecirciumlq|wvzyecirciumlq|ccedil|vzzecirciumlq

ccediloovu

vuxzecirciumlq|wvzqoecirciumlq|ovy|yecirciumlquovxoecirciumlquyvuecirciumlq|wvzyoecirciumlq|ccedilyvxuecirciumlq

ccedilwvw

q|v|xzecirciumlq|wvwwoecirciumlq|ovuecirciumlquovxqecirciumlqu|vxxuecirciumlq|wvzuuecirciumlq|vxwecirciumlq

v

qvuxecirciumlq|wvzqyecirciumlq|ovy||ecirciumlquovuzwecirciumlquyvqqecirciumlq|wvzy|ecirciumlq|ccedilov|uwecirciumlq|

ccedilowv

ovuwqecirciumlq|wvwzecirciumlq|ovo|yecirciumlquovo|zecirciumlquvuuqecirciumlq|wvzuyecirciumlq|vowecirciumlqo

ovo

oyvoozecirciumlq|wvz|qecirciumlq|ovxqxecirciumlquovuzecirciumlquyvqoecirciumlq|wvzxzecirciumlq|ccedilvzyecirciumlqo

ccedilov|

|v||zecirciumlq|wvzqzecirciumlq|ovxecirciumlquov|oecirciumlqu|v|oxecirciumlq|wvzzyecirciumlq|yv|owecirciumlqo

ovz

xvzyuecirciumlq|wvzqoecirciumlq|ovuwecirciumlquovxqxecirciumlquyvqxecirciumlq|wvzuecirciumlq|ovwuoecirciumlq

|vo

||v|xzecirciumlq|wvwzoecirciumlq|ovxecirciumlquovuzecirciumlqu|vxozecirciumlq|wvzqecirciumlq|v|zoecirciumlq

vo

|vuwxecirciumlq|wvzqyecirciumlq|ovy|zecirciumlquovxquecirciumlquyvqwoecirciumlq|wvzxzecirciumlq|ccedilov|xoecirciumlq|

ccedilowvo

|vo|uecirciumlq|wvwzqecirciumlq|ovqecirciumlquovoecirciumlqu|vqecirciumlq|wvzyuecirciumlq|ovuyzecirciumlq

uvu

yvuwecirciumlq|wvzooecirciumlq|ovxyyecirciumlquovxqoecirciumlquyvquecirciumlq|wvzyuecirciumlq|ccedilyvuw|ecirciumlq

ccedilwvy

q|v|oyecirciumlq|wvwzqecirciumlq|ovoecirciumlquov|yecirciumlqu|v|yzecirciumlq|wvzwzecirciumlq|ovxoyecirciumlq

uvy

qxvzyyecirciumlq|wvzq|ecirciumlq|ovuwecirciumlquovy|wecirciumlquvu|qecirciumlq|wvzx|ecirciumlq|ovxouecirciumlq|

xvu

ovxyecirciumlq|wvwqecirciumlq|ovoy|ecirciumlquov|oecirciumlquuvuuecirciumlq|wvzyecirciumlq|ovxwqecirciumlq|

xv|

oyvqqecirciumlq|wvzqecirciumlq|ovuzecirciumlquovy|wecirciumlquvuyecirciumlq|wvzuzecirciumlq|ovux|ecirciumlq|

uvo

|vyecirciumlq|wvwzqecirciumlq|ovxecirciumlquov|zecirciumlquuv||ecirciumlq|wvzxuecirciumlq|vowyecirciumlq

ozvx

vuuuecirciumlq|wvzuuecirciumlq|ovy|zecirciumlquovuwzecirciumlquxvzxzecirciumlq|wvz|yecirciumlq|ccedilovuz|ecirciumlq|

ccedilqvo

||vywecirciumlq|wvww|ecirciumlq|ovxyecirciumlquov|zecirciumlquuv|owecirciumlq|wvzoecirciumlq|vwoecirciumlq

ozvw

|vuqqecirciumlq|wvzqwecirciumlq|ovy|oecirciumlquovxqqecirciumlquyvquqecirciumlq|wvzxwecirciumlq|ccedilov|oqecirciumlq|

ccedilov

|v|xecirciumlq|wvww|ecirciumlq|ovuecirciumlquov|q|ecirciumlquuvqyyecirciumlq|wvzyzecirciumlq|vzuyecirciumlq

xv|

yvqwecirciumlq|wvzouecirciumlq|ovxyecirciumlquovxyyecirciumlquyvouecirciumlq|wvzuzecirciumlq|uvqzxecirciumlqo

vz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

Ccedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

Xpound

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-4Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-5Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (2

)

JAEA-Testing 2011-005

E 1oacute 1

Fig E-1 QRfeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39305 sec Ave Speed = 32566145 MBs RANK = 2 TIME = 39197 sec Ave Speed = 32655168 MBs RANK = 3 TIME = 39212 sec Ave Speed = 32643219 MBs RANK = 4 TIME = 34341 sec Ave Speed = 37272765 MBs RANK = 5 TIME = 34175 sec Ave Speed = 37453885 MBs RANK = 6 TIME = 34086 sec Ave Speed = 37552470 MBs RANK = 7 TIME = 34329 sec Ave Speed = 37286226 MBs RANK = 8 TIME = 24021 sec Ave Speed = 53287420 MBs RANK = 9 TIME = 23794 sec Ave Speed = 53794939 MBs RANK = 10 TIME = 23805 sec Ave Speed = 53770236 MBs RANK = 11 TIME = 23798 sec Ave Speed = 53786914 MBs RANK = 12 TIME = 23942 sec Ave Speed = 53462490 MBs RANK = 13 TIME = 23931 sec Ave Speed = 53488046 MBs RANK = 14 TIME = 23950 sec Ave Speed = 53443826 MBs RANK = 15 TIME = 24066 sec Ave Speed = 53186052 MBs RANK = 16 TIME = 23798 sec Ave Speed = 53786327 MBs RANK = 17 TIME = 23802 sec Ave Speed = 53775886 MBs RANK = 18 TIME = 23821 sec Ave Speed = 53733399 MBs RANK = 19 TIME = 23794 sec Ave Speed = 53795866 MBs RANK = 20 TIME = 23939 sec Ave Speed = 53468837 MBs RANK = 21 TIME = 23938 sec Ave Speed = 53470775 MBs RANK = 22 TIME = 23939 sec Ave Speed = 53470333 MBs RANK = 23 TIME = 23942 sec Ave Speed = 53462889 MBs

JAEA-Testing 2011-005

Fig E-2 afeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39370 sec Ave Speed = 32511775 MBs RANK = 2 TIME = 39475 sec Ave Speed = 32425231 MBs RANK = 3 TIME = 39415 sec Ave Speed = 32474846 MBs RANK = 4 TIME = 34392 sec Ave Speed = 37217925 MBs RANK = 5 TIME = 34543 sec Ave Speed = 37055373 MBs RANK = 6 TIME = 34250 sec Ave Speed = 37372742 MBs RANK = 7 TIME = 34709 sec Ave Speed = 36878286 MBs RANK = 8 TIME = 26967 sec Ave Speed = 47465876 MBs RANK = 9 TIME = 26964 sec Ave Speed = 47471018 MBs RANK = 10 TIME = 26955 sec Ave Speed = 47486289 MBs RANK = 11 TIME = 26947 sec Ave Speed = 47499771 MBs RANK = 12 TIME = 26957 sec Ave Speed = 47483647 MBs RANK = 13 TIME = 27020 sec Ave Speed = 47373054 MBs RANK = 14 TIME = 26948 sec Ave Speed = 47498023 MBs RANK = 15 TIME = 27174 sec Ave Speed = 47104542 MBs RANK = 16 TIME = 26956 sec Ave Speed = 47485054 MBs RANK = 17 TIME = 26945 sec Ave Speed = 47504899 MBs RANK = 18 TIME = 26946 sec Ave Speed = 47502360 MBs RANK = 19 TIME = 26967 sec Ave Speed = 47466158 MBs RANK = 20 TIME = 26948 sec Ave Speed = 47499767 MBs RANK = 21 TIME = 26982 sec Ave Speed = 47438204 MBs RANK = 22 TIME = 27087 sec Ave Speed = 47254830 MBs RANK = 23 TIME = 27004 sec Ave Speed = 47399960 MBs

JAEA-Testing 2011-005

Fig E-3 QRfeaZOpenMPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes NODE CPU coreid Rank = 0 bx2050 CPU0 0 all time RANK = 1 TIME = 38110 sec Ave Speed = 33587213 MBs bx2050 CPU1 1 RANK = 2 TIME = 28656 sec Ave Speed = 44667752 MBs bx2050 CPU0 2 RANK = 3 TIME = 37959 sec Ave Speed = 33720175 MBs bx2050 CPU1 3 RANK = 4 TIME = 28658 sec Ave Speed = 44664664 MBs bx2050 CPU0 4 RANK = 5 TIME = 37829 sec Ave Speed = 33836073 MBs bx2050 CPU1 5 RANK = 6 TIME = 28641 sec Ave Speed = 44691880 MBs bx2050 CPU0 6 RANK = 7 TIME = 37892 sec Ave Speed = 33779924 MBs bx2050 CPU1 7 RANK = 8 TIME = 35140 sec Ave Speed = 36425620 MBs bx2051 CPU0 0 RANK = 9 TIME = 35017 sec Ave Speed = 36554163 MBs bx2051 CPU1 1 RANK = 10 TIME = 38916 sec Ave Speed = 32891010 MBs bx2051 CPU0 2 RANK = 11 TIME = 45416 sec Ave Speed = 28184186 MBs bx2051 CPU1 3 RANK = 12 TIME = 35172 sec Ave Speed = 36392511 MBs bx2051 CPU0 4 RANK = 13 TIME = 35053 sec Ave Speed = 36516468 MBs bx2051 CPU1 5 RANK = 14 TIME = 35200 sec Ave Speed = 36363472 MBs bx2051 CPU0 6 RANK = 15 TIME = 35009 sec Ave Speed = 36561987 MBs bx2051 CPU1 7 RANK = 16 TIME = 22996 sec Ave Speed = 55661441 MBs bx2052 CPU0 0 RANK = 17 TIME = 35587 sec Ave Speed = 35968575 MBs bx2052 CPU1 1 RANK = 18 TIME = 35587 sec Ave Speed = 35968151 MBs bx2052 CPU0 2 RANK = 19 TIME = 35712 sec Ave Speed = 35842525 MBs bx2052 CPU1 3 RANK = 20 TIME = 22996 sec Ave Speed = 55661978 MBs bx2052 CPU0 4 RANK = 21 TIME = 36856 sec Ave Speed = 34729458 MBs bx2052 CPU1 5 RANK = 22 TIME = 22994 sec Ave Speed = 55666382 MBs bx2052 CPU0 6 RANK = 23 TIME = 22991 sec Ave Speed = 55673257 MBs bx2052 CPU1 7

JAEA-Testing 2011-005

13

F |eacuteIuml -microX|eacuteIuml^_Z)JCQRfeaZW Intel feaZegrave

pound5poundDqQRfeaZgt|YatildeGfeaUuml^13~

WCeumlUKntstiacuteordflaquoqmnaringaeligccedil13heumlX store amp5GUuml^13~gtlaquoq2csup1egraveBordfOcirc$ storegtszligordflaquoqiAgraveIntelmnntildeGfeaUuml^13~centsup3CXqCcCcedilEgravegtacircOslash5Cagraveaacute5pound

Dq WCIntelfeaZmUuml^13~ordfccdQRfeaZ

UKntst egraveBCDoumlWordfqiAgraveUKntst ordfQRfeaZEGmWordfethCDq

F1 Agraveograve o^_UgraveDo7ccedilIumlgt 128MB copyIgrave0]lyacuteOslash5GZccedil` MPI

OpenMP^_Z)JCDqm0]lyacute 100plusmnatildesectaumlC5mnccC|eacuteIuml (MBs)iquestHq^_ZiIacute[oumlYacutebrvbarsectGqOslash5IumlWCMPI gt 12348 ^_OpenMP gt 123487ccedilIumlgtOslash5UgraveD IFyacute1EC1CPUcd2^_UgraveD7ccedilIuml 2laquonotgtOslash55poundEacuteDqQRfeaZUKntst CDoumlWCXouml 2sectCDqIntelucircfeagt5poundDq

JAEA-Testing 2011-005

Fig F-1 ^_ZugraveuacuteLFlat MPIP

(main) n = nmb102410248 nmb = 128 (from input data) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) nsize = n size c NNN Repetition number of times NNN = 100 do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = MPI_WTIME() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

Fig F-2 ^_ZugraveuacuteLOpenMPP

(main) $OMP threadprivate(ioutompidompnumabierr1ierr2) n = nmb102410248 nmb = 128 (from input data) $OMP parallel shared(n) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) if(ierr1ne0orierr2ne0) then write(6) Array allocation failed stop end if $OMP end parallel nsize = n size c NNN Repetition number of times NNN = 100 $OMP parallel private(i)shared(nsize) do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue $OMP end parallel $OMP parallel private(IIttime1ttime2timewave_spd) $OMPamp shared(NNNn) ttime1 = omp_get_wtime() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = omp_get_wtime() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

F2 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt[oumlYacutebrvbarsect 4gtGqmn

d QRUKntstlaquooumlWoumlWr 2 2^_laquonotgtUKntstlaquooumlWoumlWr 3KntstCc 1ccedil8^_gtQRW IntelWr QR(-Kntst)W Intel2 2^_Wr CqOMP Intel EcircHGu MPI Wiacute

XWXpoundDDHyzGqmndbrvbarsect[raquozfrac14frac12CDq (1) MPI^_ZOpenMP^_ZWQRfeaZgtUKntstordflaquooumlW

oumlgtordfaacuteCq (2) IntelfeaZUKntstCDQRfeaZWbAEligq (3) MPI^_ZOpenMP^_ZW 1CPUSgto coremacrdnXq (4) UKntstordfQRfeaZOslash5thornlt]G(1plusmnW 2plusmn

Oslash5ordfaEacuteD)q ^_ZbrvbarPXOcirc$sup1yBordfAgraveXCc store ampordfCXoumlQR

feaZUKntst Uuml^13~CXnordfyacuteordfdXqOcirc$sup1yBordfcAgraveXcpoundecircsectCXoumlmUuml^13~CagraveaacuteGaeliguordflaquoWnq

IntelfeaZdegIacuteQRUKntstlaquosectoumlWGmWordfgtecircq(XETHNtildefeaOslash5|YatildeG4X

yen5oslashIacuteD|Yordf9WXpoundbrvbarPmacrltdnDq

JAEA-Testing 2011-005

Table F-1 |eacuteIumlWeumlUKntstiacuteUuml^13~

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqvo uuuwvq u|xvq qvzu

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |o|v wq|vy xxwvu qvXpoundo |xwxvz |xwvq |xw|vz Xpoundo |xvy wovw xov qv

|xyvu |xuvo |xxv| |ozvu woqv xyxvq qv

Xpoundq z|vx zvz z|v Xpoundq ouuwvz ozwov ooxv| qvyoXpoundo woqv wqxvo wqvy Xpoundo oux|vu ozwv| oqvu qvyoXpound wqzv| wqvo wqxv Xpound oux|v| ozzqvo oov qvyo

wquv| wqqvo wqv ouxovz ozwyvu oozvo qvyo

Xpoundq o||vo o|xvq o|uvq Xpoundq oqoxvz oxquvu oyqvo qvxzXpoundo ouv ouov ouov Xpoundo oqovw oxoqvo oy|vz qvxzXpound ouov ouyv| ouuvq Xpound oqowvz oxqzv oyuv| qvxzXpound| ouvy ouuvw ou|v Xpound| oqowvz oxqzv oyuvq qvxz

o|zvz ouovw ouqvz oqovz oxqwv| oy|vo qvxz

Xpoundq qwyvq oqyvw qzyvu Xpoundq ouxvq ouwuvy ouywv| qvqXpoundo qz|v ooyvy oqxv Xpoundo ouxvz ouwuv ouovq qvqXpound qzuvy oovq oqxvw Xpound ouxzv ouz|v ouyv qvqXpound| qzxvy ooxvu oqxvx Xpound| ouxuvw ouzv ou|vx qvqXpoundu qwzv| o||v ooovx Xpoundu ouqxvo ouuv| ouyv qvyXpoundx qwzvu o|vw oo|vy Xpoundx ouq|vo ouuxv ouuv qvyXpoundy oqvy o|xvu oovx Xpoundy ouquvw ouuyvq ouxvu qvyXpound ooqvu o|yvo o|v Xpound ouqxvx ouuyv ouyvo qvy

qzxvw ouvw ooqv| ou|qv| ouyvu ouuwvz qvyw

usup3deg

wsup3deg wsup3deg

usup3deg

|sup3deg

sup3deg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

sup3deg

|sup3deg

Table F-2 |eacuteIumlLiacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq ywxvu xyuvw yxvo qvuXpoundo |xywv |xwvx |x|vy Xpoundo yzwvz xvu y|wvo qvuXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz yzvo xovo y|ovy qvu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq yqv xywvy yozvy qvuXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound ywyvz xwv y|uvw qvuXpound| Xpound|

|xyov| |xovu |xyyv| ywvw xxvy yv qvu

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq yoov xvu ywovw qvxXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| yxv qvo yzvy qvx

|xyv| |xqvo |xyyv yowv yov ywzv qvx

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo uyv| wqvo y|v qvuXpound |xyvz |xuvu |xovo Xpound uqvu wqyv| y|wv| qvuXpound| Xpound|

|xyyv| |xuvu |xqv| uywvw wqyv y|vw qvu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo uyxvy wqyv y|yvo qvuXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| uywvq wqxvz y|vq qvu

|xyuvw |x|vo |xywvz uyyvw wqyv| y|yvx qvu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound xq|v wovo yxvz qvuXpound| |xyxvu |x|v| |xyzv| Xpound| uzzvx wqyvu yxvz qvu

|xyuvw |xuvy |xyzv xqovy wqzv yxxvu qvu

ntildeR| ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

ontildeR

ontildeR|

qntildeRo

qntildeR

qntildeR|

qntildeRo

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

JAEA-Testing 2011-005

Talbe F-3 |eacuteIumlLQRafeaZrsacute 1ccedil8 ^_P

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqxvw uqzvq uqvu ovqx

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |ywqvx |ywovy |ywovo ovq|Xpoundo |xwxvz |xwvq |xw|vz Xpoundo |yzyvz |yzvo |yzvq ovq|

|xyvu |xuvo |xxv| |ywwv |ywzv| |ywzvq ovq|

Xpoundq z|vx zvz z|v Xpoundq zv| zxvq z|v ovqqXpoundo woqv wqxvo wqvy Xpoundo wqxvx wqwvy wqvo ovqqXpound wqzv| wqvo wqxv Xpound wquv| wqwv| wqyv| ovqq

wquv| wqqvo wqv wqqv wquvq wqv| ovqq

Xpoundq o||vo o|xvq o|uvq Xpoundq ov ouqv| o|ov| ovqqXpoundo ouv ouov ouov Xpoundo o|ovy ouwvw ouqv ovqqXpound ouov ouyv| ouuvq Xpound o|ovy oxqvo ouqvw ovqqXpound| ouvy ouuvw ou|v Xpound| o|qv ouzvo o|zvz ovqq

o|zvz ouovw ouqvz ozvq ouvo o|wvq ovqq

Xpoundq qwyvq oqyvw qzyvu Xpoundq qzv ooxvq oqyvo ovqqXpoundo qz|v ooyvy oqxv Xpoundo oquv oouv| oqzv ovqqXpound qzuvy oovq oqxvw Xpound oqxv| oxvx ooxvu ovqqXpound| qzxvy ooxvu oqxvx Xpound| oqxvw oxv ooxvx ovqqXpoundu qwzv| o||v ooovx Xpoundu oqv qzuv| oqvx qvzzXpoundx qwzvu o|vw oo|vy Xpoundx oov qzov oquvu qvzwXpoundy oqvy o|xvu oovx Xpoundy ooxvu qwzv oqvy qvzwXpound ooqvu o|yvo o|v Xpound oouvo qzqvo oqvo qvzw

qzxvw ouvw ooqv| ooqvq oqxv oqvz qvzz

wsup3deg

usup3deg

|sup3deg

sup3deg

13aumlecedilg

sup3deg

|sup3deg

usup3deg

eccedilograveregYacuteg

efrac12cedilregg

oefrac12cedilregg

wsup3deg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

Table F-4 |eacuteIumlLQRafeaZrsacute iacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq |quvy |quvx |quvx ovquXpoundo |xywv |xwvx |x|vy Xpoundo ||vx |ovw |vy ovquXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz |ouvq |o|vo |o|vy ovqu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq ||zzv |qv| |xxqvw ovqqXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound |zwv| ||v |yovq ovqxXpound| Xpound|

|xyov| |xovu |xyyv| |xzwvw |o|vq |yxxvz ovq

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq |quv |quvq |quvu ovquXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| |ovx |ovy |ovy ovqu

|xyv| |xqvo |xyyv |o|vo |ovw |o|vq ovqu

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo |oyvw |oyvu |oyvy ovquXpound |xyvz |xuvu |xovo Xpound |oyv |oyvq |oyvo ovquXpound| Xpound|

|xyyv| |xuvu |xqv| |oyvx |oyv |oyv| ovqu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo |ov |oyvo |oyvz ovquXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| |ouv| |ovx |o|vu ovqu

|xyuvw |x|vo |xywvz |oyvq |ouv| |oxvo ovqu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound |oyv |oxvz |oyvo ovquXpound| |xyxvu |x|v| |xyzv| Xpound| |oxvx |oyvw |oyvo ovqu

|xyuvw |xuvy |xyzv |oxvw |oyv| |oyvo ovqu

ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

qntildeRoqntildeRo

qntildeR

qntildeR|

ontildeR

ontildeR|

ntildeR|

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpound

JAEA-Testing 2011-005

G Zccedil` MPIWIcirca|ccedil`gt MPI atilde 6MPI ETHNtildegtegraveBCD^_ZiIacute^regCZccedil`MPI

WIcirca|ccedil`gtOslash55poundDq

G1 Agraveograve MPI ETHNtildegtegraveBCD^_ZiIacute^regC1024 ^_Zccedil

` MPIW 128^_j87ccedilIumlIcirca|ccedil`128^_Zccedil` MPIW128 ^_j8 7ccedilIumlIcirca|ccedil`acircgtCc Ocirc$-a5gtrs5poundDqUgraveDRDMANORDMAIumlgt5poundDq )acircEacute coregtZccedil` MPIWIcirca|ccedil`rs acircEacute MPI^_gt

coreCCDouml rsWGq Ocirc$-a5n~n 8eacutea`ccedil1Keacutea`256Keacutea`ccedil1Meacutea`WCDq

G2 [oumlYacutebrvbarsectGqmbrvbarsect[raquozfrac14frac12CDq (1) 1024Zccedil` MPIW 128j8Icirca|ccedil`gtGu MPIatildegtIcirca|ccedil`

ordfFqUgraveDatildebrvbarpoundaacuteCCordflaquoq (2) 128Zccedil` MPIW 128j8Icirca|ccedil`gtplusmn Ocirc$-a5(256KB13yacute)

gtlaquoWGu MPIatildegtaeligIcirca|ccedil`ordfnotyW wXordfaacuteCCIcirca|ccedil`ordfZccedil` MPIbrvbarsectordfmWq

(2)gt ordfAgravecedilCXpoundegraveBG core ordfCXgtOslashf

IumlntildeHDoumlfIumlOslash5FCXWeumlnqmndbrvbarsect

BX900gtIcirca|ccedil`ordfnotygtlaquoqfIumlUTUacuteUcircbgtIcirca|ccedil`Ucirc5Gaeliguordflaquoq

JAEA-Testing 2011-005

Table G-1 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (1)

sup2~ w |y |o ovq| zx o| qv |vu vuqsup2~ oy |o |qz ovq| oqq o|x qv| |vow vwsup2~ u |ow |o ovqq zz o|x qv| |vo v|xsup2~ | |y |o ovqu zw o|x qv| |v|u v|sup2~ yu |o |oy ovqq zy o|z qvyz |v| vsup2~ ow ||u |o| ovq zw o| qv |vuq vwsup2~ xy ||u |o ovqx oqx o|z qvx |voz vsup2~ xo ||u |q ovqu oq o|w qv |vo| v|sup2~ ontildequ |uo |q ovq oqz ou| qvy |vo vu

sup2~ w owq qw qvw q oo qvy| vxy ovwxsup2~ oy owu o| qvw q oqw qvyy vyo ovzwsup2~ u owq qu qvww yw oqz qvy vyx ovwsup2~ | o| qy qvwu o oqw qvyy vu| ovzosup2~ yu o| qu qvwx o oqw qvyy vu| ovwzsup2~ ow owx qq qvz| oqz qvy vxy ovwusup2~ xy oz qo qvwz wy ooo qvw vqz ovwsup2~ xo ow qu qvw w oo qv vqx ovwsup2~ ontildequ ow ou qvwx z oq qv ovzw ovz

sup2~Oslashreg~plusmn w untildeqoy untilde|z ov| ontildeuw ontildeyz| qvu xzv|q xvusup2~Oslashreg~plusmn oy |ntildeyuw uqntilde ovwo ontildezw ontildeyuu qvz xyvx uvwsup2~Oslashreg~plusmn u untildeqxu |zntildezu| ovwx ontildexo ontildeyoq qvw xzvow uvwsup2~Oslashreg~plusmn | |ntilde|| uqntildeyq ovw ontilde|q ontildeyq qvwo xxvxy uvwysup2~Oslashreg~plusmn yu untildey uqntildeuuu ovwu ontildeww ontildeyuy qvw xvyu uvxsup2~Oslashreg~plusmn ow |ntildeqq uqntildezw| ovw ontildew ontildeyzx qv xzvxq uvowsup2~Oslashreg~plusmn xy untildeyyx uqntildeywz ovwu ontildeuxw ontildeyy qvw xovq uvuosup2~Oslashreg~plusmn xo untildeqz |zntildeoww ovwz ontildexq ontildeqx qvwz uwvyz vzzsup2~Oslashreg~plusmn ontildequ yntildexwx |ntildeuuz vqx ontildey ontildewx qvw uvo ozvzw

middotplusmnsup1 w |w | ovqq oqq o|y qv| |vw vuqmiddotplusmnsup1 oy |u |uo qvzx oqo o|y qvu |vq vxqmiddotplusmnsup1 u |uo |yq qvzx oqo ouq qv |v|w vxwmiddotplusmnsup1 | |xo |yw qvzy oq o|w qvu |vux vyymiddotplusmnsup1 yu xzo u|| ov|y oqx ouw qvo xvy| vzmiddotplusmnsup1 ow ntildez yqu uvz| ooy oxx qvx xvq |vzqmiddotplusmnsup1 xy xntilde z xv| o| oy qvz |zvyx xvwumiddotplusmnsup1 xo wntildeqoz ontildew uvxo oy oz ovo| |voy zvmiddotplusmnsup1 ontildequ oqntildezy |ntildexxo |vqz zo |w uvqz oovz ouvzx

middotplusmnsup1raquo w ntildey| oxntildequq qvo || oy qvu vz ovqomiddotplusmnsup1raquo oy ntildeuyw oxntildeqx qvoy ||x oq qvu v| ovoxmiddotplusmnsup1raquo u ntilde|y oxntildequq qvow |uo o| qvuw wvq| ovqzmiddotplusmnsup1raquo | ntildexw| oxntildeq|| qvo || ox qvu vyy ovqomiddotplusmnsup1raquo yu ntildeyyz oxntildeqqz qvow |xq yw qvuy vy ozvxumiddotplusmnsup1raquo ow |ntildeqqx oxntildeqy| qvq u|q uo qvxw yvzz qv|middotplusmnsup1raquo xy |ntilde|uy oxntildeoqu qv xyo y qvu xvzy ozvw|middotplusmnsup1raquo xo |ntilde|zq oxntildeoxq qv xzw woq qvu xvy owvomiddotplusmnsup1raquo ontildequ |ntildex| oxntildeuox qvu yy z|z qv xvxx oyvuo

Ugraveplusmnsup1 w ontildeuox yntildeyo| vy| u q ovq ovzq |vUgraveplusmnsup1 oy owntildeqz yntildewu vy |q qo ovou wv|q |uvoxUgraveplusmnsup1 u owntilde| yntildewzz v |qo q ovuy yvoy ||v|wUgraveplusmnsup1 | ozntildeqzy yntildezq v uz q ovq yvwq ||v|wUgraveplusmnsup1 yu ozntilde|wx yntildewx| vw| z oy ov| yxv|o |ovqUgraveplusmnsup1 ow qntildewo yntildez|y vz ||q o ovxq yovxq |ovuxUgraveplusmnsup1 xy qntilde|o ntildeqo vzq |wy y ovo xvx |ovqqUgraveplusmnsup1 xo ontildeo|u ntildeow| vzu |xy |u ovx| xzv| |qvyUgraveplusmnsup1 ontildequ ontildey|q ntilde|z vz| |zw ow| vow xuv|o uqvu

Ugraveplusmnsup1raquo w ozntildexw yntildez| vw| |z q ovow wvqu |uvuqUgraveplusmnsup1raquo oy ozntildexoz yntildezq vw| |q q ovou wuvyz |uvoUgraveplusmnsup1raquo u ozntildexqy yntildezu vwo u q ov| ov|q ||vyqUgraveplusmnsup1raquo | ozntildeuu yntildewz vwy o qy ov|o vwx ||vuqUgraveplusmnsup1raquo yu ozntildeqw yntildezq| vwx y oy ovu |vy |ovzy

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

sectWOcirceg

oqu$XAcircsup3deg owsup3degIcircwsup3+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

JAEA-Testing 2011-005

Table G-2 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (2)

Ugraveplusmnsup1raquo ow qntildeqx yntildewz| |vqq |u q ovq xxv|| |ovzUgraveplusmnsup1raquo xy qntildew| ntildeq| vzy ||y y ovuz yvqx |ovoUgraveplusmnsup1raquo xo ontildew ntildexy vz| u|| |u ovwx uzvoo |qvzxUgraveplusmnsup1raquo oqu ontildeww| ntildeuqz vzx |oy owu ov yzv|q uqv||

~plusmn w yntildez zntildeuw qvo ontildeozu uux vyw xvwq ovzo~plusmn oy ntildeoqz zntildeu| qv| ontildeoy| uu vyq yvoo ovz~plusmn u ntildeq|z zntildeuy qv ontildeo uuy vy| yvqq ovwx~plusmn | ntildeoz zntildeyq qvu ontildequ uu vq xvzw ovwy~plusmn yu yntildezoq zntildey qvo ontildeoyu uu vyq xvzu ovw|~plusmn ow zntilde|qz zntildew| qvzx ntildeo uuy uvw uvz ovz~plusmn xy zntilde|wz zntildewqw qvzy ntildey uuy uvzz uv ovz~plusmn xo zntildeqx zntildewuq qvzu ntildeowx uuz uvw uvo ovzu~plusmn oqu zntildexz zntildezq| qvzz ntildeo uyo uv uvuz ovxq

~plusmnraquo w yntildewox zntilde| qvq ontildeuo uux vz xvuz ovw~plusmnraquo oy yntildezq zntildeu| qvo ontildeoz uu vyw xvw ovwo~plusmnraquo u yntildeyz| zntildeux qvyz ontildeoyw uuu vy| xv| ovz|~plusmnraquo | yntildewou zntildexx qvq ontildeu uux vx xvx ovzo~plusmnraquo yu yntildezx| zntildex qvo ontildeoyo uuu vyo xvzz ovzx~plusmnraquo ow zntildeuux zntildew qvz ntildeoxx uu| uvwy uv|w vqx~plusmnraquo xy zntilde|x zntildewqo qvzx ntildeq uuy uvzx uvu ovzw~plusmnraquo xo zntildeyw zntildewuy qvzw ntildeqy uxq uvzq uv|z ovwz~plusmnraquo oqu zntildeyxq zntildezoo qvz ntildeoxy uyq uvyw uvuw ovx|

plusmn w uy uo ovo zy oow qvw uvwq |vxqplusmn oy xq uw ovqy oqo oq qvwx xvqq |vzzplusmn u xwz x|| ovoq oqy o| qvwy xvx uv|uplusmn | oo yqy ovo ooq o|q qvwx yvuy uvywplusmn yu ntilde zoz vu o| ouo qvw owvqu yvxqplusmn ow ntildeww ntildeqqz ov|z oxo ox qvwy owvu oovxqplusmn xy untildeouy |ntildeoy ovo q| oz qvz| qvuy oyvzplusmn xo |zntildeo| |xntilde|u ovoo ontildeuox ontildeuy qvwo vyz qvowplusmn oqu uqntildeqwy |xntilde|w ovo ontildeuww ontildewqw qvw yvzu ozv

plusmnraquo w oqqntildexy yqntilde|y ovy ntildeuou wy vx uovyy ywvwzplusmnraquo oy oqntildey| yontildeq ovyw ntildeu|w ww| vy uvqz yzv|oplusmnraquo u oqntildexyw yontildeqz| ovyw ntilde|zy wz| vyw uvw ywvuqplusmnraquo | oq|ntildeqo yontildeo ovyw ntildeuqq www vq uvzx ywvzoplusmnraquo yu oq|ntildeqzw yontildeoou ovyz ntildeuyx wzx vx uovw| ywv|qplusmnraquo ow oq|ntildezo yontildeq| ovq |ntildexqx wz |vzo zvyx ywvoplusmnraquo xy oquntildeqy yontildeqq ovo |ntildexyw wzz |vz zvo yvwxplusmnraquo xo oqwntilde|zq yqntildewo ovw |ntildexyx wzx |vzw |qvuo yvzplusmnraquo oqu ooqntildeqq yontildeoxx ovwq |ntildeyu zo |vzz zvzx yyv|z

~plusmnreg w ooz o|o qvzq q x qvwq xvwo xvo~plusmnreg oy oo| ou qvzo y qvwx xvq uvwu~plusmnreg u oow o|o qvzo q x qvw xvz xvy~plusmnreg | oq ou qvz q x qvwq xvzu uvwz~plusmnreg yu oq ox qvzy q x qvw yvo uvzz~plusmnreg ow o oz qvzz y qvw xvy xvq|~plusmnreg xy ox o|o qvzx y qvzx uvwy uvw|~plusmnreg xo ox o|o qvzx | y qvww xv|x uvzx~plusmnreg oqu ou o|| qvz| y z qvz uvq uvy|

~plusmn w oontildewxu |ntildeyuy |vx yu ontildeoow qvx owvuy |vy~plusmn oy oontildewo |ntildeyq |vw y|z ontildeoox qvx owvuw |v|~plusmn u oontildey |ntildexw |vz yuo ontildeoox qvx owv| |vo~plusmn | oontilde| |ntildexzx |v y| ontildeo| qvx owvuz |vq~plusmn yu oontildew|o |ntildey |v y| ontildeo qvx owvxw |vo~plusmn ow ontildeqw |ntilde| |v| yy ontildeouq qvxx ozv|o |vw~plusmn xy ontilde |ntildewox |v zz ontildeoyw qvyw oxv| |v~plusmn xo ontildeuo |ntildezzo |voo wu ontilde qvyz ouvy |v~plusmn oqu ontildequ untilde|w vzq zu ontilde|u qvo o|vuz |v|o

sup2cedil~raquo w y o qvx o | qvu| xvoy uvusup2cedil~raquo oy z o qvwq o qvxx vzx xvuxsup2cedil~raquo u x oq qvuy o qvyo |v| uvuzsup2cedil~raquo | | oo qv o qvxu vuz uvzosup2cedil~raquo yu oo qvyo o qvuz xvyo uvxusup2cedil~raquo ow oo qvyw o qvxx yvqq uvwxsup2cedil~raquo xy y oo qvxy o qvy uvyq xvoosup2cedil~raquo xo y oo qvxy | qvy| |vq| |vusup2cedil~raquo oqu x o qv|z o qvx |vyz xvuq

JAEA-Testing 2011-005

Table G-3 MPIatilderaquo LMPI and hybrid acute256ccedil1024KBP

sup2~ yntildeouu |ntildewzz |ntildeuou ovou ontildezx ontildeqyw ovw| ovzz |vqsup2~ xuntildeww ntilde|o ntildeuo qvzz |ntildeqo |ntildey qvz vu vsup2~ wyntildeu| oqntildexx oontildeqyy qvz |ntildezwy untildeuyw qvwz vq vuwsup2~ ontildequwntildexy ountildexxy oxntildeuy qvzu untildezqq xntildewyw qvwu vz vyu

sup2~ yntildeouu zou zo qvzu ontildeqqq ontildeq qvzw qvzo qvzxsup2~ xuntildeww ontildeo ontildezy qvwz ontildew ntildeuuz qv| qvz qvzsup2~ wyntildeu| ntildexuo ntildey| qvz ntildeyzz |ntildeuqw qvz qvzu qvwosup2~ ontildequwntildexy |ntildex untildeqox qvzu |ntildexww untilde|| qvw ovqx qvz

sup2~Oslashreg~plusmn yntildeouu yntildeuw| wontildexoz qvzu xntildeuo |ntildeoqw ovoq |vqq |vx|sup2~Oslashreg~plusmn xuntildeww ou|ntildewx ountildeqo qvw| uyntildeqzq yyntildewxo qvyz |vo vyosup2~Oslashreg~plusmn wyntildeu| oontildezoo yzntildeuz qvz yntildeqz zxntildeox qvo |voy vw|sup2~Oslashreg~plusmn ontildequwntildexy w|ntildeyox |yontildeuq| qvw wntildexo o|ntildexqz qvo |v| vz|

middotplusmnsup1 yntildeouu |ntildezw ||ntilde|x| qvzz oontildexww oxntildeyzu qvu vwx vo|middotplusmnsup1 xuntildeww y|ntildex yyntildeyy qvzx ontildez|x untilde|x qvuy vww ovuomiddotplusmnsup1 wyntildeu| wzntildewy z|ntildeyux qvzy |qntildeyx yntildeyqu qvuy vz ov|zmiddotplusmnsup1 ontildequwntildexy oqntildeux| o|untildeqox qvzq uqntildeyxx wwntildez| qvuy vzy ovxo

middotplusmnsup1raquo yntildeouu yntildew|u ||ntildeu ovww ntildewxy oxntildeyuw uvyy qvwy voumiddotplusmnsup1raquo xuntildeww wxntildeoz yuntildeo| ov|| wxntildex|| untildeuy ovwq ovqq ov|xmiddotplusmnsup1raquo wyntildeu| ooxntildeuw z|ntildezyu ov| z|ntildeyow yntildexuz ov|z ov| ov|zmiddotplusmnsup1raquo ontildequwntildexy o|ntildexx o|untildeuqz ovq zyntildexoq wntildeuuy ovoq ovu| ovxu

Ugraveplusmnsup1 yntildeouu |zntildeyu| oqntildexwz |vu uontildeuyq ountildewy vz qvzy qvoUgraveplusmnsup1 xuntildeww untildewxx ntildeqo voy uzntildeoy| yntildewo ovw| qvz qvw|Ugraveplusmnsup1 wyntildeu| xxntildewoq zntildewyu ovw xyntildewuu |yntildexqx ovxy qvzw qvwUgraveplusmnsup1 ontildequwntildexy yntildewxy |ntildexxw ovy yuntildewzx uyntildeoq ovuo qvz qvwo

Ugraveplusmnsup1raquo yntildeouu uqntildeq|u oqntildeyxq |vy uontildey| ountildez|z vz qvzy qvoUgraveplusmnsup1raquo xuntildeww uwntilde|u ntildeqz vo uzntildew yntildewx ovw| qvzw qvw|Ugraveplusmnsup1raquo wyntildeu| xyntildezy zntildez|y ovww xntildeo| |yntildeuwq ovx qvzw qvwUgraveplusmnsup1raquo ontildequwntildexy y|ntildeo| |ntildeyxq ovyw yxntilde|yw uyntildeqx ovuo qvz qvwo

~plusmn yntildeouu untildeyw wntildezu uvo uuntildeo wntildeuq| xvx qvzy ovq~plusmn xuntildeww xqntildeqyx qntilde|xq vuy xontildexwy untildezux vq qvz qvw~plusmn wyntildeu| xwntildeq wntildeux vqy yqntildeox |untildeu ov| qvzw qvw~plusmn ontildequwntildexy yyntildequq |yntilde|q| ovw ywntildeqx uuntildeuyy ovx| qvz qvw

~plusmnraquo yntildeouu zntilde|q zntildeqq ovqu zntildexq wntildeu|w ovo| qvzw ovq~plusmnraquo xuntildeww oyntildeu| qntilde|wy qvwo oyntildeoz untildezuy qvyx ovq qvw~plusmnraquo wyntildeu| |ntildewo wntildeuxu qvw ntildewz |untilde qvyy ovq qvw~plusmnraquo ontildequwntildexy zntildezy |yntildexo| qvw zntildezxq uuntildeu qvy ovqq qvw

plusmn yntildeouu yxntildewu wzntildeq| qv| ontildeyoz yntildezz qvwq |vq |v|qplusmn xuntildeww ountildeqoz oxntilde|o qvwo |zntildexu xxntilde|ux qv |vo vxplusmn wyntildeu| ow|ntildeuwy untildeyu qvw xntildeoyu untildezu| qvy |vo |vqqplusmn ontildequwntildexy u|ntildewo zxntildezqo qvw xntildezy zuntildeoy qvwq |vo |vo

plusmnraquo yntildeouu x|ontildeyx wyntildeqoz yvow o|ntildeqox owntilde|xu vu |vww uvyzplusmnraquo xuntildeww yozntildeyz| oz|ntildeuwu |vq oyntildeyw ywntildeyw v| |vwo vwplusmnraquo wyntildeu| yw|ntildewxu wwntildeooz v| owyntilde|q| oqxntildeyw ovy |vy vplusmnraquo ontildequwntildexy |yntildezo uqxntildeoyq ovw oontildexo ouontildezzx ovuz |vuw vwx

~plusmnreg yntildeouu uzu y| qvw uwo xzw qvwq ovq| ovqy~plusmnreg xuntildeww zq| ontildeoz qvy wo| ontildexuy qvx| ovoo qv~plusmnreg wyntildeu| ontilde|q ontildeyq qvw ontildeoxo ntildeoqz qvxx ovo| qvz~plusmnreg ontildequwntildexy ontildezqu ntilde|ux qvwo ontildeuxu ntildeyx qvxu ov|o qvww

~plusmn yntildeouu ountildewqx ountildez|x qvzz ontildeqq owntildeuw| ovou qvq qvwo~plusmn xuntildeww zntildeuy zntildewxz qvzw uqntildex|o uwntildexyw qvw| qv qvyo~plusmn wyntildeu| uuntildeq uuntildeo qvzz xyntildeyoq yzntildexzw qvwo qvw qvyu~plusmn ontildequwntildexy qntildeqqx qntildeu|q qvzz ntildeo zqntilde|ux qvwo qvzy qvw

sup2cedil~raquo yntildeouu x w qvz zo ux vq| qvw ov|sup2cedil~raquo xuntildeww ox o| qv| o|w o| qvyx qvzo qvwosup2cedil~raquo wyntildeu| oo |o qvu owz w| qvy qvzo qvwsup2cedil~raquo ontildequwntildexy ou wz qvu | |x qvyx qvz qvwo

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

sectWOcirceg

ow$XAcircsup3deg owsup3degIcircwsup3

国際単位系(SI)

乗数  接頭語 記号 乗数  接頭語 記号

1024 ヨ タ Y 10-1 デ シ d1021 ゼ タ Z 10-2 セ ン チ c1018 エ ク サ E 10-3 ミ リ m1015 ペ タ P 10-6 マイクロ micro1012 テ ラ T 10-9 ナ ノ n109 ギ ガ G 10-12 ピ コ p106 メ ガ M 10-15 フェムト f103 キ ロ k 10-18 ア ト a102 ヘ ク ト h 10-21 ゼ プ ト z101 デ カ da 10-24 ヨ ク ト y

表5SI 接頭語

名称 記号 SI 単位による値

分 min 1 min=60s時 h 1h =60 min=3600 s日 d 1 d=24 h=86 400 s度 deg 1deg=(π180) rad分 rsquo 1rsquo=(160)deg=(π10800) rad秒 rdquo 1rdquo=(160)rsquo=(π648000) rad

ヘクタール ha 1ha=1hm2=104m2

リットル Ll 1L=11=1dm3=103cm3=10-3m3

トン t 1t=103 kg

表6SIに属さないがSIと併用される単位

名称 記号 SI 単位で表される数値

電 子 ボ ル ト eV 1eV=1602 176 53(14)times10-19Jダ ル ト ン Da 1Da=1660 538 86(28)times10-27kg統一原子質量単位 u 1u=1 Da天 文 単 位 ua 1ua=1495 978 706 91(6)times1011m

表7SIに属さないがSIと併用される単位でSI単位で表される数値が実験的に得られるもの

名称 記号 SI 単位で表される数値

キ ュ リ ー Ci 1 Ci=37times1010Bqレ ン ト ゲ ン R 1 R = 258times10-4Ckgラ ド rad 1 rad=1cGy=10-2Gyレ ム rem 1 rem=1 cSv=10-2Svガ ン マ γ 1γ=1 nT=10-9Tフ ェ ル ミ 1フェルミ=1 fm=10-15mメートル系カラット 1メートル系カラット = 200 mg = 2times10-4kgト ル Torr 1 Torr = (101 325760) Pa標 準 大 気 圧 atm 1 atm = 101 325 Pa

1cal=41858J(「15」カロリー)41868J(「IT」カロリー)4184J(「熱化学」カロリー)

ミ ク ロ ン micro 1 micro =1microm=10-6m

表10SIに属さないその他の単位の例

カ ロ リ ー cal

(a)SI接頭語は固有の名称と記号を持つ組立単位と組み合わせても使用できるしかし接頭語を付した単位はもはや コヒーレントではない(b)ラジアンとステラジアンは数字の1に対する単位の特別な名称で量についての情報をつたえるために使われる

 実際には使用する時には記号rad及びsrが用いられるが習慣として組立単位としての記号である数字の1は明 示されない(c)測光学ではステラジアンという名称と記号srを単位の表し方の中にそのまま維持している

(d)ヘルツは周期現象についてのみベクレルは放射性核種の統計的過程についてのみ使用される

(e)セルシウス度はケルビンの特別な名称でセルシウス温度を表すために使用されるセルシウス度とケルビンの

  単位の大きさは同一であるしたがって温度差や温度間隔を表す数値はどちらの単位で表しても同じである

(f)放射性核種の放射能(activity referred to a radionuclide)はしばしば誤った用語でrdquoradioactivityrdquoと記される

(g)単位シーベルト(PV200270205)についてはCIPM勧告2(CI-2002)を参照

(a)量濃度(amount concentration)は臨床化学の分野では物質濃度

  (substance concentration)ともよばれる(b)これらは無次元量あるいは次元1をもつ量であるがそのこと   を表す単位記号である数字の1は通常は表記しない

名称 記号SI 基本単位による

表し方

秒ルカスパ度粘 Pa s m-1 kg s-1

力 の モ ー メ ン ト ニュートンメートル N m m2 kg s-2

表 面 張 力 ニュートン毎メートル Nm kg s-2

角 速 度 ラジアン毎秒 rads m m-1 s-1=s-1

角 加 速 度 ラジアン毎秒毎秒 rads2 m m-1 s-2=s-2

熱 流 密 度 放 射 照 度 ワット毎平方メートル Wm2 kg s-3

熱 容 量 エ ン ト ロ ピ ー ジュール毎ケルビン JK m2 kg s-2 K-1

比熱容量比エントロピー ジュール毎キログラム毎ケルビン J(kg K) m2 s-2 K-1

比 エ ネ ル ギ ー ジュール毎キログラム Jkg m2 s-2

熱 伝 導 率 ワット毎メートル毎ケルビン W(m K) m kg s-3 K-1

体 積 エ ネ ル ギ ー ジュール毎立方メートル Jm3 m-1 kg s-2

電 界 の 強 さ ボルト毎メートル Vm m kg s-3 A-1

電 荷 密 度 クーロン毎立方メートル Cm3 m-3 sA表 面 電 荷 クーロン毎平方メートル Cm2 m-2 sA電 束 密 度 電 気 変 位 クーロン毎平方メートル Cm2 m-2 sA誘 電 率 ファラド毎メートル Fm m-3 kg-1 s4 A2

透 磁 率 ヘンリー毎メートル Hm m kg s-2 A-2

モ ル エ ネ ル ギ ー ジュール毎モル Jmol m2 kg s-2 mol-1

モルエントロピー モル熱容量ジュール毎モル毎ケルビン J(mol K) m2 kg s-2 K-1 mol-1

照射線量(X線及びγ線) クーロン毎キログラム Ckg kg-1 sA吸 収 線 量 率 グレイ毎秒 Gys m2 s-3

放 射 強 度 ワット毎ステラジアン Wsr m4 m-2 kg s-3=m2 kg s-3

放 射 輝 度 ワット毎平方メートル毎ステラジアン W(m2 sr) m2 m-2 kg s-3=kg s-3

酵 素 活 性 濃 度 カタール毎立方メートル katm3 m-3 s-1 mol

表4単位の中に固有の名称と記号を含むSI組立単位の例

組立量SI 組立単位

名称 記号

面 積 平方メートル m2

体 積 立法メートル m3

速 さ 速 度 メートル毎秒 ms加 速 度 メートル毎秒毎秒 ms2

波 数 毎メートル m-1

密 度 質 量 密 度 キログラム毎立方メートル kgm3

面 積 密 度 キログラム毎平方メートル kgm2

比 体 積 立方メートル毎キログラム m3kg電 流 密 度 アンペア毎平方メートル Am2

磁 界 の 強 さ アンペア毎メートル Am量 濃 度 (a) 濃 度 モル毎立方メートル molm3

質 量 濃 度 キログラム毎立法メートル kgm3

輝 度 カンデラ毎平方メートル cdm2

屈 折 率 (b) (数字の) 1 1比 透 磁 率 (b) (数字の) 1 1

組立量SI 基本単位

表2基本単位を用いて表されるSI組立単位の例

名称 記号他のSI単位による

表し方SI基本単位による

表し方平 面 角 ラジアン(b) rad 1(b) mm立 体 角 ステラジアン(b) sr(c) 1(b) m2m2

周 波 数 ヘルツ(d) Hz s-1

ントーュニ力 N m kg s-2

圧 力 応 力 パスカル Pa Nm2 m-1 kg s-2

エ ネ ル ギ ー 仕 事 熱 量 ジュール J N m m2 kg s-2

仕 事 率 工 率 放 射 束 ワット W Js m2 kg s-3

電 荷 電 気 量 クーロン A sC電 位 差 ( 電 圧 ) 起 電 力 ボルト V WA m2 kg s-3 A-1

静 電 容 量 ファラド F CV m-2 kg-1 s4 A2

電 気 抵 抗 オーム Ω VA m2 kg s-3 A-2

コ ン ダ ク タ ン ス ジーメンス S AV m-2 kg-1 s3 A2

バーエウ束磁 Wb Vs m2 kg s-2 A-1

磁 束 密 度 テスラ T Wbm2 kg s-2 A-1

イ ン ダ ク タ ン ス ヘンリー H WbA m2 kg s-2 A-2

セ ル シ ウ ス 温 度 セルシウス度(e) Kンメール束光 lm cd sr(c) cd

スクル度照 lx lmm2 m-2 cd放射性核種の放射能( f ) ベクレル(d) Bq s-1

吸収線量 比エネルギー分与カーマ

グレイ Gy Jkg m2 s-2

線量当量 周辺線量当量 方向

性線量当量 個人線量当量シーベルト(g) Sv Jkg m2 s-2

酸 素 活 性 カタール kat s-1 mol

表3固有の名称と記号で表されるSI組立単位SI 組立単位

組立量

名称 記号 SI 単位で表される数値

バ ー ル bar 1bar=01MPa=100kPa=105Pa水銀柱ミリメートル mmHg 1mmHg=133322Paオングストローム Å 1Å=01nm=100pm=10-10m海 里 M 1M=1852mバ ー ン b 1b=100fm2=(10-12cm)2=10-28m2

ノ ッ ト kn 1kn=(18523600)msネ ー パ Npベ ル B

デ ジ ベ ル dB

表8SIに属さないがSIと併用されるその他の単位

SI単位との数値的な関係は    対数量の定義に依存

名称 記号

長 さ メ ー ト ル m質 量 キログラム kg時 間 秒 s電 流 ア ン ペ ア A熱力学温度 ケ ル ビ ン K物 質 量 モ ル mol光 度 カ ン デ ラ cd

基本量SI 基本単位

表1SI 基本単位

名称 記号 SI 単位で表される数値

エ ル グ erg 1 erg=10-7 Jダ イ ン dyn 1 dyn=10-5Nポ ア ズ P 1 P=1 dyn s cm-2=01Pa sス ト ー ク ス St 1 St =1cm2 s-1=10-4m2 s-1

ス チ ル ブ sb 1 sb =1cd cm-2=104cd m-2

フ ォ ト ph 1 ph=1cd sr cm-2 104lxガ ル Gal 1 Gal =1cm s-2=10-2ms-2

マ ク ス ウ ェ ル Mx 1 Mx = 1G cm2=10-8Wbガ ウ ス G 1 G =1Mx cm-2 =10-4Tエルステッド( c ) Oe 1 Oe  (1034π)A m-1

表9固有の名称をもつCGS組立単位

(c)3元系のCGS単位系とSIでは直接比較できないため等号「   」

   は対応関係を示すものである

(第8版2006年改訂)

この印刷物は再生紙を使用しています

Page 6: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター

JAEA-Testing 2011-005

Contents 1 Introduction ---------------------------------------------------------------------------------------------- 1 2 Outline of the system ---------------------------------------------------------------------------------- 1

21 Hardware construction of BX900 ------------------------------------------------------------------ 1 22 Hardware construction of FX1 -------------------------------------------------------------------- 2 23 Hardware construction of Storage----------------------------------------------------------------- 2

3 Performance evaluation by HPCC benchmarks ----------------------------------------------- 5 31 Outline of HPCC benchmarks --------------------------------------------------------------------- 5

311 Test programs ------------------------------------------------------------------------------------- 5 312 Rules ------------------------------------------------------------------------------------------------- 7

32 Measured results -------------------------------------------------------------------------------------- 8 321 BX900 ----------------------------------------------------------------------------------------------- 8 322 FX1 ------------------------------------------------------------------------------------------------- 14

33 Summary ------------------------------------------------------------------------------------------------ 18 4 Tunings for BX900 and FX1------------------------------------------------------------------------- 20

41 Profiler --------------------------------------------------------------------------------------------------- 20 42 Automatic tuning of memory access ----------------------------------------------------------- 20

421 Efficacy for Original version of STREAM benchmark ------------------------------- 21 422 Efficacy for HPCC version of STREAM benchmark ---------------------------------- 23

43 Manual tuning of memory access ---------------------------------------------------------------- 24 44 Efficacy for NPB23 FT benchmark ------------------------------------------------------------- 28 45 Summary ------------------------------------------------------------------------------------------------ 31

5 Performance evaluation by IOR benchmarks ------------------------------------------------ 33 51 Outline of IOR benchmarks ---------------------------------------------------------------------- 33

511 Key parameters -------------------------------------------------------------------------------- 33 512 Option parameters for MPIIO ------------------------------------------------------------- 34

52 Measured results ------------------------------------------------------------------------------------ 35 521 POSIX --------------------------------------------------------------------------------------------- 38 522 MPIIO --------------------------------------------------------------------------------------------- 38 523 Local cache --------------------------------------------------------------------------------------- 38

53 Summary----------------------------------------------------------------------------------------------- 41 531 Choice of file system---------------------------------------------------------------------------- 41 532 POSIX vs MPIIO ------------------------------------------------------------------------------ 41 533 Buffer size ---------------------------------------------------------------------------------------- 41

6 Performance evaluation of the interconnect --------------------------------------------------- 44 61 Specification of the interconnect ----------------------------------------------------------------- 44

JAEA-Testing 2011-005

611 Construction and Nodebind ------------------------------------------------------------------ 44 612 Communication method ----------------------------------------------------------------------- 46

62 Fundamental performance of MPI communication ---------------------------------------- 48 621 Measuring method ----------------------------------------------------------------------------- 48 622 Measured results -------------------------------------------------------------------------------- 49

63 RDMA ---------------------------------------------------------------------------------------------------- 53 631 Measuring method ------------------------------------------------------------------------------ 53 632 Measured results -------------------------------------------------------------------------------- 55

64 Communications between intra-CPUs inter-CPUs and inter-nodes ----------------- 56 641 Measuring method ------------------------------------------------------------------------------ 56 642 Measured results -------------------------------------------------------------------------------- 57

65 Communications between chassises ------------------------------------------------------------ 58 651 Point-to-point communication--------------------------------------------------------------- 58 652 Collective communication -------------------------------------------------------------------- 62

66 Summary ------------------------------------------------------------------------------------------------ 64 7 Concluding Remarks --------------------------------------------------------------------------------- 65 Acknowledgements -------------------------------------------------------------------------------------------- 65 References ------------------------------------------------------------------------------------------------------- 65 Appendix A HPCC benchmark results on FX1 --------------------------------------------------- 66 Appendix B Fundamental performance of MPI communication ---------------------------- 67 Appendix C RDMA vs NORDMA--------------------------------------------------------------------- 69 Appendix D Overlapping communications with calculations -------------------------------- 70 Appendix E Peer-to-peer communications --------------------------------------------------------- 76 Appendix F Memory band width ---------------------------------------------------------------------- 79 Appendix G Flat-MPI vs Hybrid --------------------------------------------------------------------- 85

JAEA-Testing 2011-005

1 EacuteH L13Pgt ABCDE

FGDHIJ 22 3K13LM=NO 153TflopsPQCRSTUVWXM=NO 200TflopsUV LinuxYZ$13LPRIMERGY BX900PWM=NO 12 Tflops []^_`$a^LFX1PbcdXefgh$13ijklmndoBpCDqBX900 rsXtuvwmWFX1 []efgh$Lxfgh$PyBzD^|13~hGmWWCq gtoBGXumlcopyordfyBClaquoPRIMERGY

BX900WFX1oacuteCO|Yl WpoundDGY`OslashCOslashGmWbrvbarsectdcWXpoundD13

GqUgrave[13EcircuJGq 3gtefgh$13MXBdn HPCC YbrvbarC 4cd 6O|YalLIOPa$fY`L^_ Pcent AringcdGq 7M$CqXyacuteampG()Ocirc$bWCUgraveWHDq

2 13Ecircu 13 3+XB-eacute13GXUVEumlIgraveOIacute[

]fIumlIacutenot|O-eacutecdX13gtlaquoqmnd13M

=NORSTUVWX 214TflopsMacirc012 56TBOgraveOacuteOcircOtildeY1212PBgtlaquoq13Ecircu Fig 1Gq 21 UVEumlIgraveOIacuteIcircIumlETHNtildeJ 1334WXUVEumlIgraveOIacutePRIMERGY BX900L13BX900P

200Tflops=NOLgYPWMacirc0 50TBnotGUV LinuxYZ$13gtlaquoq

BX900 10U-a5Aumleuml13aelig13T 186-eacute7IumlT 86accedil7Iuml8amp9XAumlAringOslashOuml7Iuml-eacutegtlaquoq1 7Iuml-eacuteaXeon ^_ccedil- X5570LYccedilIumlfP 2 ^_ccedil-Wacirc0 24GBLDDR3 SDRAMBiIacuteiumlIuml 12GB UgraveD 48GBP8ampGqiumlIumllaquoDsect=NO9376GflopsL293GHztimes4lt=gtOY_ccedilYtimes4ftimes2^_ccedil-Pgt|eacuteIuml 512GBsAumlAnotGqiumlIumlInfiniBandLIBP 4xQDRL13B QDRWCP2`ZaringCDAgravez 8GBsAbrvbarAumlFE7a13XOcirc$FGordf9gtlaquoq

JAEA-Testing 2011-005

Fig 1 13Ecircu

BX900gtacircHI4J2K currenLXMUV13Nh713~O=5Pq

22 []fIumlIacuteIcircIumlETHNtildeJ []fIumlIacuteFX1L13FX1P12Tflops =NOLgYPWMacirc0 46TBnotGYZ$13gtlaquoq

1iumlIumlQR SPARC64SLYccedilIumlfPT^_ccedil-Wacirc0 16GB8ampGqUgraveD|UCPUVBccedil^ccedil`LJSCPbrvbarsectAuml|eacuteIumlOslashWCqiumlIuml FBBLFull Bisectional BandwidthPccedil`X| InfiniBand 4xDDRLDAgravez 2GBsPgtYZGqiumlIumleacute|O=f7YOtilde IcircIumlETHNtildegtO=GVBOumltimesLAumla$fY`accedilPyBGmWgt [

brvbarUcircXOcirc$FG]EumlIgrave^_ZOslash5^=_wXM

`notGq FX1 gtFX1 CcedilEgraveordfxfgh$WXpound`acC2012 b

=Gxfgh$yBzD^|13~h5Pq 23 OgraveOacuteOcircOtildeYOumltimesIcircIumlETHNtildeJ OgraveOacuteOcircOtildeYOumltimes IO iumlIumlSPARC Enterprise M90002 6Ocirc$OcircOtildeYETERNUS DX80366$OcircOtildeYETERNUS4000 model40016gtJeumlnq

JAEA-Testing 2011-005

MOcircOtildeY12 12PBLRAID6Pgtlaquosect13gt 576GBs IO LgYPnotGLFig 2PqIO iumlIuml InfiniBand nota13Parallelnavi SRFSbrvbarsectciumlIumlcdYoacuteCaidefgCq Ocirc$OcircOtildeYOumltimeslaquoDsect 466Za SASOcircOtildeYL1TB7200rpmP8ampG

qmndOcircOtildeY RAID6 gt^UcircCOumltimesgt`ZagGmWgt IO AumlFUcirchpoundqXIOiumlIuml_aumla13Sun QFSBCLFig 3Pq

UVEumlIgraveOIacute11913aelig13()

i QDR 9j2 (113aelig13ntildesect) InfiniBandaccedil

(9iexcl) i FGFAring 72GBs (2iumlIumlcurren) IOiumlIuml (M90002iumlIuml) i FGFAring 576GBs (OcircOtildeYcurren) OgraveOacuteOcircOtildeY (ETERNUS DX80366) ()9kUcirclmiumlIumlnC

1 2 3 119

IBUSW

IO node

Fig 2 oOumltimesYZJWFGFAring

JAEA-Testing 2011-005

M90

00

2IO

M90

00

1IO

FC (3)

FC (4)

FC (36)

hellipFC (1

)F

C (2)

FC

(42)

hellipFC (3

7)FC (3

)FC (4

)FC (3

6)hellip

FC (1)

FC (2)

FC (42)

hellipFC (3

7)

DX8

0 1

helliphelliphelliphelliphelliphellip

helliphelliphelliphelliphellip

hellip

1TB 1T

B1T

B1T

B

DX8

0 2

1TB 1T

B1T

B1T

B

DX

80

35

1TB 1T

B1T

B1T

B

DX8

0 3

6

1TB 1T

B1T

B1T

B

ETER

NU

S D

X80

(36

)E4

K 1

1TB 1T

B1T

B1T

B

ETER

NU

S400

0 M

odel

400

(1)

32G

Bs(8

FC)

QFS

Q

FS

36

576

GBs

(144

FC)

F Fi

g 3

JAEA-Testing 2011-005

3 HPCCYbrvbar

HPCC (High Performance Computing Challenge)1) Y 7^_ZB BX900W FX15poundDq13gtHPCCEcircupqWrstuq 31 HPCCYEcircu

HPCC YvR Tennessee JDongara wRordf3xWXpoundyoBCefgh$BY`^_ZzgtlaquoqO

|Y^_ WpoundDefgh$acircuXu

GmWbrvbarsectefgh$13WCeacuteZGUacuteC

qO HPL(High-Performance Linpack)DGEMMFFTE 3q|^_Zgt|Y STREAM W Random Access 2 q|^_Zgt _ PTRANSW b_eff 2q|^_Zgt5n~noacuteCc^_Zbrvbar5PJWXpound 2)qo

^_ZUuml|YacuteTHORNFortranCgtcnordfHPCCYWCCgtcnD 1^_ZeumlnqXHPCCYC 2003 sect 300wordfeumlnsect+X13rsyBGmW9gtlaquoq

311 ^_Z efgh$13f CPU 1 iexclXCiexcl8ampG

L13iumlIumlPa$fY`gtYZCDYZ$JgtlaquosectCoreCPUiumlIuml13WeumlUgraveUgraveXgtO=ordf5nDHogtordfugt

laquoqmgtiumlIumlMG` (G Global system performance)WiumlIuml[gt` (SN Single Environment)gt` (EP Embarrassingly Parallel) 3gt5Pq

3111 HPL^_Z

HPL5IgraveLUcurrengtCmWgt13OG^_ZgtlaquoqOslashm^_Z|5 CcedilEgrave^_ZaumlIacutecurrenWCyB

eumlnmWordfqUgraveDefgh$13O[P TOP500B^_ZWCBeumln LinpackMPIEumlIgraveUcircWCnotgtlaquoqiumlIuml`$XOordfecircMAumlWXq Tflopsgteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq

JAEA-Testing 2011-005

3112 DGEMM^_Z DGEMM5IgraveGmWgtiumlIumlOG^_Zgtlaquoqo^_gt5curren13^_W i5Xqm^_Z

LinpackSgtegravenauml^_ZgtlaquoqUgraveDaumlefgh$gto13zeumlnDeacuteaTHORN|ordfZaZ|WCBeumlnsect

meacuteaTHORN|ordfoacute WXqY`LBMTPbrvbariumlIumlOiexclcentGq Gflops(Giga floating-point operations per second)gteumlnq` 2gtiumlIuml[(SN)(EP)`5Pq 3113 STREAM^_Z STREAM iumlIumlordf|W CDWecirceacuteIumlG^_Zgtlaquoqm

Wecirco^_gtO=ordf5np^_W i5Xqm^_Z

cpound(Copy)curren(Scale)Myen(Add)yen(Triad) 4 XO=gtJeumlnsectiumlIuml|YiexclcentGq GBs(Giga Bytes per second)gteumlnq` 8 gt[(SN)(EP)gtoiumlIuml 4(cpoundcurrenMyenyen)`5Pq 3114 PTRANS^_Z PTARNS(Parallel matrix TRANSpose)EumlIgrave^_ocirc5IgraveFtimesbrvbarsect13ccedil`brvbarY G^_Zgtlaquoq GBs gteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq 3115 Random Access^_Z Random Access |sectumleumlnDecopyIgravecdZordf 1 ulaquonotgtshyregGO=ordf 1 macrdegplusmn5ncG^_Zgtlaquoq1 plusmnYGOcirc$2ordf8BWgteumlordfgtlaquoqiumlIuml|Yucircsup2`WiumlIuml MPI gtucircsup2`5Pq Gups(Giga updates per second)gteumlnq` 3 gtiumlIuml[(SN)(EP)iumlIumlegravepoundDM(G)`5Pq 3116 FFTE^_Z FFTEAumlF|sup3^acuteOG^_ZgtlaquoqHPLWiacute CcedilEgrave^_ZaumlIacutecurrenWCyBeumlnmWordfq[(SN)W(EP)`gt2 [microcopyIgraveordfo^_gtpara|sup3^acuteeumlnqiumlIumlegravepoundDM(G)`gt3[microcopyIgraveordflaquo 1AgravezmiddotAcurrenetheumlnDcedil|sup3^acuteeumlnlaquoW^_ordf^_W 5PmWLoacute Pbrvbaro^_

copyIgravesup1copytimesordm^_raquofrac14sect[microoacuteG|sup3^acuteordf5nq

JAEA-Testing 2011-005

Gflopsgteumlnq 3117 b_eff^_Z b_eff Ocirc$FGFGfrac12frac34(7a13acute latency)WFGFAring(eacuteIuml band width)shycdiquestH^_ZgtlaquoqFGfrac12frac34ordfAgraveXCeacuteIumlordfecircM13^_ ordfAacutenmWGqb_eff iumlIumlegravepoundDM(G)G Ring W PingPong 2 ^_ZgtJeumlnqb_eff FGGOcirc$-a5mnUgravegt^_ZW+XsectlOcirc$-a5 N ] MPI ^_ P atildeAcircXq7a13 8B Ocirc$eacuteIuml 2MB Ocirc$FGGmWbrvbarsect5Pq Ring MPIbrvbarpoundAtildeHdnDZYAumlAringAEligo^_oacuteCAumlAring 1oacute 1 5PoumlWZYAumlAringZordfEumlu 1oacute 1 5Pouml 2CcedilEgravegtCDq PingPong 2Core Eacute Ecircnototilde^_ordfethsectntildednp Core EumlCOslash5Gq2CoreIgraveEacuteIacute5poundecircTIcircIIumlshyGq b_effETHNtildeOgraveRing(AumlAring)brvbar7a13OacuteRing(AumlAring)brvbareacuteIumlOcircRing(Zordf)brvbar7a13OtildeRing(Zordf)brvbareacuteIumlOumlPingPongbrvbar7a13timesPingPongbrvbareacuteIuml 6gtlaquoq7a13ordfaY_LOslashPmacreacuteIumlordf GBsgteumlnq 312 Oslash5

HPCCZaZ(Baseline Runs)WUuml^Otildea5Z(Optimized Runs) 2Oslash5ordflaquoordffeaZTUacuteUcircEacutebrvbarZaZgt5pound

q

3121 ZaZ Oslash5gtlaquoZaZfeaZbrvbarTUacuteUcircW13Oumlfrac34

eumlnaumlUgraveAuml BLAS ZaZ|W MPI ZaZ|egraveBUgravegtUacuteeumlnordfUcircfIuml^regUacuteeumlnXqegraveBCDfeaZeacuteYacute~

WfeaegraveBCDUuml^13~ZaZ|eacuteYacute~UumlordfYacutegtlaquoq 3122 Uuml^Otildea5Z UcircfIumlTHORNszligagravePTUacuteUcirc2 7gtUacute9eumlnqiAgraveaacutednDIacutecurren(-)fIumlTUacuteUcircgtlaquosectpAgrave|5macr8CEcircnototilde3XTUacuteUcircgtlaquoqUgraveD acirc HPCC^_YacuteNtildeY`acircatilde^WaumlaringordfaeligugtlaquosectatildeCUumlGaeliguordflaquoq

JAEA-Testing 2011-005

32 BX900 FX1n~nHPCC BMT^_Z 8ccedil512EumlIgravegtOslash5CDq 321 BX900 BX900 HPCC BMT^_Z Table 1GWHPL

PTRANSRandom Access FFTEo^_ZZK|Otilde Fig 4ccedilFig 7GqZK|Otildedegn^_Zegraveeacuteecirccopyordfeuml]cgtlaquoordf512 EumlIgravegtigraveyenCiacuteXCiacuteicircXordfdegdnqUgraveDHPL=NoacuteC 822(=4902596100)1gtlaquopoundDq

Table 1iumlntildeethntildeograveoacuteWCHPCCocircAEligcdotildebIcircIumlETHNtildeoumlccedilYAElig Intel Endeavor cluster (13gt Endeavor Wdivideoslash)HPCCWeboumlYacute 1) ugraveuacute Table 2Gq BX900shyucircoB3gtlaquoDH Ecircnototilde^_ZP EndeavoregraveeacuteplusmnpoundDlaquoordfEcircuumlb13yacuteordfdegdnqUgraveDEndeavorWEacute IntelfeaZBD Table 3 GbrvbarPQRfeaZbrvbarWbgtlaquopoundDqDotildeCSTREAM iumlIumlSgt5poundmWcd thornCXatilde

d(35GBs oacute 42GBs)W 2ethM EndeavorbrvbarsectEshygtlaquopoundDqm422gtaringNGq

1 BX90013VBbrvbargt17072fgt=N 2000TflopsoacuteC 1914TflopsOslashCsectOslashszlig 9566gtlaquopoundDq

- 9 -

JAEA-Testing 2011-005

Ta

ble

1 富士

通コ

ンパイ

ラに

よる機

構B

X900

での

HPC

C B

MT測

定結果

G-H

PL

G-P

TR

AN

SG

-FFTE

EP-S

TR

EA

MSys

()

EP-S

TR

EA

MTriad

EP-D

GEM

MR

ando

mR

ing

Ban

dwid

thR

ando

mR

ing

Lat

ency

TFlo

ps

GB

sG

ups

Gup

sG

Flo

ps

GB

sG

Bs

GFlo

ps

GB

sus

ecrd

ma

008

1315

12

9330

001

5112

008

9847

555

3328

074

350

9311

018

41

1171

51

268

nord

ma

008

1226

12

9544

001

5504

009

0139

557

5727

813

347

6711

023

91

1053

51

278

rdm

a0

1579

130

510

540

0238

850

1510

719

4469

555

143

4697

111

145

092

418

149

3no

rdm

a0

1536

650

491

080

0222

270

1334

439

7508

555

683

4730

111

123

078

781

324

5rd

ma

032

2493

09

0716

003

0751

024

2906

186

357

111

164

347

3911

101

00

5220

61

944

nord

ma

032

3494

08

4422

003

5001

023

7182

190

981

112

052

350

1611

100

30

5194

95

010

rdm

a0

6201

260

150

637

003

2786

037

8769

369

923

222

203

347

1911

092

40

4192

43

156

nord

ma

060

9871

014

686

80

0763

460

3882

6638

139

722

260

03

4781

110

887

044

542

652

7rd

ma

120

6270

026

303

70

0370

050

5186

5673

075

744

426

43

4708

111

007

038

892

463

5no

rdm

a1

2265

800

302

225

007

3652

061

6532

742

721

444

063

346

9211

100

50

4161

78

290

rdm

a2

4166

200

583

348

002

9897

053

8761

139

3770

888

453

347

0511

091

00

3583

55

937

nord

ma

236

3890

057

519

80

0765

560

8003

4814

845

5089

503

73

4962

110

879

040

323

103

72rd

ma

468

4620

011

199

400

0163

910

2622

2026

859

1017

779

513

4726

111

147

032

838

711

3no

rdm

a4

9017

400

109

9150

008

2799

153

7610

275

5280

1816

433

354

7711

111

90

3572

112

790

富士

通B

X90

0 (5

12C

ore

128C

PU

)C

ip

Xeo

n X55

70 2

93G

Hz

DD

R3-

1066

SM

T-O

N T

urbo

Mod

e -O

FF

Inte

rcon

nect

Q

DR

Inf

inib

and

2so

cket

nod

e F

at T

ree

OS

Red

Hat

EL 5

Com

pile

r 富

士通

CC

++ V

er O

ption -

Kfa

st -

SS

L2

()

EP-S

TR

EA

M S

ys の

値は

(EP-S

TR

EA

M T

riad

)times並

列数

で算

出し

てい

8 16 32 64

MPI

富士

通 P

aral

leln

avi Lng

uage

Pac

kage

512

128

HP

CC

Ver

V

13

1

256

並列

数通

信方

式G

-Ran

dom

Acc

ess

オリジナ

ル版

  

SA

ND

IA_O

PT2版

JAEA-Testing 2011-005

FFig 4 BX900 HPL

Fig 5 BX900 PTRANS

JAEA-Testing 2011-005

FFig 6 BX900 Random Access

Fig 7 BX900 FFTE

JAEA-Testing 2011-005

TTa

ble

2In

tel E

ndea

vor c

lust

er (

HPC

CW

eb1)

)

JAEA-Testing 2011-005

TTa

ble

3In

tel

BX9

00H

PCC

BM

T

JAEA-Testing 2011-005

322 FX1 FX1 HPCC BMT Table 4 HPL

PTRANS Random Access FFTE Fig 8 Fig 11512

HPL755(=377650100)2

Table 4 FFTE DGEMM

A

2 FX1 1200 129Tflops116Tflops 9037

JAEA-Testing 2011-005

TTa

ble

4FX

1H

PCC

BM

T

JAEA-Testing 2011-005

FFig 8 FX1 HPL

Fig 9 FX1 PTRANS

JAEA-Testing 2011-005

FFig 10 FX1 Random Access

Fig 11 FX1 FFTE

JAEA-Testing 2011-005

33 HPCCYUgraveWH Uklq BX900W FX1WrsndWK(Altix3700Bx2)Wrs

Fig 12GqmhgtTWXpoundDqshy 100WCLRandom Ring LatencyPqRandom Ring Latency13BX900ordf`ccedil^fCmWordfcurrencq

BX900 W FX1 rsGWSTREAM W FFTE necircOatildeC atildeC eacuteIumlrvCDordfmacrdnDqSTREAM

421FFTE 43aringNGq XAltix3700Bx2 Random AccessordfEordfAltix3700Bx2W

plusmngtegraveBCD HPCC BMTeacuteYacute~cdOcirc$Y|5ordf+Xpoundsect8YrsGmW13XqAltix3700Bx2 atildeC FFTEEordfmdoacute OslashszligWuumlwdn

q

JAEA-Testing 2011-005

13

0

20

40

60

80

100

BX90

0

FX1

Altix37

00Bx2

GHPL(490Tfplos)

113

GRando

mAccess(154G

ups)

GPTRANS(1120G

Bs)

EPSTREA

M(436GBs)

EPDGEM

M(111Gflo

ps)

13

GFFTE

(2755Gflo

ps)

$amp(13

Rand

omRingBandw

idth

(0359GBs)

Rand

omRingLatency(518usec)

)+

512-(FlatM

PI)

1172

10 64

64

10 32

8 2 32

130

13

123

[GFLOPS]

4567898

123

[GBs]

lt8

7898

[GBs]

=gtAB130CDgtEFGHIAB0J

KLMNOIPQ

Fi

g 1

213cent

HPC

C B

MT^_ZrsZ

Ran

dom

Rin

g La

tenc

y (5

18

sec)

)+R

JAEA-Testing 2011-005

4 BX900W FX1TUacuteUcirc 32 c^_Zgt BX900 W FX1 IcircIumlETHNtildeoumlccedilYWiCXordfWnDmWQR^_aZegraveBC(Xcurrenyen

5poundDqeumldfeaZUuml^13~brvbarTUacuteUcirc]Ucirc7^_ZTUacuteUcirc

EacuteDq 41 ^_aZ

BX900 W FX1 gtegraveBCD^_aZWQRUgraveyenXgtlaquosectoacute ^_ZOslash5^_a|Ocirc$divideC^_ZOslash5cedildivideOcirc$divideC

^_Z=)yen5PmWordf13q ^_aZyBGUgrave^_Z_IumlYacuteh)JGDHfe

a|YUuml^13~UKtl_trtGq[ _ZOslash5_IumlYacute

h fpcoll fIumlgt=GqTcedildivideCDeacuteaTHORN|Ocirc$ fprof fIumlgtdivideCaring`Gq^_aZgtPAElig 8q|gtlaquoq

^_a|Ocirc$divideAElig AElig ^_ AElig MPIZaZ|ordm)AElig f`AElig IcircIumlETHNtilde$AElig fZAElig UcircfIumlAElig

mPIcircIumlETHNtilde$AEliggtCPUlt=gtOaringaeligccedil13

hccedil`|WOcirc$FGszligbordfoacute WXq

42 feaZbrvbar|YTUacuteUcirc BX900HPCC BMT^_Z STREAMordfotildeb

gtlaquo EndeavorWrsC(35GBsacute42GBs)W 2ethMEshygtlaquopoundDmWETHNtilde5poundDqETHNtildeBDUcirc HPCCpSTREAMUuml|YacuteTHORN(CUcirc amp FortranUcirc)7ccedilIumlEumlIgraveUcircWMPIEumlIgraveUcircgtlaquoq[13BGbrvbarPmdegn BX900 feaZUuml^13~UKntst brvbarsectCDmWcdEndeavorfeaZ(IntelUgrave)W BX900feaZ(QRUgrave)igraveiacuteWuumlwdnq

JAEA-Testing 2011-005

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute BX900Uuml|YacuteTHORN STREAM^_Z Table 5Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntst brvbarordfmacrdnDqDotildeCUuml^13~gtOcirc$|sectumlGaringaeligccedil13hyBCXbrvbarPGDHoacute WX DO^SgtOcirc$sup1yBordfXmWordfdcXoumlEacuteegraveBGuecircgtlaquosectPgtXoumlordfUcirc

GordflaquoqUgraveDfeaZordfoacute WGDO ^SgtsectumlOcirc$ordf^^shypoundZYeumlnoumlEacutegtlaquoq CUcircoumlIntelfeaZXd OpenMPEumlIgraveUgraveUgravegt 42GBscoreordfdegdnDordfQRfeaZgt7ccedilIumlEumlIgraveUcirc OpenMP EumlIgravecd=EumlIgrave^regCbrvbarP]C Intel feaZWbWXpoundDqQRfeaZgtlaquo CD7ccedilIumlEumlIgraveUcircAgraveogravebrvbarpoundOcirc$copytimes]AgravebTUacuteUcircAgraveograveordflaquoWnq [ FX1Uuml|YacuteTHORN STREAM^_Z Table 6Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntstbrvbarordfmacrdnDqXTriad (132GBs)CPUQRordfWebgtUumlCshy(135GBs)CPUWotildeEacutegtlaquoq CUcircoumlordf FortranUcircUKntstUuml^13~XCWotildeiCqFX1 CfeaZ BX900W+XsectUKntstUuml^13~ordfcentsup3CXqDHFX1 CfeaZbrvbarTUacuteUcircordf$currengtlaquopoundD9ordfAumlq

JAEA-Testing 2011-005

TTa

ble

5B

X900

STRE

AM

Ta

ble

6FX

1ST

REA

M

JAEA-Testing 2011-005

422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute 321gtdegdnD BX900 HPCC BMT^_Z STREAM

Table 7GqmmgtUKntstcopyIgraveOcirc$ecircEacutearingaeligccedil13heumlX storeampegraveBGmWUKrestp=all(a$XsectordfXmW)WCDTUacuteUcirc5PfeaZUuml^13~gtlaquoq UgraveDyacutegtgtlaquoordf _ZOslash53iumlIuml XC

iumlIumlSgt|Yo CPU]o CoreiDHEumlIgrave] Agravejatildedn~nfeaZUuml^13~brvbar^UcircCXq

STREAMbrvbarPX CoreIumliX2gtZX|Yordf5n DO^gtfeaZUuml^13~UKntst GmWgt|Y+13mWordfcurrencpoundDq

Table 7 HPCC BMT^_Z STREAM (MPIEumlIgraveUcircCUcirc) feaUuml^13~Ograve(1) feaUuml^13~Oacute(2)

EumlIgrave Agravej 13aelig13

EP-STREAMTriad (GBs)

13aelig13 EP-STREAMTriad (GBs)

rdma 1 34739 U U 32

nordma 1 35016 1 43316 rdma 3 34708 U U

128 nordma 3 34692 1 43572

rdma 5 34726 4 43595 512

nordma 4 35477 4 43424 L-1QRfeaZUuml^13~OgraveacuteKfastpackedntst SSL2 -2QRfeaZUuml^13~OacuteacuteKfastpackedntstrestp=all SSL2P

JAEA-Testing 2011-005

43 hhbrvbar|YTUacuteUcirc 32 gtdcWXpoundDbrvbarPBX900 W FX1 FFTE ^_Z

acircIcircIumlETHNtilde13yacuteordfmacrdnDqmETHNtildeDH5pound

DhhW13gtszlig0Gq 13gt FFTE 32EumlIgraveOslash5CDWecircoacuteG ethBX900

gt 12FX1 gt 16Wgteumlqpoundordm)1|YEcirc2OgtlaquopoundD9ordfAumlqmgtFFTEUuml|YacuteTHORN(FortranUcirc)l3CUuml|YacuteTHORN(13aumlWoacuterGoumlUcircWdivideoslash)WaumlIacutecurren4C)JCDaumlacircoacuteGaumlZh5poundDiacuteegraveBC

hbrvbarpound^UcircCDordm)GmWbrvbarsectBX900 W FX1 oacuteG5g5poundDqUcircpaumlbrvbar5poundDh 1QRfeaZW Endeavor egraveBeumlnDafeaZWrs5PDHW 2auml13IacutecurrenTUacuteUcircbrvbar6)BbordfaumlIacutecurrenthorn7wXmWagraveaacuteGDHgtlaquoq UcircaumlIacutecurren 38plusmnUcircWmndaumlZhCDiacute

Ucircn~n Fig 13W Fig 14GWoIacutecurreniacuteEcircu0Gq Fig 13W Fig 14OgravegtCDIacutecurrenOO=5pound^gtlaquoqUuml|YacuteTHORN

gtcopyIgrave APXYZWWWWOcirc$Yordf 2ugt5nsectZgtXcpoundDDHiacutegt)9BcopyIgrave CY_W^curreneth5poundcopyIgrave APXYZWWWWYordfZXbrvbarPCDq)9BcopyIgrave CY_YZCXcpoundD=Ocirc$sectumlG^gtYordfZXpoundDoumlUcircordfecircDHgtlaquoq

Fig 13W Fig 14OacutegtCDIacutecurren3[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 3ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 3gtXC 6Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquosectaringaeligccedil13h-

a5brvbarpoundltG_ccedilY-a5^_ZUgrave)acircregmacr= 16 gteumlnq

Fig 13W Fig 14OcircgtCDIacutecurren2[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 2ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 2gtXC 4Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquoqFX1 gtBX900rufntildeDsectaringaeligccedilaringh-a5ordfgteumlmWcdmO=)t_ccedilY-a5 16cd 4^regGmWgtzyacuteordfmacrdnDq

Table 8BX900W FX1QRfeaZegraveBCDUcircWaumlbrvbarEuml BX900 afeaZegraveBCDaumlbrvbarGq

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeFFQhFQeFHFFIhFIeFHFFhFQWFRWFFILCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildeKgMSCFDEFGLCKmMlhKOFRLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeNHQRIAacuteVOEOlOjAtildehSRAacutelOEYEEfKAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKnMSCFDEFGKmMSCFDEFGKdMSCFDEFGKkMSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePMMSCFDEFGPKMSCFDEFGPPMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLSABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtildeLCnMEEhKOFKOFAVjLCgMllhKOFPOFAVjLCPMEhEEOUEFMAacuteEEfFAVjYKOFKAtildeLCKMlhllOUEFMAacutellfFAVjYKOFPAtildeAAacutelOEAtildehNAacuteEOlAtildeaumlIacutecurrenOcircKMSCFDEFGPMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBF

GFL

Fig 13 FFTEUuml|YacuteTHORNaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeop]qrsAacuteXAtildeObqqpobtbuqrvvowxAacutevOvOvAtildeL3yzPbqqpobtrAacuteowxAacutesOwO]|AtildeAtildeLCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildep~hKOwowxAacute^O~OqAtildehowAacute~O^Atilde ^currenethDHrp )9BcopyIgrave)JIacutecurrenKgMSCFDEFGKdMopt^|rKkMopt^|rpKdK^^hKOsOuqpKmM~~hKOwOuqpKmK~h~~O^MAacute~~fuqYKOwAtildeLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildepKkKVhKO]|NHQRIAacuteVOEOlOjAtildehowxAacute^Y^^fKO~OqAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKkKopt^|r LEW V^lnacutewPKnMSCFDEFGKmKSCFDEFGKmMSCFDEFGKdKSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePPMSCFDEFG LEW j^lnacutewPPKMSCFDEFGPMMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtilde]bcbrtrcAacuteuqKhnAtildeLCnMEEhKOFKOuqKLCgMllhKOFPOuqKLCKMlhllOUEFMAacutellfuqKYKOFPAtildeLCPMEhEEOUEFMAacuteEEfuqKYKOFKAtildeAAacutelOEAtildehNAacuteEOlAtilde aumlIacutecurrenOcircPMSCFDEFGLEW l^lnacutewPKMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBFGFL

Fig 14 FFTEiacuteaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

Ta

ble

8FF

TE V

er 4

1

FullW

kern

el

(EumlIgrave

32

180M

BP

roc

)

ST

UVWXYZ[]

^_

`6a8bcdRefghHdRefgij

`6a8bcdRefghHdRefgij

klmmnopqr

cltst

ouvq

wvx

ovy

zqvz

|vy

|vz

klmmnopq

~cltst

yvy

yvo

ovo

wqvq

v|

|vy

s88oy

oqvu

oqvu

ovq

z|vu

oquv

qvz

s88Ru

ovz

qvw

x|vq

ovw

klmmnopqr

cltst

ouvy

zvy

ovx

oq|vu

uvz

uv

klmmnopq

~cltst

yvx

xvw

ovo

w|vu

vz

|vy

s88oy

oqvx

oqvo

ovq

wxv

z|vo

qvz

s88Ru

o|v|

qvw

xuvw

ovy

klmmnopqr

cltst

ouvo

zvx

ovx

klmmnopq

~cltst

yv

xv

ovo

s88oy

oqvq

oqvu

ovq

s88Ru

ovz

qvw

ln

ltd

ln

zqq

D

mo

ltd

ln

mltd

JAEA-Testing 2011-005

UgraveBX900gtUcircWQRfeaZaumlWQRfeaZaumlWafeaZdegnIgraveEacuteIacuteotildeigt

laquosectauml4ordfXmWWfeaZAacuteordfXcpoundDmWordfcurren

cqUgraveDFig 14OcircgtCDIacutecurren_ccedilY-a5^regGW(_ccedilY-a5acute164)BX900gtordfpoundUcircCsectmoumlaringaeligccedil13hegravepoundXcpoundD9ordfAumlq [FX1gtUcircWQRfeaZaumlWQR

feaZIgraveEacuteIacuteshyegraveeacuteordfaEacuteDordfUuml|YacuteTHORNcdz

yacuteCiacutecurrengtrsGWotildeigtlaquomWcd BX900 WrsGmWXqBX900W FX1gtUuml|YacuteTHORNWiacuteOslash5GWFig 14OgraveIacutecurrenO=OordfEcircUgraven^currenethWPrsUVXiacute5pound

DDHBX900 gt FX1 gtzyacuteCqCcurren BX900 16 oacuteC FX1gt 40WXzyacutegtlaquopoundDqUgraveDFig 14OacuteWOcircIacutecurrendegncopyIgravefgO=EacutegtlaquosectBX900gt 11currenW 08currengtzyacuteXcpoundDordfFX1gt 36currenW 18curren(or 16curren)gtMddcXzyacuteordfaacuteHdnDq 13yacutecdFX1feaZTUacuteUcirc BX900Wdc+XpoundsectFX1

|Y]OszligyBGDHhhordfaeliggt

laquomWordfcurrencpoundDq 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute

42 STREAM^_ZgtethCDW ordfcent^_Zgtecircc5g5poundDq egraveBCD^_Z43 FFTEWEacute Stockham|5gt FFT5P

NPB (NAS Parallel Benchmarks) 23FT^_ZgtlaquoqFFTEoumlWiacuteNPB23 FT Uuml|YacuteTHORN(Fortran Ucircgt MPI EumlIgraveUcirceumln)l3CUuml|YacuteTHORNcdaumlIacutecurren4C)JCDaumloacuteCfeaZUuml^13~brvbarpound^

UcircCDordm)5poundDqUuml|YacuteTHORNgtXCaumlBD=YZ C7UV((XYZ)=(512512512))gtaumlIacutecurrenordf On aringaeligccedil13hWXsectfeaZUuml^13~szligordf13XDHgtlaquosectaumlIacutecurrenUVEacuteCDcpoundDcd

gtlaquoq ^_ZaumlIacutecurren 28plusmngtlaquosectn~n Fig 15W Fig 16GqMd

8plusmncopyIgraveoacuteGYordfZsectumlO=gt|YordfZgtlaquo

mWcdhtgtUKntst oacute WXXgtmUuml^13~notbrvbarGq UgraveDOcirc$_ccedilY-a5 16W 64brvbar5poundDq

JAEA-Testing 2011-005

13

|ucp|t^ro__tAacute^OOOsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]|trFRFY]p^top]qrsYtpYop]qrsJJDp_Q|^bbqpc^ta|rotpbctcb|urc`Q^uptatar^]|tbtarp|t]|tbccbwOa^qrR^boocbtoabccbw`Et^b|rtabtFhPU`Ar_pcrobqq^SJJDItpo]rc_pcJJDOtarbccbw|tur^^t^bq^ruwobqq^SJJDI^taEorttpMbUrttpUQOarcrUQ^tarbs^|bq|rp_U_pcbwo|ur|rtobqq`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^oq|r[qpubq`a[^oq|r[]^_`a[^trrc^OOO^O~OqOsO^rccOr^trrc__tuqpoO__tuqpo]b]bcbrtrcAacute__tuqpohdnO__tuqpo]bhddAtilde auml^_ccedilY-a5Kd]bcbrtrcAacute__tuqpohKdO__tuqpo]bhKXAtilde]bcbrtrcAacutehiOhmKPAtildep|uqrop]qrssOwO|p|uqr]cro^^prqOrqrOrq]^r^psAacute__tuqpo]bOAtildeOwAacute__tuqpo]bOAtildeO|AacuteAtildeobqqUHExE^tAacute^rccAtildesAacutevOvAtildehAacuteK`MMOM`MMAtildewAacutevOvAtildehAacuteK`MMOM`MMAtilde|AacutevAtildehAacuteK`MMOM`MMAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pcprbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqh]^xt^rAacuteAtildep^hKOKMMM^hY^pqhKOOPobqq__tPAacute^OqOOO__tuqpoO__tuqpo]bO|OsOwAtilde^_Aacuteq`r`AtildeptpKdMobqq__tPAacute^OqfKOOO__tuqpoO__tuqpo]bO|OwOsAtilderpptpKXMoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]wRtpQ`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKdMopt^|rp~hKOp^hKO__tuqposAacute^O~AtildehwAacute^O~Atilde aumlIacutecurrenLfgO=PrprpKXMopt^|rrpobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqrh]^xt^rAacuteAtilderq]hrqrYrqobqqUHExSpxcbAacuteUHExSCUUxTCBVLOrO^rccAtilde^_Aacuter`r`MAtildetarc^trAacutedOWAtilde[Gqb]rD^ruw|u`__tPAacutero`Atildev[Orq]r^_obqqUHExJ^bq^rAacute^rccAtildetp]

Fig 15 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- cfftz)

JAEA-Testing 2011-005

|ucp|t^r__tPAacute^OqOOOwOwKO|OsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pctarVYta^trcbt^pp_tarropbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^trrc^OOqOOOwOwKOKOq^Oq~OqO|O^O~O^KKO^KPO^PKO^PPp|uqrop]qrs|OsOwO|KOsKKOsPK^r^p|AacuteAtildeOsAacutewKOAtildeOwAacutewKOAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYort^^t^bq]bcbrtrc`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKhePqhPWWAacuteqYKAtildeq^hPWWAacuteYqAtildeq~hPWq|hq^fKp^hMOq^YK^KKh^WqfK^KPh^KKfK^PKh^Wq~fK^PPh^PKfq^_Aacute^`r`KAtildetar|Kh|Aacute|f^Atilderqr|Khop~Aacute|Aacute|f^AtildeAtilder^_oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoDa^qpp]^rotpc^buqr`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYphMOqYKp~hKOwsKKhsAacute~O^KKfAtildesPKhsAacute~O^KPfAtildewAacute~O^PKfAtildehsKKfsPK aumlIacutecurrenLOO=PwAacute~O^PPfAtildeh|KWAacutesKKYsPKAtilderprprpcrt|cr

Fig 16 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- fftz2)

JAEA-Testing 2011-005

UgraveWH Fig 17(Ocirc$ Table 9 Table 10GqQRfeaZbrvbarshyatildeCBX900 fftz2gtfeaZUuml^13~UKntstbrvbarzyacuteordfagraveaacute13cfftz gtnotXordfmacrdnXqUgraveDFX1degnnotXordfmacrdnXcpoundDqmW ETHNtildeCD

zyacuteordflaquopoundD BX900 fftz213gtfeaaringaeligccedil13heumlXsectumlampordf5eumlnXmWordfcurrencpoundDq13]h

eumlnX(cXCcedilEgraveWuumlwdnordf(currencdXq Ocirc$_ccedilY-a5degnOcirc` 16ordfTUacuteshygtlaquopoundDq UgraveDQRfeaZbrvbar BX900W FX1shyordfFFTEoumlbrvbarsectgteuml

(2curren13)ucircsup2WsectumlMd|YZgtlaquopoundDDHWuumlwdnq TcedilQRfeaZWafeaZUKntst ordf)BCD

IacutecurrenotildeQRfeaZAgraveordf 1 ethAacuteWXpoundDordfSTREAM gtfrac12Eacutesect

iumlntildegtlaquoq 45 TUacuteUcircUgraveWH

BX900WFX1STREAMWFFTEUuml|YacuteTHORNfIumlWNPB23 FTB13feaZbrvbarTUacuteUcirc5gCDFX1gtegraveBGfeaZUuml^13~brvbarpoundordfecircC^ordflaquo 5ordfaeligugtlaquopoundDq

FX1gtfeaZUuml^13~otildegt13ecircGmWordf13XoumlordflaquosectWecircUcirc7gth5Paeliguordflaquoq

Fig 17 feaZbrvbarTUacuteUcirc

JAEA-Testing 2011-005

TTa

ble

9N

AS

PARA

LLEL

Ben

chm

arks

Ver

23

FT

Tabl

e 10

NA

S PA

RALL

EL

Ben

chm

arks

Ver

23

FT

JAEA-Testing 2011-005

5 IORYbrvbar Y`fIuml IORBoa13alFGFAring

CaYoBETHNtildeCDq 51 IORYEcircu

IOR Benchmark3)vR ASCI Purple=_7|eacuteplusmngteumlnD C iexclbrvbarYfIumlgtlaquoLplusmneacuteYacute~ 2102egraveBPqPOSIXLwrite() read()PMPIIOLMPI_File_write_at() MPI_File_read_at()PbatildeB13EumlIgravel5PmWordfgtecircq 511 YacircueZ$

MPI oZYcplusmnLrepetitions plusmnP`5XFGFAringTshyiquestHqOcirc`centgtEumlIgraveOslash5eumln^_cdiaoacuteCpound

Eacuteecircordf5nndFGFAringshyordfeumlnqIOROslash59XacircueZ$ Fig 18Gq

1plusmn`gt readwriteplusmn(blockSizecurrentransferSizejsegmentCount)gtlaquosect`aecirceuml (blockSizej segmentCountjMPI EumlIgraveAring )WXqyenwtransferSize=1kBblockSize=1MiBL106 Bytes MB220 Bytes MiB WGPsegmentCount=1MPI EumlIgraveAring=32 oumlblockSizejsegmentCountjEumlIgraveAring=32MiB testFile ordf)dnoZY 1MiBmiddotAbrvbarntildeC1KB7fIuml 1024iexcl1313aeligUgraveDZordfpoundEacuteecircGq

[transferSize] 1plusmn readwritegtFGGeacutea`qsizeof(long long int)currencblockSizegtlaquoaeliguordflaquoq

[blockSize] o^_ordfbrvbarntildeGeacutea`qgt 1MiBq [segmentCount] o^_brvbarsectYeumlniOcirc$qyzshy 1q [randomOffset] 1XdZordfYqOcirc`shy 0q [fsync] ecircEacuteAring fsync13f5Gqyzshy 0q [filePerProc] ^_sectWcent`aYGqPOSIXEacutenotszligq

yzshy 0q [useFileView] MPI_File_set_view() MPI_File_write() MPI_File_read()egravePq

ZordfYWparaCXqyzshy 0q [collective] MPI_File_write_at_all() MPI_File_read_at_all()egravePqZordfY

WparaCXqyzshy 0q

Fig 18 IOR YacircueZ$

JAEA-Testing 2011-005

512 MPIIOatildeGUuml^13~eZ$

IOR gt Info aringbrvbarpound MPIIO =)`7wmWordfgtecircqBX900 notszligX InfoUumlYacuteNtildeY` keyW value Table 11GqplusmnCXLyzshyPq

Table 11 MPIIO Info UumlYacuteNtildeY` key value

cb_buffer_size 4194304 $amp()

+

cb_nodes -012345

6789lt

=gt-012A

B

ind_rd_buffer_size 4194304 ACDEFGF$amp()+

ind_wr_buffer_size 524288 ACDH+GF$amp()+

JAEA-Testing 2011-005

52 latildeLPOSIX MPIIOPn~nEumlIgraveAringtransferSize (1plusmn13

fgtFGeacutea`)YAgraveograveLsequential randomPb^wFig 19GumlcopyFGFAringCDL^_P Table 12Table 13Fig 20Fig 21GqUgraveDntildea13iumlIuml_aumlaringaeligccedil13hIO -eacute-eacutearingaeligccedil13hordfcentsup3CDHn~naringaeligccedil13hoacuteGIO XcOgraveOacuteOcircOtildeYoacuteG IO XcordfcentGDH_aumlaringaeligccedil13hszligEacuteDq

iexclcentpoundcurrenyenETERNUSDX80

970TB

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

brvbar

SRFSltlaquocumlcopyordf

READD WnotiteD

brvbar

SRFSltlaquocumlcopyordf

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

shyshyshy

Fig 19 ReadWriteumlcopy

JAEA-Testing 2011-005

Table 12 LPOSIXP

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3

oqu yvq ovy| |uxvzw zvwu |zwvyy ouv|u oywvo ozuqv

quw zqzvuy uvox uu|v|u u|uvqo uxovz| uwvuo owovuq wqvxuqzy oqqvo| vy uxyvyw uuuvxx uyvyw xqzvu| |vw| |q|wvzqwoz o|uv|w ouvoy ozwv|o uyxvux uqvyo u|v| uovzu zuvxoy|wu ouoov|w uwvqu uuvxq uyvwz uuvyw uxovww uqvxw zovzw|yw ouxzv wzvwq uy|vx uyvz| uyuvwz uwxvoo |xvo wxvzyyxx|y ouwyvq| wv uxvo| uovyx uzvuu u|qv xvxy yvyxo|oq ouqwvu| uqvqq uxv|w uxvxy uuwvyo uyvoo qvyw zv|qyouu oxoyvu ywovoq uuvw ux|vy |uxvu| uxovoy oqvow uwvzwxuww oxovww oo|zv uyuv uvoq |w|vo u|wv|y ouqvoq uqwvooquwxy ouwvyq ovq uuyvo| uzvyu uquvq u|uvo q|zvx| q|vq

oqu ow|qvy ovu uqvyo vyu xx|vyq uwvwy z|zyv|y oxxqwvuwquw quzvoy |vq |w|vwz uyvyw xxwvzu xuvw o|xxxvy owv|xuqzy ouvu |vu uqov|q uxzvzo u|vzq uuwvu ooouv| |qxvxuwoz ozqxv|x o|vq |z|vuo uq|vqz xxyv|u xzzvy oo|vwo xqvzzoy|wu uvuw uqvz uovz u|vwo xwwvo y|vu| oxvz ozvoq|yw uuvw zuvy uvzz uozv yqzvyx xzyv|u ovo oxvwyxx|y owvy ouuv|u uwv|| uuwvx| x|ovx| yqyvux o|ozvuo oyoovyoo|oq |wzvy xyvyw uuwvz| uvqu x|xv xuvu ooyyvxo oxxvoyyouu u|wvq |zxvwx uvo uyzvu xo|vzq xuqvqw oyzu|v|o q|quvqxxuww uuv|o ozuxvuu uuvxx uzvzu xqvow xuwvzz oyxxxvyx owyovyooquwxy xwvy |qzv| uuzvxu uzvo| xyuvwo uwzvwx oxwwvxz oxyuyvy|

oqu oxyovoo vo yo|vyu v|w xwyvzz zv |xwovy xzzxxvyquw oxvz |vwz yquvxx y|ovx xywvyz xzyvyq x|xy|vzo w|ywzvwouqzy oyvu| zv xzwvx| yyvu xvoq xv yxy|vz zqy|vyowoz oy|xvxx uvy yozv|q yvyu xwv| xxvu| ywuv| zqqqvozoy|wu oxyzvuw uyvzz y|vyy yowvxq xwuvwx xwxvy ywzuv|q wqzvoo|yw oxoovuy wwv yqyvyq yuvyw xwov|x xzov|u ywozyvqy wyyyvywyxx|y oxwxvz ouyvxy yoyvuz yvoq xwxvwx xzqvww ywv| wxxqzvo|oq oyxwvw qvwx yovw yuvw| xxwvwo xwyvuq yquvzw woyvyouu oyw|v|z yzvx| yqvwu yxyvq| x|vx| xywvuq yywqvxw wqvq|xuww oxqvxu ouxqv|u yqvuq y|wvqx xyqvw xovz| yx|||vqq yuy|vuoquwxy oyywvqx uwvuy yo|vxz yuzvz yyxvq xwv|u yox|wvww yoovqx

oqu

|

xy

acutedegmicrodegpara

degmiddotmdegemdegkk~qgcedilsup1sup3

mdegkk~ocedilsup1sup3 cedilordm cedilsup3

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3oqu |yvoy uvqw zqxvxx xvwq uzvx ovu y|zuvx xywy|vy|quw uxzvy yvuz zoyvxo vox xxv| |vqw zyuyovyo zqzuvuyuqzy uyvuz oovx zuovw wvzy zvqx |vzu o|qxxuvy oowzyyvyqwoz uoyvxx vy xyzvzq xxv|o zuovwx uyvyo ouyyqovxx o|woo|v||oy|wu xqvzy uuvy oquvou oqovq oqqovuw z|vy oxuwv| ouqxzyvwz|yw xoyvyw xwvqy oqqvo oz|vo oqzvyz o|yvu ouwwqvq o|wzuv|yxx|y yu|vq ooyvq o|yvzx qvu o|wvqq uwvy oozwwvyo oozqyvxo|oq youv|y ovoy oxu|v|w yqv ouv|y owzvoo wuo|vou w|qzvzwyouu yxvw uouvwo oywzvwo o|yxvqo oyyvqq oxzzvq xo||voy xoyvwuxuww wozvq yuv| oyuvwo ouxvo ox|wvq o|o|v| |qwqvxw |ouvyzoquwxy zyxv|o zqyvu| oyq|vx oxqwvy oyzvz ooxvu ozzvx owovyu

oqu |xvyy v|u ozuvy w|vyy |xvw |vxy uoyyzov|o |zqyyyvoquw |xvww uvxw owq|vx xvzx oyovu |vqy xw|zvzw xyzuzvzxuqzy |yzv|z zvqw qvuy v|x oyovww wqvyw wzyv|w yqu|vowwoz |wuvz owvu o||vu oqv xzyv ouuv|| yozqovu zzqvzoy|wu xqxvwy |yvx qovou uwvyx oz|zvxy ywv| xqqxvzq yq|qxvzw|yw xuv|u uv|z qywvyu zxvy qxovww |quvz yxwoqv yuquu|vy|yxx|y yzxvqq ouyv| xzwvwx xyvzq yvoq wqzvy uzyqzvu x|yzqzvw|o|oq yyqvx| zzvq |ov|x zzovw| w|ov|| ozvo x|o||xvuz |yqqvwyouu zzyvzq yq|vwz |zzzvz |qvy |wzuv zyvxq |xowu|v |uuoqovuuxuww ooxvu o|wvu |wyvww |zvu |ux|vo zw|vzo |xuwvy |uwovxoquwxy oz|yv| owzuv| |wxyvwu ||yvqu |yzuvzx |yyovq o|uwvo o|uywvo

oqu |yvq vq |qyvqq ozwv|u |zvwq qzvww o|wuzvyo ooq|xxvzyquw |yvzu uv ooqvyu ozzv|w uxxv ozqv| owzzwvyy oxxuq|vouuqzy |yyvx| wvxo yvqq ozwvqy |zwvzq ozwv zzvx qyqqovoowoz |xv| oyvyz ozxovou |wwvw uxqvq| uquvq| |ozqxv|| qwuzzzvw|oy|wu uyqvzo ||v|| |wvyy wquvw wxuvox zvuo |oxy|vw q|qyvoq|yw ux|vuw yvw |xov|u |uvz uuvyu |wqvy o|xxovzu q|wvwzyxx|y yzvwo o|xvu |uqovuy qvw |yvy wwzvx o|wuyov|y owxuqzyvowo|oq xzzvwz yzv|y |q|vxq ovzu uqwvuq oqqvz oyuyyxuv|u oyouoz|vuoyouu wxyvq xuuv|u uoyxvo |yywvo uzvw| uouvu ooq|ozvo oqyq|vqxuww oqqvqu oowvoz uoy|vw |yvx uwuvuw uux|vxx woyyqzvow zqzvqwoquwxy owquvu owov|u uouvx |zxyv uvw uxoqvqu uy|yozvuw uuqqvuo

xy

oqu

|

cedilsup1sup3plusmnsup2

microdegpara cedilsup1sup3 cedilordm cedilsup3degmiddotmdegemdegkk~qg mdegkk~o

JAEA-Testing 2011-005

Ta

ble

13L

MPI

IOP

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

qv

qvz

ovqq

ov

|vq

vwy

quw

vyu

ovq

ovzz

vx

yvqq

yvox

uqzy

xvq

|vx

|vzq

xvo

oovw

oovxu

woz

oqvz

wvqu

vy

oqvy

|vuz

vy

oy|wu

ozvwy

oyvwq

oyvy

ozvz|

uxvw

uxvz

|yw

|yvzx

|vyu

wvy

|xvxo

vq

xv|

yxx|y

yvy

xyvoz

xqvqq

xuv||

oyvqo

oovz

o|oq

zyv|o

wqvu

zovu

zvx

owwvw

oyvzx

youuoxvoq

oqyvx

ouovzu

oqov

x|vqw

uvux

xuwwuyuvoo

|wvuo

uwvxu

oyovqq

wuvqq

||zv|y

oquwxyyqxvu

uwyvux

uwxvzx

ozuvqy

|uwv|

|uuvx

oqu

ovou

qv|o

qvy|

qvyx

|vy

|voq

quw

v|

qvx|

ovq

ovw

yvu

yvz

uqzy

uvxy

ovu

vqw

vyw

ovu

ovou

woz

zv|w

vyu

xv

xvoq

xv

uv|

oy|wu

ov|z

uvyz

oqvou

zvzw

uzvyo

uwvwz

|yw

||vz

zv|w

uv|z

qvy

wvwq

zovy

yxx|y

xvxw

v

uyv

|wvy|

oxqvz

ouzv|q

o|oq

w|vuu

u|vz

wyvx

ywvzz

y|vw

yvuz

youuooyvu|

yovw

oyqvoqvxo

||xvyo

|vx

xuwwwwvww

o|ovu

|uvqzyvuy

|yxv|

uqvw

oquwxy|xvo|

oovux

qovwu|qyvqz

uqwvo

x|v||

|

xy

acutedeg

microdegpara

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

ovou

ovoo

ovq

ovo

vo

vwx

quw

vx

vow

vo

vxu

uvxo

yvo

uqzy

xvoq

uvou

uv|

xvqu

zvo

oovy

woz

oqv

wvx

wvy

oqvo

ozvwo

vo

oy|wu

qvoq

ovo

ovux

qvq|

|yvyq

uyvou

|yw

uqvyo

|yvz

|uvy|

|wvu|

wovo|

wyvzu

yxx|y

wqvox

|v|y

yyvww

wvy

ouovz

ouwvqq

o|oqoxyv|w

o|vuq

oxvyo

oxovz

wvw

xwv||

youuwovuu

xyvox

u|vzq

yzvxw

xoovux

x|xvwu

xuww|xvo

uzvx

u|vyz

xqvozoquqvw

zuyv

oquwxyzv|y

yovz

wqvou

qyvuqooo|vuoqovu|

oqu

ovo|

qvxy

qvzy

ovo

voz

|v

quw

vz

qvzu

ovzq

vy

uv|u

yvx

uqzy

uvy|

vqx

|v

uvuz

wvwq

ovxo

woz

zv|

|vx

vwq

wvwu

ovyy

yvqu

oy|wu

owvzu

vu

oyvox

owvoq

|xvxo

xuvwx

|yw

|vy|

ouv

|ov|u

|xvx

qvq

oquvwq

yxx|y

xvq|

|vu

uxvx|

yvwo

oxuv||

owov|w

o|oqoxqvyw

v|

wyvq

ovux

|qvo

|vu

youuz|vwo

ouzvwq

o|vy

wuvqw

yovy

yywvz

xuwwxxvyu

|qv

uy|voy

xoqvwoxqvwwo|yvzw

oquwxywxwvxq

xov

wqvx

zqwvu|uovxzoqvu

xy

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2

microdegpara

|

JAEA-Testing 2011-005

521 POSIX Fig 20GbrvbarPZYgtFGeacutea`brvbardotildeiWXoacute

CZordfYgtFGeacutea`laquoGnordfzyacuteCqXFGeacutea

`ordf 1MB oumlZYWZordfYWgtiacuteOcirc$FGe$WXDHordfiGqUgraveDZYecircCo^_ordfn

~nparaCDaYGouml`$iaYbrvbarsect

EGoacuteCpoundEacuteEacutegtiaYGoumlAgraveordfAumlgtlaquopoundDq

mnd$Ocirc$YUumleacutenotccedilIuml]OcircOtildeYYL13YEuml

plusmnFEumlPthornordfuumlwdnqecircCCgtordfFGeacutea`

iexclcentIacuteiWXmW]poundEacuteEacuteecircC+Xnota

13gtOslashordfmacrdnX=WC-eacutearingaeligccedil13hLIOiumlIumlshyaringaeligccedil13hPthornWuumlwdnq 522 MPIIO POSIXZordfYWiacute=zCFGeacutea`

otilderyenCordfzyacuteCLFig 21PqUgraveDEumlIgraveiexclcentCXqmndreg=cd MPIIO ]^_ordfZordfY5POslashOumlWXpoundgtXcWmacrdegeumlnq 523 _aumlaringaeligccedil13h 13 IOoiumlIuml_aumlaringaeligccedil13hWa-eacutearingaeligccedil

13hordflaquoqa-eacutearingaeligccedil13humlcopyordfplusmnsup2GmWsup3acutegtlaquoordf13

BeumlnmemcleanfIumlegravePmWbrvbarsect_aumlaringaeligccedil13hY|1micro5PmWordfgtecircq 13G3brvbarsectpoundEacuteEacuteCDPOSIX_aumlaringaeligccedil13h

szligordfaacuteHdnDLTable 14Pq

(1) ecircEacute` (2) memclean (3) TplusmnpoundEacuteEacute` (4) 2plusmnpoundEacuteEacute`

JAEA-Testing 2011-005

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomSingleFileFilePerProc

ReadWrite

homehome

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

Write Read

sequential randomhomework

N=322561024

N=256

N=1024

N=32

N=32

N=2561024

FilePerProc FilePerProc

N=256

N=1024

N=32

N=2561024

N=32

FFig 20 POSIX

JAEA-Testing 2011-005

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential random

use viewcollective

use view N=256 use view N=256

N=32 256 N=32 256

homehome

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomhomework N=32 N=32

FFig 21 MPIIO

JAEA-Testing 2011-005

Table 14 _aumlaringaeligccedil13hszlig

| xy | xy

oregRplusmnsup2 zyvx qzv|| xwvxw zyuvqu

sup2Rplusmnsup2 woxqvoz ||qqvz| uzvu z|voy

kfrac14 frac12kfrac14

53 IORYUgraveWH

IORYbrvbardegdnDregmacrb13tuq 531 a13egravecurren occedil`brvbarYCDcparaX IO O=5Pnota13LSRFSP

homea13brvbarsectiumlIumlS_aumla13Lext3Ptmpa13PordfOslashszligordfAumlDHcAumlFXgttmpa13YZccedila)W^|13~AumlFUcircgtecirc9ordflaquoLFig 22middot Table 15Pq UgraveDhomemiddotAQFS_ccedilY-a5 512kcentCsect186OcircOtildeYOumltimes

raquoZETHIuml_Kgtaordf)JeumlnqUcircfIuml13NtildeY|^`bgta

gt_ccedilY-a5gtyBUacuteCqiAgravework middotAIO eacuteIumlkCQFS_ccedilY-a5 1MBcentCsect366OcircOtildeYOumltimes`Zaggtaordf)Jeumlnq^|13~cda_ccedilY-a5gtlgtyB

UacuteCqa-a5ordfecircCXbrvbarPX MPIIO work middotAegravePuecircgtlaquoq 532 POSIXW MPIIO c^_brvbariaYhomemiddotA Readgt 256EumlIgravegt 2curren

Writegt 10currenW POSIXbrvbarsect MPIIOordfcedilqPOSIXbrvbaraYmacrsup1Gq 533 eacuteccedilecirceuml ntransferSizeGX 1 plusmn13fgtleumlneacute

a`ordfecircMFGFAringordfecircCXWPdegdnDmWcd^|13~

^_Zeacuteccedil-a5laquo]GWordfyacuteordf9ordflaquoq C`|lOcirc`eacuteccedil-a5Y_ BUFSIZgtAtildeUgravesect13

gt gccegravePoumlgt fccegravePoumlgt 8192eacutea`gtlaquoqCfIumlgtsetvbufZaZ|atildebrvbarpoundeacuteccedilecirceuml^wmWordfgtecircq QR Fortran1313aeligleacuteccedil-a5GAgraveograve13sectLAacute

ordmPq (1) OPENyacute BLOCKSIZE (2) ^ fuxxbf (3) Oslash5fIumlZaUuml^13~ -g (4) Ocirc`gt 8Meacutea`

JAEA-Testing 2011-005

ETER

NU

S D

X80

1-

18 (d

x01frac34

dx18

)

DX

80

01

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

DX

80

02D

X80

03

DX

80

04D

X80

05

DX

80

06D

X80

07

DX

80

08D

X80

09

DX

80

10D

X80

11

DX

80

12D

X80

13

DX

80

14D

X80

15

DX

80

16D

X80

17

DX

80

18

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

IObrvbar

1(io

1)

ETER

NU

S D

X80

19

-36

(dx1

9frac34dx

36)

DX

80

19

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

DX

80

20D

X80

21

DX

80

22D

X80

23

DX

80

24D

X80

25

DX

80

26D

X80

27

DX

80

28D

X80

29

DX

80

30D

X80

31

DX

80

32D

X80

33

DX

80

34D

X80

35

DX

80

36

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

Wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

IObrvbar

2(io

2)

Fi

g 2

2a13J

JAEA-Testing 2011-005

Table 15 OcircOtilde7Y`|J

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_01 QFS_data_01 2109888 MB 2109888Icirc36 = 724TB 1024k

data_02 QFS_data_02 2109888 MB 2109888Icirc36 = 724TB 1024k

data_03 QFS_data_03 2109888 MB 2109888Icirc36 = 724TB 1024k

data_04 QFS_data_04 2109888 MB 2109888Icirc36 = 724TB 1024k

data_05 QFS_data_05 3695488 MB 3695488Icirc36= 1269TB 1024k

home QFS_home 3695488 MB 3695488Icirc36= 1269TB 512k

work 524288 MB Io2JAgraveAacuteAcirc

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_06 QFS_data_06 2109888 MB 2109888Icirc36 = 724TB 1024k

data_07 QFS_data_07 2109888 MB 2109888Icirc36 = 724TB 1024k

data_08 QFS_data_08 2109888 MB 2109888Icirc36 = 724TB 1024k

data_09 QFS_data_09 2109888 MB 2109888Icirc36 = 724TB 1024k

data_10 QFS_data_10 3695488 MB 3695488Icirc36= 1269TB 1024k

data QFS_data 3695488 MB 3695488Icirc24= 846TB 1024k

sysdata QFS_sysdata 3695488 MB 3695488Icirc12= 423TB 16k IumlETHNtildeETH

work QFS_work 524288 MB 524288Icirc144=72TB 1024k dx01frac3436OgraveJ

JAEA-Testing 2011-005

6 a$fY` mmgt BX900 a$fY`ETHNtildeCDtuq

ETHNtildeBX900Xraquozfrac14frac12GmWordfgtecircDq 61 a$fY`igraveiacute mmgt BX900 a$fY`IcircIumligraveiacute AgravejiumlIumleacuteaIuml

0Gq 611 icircWiumlIumlethsectntilde

BX900 InfiniBand QDRegraveBCD Fat Treeicircgtlaquoq113aelig13S 18iumlIumlAEligicircgtlaquoordf113aelig13 LeafWdividenaccedil7Iumlordf 2Igravecenttimeseumln1accedil7IumlcdoiumlIuml QDR ordf 1 7gtYZeumlnqaccedil7Iuml 2 Igravelaquogt1 iumlIuml 2 QDR ordfYZeumln(QDR Iacutegt 18 j2Igravegt 36WX)qoiumlIuml`ZaringbrvbarsectT 8GBs (4GBsj2) AAEligpoundq 13aelig13 5PDHLeafaccedilyacute7 SpineWdividenaccedilordf

9iexcllaquosecto LeafaccedilGu Spineaccedil QDRgtYZeumlnqbrvbarpound 113aelig13cd SpineIacutegt 9j2Igrave=18 QDRordfegraveBeumlnq QDRordf 5PoumlLeaf accedil3xmacrDownlink ordf 36 Uplink ordf 18 Xgt Uplinkshy 50WXsectmn 50_ccedilaringWdividenotgtqBX900gta$fY`Ecircfrac34h Fig 23GqBX900gt 50_ccedilaring13aelig13ocircotilde gtthornordf]GqSpineLeafaccedilW13aelig13SoiumlIumlYZGEcircfrac34h Fig 24Gqmmgt13aelig13S nAumliumlIuml nAuml(niquest9)UgraveD nU9Auml(nAgrave9) SpineaccedilegraveBGqlaquo13aelig13laquoiumlIumlordfcent13aelig13iumlIumlcdOcirc$ltAacuteoumlSpineaccedilcd Leafaccedil(2Igrave) QDR2(1j2Igrave)ordfegraveBeumln(2Igrave)LeafaccedilcdiumlIumlQDR2gtYZeumlnqDotildeCmSpineaccedilcd Leaf accedilordmAcirc13aelig13SPAtildeWiumlIumlWnoteumlnqmDHyenwiumlIumlTEacuteOcirc$lt GoumlWiumlIuml 1 W 10 ordflt GoumlWgtn~n FAringordf+XqX13aelig13Sotilde gtiuml_ccedilaring

WXqmDHoYacute~iumlIumlethsectntildeAgraveograveLiumlIumleacuteaOcircOtildePordf

thornotildeGq (1) XiumlIumlethsectntildeL13klP iumlIumlZeacuteaIumleumlnqyenw13aelig13 1 iumlIumlordf bx0001ccedil

bx0018Ugravegtlaquoordf144EumlIgraveYacute~oumlmAumlgtiumlIumlordfeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDdmiumlIumlAringAEligw[iumlIumlegraveBGq13

aelig13ocircotildeoumliacuteZeacuteaIumleumlnq

JAEA-Testing 2011-005

(2) CcedilXiumlIumlethsectntilde Wsup33iIacuteOslash5YZniumlIuml 1ecirceacuteaIumleumlnqyenw13aelig13

1gt bx0001bx0003bx0005bx0007WeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDd8)eacuteaIumleumlnDiumlIumlordfEgraveAumlAringXdAringAEligwTEacuteEgrave

AumlAringiumlIumlegraveBGqEcircAumlAringoumliacuteAringAEligwTEacuteEcircAumlAringiumlIumlegrave

BGqmnAtildeWYacute~SiumlIuml gt IB [EumlDHcenteumlnDq plusmnoqgt65EacuteWsup3AgraveogravegtCDq

13aelig13 1 13aelig13 119

Leaf(1-2) Leaf(1-1)

bx0001 bx0002 bx0018

Spine(1) Spine(2) Spine(9)

Leaf(119-2)

Leaf(119-1)

bx2125 bx2126

Fig 23 BX900a$fY`icirc

3OslashCDqIgraveIacuteoFL13aelig13raquoIcircIumlntildePfrac34wWsup313kl

WiacuteiumlIumlethsectntildeAgravejBCq

JAEA-Testing 2011-005

accedil (Spine) accedil

(Leaf) 13aelig13 iumlIuml (1 13aelig13acute18 iumlIuml ) (acutecsect]GCGDHLeaf accedilTCcX)

1

10

2

11

3

12

4 5 6 7 8 9

13 14 15 16 17 18

SW 1

SW 2

SW 3

SW4

SW5

SW6

SW 7

SW 8

SW 9

SW

Fig 24 iumlIumlWaccedilatildeAcirc

612 Agravej

BX900 Ba$fY`plusmnsup2yBCD IB AgraveograveWC

RDMA(Remote Direct Memory Access)yBCD RDMA Agravej(13 RDMA WC)WRDMA egraveBCX NORDMAAgravej(13 NORDMA WC)ordflaquoqBX900 gtOcirc` RDMA gtlaquoqRDMA W IB begravecccedil`brvbarYaumlIuml(Host Channel Adapter(HCA))Ocirc$^|13~|W8Y]sectWsectgtecircAEligDIacutemWgtETH_fg (1)5PqAuml^ccedil`E7a13 ordfNtildeEumlgtecircqiAgraveNORDMA CPUbrvbar|fg5Pq

RDMANORDMAWcentccedilYacute AgraveograveWC Normal SendAgravej(2)W Send on RequestAgravej(3)ordflaquoqRDMANORDMAacircWmnd Normal SendAgravejSend on RequestAgravej5PmWordfgtecircordfmndsectOgravewG ccedilYacute-a5brvbarpoundAtildeUgraveqRDMANORDMAgtmndAgravejsectOgraveOcirc`Oacuteshy Table 16GqOacuteshyOcircOtilde Normal SendAgravejOacuteshy13yacute Send on RequestAgravejWXqmndAgravejWccedilYacute-a5^_atildeAcirc Table 17W Table 18Gq Oacuteshy^ MP_MAX_NORMAL_SENDgt^wdnordfyacuteaacuteshyCDOcirc`shyWEacutegtlaquoqbrvbarpound^egravePoumlOacuteshyOumloumlEacutegtlaquoq

JAEA-Testing 2011-005

XDshy 5PDHOslash5 -onesideUuml^13~CDoumlNORDMAAgravejc(ccedilYacute-a5WatildeAcirc)Normal SendAgravejgt5nq

Table 16 AgravejsectOgravewOacuteshy Iuml RDMA RDMA NORDMA

^_ Tccedil1024 1025ccedil U Oacuteshy 32768eacutea` 16384eacutea` 524288eacutea`

Table 17 RDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil16KBOcircOtilde 16KBccedil32KBOcircOtilde 32KB13yacute 1ccedil1024^_ Normal Send Normal Send Send on Request

1025ccedil2048^_ Normal Send Send on Request Send on Request 2049^_13yacute NORDMAWX Send on Request Send on Request

Table 18 NORDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil512KBOcircOtilde 512KB13yacute Gu^_ Normal Send Send on Request

timesOslash (1) ETH_fg ^_ccedil-brvbarOcirc$fgIacuteccedil`brvbarYordmgt ordm|8

YOcirc$FGGCcedilEgravemWq^_ccedil-( 13)Obbrvbarsectdivide3GmWordfgtecircq (2) Normal SendAgravej G shylt shyccdccedilYacuteG GqccedilYacuteUgraveUacutesect

umlordmordfagraveCnUgraveUgravefg(direct copy)qagraveCXoumliUcircIacuteEumlAagravefCmmfg(indirect copy)agraveCDdsectumlordmfgGqUumlccedilYacutezecircq (3) Send on RequestAgravej G shylt shyWYacuteAacutesectlt shyccedilYacutesectumlordmordfagraveCcdG 5Pq

`ccedilYacutezecircq

JAEA-Testing 2011-005

62 MPI MPI acircX atildeegraveBCD^_Z)JCBX900 yacutegt

CDq256EumlIgraveccedil4096EumlIgravegt Ocirc$-a5THORNumlcopyszligpound5poundDordfmmgto MPIatildegtagraveaacuteX=ecircordfmacrdn 32ccedil80000eacutea`gtGq WCRDMANORDMA ordflteumln atildeordflaquomWlaquoOcirc

$-a5cdecircCordf^UcircG atildeordflaquomWXMordfethCDq 621 Agraveograve mmgtG Ocirc$umlcopy 0ccedil80000 eacutea`gtlaquoordfOslash5vEacuteAacuteG

Ocirc$acircETHeCLTable 19Pq AgraveograveWCMPIatildeTplusmnOslash5GraquoCmn 20plusmnatildesectaumlGq20plusmnatilde

sectaumlC3 1plusmnWecircXshyordfeumlnmWordfDDlaquopoundDqWCpumlcopyYacute~ W[ordfuumlwdnqmDHT`C19plusmnshyIIumlAacutemWCDqXIuml RDMAW NORDMA 2IumlgtGq^_ZOslash MPIatildeIacutecurren Fig 25Gq

Table 19 Ocirc$acirc

EumlIgrave AllreduceReduceBcastSendRecv 13 MPIatilde 256 32eacutea` 128eacutea` 512 yacute 128eacutea`

1024 yacute 256eacutea` 2048 yacute 2048eacutea` 4096 yacute 2048eacutea`

do 1000 III = 1 NNN call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() mesuarment start call MPI_ALLGATHER(anelemMPI_DOUBLE_PRECISION amp bnelemMPI_DOUBLE_PRECISION amp MPI_COMM_WORLDierr) ttime2 = MPI_WTIME() mesuarment end ttime0 = ttime2 - ttime1 call MPI_REDUCE(ttime0ttime1MPI_DOUBLE_PRECISIONMPI_MAX0 amp MPI_COMM_WORLDierr) if(myideq0) then max_time = max(max_timettime) min_time = min(min_timettime) all_time = all_time + ttime end if 1000 continue

Fig 25 ^_ZLIacutecurrenP

JAEA-Testing 2011-005

13

622 mnUgravegttuDOslash5CcedilEgraveOslash5IumlgtP512EumlIgrave Fig 26 Fig

27 1024EumlIgrave B Gqoh$a` MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

0

500

1000

1500

2000

2500

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Reduce(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(512-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(512-)

rdma

Nordma

Fig 26 MPI atildeL512EumlIgrave-1P

JAEA-Testing 2011-005

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(512-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(512-)

rdma

Nordma

0

500000

1000000

1500000

2000000

2500000

3000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(512-)

rdma

Nordma

0

50

100

150

200

250

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(512-Rank0gtRank511)

rdma

nordma

Fig 27 MPI atildeL512EumlIgrave-2P

JAEA-Testing 2011-005

brvbarsect[raquozfrac14frac12CDq

(1) C gt RDMAbrvbarsect NORDMAordfFoumlordflaquoq (2) RDMAgtlaquo Ocirc$-a5cdaeligccedil ordflaquoEGqmn Normal Send

Agravejcd Send on RequestAgravejsectOgravethornWnq (3) NORDMA gt AgravejsectOgravewOacuteshyWatildeAcircOcirc$-a5gtaeligccedil ordflaquo

EGoumlordflaquoq (4) EumlIgraveordfCXWEacutebrvbarPX Ocirc$-a5gt decircordfmacrdn

oumlordflaquoq ^_ZgtacircegraveBG MPIatildebrvbarpoundRDMAW NORDMAegravecurrenaeliguordflaquo

qDotildeCmnd MPI ^_Z=GfIumlgtlaquo mpiexec Uuml^13~gtlaquogtGu MPIatildeoacuteC RDMAgt5Pc NORDMAgt5Pclaquo XgtC MPI atildeegravepoundoumlAgraveIumlgtOslash5rsGmWordfaeliguWXq ucircuumlWCo MPI atildegtUVOcirc$(ccedil8KB)3UVOcirc$(8Kccedil78KB)U

VOcirc$(80KBccedil584KB)gt RDMAZUcircCD Fig 28LgtUVOcirc$P CLUVOcirc$PGqZyacutegt 100n RDMA ordfnoty0n NORDMAordfnotyWXq

JAEA-Testing 2011-005

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

middotplusmnsup1raquo

~plusmnreg

plusmn

sup2cedil~raquo

sup2~

sup2~

sup2~Oslashreg~plusmn

middotplusmnsup1

~plusmnraquo

~plusmn

Ugraveplusmnsup1raquo

Ugraveplusmnsup1

~plusmn

plusmnraquo

Fig

28

RD

MAW

NO

RD

MALgtUVOcirc$acute

0K

B~8

KBP

JAEA-Testing 2011-005

63 RDMAszlig RDMA 612 gt0CDbrvbarPCPU plusmnsup2WnotM5XDH 13

O=Vfrac34gtecircqmnbrvbarsect WbordfO=gtecirc9ordflaquosectC

cCcedilEgrave`5poundEacuteDqUgraveDrsDHNORDMAgtiacuteX`5poundDq WC WordfO=eumlnDegraveyenagraveaacutegtecircXcpoundDq 631 Agraveograve W5P^_ZegravepoundRDMA W NORDMA rsGq

eacute^_WOcirc$ecircacuteGeumlsectCigraveecircacuteiacuteVicircG (13cedil Wdivideoslash)5PqGlt iuml_ccedilaring (MPI_IsendIrecv)egraveBGqG UgraveDlt GAtildeWcopyIgraveOcirc$-a5 5MB gtlaquoq W5P5Igravegtlaquoq5Igrave-a5(450450)gtlaquoqrsDHgt n~nOslash55Pq ^_ZZccedil`MPIW 2q|Icirca|ccedil`(MPIiumlOpenMP) 3q|)

JCDqZccedil` MPIgtOslash5CDoumliumlIumlS RDMAgtXDHmAgravejszligordfgtecircXqmDHiumlIumlSgt MPI 5XCeth2Icirca|ccedil`)JCDqIcirca|ccedil`Oslash5WOslash5Iacutecurren7ccedilIumlEumlIgraveUcircCXeacute

Yacute~ 1W7ccedilIumlEumlIgraveUcircCDeacuteYacute~ 2)JCDqeacuteYacute~ 2 WOslash5Iacutecurren Fig 29GqXMPIWIcirca|ccedil`eacuteYacute~ 15IgraveIacutecurren7ccedilIumlEumlIgraveampeuml$OMP parallel doiacuteordfotildegtlaquoq DHOslash5Zccedil` MPIgt 8EumlIgraveIcirca|ccedil`Oslash5 4mpij8omp(32core

egraveB)gtOslash5Gq

JAEA-Testing 2011-005

call MPI_BARRIER(MPI_COMM_WORLDierr) ttime5 = MPI_WTIME() mesuarment start call mpi_isend(aNMBmpi_DOUBLE_PRECISIONir1 1 MPI_COMM_WORLDireq(1)ierr) call mpi_irecv(bNMBmpi_DOUBLE_PRECISIONil1 1 MPI_COMM_WORLDireq(2)ierr) call mpi_isend(dNMBmpi_DOUBLE_PRECISIONil2 1 MPI_COMM_WORLDireq(3)ierr) call mpi_irecv(cNMBmpi_DOUBLE_PRECISIONir2 1 MPI_COMM_WORLDireq(4)ierr) ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ttime7 = MPI_WTIME() mesuarment start $OMP parallel do do j = 1 NMM do i = 1 NMM a1(ij) = 1d0 b1(ij) = 1d0 c1(ij) = 0d0 end do end do $OMP parallel do do j = 1 NMM do k = 1 NMM do i = 1 NMM c1(ij) = c1(ij) + a1(ik)b1(kj) end do end do end do ttime8 = MPI_WTIME() mesuarment end ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc call mpi_waitall(4ireqiastatusierr) ttime6 = MPI_WTIME() mesuarment end ttimeDD = ttime8 - ttime7 ttimeal = ttime6 - ttime5 Oslash5 ttimesd = ttimeal ndash ttimeDD Oslash5cdnD (= UgraveDiIacute)

Fig 29 WO=5PDH^_Z

JAEA-Testing 2011-005

632 Zccedil` MPI 2plusmnIcirca|ccedil`eacuteYacute~ 1 WeacuteYacute~ 2n~n 3plusmn

Oslash55poundDqmnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt D Gqontilde13Gq

acutegtOslash5CD Oslash5 acutegtOslash5CDOslash5 R acute Wn~nOslash5shy Oslash5 acute W5poundDOslash5 (U) acuteOslash5cd()Dshy acuteOslash5CD(OslashC) U acuteOslash5cdRshyDshy

laquogtOslash5CD oacuteG WOslash5GmWbrvbarpound

UumlCXpoundDrGqmj13sectgtlaquoq

laquo() = (Oslash5U( iuml)) j100 aTHORNshyordfecircMszligordflaquomWXqbrvbarsect[raquozfrac14frac12CDq

(1) Zccedil` MPIgtiumlIumlS IBegravepoundXDH RDMAograveoacuteordfltdnOslash5szligq

(2) Icirca|ccedil`eacuteYacute~ 1 RDMAOslash5gt WOslash5rsGWucircOslash5ordfFWPXq NORDMAatildeCiacuteXWXpoundq

(3) Icirca|ccedil`eacuteYacute~ 2 gtIacutecurrenordf7ccedilIumlEumlIgraveUcirceumlngteacuteYacute~ 1WRrsgt 72currenOslash5Rrsgt 55currenAumlFUcircordfXeumln(RDMA)qCcCgtucircOslash5ordfFWPdegdnXq NORDMAIacutecurren7ccedilIumlUcircgt Oslash5eacuteYacute~ 1ruUumlIgraveUcirceumlnordfRDMAbrvbarsectcedilCXpoundq

BX900gt RDMA Uuml_IumlOslashWGgtCPU oacuteG

dCb^|13~ CPUbrvbarsectVfrac34GmWordfgtecircqCcCBX900 ordfiacuteCUgraveDplusmngt ordfUumlmWlaquosect Wordf

O=eumlnDegraveyenagraveaacuteGmWordfgtecircXcpoundDq

JAEA-Testing 2011-005

64 CPUSCPUiumlIuml 1oacute 1 core FAringG^_Zegravepound CDqCPU S

iumlIumlS CPU iumlIuml gt Ocirc$ordmAcircAgravejordf+Xqn~n gtordfaEacuteiumlIuml ordfTFWXpoundDq 641 Agraveograve

1oacute 1 ^_Z coreOcirc$ G^_ZgtlaquoqZY0cdcZYCD^_acircgtCD-a5Ocirc$ 100plusmnG Cmnd uCDn~nGq^_ZiIacute Fig 30Gq plusmn 24EumlIgravegt128MBOcirc$G C^_acirc 1WGq BX900gt 1iuml

Iuml 2CPU1CPU 4coregtlaquogtZY 3Ugravegt CPUS ZY 7UgravegtiumlIumlSPAtildeW CPUW mnd13yacuteiumlIuml WXqiumlIumlS coreoacuteG core IDWethsectntildedn MPIZYAumlAring Fig 31Gq

IC = 0 do 1000 III = intcpu nodes-1 intcpu IIRANK = III call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN if(myideq0) amp call MPI_SEND(ansizeMPI_DOUBLE_PRECISION amp IIRANKIIMPI_COMM_WORLDierr) if(myideqIIRANK) amp call MPI_RECV(ansizeMPI_DOUBLE_PRECISION amp 0IIMPI_COMM_WORLDistatusierr) 120 continue ttime2 = MPI_WTIME() c IC = IC + 1 timew(IC) = ttime2 - ttime1 ir(IC) = IIRANK 1000 continue

Fig 30 ToacuteT ^_ZLIacutecurrenP

JAEA-Testing 2011-005

NODE

CPU0 CPU1

Core IDacute0 Core IDacute2 (Rank 0) (Rank 1) Core IDacute4 Core IDacute6 (Rank 2) (Rank 3)

Core IDacute1 Core IDacute3 (Rank 4) (Rank 5) Core IDacute5 Core IDacute7 (Rank 6) (Rank 7)

Fig 31 f MPIZYethsectntilde

BX900Ocirc`QRfeaZ+QR MPIZaZ|gtlaquoordfmQRRIgraveEacuteIacute13

IntelfeaZ+QR MPI IntelfeaZ+Intel MPI QRfeaZ+Open MPI

3 gt5poundDqDotildeCQR MPI W Intel MPI MPI ^_ coreoacuteGethsectntilde Fig 31CDsectgtlaquoordfOpen MPIatildeC rankAumlAring 01234567 Core ID 01234567gtethsectntildednq 642 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt E Gqmbrvbarsect[

raquozfrac14frac12CDq

(1) QRfeaZ+QR MPI gtCPU S FAringordf 3260MBsiumlIumlS CPU FAringordf 3750MBsiumlIumlordf 5350MBsgt CPUS ordfiAumlcedilCiumlIumlordfiAumlFWPWXpoundDq

(2) Intel feaZiumlQR MPI CPU SiumlIumlS CPU QRfeaZWotildeiacuteXotildeordfiumlIuml 4750MBsWegraveeacuteEGqmnQR MPIQRfeaZUcircCD5poundordfIntelfeaZoacuteC5nXDHWnq

(3) IntelfeaZ+ Intel MPI CPUSiumlIumlS CPUgtAgraveXordfmndocircoacute(1)(2)QR MPI egraveBqUgraveDiumlIumlgt Intel MPI 2(` QDRegraveBordfgtecircXDH1(` QDRCcXq

JAEA-Testing 2011-005

(4) QRfeaZiumlOpen MPICPUS ordf 4460MBSgtmnUgravegt3gtTbrvbarordfiumlIumlS CPU QR MPI WotildeAringiumlIumloumlCXWXpoundDqDotildeCiIacute QDRegraveB5poundbrvbarPotildeq

dividefrac12XegraveAgravegtlaquoQRfeaZiumlQR MPIgt CPUSiumlIumlS CPU ordf

iumlIuml brvbarsectcedilWPfrac34W+XWXpoundDqmmacraacutesectXuC

iumlIumlSgt MPI XuCEumluecircgtlaquoq 65 13aelig13ocircotildeouml 13aelig13ccedil`brvbarY 50_ccedilaringicircXpoundDHegraveBoslashbrvbarpound

thornltCUgravePqmthorn 1oacute 1 Wdivideoslash ETHuDq 651 1oacute 1

1oacute 1 MPI atildegtlaquosectbrvbarCegraveBeumlnqyenwOslashfIumlgteacute^_WFGgt^_ordfgt5noumlordflaquoqmbrvbarPXouml50_ccedilaringthornlt9ordflaquoqmmgt 5P^_Z)JC

thornETHuDq WC1oacute 1 gt 2iumlIumlgt IBnotGn[ordfecircDH50_

ccedilaringthornCltmWordfcpoundDqDotildeCWsup3iumlIumleacuteaOcircOtilde(Yacute~iumlIumlethsectntilde)AgraveogravegtAtildeWYacute~ordf IBnotGbrvbarPXiumlIumlcopytimesUVYacute~139ordfAgraveXDHthornordfCq 6511 Agraveograve ETHNtildeegraveBCD^_Z^_1curreno^_cdfrac14secto^_

1oacute 1 (MPI_SendRecv)gt 128MBOcirc$ 100plusmnG CmnuCDiexcl FAringGqOcirc$G aYacute Fig 32 Gqmyengt8EumlIgraveYacute~oumlgtZY 0ccedil3^_cdiexclZY 4ccedil7^_Ocirc$G CqOcirc$Glt ^_iugravepGgtoslashbrvbarpoundccedil`brvbarYgt[

ordfecircmWXq ^_ZZccedil` MPI WIcirca|ccedil`(MPI+OpenMP))JCDqIcirc

a|ccedil`7ccedilIumlETHeGmWgt1 iumlIumlcdOcirc$G 2^wmWordfgtecircq

JAEA-Testing 2011-005

13

6512 (1) 113aelig13S rsDH1 13aelig13S(144core)5poundDqZccedil` MPI 144 EumlIgraveIcirca

|ccedil` 72mpij2omp36mpij4omp18mpij8omp 3 IumlgtOslash55poundDq13aelig13Siuml_ccedilaringgtlaquoDH iumlIumluacuteordf IB iexclcentGqZccedil` MPI 1 iumlIumlcd 8 ^_currenOcirc$ordfeumlnIcirca|ccedil` 2 7ccedilIumliacute 1iumlIumlcd 4^_curren47ccedilIumlEacuteC 2^_curren87ccedilIuml1^_currenWXgt87ccedilIuml13 IBgt[ordfecircqmDHmnd FAring+XordfIBagraveaacuteGDHogtIIuml FAringW 1iumlIumlntildesect FAringshyiquestHDqm Table 20GqXeacuteYacute~Icirca|ccedil`cedil$ucirc3shy(MPI^_j7ccedilIuml)Gq Icirca|ccedil`(18mpij8omp)EacuteegraveeacuteshyordfEordfFAringshyGueacuteYacute~gt

otildeEacutebrvbarPXshyWXpoundqmnordf QDR2currenOslashgtlaquoigravePq (2) 13aelig13Zccedil` MPI TEacute144EumlIgraveZccedil`MPIgt5poundDqYacute~Oslash5gt t256YZegraveBCDq

mnmYZiumlIumleacuteaOcircOtildeordfZcopytimesWXDHIBnotordfecirc]GDHgtlaquoq _copytimes 72^_13aelig13(=9iumlIuml13aelig13)gtIumlb^_ 213aelig13currenethCDq13aelig13S 1iumlIumlecirc 2 IBordfuacuteordfpoundq13aelig13S 18iumlIumllaquogt 36 IBgteumlnqiAgrave13aelig13 50_ccedilaringgtlaquogt 18(9j2ccedil`)gteumlnq144EumlIgraveouml13aelig13S13aelig13W 18 IB egraveBGgtiuml_ccedilaringWXsect13aelig13S13aelig13EacuteXqbrvbarpoundm13aelig13ocircotilde 13aelig13SgtCD Wotilde^

dXq

Ocirc$G (G G)

^_0

^_1

^_2

^_3

^_4

^_5

^_6

^_7

Fig 32 ToacuteT Ocirc$G aYacute

JAEA-Testing 2011-005

Table 20 L13aelig13SP eacuteYacute~ Zccedil` MPI uumlyacutethorn(72j2) uumlyacutethorn(36j4) uumlyacutethorn(18j8)IIuml FAring 7238MBs 14474MBs 28942MBs 53467MBs iumlIumlshy 57906MBs 57899MBs 57884MBs 53467MBs

Table 21 L13aelig13Zccedil` MPIP 144 MPI 192 MPI 224 MPI iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 7246 R 4670 laquosectR 4671 laquosectR

2 7244 R 4670 laquosectR 4671 laquosectR

3 7246 R 4670 laquosectR 4671 laquosectR

4 7242 R 7243 R 4670 laquosectR

5 7243 R 7244 R 4671 laquosectR

6 7242 R 7240 R 7237 R

7 7241 R 7239 R 7246 R

8 7244 R 7240 R 7245 R

9 7249 R 7243 R 7240 R

10 R 4670 laquosectR 4670 laquosectR

11 R 4670 laquosectR 4672 laquosectR

12 R 4669 laquosectR 4671 laquosectR

13 R R 4671 laquosectR

14 R R 4670 laquosectR

iacute 192EumlIgraveW 224EumlIgraveZccedil` MPIn~n 96^_13aelig13gt 213

aelig13Oslash5112^_13aelig13gt 213aelig13Oslash55poundDqordm 144EumlIgraveOslash5WWlt shy13aelig13iumlIumlraquoIIuml FAring Table 21Gq yenw 224 EumlIgraveoumlltAacutesectshygt IBnotGiumlIumlordflaquogt(50_ccedilaring)

miumlIuml FAringordfcedilCXqFig 24a$fY`icircgtCDbrvbarPiumlIuml1W 102W 113W 12ordfnotGmWXsectmndiumlIuml FAringcedilCXpoundqCcCiumlIuml 15161718 egraveBeumlnXDHmndiumlIumlWnotGiumlIuml 6789 FAringordfFC 144MPI oumlW FAringordfEacutegtlaquoqbrvbarpoundiumlIuml6ccedil9 iuml_ccedilaringWXq192 EumlIgraveiacutegtiumlIuml 4ccedil9 p IB notGiumlIumlegraveBeumlnXDH FAringordfFq

JAEA-Testing 2011-005

Table 22 L13aelig13Icirca|ccedilIumlP Icirca|ccedil`(112x2) Icirca|ccedil`(56x4) Icirca|ccedil`(28x8) iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 9339 laquosectR 18567 laquosectR 36408 laquosectR

2 9337 laquosectR 18564 laquosectR 36919 laquosectR

3 9346 laquosectR 18566 laquosectR 36536 laquosectR

4 9347 laquosectR 18570 laquosectR 36457 laquosectR

5 9342 laquosectR 18651 laquosectR 36646 laquosectR

6 14385 R 28606 R 53808 R

7 14488 R 29118 R 53779 R

8 14500 R 28676 R 53789 R

9 14495 R 29100 R 53784 R

10 9337 laquosectR 18797 laquosectR 36408 laquosectR

11 9340 laquosectR 18775 laquosectR 36715 laquosectR

12 9343 laquosectR 18651 laquosectR 36559 laquosectR

13 9348 laquosectR 18811 laquosectR 36457 laquosectR

14 9338 laquosectR 18694 laquosectR 36991 laquosectR

(3) 13aelig13Icirca|ccedil` Icirca|ccedil`gt 213aelig13(224core)egravepound112mpij2omp(56^_13aelig13)

56mpij4omp(28^_13aelig13)28mpij8omp(14^_13aelig13)IumlgtOslash5CDqmndoiumlIumlgtIIuml FAringWpiumlIumlWnotnot Table 22Gq MIcirca|ccedil`gt IB notnotgt FAringordfecircC^poundqUgraveDIcirca|ccedil

`7ccedilIumlbrvbarpoundiumlIumlcdGlt GOcirc$2ordf^DHiumlIumlntildesect^_

ordfMiumlIumlXordf IB gtOcirc$[thornordfagraveaacuteXqmnduumlGWBX900 gtZccedil` MPI brvbarsectIcirca|ccedil`UcirceumlnDfIumlordfIcirca|ccedil`UcircfIumlXd 87ccedilIumlordfnotygtlaquomWordfcq WmigravegtWsup3iumlIumleacuteaOcircOtildeAgraveograveiumlIumlAtildeWecirccopytimeseumlnD

HUVYacute~13currenYacute~gt IB not9AgraveXqbrvbarpound50_ccedilaringmWuumlGbrvbarsectordmgttuDIcirca|ccedil`Ucirc5GuecircgtlaquoqCcC

Yacute~Oslash5 IB notGiumlIuml ordfpYacute~ordfOslash5eumlnWthornltCUgravePordfucircoB3mnEumlmWgtecircXq

JAEA-Testing 2011-005

652 divideoslash mmgtdivideoslash atildePiAuml 2ordfCX MPI_AlltoallegravepoundEacute

DGq 6521 Agraveograve

MPI_AlltoallGCcedilEgravecentCOslash5GqCmrsGqXMPI_Alltoall 100 plusmnOslash5C oacute copyIgrave-a5Zccedil` MPI ordf512MBIcirca|ccedil` 1024MBWCDq

A ucircYacute~Oslash5 B 13aelig13ntildesectiumlIumlgtecircotildeCGq C 13aelig13ntildesectiumlIumlAgraveXCG(13aelig13ordfCX)q D IBnotCG(iuml_ccedilaring)q

BccedilD atildeGgtlaquoordfiumlcopyordfYacute~gtegraveBGiumlIumllaquo GmWgtecircXq mDHWsup3ucircoBgteacuteccedilYacute~TYZ h4096egraveTEacute 512iumlIuml(4096core)agravefCm3cdXuCCcedilEgravePiumlIumllaquonotgtecircmCcedilEgravegtagravefgtecircEumlIgravegtCfNh$)JCmfNh$egravepound

MPI_Alltoall Oslash55poundDqmDHAacutedegCDOcirc$ MPI ^_UYacuteordfXq 6522 Zccedil` MPIWIcirca|ccedil`egravepoundD Table 23GqmmgtCXC

XmWOslash5CcedilEgraveordf+XDH AucircYacute~Oslash5WBccedilDUgravegtoshy8YrsCXmWgtlaquo(A 10WCDrClaquoordfmnoumlgt BccedilDEacutersWG)qBccedilDTEacute 512iumlIumlagravefCcdEumlIgravefNh$)JCOslash5GordfQR MPI gtmbrvbarPTEacuteCiumlIumlagravefCcdAgraveXiumlIumlgtCfNh$)poundOslash5CDoumlWTEacutecd

EumlIgravegtOslash5CDoumlWgt)acircOslash5ordfEGqZccedil` MPI WIcirca|ccedil`gtgt coreordfAgraveXoumlmraquozordfagraveaacuteXqX BccedilDrsXq

JAEA-Testing 2011-005

Ta

ble

23iacuteXCcedilEgrave

MPI

_allt

oall

vUacuteSTUcircR

vRcopyUumlYacutebrvbarTHORN

szligvcopyUumlYacutebrvbaragrave

pvRaacuteacircatildeNR

+aumleAumlovqaringaeligPgR

Oslash 5 Iuml

copy Igrave - a 5 (MB)

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

R

R

szligR

pR

144m

pi

512

3R6

0 11

14

118

0

145

9 3

60 R

138

6

36

0 13

87

100

1

31

124

1

25

2plusmn

(BC

DEacute

) 51

2 3R

60

111

4

118

0

148

2 11

16 R

146

9

63

0 14

75

100

1

33

132

1

32

3plusmn

(BC

DEacute

) 51

2 3R

60

111

4

29

0 14

81

714

13 R

153

3

29

0 14

21

100

1

33

138

1

28

448m

pi

512

14R

40

142

6

414

0

182

4 21

272

1 R

170

3

78

0 16

32

100

1

28

119

1

14

504m

pi

512

14R

45

142

7

415

8

180

4 27

242

6 R

160

5

79

0 14

33

100

1

26

112

1

00

2048

mpi

51

2 37R

69

182

1

298

8 24

93

465

6 R

250

1

465

6 25

04

100

1

37

137

1

38

2plusmn

51

2 37R

69

166

7

2211

6

262

5 67

564

6 R

264

7

298

8 25

04

100

1

57

159

1

50

3plusmn

(BC

DEacute

) 51

2 37R

69

166

7

2211

6

262

1 69

524

9 R

269

3 6

298

8 24

59

100

1

57

162

1

47

3072

mpi

51

2 69R

56

258

7 28

399

9 27

34

5371

54 R

286

3 9

507

7 26

44

100

1

06

111

1

02

68m

pij

2om

p 10

24

3R5

7 12

10

11

170

11

80

91

9 R

122

3

28

5 12

21

100

0

97

101

1

01

120m

pij

2om

p 10

24

9R3

3 12

89

215

0

127

0 12

191

6 R

130

4

47

5 12

94

100

0

99

101

1

00

132m

pij

2om

p 10

24

8R4

1 13

79

216

5

116

1 15

211

6 R

119

5

48

3 11

38

100

0

84

087

0

83

432m

pij

2om

p 10

24

17R

64

162

4

119

8 16

28

1925

43 R

161

2 1

129

0 16

18

100

1

00

099

1

00

30m

pitimes4

omp

1024

4R

38

598

115

0

674

12

13 R

683

27

5 68

2

100

1

13

114

1

14

2plusmn

(BC

DEacute

) 10

24

4R3

8 59

8

115

0

678

12

13 R

690

27

5 70

0

100

1

13

115

1

17

36m

pitimes4

omp

1024

6R

30

603

118

0

684

11

16 R

697

63

0 70

3

100

1

13

116

1

17

2plusmn

(BC

DEacute

) 10

24

6R3

0 60

3

29

0 68

8

714

13 R

702

29

0 71

0

100

1

14

116

1

18

126m

pitimes4

omp

1024

16R

39

822

3

415

8

805

27

242

6 R

738

1

79

0 71

5

100

0

98

090

0

87

180m

pitimes4

omp

1024

18R

50

925

615

0

892

40

283

2 R

776

127

5 75

3

100

0

96

084

0

81

2plusmn

(BC

DEacute

) 10

24

18R

50

925

712

9

861

29

352

6 R

808

109

0 76

8

100

0

93

087

0

83

612m

pitimes4

omp

1024

45R

68

110

2

349

0 10

34

525

9 R

111

4

349

0 10

46

100

0

94

101

0

95

2plusmn

(BC

DEacute

) 10

24

45R

68

110

2

2810

9

920

57

634

9 R

960

5

358

7 83

4

100

0

83

087

0

76

17m

pitimes8

omp

1024

3R

57

338

1

117

0

430

9

19 R

432

28

5 43

3

100

1

27

128

1

28

90m

pitimes8

omp

1024

15R

60

504

109

0 44

9

243

8 R

471

109

0 47

1

100

0

89

093

0

94

357m

pitimes8

omp

1024

53R

67

731

418

7 69

2

576

3 R

711

418

7 72

9

100

0

95

097

1

00

2plusmn

(BC

DEacute

) 10

24

53R

67

731

3410

5

642

57

685

3 R

628

6

438

3 60

2

100

0

88

086

0

82

JAEA-Testing 2011-005

brvbarWBccedilD gtecircCOslash5ordf+XAgraveXq13aelig13ordfAgraveXCc IB notGiumlIumlordfAgraveXnOslash5ordfUumlWnDordfmacrWaeligCmbrvbarPXpoundXqMPI_Alltoall MPI atilde3gtT 2ordfCX 5Pordfmnegravepound BX900 gtecircXordfXmWordfethCDq1 oacute 1 gt13aelig13thornordfDordfmgtpYacute~thornltC t256YZegraveBCDoacuteCdivideoslash h4096YZegraveBCDgtpYacute~thornlt9]QR MPIbrvbar TUacuteUcircthornuumlwdnWgtszligagraveXNXq 66 a$fY`UgraveWH mnUgravegt BX900 gt brvbarsect ecircXthorn7wWCacirc

RDMANORDMA Agravej Ocirc$-a5 1oacute 1 (13aelig13ocirc)IBnot

ordflaquopoundDqumlcopyfIuml oslashfrac14frac12CTbrvbarCcedilEgravelaquo GuecircgtlaquoigravePqeumld

fIumlszligOslash5GDH Zccedil` MPIUgraveDIcirca|ccedil`laquo

ordfugtlaquoqucircuumlWCGZccedil`MPIWIcirca|ccedil` rsGordfWmcdBX900gtIcirca|ccedil`ordfnotyXmWordfcurrencqmncdfIuml)JGoumllaquocentfIumlAumlFUcirc5GoumlcentX=ordfX

nIcirca|ccedil`laquo GuecircgtlaquoqBX9001iumlIuml 8coreoacuteC IB2gt brvbarntildeGDHm IBgt 2laquo]eumlXDHIcirca|ccedil`ordfugtlaquoqCPU13fUcircordfbrvbarsect2WnIcirca|ccedil`fIuml13vC]G

Weumlnq

JAEA-Testing 2011-005

7 UgraveWH

HPCC BMT7q|^_ZWUuml|YacuteTHORN^_ZBDbrvbarsectBX900W FX1gtfeaZbrvbarTUacuteUcircordflaquosectnordf^_ZOslashszligthornGmWordfdcXpoundDqIORBDgtegraveBG API]a13brvbarpoundlordfagraveaacutegtlaquopoundDqiacuteX B^_ZbrvbarOcirc$

a$fY`yBAgraveogravegt ordfecircC+XmWethCDqmnd

EcircHdegdnDregmacrHPCCYoJusectWUgraveWHWCtCqXyacuteampG(GOcirc$b AccedilGUgraveWHq mndregmacrBX900 W FX1 o13gtegraveBG^_ZIacuteDh

Agraveogravelaquo GDHnotszligXWXqordfbrvbarsectCumlcopy13

ABGlCWCJ7gtecircngtlaquoq

ugraveuacute

)JlaquoDsectXsup2 sup2kDotildeDAElig13=` v oBeacuteQRjlucircsup2acircAgraveXugraveG

q

ucircuumlyacutethorn

1) Innovative Innovative Computing Laboratory the University of Tennesee ldquoHPC CHALLENGErdquo httpiclcsutkeduhpcc

2) gt$middotamp()p HPCaelig7Yacutegt SX13L32 3plusmnAElig13THORNYacutel 3 A=P+ frac14AElig13THORNYacute$ 2005-05middot4 p98-116 (2005)

3) ldquoIOR HPC Benchmarkrdquo httpior-siosourceforgenet

JAEA-Testing 2011-005

A

FX1

HPC

C B

MT

G-F

FTEfeaZUuml^13~

-DH

PCC

_FFT

_235GmWgt

FFT|5ordfTUacuteUcirceumlnordfzyacuteCq

iAgrave

EP-

DG

EM

M-a5^regbrvbarpoundaringaeligccedil13h-a5

(15

MB

Cor

e)UacuteCordfzyacuteCq

Tabl

e A

-1feaZUuml^13~]-a5^regCD

FX1

HPC

C B

MT

UgraveccedilegravekeacuteUgraveccedilknUgraveccedilplusmnsup2sup3

~~regreg

Ugraveccedilmmnecircecirckccedilnecircfrac12

regReeumlgecirckccedilnecircfrac12

ndegplusmnsup2

ecirckccedilpUgraveecircfrac12frac12plusmnsup2sup3degmiddot

plusmnsup2ordmdegsup2sup1plusmnsup2sup3degmiddot

eacuteplusmn~+igraveiacute8

nmcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

reg~

reg~v

reg

sup2sup3plusmn

qvqyuqou

ovwyy

qvquzqq

vxzwo

qvzz

vyoy

zzv|w|

qvuqqy

uvuyx

qyu

sup2sup3plusmnqvqy|zx

ovwz

qvquyxz

vxuo

qvzuw

vyowx

zzv|z

qvxx|y

yvox

qwy

sup2sup3plusmn

qvxoquq

uvoxzo

qvoqqz

yyvuox

w|vyzx

vyoxx

zzvzyq

qv|qq

vqy

yy

sup2sup3plusmnqvxoqwwq

uvqwxx

qvooqzu

yyvxuwo

w|vyyq

vyouu

zzvzzx

qvux|

oqvz|

xw

sup2sup3plusmn

|vwq|qqq

|uvwz

qvqoxxw

qv|

o|uuv|q

vyxy

zzv||wq

qvqwzuw

oqvywyu

sup2sup3plusmn|vw||wqq

|vyz

qvuxu

yyyvuwwu

o|uyvyw

vy|q

zzv|uy

qvxwzqu

ovxuo

zq|z

sup2sup3plusmn

|vwuzqq

uuvxqwq

qvqoxxw

wwyv|zq

o|uv|q

vy|ox

xvuuwu

qvowox

oqvowuyq

sup2sup3plusmn|vwoquxqq

|yvzyxu

qvuzou

yvqwq|

o|uyvzxu

vy|qw

xvuuz

qvqwuy

ovuz

|y

uxqqq

wxqqq

|xqqqq

|qqqq

w

frac14icircRfrac14Rplusmndegreg

szligsup3degicircRRszligcedilszligiumliumlRethvRntildeRfrac14degicircRccedilogravemicroplusmnregntildemicro~sup1Oslashsup3sup2moRenecircfrac12oacuteregplusmnAumlocircotildeg

egravekeacuteOslashfrac14knicircccedilpegravekszligszligOslashmmnOslash|xRccedilpOslashpOslashfrac14knRccedilpegravekszligszligOslashfrac12ecircfrac12eacuteeacuteszlignRocircotilde

mszligpecircmicircccedilpeacutefrac14UgraveOslashOslashyunRocircotilde

frac12kicircRRkplusmnplusmnplusmnraquodegReacutemiddotplusmnmiddotRkplusmn~plusmnmiddot

-

moRexoszligcedilowszligkoumlg

szligdegicircRkszligyudividentildeRvxUgraveegraveparantildeRpposlashueuqUgravecedilregedegsup2plusmngntildeRo|vxUgravecedilregesup3plusmnregsup2RugraveRnecircfrac12gg

Cuacute uacuteucircicircuumlTyacutethornYacuteN0ucircicircuumlTyacutethornYacuteN0gt

|

xo

xo

egravekszligszligRethvicircRethov|vo

eeumlgicircecirckccedilnecircfrac12RregR0Eeecirckccedilnecircfrac12Rndegplusmnsup2gIcirc-RJ13NOIPQ

~~icircRppRmicrodegdegugraveplusmnsup2ntildeRoreg~cedilsup2ntildeRmplusmnRn

JAEA-Testing 2011-005

B EumlIgravecentMPI atilde oh$a`MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

1024EumlIgrave

0

1000

2000

3000

4000

5000

6000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

2000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(1024-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(1024-)

rdma

Nordma

0

100000

200000

300000

400000

500000

600000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(1024-)

rdma

Nordma

Fig B-1 MPI atildeL1024EumlIgrave-1P

JAEA-Testing 2011-005

0

50000

100000

150000

200000

250000

300000

350000

400000

q qqqq uqqqq yqqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(1024-)

rdma

Nordma

0

1000000

2000000

3000000

4000000

5000000

6000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(1024-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(1024-)

rdma

Nordma

0

50

100

150

200

250

300

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(1024-Rank0gtRank1023)

rdma

nordma

Fig B-2 MPI atildeL1024EumlIgrave-2P

JAEA-Testing 2011-005

13

C

RD

MAoacute

NO

RD

MA

Ta

ble

C-13UVOcirc$

(8K

Bccedil

78K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

~plusmnreg

plusmn

sup2~

sup2~

sup2cedil~raquo

sup2~Oslashreg~plusmn

middotplusmnsup1

middotplusmnsup1raquo

~plusmnraquo

~plusmn

~plusmn

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

plusmnraquo

Tabl

e C

-2UVOcirc$

(80

KBccedil

584K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

sup2~Oslashreg~plusmn

plusmn

~plusmnreg

middotplusmnmiddotsup1

~plusmnraquo

~plusmn

sup2~

sup2cedil~raquo

sup2~

middotplusmnsup1raquo

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

~plusmn

plusmnraquo

Zordf

100M

RD

MAAgravejordfnotyWXq

0CXM

NO

RD

MA

AgravejordfnotyWXq

JAEA-Testing 2011-005

D

W Oslash5

Tabl

e D

-1Zccedil`

MPI

8mpi

qzvquecirciumlq|yvwuecirciumlquvzuecirciumlquovu|oecirciumlqxvwoqecirciumlquyvxqoecirciumlquyvxoecirciumlqu

qwvo

qzvzzecirciumlq|vuecirciumlquwvyecirciumlquovu|oecirciumlqxvwuecirciumlquyvuzoecirciumlquxvy|ecirciumlqu

yqyv|

owvz|oecirciumlq|yvwyyecirciumlquvxzecirciumlquvqxecirciumlquov|uecirciumlquyvuecirciumlquccedilxv|xecirciumlq

ccedilyvq

ozvqoecirciumlq|vxxecirciumlquwvux|ecirciumlquvyzoecirciumlquovxoecirciumlquyvuuqecirciumlquccedilvyecirciumlq|

ccedilwuvy

ovq|ecirciumlquyvzz|ecirciumlquwvq|qecirciumlquovu|qecirciumlqxvw|wecirciumlquyvuyoecirciumlquyvywecirciumlqu

yquv

ovquyecirciumlquvuyqecirciumlquwvxqecirciumlquovu|qecirciumlqxvwecirciumlquyvuwqecirciumlquxvzxecirciumlqu

xx|vw

|ovq|zecirciumlquyvwzuecirciumlquvz||ecirciumlquvyoecirciumlquovwecirciumlquyvuzecirciumlquccedilovozecirciumlq|

ccediloyvy

|ovquecirciumlquv|qecirciumlquwvuoyecirciumlquvyy|ecirciumlquovowyecirciumlquyvuecirciumlquccedilvxwecirciumlq|

ccedilovz

uzvwoecirciumlq|vuxyecirciumlquwvu|uecirciumlquovu|ecirciumlqxvw|ecirciumlquyvuuxecirciumlquxvwwuecirciumlqu

yqovy

uzvywecirciumlq|yvzuzecirciumlquvzoyecirciumlquovuwecirciumlqxvwqyecirciumlquyvuecirciumlquyv|yecirciumlqu

yxvu

xzvqecirciumlq|vuoqecirciumlquwv|wqecirciumlquvyyecirciumlquovowecirciumlquyvuuzecirciumlquccedilvozecirciumlq|

ccedil|vx

xzvyzzecirciumlq|voxuecirciumlquwvouecirciumlquvqqecirciumlquovxuecirciumlquyvuuyecirciumlquccediluv|yecirciumlq|

ccedilu|v

yzvuecirciumlq|vxzecirciumlquwvxecirciumlquovuyecirciumlqxvwecirciumlquyvuxecirciumlquxvyzqecirciumlqu

xw|vw

yzvz|oecirciumlq|yvwzoecirciumlquvwwxecirciumlquovuzecirciumlqxvwyecirciumlquyvuyyecirciumlquyvuqwecirciumlqu

yuxv|

zvwqecirciumlq|v|uzecirciumlquwv||yecirciumlquvyxwecirciumlquovozecirciumlquyvuzecirciumlquccedilyvwuecirciumlq|

ccedilywv

ovqq|ecirciumlquyvzzyecirciumlquvzzzecirciumlquvyzecirciumlquovqoecirciumlquyvuzecirciumlquccedil|vozxecirciumlq|

ccedil|ovz

zvuzecirciumlq|vowqecirciumlquwvoxxecirciumlquovoqqecirciumlqxuvxwecirciumlquyvuqecirciumlquvwu|ecirciumlqu

zovy

zvwecirciumlq|vyxecirciumlquwvuecirciumlquovqzzecirciumlqxuvxoecirciumlquyvuyzecirciumlquvu|ecirciumlqu

wvw

qwv|xzecirciumlq|vzecirciumlquwvywecirciumlquvxyoecirciumlquovqwuecirciumlquyvuecirciumlquccedilovqyecirciumlquccedilov

qzvuuuecirciumlq|v|yecirciumlquwvqecirciumlquovuzecirciumlqxvwqqecirciumlquyvuzoecirciumlquyvqoecirciumlqu

y|vx

oovouwecirciumlquwvqqecirciumlquzvoywecirciumlquovuqwecirciumlqxvyxxecirciumlquyvuwecirciumlquuvzoxecirciumlqu

uwvq

owvwuecirciumlq|vq|qecirciumlquvzqzecirciumlquv|uuecirciumlquzvqoecirciumlq|yvu|ecirciumlquccedilxvyxqecirciumlq|

ccedilyuv|

yvz||ecirciumlq|v|oecirciumlquwvuuecirciumlquvqyecirciumlqxovuowecirciumlqxyvu||ecirciumlquovozecirciumlqxoxwvy

ovouecirciumlquvuxecirciumlquwv|zecirciumlquovu|oecirciumlqxvwxyecirciumlquyvux|ecirciumlquxvzoecirciumlqu

xoyvo

|ov|qoecirciumlquv|zqecirciumlquwvyzoecirciumlquv|oecirciumlquwvu|uecirciumlq|yvu|ecirciumlquccedilov|uecirciumlquccediloqxv

|v|qecirciumlq|vqecirciumlquvxzecirciumlquvxqecirciumlquovq|ecirciumlquyvuzecirciumlquccedilv|zoecirciumlq|

ccedil|vu

uwvxy|ecirciumlq|wvzoyecirciumlquzv|ecirciumlquovuozecirciumlqxvuoqecirciumlquyvwqecirciumlquuvuoecirciumlqu

xoxvw

uovquecirciumlquvwecirciumlquwvqecirciumlquovuouecirciumlqxvyxqecirciumlquyvuzoecirciumlquxvwoecirciumlqu

xy|vx

xzvux|ecirciumlq|vxwecirciumlquwvx|ecirciumlquvx|oecirciumlquovoqoecirciumlquyvu|qecirciumlquccedilzvzoecirciumlq|ccediloquvz

xwvqqecirciumlq|vuuwecirciumlquwv|owecirciumlquvzuzecirciumlquov|wyecirciumlquyvxy|ecirciumlquccedil|vywuecirciumlq|

ccediluv|

yzvqecirciumlq|zv|w|ecirciumlquovq|qecirciumlqxovuxecirciumlqxvyecirciumlquyvzqecirciumlquuvqecirciumlqu

uyuvq

yovqyuecirciumlquv|quecirciumlquwv|ywecirciumlquovuyuecirciumlqxwvqxwecirciumlquyvxwoecirciumlquyvecirciumlqu

xwzvx

ovoooecirciumlquvzwwecirciumlquzvqzzecirciumlquvo|ecirciumlqxovuoecirciumlqxyvyouecirciumlquovecirciumlqxooqqvy

ovqoecirciumlquvqyoecirciumlquwvo|ecirciumlquvy|ecirciumlquovoecirciumlquyvuxecirciumlquccediluvxzecirciumlq|

ccediluvz

zvy|ecirciumlq|wvoqqecirciumlquzvqyecirciumlquov|uqecirciumlqxyvw|ecirciumlquyvxyecirciumlquuv||ecirciumlqu

uzovo

zvyzoecirciumlq|vqwecirciumlquwvoecirciumlquovqzwecirciumlqxuvuwwecirciumlquyvuzyecirciumlquvwqyecirciumlqu

yxvy

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

JAEA-Testing 2011-005

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy1

Ccedil

Ccedil

+

Ccedil

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

Fi

g D

-1 Zccedil`

MPI

8mpi

JAEA-Testing 2011-005

Tabl

e D

-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p

q|vu|ecirciumlq|yvuqqecirciumlquyvuuecirciumlquyvy|ecirciumlqu|vuuxecirciumlq|yvuozecirciumlquovzzoecirciumlq

xvw

qvuuwecirciumlq|yvuqecirciumlquvouecirciumlquvqyecirciumlquyvqxqecirciumlq|yvuqecirciumlquccedilovoecirciumlq|

ccediloyv|

o|vuuecirciumlq|yvuq|ecirciumlquyvuecirciumlquyvwuwecirciumlquuv|ecirciumlq|yvuuecirciumlquovqquecirciumlq|

zv

oyvqzwecirciumlq|yvuqoecirciumlquvqooecirciumlquvqecirciumlquyvqyecirciumlq|yvuozecirciumlquovxxecirciumlq

vx

|vwuecirciumlq|yvuqoecirciumlquyv|qecirciumlquyvwxecirciumlquuv|uqecirciumlq|yvu|ecirciumlquovoecirciumlq|

|wv

yvqqwecirciumlq|yvuqyecirciumlquvqqecirciumlquvq|oecirciumlquyvoyecirciumlq|yvuouecirciumlquvuowecirciumlq

uvq

||vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvwxecirciumlquuv|yqecirciumlq|yvuoecirciumlquovooecirciumlq|

|yv

|vuzwecirciumlq|yvuqyecirciumlquvoxxecirciumlquvq|ecirciumlquyvoqyecirciumlq|yvuoecirciumlquccedilov|uecirciumlq|

ccediloyvx

|v|yuecirciumlq|yvuq|ecirciumlquyv|zecirciumlquyvw|oecirciumlquuvqzuecirciumlq|yvuecirciumlquzvo|ecirciumlq

vy

yvy|ecirciumlq|yvuquecirciumlquvqwqecirciumlquvqzecirciumlquyvoqecirciumlq|yvuozecirciumlquccedilxvo|uecirciumlq

ccedilyvy

q|vu|ecirciumlq|yvuqoecirciumlquyvu|ecirciumlquyvyyyecirciumlquvx|qecirciumlq|yvuo|ecirciumlquccedilvoqecirciumlq

ccedilvx

qvu||ecirciumlq|yvuquecirciumlquvouwecirciumlquvq|ecirciumlquyvoxecirciumlq|yvuecirciumlquccedilovoquecirciumlq|

ccedilouvz

ovxouecirciumlq|yvuqecirciumlquyvyxzecirciumlquyvyyecirciumlquvu|ecirciumlq|yvuxecirciumlquwvuwwecirciumlqo

|vu

oyvozecirciumlq|yvuq|ecirciumlquvqoyecirciumlquvq|ecirciumlquyvoecirciumlq|yvuxecirciumlquvooqecirciumlq

|vu

|vyoqecirciumlq|yvuoyecirciumlquyvecirciumlquyvxyecirciumlqu|v|xwecirciumlq|yvuqecirciumlquccedilvozecirciumlq

ccedilxvz

xvzzyecirciumlq|yv|wzecirciumlquyvzwzecirciumlquvq|oecirciumlquyvouecirciumlq|yvuoyecirciumlquuvoxwecirciumlq

yvz

||vyecirciumlq|yvuqoecirciumlquyvy|ecirciumlquyvxecirciumlqu|v|xxecirciumlq|yvuoecirciumlquccedilxvwwzecirciumlqo

ccedilovy

|vuzuecirciumlq|yvuquecirciumlquvox|ecirciumlquvq||ecirciumlquyvo|ecirciumlq|yvuozecirciumlquccedilovq|ecirciumlq|

ccediloyvo

|vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvooecirciumlquvzoyecirciumlq|yvuqecirciumlquccedilv|zxecirciumlq

ccedilyv

yvy|ecirciumlq|yvuqqecirciumlquvqecirciumlquvq|xecirciumlquyvouecirciumlq|yvuqecirciumlquccediluvqoecirciumlq

ccedilxvo

q|vozyecirciumlq|yv|zzecirciumlquyvowecirciumlquyvyzecirciumlqu|vxu|ecirciumlq|yvuouecirciumlquxvq|wecirciumlq

oxvw

qv|wyecirciumlq|yvuqxecirciumlquvouuecirciumlquvquuecirciumlquyvoqyecirciumlq|yvu||ecirciumlquccedilovqqqecirciumlq|

ccedilo|vx

ouvoq|ecirciumlq|yv|zecirciumlquyvwqecirciumlquyvyxecirciumlquvuoecirciumlq|yvuoyecirciumlquccedilovxqoecirciumlq|

ccedil|yvy

ovuqzecirciumlq|yv|zoecirciumlquvo|ecirciumlquvq|zecirciumlquyvo|ecirciumlq|yvuowecirciumlquccedilzv|ecirciumlq

ccedilovy

uvoecirciumlq|yvuqqecirciumlquyvwo|ecirciumlquyvxoecirciumlqu|vzwecirciumlq|yvuoecirciumlquccedilyvowecirciumlq

ccediloxvq

vxozecirciumlq|yvuoecirciumlquvo|ecirciumlquvquuecirciumlquyvoyecirciumlq|yvuecirciumlquccedilovzzecirciumlq|

ccedilov|

||vuoqecirciumlq|yvuqxecirciumlquyvuecirciumlquyvywecirciumlqu|vuyuecirciumlq|yvuoecirciumlquvooqecirciumlq

yv

|vxoecirciumlq|yvuquecirciumlquvoxyecirciumlquvq||ecirciumlquyvqwyecirciumlq|yvuuecirciumlquccedilov||ecirciumlq|

ccediloyvu

|vqwecirciumlq|yvuqqecirciumlquyvoecirciumlquyv|yecirciumlqu|vozecirciumlq|yvuowecirciumlquccedil|vxooecirciumlq

ccedilvu

vuxwecirciumlq|yvuqxecirciumlquvoxoecirciumlquvquqecirciumlquyvoxxecirciumlq|yvuuecirciumlquccedilovooyecirciumlq|

ccedilouvz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

jetimesg

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-3Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (2

)

JAEA-Testing 2011-005

Tabl

e D

-3Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p

q|v|oxecirciumlq|wvwz|ecirciumlq|ovoecirciumlquovou|ecirciumlquvu|qecirciumlq|zvqqecirciumlq|ccedilvxwecirciumlq

ccedil|vu

qvx||ecirciumlq|wvzquecirciumlq|ovyuuecirciumlquovyuyecirciumlquvxqoecirciumlq|wvzyecirciumlq|vy|ecirciumlqo

qv|

ovyecirciumlq|wvwzqecirciumlq|ovoyxecirciumlquovouuecirciumlquvuwecirciumlq|wvzyuecirciumlq|ccedilvoqqecirciumlq

ccedilvy

ovxqoecirciumlq|wvwzecirciumlq|ovyuqecirciumlquovy|yecirciumlquvuoqecirciumlq|wvzxqecirciumlq|ccedil|vzoecirciumlqo

ccedilqvx

|vywoecirciumlq|wvz|ecirciumlq|ovyoecirciumlquov|ecirciumlqu|v|wyecirciumlq|wvz|oecirciumlq|ccedilvzxzecirciumlq

ccedilwvq

v|zoecirciumlq|wvwzyecirciumlq|ovyzecirciumlquovuzyecirciumlquxvzzwecirciumlq|wvzxwecirciumlq|ccedilov||oecirciumlq|

ccedilowvq

||vywecirciumlq|wvwwzecirciumlq|ovxecirciumlquov||ecirciumlqu|v|xwecirciumlq|wvzoecirciumlq|ccedilv|zecirciumlq

ccedilyvx

|vuooecirciumlq|wvzqwecirciumlq|ovy|ecirciumlquovxqxecirciumlquyvqzecirciumlq|wvz|ecirciumlq|ccedilovyecirciumlq|

ccedilovo

|v|xzecirciumlq|wvzqoecirciumlq|ovyecirciumlquovowwecirciumlquvzo|ecirciumlq|wvzyecirciumlq|ccedil|vzzecirciumlq

ccediloovu

vuxzecirciumlq|wvzqoecirciumlq|ovy|yecirciumlquovxoecirciumlquyvuecirciumlq|wvzyoecirciumlq|ccedilyvxuecirciumlq

ccedilwvw

q|v|xzecirciumlq|wvwwoecirciumlq|ovuecirciumlquovxqecirciumlqu|vxxuecirciumlq|wvzuuecirciumlq|vxwecirciumlq

v

qvuxecirciumlq|wvzqyecirciumlq|ovy||ecirciumlquovuzwecirciumlquyvqqecirciumlq|wvzy|ecirciumlq|ccedilov|uwecirciumlq|

ccedilowv

ovuwqecirciumlq|wvwzecirciumlq|ovo|yecirciumlquovo|zecirciumlquvuuqecirciumlq|wvzuyecirciumlq|vowecirciumlqo

ovo

oyvoozecirciumlq|wvz|qecirciumlq|ovxqxecirciumlquovuzecirciumlquyvqoecirciumlq|wvzxzecirciumlq|ccedilvzyecirciumlqo

ccedilov|

|v||zecirciumlq|wvzqzecirciumlq|ovxecirciumlquov|oecirciumlqu|v|oxecirciumlq|wvzzyecirciumlq|yv|owecirciumlqo

ovz

xvzyuecirciumlq|wvzqoecirciumlq|ovuwecirciumlquovxqxecirciumlquyvqxecirciumlq|wvzuecirciumlq|ovwuoecirciumlq

|vo

||v|xzecirciumlq|wvwzoecirciumlq|ovxecirciumlquovuzecirciumlqu|vxozecirciumlq|wvzqecirciumlq|v|zoecirciumlq

vo

|vuwxecirciumlq|wvzqyecirciumlq|ovy|zecirciumlquovxquecirciumlquyvqwoecirciumlq|wvzxzecirciumlq|ccedilov|xoecirciumlq|

ccedilowvo

|vo|uecirciumlq|wvwzqecirciumlq|ovqecirciumlquovoecirciumlqu|vqecirciumlq|wvzyuecirciumlq|ovuyzecirciumlq

uvu

yvuwecirciumlq|wvzooecirciumlq|ovxyyecirciumlquovxqoecirciumlquyvquecirciumlq|wvzyuecirciumlq|ccedilyvuw|ecirciumlq

ccedilwvy

q|v|oyecirciumlq|wvwzqecirciumlq|ovoecirciumlquov|yecirciumlqu|v|yzecirciumlq|wvzwzecirciumlq|ovxoyecirciumlq

uvy

qxvzyyecirciumlq|wvzq|ecirciumlq|ovuwecirciumlquovy|wecirciumlquvu|qecirciumlq|wvzx|ecirciumlq|ovxouecirciumlq|

xvu

ovxyecirciumlq|wvwqecirciumlq|ovoy|ecirciumlquov|oecirciumlquuvuuecirciumlq|wvzyecirciumlq|ovxwqecirciumlq|

xv|

oyvqqecirciumlq|wvzqecirciumlq|ovuzecirciumlquovy|wecirciumlquvuyecirciumlq|wvzuzecirciumlq|ovux|ecirciumlq|

uvo

|vyecirciumlq|wvwzqecirciumlq|ovxecirciumlquov|zecirciumlquuv||ecirciumlq|wvzxuecirciumlq|vowyecirciumlq

ozvx

vuuuecirciumlq|wvzuuecirciumlq|ovy|zecirciumlquovuwzecirciumlquxvzxzecirciumlq|wvz|yecirciumlq|ccedilovuz|ecirciumlq|

ccedilqvo

||vywecirciumlq|wvww|ecirciumlq|ovxyecirciumlquov|zecirciumlquuv|owecirciumlq|wvzoecirciumlq|vwoecirciumlq

ozvw

|vuqqecirciumlq|wvzqwecirciumlq|ovy|oecirciumlquovxqqecirciumlquyvquqecirciumlq|wvzxwecirciumlq|ccedilov|oqecirciumlq|

ccedilov

|v|xecirciumlq|wvww|ecirciumlq|ovuecirciumlquov|q|ecirciumlquuvqyyecirciumlq|wvzyzecirciumlq|vzuyecirciumlq

xv|

yvqwecirciumlq|wvzouecirciumlq|ovxyecirciumlquovxyyecirciumlquyvouecirciumlq|wvzuzecirciumlq|uvqzxecirciumlqo

vz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

Ccedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

Xpound

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-4Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-5Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (2

)

JAEA-Testing 2011-005

E 1oacute 1

Fig E-1 QRfeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39305 sec Ave Speed = 32566145 MBs RANK = 2 TIME = 39197 sec Ave Speed = 32655168 MBs RANK = 3 TIME = 39212 sec Ave Speed = 32643219 MBs RANK = 4 TIME = 34341 sec Ave Speed = 37272765 MBs RANK = 5 TIME = 34175 sec Ave Speed = 37453885 MBs RANK = 6 TIME = 34086 sec Ave Speed = 37552470 MBs RANK = 7 TIME = 34329 sec Ave Speed = 37286226 MBs RANK = 8 TIME = 24021 sec Ave Speed = 53287420 MBs RANK = 9 TIME = 23794 sec Ave Speed = 53794939 MBs RANK = 10 TIME = 23805 sec Ave Speed = 53770236 MBs RANK = 11 TIME = 23798 sec Ave Speed = 53786914 MBs RANK = 12 TIME = 23942 sec Ave Speed = 53462490 MBs RANK = 13 TIME = 23931 sec Ave Speed = 53488046 MBs RANK = 14 TIME = 23950 sec Ave Speed = 53443826 MBs RANK = 15 TIME = 24066 sec Ave Speed = 53186052 MBs RANK = 16 TIME = 23798 sec Ave Speed = 53786327 MBs RANK = 17 TIME = 23802 sec Ave Speed = 53775886 MBs RANK = 18 TIME = 23821 sec Ave Speed = 53733399 MBs RANK = 19 TIME = 23794 sec Ave Speed = 53795866 MBs RANK = 20 TIME = 23939 sec Ave Speed = 53468837 MBs RANK = 21 TIME = 23938 sec Ave Speed = 53470775 MBs RANK = 22 TIME = 23939 sec Ave Speed = 53470333 MBs RANK = 23 TIME = 23942 sec Ave Speed = 53462889 MBs

JAEA-Testing 2011-005

Fig E-2 afeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39370 sec Ave Speed = 32511775 MBs RANK = 2 TIME = 39475 sec Ave Speed = 32425231 MBs RANK = 3 TIME = 39415 sec Ave Speed = 32474846 MBs RANK = 4 TIME = 34392 sec Ave Speed = 37217925 MBs RANK = 5 TIME = 34543 sec Ave Speed = 37055373 MBs RANK = 6 TIME = 34250 sec Ave Speed = 37372742 MBs RANK = 7 TIME = 34709 sec Ave Speed = 36878286 MBs RANK = 8 TIME = 26967 sec Ave Speed = 47465876 MBs RANK = 9 TIME = 26964 sec Ave Speed = 47471018 MBs RANK = 10 TIME = 26955 sec Ave Speed = 47486289 MBs RANK = 11 TIME = 26947 sec Ave Speed = 47499771 MBs RANK = 12 TIME = 26957 sec Ave Speed = 47483647 MBs RANK = 13 TIME = 27020 sec Ave Speed = 47373054 MBs RANK = 14 TIME = 26948 sec Ave Speed = 47498023 MBs RANK = 15 TIME = 27174 sec Ave Speed = 47104542 MBs RANK = 16 TIME = 26956 sec Ave Speed = 47485054 MBs RANK = 17 TIME = 26945 sec Ave Speed = 47504899 MBs RANK = 18 TIME = 26946 sec Ave Speed = 47502360 MBs RANK = 19 TIME = 26967 sec Ave Speed = 47466158 MBs RANK = 20 TIME = 26948 sec Ave Speed = 47499767 MBs RANK = 21 TIME = 26982 sec Ave Speed = 47438204 MBs RANK = 22 TIME = 27087 sec Ave Speed = 47254830 MBs RANK = 23 TIME = 27004 sec Ave Speed = 47399960 MBs

JAEA-Testing 2011-005

Fig E-3 QRfeaZOpenMPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes NODE CPU coreid Rank = 0 bx2050 CPU0 0 all time RANK = 1 TIME = 38110 sec Ave Speed = 33587213 MBs bx2050 CPU1 1 RANK = 2 TIME = 28656 sec Ave Speed = 44667752 MBs bx2050 CPU0 2 RANK = 3 TIME = 37959 sec Ave Speed = 33720175 MBs bx2050 CPU1 3 RANK = 4 TIME = 28658 sec Ave Speed = 44664664 MBs bx2050 CPU0 4 RANK = 5 TIME = 37829 sec Ave Speed = 33836073 MBs bx2050 CPU1 5 RANK = 6 TIME = 28641 sec Ave Speed = 44691880 MBs bx2050 CPU0 6 RANK = 7 TIME = 37892 sec Ave Speed = 33779924 MBs bx2050 CPU1 7 RANK = 8 TIME = 35140 sec Ave Speed = 36425620 MBs bx2051 CPU0 0 RANK = 9 TIME = 35017 sec Ave Speed = 36554163 MBs bx2051 CPU1 1 RANK = 10 TIME = 38916 sec Ave Speed = 32891010 MBs bx2051 CPU0 2 RANK = 11 TIME = 45416 sec Ave Speed = 28184186 MBs bx2051 CPU1 3 RANK = 12 TIME = 35172 sec Ave Speed = 36392511 MBs bx2051 CPU0 4 RANK = 13 TIME = 35053 sec Ave Speed = 36516468 MBs bx2051 CPU1 5 RANK = 14 TIME = 35200 sec Ave Speed = 36363472 MBs bx2051 CPU0 6 RANK = 15 TIME = 35009 sec Ave Speed = 36561987 MBs bx2051 CPU1 7 RANK = 16 TIME = 22996 sec Ave Speed = 55661441 MBs bx2052 CPU0 0 RANK = 17 TIME = 35587 sec Ave Speed = 35968575 MBs bx2052 CPU1 1 RANK = 18 TIME = 35587 sec Ave Speed = 35968151 MBs bx2052 CPU0 2 RANK = 19 TIME = 35712 sec Ave Speed = 35842525 MBs bx2052 CPU1 3 RANK = 20 TIME = 22996 sec Ave Speed = 55661978 MBs bx2052 CPU0 4 RANK = 21 TIME = 36856 sec Ave Speed = 34729458 MBs bx2052 CPU1 5 RANK = 22 TIME = 22994 sec Ave Speed = 55666382 MBs bx2052 CPU0 6 RANK = 23 TIME = 22991 sec Ave Speed = 55673257 MBs bx2052 CPU1 7

JAEA-Testing 2011-005

13

F |eacuteIuml -microX|eacuteIuml^_Z)JCQRfeaZW Intel feaZegrave

pound5poundDqQRfeaZgt|YatildeGfeaUuml^13~

WCeumlUKntstiacuteordflaquoqmnaringaeligccedil13heumlX store amp5GUuml^13~gtlaquoq2csup1egraveBordfOcirc$ storegtszligordflaquoqiAgraveIntelmnntildeGfeaUuml^13~centsup3CXqCcCcedilEgravegtacircOslash5Cagraveaacute5pound

Dq WCIntelfeaZmUuml^13~ordfccdQRfeaZ

UKntst egraveBCDoumlWordfqiAgraveUKntst ordfQRfeaZEGmWordfethCDq

F1 Agraveograve o^_UgraveDo7ccedilIumlgt 128MB copyIgrave0]lyacuteOslash5GZccedil` MPI

OpenMP^_Z)JCDqm0]lyacute 100plusmnatildesectaumlC5mnccC|eacuteIuml (MBs)iquestHq^_ZiIacute[oumlYacutebrvbarsectGqOslash5IumlWCMPI gt 12348 ^_OpenMP gt 123487ccedilIumlgtOslash5UgraveD IFyacute1EC1CPUcd2^_UgraveD7ccedilIuml 2laquonotgtOslash55poundEacuteDqQRfeaZUKntst CDoumlWCXouml 2sectCDqIntelucircfeagt5poundDq

JAEA-Testing 2011-005

Fig F-1 ^_ZugraveuacuteLFlat MPIP

(main) n = nmb102410248 nmb = 128 (from input data) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) nsize = n size c NNN Repetition number of times NNN = 100 do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = MPI_WTIME() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

Fig F-2 ^_ZugraveuacuteLOpenMPP

(main) $OMP threadprivate(ioutompidompnumabierr1ierr2) n = nmb102410248 nmb = 128 (from input data) $OMP parallel shared(n) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) if(ierr1ne0orierr2ne0) then write(6) Array allocation failed stop end if $OMP end parallel nsize = n size c NNN Repetition number of times NNN = 100 $OMP parallel private(i)shared(nsize) do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue $OMP end parallel $OMP parallel private(IIttime1ttime2timewave_spd) $OMPamp shared(NNNn) ttime1 = omp_get_wtime() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = omp_get_wtime() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

F2 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt[oumlYacutebrvbarsect 4gtGqmn

d QRUKntstlaquooumlWoumlWr 2 2^_laquonotgtUKntstlaquooumlWoumlWr 3KntstCc 1ccedil8^_gtQRW IntelWr QR(-Kntst)W Intel2 2^_Wr CqOMP Intel EcircHGu MPI Wiacute

XWXpoundDDHyzGqmndbrvbarsect[raquozfrac14frac12CDq (1) MPI^_ZOpenMP^_ZWQRfeaZgtUKntstordflaquooumlW

oumlgtordfaacuteCq (2) IntelfeaZUKntstCDQRfeaZWbAEligq (3) MPI^_ZOpenMP^_ZW 1CPUSgto coremacrdnXq (4) UKntstordfQRfeaZOslash5thornlt]G(1plusmnW 2plusmn

Oslash5ordfaEacuteD)q ^_ZbrvbarPXOcirc$sup1yBordfAgraveXCc store ampordfCXoumlQR

feaZUKntst Uuml^13~CXnordfyacuteordfdXqOcirc$sup1yBordfcAgraveXcpoundecircsectCXoumlmUuml^13~CagraveaacuteGaeliguordflaquoWnq

IntelfeaZdegIacuteQRUKntstlaquosectoumlWGmWordfgtecircq(XETHNtildefeaOslash5|YatildeG4X

yen5oslashIacuteD|Yordf9WXpoundbrvbarPmacrltdnDq

JAEA-Testing 2011-005

Table F-1 |eacuteIumlWeumlUKntstiacuteUuml^13~

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqvo uuuwvq u|xvq qvzu

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |o|v wq|vy xxwvu qvXpoundo |xwxvz |xwvq |xw|vz Xpoundo |xvy wovw xov qv

|xyvu |xuvo |xxv| |ozvu woqv xyxvq qv

Xpoundq z|vx zvz z|v Xpoundq ouuwvz ozwov ooxv| qvyoXpoundo woqv wqxvo wqvy Xpoundo oux|vu ozwv| oqvu qvyoXpound wqzv| wqvo wqxv Xpound oux|v| ozzqvo oov qvyo

wquv| wqqvo wqv ouxovz ozwyvu oozvo qvyo

Xpoundq o||vo o|xvq o|uvq Xpoundq oqoxvz oxquvu oyqvo qvxzXpoundo ouv ouov ouov Xpoundo oqovw oxoqvo oy|vz qvxzXpound ouov ouyv| ouuvq Xpound oqowvz oxqzv oyuv| qvxzXpound| ouvy ouuvw ou|v Xpound| oqowvz oxqzv oyuvq qvxz

o|zvz ouovw ouqvz oqovz oxqwv| oy|vo qvxz

Xpoundq qwyvq oqyvw qzyvu Xpoundq ouxvq ouwuvy ouywv| qvqXpoundo qz|v ooyvy oqxv Xpoundo ouxvz ouwuv ouovq qvqXpound qzuvy oovq oqxvw Xpound ouxzv ouz|v ouyv qvqXpound| qzxvy ooxvu oqxvx Xpound| ouxuvw ouzv ou|vx qvqXpoundu qwzv| o||v ooovx Xpoundu ouqxvo ouuv| ouyv qvyXpoundx qwzvu o|vw oo|vy Xpoundx ouq|vo ouuxv ouuv qvyXpoundy oqvy o|xvu oovx Xpoundy ouquvw ouuyvq ouxvu qvyXpound ooqvu o|yvo o|v Xpound ouqxvx ouuyv ouyvo qvy

qzxvw ouvw ooqv| ou|qv| ouyvu ouuwvz qvyw

usup3deg

wsup3deg wsup3deg

usup3deg

|sup3deg

sup3deg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

sup3deg

|sup3deg

Table F-2 |eacuteIumlLiacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq ywxvu xyuvw yxvo qvuXpoundo |xywv |xwvx |x|vy Xpoundo yzwvz xvu y|wvo qvuXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz yzvo xovo y|ovy qvu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq yqv xywvy yozvy qvuXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound ywyvz xwv y|uvw qvuXpound| Xpound|

|xyov| |xovu |xyyv| ywvw xxvy yv qvu

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq yoov xvu ywovw qvxXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| yxv qvo yzvy qvx

|xyv| |xqvo |xyyv yowv yov ywzv qvx

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo uyv| wqvo y|v qvuXpound |xyvz |xuvu |xovo Xpound uqvu wqyv| y|wv| qvuXpound| Xpound|

|xyyv| |xuvu |xqv| uywvw wqyv y|vw qvu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo uyxvy wqyv y|yvo qvuXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| uywvq wqxvz y|vq qvu

|xyuvw |x|vo |xywvz uyyvw wqyv| y|yvx qvu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound xq|v wovo yxvz qvuXpound| |xyxvu |x|v| |xyzv| Xpound| uzzvx wqyvu yxvz qvu

|xyuvw |xuvy |xyzv xqovy wqzv yxxvu qvu

ntildeR| ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

ontildeR

ontildeR|

qntildeRo

qntildeR

qntildeR|

qntildeRo

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

JAEA-Testing 2011-005

Talbe F-3 |eacuteIumlLQRafeaZrsacute 1ccedil8 ^_P

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqxvw uqzvq uqvu ovqx

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |ywqvx |ywovy |ywovo ovq|Xpoundo |xwxvz |xwvq |xw|vz Xpoundo |yzyvz |yzvo |yzvq ovq|

|xyvu |xuvo |xxv| |ywwv |ywzv| |ywzvq ovq|

Xpoundq z|vx zvz z|v Xpoundq zv| zxvq z|v ovqqXpoundo woqv wqxvo wqvy Xpoundo wqxvx wqwvy wqvo ovqqXpound wqzv| wqvo wqxv Xpound wquv| wqwv| wqyv| ovqq

wquv| wqqvo wqv wqqv wquvq wqv| ovqq

Xpoundq o||vo o|xvq o|uvq Xpoundq ov ouqv| o|ov| ovqqXpoundo ouv ouov ouov Xpoundo o|ovy ouwvw ouqv ovqqXpound ouov ouyv| ouuvq Xpound o|ovy oxqvo ouqvw ovqqXpound| ouvy ouuvw ou|v Xpound| o|qv ouzvo o|zvz ovqq

o|zvz ouovw ouqvz ozvq ouvo o|wvq ovqq

Xpoundq qwyvq oqyvw qzyvu Xpoundq qzv ooxvq oqyvo ovqqXpoundo qz|v ooyvy oqxv Xpoundo oquv oouv| oqzv ovqqXpound qzuvy oovq oqxvw Xpound oqxv| oxvx ooxvu ovqqXpound| qzxvy ooxvu oqxvx Xpound| oqxvw oxv ooxvx ovqqXpoundu qwzv| o||v ooovx Xpoundu oqv qzuv| oqvx qvzzXpoundx qwzvu o|vw oo|vy Xpoundx oov qzov oquvu qvzwXpoundy oqvy o|xvu oovx Xpoundy ooxvu qwzv oqvy qvzwXpound ooqvu o|yvo o|v Xpound oouvo qzqvo oqvo qvzw

qzxvw ouvw ooqv| ooqvq oqxv oqvz qvzz

wsup3deg

usup3deg

|sup3deg

sup3deg

13aumlecedilg

sup3deg

|sup3deg

usup3deg

eccedilograveregYacuteg

efrac12cedilregg

oefrac12cedilregg

wsup3deg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

Table F-4 |eacuteIumlLQRafeaZrsacute iacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq |quvy |quvx |quvx ovquXpoundo |xywv |xwvx |x|vy Xpoundo ||vx |ovw |vy ovquXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz |ouvq |o|vo |o|vy ovqu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq ||zzv |qv| |xxqvw ovqqXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound |zwv| ||v |yovq ovqxXpound| Xpound|

|xyov| |xovu |xyyv| |xzwvw |o|vq |yxxvz ovq

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq |quv |quvq |quvu ovquXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| |ovx |ovy |ovy ovqu

|xyv| |xqvo |xyyv |o|vo |ovw |o|vq ovqu

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo |oyvw |oyvu |oyvy ovquXpound |xyvz |xuvu |xovo Xpound |oyv |oyvq |oyvo ovquXpound| Xpound|

|xyyv| |xuvu |xqv| |oyvx |oyv |oyv| ovqu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo |ov |oyvo |oyvz ovquXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| |ouv| |ovx |o|vu ovqu

|xyuvw |x|vo |xywvz |oyvq |ouv| |oxvo ovqu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound |oyv |oxvz |oyvo ovquXpound| |xyxvu |x|v| |xyzv| Xpound| |oxvx |oyvw |oyvo ovqu

|xyuvw |xuvy |xyzv |oxvw |oyv| |oyvo ovqu

ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

qntildeRoqntildeRo

qntildeR

qntildeR|

ontildeR

ontildeR|

ntildeR|

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpound

JAEA-Testing 2011-005

G Zccedil` MPIWIcirca|ccedil`gt MPI atilde 6MPI ETHNtildegtegraveBCD^_ZiIacute^regCZccedil`MPI

WIcirca|ccedil`gtOslash55poundDq

G1 Agraveograve MPI ETHNtildegtegraveBCD^_ZiIacute^regC1024 ^_Zccedil

` MPIW 128^_j87ccedilIumlIcirca|ccedil`128^_Zccedil` MPIW128 ^_j8 7ccedilIumlIcirca|ccedil`acircgtCc Ocirc$-a5gtrs5poundDqUgraveDRDMANORDMAIumlgt5poundDq )acircEacute coregtZccedil` MPIWIcirca|ccedil`rs acircEacute MPI^_gt

coreCCDouml rsWGq Ocirc$-a5n~n 8eacutea`ccedil1Keacutea`256Keacutea`ccedil1Meacutea`WCDq

G2 [oumlYacutebrvbarsectGqmbrvbarsect[raquozfrac14frac12CDq (1) 1024Zccedil` MPIW 128j8Icirca|ccedil`gtGu MPIatildegtIcirca|ccedil`

ordfFqUgraveDatildebrvbarpoundaacuteCCordflaquoq (2) 128Zccedil` MPIW 128j8Icirca|ccedil`gtplusmn Ocirc$-a5(256KB13yacute)

gtlaquoWGu MPIatildegtaeligIcirca|ccedil`ordfnotyW wXordfaacuteCCIcirca|ccedil`ordfZccedil` MPIbrvbarsectordfmWq

(2)gt ordfAgravecedilCXpoundegraveBG core ordfCXgtOslashf

IumlntildeHDoumlfIumlOslash5FCXWeumlnqmndbrvbarsect

BX900gtIcirca|ccedil`ordfnotygtlaquoqfIumlUTUacuteUcircbgtIcirca|ccedil`Ucirc5Gaeliguordflaquoq

JAEA-Testing 2011-005

Table G-1 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (1)

sup2~ w |y |o ovq| zx o| qv |vu vuqsup2~ oy |o |qz ovq| oqq o|x qv| |vow vwsup2~ u |ow |o ovqq zz o|x qv| |vo v|xsup2~ | |y |o ovqu zw o|x qv| |v|u v|sup2~ yu |o |oy ovqq zy o|z qvyz |v| vsup2~ ow ||u |o| ovq zw o| qv |vuq vwsup2~ xy ||u |o ovqx oqx o|z qvx |voz vsup2~ xo ||u |q ovqu oq o|w qv |vo| v|sup2~ ontildequ |uo |q ovq oqz ou| qvy |vo vu

sup2~ w owq qw qvw q oo qvy| vxy ovwxsup2~ oy owu o| qvw q oqw qvyy vyo ovzwsup2~ u owq qu qvww yw oqz qvy vyx ovwsup2~ | o| qy qvwu o oqw qvyy vu| ovzosup2~ yu o| qu qvwx o oqw qvyy vu| ovwzsup2~ ow owx qq qvz| oqz qvy vxy ovwusup2~ xy oz qo qvwz wy ooo qvw vqz ovwsup2~ xo ow qu qvw w oo qv vqx ovwsup2~ ontildequ ow ou qvwx z oq qv ovzw ovz

sup2~Oslashreg~plusmn w untildeqoy untilde|z ov| ontildeuw ontildeyz| qvu xzv|q xvusup2~Oslashreg~plusmn oy |ntildeyuw uqntilde ovwo ontildezw ontildeyuu qvz xyvx uvwsup2~Oslashreg~plusmn u untildeqxu |zntildezu| ovwx ontildexo ontildeyoq qvw xzvow uvwsup2~Oslashreg~plusmn | |ntilde|| uqntildeyq ovw ontilde|q ontildeyq qvwo xxvxy uvwysup2~Oslashreg~plusmn yu untildey uqntildeuuu ovwu ontildeww ontildeyuy qvw xvyu uvxsup2~Oslashreg~plusmn ow |ntildeqq uqntildezw| ovw ontildew ontildeyzx qv xzvxq uvowsup2~Oslashreg~plusmn xy untildeyyx uqntildeywz ovwu ontildeuxw ontildeyy qvw xovq uvuosup2~Oslashreg~plusmn xo untildeqz |zntildeoww ovwz ontildexq ontildeqx qvwz uwvyz vzzsup2~Oslashreg~plusmn ontildequ yntildexwx |ntildeuuz vqx ontildey ontildewx qvw uvo ozvzw

middotplusmnsup1 w |w | ovqq oqq o|y qv| |vw vuqmiddotplusmnsup1 oy |u |uo qvzx oqo o|y qvu |vq vxqmiddotplusmnsup1 u |uo |yq qvzx oqo ouq qv |v|w vxwmiddotplusmnsup1 | |xo |yw qvzy oq o|w qvu |vux vyymiddotplusmnsup1 yu xzo u|| ov|y oqx ouw qvo xvy| vzmiddotplusmnsup1 ow ntildez yqu uvz| ooy oxx qvx xvq |vzqmiddotplusmnsup1 xy xntilde z xv| o| oy qvz |zvyx xvwumiddotplusmnsup1 xo wntildeqoz ontildew uvxo oy oz ovo| |voy zvmiddotplusmnsup1 ontildequ oqntildezy |ntildexxo |vqz zo |w uvqz oovz ouvzx

middotplusmnsup1raquo w ntildey| oxntildequq qvo || oy qvu vz ovqomiddotplusmnsup1raquo oy ntildeuyw oxntildeqx qvoy ||x oq qvu v| ovoxmiddotplusmnsup1raquo u ntilde|y oxntildequq qvow |uo o| qvuw wvq| ovqzmiddotplusmnsup1raquo | ntildexw| oxntildeq|| qvo || ox qvu vyy ovqomiddotplusmnsup1raquo yu ntildeyyz oxntildeqqz qvow |xq yw qvuy vy ozvxumiddotplusmnsup1raquo ow |ntildeqqx oxntildeqy| qvq u|q uo qvxw yvzz qv|middotplusmnsup1raquo xy |ntilde|uy oxntildeoqu qv xyo y qvu xvzy ozvw|middotplusmnsup1raquo xo |ntilde|zq oxntildeoxq qv xzw woq qvu xvy owvomiddotplusmnsup1raquo ontildequ |ntildex| oxntildeuox qvu yy z|z qv xvxx oyvuo

Ugraveplusmnsup1 w ontildeuox yntildeyo| vy| u q ovq ovzq |vUgraveplusmnsup1 oy owntildeqz yntildewu vy |q qo ovou wv|q |uvoxUgraveplusmnsup1 u owntilde| yntildewzz v |qo q ovuy yvoy ||v|wUgraveplusmnsup1 | ozntildeqzy yntildezq v uz q ovq yvwq ||v|wUgraveplusmnsup1 yu ozntilde|wx yntildewx| vw| z oy ov| yxv|o |ovqUgraveplusmnsup1 ow qntildewo yntildez|y vz ||q o ovxq yovxq |ovuxUgraveplusmnsup1 xy qntilde|o ntildeqo vzq |wy y ovo xvx |ovqqUgraveplusmnsup1 xo ontildeo|u ntildeow| vzu |xy |u ovx| xzv| |qvyUgraveplusmnsup1 ontildequ ontildey|q ntilde|z vz| |zw ow| vow xuv|o uqvu

Ugraveplusmnsup1raquo w ozntildexw yntildez| vw| |z q ovow wvqu |uvuqUgraveplusmnsup1raquo oy ozntildexoz yntildezq vw| |q q ovou wuvyz |uvoUgraveplusmnsup1raquo u ozntildexqy yntildezu vwo u q ov| ov|q ||vyqUgraveplusmnsup1raquo | ozntildeuu yntildewz vwy o qy ov|o vwx ||vuqUgraveplusmnsup1raquo yu ozntildeqw yntildezq| vwx y oy ovu |vy |ovzy

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

sectWOcirceg

oqu$XAcircsup3deg owsup3degIcircwsup3+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

JAEA-Testing 2011-005

Table G-2 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (2)

Ugraveplusmnsup1raquo ow qntildeqx yntildewz| |vqq |u q ovq xxv|| |ovzUgraveplusmnsup1raquo xy qntildew| ntildeq| vzy ||y y ovuz yvqx |ovoUgraveplusmnsup1raquo xo ontildew ntildexy vz| u|| |u ovwx uzvoo |qvzxUgraveplusmnsup1raquo oqu ontildeww| ntildeuqz vzx |oy owu ov yzv|q uqv||

~plusmn w yntildez zntildeuw qvo ontildeozu uux vyw xvwq ovzo~plusmn oy ntildeoqz zntildeu| qv| ontildeoy| uu vyq yvoo ovz~plusmn u ntildeq|z zntildeuy qv ontildeo uuy vy| yvqq ovwx~plusmn | ntildeoz zntildeyq qvu ontildequ uu vq xvzw ovwy~plusmn yu yntildezoq zntildey qvo ontildeoyu uu vyq xvzu ovw|~plusmn ow zntilde|qz zntildew| qvzx ntildeo uuy uvw uvz ovz~plusmn xy zntilde|wz zntildewqw qvzy ntildey uuy uvzz uv ovz~plusmn xo zntildeqx zntildewuq qvzu ntildeowx uuz uvw uvo ovzu~plusmn oqu zntildexz zntildezq| qvzz ntildeo uyo uv uvuz ovxq

~plusmnraquo w yntildewox zntilde| qvq ontildeuo uux vz xvuz ovw~plusmnraquo oy yntildezq zntildeu| qvo ontildeoz uu vyw xvw ovwo~plusmnraquo u yntildeyz| zntildeux qvyz ontildeoyw uuu vy| xv| ovz|~plusmnraquo | yntildewou zntildexx qvq ontildeu uux vx xvx ovzo~plusmnraquo yu yntildezx| zntildex qvo ontildeoyo uuu vyo xvzz ovzx~plusmnraquo ow zntildeuux zntildew qvz ntildeoxx uu| uvwy uv|w vqx~plusmnraquo xy zntilde|x zntildewqo qvzx ntildeq uuy uvzx uvu ovzw~plusmnraquo xo zntildeyw zntildewuy qvzw ntildeqy uxq uvzq uv|z ovwz~plusmnraquo oqu zntildeyxq zntildezoo qvz ntildeoxy uyq uvyw uvuw ovx|

plusmn w uy uo ovo zy oow qvw uvwq |vxqplusmn oy xq uw ovqy oqo oq qvwx xvqq |vzzplusmn u xwz x|| ovoq oqy o| qvwy xvx uv|uplusmn | oo yqy ovo ooq o|q qvwx yvuy uvywplusmn yu ntilde zoz vu o| ouo qvw owvqu yvxqplusmn ow ntildeww ntildeqqz ov|z oxo ox qvwy owvu oovxqplusmn xy untildeouy |ntildeoy ovo q| oz qvz| qvuy oyvzplusmn xo |zntildeo| |xntilde|u ovoo ontildeuox ontildeuy qvwo vyz qvowplusmn oqu uqntildeqwy |xntilde|w ovo ontildeuww ontildewqw qvw yvzu ozv

plusmnraquo w oqqntildexy yqntilde|y ovy ntildeuou wy vx uovyy ywvwzplusmnraquo oy oqntildey| yontildeq ovyw ntildeu|w ww| vy uvqz yzv|oplusmnraquo u oqntildexyw yontildeqz| ovyw ntilde|zy wz| vyw uvw ywvuqplusmnraquo | oq|ntildeqo yontildeo ovyw ntildeuqq www vq uvzx ywvzoplusmnraquo yu oq|ntildeqzw yontildeoou ovyz ntildeuyx wzx vx uovw| ywv|qplusmnraquo ow oq|ntildezo yontildeq| ovq |ntildexqx wz |vzo zvyx ywvoplusmnraquo xy oquntildeqy yontildeqq ovo |ntildexyw wzz |vz zvo yvwxplusmnraquo xo oqwntilde|zq yqntildewo ovw |ntildexyx wzx |vzw |qvuo yvzplusmnraquo oqu ooqntildeqq yontildeoxx ovwq |ntildeyu zo |vzz zvzx yyv|z

~plusmnreg w ooz o|o qvzq q x qvwq xvwo xvo~plusmnreg oy oo| ou qvzo y qvwx xvq uvwu~plusmnreg u oow o|o qvzo q x qvw xvz xvy~plusmnreg | oq ou qvz q x qvwq xvzu uvwz~plusmnreg yu oq ox qvzy q x qvw yvo uvzz~plusmnreg ow o oz qvzz y qvw xvy xvq|~plusmnreg xy ox o|o qvzx y qvzx uvwy uvw|~plusmnreg xo ox o|o qvzx | y qvww xv|x uvzx~plusmnreg oqu ou o|| qvz| y z qvz uvq uvy|

~plusmn w oontildewxu |ntildeyuy |vx yu ontildeoow qvx owvuy |vy~plusmn oy oontildewo |ntildeyq |vw y|z ontildeoox qvx owvuw |v|~plusmn u oontildey |ntildexw |vz yuo ontildeoox qvx owv| |vo~plusmn | oontilde| |ntildexzx |v y| ontildeo| qvx owvuz |vq~plusmn yu oontildew|o |ntildey |v y| ontildeo qvx owvxw |vo~plusmn ow ontildeqw |ntilde| |v| yy ontildeouq qvxx ozv|o |vw~plusmn xy ontilde |ntildewox |v zz ontildeoyw qvyw oxv| |v~plusmn xo ontildeuo |ntildezzo |voo wu ontilde qvyz ouvy |v~plusmn oqu ontildequ untilde|w vzq zu ontilde|u qvo o|vuz |v|o

sup2cedil~raquo w y o qvx o | qvu| xvoy uvusup2cedil~raquo oy z o qvwq o qvxx vzx xvuxsup2cedil~raquo u x oq qvuy o qvyo |v| uvuzsup2cedil~raquo | | oo qv o qvxu vuz uvzosup2cedil~raquo yu oo qvyo o qvuz xvyo uvxusup2cedil~raquo ow oo qvyw o qvxx yvqq uvwxsup2cedil~raquo xy y oo qvxy o qvy uvyq xvoosup2cedil~raquo xo y oo qvxy | qvy| |vq| |vusup2cedil~raquo oqu x o qv|z o qvx |vyz xvuq

JAEA-Testing 2011-005

Table G-3 MPIatilderaquo LMPI and hybrid acute256ccedil1024KBP

sup2~ yntildeouu |ntildewzz |ntildeuou ovou ontildezx ontildeqyw ovw| ovzz |vqsup2~ xuntildeww ntilde|o ntildeuo qvzz |ntildeqo |ntildey qvz vu vsup2~ wyntildeu| oqntildexx oontildeqyy qvz |ntildezwy untildeuyw qvwz vq vuwsup2~ ontildequwntildexy ountildexxy oxntildeuy qvzu untildezqq xntildewyw qvwu vz vyu

sup2~ yntildeouu zou zo qvzu ontildeqqq ontildeq qvzw qvzo qvzxsup2~ xuntildeww ontildeo ontildezy qvwz ontildew ntildeuuz qv| qvz qvzsup2~ wyntildeu| ntildexuo ntildey| qvz ntildeyzz |ntildeuqw qvz qvzu qvwosup2~ ontildequwntildexy |ntildex untildeqox qvzu |ntildexww untilde|| qvw ovqx qvz

sup2~Oslashreg~plusmn yntildeouu yntildeuw| wontildexoz qvzu xntildeuo |ntildeoqw ovoq |vqq |vx|sup2~Oslashreg~plusmn xuntildeww ou|ntildewx ountildeqo qvw| uyntildeqzq yyntildewxo qvyz |vo vyosup2~Oslashreg~plusmn wyntildeu| oontildezoo yzntildeuz qvz yntildeqz zxntildeox qvo |voy vw|sup2~Oslashreg~plusmn ontildequwntildexy w|ntildeyox |yontildeuq| qvw wntildexo o|ntildexqz qvo |v| vz|

middotplusmnsup1 yntildeouu |ntildezw ||ntilde|x| qvzz oontildexww oxntildeyzu qvu vwx vo|middotplusmnsup1 xuntildeww y|ntildex yyntildeyy qvzx ontildez|x untilde|x qvuy vww ovuomiddotplusmnsup1 wyntildeu| wzntildewy z|ntildeyux qvzy |qntildeyx yntildeyqu qvuy vz ov|zmiddotplusmnsup1 ontildequwntildexy oqntildeux| o|untildeqox qvzq uqntildeyxx wwntildez| qvuy vzy ovxo

middotplusmnsup1raquo yntildeouu yntildew|u ||ntildeu ovww ntildewxy oxntildeyuw uvyy qvwy voumiddotplusmnsup1raquo xuntildeww wxntildeoz yuntildeo| ov|| wxntildex|| untildeuy ovwq ovqq ov|xmiddotplusmnsup1raquo wyntildeu| ooxntildeuw z|ntildezyu ov| z|ntildeyow yntildexuz ov|z ov| ov|zmiddotplusmnsup1raquo ontildequwntildexy o|ntildexx o|untildeuqz ovq zyntildexoq wntildeuuy ovoq ovu| ovxu

Ugraveplusmnsup1 yntildeouu |zntildeyu| oqntildexwz |vu uontildeuyq ountildewy vz qvzy qvoUgraveplusmnsup1 xuntildeww untildewxx ntildeqo voy uzntildeoy| yntildewo ovw| qvz qvw|Ugraveplusmnsup1 wyntildeu| xxntildewoq zntildewyu ovw xyntildewuu |yntildexqx ovxy qvzw qvwUgraveplusmnsup1 ontildequwntildexy yntildewxy |ntildexxw ovy yuntildewzx uyntildeoq ovuo qvz qvwo

Ugraveplusmnsup1raquo yntildeouu uqntildeq|u oqntildeyxq |vy uontildey| ountildez|z vz qvzy qvoUgraveplusmnsup1raquo xuntildeww uwntilde|u ntildeqz vo uzntildew yntildewx ovw| qvzw qvw|Ugraveplusmnsup1raquo wyntildeu| xyntildezy zntildez|y ovww xntildeo| |yntildeuwq ovx qvzw qvwUgraveplusmnsup1raquo ontildequwntildexy y|ntildeo| |ntildeyxq ovyw yxntilde|yw uyntildeqx ovuo qvz qvwo

~plusmn yntildeouu untildeyw wntildezu uvo uuntildeo wntildeuq| xvx qvzy ovq~plusmn xuntildeww xqntildeqyx qntilde|xq vuy xontildexwy untildezux vq qvz qvw~plusmn wyntildeu| xwntildeq wntildeux vqy yqntildeox |untildeu ov| qvzw qvw~plusmn ontildequwntildexy yyntildequq |yntilde|q| ovw ywntildeqx uuntildeuyy ovx| qvz qvw

~plusmnraquo yntildeouu zntilde|q zntildeqq ovqu zntildexq wntildeu|w ovo| qvzw ovq~plusmnraquo xuntildeww oyntildeu| qntilde|wy qvwo oyntildeoz untildezuy qvyx ovq qvw~plusmnraquo wyntildeu| |ntildewo wntildeuxu qvw ntildewz |untilde qvyy ovq qvw~plusmnraquo ontildequwntildexy zntildezy |yntildexo| qvw zntildezxq uuntildeu qvy ovqq qvw

plusmn yntildeouu yxntildewu wzntildeq| qv| ontildeyoz yntildezz qvwq |vq |v|qplusmn xuntildeww ountildeqoz oxntilde|o qvwo |zntildexu xxntilde|ux qv |vo vxplusmn wyntildeu| ow|ntildeuwy untildeyu qvw xntildeoyu untildezu| qvy |vo |vqqplusmn ontildequwntildexy u|ntildewo zxntildezqo qvw xntildezy zuntildeoy qvwq |vo |vo

plusmnraquo yntildeouu x|ontildeyx wyntildeqoz yvow o|ntildeqox owntilde|xu vu |vww uvyzplusmnraquo xuntildeww yozntildeyz| oz|ntildeuwu |vq oyntildeyw ywntildeyw v| |vwo vwplusmnraquo wyntildeu| yw|ntildewxu wwntildeooz v| owyntilde|q| oqxntildeyw ovy |vy vplusmnraquo ontildequwntildexy |yntildezo uqxntildeoyq ovw oontildexo ouontildezzx ovuz |vuw vwx

~plusmnreg yntildeouu uzu y| qvw uwo xzw qvwq ovq| ovqy~plusmnreg xuntildeww zq| ontildeoz qvy wo| ontildexuy qvx| ovoo qv~plusmnreg wyntildeu| ontilde|q ontildeyq qvw ontildeoxo ntildeoqz qvxx ovo| qvz~plusmnreg ontildequwntildexy ontildezqu ntilde|ux qvwo ontildeuxu ntildeyx qvxu ov|o qvww

~plusmn yntildeouu ountildewqx ountildez|x qvzz ontildeqq owntildeuw| ovou qvq qvwo~plusmn xuntildeww zntildeuy zntildewxz qvzw uqntildex|o uwntildexyw qvw| qv qvyo~plusmn wyntildeu| uuntildeq uuntildeo qvzz xyntildeyoq yzntildexzw qvwo qvw qvyu~plusmn ontildequwntildexy qntildeqqx qntildeu|q qvzz ntildeo zqntilde|ux qvwo qvzy qvw

sup2cedil~raquo yntildeouu x w qvz zo ux vq| qvw ov|sup2cedil~raquo xuntildeww ox o| qv| o|w o| qvyx qvzo qvwosup2cedil~raquo wyntildeu| oo |o qvu owz w| qvy qvzo qvwsup2cedil~raquo ontildequwntildexy ou wz qvu | |x qvyx qvz qvwo

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

sectWOcirceg

ow$XAcircsup3deg owsup3degIcircwsup3

国際単位系(SI)

乗数  接頭語 記号 乗数  接頭語 記号

1024 ヨ タ Y 10-1 デ シ d1021 ゼ タ Z 10-2 セ ン チ c1018 エ ク サ E 10-3 ミ リ m1015 ペ タ P 10-6 マイクロ micro1012 テ ラ T 10-9 ナ ノ n109 ギ ガ G 10-12 ピ コ p106 メ ガ M 10-15 フェムト f103 キ ロ k 10-18 ア ト a102 ヘ ク ト h 10-21 ゼ プ ト z101 デ カ da 10-24 ヨ ク ト y

表5SI 接頭語

名称 記号 SI 単位による値

分 min 1 min=60s時 h 1h =60 min=3600 s日 d 1 d=24 h=86 400 s度 deg 1deg=(π180) rad分 rsquo 1rsquo=(160)deg=(π10800) rad秒 rdquo 1rdquo=(160)rsquo=(π648000) rad

ヘクタール ha 1ha=1hm2=104m2

リットル Ll 1L=11=1dm3=103cm3=10-3m3

トン t 1t=103 kg

表6SIに属さないがSIと併用される単位

名称 記号 SI 単位で表される数値

電 子 ボ ル ト eV 1eV=1602 176 53(14)times10-19Jダ ル ト ン Da 1Da=1660 538 86(28)times10-27kg統一原子質量単位 u 1u=1 Da天 文 単 位 ua 1ua=1495 978 706 91(6)times1011m

表7SIに属さないがSIと併用される単位でSI単位で表される数値が実験的に得られるもの

名称 記号 SI 単位で表される数値

キ ュ リ ー Ci 1 Ci=37times1010Bqレ ン ト ゲ ン R 1 R = 258times10-4Ckgラ ド rad 1 rad=1cGy=10-2Gyレ ム rem 1 rem=1 cSv=10-2Svガ ン マ γ 1γ=1 nT=10-9Tフ ェ ル ミ 1フェルミ=1 fm=10-15mメートル系カラット 1メートル系カラット = 200 mg = 2times10-4kgト ル Torr 1 Torr = (101 325760) Pa標 準 大 気 圧 atm 1 atm = 101 325 Pa

1cal=41858J(「15」カロリー)41868J(「IT」カロリー)4184J(「熱化学」カロリー)

ミ ク ロ ン micro 1 micro =1microm=10-6m

表10SIに属さないその他の単位の例

カ ロ リ ー cal

(a)SI接頭語は固有の名称と記号を持つ組立単位と組み合わせても使用できるしかし接頭語を付した単位はもはや コヒーレントではない(b)ラジアンとステラジアンは数字の1に対する単位の特別な名称で量についての情報をつたえるために使われる

 実際には使用する時には記号rad及びsrが用いられるが習慣として組立単位としての記号である数字の1は明 示されない(c)測光学ではステラジアンという名称と記号srを単位の表し方の中にそのまま維持している

(d)ヘルツは周期現象についてのみベクレルは放射性核種の統計的過程についてのみ使用される

(e)セルシウス度はケルビンの特別な名称でセルシウス温度を表すために使用されるセルシウス度とケルビンの

  単位の大きさは同一であるしたがって温度差や温度間隔を表す数値はどちらの単位で表しても同じである

(f)放射性核種の放射能(activity referred to a radionuclide)はしばしば誤った用語でrdquoradioactivityrdquoと記される

(g)単位シーベルト(PV200270205)についてはCIPM勧告2(CI-2002)を参照

(a)量濃度(amount concentration)は臨床化学の分野では物質濃度

  (substance concentration)ともよばれる(b)これらは無次元量あるいは次元1をもつ量であるがそのこと   を表す単位記号である数字の1は通常は表記しない

名称 記号SI 基本単位による

表し方

秒ルカスパ度粘 Pa s m-1 kg s-1

力 の モ ー メ ン ト ニュートンメートル N m m2 kg s-2

表 面 張 力 ニュートン毎メートル Nm kg s-2

角 速 度 ラジアン毎秒 rads m m-1 s-1=s-1

角 加 速 度 ラジアン毎秒毎秒 rads2 m m-1 s-2=s-2

熱 流 密 度 放 射 照 度 ワット毎平方メートル Wm2 kg s-3

熱 容 量 エ ン ト ロ ピ ー ジュール毎ケルビン JK m2 kg s-2 K-1

比熱容量比エントロピー ジュール毎キログラム毎ケルビン J(kg K) m2 s-2 K-1

比 エ ネ ル ギ ー ジュール毎キログラム Jkg m2 s-2

熱 伝 導 率 ワット毎メートル毎ケルビン W(m K) m kg s-3 K-1

体 積 エ ネ ル ギ ー ジュール毎立方メートル Jm3 m-1 kg s-2

電 界 の 強 さ ボルト毎メートル Vm m kg s-3 A-1

電 荷 密 度 クーロン毎立方メートル Cm3 m-3 sA表 面 電 荷 クーロン毎平方メートル Cm2 m-2 sA電 束 密 度 電 気 変 位 クーロン毎平方メートル Cm2 m-2 sA誘 電 率 ファラド毎メートル Fm m-3 kg-1 s4 A2

透 磁 率 ヘンリー毎メートル Hm m kg s-2 A-2

モ ル エ ネ ル ギ ー ジュール毎モル Jmol m2 kg s-2 mol-1

モルエントロピー モル熱容量ジュール毎モル毎ケルビン J(mol K) m2 kg s-2 K-1 mol-1

照射線量(X線及びγ線) クーロン毎キログラム Ckg kg-1 sA吸 収 線 量 率 グレイ毎秒 Gys m2 s-3

放 射 強 度 ワット毎ステラジアン Wsr m4 m-2 kg s-3=m2 kg s-3

放 射 輝 度 ワット毎平方メートル毎ステラジアン W(m2 sr) m2 m-2 kg s-3=kg s-3

酵 素 活 性 濃 度 カタール毎立方メートル katm3 m-3 s-1 mol

表4単位の中に固有の名称と記号を含むSI組立単位の例

組立量SI 組立単位

名称 記号

面 積 平方メートル m2

体 積 立法メートル m3

速 さ 速 度 メートル毎秒 ms加 速 度 メートル毎秒毎秒 ms2

波 数 毎メートル m-1

密 度 質 量 密 度 キログラム毎立方メートル kgm3

面 積 密 度 キログラム毎平方メートル kgm2

比 体 積 立方メートル毎キログラム m3kg電 流 密 度 アンペア毎平方メートル Am2

磁 界 の 強 さ アンペア毎メートル Am量 濃 度 (a) 濃 度 モル毎立方メートル molm3

質 量 濃 度 キログラム毎立法メートル kgm3

輝 度 カンデラ毎平方メートル cdm2

屈 折 率 (b) (数字の) 1 1比 透 磁 率 (b) (数字の) 1 1

組立量SI 基本単位

表2基本単位を用いて表されるSI組立単位の例

名称 記号他のSI単位による

表し方SI基本単位による

表し方平 面 角 ラジアン(b) rad 1(b) mm立 体 角 ステラジアン(b) sr(c) 1(b) m2m2

周 波 数 ヘルツ(d) Hz s-1

ントーュニ力 N m kg s-2

圧 力 応 力 パスカル Pa Nm2 m-1 kg s-2

エ ネ ル ギ ー 仕 事 熱 量 ジュール J N m m2 kg s-2

仕 事 率 工 率 放 射 束 ワット W Js m2 kg s-3

電 荷 電 気 量 クーロン A sC電 位 差 ( 電 圧 ) 起 電 力 ボルト V WA m2 kg s-3 A-1

静 電 容 量 ファラド F CV m-2 kg-1 s4 A2

電 気 抵 抗 オーム Ω VA m2 kg s-3 A-2

コ ン ダ ク タ ン ス ジーメンス S AV m-2 kg-1 s3 A2

バーエウ束磁 Wb Vs m2 kg s-2 A-1

磁 束 密 度 テスラ T Wbm2 kg s-2 A-1

イ ン ダ ク タ ン ス ヘンリー H WbA m2 kg s-2 A-2

セ ル シ ウ ス 温 度 セルシウス度(e) Kンメール束光 lm cd sr(c) cd

スクル度照 lx lmm2 m-2 cd放射性核種の放射能( f ) ベクレル(d) Bq s-1

吸収線量 比エネルギー分与カーマ

グレイ Gy Jkg m2 s-2

線量当量 周辺線量当量 方向

性線量当量 個人線量当量シーベルト(g) Sv Jkg m2 s-2

酸 素 活 性 カタール kat s-1 mol

表3固有の名称と記号で表されるSI組立単位SI 組立単位

組立量

名称 記号 SI 単位で表される数値

バ ー ル bar 1bar=01MPa=100kPa=105Pa水銀柱ミリメートル mmHg 1mmHg=133322Paオングストローム Å 1Å=01nm=100pm=10-10m海 里 M 1M=1852mバ ー ン b 1b=100fm2=(10-12cm)2=10-28m2

ノ ッ ト kn 1kn=(18523600)msネ ー パ Npベ ル B

デ ジ ベ ル dB

表8SIに属さないがSIと併用されるその他の単位

SI単位との数値的な関係は    対数量の定義に依存

名称 記号

長 さ メ ー ト ル m質 量 キログラム kg時 間 秒 s電 流 ア ン ペ ア A熱力学温度 ケ ル ビ ン K物 質 量 モ ル mol光 度 カ ン デ ラ cd

基本量SI 基本単位

表1SI 基本単位

名称 記号 SI 単位で表される数値

エ ル グ erg 1 erg=10-7 Jダ イ ン dyn 1 dyn=10-5Nポ ア ズ P 1 P=1 dyn s cm-2=01Pa sス ト ー ク ス St 1 St =1cm2 s-1=10-4m2 s-1

ス チ ル ブ sb 1 sb =1cd cm-2=104cd m-2

フ ォ ト ph 1 ph=1cd sr cm-2 104lxガ ル Gal 1 Gal =1cm s-2=10-2ms-2

マ ク ス ウ ェ ル Mx 1 Mx = 1G cm2=10-8Wbガ ウ ス G 1 G =1Mx cm-2 =10-4Tエルステッド( c ) Oe 1 Oe  (1034π)A m-1

表9固有の名称をもつCGS組立単位

(c)3元系のCGS単位系とSIでは直接比較できないため等号「   」

   は対応関係を示すものである

(第8版2006年改訂)

この印刷物は再生紙を使用しています

Page 7: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター

JAEA-Testing 2011-005

611 Construction and Nodebind ------------------------------------------------------------------ 44 612 Communication method ----------------------------------------------------------------------- 46

62 Fundamental performance of MPI communication ---------------------------------------- 48 621 Measuring method ----------------------------------------------------------------------------- 48 622 Measured results -------------------------------------------------------------------------------- 49

63 RDMA ---------------------------------------------------------------------------------------------------- 53 631 Measuring method ------------------------------------------------------------------------------ 53 632 Measured results -------------------------------------------------------------------------------- 55

64 Communications between intra-CPUs inter-CPUs and inter-nodes ----------------- 56 641 Measuring method ------------------------------------------------------------------------------ 56 642 Measured results -------------------------------------------------------------------------------- 57

65 Communications between chassises ------------------------------------------------------------ 58 651 Point-to-point communication--------------------------------------------------------------- 58 652 Collective communication -------------------------------------------------------------------- 62

66 Summary ------------------------------------------------------------------------------------------------ 64 7 Concluding Remarks --------------------------------------------------------------------------------- 65 Acknowledgements -------------------------------------------------------------------------------------------- 65 References ------------------------------------------------------------------------------------------------------- 65 Appendix A HPCC benchmark results on FX1 --------------------------------------------------- 66 Appendix B Fundamental performance of MPI communication ---------------------------- 67 Appendix C RDMA vs NORDMA--------------------------------------------------------------------- 69 Appendix D Overlapping communications with calculations -------------------------------- 70 Appendix E Peer-to-peer communications --------------------------------------------------------- 76 Appendix F Memory band width ---------------------------------------------------------------------- 79 Appendix G Flat-MPI vs Hybrid --------------------------------------------------------------------- 85

JAEA-Testing 2011-005

1 EacuteH L13Pgt ABCDE

FGDHIJ 22 3K13LM=NO 153TflopsPQCRSTUVWXM=NO 200TflopsUV LinuxYZ$13LPRIMERGY BX900PWM=NO 12 Tflops []^_`$a^LFX1PbcdXefgh$13ijklmndoBpCDqBX900 rsXtuvwmWFX1 []efgh$Lxfgh$PyBzD^|13~hGmWWCq gtoBGXumlcopyordfyBClaquoPRIMERGY

BX900WFX1oacuteCO|Yl WpoundDGY`OslashCOslashGmWbrvbarsectdcWXpoundD13

GqUgrave[13EcircuJGq 3gtefgh$13MXBdn HPCC YbrvbarC 4cd 6O|YalLIOPa$fY`L^_ Pcent AringcdGq 7M$CqXyacuteampG()Ocirc$bWCUgraveWHDq

2 13Ecircu 13 3+XB-eacute13GXUVEumlIgraveOIacute[

]fIumlIacutenot|O-eacutecdX13gtlaquoqmnd13M

=NORSTUVWX 214TflopsMacirc012 56TBOgraveOacuteOcircOtildeY1212PBgtlaquoq13Ecircu Fig 1Gq 21 UVEumlIgraveOIacuteIcircIumlETHNtildeJ 1334WXUVEumlIgraveOIacutePRIMERGY BX900L13BX900P

200Tflops=NOLgYPWMacirc0 50TBnotGUV LinuxYZ$13gtlaquoq

BX900 10U-a5Aumleuml13aelig13T 186-eacute7IumlT 86accedil7Iuml8amp9XAumlAringOslashOuml7Iuml-eacutegtlaquoq1 7Iuml-eacuteaXeon ^_ccedil- X5570LYccedilIumlfP 2 ^_ccedil-Wacirc0 24GBLDDR3 SDRAMBiIacuteiumlIuml 12GB UgraveD 48GBP8ampGqiumlIumllaquoDsect=NO9376GflopsL293GHztimes4lt=gtOY_ccedilYtimes4ftimes2^_ccedil-Pgt|eacuteIuml 512GBsAumlAnotGqiumlIumlInfiniBandLIBP 4xQDRL13B QDRWCP2`ZaringCDAgravez 8GBsAbrvbarAumlFE7a13XOcirc$FGordf9gtlaquoq

JAEA-Testing 2011-005

Fig 1 13Ecircu

BX900gtacircHI4J2K currenLXMUV13Nh713~O=5Pq

22 []fIumlIacuteIcircIumlETHNtildeJ []fIumlIacuteFX1L13FX1P12Tflops =NOLgYPWMacirc0 46TBnotGYZ$13gtlaquoq

1iumlIumlQR SPARC64SLYccedilIumlfPT^_ccedil-Wacirc0 16GB8ampGqUgraveD|UCPUVBccedil^ccedil`LJSCPbrvbarsectAuml|eacuteIumlOslashWCqiumlIuml FBBLFull Bisectional BandwidthPccedil`X| InfiniBand 4xDDRLDAgravez 2GBsPgtYZGqiumlIumleacute|O=f7YOtilde IcircIumlETHNtildegtO=GVBOumltimesLAumla$fY`accedilPyBGmWgt [

brvbarUcircXOcirc$FG]EumlIgrave^_ZOslash5^=_wXM

`notGq FX1 gtFX1 CcedilEgraveordfxfgh$WXpound`acC2012 b

=Gxfgh$yBzD^|13~h5Pq 23 OgraveOacuteOcircOtildeYOumltimesIcircIumlETHNtildeJ OgraveOacuteOcircOtildeYOumltimes IO iumlIumlSPARC Enterprise M90002 6Ocirc$OcircOtildeYETERNUS DX80366$OcircOtildeYETERNUS4000 model40016gtJeumlnq

JAEA-Testing 2011-005

MOcircOtildeY12 12PBLRAID6Pgtlaquosect13gt 576GBs IO LgYPnotGLFig 2PqIO iumlIuml InfiniBand nota13Parallelnavi SRFSbrvbarsectciumlIumlcdYoacuteCaidefgCq Ocirc$OcircOtildeYOumltimeslaquoDsect 466Za SASOcircOtildeYL1TB7200rpmP8ampG

qmndOcircOtildeY RAID6 gt^UcircCOumltimesgt`ZagGmWgt IO AumlFUcirchpoundqXIOiumlIuml_aumla13Sun QFSBCLFig 3Pq

UVEumlIgraveOIacute11913aelig13()

i QDR 9j2 (113aelig13ntildesect) InfiniBandaccedil

(9iexcl) i FGFAring 72GBs (2iumlIumlcurren) IOiumlIuml (M90002iumlIuml) i FGFAring 576GBs (OcircOtildeYcurren) OgraveOacuteOcircOtildeY (ETERNUS DX80366) ()9kUcirclmiumlIumlnC

1 2 3 119

IBUSW

IO node

Fig 2 oOumltimesYZJWFGFAring

JAEA-Testing 2011-005

M90

00

2IO

M90

00

1IO

FC (3)

FC (4)

FC (36)

hellipFC (1

)F

C (2)

FC

(42)

hellipFC (3

7)FC (3

)FC (4

)FC (3

6)hellip

FC (1)

FC (2)

FC (42)

hellipFC (3

7)

DX8

0 1

helliphelliphelliphelliphelliphellip

helliphelliphelliphelliphellip

hellip

1TB 1T

B1T

B1T

B

DX8

0 2

1TB 1T

B1T

B1T

B

DX

80

35

1TB 1T

B1T

B1T

B

DX8

0 3

6

1TB 1T

B1T

B1T

B

ETER

NU

S D

X80

(36

)E4

K 1

1TB 1T

B1T

B1T

B

ETER

NU

S400

0 M

odel

400

(1)

32G

Bs(8

FC)

QFS

Q

FS

36

576

GBs

(144

FC)

F Fi

g 3

JAEA-Testing 2011-005

3 HPCCYbrvbar

HPCC (High Performance Computing Challenge)1) Y 7^_ZB BX900W FX15poundDq13gtHPCCEcircupqWrstuq 31 HPCCYEcircu

HPCC YvR Tennessee JDongara wRordf3xWXpoundyoBCefgh$BY`^_ZzgtlaquoqO

|Y^_ WpoundDefgh$acircuXu

GmWbrvbarsectefgh$13WCeacuteZGUacuteC

qO HPL(High-Performance Linpack)DGEMMFFTE 3q|^_Zgt|Y STREAM W Random Access 2 q|^_Zgt _ PTRANSW b_eff 2q|^_Zgt5n~noacuteCc^_Zbrvbar5PJWXpound 2)qo

^_ZUuml|YacuteTHORNFortranCgtcnordfHPCCYWCCgtcnD 1^_ZeumlnqXHPCCYC 2003 sect 300wordfeumlnsect+X13rsyBGmW9gtlaquoq

311 ^_Z efgh$13f CPU 1 iexclXCiexcl8ampG

L13iumlIumlPa$fY`gtYZCDYZ$JgtlaquosectCoreCPUiumlIuml13WeumlUgraveUgraveXgtO=ordf5nDHogtordfugt

laquoqmgtiumlIumlMG` (G Global system performance)WiumlIuml[gt` (SN Single Environment)gt` (EP Embarrassingly Parallel) 3gt5Pq

3111 HPL^_Z

HPL5IgraveLUcurrengtCmWgt13OG^_ZgtlaquoqOslashm^_Z|5 CcedilEgrave^_ZaumlIacutecurrenWCyB

eumlnmWordfqUgraveDefgh$13O[P TOP500B^_ZWCBeumln LinpackMPIEumlIgraveUcircWCnotgtlaquoqiumlIuml`$XOordfecircMAumlWXq Tflopsgteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq

JAEA-Testing 2011-005

3112 DGEMM^_Z DGEMM5IgraveGmWgtiumlIumlOG^_Zgtlaquoqo^_gt5curren13^_W i5Xqm^_Z

LinpackSgtegravenauml^_ZgtlaquoqUgraveDaumlefgh$gto13zeumlnDeacuteaTHORN|ordfZaZ|WCBeumlnsect

meacuteaTHORN|ordfoacute WXqY`LBMTPbrvbariumlIumlOiexclcentGq Gflops(Giga floating-point operations per second)gteumlnq` 2gtiumlIuml[(SN)(EP)`5Pq 3113 STREAM^_Z STREAM iumlIumlordf|W CDWecirceacuteIumlG^_Zgtlaquoqm

Wecirco^_gtO=ordf5np^_W i5Xqm^_Z

cpound(Copy)curren(Scale)Myen(Add)yen(Triad) 4 XO=gtJeumlnsectiumlIuml|YiexclcentGq GBs(Giga Bytes per second)gteumlnq` 8 gt[(SN)(EP)gtoiumlIuml 4(cpoundcurrenMyenyen)`5Pq 3114 PTRANS^_Z PTARNS(Parallel matrix TRANSpose)EumlIgrave^_ocirc5IgraveFtimesbrvbarsect13ccedil`brvbarY G^_Zgtlaquoq GBs gteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq 3115 Random Access^_Z Random Access |sectumleumlnDecopyIgravecdZordf 1 ulaquonotgtshyregGO=ordf 1 macrdegplusmn5ncG^_Zgtlaquoq1 plusmnYGOcirc$2ordf8BWgteumlordfgtlaquoqiumlIuml|Yucircsup2`WiumlIuml MPI gtucircsup2`5Pq Gups(Giga updates per second)gteumlnq` 3 gtiumlIuml[(SN)(EP)iumlIumlegravepoundDM(G)`5Pq 3116 FFTE^_Z FFTEAumlF|sup3^acuteOG^_ZgtlaquoqHPLWiacute CcedilEgrave^_ZaumlIacutecurrenWCyBeumlnmWordfq[(SN)W(EP)`gt2 [microcopyIgraveordfo^_gtpara|sup3^acuteeumlnqiumlIumlegravepoundDM(G)`gt3[microcopyIgraveordflaquo 1AgravezmiddotAcurrenetheumlnDcedil|sup3^acuteeumlnlaquoW^_ordf^_W 5PmWLoacute Pbrvbaro^_

copyIgravesup1copytimesordm^_raquofrac14sect[microoacuteG|sup3^acuteordf5nq

JAEA-Testing 2011-005

Gflopsgteumlnq 3117 b_eff^_Z b_eff Ocirc$FGFGfrac12frac34(7a13acute latency)WFGFAring(eacuteIuml band width)shycdiquestH^_ZgtlaquoqFGfrac12frac34ordfAgraveXCeacuteIumlordfecircM13^_ ordfAacutenmWGqb_eff iumlIumlegravepoundDM(G)G Ring W PingPong 2 ^_ZgtJeumlnqb_eff FGGOcirc$-a5mnUgravegt^_ZW+XsectlOcirc$-a5 N ] MPI ^_ P atildeAcircXq7a13 8B Ocirc$eacuteIuml 2MB Ocirc$FGGmWbrvbarsect5Pq Ring MPIbrvbarpoundAtildeHdnDZYAumlAringAEligo^_oacuteCAumlAring 1oacute 1 5PoumlWZYAumlAringZordfEumlu 1oacute 1 5Pouml 2CcedilEgravegtCDq PingPong 2Core Eacute Ecircnototilde^_ordfethsectntildednp Core EumlCOslash5Gq2CoreIgraveEacuteIacute5poundecircTIcircIIumlshyGq b_effETHNtildeOgraveRing(AumlAring)brvbar7a13OacuteRing(AumlAring)brvbareacuteIumlOcircRing(Zordf)brvbar7a13OtildeRing(Zordf)brvbareacuteIumlOumlPingPongbrvbar7a13timesPingPongbrvbareacuteIuml 6gtlaquoq7a13ordfaY_LOslashPmacreacuteIumlordf GBsgteumlnq 312 Oslash5

HPCCZaZ(Baseline Runs)WUuml^Otildea5Z(Optimized Runs) 2Oslash5ordflaquoordffeaZTUacuteUcircEacutebrvbarZaZgt5pound

q

3121 ZaZ Oslash5gtlaquoZaZfeaZbrvbarTUacuteUcircW13Oumlfrac34

eumlnaumlUgraveAuml BLAS ZaZ|W MPI ZaZ|egraveBUgravegtUacuteeumlnordfUcircfIuml^regUacuteeumlnXqegraveBCDfeaZeacuteYacute~

WfeaegraveBCDUuml^13~ZaZ|eacuteYacute~UumlordfYacutegtlaquoq 3122 Uuml^Otildea5Z UcircfIumlTHORNszligagravePTUacuteUcirc2 7gtUacute9eumlnqiAgraveaacutednDIacutecurren(-)fIumlTUacuteUcircgtlaquosectpAgrave|5macr8CEcircnototilde3XTUacuteUcircgtlaquoqUgraveD acirc HPCC^_YacuteNtildeY`acircatilde^WaumlaringordfaeligugtlaquosectatildeCUumlGaeliguordflaquoq

JAEA-Testing 2011-005

32 BX900 FX1n~nHPCC BMT^_Z 8ccedil512EumlIgravegtOslash5CDq 321 BX900 BX900 HPCC BMT^_Z Table 1GWHPL

PTRANSRandom Access FFTEo^_ZZK|Otilde Fig 4ccedilFig 7GqZK|Otildedegn^_Zegraveeacuteecirccopyordfeuml]cgtlaquoordf512 EumlIgravegtigraveyenCiacuteXCiacuteicircXordfdegdnqUgraveDHPL=NoacuteC 822(=4902596100)1gtlaquopoundDq

Table 1iumlntildeethntildeograveoacuteWCHPCCocircAEligcdotildebIcircIumlETHNtildeoumlccedilYAElig Intel Endeavor cluster (13gt Endeavor Wdivideoslash)HPCCWeboumlYacute 1) ugraveuacute Table 2Gq BX900shyucircoB3gtlaquoDH Ecircnototilde^_ZP EndeavoregraveeacuteplusmnpoundDlaquoordfEcircuumlb13yacuteordfdegdnqUgraveDEndeavorWEacute IntelfeaZBD Table 3 GbrvbarPQRfeaZbrvbarWbgtlaquopoundDqDotildeCSTREAM iumlIumlSgt5poundmWcd thornCXatilde

d(35GBs oacute 42GBs)W 2ethM EndeavorbrvbarsectEshygtlaquopoundDqm422gtaringNGq

1 BX90013VBbrvbargt17072fgt=N 2000TflopsoacuteC 1914TflopsOslashCsectOslashszlig 9566gtlaquopoundDq

- 9 -

JAEA-Testing 2011-005

Ta

ble

1 富士

通コ

ンパイ

ラに

よる機

構B

X900

での

HPC

C B

MT測

定結果

G-H

PL

G-P

TR

AN

SG

-FFTE

EP-S

TR

EA

MSys

()

EP-S

TR

EA

MTriad

EP-D

GEM

MR

ando

mR

ing

Ban

dwid

thR

ando

mR

ing

Lat

ency

TFlo

ps

GB

sG

ups

Gup

sG

Flo

ps

GB

sG

Bs

GFlo

ps

GB

sus

ecrd

ma

008

1315

12

9330

001

5112

008

9847

555

3328

074

350

9311

018

41

1171

51

268

nord

ma

008

1226

12

9544

001

5504

009

0139

557

5727

813

347

6711

023

91

1053

51

278

rdm

a0

1579

130

510

540

0238

850

1510

719

4469

555

143

4697

111

145

092

418

149

3no

rdm

a0

1536

650

491

080

0222

270

1334

439

7508

555

683

4730

111

123

078

781

324

5rd

ma

032

2493

09

0716

003

0751

024

2906

186

357

111

164

347

3911

101

00

5220

61

944

nord

ma

032

3494

08

4422

003

5001

023

7182

190

981

112

052

350

1611

100

30

5194

95

010

rdm

a0

6201

260

150

637

003

2786

037

8769

369

923

222

203

347

1911

092

40

4192

43

156

nord

ma

060

9871

014

686

80

0763

460

3882

6638

139

722

260

03

4781

110

887

044

542

652

7rd

ma

120

6270

026

303

70

0370

050

5186

5673

075

744

426

43

4708

111

007

038

892

463

5no

rdm

a1

2265

800

302

225

007

3652

061

6532

742

721

444

063

346

9211

100

50

4161

78

290

rdm

a2

4166

200

583

348

002

9897

053

8761

139

3770

888

453

347

0511

091

00

3583

55

937

nord

ma

236

3890

057

519

80

0765

560

8003

4814

845

5089

503

73

4962

110

879

040

323

103

72rd

ma

468

4620

011

199

400

0163

910

2622

2026

859

1017

779

513

4726

111

147

032

838

711

3no

rdm

a4

9017

400

109

9150

008

2799

153

7610

275

5280

1816

433

354

7711

111

90

3572

112

790

富士

通B

X90

0 (5

12C

ore

128C

PU

)C

ip

Xeo

n X55

70 2

93G

Hz

DD

R3-

1066

SM

T-O

N T

urbo

Mod

e -O

FF

Inte

rcon

nect

Q

DR

Inf

inib

and

2so

cket

nod

e F

at T

ree

OS

Red

Hat

EL 5

Com

pile

r 富

士通

CC

++ V

er O

ption -

Kfa

st -

SS

L2

()

EP-S

TR

EA

M S

ys の

値は

(EP-S

TR

EA

M T

riad

)times並

列数

で算

出し

てい

8 16 32 64

MPI

富士

通 P

aral

leln

avi Lng

uage

Pac

kage

512

128

HP

CC

Ver

V

13

1

256

並列

数通

信方

式G

-Ran

dom

Acc

ess

オリジナ

ル版

  

SA

ND

IA_O

PT2版

JAEA-Testing 2011-005

FFig 4 BX900 HPL

Fig 5 BX900 PTRANS

JAEA-Testing 2011-005

FFig 6 BX900 Random Access

Fig 7 BX900 FFTE

JAEA-Testing 2011-005

TTa

ble

2In

tel E

ndea

vor c

lust

er (

HPC

CW

eb1)

)

JAEA-Testing 2011-005

TTa

ble

3In

tel

BX9

00H

PCC

BM

T

JAEA-Testing 2011-005

322 FX1 FX1 HPCC BMT Table 4 HPL

PTRANS Random Access FFTE Fig 8 Fig 11512

HPL755(=377650100)2

Table 4 FFTE DGEMM

A

2 FX1 1200 129Tflops116Tflops 9037

JAEA-Testing 2011-005

TTa

ble

4FX

1H

PCC

BM

T

JAEA-Testing 2011-005

FFig 8 FX1 HPL

Fig 9 FX1 PTRANS

JAEA-Testing 2011-005

FFig 10 FX1 Random Access

Fig 11 FX1 FFTE

JAEA-Testing 2011-005

33 HPCCYUgraveWH Uklq BX900W FX1WrsndWK(Altix3700Bx2)Wrs

Fig 12GqmhgtTWXpoundDqshy 100WCLRandom Ring LatencyPqRandom Ring Latency13BX900ordf`ccedil^fCmWordfcurrencq

BX900 W FX1 rsGWSTREAM W FFTE necircOatildeC atildeC eacuteIumlrvCDordfmacrdnDqSTREAM

421FFTE 43aringNGq XAltix3700Bx2 Random AccessordfEordfAltix3700Bx2W

plusmngtegraveBCD HPCC BMTeacuteYacute~cdOcirc$Y|5ordf+Xpoundsect8YrsGmW13XqAltix3700Bx2 atildeC FFTEEordfmdoacute OslashszligWuumlwdn

q

JAEA-Testing 2011-005

13

0

20

40

60

80

100

BX90

0

FX1

Altix37

00Bx2

GHPL(490Tfplos)

113

GRando

mAccess(154G

ups)

GPTRANS(1120G

Bs)

EPSTREA

M(436GBs)

EPDGEM

M(111Gflo

ps)

13

GFFTE

(2755Gflo

ps)

$amp(13

Rand

omRingBandw

idth

(0359GBs)

Rand

omRingLatency(518usec)

)+

512-(FlatM

PI)

1172

10 64

64

10 32

8 2 32

130

13

123

[GFLOPS]

4567898

123

[GBs]

lt8

7898

[GBs]

=gtAB130CDgtEFGHIAB0J

KLMNOIPQ

Fi

g 1

213cent

HPC

C B

MT^_ZrsZ

Ran

dom

Rin

g La

tenc

y (5

18

sec)

)+R

JAEA-Testing 2011-005

4 BX900W FX1TUacuteUcirc 32 c^_Zgt BX900 W FX1 IcircIumlETHNtildeoumlccedilYWiCXordfWnDmWQR^_aZegraveBC(Xcurrenyen

5poundDqeumldfeaZUuml^13~brvbarTUacuteUcirc]Ucirc7^_ZTUacuteUcirc

EacuteDq 41 ^_aZ

BX900 W FX1 gtegraveBCD^_aZWQRUgraveyenXgtlaquosectoacute ^_ZOslash5^_a|Ocirc$divideC^_ZOslash5cedildivideOcirc$divideC

^_Z=)yen5PmWordf13q ^_aZyBGUgrave^_Z_IumlYacuteh)JGDHfe

a|YUuml^13~UKtl_trtGq[ _ZOslash5_IumlYacute

h fpcoll fIumlgt=GqTcedildivideCDeacuteaTHORN|Ocirc$ fprof fIumlgtdivideCaring`Gq^_aZgtPAElig 8q|gtlaquoq

^_a|Ocirc$divideAElig AElig ^_ AElig MPIZaZ|ordm)AElig f`AElig IcircIumlETHNtilde$AElig fZAElig UcircfIumlAElig

mPIcircIumlETHNtilde$AEliggtCPUlt=gtOaringaeligccedil13

hccedil`|WOcirc$FGszligbordfoacute WXq

42 feaZbrvbar|YTUacuteUcirc BX900HPCC BMT^_Z STREAMordfotildeb

gtlaquo EndeavorWrsC(35GBsacute42GBs)W 2ethMEshygtlaquopoundDmWETHNtilde5poundDqETHNtildeBDUcirc HPCCpSTREAMUuml|YacuteTHORN(CUcirc amp FortranUcirc)7ccedilIumlEumlIgraveUcircWMPIEumlIgraveUcircgtlaquoq[13BGbrvbarPmdegn BX900 feaZUuml^13~UKntst brvbarsectCDmWcdEndeavorfeaZ(IntelUgrave)W BX900feaZ(QRUgrave)igraveiacuteWuumlwdnq

JAEA-Testing 2011-005

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute BX900Uuml|YacuteTHORN STREAM^_Z Table 5Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntst brvbarordfmacrdnDqDotildeCUuml^13~gtOcirc$|sectumlGaringaeligccedil13hyBCXbrvbarPGDHoacute WX DO^SgtOcirc$sup1yBordfXmWordfdcXoumlEacuteegraveBGuecircgtlaquosectPgtXoumlordfUcirc

GordflaquoqUgraveDfeaZordfoacute WGDO ^SgtsectumlOcirc$ordf^^shypoundZYeumlnoumlEacutegtlaquoq CUcircoumlIntelfeaZXd OpenMPEumlIgraveUgraveUgravegt 42GBscoreordfdegdnDordfQRfeaZgt7ccedilIumlEumlIgraveUcirc OpenMP EumlIgravecd=EumlIgrave^regCbrvbarP]C Intel feaZWbWXpoundDqQRfeaZgtlaquo CD7ccedilIumlEumlIgraveUcircAgraveogravebrvbarpoundOcirc$copytimes]AgravebTUacuteUcircAgraveograveordflaquoWnq [ FX1Uuml|YacuteTHORN STREAM^_Z Table 6Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntstbrvbarordfmacrdnDqXTriad (132GBs)CPUQRordfWebgtUumlCshy(135GBs)CPUWotildeEacutegtlaquoq CUcircoumlordf FortranUcircUKntstUuml^13~XCWotildeiCqFX1 CfeaZ BX900W+XsectUKntstUuml^13~ordfcentsup3CXqDHFX1 CfeaZbrvbarTUacuteUcircordf$currengtlaquopoundD9ordfAumlq

JAEA-Testing 2011-005

TTa

ble

5B

X900

STRE

AM

Ta

ble

6FX

1ST

REA

M

JAEA-Testing 2011-005

422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute 321gtdegdnD BX900 HPCC BMT^_Z STREAM

Table 7GqmmgtUKntstcopyIgraveOcirc$ecircEacutearingaeligccedil13heumlX storeampegraveBGmWUKrestp=all(a$XsectordfXmW)WCDTUacuteUcirc5PfeaZUuml^13~gtlaquoq UgraveDyacutegtgtlaquoordf _ZOslash53iumlIuml XC

iumlIumlSgt|Yo CPU]o CoreiDHEumlIgrave] Agravejatildedn~nfeaZUuml^13~brvbar^UcircCXq

STREAMbrvbarPX CoreIumliX2gtZX|Yordf5n DO^gtfeaZUuml^13~UKntst GmWgt|Y+13mWordfcurrencpoundDq

Table 7 HPCC BMT^_Z STREAM (MPIEumlIgraveUcircCUcirc) feaUuml^13~Ograve(1) feaUuml^13~Oacute(2)

EumlIgrave Agravej 13aelig13

EP-STREAMTriad (GBs)

13aelig13 EP-STREAMTriad (GBs)

rdma 1 34739 U U 32

nordma 1 35016 1 43316 rdma 3 34708 U U

128 nordma 3 34692 1 43572

rdma 5 34726 4 43595 512

nordma 4 35477 4 43424 L-1QRfeaZUuml^13~OgraveacuteKfastpackedntst SSL2 -2QRfeaZUuml^13~OacuteacuteKfastpackedntstrestp=all SSL2P

JAEA-Testing 2011-005

43 hhbrvbar|YTUacuteUcirc 32 gtdcWXpoundDbrvbarPBX900 W FX1 FFTE ^_Z

acircIcircIumlETHNtilde13yacuteordfmacrdnDqmETHNtildeDH5pound

DhhW13gtszlig0Gq 13gt FFTE 32EumlIgraveOslash5CDWecircoacuteG ethBX900

gt 12FX1 gt 16Wgteumlqpoundordm)1|YEcirc2OgtlaquopoundD9ordfAumlqmgtFFTEUuml|YacuteTHORN(FortranUcirc)l3CUuml|YacuteTHORN(13aumlWoacuterGoumlUcircWdivideoslash)WaumlIacutecurren4C)JCDaumlacircoacuteGaumlZh5poundDiacuteegraveBC

hbrvbarpound^UcircCDordm)GmWbrvbarsectBX900 W FX1 oacuteG5g5poundDqUcircpaumlbrvbar5poundDh 1QRfeaZW Endeavor egraveBeumlnDafeaZWrs5PDHW 2auml13IacutecurrenTUacuteUcircbrvbar6)BbordfaumlIacutecurrenthorn7wXmWagraveaacuteGDHgtlaquoq UcircaumlIacutecurren 38plusmnUcircWmndaumlZhCDiacute

Ucircn~n Fig 13W Fig 14GWoIacutecurreniacuteEcircu0Gq Fig 13W Fig 14OgravegtCDIacutecurrenOO=5pound^gtlaquoqUuml|YacuteTHORN

gtcopyIgrave APXYZWWWWOcirc$Yordf 2ugt5nsectZgtXcpoundDDHiacutegt)9BcopyIgrave CY_W^curreneth5poundcopyIgrave APXYZWWWWYordfZXbrvbarPCDq)9BcopyIgrave CY_YZCXcpoundD=Ocirc$sectumlG^gtYordfZXpoundDoumlUcircordfecircDHgtlaquoq

Fig 13W Fig 14OacutegtCDIacutecurren3[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 3ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 3gtXC 6Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquosectaringaeligccedil13h-

a5brvbarpoundltG_ccedilY-a5^_ZUgrave)acircregmacr= 16 gteumlnq

Fig 13W Fig 14OcircgtCDIacutecurren2[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 2ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 2gtXC 4Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquoqFX1 gtBX900rufntildeDsectaringaeligccedilaringh-a5ordfgteumlmWcdmO=)t_ccedilY-a5 16cd 4^regGmWgtzyacuteordfmacrdnDq

Table 8BX900W FX1QRfeaZegraveBCDUcircWaumlbrvbarEuml BX900 afeaZegraveBCDaumlbrvbarGq

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeFFQhFQeFHFFIhFIeFHFFhFQWFRWFFILCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildeKgMSCFDEFGLCKmMlhKOFRLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeNHQRIAacuteVOEOlOjAtildehSRAacutelOEYEEfKAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKnMSCFDEFGKmMSCFDEFGKdMSCFDEFGKkMSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePMMSCFDEFGPKMSCFDEFGPPMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLSABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtildeLCnMEEhKOFKOFAVjLCgMllhKOFPOFAVjLCPMEhEEOUEFMAacuteEEfFAVjYKOFKAtildeLCKMlhllOUEFMAacutellfFAVjYKOFPAtildeAAacutelOEAtildehNAacuteEOlAtildeaumlIacutecurrenOcircKMSCFDEFGPMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBF

GFL

Fig 13 FFTEUuml|YacuteTHORNaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeop]qrsAacuteXAtildeObqqpobtbuqrvvowxAacutevOvOvAtildeL3yzPbqqpobtrAacuteowxAacutesOwO]|AtildeAtildeLCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildep~hKOwowxAacute^O~OqAtildehowAacute~O^Atilde ^currenethDHrp )9BcopyIgrave)JIacutecurrenKgMSCFDEFGKdMopt^|rKkMopt^|rpKdK^^hKOsOuqpKmM~~hKOwOuqpKmK~h~~O^MAacute~~fuqYKOwAtildeLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildepKkKVhKO]|NHQRIAacuteVOEOlOjAtildehowxAacute^Y^^fKO~OqAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKkKopt^|r LEW V^lnacutewPKnMSCFDEFGKmKSCFDEFGKmMSCFDEFGKdKSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePPMSCFDEFG LEW j^lnacutewPPKMSCFDEFGPMMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtilde]bcbrtrcAacuteuqKhnAtildeLCnMEEhKOFKOuqKLCgMllhKOFPOuqKLCKMlhllOUEFMAacutellfuqKYKOFPAtildeLCPMEhEEOUEFMAacuteEEfuqKYKOFKAtildeAAacutelOEAtildehNAacuteEOlAtilde aumlIacutecurrenOcircPMSCFDEFGLEW l^lnacutewPKMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBFGFL

Fig 14 FFTEiacuteaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

Ta

ble

8FF

TE V

er 4

1

FullW

kern

el

(EumlIgrave

32

180M

BP

roc

)

ST

UVWXYZ[]

^_

`6a8bcdRefghHdRefgij

`6a8bcdRefghHdRefgij

klmmnopqr

cltst

ouvq

wvx

ovy

zqvz

|vy

|vz

klmmnopq

~cltst

yvy

yvo

ovo

wqvq

v|

|vy

s88oy

oqvu

oqvu

ovq

z|vu

oquv

qvz

s88Ru

ovz

qvw

x|vq

ovw

klmmnopqr

cltst

ouvy

zvy

ovx

oq|vu

uvz

uv

klmmnopq

~cltst

yvx

xvw

ovo

w|vu

vz

|vy

s88oy

oqvx

oqvo

ovq

wxv

z|vo

qvz

s88Ru

o|v|

qvw

xuvw

ovy

klmmnopqr

cltst

ouvo

zvx

ovx

klmmnopq

~cltst

yv

xv

ovo

s88oy

oqvq

oqvu

ovq

s88Ru

ovz

qvw

ln

ltd

ln

zqq

D

mo

ltd

ln

mltd

JAEA-Testing 2011-005

UgraveBX900gtUcircWQRfeaZaumlWQRfeaZaumlWafeaZdegnIgraveEacuteIacuteotildeigt

laquosectauml4ordfXmWWfeaZAacuteordfXcpoundDmWordfcurren

cqUgraveDFig 14OcircgtCDIacutecurren_ccedilY-a5^regGW(_ccedilY-a5acute164)BX900gtordfpoundUcircCsectmoumlaringaeligccedil13hegravepoundXcpoundD9ordfAumlq [FX1gtUcircWQRfeaZaumlWQR

feaZIgraveEacuteIacuteshyegraveeacuteordfaEacuteDordfUuml|YacuteTHORNcdz

yacuteCiacutecurrengtrsGWotildeigtlaquomWcd BX900 WrsGmWXqBX900W FX1gtUuml|YacuteTHORNWiacuteOslash5GWFig 14OgraveIacutecurrenO=OordfEcircUgraven^currenethWPrsUVXiacute5pound

DDHBX900 gt FX1 gtzyacuteCqCcurren BX900 16 oacuteC FX1gt 40WXzyacutegtlaquopoundDqUgraveDFig 14OacuteWOcircIacutecurrendegncopyIgravefgO=EacutegtlaquosectBX900gt 11currenW 08currengtzyacuteXcpoundDordfFX1gt 36currenW 18curren(or 16curren)gtMddcXzyacuteordfaacuteHdnDq 13yacutecdFX1feaZTUacuteUcirc BX900Wdc+XpoundsectFX1

|Y]OszligyBGDHhhordfaeliggt

laquomWordfcurrencpoundDq 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute

42 STREAM^_ZgtethCDW ordfcent^_Zgtecircc5g5poundDq egraveBCD^_Z43 FFTEWEacute Stockham|5gt FFT5P

NPB (NAS Parallel Benchmarks) 23FT^_ZgtlaquoqFFTEoumlWiacuteNPB23 FT Uuml|YacuteTHORN(Fortran Ucircgt MPI EumlIgraveUcirceumln)l3CUuml|YacuteTHORNcdaumlIacutecurren4C)JCDaumloacuteCfeaZUuml^13~brvbarpound^

UcircCDordm)5poundDqUuml|YacuteTHORNgtXCaumlBD=YZ C7UV((XYZ)=(512512512))gtaumlIacutecurrenordf On aringaeligccedil13hWXsectfeaZUuml^13~szligordf13XDHgtlaquosectaumlIacutecurrenUVEacuteCDcpoundDcd

gtlaquoq ^_ZaumlIacutecurren 28plusmngtlaquosectn~n Fig 15W Fig 16GqMd

8plusmncopyIgraveoacuteGYordfZsectumlO=gt|YordfZgtlaquo

mWcdhtgtUKntst oacute WXXgtmUuml^13~notbrvbarGq UgraveDOcirc$_ccedilY-a5 16W 64brvbar5poundDq

JAEA-Testing 2011-005

13

|ucp|t^ro__tAacute^OOOsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]|trFRFY]p^top]qrsYtpYop]qrsJJDp_Q|^bbqpc^ta|rotpbctcb|urc`Q^uptatar^]|tbtarp|t]|tbccbwOa^qrR^boocbtoabccbw`Et^b|rtabtFhPU`Ar_pcrobqq^SJJDItpo]rc_pcJJDOtarbccbw|tur^^t^bq^ruwobqq^SJJDI^taEorttpMbUrttpUQOarcrUQ^tarbs^|bq|rp_U_pcbwo|ur|rtobqq`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^oq|r[qpubq`a[^oq|r[]^_`a[^trrc^OOO^O~OqOsO^rccOr^trrc__tuqpoO__tuqpo]b]bcbrtrcAacute__tuqpohdnO__tuqpo]bhddAtilde auml^_ccedilY-a5Kd]bcbrtrcAacute__tuqpohKdO__tuqpo]bhKXAtilde]bcbrtrcAacutehiOhmKPAtildep|uqrop]qrssOwO|p|uqr]cro^^prqOrqrOrq]^r^psAacute__tuqpo]bOAtildeOwAacute__tuqpo]bOAtildeO|AacuteAtildeobqqUHExE^tAacute^rccAtildesAacutevOvAtildehAacuteK`MMOM`MMAtildewAacutevOvAtildehAacuteK`MMOM`MMAtilde|AacutevAtildehAacuteK`MMOM`MMAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pcprbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqh]^xt^rAacuteAtildep^hKOKMMM^hY^pqhKOOPobqq__tPAacute^OqOOO__tuqpoO__tuqpo]bO|OsOwAtilde^_Aacuteq`r`AtildeptpKdMobqq__tPAacute^OqfKOOO__tuqpoO__tuqpo]bO|OwOsAtilderpptpKXMoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]wRtpQ`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKdMopt^|rp~hKOp^hKO__tuqposAacute^O~AtildehwAacute^O~Atilde aumlIacutecurrenLfgO=PrprpKXMopt^|rrpobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqrh]^xt^rAacuteAtilderq]hrqrYrqobqqUHExSpxcbAacuteUHExSCUUxTCBVLOrO^rccAtilde^_Aacuter`r`MAtildetarc^trAacutedOWAtilde[Gqb]rD^ruw|u`__tPAacutero`Atildev[Orq]r^_obqqUHExJ^bq^rAacute^rccAtildetp]

Fig 15 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- cfftz)

JAEA-Testing 2011-005

|ucp|t^r__tPAacute^OqOOOwOwKO|OsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pctarVYta^trcbt^pp_tarropbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^trrc^OOqOOOwOwKOKOq^Oq~OqO|O^O~O^KKO^KPO^PKO^PPp|uqrop]qrs|OsOwO|KOsKKOsPK^r^p|AacuteAtildeOsAacutewKOAtildeOwAacutewKOAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYort^^t^bq]bcbrtrc`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKhePqhPWWAacuteqYKAtildeq^hPWWAacuteYqAtildeq~hPWq|hq^fKp^hMOq^YK^KKh^WqfK^KPh^KKfK^PKh^Wq~fK^PPh^PKfq^_Aacute^`r`KAtildetar|Kh|Aacute|f^Atilderqr|Khop~Aacute|Aacute|f^AtildeAtilder^_oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoDa^qpp]^rotpc^buqr`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYphMOqYKp~hKOwsKKhsAacute~O^KKfAtildesPKhsAacute~O^KPfAtildewAacute~O^PKfAtildehsKKfsPK aumlIacutecurrenLOO=PwAacute~O^PPfAtildeh|KWAacutesKKYsPKAtilderprprpcrt|cr

Fig 16 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- fftz2)

JAEA-Testing 2011-005

UgraveWH Fig 17(Ocirc$ Table 9 Table 10GqQRfeaZbrvbarshyatildeCBX900 fftz2gtfeaZUuml^13~UKntstbrvbarzyacuteordfagraveaacute13cfftz gtnotXordfmacrdnXqUgraveDFX1degnnotXordfmacrdnXcpoundDqmW ETHNtildeCD

zyacuteordflaquopoundD BX900 fftz213gtfeaaringaeligccedil13heumlXsectumlampordf5eumlnXmWordfcurrencpoundDq13]h

eumlnX(cXCcedilEgraveWuumlwdnordf(currencdXq Ocirc$_ccedilY-a5degnOcirc` 16ordfTUacuteshygtlaquopoundDq UgraveDQRfeaZbrvbar BX900W FX1shyordfFFTEoumlbrvbarsectgteuml

(2curren13)ucircsup2WsectumlMd|YZgtlaquopoundDDHWuumlwdnq TcedilQRfeaZWafeaZUKntst ordf)BCD

IacutecurrenotildeQRfeaZAgraveordf 1 ethAacuteWXpoundDordfSTREAM gtfrac12Eacutesect

iumlntildegtlaquoq 45 TUacuteUcircUgraveWH

BX900WFX1STREAMWFFTEUuml|YacuteTHORNfIumlWNPB23 FTB13feaZbrvbarTUacuteUcirc5gCDFX1gtegraveBGfeaZUuml^13~brvbarpoundordfecircC^ordflaquo 5ordfaeligugtlaquopoundDq

FX1gtfeaZUuml^13~otildegt13ecircGmWordf13XoumlordflaquosectWecircUcirc7gth5Paeliguordflaquoq

Fig 17 feaZbrvbarTUacuteUcirc

JAEA-Testing 2011-005

TTa

ble

9N

AS

PARA

LLEL

Ben

chm

arks

Ver

23

FT

Tabl

e 10

NA

S PA

RALL

EL

Ben

chm

arks

Ver

23

FT

JAEA-Testing 2011-005

5 IORYbrvbar Y`fIuml IORBoa13alFGFAring

CaYoBETHNtildeCDq 51 IORYEcircu

IOR Benchmark3)vR ASCI Purple=_7|eacuteplusmngteumlnD C iexclbrvbarYfIumlgtlaquoLplusmneacuteYacute~ 2102egraveBPqPOSIXLwrite() read()PMPIIOLMPI_File_write_at() MPI_File_read_at()PbatildeB13EumlIgravel5PmWordfgtecircq 511 YacircueZ$

MPI oZYcplusmnLrepetitions plusmnP`5XFGFAringTshyiquestHqOcirc`centgtEumlIgraveOslash5eumln^_cdiaoacuteCpound

Eacuteecircordf5nndFGFAringshyordfeumlnqIOROslash59XacircueZ$ Fig 18Gq

1plusmn`gt readwriteplusmn(blockSizecurrentransferSizejsegmentCount)gtlaquosect`aecirceuml (blockSizej segmentCountjMPI EumlIgraveAring )WXqyenwtransferSize=1kBblockSize=1MiBL106 Bytes MB220 Bytes MiB WGPsegmentCount=1MPI EumlIgraveAring=32 oumlblockSizejsegmentCountjEumlIgraveAring=32MiB testFile ordf)dnoZY 1MiBmiddotAbrvbarntildeC1KB7fIuml 1024iexcl1313aeligUgraveDZordfpoundEacuteecircGq

[transferSize] 1plusmn readwritegtFGGeacutea`qsizeof(long long int)currencblockSizegtlaquoaeliguordflaquoq

[blockSize] o^_ordfbrvbarntildeGeacutea`qgt 1MiBq [segmentCount] o^_brvbarsectYeumlniOcirc$qyzshy 1q [randomOffset] 1XdZordfYqOcirc`shy 0q [fsync] ecircEacuteAring fsync13f5Gqyzshy 0q [filePerProc] ^_sectWcent`aYGqPOSIXEacutenotszligq

yzshy 0q [useFileView] MPI_File_set_view() MPI_File_write() MPI_File_read()egravePq

ZordfYWparaCXqyzshy 0q [collective] MPI_File_write_at_all() MPI_File_read_at_all()egravePqZordfY

WparaCXqyzshy 0q

Fig 18 IOR YacircueZ$

JAEA-Testing 2011-005

512 MPIIOatildeGUuml^13~eZ$

IOR gt Info aringbrvbarpound MPIIO =)`7wmWordfgtecircqBX900 notszligX InfoUumlYacuteNtildeY` keyW value Table 11GqplusmnCXLyzshyPq

Table 11 MPIIO Info UumlYacuteNtildeY` key value

cb_buffer_size 4194304 $amp()

+

cb_nodes -012345

6789lt

=gt-012A

B

ind_rd_buffer_size 4194304 ACDEFGF$amp()+

ind_wr_buffer_size 524288 ACDH+GF$amp()+

JAEA-Testing 2011-005

52 latildeLPOSIX MPIIOPn~nEumlIgraveAringtransferSize (1plusmn13

fgtFGeacutea`)YAgraveograveLsequential randomPb^wFig 19GumlcopyFGFAringCDL^_P Table 12Table 13Fig 20Fig 21GqUgraveDntildea13iumlIuml_aumlaringaeligccedil13hIO -eacute-eacutearingaeligccedil13hordfcentsup3CDHn~naringaeligccedil13hoacuteGIO XcOgraveOacuteOcircOtildeYoacuteG IO XcordfcentGDH_aumlaringaeligccedil13hszligEacuteDq

iexclcentpoundcurrenyenETERNUSDX80

970TB

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

brvbar

SRFSltlaquocumlcopyordf

READD WnotiteD

brvbar

SRFSltlaquocumlcopyordf

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

shyshyshy

Fig 19 ReadWriteumlcopy

JAEA-Testing 2011-005

Table 12 LPOSIXP

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3

oqu yvq ovy| |uxvzw zvwu |zwvyy ouv|u oywvo ozuqv

quw zqzvuy uvox uu|v|u u|uvqo uxovz| uwvuo owovuq wqvxuqzy oqqvo| vy uxyvyw uuuvxx uyvyw xqzvu| |vw| |q|wvzqwoz o|uv|w ouvoy ozwv|o uyxvux uqvyo u|v| uovzu zuvxoy|wu ouoov|w uwvqu uuvxq uyvwz uuvyw uxovww uqvxw zovzw|yw ouxzv wzvwq uy|vx uyvz| uyuvwz uwxvoo |xvo wxvzyyxx|y ouwyvq| wv uxvo| uovyx uzvuu u|qv xvxy yvyxo|oq ouqwvu| uqvqq uxv|w uxvxy uuwvyo uyvoo qvyw zv|qyouu oxoyvu ywovoq uuvw ux|vy |uxvu| uxovoy oqvow uwvzwxuww oxovww oo|zv uyuv uvoq |w|vo u|wv|y ouqvoq uqwvooquwxy ouwvyq ovq uuyvo| uzvyu uquvq u|uvo q|zvx| q|vq

oqu ow|qvy ovu uqvyo vyu xx|vyq uwvwy z|zyv|y oxxqwvuwquw quzvoy |vq |w|vwz uyvyw xxwvzu xuvw o|xxxvy owv|xuqzy ouvu |vu uqov|q uxzvzo u|vzq uuwvu ooouv| |qxvxuwoz ozqxv|x o|vq |z|vuo uq|vqz xxyv|u xzzvy oo|vwo xqvzzoy|wu uvuw uqvz uovz u|vwo xwwvo y|vu| oxvz ozvoq|yw uuvw zuvy uvzz uozv yqzvyx xzyv|u ovo oxvwyxx|y owvy ouuv|u uwv|| uuwvx| x|ovx| yqyvux o|ozvuo oyoovyoo|oq |wzvy xyvyw uuwvz| uvqu x|xv xuvu ooyyvxo oxxvoyyouu u|wvq |zxvwx uvo uyzvu xo|vzq xuqvqw oyzu|v|o q|quvqxxuww uuv|o ozuxvuu uuvxx uzvzu xqvow xuwvzz oyxxxvyx owyovyooquwxy xwvy |qzv| uuzvxu uzvo| xyuvwo uwzvwx oxwwvxz oxyuyvy|

oqu oxyovoo vo yo|vyu v|w xwyvzz zv |xwovy xzzxxvyquw oxvz |vwz yquvxx y|ovx xywvyz xzyvyq x|xy|vzo w|ywzvwouqzy oyvu| zv xzwvx| yyvu xvoq xv yxy|vz zqy|vyowoz oy|xvxx uvy yozv|q yvyu xwv| xxvu| ywuv| zqqqvozoy|wu oxyzvuw uyvzz y|vyy yowvxq xwuvwx xwxvy ywzuv|q wqzvoo|yw oxoovuy wwv yqyvyq yuvyw xwov|x xzov|u ywozyvqy wyyyvywyxx|y oxwxvz ouyvxy yoyvuz yvoq xwxvwx xzqvww ywv| wxxqzvo|oq oyxwvw qvwx yovw yuvw| xxwvwo xwyvuq yquvzw woyvyouu oyw|v|z yzvx| yqvwu yxyvq| x|vx| xywvuq yywqvxw wqvq|xuww oxqvxu ouxqv|u yqvuq y|wvqx xyqvw xovz| yx|||vqq yuy|vuoquwxy oyywvqx uwvuy yo|vxz yuzvz yyxvq xwv|u yox|wvww yoovqx

oqu

|

xy

acutedegmicrodegpara

degmiddotmdegemdegkk~qgcedilsup1sup3

mdegkk~ocedilsup1sup3 cedilordm cedilsup3

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3oqu |yvoy uvqw zqxvxx xvwq uzvx ovu y|zuvx xywy|vy|quw uxzvy yvuz zoyvxo vox xxv| |vqw zyuyovyo zqzuvuyuqzy uyvuz oovx zuovw wvzy zvqx |vzu o|qxxuvy oowzyyvyqwoz uoyvxx vy xyzvzq xxv|o zuovwx uyvyo ouyyqovxx o|woo|v||oy|wu xqvzy uuvy oquvou oqovq oqqovuw z|vy oxuwv| ouqxzyvwz|yw xoyvyw xwvqy oqqvo oz|vo oqzvyz o|yvu ouwwqvq o|wzuv|yxx|y yu|vq ooyvq o|yvzx qvu o|wvqq uwvy oozwwvyo oozqyvxo|oq youv|y ovoy oxu|v|w yqv ouv|y owzvoo wuo|vou w|qzvzwyouu yxvw uouvwo oywzvwo o|yxvqo oyyvqq oxzzvq xo||voy xoyvwuxuww wozvq yuv| oyuvwo ouxvo ox|wvq o|o|v| |qwqvxw |ouvyzoquwxy zyxv|o zqyvu| oyq|vx oxqwvy oyzvz ooxvu ozzvx owovyu

oqu |xvyy v|u ozuvy w|vyy |xvw |vxy uoyyzov|o |zqyyyvoquw |xvww uvxw owq|vx xvzx oyovu |vqy xw|zvzw xyzuzvzxuqzy |yzv|z zvqw qvuy v|x oyovww wqvyw wzyv|w yqu|vowwoz |wuvz owvu o||vu oqv xzyv ouuv|| yozqovu zzqvzoy|wu xqxvwy |yvx qovou uwvyx oz|zvxy ywv| xqqxvzq yq|qxvzw|yw xuv|u uv|z qywvyu zxvy qxovww |quvz yxwoqv yuquu|vy|yxx|y yzxvqq ouyv| xzwvwx xyvzq yvoq wqzvy uzyqzvu x|yzqzvw|o|oq yyqvx| zzvq |ov|x zzovw| w|ov|| ozvo x|o||xvuz |yqqvwyouu zzyvzq yq|vwz |zzzvz |qvy |wzuv zyvxq |xowu|v |uuoqovuuxuww ooxvu o|wvu |wyvww |zvu |ux|vo zw|vzo |xuwvy |uwovxoquwxy oz|yv| owzuv| |wxyvwu ||yvqu |yzuvzx |yyovq o|uwvo o|uywvo

oqu |yvq vq |qyvqq ozwv|u |zvwq qzvww o|wuzvyo ooq|xxvzyquw |yvzu uv ooqvyu ozzv|w uxxv ozqv| owzzwvyy oxxuq|vouuqzy |yyvx| wvxo yvqq ozwvqy |zwvzq ozwv zzvx qyqqovoowoz |xv| oyvyz ozxovou |wwvw uxqvq| uquvq| |ozqxv|| qwuzzzvw|oy|wu uyqvzo ||v|| |wvyy wquvw wxuvox zvuo |oxy|vw q|qyvoq|yw ux|vuw yvw |xov|u |uvz uuvyu |wqvy o|xxovzu q|wvwzyxx|y yzvwo o|xvu |uqovuy qvw |yvy wwzvx o|wuyov|y owxuqzyvowo|oq xzzvwz yzv|y |q|vxq ovzu uqwvuq oqqvz oyuyyxuv|u oyouoz|vuoyouu wxyvq xuuv|u uoyxvo |yywvo uzvw| uouvu ooq|ozvo oqyq|vqxuww oqqvqu oowvoz uoy|vw |yvx uwuvuw uux|vxx woyyqzvow zqzvqwoquwxy owquvu owov|u uouvx |zxyv uvw uxoqvqu uy|yozvuw uuqqvuo

xy

oqu

|

cedilsup1sup3plusmnsup2

microdegpara cedilsup1sup3 cedilordm cedilsup3degmiddotmdegemdegkk~qg mdegkk~o

JAEA-Testing 2011-005

Ta

ble

13L

MPI

IOP

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

qv

qvz

ovqq

ov

|vq

vwy

quw

vyu

ovq

ovzz

vx

yvqq

yvox

uqzy

xvq

|vx

|vzq

xvo

oovw

oovxu

woz

oqvz

wvqu

vy

oqvy

|vuz

vy

oy|wu

ozvwy

oyvwq

oyvy

ozvz|

uxvw

uxvz

|yw

|yvzx

|vyu

wvy

|xvxo

vq

xv|

yxx|y

yvy

xyvoz

xqvqq

xuv||

oyvqo

oovz

o|oq

zyv|o

wqvu

zovu

zvx

owwvw

oyvzx

youuoxvoq

oqyvx

ouovzu

oqov

x|vqw

uvux

xuwwuyuvoo

|wvuo

uwvxu

oyovqq

wuvqq

||zv|y

oquwxyyqxvu

uwyvux

uwxvzx

ozuvqy

|uwv|

|uuvx

oqu

ovou

qv|o

qvy|

qvyx

|vy

|voq

quw

v|

qvx|

ovq

ovw

yvu

yvz

uqzy

uvxy

ovu

vqw

vyw

ovu

ovou

woz

zv|w

vyu

xv

xvoq

xv

uv|

oy|wu

ov|z

uvyz

oqvou

zvzw

uzvyo

uwvwz

|yw

||vz

zv|w

uv|z

qvy

wvwq

zovy

yxx|y

xvxw

v

uyv

|wvy|

oxqvz

ouzv|q

o|oq

w|vuu

u|vz

wyvx

ywvzz

y|vw

yvuz

youuooyvu|

yovw

oyqvoqvxo

||xvyo

|vx

xuwwwwvww

o|ovu

|uvqzyvuy

|yxv|

uqvw

oquwxy|xvo|

oovux

qovwu|qyvqz

uqwvo

x|v||

|

xy

acutedeg

microdegpara

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

ovou

ovoo

ovq

ovo

vo

vwx

quw

vx

vow

vo

vxu

uvxo

yvo

uqzy

xvoq

uvou

uv|

xvqu

zvo

oovy

woz

oqv

wvx

wvy

oqvo

ozvwo

vo

oy|wu

qvoq

ovo

ovux

qvq|

|yvyq

uyvou

|yw

uqvyo

|yvz

|uvy|

|wvu|

wovo|

wyvzu

yxx|y

wqvox

|v|y

yyvww

wvy

ouovz

ouwvqq

o|oqoxyv|w

o|vuq

oxvyo

oxovz

wvw

xwv||

youuwovuu

xyvox

u|vzq

yzvxw

xoovux

x|xvwu

xuww|xvo

uzvx

u|vyz

xqvozoquqvw

zuyv

oquwxyzv|y

yovz

wqvou

qyvuqooo|vuoqovu|

oqu

ovo|

qvxy

qvzy

ovo

voz

|v

quw

vz

qvzu

ovzq

vy

uv|u

yvx

uqzy

uvy|

vqx

|v

uvuz

wvwq

ovxo

woz

zv|

|vx

vwq

wvwu

ovyy

yvqu

oy|wu

owvzu

vu

oyvox

owvoq

|xvxo

xuvwx

|yw

|vy|

ouv

|ov|u

|xvx

qvq

oquvwq

yxx|y

xvq|

|vu

uxvx|

yvwo

oxuv||

owov|w

o|oqoxqvyw

v|

wyvq

ovux

|qvo

|vu

youuz|vwo

ouzvwq

o|vy

wuvqw

yovy

yywvz

xuwwxxvyu

|qv

uy|voy

xoqvwoxqvwwo|yvzw

oquwxywxwvxq

xov

wqvx

zqwvu|uovxzoqvu

xy

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2

microdegpara

|

JAEA-Testing 2011-005

521 POSIX Fig 20GbrvbarPZYgtFGeacutea`brvbardotildeiWXoacute

CZordfYgtFGeacutea`laquoGnordfzyacuteCqXFGeacutea

`ordf 1MB oumlZYWZordfYWgtiacuteOcirc$FGe$WXDHordfiGqUgraveDZYecircCo^_ordfn

~nparaCDaYGouml`$iaYbrvbarsect

EGoacuteCpoundEacuteEacutegtiaYGoumlAgraveordfAumlgtlaquopoundDq

mnd$Ocirc$YUumleacutenotccedilIuml]OcircOtildeYYL13YEuml

plusmnFEumlPthornordfuumlwdnqecircCCgtordfFGeacutea`

iexclcentIacuteiWXmW]poundEacuteEacuteecircC+Xnota

13gtOslashordfmacrdnX=WC-eacutearingaeligccedil13hLIOiumlIumlshyaringaeligccedil13hPthornWuumlwdnq 522 MPIIO POSIXZordfYWiacute=zCFGeacutea`

otilderyenCordfzyacuteCLFig 21PqUgraveDEumlIgraveiexclcentCXqmndreg=cd MPIIO ]^_ordfZordfY5POslashOumlWXpoundgtXcWmacrdegeumlnq 523 _aumlaringaeligccedil13h 13 IOoiumlIuml_aumlaringaeligccedil13hWa-eacutearingaeligccedil

13hordflaquoqa-eacutearingaeligccedil13humlcopyordfplusmnsup2GmWsup3acutegtlaquoordf13

BeumlnmemcleanfIumlegravePmWbrvbarsect_aumlaringaeligccedil13hY|1micro5PmWordfgtecircq 13G3brvbarsectpoundEacuteEacuteCDPOSIX_aumlaringaeligccedil13h

szligordfaacuteHdnDLTable 14Pq

(1) ecircEacute` (2) memclean (3) TplusmnpoundEacuteEacute` (4) 2plusmnpoundEacuteEacute`

JAEA-Testing 2011-005

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomSingleFileFilePerProc

ReadWrite

homehome

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

Write Read

sequential randomhomework

N=322561024

N=256

N=1024

N=32

N=32

N=2561024

FilePerProc FilePerProc

N=256

N=1024

N=32

N=2561024

N=32

FFig 20 POSIX

JAEA-Testing 2011-005

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential random

use viewcollective

use view N=256 use view N=256

N=32 256 N=32 256

homehome

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomhomework N=32 N=32

FFig 21 MPIIO

JAEA-Testing 2011-005

Table 14 _aumlaringaeligccedil13hszlig

| xy | xy

oregRplusmnsup2 zyvx qzv|| xwvxw zyuvqu

sup2Rplusmnsup2 woxqvoz ||qqvz| uzvu z|voy

kfrac14 frac12kfrac14

53 IORYUgraveWH

IORYbrvbardegdnDregmacrb13tuq 531 a13egravecurren occedil`brvbarYCDcparaX IO O=5Pnota13LSRFSP

homea13brvbarsectiumlIumlS_aumla13Lext3Ptmpa13PordfOslashszligordfAumlDHcAumlFXgttmpa13YZccedila)W^|13~AumlFUcircgtecirc9ordflaquoLFig 22middot Table 15Pq UgraveDhomemiddotAQFS_ccedilY-a5 512kcentCsect186OcircOtildeYOumltimes

raquoZETHIuml_Kgtaordf)JeumlnqUcircfIuml13NtildeY|^`bgta

gt_ccedilY-a5gtyBUacuteCqiAgravework middotAIO eacuteIumlkCQFS_ccedilY-a5 1MBcentCsect366OcircOtildeYOumltimes`Zaggtaordf)Jeumlnq^|13~cda_ccedilY-a5gtlgtyB

UacuteCqa-a5ordfecircCXbrvbarPX MPIIO work middotAegravePuecircgtlaquoq 532 POSIXW MPIIO c^_brvbariaYhomemiddotA Readgt 256EumlIgravegt 2curren

Writegt 10currenW POSIXbrvbarsect MPIIOordfcedilqPOSIXbrvbaraYmacrsup1Gq 533 eacuteccedilecirceuml ntransferSizeGX 1 plusmn13fgtleumlneacute

a`ordfecircMFGFAringordfecircCXWPdegdnDmWcd^|13~

^_Zeacuteccedil-a5laquo]GWordfyacuteordf9ordflaquoq C`|lOcirc`eacuteccedil-a5Y_ BUFSIZgtAtildeUgravesect13

gt gccegravePoumlgt fccegravePoumlgt 8192eacutea`gtlaquoqCfIumlgtsetvbufZaZ|atildebrvbarpoundeacuteccedilecirceuml^wmWordfgtecircq QR Fortran1313aeligleacuteccedil-a5GAgraveograve13sectLAacute

ordmPq (1) OPENyacute BLOCKSIZE (2) ^ fuxxbf (3) Oslash5fIumlZaUuml^13~ -g (4) Ocirc`gt 8Meacutea`

JAEA-Testing 2011-005

ETER

NU

S D

X80

1-

18 (d

x01frac34

dx18

)

DX

80

01

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

DX

80

02D

X80

03

DX

80

04D

X80

05

DX

80

06D

X80

07

DX

80

08D

X80

09

DX

80

10D

X80

11

DX

80

12D

X80

13

DX

80

14D

X80

15

DX

80

16D

X80

17

DX

80

18

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

IObrvbar

1(io

1)

ETER

NU

S D

X80

19

-36

(dx1

9frac34dx

36)

DX

80

19

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

DX

80

20D

X80

21

DX

80

22D

X80

23

DX

80

24D

X80

25

DX

80

26D

X80

27

DX

80

28D

X80

29

DX

80

30D

X80

31

DX

80

32D

X80

33

DX

80

34D

X80

35

DX

80

36

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

Wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

IObrvbar

2(io

2)

Fi

g 2

2a13J

JAEA-Testing 2011-005

Table 15 OcircOtilde7Y`|J

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_01 QFS_data_01 2109888 MB 2109888Icirc36 = 724TB 1024k

data_02 QFS_data_02 2109888 MB 2109888Icirc36 = 724TB 1024k

data_03 QFS_data_03 2109888 MB 2109888Icirc36 = 724TB 1024k

data_04 QFS_data_04 2109888 MB 2109888Icirc36 = 724TB 1024k

data_05 QFS_data_05 3695488 MB 3695488Icirc36= 1269TB 1024k

home QFS_home 3695488 MB 3695488Icirc36= 1269TB 512k

work 524288 MB Io2JAgraveAacuteAcirc

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_06 QFS_data_06 2109888 MB 2109888Icirc36 = 724TB 1024k

data_07 QFS_data_07 2109888 MB 2109888Icirc36 = 724TB 1024k

data_08 QFS_data_08 2109888 MB 2109888Icirc36 = 724TB 1024k

data_09 QFS_data_09 2109888 MB 2109888Icirc36 = 724TB 1024k

data_10 QFS_data_10 3695488 MB 3695488Icirc36= 1269TB 1024k

data QFS_data 3695488 MB 3695488Icirc24= 846TB 1024k

sysdata QFS_sysdata 3695488 MB 3695488Icirc12= 423TB 16k IumlETHNtildeETH

work QFS_work 524288 MB 524288Icirc144=72TB 1024k dx01frac3436OgraveJ

JAEA-Testing 2011-005

6 a$fY` mmgt BX900 a$fY`ETHNtildeCDtuq

ETHNtildeBX900Xraquozfrac14frac12GmWordfgtecircDq 61 a$fY`igraveiacute mmgt BX900 a$fY`IcircIumligraveiacute AgravejiumlIumleacuteaIuml

0Gq 611 icircWiumlIumlethsectntilde

BX900 InfiniBand QDRegraveBCD Fat Treeicircgtlaquoq113aelig13S 18iumlIumlAEligicircgtlaquoordf113aelig13 LeafWdividenaccedil7Iumlordf 2Igravecenttimeseumln1accedil7IumlcdoiumlIuml QDR ordf 1 7gtYZeumlnqaccedil7Iuml 2 Igravelaquogt1 iumlIuml 2 QDR ordfYZeumln(QDR Iacutegt 18 j2Igravegt 36WX)qoiumlIuml`ZaringbrvbarsectT 8GBs (4GBsj2) AAEligpoundq 13aelig13 5PDHLeafaccedilyacute7 SpineWdividenaccedilordf

9iexcllaquosecto LeafaccedilGu Spineaccedil QDRgtYZeumlnqbrvbarpound 113aelig13cd SpineIacutegt 9j2Igrave=18 QDRordfegraveBeumlnq QDRordf 5PoumlLeaf accedil3xmacrDownlink ordf 36 Uplink ordf 18 Xgt Uplinkshy 50WXsectmn 50_ccedilaringWdividenotgtqBX900gta$fY`Ecircfrac34h Fig 23GqBX900gt 50_ccedilaring13aelig13ocircotilde gtthornordf]GqSpineLeafaccedilW13aelig13SoiumlIumlYZGEcircfrac34h Fig 24Gqmmgt13aelig13S nAumliumlIuml nAuml(niquest9)UgraveD nU9Auml(nAgrave9) SpineaccedilegraveBGqlaquo13aelig13laquoiumlIumlordfcent13aelig13iumlIumlcdOcirc$ltAacuteoumlSpineaccedilcd Leafaccedil(2Igrave) QDR2(1j2Igrave)ordfegraveBeumln(2Igrave)LeafaccedilcdiumlIumlQDR2gtYZeumlnqDotildeCmSpineaccedilcd Leaf accedilordmAcirc13aelig13SPAtildeWiumlIumlWnoteumlnqmDHyenwiumlIumlTEacuteOcirc$lt GoumlWiumlIuml 1 W 10 ordflt GoumlWgtn~n FAringordf+XqX13aelig13Sotilde gtiuml_ccedilaring

WXqmDHoYacute~iumlIumlethsectntildeAgraveograveLiumlIumleacuteaOcircOtildePordf

thornotildeGq (1) XiumlIumlethsectntildeL13klP iumlIumlZeacuteaIumleumlnqyenw13aelig13 1 iumlIumlordf bx0001ccedil

bx0018Ugravegtlaquoordf144EumlIgraveYacute~oumlmAumlgtiumlIumlordfeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDdmiumlIumlAringAEligw[iumlIumlegraveBGq13

aelig13ocircotildeoumliacuteZeacuteaIumleumlnq

JAEA-Testing 2011-005

(2) CcedilXiumlIumlethsectntilde Wsup33iIacuteOslash5YZniumlIuml 1ecirceacuteaIumleumlnqyenw13aelig13

1gt bx0001bx0003bx0005bx0007WeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDd8)eacuteaIumleumlnDiumlIumlordfEgraveAumlAringXdAringAEligwTEacuteEgrave

AumlAringiumlIumlegraveBGqEcircAumlAringoumliacuteAringAEligwTEacuteEcircAumlAringiumlIumlegrave

BGqmnAtildeWYacute~SiumlIuml gt IB [EumlDHcenteumlnDq plusmnoqgt65EacuteWsup3AgraveogravegtCDq

13aelig13 1 13aelig13 119

Leaf(1-2) Leaf(1-1)

bx0001 bx0002 bx0018

Spine(1) Spine(2) Spine(9)

Leaf(119-2)

Leaf(119-1)

bx2125 bx2126

Fig 23 BX900a$fY`icirc

3OslashCDqIgraveIacuteoFL13aelig13raquoIcircIumlntildePfrac34wWsup313kl

WiacuteiumlIumlethsectntildeAgravejBCq

JAEA-Testing 2011-005

accedil (Spine) accedil

(Leaf) 13aelig13 iumlIuml (1 13aelig13acute18 iumlIuml ) (acutecsect]GCGDHLeaf accedilTCcX)

1

10

2

11

3

12

4 5 6 7 8 9

13 14 15 16 17 18

SW 1

SW 2

SW 3

SW4

SW5

SW6

SW 7

SW 8

SW 9

SW

Fig 24 iumlIumlWaccedilatildeAcirc

612 Agravej

BX900 Ba$fY`plusmnsup2yBCD IB AgraveograveWC

RDMA(Remote Direct Memory Access)yBCD RDMA Agravej(13 RDMA WC)WRDMA egraveBCX NORDMAAgravej(13 NORDMA WC)ordflaquoqBX900 gtOcirc` RDMA gtlaquoqRDMA W IB begravecccedil`brvbarYaumlIuml(Host Channel Adapter(HCA))Ocirc$^|13~|W8Y]sectWsectgtecircAEligDIacutemWgtETH_fg (1)5PqAuml^ccedil`E7a13 ordfNtildeEumlgtecircqiAgraveNORDMA CPUbrvbar|fg5Pq

RDMANORDMAWcentccedilYacute AgraveograveWC Normal SendAgravej(2)W Send on RequestAgravej(3)ordflaquoqRDMANORDMAacircWmnd Normal SendAgravejSend on RequestAgravej5PmWordfgtecircordfmndsectOgravewG ccedilYacute-a5brvbarpoundAtildeUgraveqRDMANORDMAgtmndAgravejsectOgraveOcirc`Oacuteshy Table 16GqOacuteshyOcircOtilde Normal SendAgravejOacuteshy13yacute Send on RequestAgravejWXqmndAgravejWccedilYacute-a5^_atildeAcirc Table 17W Table 18Gq Oacuteshy^ MP_MAX_NORMAL_SENDgt^wdnordfyacuteaacuteshyCDOcirc`shyWEacutegtlaquoqbrvbarpound^egravePoumlOacuteshyOumloumlEacutegtlaquoq

JAEA-Testing 2011-005

XDshy 5PDHOslash5 -onesideUuml^13~CDoumlNORDMAAgravejc(ccedilYacute-a5WatildeAcirc)Normal SendAgravejgt5nq

Table 16 AgravejsectOgravewOacuteshy Iuml RDMA RDMA NORDMA

^_ Tccedil1024 1025ccedil U Oacuteshy 32768eacutea` 16384eacutea` 524288eacutea`

Table 17 RDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil16KBOcircOtilde 16KBccedil32KBOcircOtilde 32KB13yacute 1ccedil1024^_ Normal Send Normal Send Send on Request

1025ccedil2048^_ Normal Send Send on Request Send on Request 2049^_13yacute NORDMAWX Send on Request Send on Request

Table 18 NORDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil512KBOcircOtilde 512KB13yacute Gu^_ Normal Send Send on Request

timesOslash (1) ETH_fg ^_ccedil-brvbarOcirc$fgIacuteccedil`brvbarYordmgt ordm|8

YOcirc$FGGCcedilEgravemWq^_ccedil-( 13)Obbrvbarsectdivide3GmWordfgtecircq (2) Normal SendAgravej G shylt shyccdccedilYacuteG GqccedilYacuteUgraveUacutesect

umlordmordfagraveCnUgraveUgravefg(direct copy)qagraveCXoumliUcircIacuteEumlAagravefCmmfg(indirect copy)agraveCDdsectumlordmfgGqUumlccedilYacutezecircq (3) Send on RequestAgravej G shylt shyWYacuteAacutesectlt shyccedilYacutesectumlordmordfagraveCcdG 5Pq

`ccedilYacutezecircq

JAEA-Testing 2011-005

62 MPI MPI acircX atildeegraveBCD^_Z)JCBX900 yacutegt

CDq256EumlIgraveccedil4096EumlIgravegt Ocirc$-a5THORNumlcopyszligpound5poundDordfmmgto MPIatildegtagraveaacuteX=ecircordfmacrdn 32ccedil80000eacutea`gtGq WCRDMANORDMA ordflteumln atildeordflaquomWlaquoOcirc

$-a5cdecircCordf^UcircG atildeordflaquomWXMordfethCDq 621 Agraveograve mmgtG Ocirc$umlcopy 0ccedil80000 eacutea`gtlaquoordfOslash5vEacuteAacuteG

Ocirc$acircETHeCLTable 19Pq AgraveograveWCMPIatildeTplusmnOslash5GraquoCmn 20plusmnatildesectaumlGq20plusmnatilde

sectaumlC3 1plusmnWecircXshyordfeumlnmWordfDDlaquopoundDqWCpumlcopyYacute~ W[ordfuumlwdnqmDHT`C19plusmnshyIIumlAacutemWCDqXIuml RDMAW NORDMA 2IumlgtGq^_ZOslash MPIatildeIacutecurren Fig 25Gq

Table 19 Ocirc$acirc

EumlIgrave AllreduceReduceBcastSendRecv 13 MPIatilde 256 32eacutea` 128eacutea` 512 yacute 128eacutea`

1024 yacute 256eacutea` 2048 yacute 2048eacutea` 4096 yacute 2048eacutea`

do 1000 III = 1 NNN call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() mesuarment start call MPI_ALLGATHER(anelemMPI_DOUBLE_PRECISION amp bnelemMPI_DOUBLE_PRECISION amp MPI_COMM_WORLDierr) ttime2 = MPI_WTIME() mesuarment end ttime0 = ttime2 - ttime1 call MPI_REDUCE(ttime0ttime1MPI_DOUBLE_PRECISIONMPI_MAX0 amp MPI_COMM_WORLDierr) if(myideq0) then max_time = max(max_timettime) min_time = min(min_timettime) all_time = all_time + ttime end if 1000 continue

Fig 25 ^_ZLIacutecurrenP

JAEA-Testing 2011-005

13

622 mnUgravegttuDOslash5CcedilEgraveOslash5IumlgtP512EumlIgrave Fig 26 Fig

27 1024EumlIgrave B Gqoh$a` MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

0

500

1000

1500

2000

2500

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Reduce(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(512-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(512-)

rdma

Nordma

Fig 26 MPI atildeL512EumlIgrave-1P

JAEA-Testing 2011-005

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(512-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(512-)

rdma

Nordma

0

500000

1000000

1500000

2000000

2500000

3000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(512-)

rdma

Nordma

0

50

100

150

200

250

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(512-Rank0gtRank511)

rdma

nordma

Fig 27 MPI atildeL512EumlIgrave-2P

JAEA-Testing 2011-005

brvbarsect[raquozfrac14frac12CDq

(1) C gt RDMAbrvbarsect NORDMAordfFoumlordflaquoq (2) RDMAgtlaquo Ocirc$-a5cdaeligccedil ordflaquoEGqmn Normal Send

Agravejcd Send on RequestAgravejsectOgravethornWnq (3) NORDMA gt AgravejsectOgravewOacuteshyWatildeAcircOcirc$-a5gtaeligccedil ordflaquo

EGoumlordflaquoq (4) EumlIgraveordfCXWEacutebrvbarPX Ocirc$-a5gt decircordfmacrdn

oumlordflaquoq ^_ZgtacircegraveBG MPIatildebrvbarpoundRDMAW NORDMAegravecurrenaeliguordflaquo

qDotildeCmnd MPI ^_Z=GfIumlgtlaquo mpiexec Uuml^13~gtlaquogtGu MPIatildeoacuteC RDMAgt5Pc NORDMAgt5Pclaquo XgtC MPI atildeegravepoundoumlAgraveIumlgtOslash5rsGmWordfaeliguWXq ucircuumlWCo MPI atildegtUVOcirc$(ccedil8KB)3UVOcirc$(8Kccedil78KB)U

VOcirc$(80KBccedil584KB)gt RDMAZUcircCD Fig 28LgtUVOcirc$P CLUVOcirc$PGqZyacutegt 100n RDMA ordfnoty0n NORDMAordfnotyWXq

JAEA-Testing 2011-005

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

middotplusmnsup1raquo

~plusmnreg

plusmn

sup2cedil~raquo

sup2~

sup2~

sup2~Oslashreg~plusmn

middotplusmnsup1

~plusmnraquo

~plusmn

Ugraveplusmnsup1raquo

Ugraveplusmnsup1

~plusmn

plusmnraquo

Fig

28

RD

MAW

NO

RD

MALgtUVOcirc$acute

0K

B~8

KBP

JAEA-Testing 2011-005

63 RDMAszlig RDMA 612 gt0CDbrvbarPCPU plusmnsup2WnotM5XDH 13

O=Vfrac34gtecircqmnbrvbarsect WbordfO=gtecirc9ordflaquosectC

cCcedilEgrave`5poundEacuteDqUgraveDrsDHNORDMAgtiacuteX`5poundDq WC WordfO=eumlnDegraveyenagraveaacutegtecircXcpoundDq 631 Agraveograve W5P^_ZegravepoundRDMA W NORDMA rsGq

eacute^_WOcirc$ecircacuteGeumlsectCigraveecircacuteiacuteVicircG (13cedil Wdivideoslash)5PqGlt iuml_ccedilaring (MPI_IsendIrecv)egraveBGqG UgraveDlt GAtildeWcopyIgraveOcirc$-a5 5MB gtlaquoq W5P5Igravegtlaquoq5Igrave-a5(450450)gtlaquoqrsDHgt n~nOslash55Pq ^_ZZccedil`MPIW 2q|Icirca|ccedil`(MPIiumlOpenMP) 3q|)

JCDqZccedil` MPIgtOslash5CDoumliumlIumlS RDMAgtXDHmAgravejszligordfgtecircXqmDHiumlIumlSgt MPI 5XCeth2Icirca|ccedil`)JCDqIcirca|ccedil`Oslash5WOslash5Iacutecurren7ccedilIumlEumlIgraveUcircCXeacute

Yacute~ 1W7ccedilIumlEumlIgraveUcircCDeacuteYacute~ 2)JCDqeacuteYacute~ 2 WOslash5Iacutecurren Fig 29GqXMPIWIcirca|ccedil`eacuteYacute~ 15IgraveIacutecurren7ccedilIumlEumlIgraveampeuml$OMP parallel doiacuteordfotildegtlaquoq DHOslash5Zccedil` MPIgt 8EumlIgraveIcirca|ccedil`Oslash5 4mpij8omp(32core

egraveB)gtOslash5Gq

JAEA-Testing 2011-005

call MPI_BARRIER(MPI_COMM_WORLDierr) ttime5 = MPI_WTIME() mesuarment start call mpi_isend(aNMBmpi_DOUBLE_PRECISIONir1 1 MPI_COMM_WORLDireq(1)ierr) call mpi_irecv(bNMBmpi_DOUBLE_PRECISIONil1 1 MPI_COMM_WORLDireq(2)ierr) call mpi_isend(dNMBmpi_DOUBLE_PRECISIONil2 1 MPI_COMM_WORLDireq(3)ierr) call mpi_irecv(cNMBmpi_DOUBLE_PRECISIONir2 1 MPI_COMM_WORLDireq(4)ierr) ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ttime7 = MPI_WTIME() mesuarment start $OMP parallel do do j = 1 NMM do i = 1 NMM a1(ij) = 1d0 b1(ij) = 1d0 c1(ij) = 0d0 end do end do $OMP parallel do do j = 1 NMM do k = 1 NMM do i = 1 NMM c1(ij) = c1(ij) + a1(ik)b1(kj) end do end do end do ttime8 = MPI_WTIME() mesuarment end ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc call mpi_waitall(4ireqiastatusierr) ttime6 = MPI_WTIME() mesuarment end ttimeDD = ttime8 - ttime7 ttimeal = ttime6 - ttime5 Oslash5 ttimesd = ttimeal ndash ttimeDD Oslash5cdnD (= UgraveDiIacute)

Fig 29 WO=5PDH^_Z

JAEA-Testing 2011-005

632 Zccedil` MPI 2plusmnIcirca|ccedil`eacuteYacute~ 1 WeacuteYacute~ 2n~n 3plusmn

Oslash55poundDqmnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt D Gqontilde13Gq

acutegtOslash5CD Oslash5 acutegtOslash5CDOslash5 R acute Wn~nOslash5shy Oslash5 acute W5poundDOslash5 (U) acuteOslash5cd()Dshy acuteOslash5CD(OslashC) U acuteOslash5cdRshyDshy

laquogtOslash5CD oacuteG WOslash5GmWbrvbarpound

UumlCXpoundDrGqmj13sectgtlaquoq

laquo() = (Oslash5U( iuml)) j100 aTHORNshyordfecircMszligordflaquomWXqbrvbarsect[raquozfrac14frac12CDq

(1) Zccedil` MPIgtiumlIumlS IBegravepoundXDH RDMAograveoacuteordfltdnOslash5szligq

(2) Icirca|ccedil`eacuteYacute~ 1 RDMAOslash5gt WOslash5rsGWucircOslash5ordfFWPXq NORDMAatildeCiacuteXWXpoundq

(3) Icirca|ccedil`eacuteYacute~ 2 gtIacutecurrenordf7ccedilIumlEumlIgraveUcirceumlngteacuteYacute~ 1WRrsgt 72currenOslash5Rrsgt 55currenAumlFUcircordfXeumln(RDMA)qCcCgtucircOslash5ordfFWPdegdnXq NORDMAIacutecurren7ccedilIumlUcircgt Oslash5eacuteYacute~ 1ruUumlIgraveUcirceumlnordfRDMAbrvbarsectcedilCXpoundq

BX900gt RDMA Uuml_IumlOslashWGgtCPU oacuteG

dCb^|13~ CPUbrvbarsectVfrac34GmWordfgtecircqCcCBX900 ordfiacuteCUgraveDplusmngt ordfUumlmWlaquosect Wordf

O=eumlnDegraveyenagraveaacuteGmWordfgtecircXcpoundDq

JAEA-Testing 2011-005

64 CPUSCPUiumlIuml 1oacute 1 core FAringG^_Zegravepound CDqCPU S

iumlIumlS CPU iumlIuml gt Ocirc$ordmAcircAgravejordf+Xqn~n gtordfaEacuteiumlIuml ordfTFWXpoundDq 641 Agraveograve

1oacute 1 ^_Z coreOcirc$ G^_ZgtlaquoqZY0cdcZYCD^_acircgtCD-a5Ocirc$ 100plusmnG Cmnd uCDn~nGq^_ZiIacute Fig 30Gq plusmn 24EumlIgravegt128MBOcirc$G C^_acirc 1WGq BX900gt 1iuml

Iuml 2CPU1CPU 4coregtlaquogtZY 3Ugravegt CPUS ZY 7UgravegtiumlIumlSPAtildeW CPUW mnd13yacuteiumlIuml WXqiumlIumlS coreoacuteG core IDWethsectntildedn MPIZYAumlAring Fig 31Gq

IC = 0 do 1000 III = intcpu nodes-1 intcpu IIRANK = III call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN if(myideq0) amp call MPI_SEND(ansizeMPI_DOUBLE_PRECISION amp IIRANKIIMPI_COMM_WORLDierr) if(myideqIIRANK) amp call MPI_RECV(ansizeMPI_DOUBLE_PRECISION amp 0IIMPI_COMM_WORLDistatusierr) 120 continue ttime2 = MPI_WTIME() c IC = IC + 1 timew(IC) = ttime2 - ttime1 ir(IC) = IIRANK 1000 continue

Fig 30 ToacuteT ^_ZLIacutecurrenP

JAEA-Testing 2011-005

NODE

CPU0 CPU1

Core IDacute0 Core IDacute2 (Rank 0) (Rank 1) Core IDacute4 Core IDacute6 (Rank 2) (Rank 3)

Core IDacute1 Core IDacute3 (Rank 4) (Rank 5) Core IDacute5 Core IDacute7 (Rank 6) (Rank 7)

Fig 31 f MPIZYethsectntilde

BX900Ocirc`QRfeaZ+QR MPIZaZ|gtlaquoordfmQRRIgraveEacuteIacute13

IntelfeaZ+QR MPI IntelfeaZ+Intel MPI QRfeaZ+Open MPI

3 gt5poundDqDotildeCQR MPI W Intel MPI MPI ^_ coreoacuteGethsectntilde Fig 31CDsectgtlaquoordfOpen MPIatildeC rankAumlAring 01234567 Core ID 01234567gtethsectntildednq 642 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt E Gqmbrvbarsect[

raquozfrac14frac12CDq

(1) QRfeaZ+QR MPI gtCPU S FAringordf 3260MBsiumlIumlS CPU FAringordf 3750MBsiumlIumlordf 5350MBsgt CPUS ordfiAumlcedilCiumlIumlordfiAumlFWPWXpoundDq

(2) Intel feaZiumlQR MPI CPU SiumlIumlS CPU QRfeaZWotildeiacuteXotildeordfiumlIuml 4750MBsWegraveeacuteEGqmnQR MPIQRfeaZUcircCD5poundordfIntelfeaZoacuteC5nXDHWnq

(3) IntelfeaZ+ Intel MPI CPUSiumlIumlS CPUgtAgraveXordfmndocircoacute(1)(2)QR MPI egraveBqUgraveDiumlIumlgt Intel MPI 2(` QDRegraveBordfgtecircXDH1(` QDRCcXq

JAEA-Testing 2011-005

(4) QRfeaZiumlOpen MPICPUS ordf 4460MBSgtmnUgravegt3gtTbrvbarordfiumlIumlS CPU QR MPI WotildeAringiumlIumloumlCXWXpoundDqDotildeCiIacute QDRegraveB5poundbrvbarPotildeq

dividefrac12XegraveAgravegtlaquoQRfeaZiumlQR MPIgt CPUSiumlIumlS CPU ordf

iumlIuml brvbarsectcedilWPfrac34W+XWXpoundDqmmacraacutesectXuC

iumlIumlSgt MPI XuCEumluecircgtlaquoq 65 13aelig13ocircotildeouml 13aelig13ccedil`brvbarY 50_ccedilaringicircXpoundDHegraveBoslashbrvbarpound

thornltCUgravePqmthorn 1oacute 1 Wdivideoslash ETHuDq 651 1oacute 1

1oacute 1 MPI atildegtlaquosectbrvbarCegraveBeumlnqyenwOslashfIumlgteacute^_WFGgt^_ordfgt5noumlordflaquoqmbrvbarPXouml50_ccedilaringthornlt9ordflaquoqmmgt 5P^_Z)JC

thornETHuDq WC1oacute 1 gt 2iumlIumlgt IBnotGn[ordfecircDH50_

ccedilaringthornCltmWordfcpoundDqDotildeCWsup3iumlIumleacuteaOcircOtilde(Yacute~iumlIumlethsectntilde)AgraveogravegtAtildeWYacute~ordf IBnotGbrvbarPXiumlIumlcopytimesUVYacute~139ordfAgraveXDHthornordfCq 6511 Agraveograve ETHNtildeegraveBCD^_Z^_1curreno^_cdfrac14secto^_

1oacute 1 (MPI_SendRecv)gt 128MBOcirc$ 100plusmnG CmnuCDiexcl FAringGqOcirc$G aYacute Fig 32 Gqmyengt8EumlIgraveYacute~oumlgtZY 0ccedil3^_cdiexclZY 4ccedil7^_Ocirc$G CqOcirc$Glt ^_iugravepGgtoslashbrvbarpoundccedil`brvbarYgt[

ordfecircmWXq ^_ZZccedil` MPI WIcirca|ccedil`(MPI+OpenMP))JCDqIcirc

a|ccedil`7ccedilIumlETHeGmWgt1 iumlIumlcdOcirc$G 2^wmWordfgtecircq

JAEA-Testing 2011-005

13

6512 (1) 113aelig13S rsDH1 13aelig13S(144core)5poundDqZccedil` MPI 144 EumlIgraveIcirca

|ccedil` 72mpij2omp36mpij4omp18mpij8omp 3 IumlgtOslash55poundDq13aelig13Siuml_ccedilaringgtlaquoDH iumlIumluacuteordf IB iexclcentGqZccedil` MPI 1 iumlIumlcd 8 ^_currenOcirc$ordfeumlnIcirca|ccedil` 2 7ccedilIumliacute 1iumlIumlcd 4^_curren47ccedilIumlEacuteC 2^_curren87ccedilIuml1^_currenWXgt87ccedilIuml13 IBgt[ordfecircqmDHmnd FAring+XordfIBagraveaacuteGDHogtIIuml FAringW 1iumlIumlntildesect FAringshyiquestHDqm Table 20GqXeacuteYacute~Icirca|ccedil`cedil$ucirc3shy(MPI^_j7ccedilIuml)Gq Icirca|ccedil`(18mpij8omp)EacuteegraveeacuteshyordfEordfFAringshyGueacuteYacute~gt

otildeEacutebrvbarPXshyWXpoundqmnordf QDR2currenOslashgtlaquoigravePq (2) 13aelig13Zccedil` MPI TEacute144EumlIgraveZccedil`MPIgt5poundDqYacute~Oslash5gt t256YZegraveBCDq

mnmYZiumlIumleacuteaOcircOtildeordfZcopytimesWXDHIBnotordfecirc]GDHgtlaquoq _copytimes 72^_13aelig13(=9iumlIuml13aelig13)gtIumlb^_ 213aelig13currenethCDq13aelig13S 1iumlIumlecirc 2 IBordfuacuteordfpoundq13aelig13S 18iumlIumllaquogt 36 IBgteumlnqiAgrave13aelig13 50_ccedilaringgtlaquogt 18(9j2ccedil`)gteumlnq144EumlIgraveouml13aelig13S13aelig13W 18 IB egraveBGgtiuml_ccedilaringWXsect13aelig13S13aelig13EacuteXqbrvbarpoundm13aelig13ocircotilde 13aelig13SgtCD Wotilde^

dXq

Ocirc$G (G G)

^_0

^_1

^_2

^_3

^_4

^_5

^_6

^_7

Fig 32 ToacuteT Ocirc$G aYacute

JAEA-Testing 2011-005

Table 20 L13aelig13SP eacuteYacute~ Zccedil` MPI uumlyacutethorn(72j2) uumlyacutethorn(36j4) uumlyacutethorn(18j8)IIuml FAring 7238MBs 14474MBs 28942MBs 53467MBs iumlIumlshy 57906MBs 57899MBs 57884MBs 53467MBs

Table 21 L13aelig13Zccedil` MPIP 144 MPI 192 MPI 224 MPI iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 7246 R 4670 laquosectR 4671 laquosectR

2 7244 R 4670 laquosectR 4671 laquosectR

3 7246 R 4670 laquosectR 4671 laquosectR

4 7242 R 7243 R 4670 laquosectR

5 7243 R 7244 R 4671 laquosectR

6 7242 R 7240 R 7237 R

7 7241 R 7239 R 7246 R

8 7244 R 7240 R 7245 R

9 7249 R 7243 R 7240 R

10 R 4670 laquosectR 4670 laquosectR

11 R 4670 laquosectR 4672 laquosectR

12 R 4669 laquosectR 4671 laquosectR

13 R R 4671 laquosectR

14 R R 4670 laquosectR

iacute 192EumlIgraveW 224EumlIgraveZccedil` MPIn~n 96^_13aelig13gt 213

aelig13Oslash5112^_13aelig13gt 213aelig13Oslash55poundDqordm 144EumlIgraveOslash5WWlt shy13aelig13iumlIumlraquoIIuml FAring Table 21Gq yenw 224 EumlIgraveoumlltAacutesectshygt IBnotGiumlIumlordflaquogt(50_ccedilaring)

miumlIuml FAringordfcedilCXqFig 24a$fY`icircgtCDbrvbarPiumlIuml1W 102W 113W 12ordfnotGmWXsectmndiumlIuml FAringcedilCXpoundqCcCiumlIuml 15161718 egraveBeumlnXDHmndiumlIumlWnotGiumlIuml 6789 FAringordfFC 144MPI oumlW FAringordfEacutegtlaquoqbrvbarpoundiumlIuml6ccedil9 iuml_ccedilaringWXq192 EumlIgraveiacutegtiumlIuml 4ccedil9 p IB notGiumlIumlegraveBeumlnXDH FAringordfFq

JAEA-Testing 2011-005

Table 22 L13aelig13Icirca|ccedilIumlP Icirca|ccedil`(112x2) Icirca|ccedil`(56x4) Icirca|ccedil`(28x8) iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 9339 laquosectR 18567 laquosectR 36408 laquosectR

2 9337 laquosectR 18564 laquosectR 36919 laquosectR

3 9346 laquosectR 18566 laquosectR 36536 laquosectR

4 9347 laquosectR 18570 laquosectR 36457 laquosectR

5 9342 laquosectR 18651 laquosectR 36646 laquosectR

6 14385 R 28606 R 53808 R

7 14488 R 29118 R 53779 R

8 14500 R 28676 R 53789 R

9 14495 R 29100 R 53784 R

10 9337 laquosectR 18797 laquosectR 36408 laquosectR

11 9340 laquosectR 18775 laquosectR 36715 laquosectR

12 9343 laquosectR 18651 laquosectR 36559 laquosectR

13 9348 laquosectR 18811 laquosectR 36457 laquosectR

14 9338 laquosectR 18694 laquosectR 36991 laquosectR

(3) 13aelig13Icirca|ccedil` Icirca|ccedil`gt 213aelig13(224core)egravepound112mpij2omp(56^_13aelig13)

56mpij4omp(28^_13aelig13)28mpij8omp(14^_13aelig13)IumlgtOslash5CDqmndoiumlIumlgtIIuml FAringWpiumlIumlWnotnot Table 22Gq MIcirca|ccedil`gt IB notnotgt FAringordfecircC^poundqUgraveDIcirca|ccedil

`7ccedilIumlbrvbarpoundiumlIumlcdGlt GOcirc$2ordf^DHiumlIumlntildesect^_

ordfMiumlIumlXordf IB gtOcirc$[thornordfagraveaacuteXqmnduumlGWBX900 gtZccedil` MPI brvbarsectIcirca|ccedil`UcirceumlnDfIumlordfIcirca|ccedil`UcircfIumlXd 87ccedilIumlordfnotygtlaquomWordfcq WmigravegtWsup3iumlIumleacuteaOcircOtildeAgraveograveiumlIumlAtildeWecirccopytimeseumlnD

HUVYacute~13currenYacute~gt IB not9AgraveXqbrvbarpound50_ccedilaringmWuumlGbrvbarsectordmgttuDIcirca|ccedil`Ucirc5GuecircgtlaquoqCcC

Yacute~Oslash5 IB notGiumlIuml ordfpYacute~ordfOslash5eumlnWthornltCUgravePordfucircoB3mnEumlmWgtecircXq

JAEA-Testing 2011-005

652 divideoslash mmgtdivideoslash atildePiAuml 2ordfCX MPI_AlltoallegravepoundEacute

DGq 6521 Agraveograve

MPI_AlltoallGCcedilEgravecentCOslash5GqCmrsGqXMPI_Alltoall 100 plusmnOslash5C oacute copyIgrave-a5Zccedil` MPI ordf512MBIcirca|ccedil` 1024MBWCDq

A ucircYacute~Oslash5 B 13aelig13ntildesectiumlIumlgtecircotildeCGq C 13aelig13ntildesectiumlIumlAgraveXCG(13aelig13ordfCX)q D IBnotCG(iuml_ccedilaring)q

BccedilD atildeGgtlaquoordfiumlcopyordfYacute~gtegraveBGiumlIumllaquo GmWgtecircXq mDHWsup3ucircoBgteacuteccedilYacute~TYZ h4096egraveTEacute 512iumlIuml(4096core)agravefCm3cdXuCCcedilEgravePiumlIumllaquonotgtecircmCcedilEgravegtagravefgtecircEumlIgravegtCfNh$)JCmfNh$egravepound

MPI_Alltoall Oslash55poundDqmDHAacutedegCDOcirc$ MPI ^_UYacuteordfXq 6522 Zccedil` MPIWIcirca|ccedil`egravepoundD Table 23GqmmgtCXC

XmWOslash5CcedilEgraveordf+XDH AucircYacute~Oslash5WBccedilDUgravegtoshy8YrsCXmWgtlaquo(A 10WCDrClaquoordfmnoumlgt BccedilDEacutersWG)qBccedilDTEacute 512iumlIumlagravefCcdEumlIgravefNh$)JCOslash5GordfQR MPI gtmbrvbarPTEacuteCiumlIumlagravefCcdAgraveXiumlIumlgtCfNh$)poundOslash5CDoumlWTEacutecd

EumlIgravegtOslash5CDoumlWgt)acircOslash5ordfEGqZccedil` MPI WIcirca|ccedil`gtgt coreordfAgraveXoumlmraquozordfagraveaacuteXqX BccedilDrsXq

JAEA-Testing 2011-005

Ta

ble

23iacuteXCcedilEgrave

MPI

_allt

oall

vUacuteSTUcircR

vRcopyUumlYacutebrvbarTHORN

szligvcopyUumlYacutebrvbaragrave

pvRaacuteacircatildeNR

+aumleAumlovqaringaeligPgR

Oslash 5 Iuml

copy Igrave - a 5 (MB)

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

R

R

szligR

pR

144m

pi

512

3R6

0 11

14

118

0

145

9 3

60 R

138

6

36

0 13

87

100

1

31

124

1

25

2plusmn

(BC

DEacute

) 51

2 3R

60

111

4

118

0

148

2 11

16 R

146

9

63

0 14

75

100

1

33

132

1

32

3plusmn

(BC

DEacute

) 51

2 3R

60

111

4

29

0 14

81

714

13 R

153

3

29

0 14

21

100

1

33

138

1

28

448m

pi

512

14R

40

142

6

414

0

182

4 21

272

1 R

170

3

78

0 16

32

100

1

28

119

1

14

504m

pi

512

14R

45

142

7

415

8

180

4 27

242

6 R

160

5

79

0 14

33

100

1

26

112

1

00

2048

mpi

51

2 37R

69

182

1

298

8 24

93

465

6 R

250

1

465

6 25

04

100

1

37

137

1

38

2plusmn

51

2 37R

69

166

7

2211

6

262

5 67

564

6 R

264

7

298

8 25

04

100

1

57

159

1

50

3plusmn

(BC

DEacute

) 51

2 37R

69

166

7

2211

6

262

1 69

524

9 R

269

3 6

298

8 24

59

100

1

57

162

1

47

3072

mpi

51

2 69R

56

258

7 28

399

9 27

34

5371

54 R

286

3 9

507

7 26

44

100

1

06

111

1

02

68m

pij

2om

p 10

24

3R5

7 12

10

11

170

11

80

91

9 R

122

3

28

5 12

21

100

0

97

101

1

01

120m

pij

2om

p 10

24

9R3

3 12

89

215

0

127

0 12

191

6 R

130

4

47

5 12

94

100

0

99

101

1

00

132m

pij

2om

p 10

24

8R4

1 13

79

216

5

116

1 15

211

6 R

119

5

48

3 11

38

100

0

84

087

0

83

432m

pij

2om

p 10

24

17R

64

162

4

119

8 16

28

1925

43 R

161

2 1

129

0 16

18

100

1

00

099

1

00

30m

pitimes4

omp

1024

4R

38

598

115

0

674

12

13 R

683

27

5 68

2

100

1

13

114

1

14

2plusmn

(BC

DEacute

) 10

24

4R3

8 59

8

115

0

678

12

13 R

690

27

5 70

0

100

1

13

115

1

17

36m

pitimes4

omp

1024

6R

30

603

118

0

684

11

16 R

697

63

0 70

3

100

1

13

116

1

17

2plusmn

(BC

DEacute

) 10

24

6R3

0 60

3

29

0 68

8

714

13 R

702

29

0 71

0

100

1

14

116

1

18

126m

pitimes4

omp

1024

16R

39

822

3

415

8

805

27

242

6 R

738

1

79

0 71

5

100

0

98

090

0

87

180m

pitimes4

omp

1024

18R

50

925

615

0

892

40

283

2 R

776

127

5 75

3

100

0

96

084

0

81

2plusmn

(BC

DEacute

) 10

24

18R

50

925

712

9

861

29

352

6 R

808

109

0 76

8

100

0

93

087

0

83

612m

pitimes4

omp

1024

45R

68

110

2

349

0 10

34

525

9 R

111

4

349

0 10

46

100

0

94

101

0

95

2plusmn

(BC

DEacute

) 10

24

45R

68

110

2

2810

9

920

57

634

9 R

960

5

358

7 83

4

100

0

83

087

0

76

17m

pitimes8

omp

1024

3R

57

338

1

117

0

430

9

19 R

432

28

5 43

3

100

1

27

128

1

28

90m

pitimes8

omp

1024

15R

60

504

109

0 44

9

243

8 R

471

109

0 47

1

100

0

89

093

0

94

357m

pitimes8

omp

1024

53R

67

731

418

7 69

2

576

3 R

711

418

7 72

9

100

0

95

097

1

00

2plusmn

(BC

DEacute

) 10

24

53R

67

731

3410

5

642

57

685

3 R

628

6

438

3 60

2

100

0

88

086

0

82

JAEA-Testing 2011-005

brvbarWBccedilD gtecircCOslash5ordf+XAgraveXq13aelig13ordfAgraveXCc IB notGiumlIumlordfAgraveXnOslash5ordfUumlWnDordfmacrWaeligCmbrvbarPXpoundXqMPI_Alltoall MPI atilde3gtT 2ordfCX 5Pordfmnegravepound BX900 gtecircXordfXmWordfethCDq1 oacute 1 gt13aelig13thornordfDordfmgtpYacute~thornltC t256YZegraveBCDoacuteCdivideoslash h4096YZegraveBCDgtpYacute~thornlt9]QR MPIbrvbar TUacuteUcircthornuumlwdnWgtszligagraveXNXq 66 a$fY`UgraveWH mnUgravegt BX900 gt brvbarsect ecircXthorn7wWCacirc

RDMANORDMA Agravej Ocirc$-a5 1oacute 1 (13aelig13ocirc)IBnot

ordflaquopoundDqumlcopyfIuml oslashfrac14frac12CTbrvbarCcedilEgravelaquo GuecircgtlaquoigravePqeumld

fIumlszligOslash5GDH Zccedil` MPIUgraveDIcirca|ccedil`laquo

ordfugtlaquoqucircuumlWCGZccedil`MPIWIcirca|ccedil` rsGordfWmcdBX900gtIcirca|ccedil`ordfnotyXmWordfcurrencqmncdfIuml)JGoumllaquocentfIumlAumlFUcirc5GoumlcentX=ordfX

nIcirca|ccedil`laquo GuecircgtlaquoqBX9001iumlIuml 8coreoacuteC IB2gt brvbarntildeGDHm IBgt 2laquo]eumlXDHIcirca|ccedil`ordfugtlaquoqCPU13fUcircordfbrvbarsect2WnIcirca|ccedil`fIuml13vC]G

Weumlnq

JAEA-Testing 2011-005

7 UgraveWH

HPCC BMT7q|^_ZWUuml|YacuteTHORN^_ZBDbrvbarsectBX900W FX1gtfeaZbrvbarTUacuteUcircordflaquosectnordf^_ZOslashszligthornGmWordfdcXpoundDqIORBDgtegraveBG API]a13brvbarpoundlordfagraveaacutegtlaquopoundDqiacuteX B^_ZbrvbarOcirc$

a$fY`yBAgraveogravegt ordfecircC+XmWethCDqmnd

EcircHdegdnDregmacrHPCCYoJusectWUgraveWHWCtCqXyacuteampG(GOcirc$b AccedilGUgraveWHq mndregmacrBX900 W FX1 o13gtegraveBG^_ZIacuteDh

Agraveogravelaquo GDHnotszligXWXqordfbrvbarsectCumlcopy13

ABGlCWCJ7gtecircngtlaquoq

ugraveuacute

)JlaquoDsectXsup2 sup2kDotildeDAElig13=` v oBeacuteQRjlucircsup2acircAgraveXugraveG

q

ucircuumlyacutethorn

1) Innovative Innovative Computing Laboratory the University of Tennesee ldquoHPC CHALLENGErdquo httpiclcsutkeduhpcc

2) gt$middotamp()p HPCaelig7Yacutegt SX13L32 3plusmnAElig13THORNYacutel 3 A=P+ frac14AElig13THORNYacute$ 2005-05middot4 p98-116 (2005)

3) ldquoIOR HPC Benchmarkrdquo httpior-siosourceforgenet

JAEA-Testing 2011-005

A

FX1

HPC

C B

MT

G-F

FTEfeaZUuml^13~

-DH

PCC

_FFT

_235GmWgt

FFT|5ordfTUacuteUcirceumlnordfzyacuteCq

iAgrave

EP-

DG

EM

M-a5^regbrvbarpoundaringaeligccedil13h-a5

(15

MB

Cor

e)UacuteCordfzyacuteCq

Tabl

e A

-1feaZUuml^13~]-a5^regCD

FX1

HPC

C B

MT

UgraveccedilegravekeacuteUgraveccedilknUgraveccedilplusmnsup2sup3

~~regreg

Ugraveccedilmmnecircecirckccedilnecircfrac12

regReeumlgecirckccedilnecircfrac12

ndegplusmnsup2

ecirckccedilpUgraveecircfrac12frac12plusmnsup2sup3degmiddot

plusmnsup2ordmdegsup2sup1plusmnsup2sup3degmiddot

eacuteplusmn~+igraveiacute8

nmcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

reg~

reg~v

reg

sup2sup3plusmn

qvqyuqou

ovwyy

qvquzqq

vxzwo

qvzz

vyoy

zzv|w|

qvuqqy

uvuyx

qyu

sup2sup3plusmnqvqy|zx

ovwz

qvquyxz

vxuo

qvzuw

vyowx

zzv|z

qvxx|y

yvox

qwy

sup2sup3plusmn

qvxoquq

uvoxzo

qvoqqz

yyvuox

w|vyzx

vyoxx

zzvzyq

qv|qq

vqy

yy

sup2sup3plusmnqvxoqwwq

uvqwxx

qvooqzu

yyvxuwo

w|vyyq

vyouu

zzvzzx

qvux|

oqvz|

xw

sup2sup3plusmn

|vwq|qqq

|uvwz

qvqoxxw

qv|

o|uuv|q

vyxy

zzv||wq

qvqwzuw

oqvywyu

sup2sup3plusmn|vw||wqq

|vyz

qvuxu

yyyvuwwu

o|uyvyw

vy|q

zzv|uy

qvxwzqu

ovxuo

zq|z

sup2sup3plusmn

|vwuzqq

uuvxqwq

qvqoxxw

wwyv|zq

o|uv|q

vy|ox

xvuuwu

qvowox

oqvowuyq

sup2sup3plusmn|vwoquxqq

|yvzyxu

qvuzou

yvqwq|

o|uyvzxu

vy|qw

xvuuz

qvqwuy

ovuz

|y

uxqqq

wxqqq

|xqqqq

|qqqq

w

frac14icircRfrac14Rplusmndegreg

szligsup3degicircRRszligcedilszligiumliumlRethvRntildeRfrac14degicircRccedilogravemicroplusmnregntildemicro~sup1Oslashsup3sup2moRenecircfrac12oacuteregplusmnAumlocircotildeg

egravekeacuteOslashfrac14knicircccedilpegravekszligszligOslashmmnOslash|xRccedilpOslashpOslashfrac14knRccedilpegravekszligszligOslashfrac12ecircfrac12eacuteeacuteszlignRocircotilde

mszligpecircmicircccedilpeacutefrac14UgraveOslashOslashyunRocircotilde

frac12kicircRRkplusmnplusmnplusmnraquodegReacutemiddotplusmnmiddotRkplusmn~plusmnmiddot

-

moRexoszligcedilowszligkoumlg

szligdegicircRkszligyudividentildeRvxUgraveegraveparantildeRpposlashueuqUgravecedilregedegsup2plusmngntildeRo|vxUgravecedilregesup3plusmnregsup2RugraveRnecircfrac12gg

Cuacute uacuteucircicircuumlTyacutethornYacuteN0ucircicircuumlTyacutethornYacuteN0gt

|

xo

xo

egravekszligszligRethvicircRethov|vo

eeumlgicircecirckccedilnecircfrac12RregR0Eeecirckccedilnecircfrac12Rndegplusmnsup2gIcirc-RJ13NOIPQ

~~icircRppRmicrodegdegugraveplusmnsup2ntildeRoreg~cedilsup2ntildeRmplusmnRn

JAEA-Testing 2011-005

B EumlIgravecentMPI atilde oh$a`MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

1024EumlIgrave

0

1000

2000

3000

4000

5000

6000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

2000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(1024-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(1024-)

rdma

Nordma

0

100000

200000

300000

400000

500000

600000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(1024-)

rdma

Nordma

Fig B-1 MPI atildeL1024EumlIgrave-1P

JAEA-Testing 2011-005

0

50000

100000

150000

200000

250000

300000

350000

400000

q qqqq uqqqq yqqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(1024-)

rdma

Nordma

0

1000000

2000000

3000000

4000000

5000000

6000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(1024-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(1024-)

rdma

Nordma

0

50

100

150

200

250

300

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(1024-Rank0gtRank1023)

rdma

nordma

Fig B-2 MPI atildeL1024EumlIgrave-2P

JAEA-Testing 2011-005

13

C

RD

MAoacute

NO

RD

MA

Ta

ble

C-13UVOcirc$

(8K

Bccedil

78K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

~plusmnreg

plusmn

sup2~

sup2~

sup2cedil~raquo

sup2~Oslashreg~plusmn

middotplusmnsup1

middotplusmnsup1raquo

~plusmnraquo

~plusmn

~plusmn

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

plusmnraquo

Tabl

e C

-2UVOcirc$

(80

KBccedil

584K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

sup2~Oslashreg~plusmn

plusmn

~plusmnreg

middotplusmnmiddotsup1

~plusmnraquo

~plusmn

sup2~

sup2cedil~raquo

sup2~

middotplusmnsup1raquo

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

~plusmn

plusmnraquo

Zordf

100M

RD

MAAgravejordfnotyWXq

0CXM

NO

RD

MA

AgravejordfnotyWXq

JAEA-Testing 2011-005

D

W Oslash5

Tabl

e D

-1Zccedil`

MPI

8mpi

qzvquecirciumlq|yvwuecirciumlquvzuecirciumlquovu|oecirciumlqxvwoqecirciumlquyvxqoecirciumlquyvxoecirciumlqu

qwvo

qzvzzecirciumlq|vuecirciumlquwvyecirciumlquovu|oecirciumlqxvwuecirciumlquyvuzoecirciumlquxvy|ecirciumlqu

yqyv|

owvz|oecirciumlq|yvwyyecirciumlquvxzecirciumlquvqxecirciumlquov|uecirciumlquyvuecirciumlquccedilxv|xecirciumlq

ccedilyvq

ozvqoecirciumlq|vxxecirciumlquwvux|ecirciumlquvyzoecirciumlquovxoecirciumlquyvuuqecirciumlquccedilvyecirciumlq|

ccedilwuvy

ovq|ecirciumlquyvzz|ecirciumlquwvq|qecirciumlquovu|qecirciumlqxvw|wecirciumlquyvuyoecirciumlquyvywecirciumlqu

yquv

ovquyecirciumlquvuyqecirciumlquwvxqecirciumlquovu|qecirciumlqxvwecirciumlquyvuwqecirciumlquxvzxecirciumlqu

xx|vw

|ovq|zecirciumlquyvwzuecirciumlquvz||ecirciumlquvyoecirciumlquovwecirciumlquyvuzecirciumlquccedilovozecirciumlq|

ccediloyvy

|ovquecirciumlquv|qecirciumlquwvuoyecirciumlquvyy|ecirciumlquovowyecirciumlquyvuecirciumlquccedilvxwecirciumlq|

ccedilovz

uzvwoecirciumlq|vuxyecirciumlquwvu|uecirciumlquovu|ecirciumlqxvw|ecirciumlquyvuuxecirciumlquxvwwuecirciumlqu

yqovy

uzvywecirciumlq|yvzuzecirciumlquvzoyecirciumlquovuwecirciumlqxvwqyecirciumlquyvuecirciumlquyv|yecirciumlqu

yxvu

xzvqecirciumlq|vuoqecirciumlquwv|wqecirciumlquvyyecirciumlquovowecirciumlquyvuuzecirciumlquccedilvozecirciumlq|

ccedil|vx

xzvyzzecirciumlq|voxuecirciumlquwvouecirciumlquvqqecirciumlquovxuecirciumlquyvuuyecirciumlquccediluv|yecirciumlq|

ccedilu|v

yzvuecirciumlq|vxzecirciumlquwvxecirciumlquovuyecirciumlqxvwecirciumlquyvuxecirciumlquxvyzqecirciumlqu

xw|vw

yzvz|oecirciumlq|yvwzoecirciumlquvwwxecirciumlquovuzecirciumlqxvwyecirciumlquyvuyyecirciumlquyvuqwecirciumlqu

yuxv|

zvwqecirciumlq|v|uzecirciumlquwv||yecirciumlquvyxwecirciumlquovozecirciumlquyvuzecirciumlquccedilyvwuecirciumlq|

ccedilywv

ovqq|ecirciumlquyvzzyecirciumlquvzzzecirciumlquvyzecirciumlquovqoecirciumlquyvuzecirciumlquccedil|vozxecirciumlq|

ccedil|ovz

zvuzecirciumlq|vowqecirciumlquwvoxxecirciumlquovoqqecirciumlqxuvxwecirciumlquyvuqecirciumlquvwu|ecirciumlqu

zovy

zvwecirciumlq|vyxecirciumlquwvuecirciumlquovqzzecirciumlqxuvxoecirciumlquyvuyzecirciumlquvu|ecirciumlqu

wvw

qwv|xzecirciumlq|vzecirciumlquwvywecirciumlquvxyoecirciumlquovqwuecirciumlquyvuecirciumlquccedilovqyecirciumlquccedilov

qzvuuuecirciumlq|v|yecirciumlquwvqecirciumlquovuzecirciumlqxvwqqecirciumlquyvuzoecirciumlquyvqoecirciumlqu

y|vx

oovouwecirciumlquwvqqecirciumlquzvoywecirciumlquovuqwecirciumlqxvyxxecirciumlquyvuwecirciumlquuvzoxecirciumlqu

uwvq

owvwuecirciumlq|vq|qecirciumlquvzqzecirciumlquv|uuecirciumlquzvqoecirciumlq|yvu|ecirciumlquccedilxvyxqecirciumlq|

ccedilyuv|

yvz||ecirciumlq|v|oecirciumlquwvuuecirciumlquvqyecirciumlqxovuowecirciumlqxyvu||ecirciumlquovozecirciumlqxoxwvy

ovouecirciumlquvuxecirciumlquwv|zecirciumlquovu|oecirciumlqxvwxyecirciumlquyvux|ecirciumlquxvzoecirciumlqu

xoyvo

|ov|qoecirciumlquv|zqecirciumlquwvyzoecirciumlquv|oecirciumlquwvu|uecirciumlq|yvu|ecirciumlquccedilov|uecirciumlquccediloqxv

|v|qecirciumlq|vqecirciumlquvxzecirciumlquvxqecirciumlquovq|ecirciumlquyvuzecirciumlquccedilv|zoecirciumlq|

ccedil|vu

uwvxy|ecirciumlq|wvzoyecirciumlquzv|ecirciumlquovuozecirciumlqxvuoqecirciumlquyvwqecirciumlquuvuoecirciumlqu

xoxvw

uovquecirciumlquvwecirciumlquwvqecirciumlquovuouecirciumlqxvyxqecirciumlquyvuzoecirciumlquxvwoecirciumlqu

xy|vx

xzvux|ecirciumlq|vxwecirciumlquwvx|ecirciumlquvx|oecirciumlquovoqoecirciumlquyvu|qecirciumlquccedilzvzoecirciumlq|ccediloquvz

xwvqqecirciumlq|vuuwecirciumlquwv|owecirciumlquvzuzecirciumlquov|wyecirciumlquyvxy|ecirciumlquccedil|vywuecirciumlq|

ccediluv|

yzvqecirciumlq|zv|w|ecirciumlquovq|qecirciumlqxovuxecirciumlqxvyecirciumlquyvzqecirciumlquuvqecirciumlqu

uyuvq

yovqyuecirciumlquv|quecirciumlquwv|ywecirciumlquovuyuecirciumlqxwvqxwecirciumlquyvxwoecirciumlquyvecirciumlqu

xwzvx

ovoooecirciumlquvzwwecirciumlquzvqzzecirciumlquvo|ecirciumlqxovuoecirciumlqxyvyouecirciumlquovecirciumlqxooqqvy

ovqoecirciumlquvqyoecirciumlquwvo|ecirciumlquvy|ecirciumlquovoecirciumlquyvuxecirciumlquccediluvxzecirciumlq|

ccediluvz

zvy|ecirciumlq|wvoqqecirciumlquzvqyecirciumlquov|uqecirciumlqxyvw|ecirciumlquyvxyecirciumlquuv||ecirciumlqu

uzovo

zvyzoecirciumlq|vqwecirciumlquwvoecirciumlquovqzwecirciumlqxuvuwwecirciumlquyvuzyecirciumlquvwqyecirciumlqu

yxvy

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

JAEA-Testing 2011-005

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy1

Ccedil

Ccedil

+

Ccedil

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

Fi

g D

-1 Zccedil`

MPI

8mpi

JAEA-Testing 2011-005

Tabl

e D

-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p

q|vu|ecirciumlq|yvuqqecirciumlquyvuuecirciumlquyvy|ecirciumlqu|vuuxecirciumlq|yvuozecirciumlquovzzoecirciumlq

xvw

qvuuwecirciumlq|yvuqecirciumlquvouecirciumlquvqyecirciumlquyvqxqecirciumlq|yvuqecirciumlquccedilovoecirciumlq|

ccediloyv|

o|vuuecirciumlq|yvuq|ecirciumlquyvuecirciumlquyvwuwecirciumlquuv|ecirciumlq|yvuuecirciumlquovqquecirciumlq|

zv

oyvqzwecirciumlq|yvuqoecirciumlquvqooecirciumlquvqecirciumlquyvqyecirciumlq|yvuozecirciumlquovxxecirciumlq

vx

|vwuecirciumlq|yvuqoecirciumlquyv|qecirciumlquyvwxecirciumlquuv|uqecirciumlq|yvu|ecirciumlquovoecirciumlq|

|wv

yvqqwecirciumlq|yvuqyecirciumlquvqqecirciumlquvq|oecirciumlquyvoyecirciumlq|yvuouecirciumlquvuowecirciumlq

uvq

||vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvwxecirciumlquuv|yqecirciumlq|yvuoecirciumlquovooecirciumlq|

|yv

|vuzwecirciumlq|yvuqyecirciumlquvoxxecirciumlquvq|ecirciumlquyvoqyecirciumlq|yvuoecirciumlquccedilov|uecirciumlq|

ccediloyvx

|v|yuecirciumlq|yvuq|ecirciumlquyv|zecirciumlquyvw|oecirciumlquuvqzuecirciumlq|yvuecirciumlquzvo|ecirciumlq

vy

yvy|ecirciumlq|yvuquecirciumlquvqwqecirciumlquvqzecirciumlquyvoqecirciumlq|yvuozecirciumlquccedilxvo|uecirciumlq

ccedilyvy

q|vu|ecirciumlq|yvuqoecirciumlquyvu|ecirciumlquyvyyyecirciumlquvx|qecirciumlq|yvuo|ecirciumlquccedilvoqecirciumlq

ccedilvx

qvu||ecirciumlq|yvuquecirciumlquvouwecirciumlquvq|ecirciumlquyvoxecirciumlq|yvuecirciumlquccedilovoquecirciumlq|

ccedilouvz

ovxouecirciumlq|yvuqecirciumlquyvyxzecirciumlquyvyyecirciumlquvu|ecirciumlq|yvuxecirciumlquwvuwwecirciumlqo

|vu

oyvozecirciumlq|yvuq|ecirciumlquvqoyecirciumlquvq|ecirciumlquyvoecirciumlq|yvuxecirciumlquvooqecirciumlq

|vu

|vyoqecirciumlq|yvuoyecirciumlquyvecirciumlquyvxyecirciumlqu|v|xwecirciumlq|yvuqecirciumlquccedilvozecirciumlq

ccedilxvz

xvzzyecirciumlq|yv|wzecirciumlquyvzwzecirciumlquvq|oecirciumlquyvouecirciumlq|yvuoyecirciumlquuvoxwecirciumlq

yvz

||vyecirciumlq|yvuqoecirciumlquyvy|ecirciumlquyvxecirciumlqu|v|xxecirciumlq|yvuoecirciumlquccedilxvwwzecirciumlqo

ccedilovy

|vuzuecirciumlq|yvuquecirciumlquvox|ecirciumlquvq||ecirciumlquyvo|ecirciumlq|yvuozecirciumlquccedilovq|ecirciumlq|

ccediloyvo

|vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvooecirciumlquvzoyecirciumlq|yvuqecirciumlquccedilv|zxecirciumlq

ccedilyv

yvy|ecirciumlq|yvuqqecirciumlquvqecirciumlquvq|xecirciumlquyvouecirciumlq|yvuqecirciumlquccediluvqoecirciumlq

ccedilxvo

q|vozyecirciumlq|yv|zzecirciumlquyvowecirciumlquyvyzecirciumlqu|vxu|ecirciumlq|yvuouecirciumlquxvq|wecirciumlq

oxvw

qv|wyecirciumlq|yvuqxecirciumlquvouuecirciumlquvquuecirciumlquyvoqyecirciumlq|yvu||ecirciumlquccedilovqqqecirciumlq|

ccedilo|vx

ouvoq|ecirciumlq|yv|zecirciumlquyvwqecirciumlquyvyxecirciumlquvuoecirciumlq|yvuoyecirciumlquccedilovxqoecirciumlq|

ccedil|yvy

ovuqzecirciumlq|yv|zoecirciumlquvo|ecirciumlquvq|zecirciumlquyvo|ecirciumlq|yvuowecirciumlquccedilzv|ecirciumlq

ccedilovy

uvoecirciumlq|yvuqqecirciumlquyvwo|ecirciumlquyvxoecirciumlqu|vzwecirciumlq|yvuoecirciumlquccedilyvowecirciumlq

ccediloxvq

vxozecirciumlq|yvuoecirciumlquvo|ecirciumlquvquuecirciumlquyvoyecirciumlq|yvuecirciumlquccedilovzzecirciumlq|

ccedilov|

||vuoqecirciumlq|yvuqxecirciumlquyvuecirciumlquyvywecirciumlqu|vuyuecirciumlq|yvuoecirciumlquvooqecirciumlq

yv

|vxoecirciumlq|yvuquecirciumlquvoxyecirciumlquvq||ecirciumlquyvqwyecirciumlq|yvuuecirciumlquccedilov||ecirciumlq|

ccediloyvu

|vqwecirciumlq|yvuqqecirciumlquyvoecirciumlquyv|yecirciumlqu|vozecirciumlq|yvuowecirciumlquccedil|vxooecirciumlq

ccedilvu

vuxwecirciumlq|yvuqxecirciumlquvoxoecirciumlquvquqecirciumlquyvoxxecirciumlq|yvuuecirciumlquccedilovooyecirciumlq|

ccedilouvz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

jetimesg

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-3Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (2

)

JAEA-Testing 2011-005

Tabl

e D

-3Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p

q|v|oxecirciumlq|wvwz|ecirciumlq|ovoecirciumlquovou|ecirciumlquvu|qecirciumlq|zvqqecirciumlq|ccedilvxwecirciumlq

ccedil|vu

qvx||ecirciumlq|wvzquecirciumlq|ovyuuecirciumlquovyuyecirciumlquvxqoecirciumlq|wvzyecirciumlq|vy|ecirciumlqo

qv|

ovyecirciumlq|wvwzqecirciumlq|ovoyxecirciumlquovouuecirciumlquvuwecirciumlq|wvzyuecirciumlq|ccedilvoqqecirciumlq

ccedilvy

ovxqoecirciumlq|wvwzecirciumlq|ovyuqecirciumlquovy|yecirciumlquvuoqecirciumlq|wvzxqecirciumlq|ccedil|vzoecirciumlqo

ccedilqvx

|vywoecirciumlq|wvz|ecirciumlq|ovyoecirciumlquov|ecirciumlqu|v|wyecirciumlq|wvz|oecirciumlq|ccedilvzxzecirciumlq

ccedilwvq

v|zoecirciumlq|wvwzyecirciumlq|ovyzecirciumlquovuzyecirciumlquxvzzwecirciumlq|wvzxwecirciumlq|ccedilov||oecirciumlq|

ccedilowvq

||vywecirciumlq|wvwwzecirciumlq|ovxecirciumlquov||ecirciumlqu|v|xwecirciumlq|wvzoecirciumlq|ccedilv|zecirciumlq

ccedilyvx

|vuooecirciumlq|wvzqwecirciumlq|ovy|ecirciumlquovxqxecirciumlquyvqzecirciumlq|wvz|ecirciumlq|ccedilovyecirciumlq|

ccedilovo

|v|xzecirciumlq|wvzqoecirciumlq|ovyecirciumlquovowwecirciumlquvzo|ecirciumlq|wvzyecirciumlq|ccedil|vzzecirciumlq

ccediloovu

vuxzecirciumlq|wvzqoecirciumlq|ovy|yecirciumlquovxoecirciumlquyvuecirciumlq|wvzyoecirciumlq|ccedilyvxuecirciumlq

ccedilwvw

q|v|xzecirciumlq|wvwwoecirciumlq|ovuecirciumlquovxqecirciumlqu|vxxuecirciumlq|wvzuuecirciumlq|vxwecirciumlq

v

qvuxecirciumlq|wvzqyecirciumlq|ovy||ecirciumlquovuzwecirciumlquyvqqecirciumlq|wvzy|ecirciumlq|ccedilov|uwecirciumlq|

ccedilowv

ovuwqecirciumlq|wvwzecirciumlq|ovo|yecirciumlquovo|zecirciumlquvuuqecirciumlq|wvzuyecirciumlq|vowecirciumlqo

ovo

oyvoozecirciumlq|wvz|qecirciumlq|ovxqxecirciumlquovuzecirciumlquyvqoecirciumlq|wvzxzecirciumlq|ccedilvzyecirciumlqo

ccedilov|

|v||zecirciumlq|wvzqzecirciumlq|ovxecirciumlquov|oecirciumlqu|v|oxecirciumlq|wvzzyecirciumlq|yv|owecirciumlqo

ovz

xvzyuecirciumlq|wvzqoecirciumlq|ovuwecirciumlquovxqxecirciumlquyvqxecirciumlq|wvzuecirciumlq|ovwuoecirciumlq

|vo

||v|xzecirciumlq|wvwzoecirciumlq|ovxecirciumlquovuzecirciumlqu|vxozecirciumlq|wvzqecirciumlq|v|zoecirciumlq

vo

|vuwxecirciumlq|wvzqyecirciumlq|ovy|zecirciumlquovxquecirciumlquyvqwoecirciumlq|wvzxzecirciumlq|ccedilov|xoecirciumlq|

ccedilowvo

|vo|uecirciumlq|wvwzqecirciumlq|ovqecirciumlquovoecirciumlqu|vqecirciumlq|wvzyuecirciumlq|ovuyzecirciumlq

uvu

yvuwecirciumlq|wvzooecirciumlq|ovxyyecirciumlquovxqoecirciumlquyvquecirciumlq|wvzyuecirciumlq|ccedilyvuw|ecirciumlq

ccedilwvy

q|v|oyecirciumlq|wvwzqecirciumlq|ovoecirciumlquov|yecirciumlqu|v|yzecirciumlq|wvzwzecirciumlq|ovxoyecirciumlq

uvy

qxvzyyecirciumlq|wvzq|ecirciumlq|ovuwecirciumlquovy|wecirciumlquvu|qecirciumlq|wvzx|ecirciumlq|ovxouecirciumlq|

xvu

ovxyecirciumlq|wvwqecirciumlq|ovoy|ecirciumlquov|oecirciumlquuvuuecirciumlq|wvzyecirciumlq|ovxwqecirciumlq|

xv|

oyvqqecirciumlq|wvzqecirciumlq|ovuzecirciumlquovy|wecirciumlquvuyecirciumlq|wvzuzecirciumlq|ovux|ecirciumlq|

uvo

|vyecirciumlq|wvwzqecirciumlq|ovxecirciumlquov|zecirciumlquuv||ecirciumlq|wvzxuecirciumlq|vowyecirciumlq

ozvx

vuuuecirciumlq|wvzuuecirciumlq|ovy|zecirciumlquovuwzecirciumlquxvzxzecirciumlq|wvz|yecirciumlq|ccedilovuz|ecirciumlq|

ccedilqvo

||vywecirciumlq|wvww|ecirciumlq|ovxyecirciumlquov|zecirciumlquuv|owecirciumlq|wvzoecirciumlq|vwoecirciumlq

ozvw

|vuqqecirciumlq|wvzqwecirciumlq|ovy|oecirciumlquovxqqecirciumlquyvquqecirciumlq|wvzxwecirciumlq|ccedilov|oqecirciumlq|

ccedilov

|v|xecirciumlq|wvww|ecirciumlq|ovuecirciumlquov|q|ecirciumlquuvqyyecirciumlq|wvzyzecirciumlq|vzuyecirciumlq

xv|

yvqwecirciumlq|wvzouecirciumlq|ovxyecirciumlquovxyyecirciumlquyvouecirciumlq|wvzuzecirciumlq|uvqzxecirciumlqo

vz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

Ccedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

Xpound

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-4Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-5Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (2

)

JAEA-Testing 2011-005

E 1oacute 1

Fig E-1 QRfeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39305 sec Ave Speed = 32566145 MBs RANK = 2 TIME = 39197 sec Ave Speed = 32655168 MBs RANK = 3 TIME = 39212 sec Ave Speed = 32643219 MBs RANK = 4 TIME = 34341 sec Ave Speed = 37272765 MBs RANK = 5 TIME = 34175 sec Ave Speed = 37453885 MBs RANK = 6 TIME = 34086 sec Ave Speed = 37552470 MBs RANK = 7 TIME = 34329 sec Ave Speed = 37286226 MBs RANK = 8 TIME = 24021 sec Ave Speed = 53287420 MBs RANK = 9 TIME = 23794 sec Ave Speed = 53794939 MBs RANK = 10 TIME = 23805 sec Ave Speed = 53770236 MBs RANK = 11 TIME = 23798 sec Ave Speed = 53786914 MBs RANK = 12 TIME = 23942 sec Ave Speed = 53462490 MBs RANK = 13 TIME = 23931 sec Ave Speed = 53488046 MBs RANK = 14 TIME = 23950 sec Ave Speed = 53443826 MBs RANK = 15 TIME = 24066 sec Ave Speed = 53186052 MBs RANK = 16 TIME = 23798 sec Ave Speed = 53786327 MBs RANK = 17 TIME = 23802 sec Ave Speed = 53775886 MBs RANK = 18 TIME = 23821 sec Ave Speed = 53733399 MBs RANK = 19 TIME = 23794 sec Ave Speed = 53795866 MBs RANK = 20 TIME = 23939 sec Ave Speed = 53468837 MBs RANK = 21 TIME = 23938 sec Ave Speed = 53470775 MBs RANK = 22 TIME = 23939 sec Ave Speed = 53470333 MBs RANK = 23 TIME = 23942 sec Ave Speed = 53462889 MBs

JAEA-Testing 2011-005

Fig E-2 afeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39370 sec Ave Speed = 32511775 MBs RANK = 2 TIME = 39475 sec Ave Speed = 32425231 MBs RANK = 3 TIME = 39415 sec Ave Speed = 32474846 MBs RANK = 4 TIME = 34392 sec Ave Speed = 37217925 MBs RANK = 5 TIME = 34543 sec Ave Speed = 37055373 MBs RANK = 6 TIME = 34250 sec Ave Speed = 37372742 MBs RANK = 7 TIME = 34709 sec Ave Speed = 36878286 MBs RANK = 8 TIME = 26967 sec Ave Speed = 47465876 MBs RANK = 9 TIME = 26964 sec Ave Speed = 47471018 MBs RANK = 10 TIME = 26955 sec Ave Speed = 47486289 MBs RANK = 11 TIME = 26947 sec Ave Speed = 47499771 MBs RANK = 12 TIME = 26957 sec Ave Speed = 47483647 MBs RANK = 13 TIME = 27020 sec Ave Speed = 47373054 MBs RANK = 14 TIME = 26948 sec Ave Speed = 47498023 MBs RANK = 15 TIME = 27174 sec Ave Speed = 47104542 MBs RANK = 16 TIME = 26956 sec Ave Speed = 47485054 MBs RANK = 17 TIME = 26945 sec Ave Speed = 47504899 MBs RANK = 18 TIME = 26946 sec Ave Speed = 47502360 MBs RANK = 19 TIME = 26967 sec Ave Speed = 47466158 MBs RANK = 20 TIME = 26948 sec Ave Speed = 47499767 MBs RANK = 21 TIME = 26982 sec Ave Speed = 47438204 MBs RANK = 22 TIME = 27087 sec Ave Speed = 47254830 MBs RANK = 23 TIME = 27004 sec Ave Speed = 47399960 MBs

JAEA-Testing 2011-005

Fig E-3 QRfeaZOpenMPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes NODE CPU coreid Rank = 0 bx2050 CPU0 0 all time RANK = 1 TIME = 38110 sec Ave Speed = 33587213 MBs bx2050 CPU1 1 RANK = 2 TIME = 28656 sec Ave Speed = 44667752 MBs bx2050 CPU0 2 RANK = 3 TIME = 37959 sec Ave Speed = 33720175 MBs bx2050 CPU1 3 RANK = 4 TIME = 28658 sec Ave Speed = 44664664 MBs bx2050 CPU0 4 RANK = 5 TIME = 37829 sec Ave Speed = 33836073 MBs bx2050 CPU1 5 RANK = 6 TIME = 28641 sec Ave Speed = 44691880 MBs bx2050 CPU0 6 RANK = 7 TIME = 37892 sec Ave Speed = 33779924 MBs bx2050 CPU1 7 RANK = 8 TIME = 35140 sec Ave Speed = 36425620 MBs bx2051 CPU0 0 RANK = 9 TIME = 35017 sec Ave Speed = 36554163 MBs bx2051 CPU1 1 RANK = 10 TIME = 38916 sec Ave Speed = 32891010 MBs bx2051 CPU0 2 RANK = 11 TIME = 45416 sec Ave Speed = 28184186 MBs bx2051 CPU1 3 RANK = 12 TIME = 35172 sec Ave Speed = 36392511 MBs bx2051 CPU0 4 RANK = 13 TIME = 35053 sec Ave Speed = 36516468 MBs bx2051 CPU1 5 RANK = 14 TIME = 35200 sec Ave Speed = 36363472 MBs bx2051 CPU0 6 RANK = 15 TIME = 35009 sec Ave Speed = 36561987 MBs bx2051 CPU1 7 RANK = 16 TIME = 22996 sec Ave Speed = 55661441 MBs bx2052 CPU0 0 RANK = 17 TIME = 35587 sec Ave Speed = 35968575 MBs bx2052 CPU1 1 RANK = 18 TIME = 35587 sec Ave Speed = 35968151 MBs bx2052 CPU0 2 RANK = 19 TIME = 35712 sec Ave Speed = 35842525 MBs bx2052 CPU1 3 RANK = 20 TIME = 22996 sec Ave Speed = 55661978 MBs bx2052 CPU0 4 RANK = 21 TIME = 36856 sec Ave Speed = 34729458 MBs bx2052 CPU1 5 RANK = 22 TIME = 22994 sec Ave Speed = 55666382 MBs bx2052 CPU0 6 RANK = 23 TIME = 22991 sec Ave Speed = 55673257 MBs bx2052 CPU1 7

JAEA-Testing 2011-005

13

F |eacuteIuml -microX|eacuteIuml^_Z)JCQRfeaZW Intel feaZegrave

pound5poundDqQRfeaZgt|YatildeGfeaUuml^13~

WCeumlUKntstiacuteordflaquoqmnaringaeligccedil13heumlX store amp5GUuml^13~gtlaquoq2csup1egraveBordfOcirc$ storegtszligordflaquoqiAgraveIntelmnntildeGfeaUuml^13~centsup3CXqCcCcedilEgravegtacircOslash5Cagraveaacute5pound

Dq WCIntelfeaZmUuml^13~ordfccdQRfeaZ

UKntst egraveBCDoumlWordfqiAgraveUKntst ordfQRfeaZEGmWordfethCDq

F1 Agraveograve o^_UgraveDo7ccedilIumlgt 128MB copyIgrave0]lyacuteOslash5GZccedil` MPI

OpenMP^_Z)JCDqm0]lyacute 100plusmnatildesectaumlC5mnccC|eacuteIuml (MBs)iquestHq^_ZiIacute[oumlYacutebrvbarsectGqOslash5IumlWCMPI gt 12348 ^_OpenMP gt 123487ccedilIumlgtOslash5UgraveD IFyacute1EC1CPUcd2^_UgraveD7ccedilIuml 2laquonotgtOslash55poundEacuteDqQRfeaZUKntst CDoumlWCXouml 2sectCDqIntelucircfeagt5poundDq

JAEA-Testing 2011-005

Fig F-1 ^_ZugraveuacuteLFlat MPIP

(main) n = nmb102410248 nmb = 128 (from input data) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) nsize = n size c NNN Repetition number of times NNN = 100 do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = MPI_WTIME() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

Fig F-2 ^_ZugraveuacuteLOpenMPP

(main) $OMP threadprivate(ioutompidompnumabierr1ierr2) n = nmb102410248 nmb = 128 (from input data) $OMP parallel shared(n) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) if(ierr1ne0orierr2ne0) then write(6) Array allocation failed stop end if $OMP end parallel nsize = n size c NNN Repetition number of times NNN = 100 $OMP parallel private(i)shared(nsize) do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue $OMP end parallel $OMP parallel private(IIttime1ttime2timewave_spd) $OMPamp shared(NNNn) ttime1 = omp_get_wtime() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = omp_get_wtime() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

F2 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt[oumlYacutebrvbarsect 4gtGqmn

d QRUKntstlaquooumlWoumlWr 2 2^_laquonotgtUKntstlaquooumlWoumlWr 3KntstCc 1ccedil8^_gtQRW IntelWr QR(-Kntst)W Intel2 2^_Wr CqOMP Intel EcircHGu MPI Wiacute

XWXpoundDDHyzGqmndbrvbarsect[raquozfrac14frac12CDq (1) MPI^_ZOpenMP^_ZWQRfeaZgtUKntstordflaquooumlW

oumlgtordfaacuteCq (2) IntelfeaZUKntstCDQRfeaZWbAEligq (3) MPI^_ZOpenMP^_ZW 1CPUSgto coremacrdnXq (4) UKntstordfQRfeaZOslash5thornlt]G(1plusmnW 2plusmn

Oslash5ordfaEacuteD)q ^_ZbrvbarPXOcirc$sup1yBordfAgraveXCc store ampordfCXoumlQR

feaZUKntst Uuml^13~CXnordfyacuteordfdXqOcirc$sup1yBordfcAgraveXcpoundecircsectCXoumlmUuml^13~CagraveaacuteGaeliguordflaquoWnq

IntelfeaZdegIacuteQRUKntstlaquosectoumlWGmWordfgtecircq(XETHNtildefeaOslash5|YatildeG4X

yen5oslashIacuteD|Yordf9WXpoundbrvbarPmacrltdnDq

JAEA-Testing 2011-005

Table F-1 |eacuteIumlWeumlUKntstiacuteUuml^13~

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqvo uuuwvq u|xvq qvzu

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |o|v wq|vy xxwvu qvXpoundo |xwxvz |xwvq |xw|vz Xpoundo |xvy wovw xov qv

|xyvu |xuvo |xxv| |ozvu woqv xyxvq qv

Xpoundq z|vx zvz z|v Xpoundq ouuwvz ozwov ooxv| qvyoXpoundo woqv wqxvo wqvy Xpoundo oux|vu ozwv| oqvu qvyoXpound wqzv| wqvo wqxv Xpound oux|v| ozzqvo oov qvyo

wquv| wqqvo wqv ouxovz ozwyvu oozvo qvyo

Xpoundq o||vo o|xvq o|uvq Xpoundq oqoxvz oxquvu oyqvo qvxzXpoundo ouv ouov ouov Xpoundo oqovw oxoqvo oy|vz qvxzXpound ouov ouyv| ouuvq Xpound oqowvz oxqzv oyuv| qvxzXpound| ouvy ouuvw ou|v Xpound| oqowvz oxqzv oyuvq qvxz

o|zvz ouovw ouqvz oqovz oxqwv| oy|vo qvxz

Xpoundq qwyvq oqyvw qzyvu Xpoundq ouxvq ouwuvy ouywv| qvqXpoundo qz|v ooyvy oqxv Xpoundo ouxvz ouwuv ouovq qvqXpound qzuvy oovq oqxvw Xpound ouxzv ouz|v ouyv qvqXpound| qzxvy ooxvu oqxvx Xpound| ouxuvw ouzv ou|vx qvqXpoundu qwzv| o||v ooovx Xpoundu ouqxvo ouuv| ouyv qvyXpoundx qwzvu o|vw oo|vy Xpoundx ouq|vo ouuxv ouuv qvyXpoundy oqvy o|xvu oovx Xpoundy ouquvw ouuyvq ouxvu qvyXpound ooqvu o|yvo o|v Xpound ouqxvx ouuyv ouyvo qvy

qzxvw ouvw ooqv| ou|qv| ouyvu ouuwvz qvyw

usup3deg

wsup3deg wsup3deg

usup3deg

|sup3deg

sup3deg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

sup3deg

|sup3deg

Table F-2 |eacuteIumlLiacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq ywxvu xyuvw yxvo qvuXpoundo |xywv |xwvx |x|vy Xpoundo yzwvz xvu y|wvo qvuXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz yzvo xovo y|ovy qvu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq yqv xywvy yozvy qvuXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound ywyvz xwv y|uvw qvuXpound| Xpound|

|xyov| |xovu |xyyv| ywvw xxvy yv qvu

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq yoov xvu ywovw qvxXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| yxv qvo yzvy qvx

|xyv| |xqvo |xyyv yowv yov ywzv qvx

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo uyv| wqvo y|v qvuXpound |xyvz |xuvu |xovo Xpound uqvu wqyv| y|wv| qvuXpound| Xpound|

|xyyv| |xuvu |xqv| uywvw wqyv y|vw qvu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo uyxvy wqyv y|yvo qvuXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| uywvq wqxvz y|vq qvu

|xyuvw |x|vo |xywvz uyyvw wqyv| y|yvx qvu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound xq|v wovo yxvz qvuXpound| |xyxvu |x|v| |xyzv| Xpound| uzzvx wqyvu yxvz qvu

|xyuvw |xuvy |xyzv xqovy wqzv yxxvu qvu

ntildeR| ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

ontildeR

ontildeR|

qntildeRo

qntildeR

qntildeR|

qntildeRo

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

JAEA-Testing 2011-005

Talbe F-3 |eacuteIumlLQRafeaZrsacute 1ccedil8 ^_P

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqxvw uqzvq uqvu ovqx

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |ywqvx |ywovy |ywovo ovq|Xpoundo |xwxvz |xwvq |xw|vz Xpoundo |yzyvz |yzvo |yzvq ovq|

|xyvu |xuvo |xxv| |ywwv |ywzv| |ywzvq ovq|

Xpoundq z|vx zvz z|v Xpoundq zv| zxvq z|v ovqqXpoundo woqv wqxvo wqvy Xpoundo wqxvx wqwvy wqvo ovqqXpound wqzv| wqvo wqxv Xpound wquv| wqwv| wqyv| ovqq

wquv| wqqvo wqv wqqv wquvq wqv| ovqq

Xpoundq o||vo o|xvq o|uvq Xpoundq ov ouqv| o|ov| ovqqXpoundo ouv ouov ouov Xpoundo o|ovy ouwvw ouqv ovqqXpound ouov ouyv| ouuvq Xpound o|ovy oxqvo ouqvw ovqqXpound| ouvy ouuvw ou|v Xpound| o|qv ouzvo o|zvz ovqq

o|zvz ouovw ouqvz ozvq ouvo o|wvq ovqq

Xpoundq qwyvq oqyvw qzyvu Xpoundq qzv ooxvq oqyvo ovqqXpoundo qz|v ooyvy oqxv Xpoundo oquv oouv| oqzv ovqqXpound qzuvy oovq oqxvw Xpound oqxv| oxvx ooxvu ovqqXpound| qzxvy ooxvu oqxvx Xpound| oqxvw oxv ooxvx ovqqXpoundu qwzv| o||v ooovx Xpoundu oqv qzuv| oqvx qvzzXpoundx qwzvu o|vw oo|vy Xpoundx oov qzov oquvu qvzwXpoundy oqvy o|xvu oovx Xpoundy ooxvu qwzv oqvy qvzwXpound ooqvu o|yvo o|v Xpound oouvo qzqvo oqvo qvzw

qzxvw ouvw ooqv| ooqvq oqxv oqvz qvzz

wsup3deg

usup3deg

|sup3deg

sup3deg

13aumlecedilg

sup3deg

|sup3deg

usup3deg

eccedilograveregYacuteg

efrac12cedilregg

oefrac12cedilregg

wsup3deg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

Table F-4 |eacuteIumlLQRafeaZrsacute iacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq |quvy |quvx |quvx ovquXpoundo |xywv |xwvx |x|vy Xpoundo ||vx |ovw |vy ovquXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz |ouvq |o|vo |o|vy ovqu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq ||zzv |qv| |xxqvw ovqqXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound |zwv| ||v |yovq ovqxXpound| Xpound|

|xyov| |xovu |xyyv| |xzwvw |o|vq |yxxvz ovq

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq |quv |quvq |quvu ovquXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| |ovx |ovy |ovy ovqu

|xyv| |xqvo |xyyv |o|vo |ovw |o|vq ovqu

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo |oyvw |oyvu |oyvy ovquXpound |xyvz |xuvu |xovo Xpound |oyv |oyvq |oyvo ovquXpound| Xpound|

|xyyv| |xuvu |xqv| |oyvx |oyv |oyv| ovqu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo |ov |oyvo |oyvz ovquXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| |ouv| |ovx |o|vu ovqu

|xyuvw |x|vo |xywvz |oyvq |ouv| |oxvo ovqu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound |oyv |oxvz |oyvo ovquXpound| |xyxvu |x|v| |xyzv| Xpound| |oxvx |oyvw |oyvo ovqu

|xyuvw |xuvy |xyzv |oxvw |oyv| |oyvo ovqu

ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

qntildeRoqntildeRo

qntildeR

qntildeR|

ontildeR

ontildeR|

ntildeR|

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpound

JAEA-Testing 2011-005

G Zccedil` MPIWIcirca|ccedil`gt MPI atilde 6MPI ETHNtildegtegraveBCD^_ZiIacute^regCZccedil`MPI

WIcirca|ccedil`gtOslash55poundDq

G1 Agraveograve MPI ETHNtildegtegraveBCD^_ZiIacute^regC1024 ^_Zccedil

` MPIW 128^_j87ccedilIumlIcirca|ccedil`128^_Zccedil` MPIW128 ^_j8 7ccedilIumlIcirca|ccedil`acircgtCc Ocirc$-a5gtrs5poundDqUgraveDRDMANORDMAIumlgt5poundDq )acircEacute coregtZccedil` MPIWIcirca|ccedil`rs acircEacute MPI^_gt

coreCCDouml rsWGq Ocirc$-a5n~n 8eacutea`ccedil1Keacutea`256Keacutea`ccedil1Meacutea`WCDq

G2 [oumlYacutebrvbarsectGqmbrvbarsect[raquozfrac14frac12CDq (1) 1024Zccedil` MPIW 128j8Icirca|ccedil`gtGu MPIatildegtIcirca|ccedil`

ordfFqUgraveDatildebrvbarpoundaacuteCCordflaquoq (2) 128Zccedil` MPIW 128j8Icirca|ccedil`gtplusmn Ocirc$-a5(256KB13yacute)

gtlaquoWGu MPIatildegtaeligIcirca|ccedil`ordfnotyW wXordfaacuteCCIcirca|ccedil`ordfZccedil` MPIbrvbarsectordfmWq

(2)gt ordfAgravecedilCXpoundegraveBG core ordfCXgtOslashf

IumlntildeHDoumlfIumlOslash5FCXWeumlnqmndbrvbarsect

BX900gtIcirca|ccedil`ordfnotygtlaquoqfIumlUTUacuteUcircbgtIcirca|ccedil`Ucirc5Gaeliguordflaquoq

JAEA-Testing 2011-005

Table G-1 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (1)

sup2~ w |y |o ovq| zx o| qv |vu vuqsup2~ oy |o |qz ovq| oqq o|x qv| |vow vwsup2~ u |ow |o ovqq zz o|x qv| |vo v|xsup2~ | |y |o ovqu zw o|x qv| |v|u v|sup2~ yu |o |oy ovqq zy o|z qvyz |v| vsup2~ ow ||u |o| ovq zw o| qv |vuq vwsup2~ xy ||u |o ovqx oqx o|z qvx |voz vsup2~ xo ||u |q ovqu oq o|w qv |vo| v|sup2~ ontildequ |uo |q ovq oqz ou| qvy |vo vu

sup2~ w owq qw qvw q oo qvy| vxy ovwxsup2~ oy owu o| qvw q oqw qvyy vyo ovzwsup2~ u owq qu qvww yw oqz qvy vyx ovwsup2~ | o| qy qvwu o oqw qvyy vu| ovzosup2~ yu o| qu qvwx o oqw qvyy vu| ovwzsup2~ ow owx qq qvz| oqz qvy vxy ovwusup2~ xy oz qo qvwz wy ooo qvw vqz ovwsup2~ xo ow qu qvw w oo qv vqx ovwsup2~ ontildequ ow ou qvwx z oq qv ovzw ovz

sup2~Oslashreg~plusmn w untildeqoy untilde|z ov| ontildeuw ontildeyz| qvu xzv|q xvusup2~Oslashreg~plusmn oy |ntildeyuw uqntilde ovwo ontildezw ontildeyuu qvz xyvx uvwsup2~Oslashreg~plusmn u untildeqxu |zntildezu| ovwx ontildexo ontildeyoq qvw xzvow uvwsup2~Oslashreg~plusmn | |ntilde|| uqntildeyq ovw ontilde|q ontildeyq qvwo xxvxy uvwysup2~Oslashreg~plusmn yu untildey uqntildeuuu ovwu ontildeww ontildeyuy qvw xvyu uvxsup2~Oslashreg~plusmn ow |ntildeqq uqntildezw| ovw ontildew ontildeyzx qv xzvxq uvowsup2~Oslashreg~plusmn xy untildeyyx uqntildeywz ovwu ontildeuxw ontildeyy qvw xovq uvuosup2~Oslashreg~plusmn xo untildeqz |zntildeoww ovwz ontildexq ontildeqx qvwz uwvyz vzzsup2~Oslashreg~plusmn ontildequ yntildexwx |ntildeuuz vqx ontildey ontildewx qvw uvo ozvzw

middotplusmnsup1 w |w | ovqq oqq o|y qv| |vw vuqmiddotplusmnsup1 oy |u |uo qvzx oqo o|y qvu |vq vxqmiddotplusmnsup1 u |uo |yq qvzx oqo ouq qv |v|w vxwmiddotplusmnsup1 | |xo |yw qvzy oq o|w qvu |vux vyymiddotplusmnsup1 yu xzo u|| ov|y oqx ouw qvo xvy| vzmiddotplusmnsup1 ow ntildez yqu uvz| ooy oxx qvx xvq |vzqmiddotplusmnsup1 xy xntilde z xv| o| oy qvz |zvyx xvwumiddotplusmnsup1 xo wntildeqoz ontildew uvxo oy oz ovo| |voy zvmiddotplusmnsup1 ontildequ oqntildezy |ntildexxo |vqz zo |w uvqz oovz ouvzx

middotplusmnsup1raquo w ntildey| oxntildequq qvo || oy qvu vz ovqomiddotplusmnsup1raquo oy ntildeuyw oxntildeqx qvoy ||x oq qvu v| ovoxmiddotplusmnsup1raquo u ntilde|y oxntildequq qvow |uo o| qvuw wvq| ovqzmiddotplusmnsup1raquo | ntildexw| oxntildeq|| qvo || ox qvu vyy ovqomiddotplusmnsup1raquo yu ntildeyyz oxntildeqqz qvow |xq yw qvuy vy ozvxumiddotplusmnsup1raquo ow |ntildeqqx oxntildeqy| qvq u|q uo qvxw yvzz qv|middotplusmnsup1raquo xy |ntilde|uy oxntildeoqu qv xyo y qvu xvzy ozvw|middotplusmnsup1raquo xo |ntilde|zq oxntildeoxq qv xzw woq qvu xvy owvomiddotplusmnsup1raquo ontildequ |ntildex| oxntildeuox qvu yy z|z qv xvxx oyvuo

Ugraveplusmnsup1 w ontildeuox yntildeyo| vy| u q ovq ovzq |vUgraveplusmnsup1 oy owntildeqz yntildewu vy |q qo ovou wv|q |uvoxUgraveplusmnsup1 u owntilde| yntildewzz v |qo q ovuy yvoy ||v|wUgraveplusmnsup1 | ozntildeqzy yntildezq v uz q ovq yvwq ||v|wUgraveplusmnsup1 yu ozntilde|wx yntildewx| vw| z oy ov| yxv|o |ovqUgraveplusmnsup1 ow qntildewo yntildez|y vz ||q o ovxq yovxq |ovuxUgraveplusmnsup1 xy qntilde|o ntildeqo vzq |wy y ovo xvx |ovqqUgraveplusmnsup1 xo ontildeo|u ntildeow| vzu |xy |u ovx| xzv| |qvyUgraveplusmnsup1 ontildequ ontildey|q ntilde|z vz| |zw ow| vow xuv|o uqvu

Ugraveplusmnsup1raquo w ozntildexw yntildez| vw| |z q ovow wvqu |uvuqUgraveplusmnsup1raquo oy ozntildexoz yntildezq vw| |q q ovou wuvyz |uvoUgraveplusmnsup1raquo u ozntildexqy yntildezu vwo u q ov| ov|q ||vyqUgraveplusmnsup1raquo | ozntildeuu yntildewz vwy o qy ov|o vwx ||vuqUgraveplusmnsup1raquo yu ozntildeqw yntildezq| vwx y oy ovu |vy |ovzy

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

sectWOcirceg

oqu$XAcircsup3deg owsup3degIcircwsup3+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

JAEA-Testing 2011-005

Table G-2 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (2)

Ugraveplusmnsup1raquo ow qntildeqx yntildewz| |vqq |u q ovq xxv|| |ovzUgraveplusmnsup1raquo xy qntildew| ntildeq| vzy ||y y ovuz yvqx |ovoUgraveplusmnsup1raquo xo ontildew ntildexy vz| u|| |u ovwx uzvoo |qvzxUgraveplusmnsup1raquo oqu ontildeww| ntildeuqz vzx |oy owu ov yzv|q uqv||

~plusmn w yntildez zntildeuw qvo ontildeozu uux vyw xvwq ovzo~plusmn oy ntildeoqz zntildeu| qv| ontildeoy| uu vyq yvoo ovz~plusmn u ntildeq|z zntildeuy qv ontildeo uuy vy| yvqq ovwx~plusmn | ntildeoz zntildeyq qvu ontildequ uu vq xvzw ovwy~plusmn yu yntildezoq zntildey qvo ontildeoyu uu vyq xvzu ovw|~plusmn ow zntilde|qz zntildew| qvzx ntildeo uuy uvw uvz ovz~plusmn xy zntilde|wz zntildewqw qvzy ntildey uuy uvzz uv ovz~plusmn xo zntildeqx zntildewuq qvzu ntildeowx uuz uvw uvo ovzu~plusmn oqu zntildexz zntildezq| qvzz ntildeo uyo uv uvuz ovxq

~plusmnraquo w yntildewox zntilde| qvq ontildeuo uux vz xvuz ovw~plusmnraquo oy yntildezq zntildeu| qvo ontildeoz uu vyw xvw ovwo~plusmnraquo u yntildeyz| zntildeux qvyz ontildeoyw uuu vy| xv| ovz|~plusmnraquo | yntildewou zntildexx qvq ontildeu uux vx xvx ovzo~plusmnraquo yu yntildezx| zntildex qvo ontildeoyo uuu vyo xvzz ovzx~plusmnraquo ow zntildeuux zntildew qvz ntildeoxx uu| uvwy uv|w vqx~plusmnraquo xy zntilde|x zntildewqo qvzx ntildeq uuy uvzx uvu ovzw~plusmnraquo xo zntildeyw zntildewuy qvzw ntildeqy uxq uvzq uv|z ovwz~plusmnraquo oqu zntildeyxq zntildezoo qvz ntildeoxy uyq uvyw uvuw ovx|

plusmn w uy uo ovo zy oow qvw uvwq |vxqplusmn oy xq uw ovqy oqo oq qvwx xvqq |vzzplusmn u xwz x|| ovoq oqy o| qvwy xvx uv|uplusmn | oo yqy ovo ooq o|q qvwx yvuy uvywplusmn yu ntilde zoz vu o| ouo qvw owvqu yvxqplusmn ow ntildeww ntildeqqz ov|z oxo ox qvwy owvu oovxqplusmn xy untildeouy |ntildeoy ovo q| oz qvz| qvuy oyvzplusmn xo |zntildeo| |xntilde|u ovoo ontildeuox ontildeuy qvwo vyz qvowplusmn oqu uqntildeqwy |xntilde|w ovo ontildeuww ontildewqw qvw yvzu ozv

plusmnraquo w oqqntildexy yqntilde|y ovy ntildeuou wy vx uovyy ywvwzplusmnraquo oy oqntildey| yontildeq ovyw ntildeu|w ww| vy uvqz yzv|oplusmnraquo u oqntildexyw yontildeqz| ovyw ntilde|zy wz| vyw uvw ywvuqplusmnraquo | oq|ntildeqo yontildeo ovyw ntildeuqq www vq uvzx ywvzoplusmnraquo yu oq|ntildeqzw yontildeoou ovyz ntildeuyx wzx vx uovw| ywv|qplusmnraquo ow oq|ntildezo yontildeq| ovq |ntildexqx wz |vzo zvyx ywvoplusmnraquo xy oquntildeqy yontildeqq ovo |ntildexyw wzz |vz zvo yvwxplusmnraquo xo oqwntilde|zq yqntildewo ovw |ntildexyx wzx |vzw |qvuo yvzplusmnraquo oqu ooqntildeqq yontildeoxx ovwq |ntildeyu zo |vzz zvzx yyv|z

~plusmnreg w ooz o|o qvzq q x qvwq xvwo xvo~plusmnreg oy oo| ou qvzo y qvwx xvq uvwu~plusmnreg u oow o|o qvzo q x qvw xvz xvy~plusmnreg | oq ou qvz q x qvwq xvzu uvwz~plusmnreg yu oq ox qvzy q x qvw yvo uvzz~plusmnreg ow o oz qvzz y qvw xvy xvq|~plusmnreg xy ox o|o qvzx y qvzx uvwy uvw|~plusmnreg xo ox o|o qvzx | y qvww xv|x uvzx~plusmnreg oqu ou o|| qvz| y z qvz uvq uvy|

~plusmn w oontildewxu |ntildeyuy |vx yu ontildeoow qvx owvuy |vy~plusmn oy oontildewo |ntildeyq |vw y|z ontildeoox qvx owvuw |v|~plusmn u oontildey |ntildexw |vz yuo ontildeoox qvx owv| |vo~plusmn | oontilde| |ntildexzx |v y| ontildeo| qvx owvuz |vq~plusmn yu oontildew|o |ntildey |v y| ontildeo qvx owvxw |vo~plusmn ow ontildeqw |ntilde| |v| yy ontildeouq qvxx ozv|o |vw~plusmn xy ontilde |ntildewox |v zz ontildeoyw qvyw oxv| |v~plusmn xo ontildeuo |ntildezzo |voo wu ontilde qvyz ouvy |v~plusmn oqu ontildequ untilde|w vzq zu ontilde|u qvo o|vuz |v|o

sup2cedil~raquo w y o qvx o | qvu| xvoy uvusup2cedil~raquo oy z o qvwq o qvxx vzx xvuxsup2cedil~raquo u x oq qvuy o qvyo |v| uvuzsup2cedil~raquo | | oo qv o qvxu vuz uvzosup2cedil~raquo yu oo qvyo o qvuz xvyo uvxusup2cedil~raquo ow oo qvyw o qvxx yvqq uvwxsup2cedil~raquo xy y oo qvxy o qvy uvyq xvoosup2cedil~raquo xo y oo qvxy | qvy| |vq| |vusup2cedil~raquo oqu x o qv|z o qvx |vyz xvuq

JAEA-Testing 2011-005

Table G-3 MPIatilderaquo LMPI and hybrid acute256ccedil1024KBP

sup2~ yntildeouu |ntildewzz |ntildeuou ovou ontildezx ontildeqyw ovw| ovzz |vqsup2~ xuntildeww ntilde|o ntildeuo qvzz |ntildeqo |ntildey qvz vu vsup2~ wyntildeu| oqntildexx oontildeqyy qvz |ntildezwy untildeuyw qvwz vq vuwsup2~ ontildequwntildexy ountildexxy oxntildeuy qvzu untildezqq xntildewyw qvwu vz vyu

sup2~ yntildeouu zou zo qvzu ontildeqqq ontildeq qvzw qvzo qvzxsup2~ xuntildeww ontildeo ontildezy qvwz ontildew ntildeuuz qv| qvz qvzsup2~ wyntildeu| ntildexuo ntildey| qvz ntildeyzz |ntildeuqw qvz qvzu qvwosup2~ ontildequwntildexy |ntildex untildeqox qvzu |ntildexww untilde|| qvw ovqx qvz

sup2~Oslashreg~plusmn yntildeouu yntildeuw| wontildexoz qvzu xntildeuo |ntildeoqw ovoq |vqq |vx|sup2~Oslashreg~plusmn xuntildeww ou|ntildewx ountildeqo qvw| uyntildeqzq yyntildewxo qvyz |vo vyosup2~Oslashreg~plusmn wyntildeu| oontildezoo yzntildeuz qvz yntildeqz zxntildeox qvo |voy vw|sup2~Oslashreg~plusmn ontildequwntildexy w|ntildeyox |yontildeuq| qvw wntildexo o|ntildexqz qvo |v| vz|

middotplusmnsup1 yntildeouu |ntildezw ||ntilde|x| qvzz oontildexww oxntildeyzu qvu vwx vo|middotplusmnsup1 xuntildeww y|ntildex yyntildeyy qvzx ontildez|x untilde|x qvuy vww ovuomiddotplusmnsup1 wyntildeu| wzntildewy z|ntildeyux qvzy |qntildeyx yntildeyqu qvuy vz ov|zmiddotplusmnsup1 ontildequwntildexy oqntildeux| o|untildeqox qvzq uqntildeyxx wwntildez| qvuy vzy ovxo

middotplusmnsup1raquo yntildeouu yntildew|u ||ntildeu ovww ntildewxy oxntildeyuw uvyy qvwy voumiddotplusmnsup1raquo xuntildeww wxntildeoz yuntildeo| ov|| wxntildex|| untildeuy ovwq ovqq ov|xmiddotplusmnsup1raquo wyntildeu| ooxntildeuw z|ntildezyu ov| z|ntildeyow yntildexuz ov|z ov| ov|zmiddotplusmnsup1raquo ontildequwntildexy o|ntildexx o|untildeuqz ovq zyntildexoq wntildeuuy ovoq ovu| ovxu

Ugraveplusmnsup1 yntildeouu |zntildeyu| oqntildexwz |vu uontildeuyq ountildewy vz qvzy qvoUgraveplusmnsup1 xuntildeww untildewxx ntildeqo voy uzntildeoy| yntildewo ovw| qvz qvw|Ugraveplusmnsup1 wyntildeu| xxntildewoq zntildewyu ovw xyntildewuu |yntildexqx ovxy qvzw qvwUgraveplusmnsup1 ontildequwntildexy yntildewxy |ntildexxw ovy yuntildewzx uyntildeoq ovuo qvz qvwo

Ugraveplusmnsup1raquo yntildeouu uqntildeq|u oqntildeyxq |vy uontildey| ountildez|z vz qvzy qvoUgraveplusmnsup1raquo xuntildeww uwntilde|u ntildeqz vo uzntildew yntildewx ovw| qvzw qvw|Ugraveplusmnsup1raquo wyntildeu| xyntildezy zntildez|y ovww xntildeo| |yntildeuwq ovx qvzw qvwUgraveplusmnsup1raquo ontildequwntildexy y|ntildeo| |ntildeyxq ovyw yxntilde|yw uyntildeqx ovuo qvz qvwo

~plusmn yntildeouu untildeyw wntildezu uvo uuntildeo wntildeuq| xvx qvzy ovq~plusmn xuntildeww xqntildeqyx qntilde|xq vuy xontildexwy untildezux vq qvz qvw~plusmn wyntildeu| xwntildeq wntildeux vqy yqntildeox |untildeu ov| qvzw qvw~plusmn ontildequwntildexy yyntildequq |yntilde|q| ovw ywntildeqx uuntildeuyy ovx| qvz qvw

~plusmnraquo yntildeouu zntilde|q zntildeqq ovqu zntildexq wntildeu|w ovo| qvzw ovq~plusmnraquo xuntildeww oyntildeu| qntilde|wy qvwo oyntildeoz untildezuy qvyx ovq qvw~plusmnraquo wyntildeu| |ntildewo wntildeuxu qvw ntildewz |untilde qvyy ovq qvw~plusmnraquo ontildequwntildexy zntildezy |yntildexo| qvw zntildezxq uuntildeu qvy ovqq qvw

plusmn yntildeouu yxntildewu wzntildeq| qv| ontildeyoz yntildezz qvwq |vq |v|qplusmn xuntildeww ountildeqoz oxntilde|o qvwo |zntildexu xxntilde|ux qv |vo vxplusmn wyntildeu| ow|ntildeuwy untildeyu qvw xntildeoyu untildezu| qvy |vo |vqqplusmn ontildequwntildexy u|ntildewo zxntildezqo qvw xntildezy zuntildeoy qvwq |vo |vo

plusmnraquo yntildeouu x|ontildeyx wyntildeqoz yvow o|ntildeqox owntilde|xu vu |vww uvyzplusmnraquo xuntildeww yozntildeyz| oz|ntildeuwu |vq oyntildeyw ywntildeyw v| |vwo vwplusmnraquo wyntildeu| yw|ntildewxu wwntildeooz v| owyntilde|q| oqxntildeyw ovy |vy vplusmnraquo ontildequwntildexy |yntildezo uqxntildeoyq ovw oontildexo ouontildezzx ovuz |vuw vwx

~plusmnreg yntildeouu uzu y| qvw uwo xzw qvwq ovq| ovqy~plusmnreg xuntildeww zq| ontildeoz qvy wo| ontildexuy qvx| ovoo qv~plusmnreg wyntildeu| ontilde|q ontildeyq qvw ontildeoxo ntildeoqz qvxx ovo| qvz~plusmnreg ontildequwntildexy ontildezqu ntilde|ux qvwo ontildeuxu ntildeyx qvxu ov|o qvww

~plusmn yntildeouu ountildewqx ountildez|x qvzz ontildeqq owntildeuw| ovou qvq qvwo~plusmn xuntildeww zntildeuy zntildewxz qvzw uqntildex|o uwntildexyw qvw| qv qvyo~plusmn wyntildeu| uuntildeq uuntildeo qvzz xyntildeyoq yzntildexzw qvwo qvw qvyu~plusmn ontildequwntildexy qntildeqqx qntildeu|q qvzz ntildeo zqntilde|ux qvwo qvzy qvw

sup2cedil~raquo yntildeouu x w qvz zo ux vq| qvw ov|sup2cedil~raquo xuntildeww ox o| qv| o|w o| qvyx qvzo qvwosup2cedil~raquo wyntildeu| oo |o qvu owz w| qvy qvzo qvwsup2cedil~raquo ontildequwntildexy ou wz qvu | |x qvyx qvz qvwo

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

sectWOcirceg

ow$XAcircsup3deg owsup3degIcircwsup3

国際単位系(SI)

乗数  接頭語 記号 乗数  接頭語 記号

1024 ヨ タ Y 10-1 デ シ d1021 ゼ タ Z 10-2 セ ン チ c1018 エ ク サ E 10-3 ミ リ m1015 ペ タ P 10-6 マイクロ micro1012 テ ラ T 10-9 ナ ノ n109 ギ ガ G 10-12 ピ コ p106 メ ガ M 10-15 フェムト f103 キ ロ k 10-18 ア ト a102 ヘ ク ト h 10-21 ゼ プ ト z101 デ カ da 10-24 ヨ ク ト y

表5SI 接頭語

名称 記号 SI 単位による値

分 min 1 min=60s時 h 1h =60 min=3600 s日 d 1 d=24 h=86 400 s度 deg 1deg=(π180) rad分 rsquo 1rsquo=(160)deg=(π10800) rad秒 rdquo 1rdquo=(160)rsquo=(π648000) rad

ヘクタール ha 1ha=1hm2=104m2

リットル Ll 1L=11=1dm3=103cm3=10-3m3

トン t 1t=103 kg

表6SIに属さないがSIと併用される単位

名称 記号 SI 単位で表される数値

電 子 ボ ル ト eV 1eV=1602 176 53(14)times10-19Jダ ル ト ン Da 1Da=1660 538 86(28)times10-27kg統一原子質量単位 u 1u=1 Da天 文 単 位 ua 1ua=1495 978 706 91(6)times1011m

表7SIに属さないがSIと併用される単位でSI単位で表される数値が実験的に得られるもの

名称 記号 SI 単位で表される数値

キ ュ リ ー Ci 1 Ci=37times1010Bqレ ン ト ゲ ン R 1 R = 258times10-4Ckgラ ド rad 1 rad=1cGy=10-2Gyレ ム rem 1 rem=1 cSv=10-2Svガ ン マ γ 1γ=1 nT=10-9Tフ ェ ル ミ 1フェルミ=1 fm=10-15mメートル系カラット 1メートル系カラット = 200 mg = 2times10-4kgト ル Torr 1 Torr = (101 325760) Pa標 準 大 気 圧 atm 1 atm = 101 325 Pa

1cal=41858J(「15」カロリー)41868J(「IT」カロリー)4184J(「熱化学」カロリー)

ミ ク ロ ン micro 1 micro =1microm=10-6m

表10SIに属さないその他の単位の例

カ ロ リ ー cal

(a)SI接頭語は固有の名称と記号を持つ組立単位と組み合わせても使用できるしかし接頭語を付した単位はもはや コヒーレントではない(b)ラジアンとステラジアンは数字の1に対する単位の特別な名称で量についての情報をつたえるために使われる

 実際には使用する時には記号rad及びsrが用いられるが習慣として組立単位としての記号である数字の1は明 示されない(c)測光学ではステラジアンという名称と記号srを単位の表し方の中にそのまま維持している

(d)ヘルツは周期現象についてのみベクレルは放射性核種の統計的過程についてのみ使用される

(e)セルシウス度はケルビンの特別な名称でセルシウス温度を表すために使用されるセルシウス度とケルビンの

  単位の大きさは同一であるしたがって温度差や温度間隔を表す数値はどちらの単位で表しても同じである

(f)放射性核種の放射能(activity referred to a radionuclide)はしばしば誤った用語でrdquoradioactivityrdquoと記される

(g)単位シーベルト(PV200270205)についてはCIPM勧告2(CI-2002)を参照

(a)量濃度(amount concentration)は臨床化学の分野では物質濃度

  (substance concentration)ともよばれる(b)これらは無次元量あるいは次元1をもつ量であるがそのこと   を表す単位記号である数字の1は通常は表記しない

名称 記号SI 基本単位による

表し方

秒ルカスパ度粘 Pa s m-1 kg s-1

力 の モ ー メ ン ト ニュートンメートル N m m2 kg s-2

表 面 張 力 ニュートン毎メートル Nm kg s-2

角 速 度 ラジアン毎秒 rads m m-1 s-1=s-1

角 加 速 度 ラジアン毎秒毎秒 rads2 m m-1 s-2=s-2

熱 流 密 度 放 射 照 度 ワット毎平方メートル Wm2 kg s-3

熱 容 量 エ ン ト ロ ピ ー ジュール毎ケルビン JK m2 kg s-2 K-1

比熱容量比エントロピー ジュール毎キログラム毎ケルビン J(kg K) m2 s-2 K-1

比 エ ネ ル ギ ー ジュール毎キログラム Jkg m2 s-2

熱 伝 導 率 ワット毎メートル毎ケルビン W(m K) m kg s-3 K-1

体 積 エ ネ ル ギ ー ジュール毎立方メートル Jm3 m-1 kg s-2

電 界 の 強 さ ボルト毎メートル Vm m kg s-3 A-1

電 荷 密 度 クーロン毎立方メートル Cm3 m-3 sA表 面 電 荷 クーロン毎平方メートル Cm2 m-2 sA電 束 密 度 電 気 変 位 クーロン毎平方メートル Cm2 m-2 sA誘 電 率 ファラド毎メートル Fm m-3 kg-1 s4 A2

透 磁 率 ヘンリー毎メートル Hm m kg s-2 A-2

モ ル エ ネ ル ギ ー ジュール毎モル Jmol m2 kg s-2 mol-1

モルエントロピー モル熱容量ジュール毎モル毎ケルビン J(mol K) m2 kg s-2 K-1 mol-1

照射線量(X線及びγ線) クーロン毎キログラム Ckg kg-1 sA吸 収 線 量 率 グレイ毎秒 Gys m2 s-3

放 射 強 度 ワット毎ステラジアン Wsr m4 m-2 kg s-3=m2 kg s-3

放 射 輝 度 ワット毎平方メートル毎ステラジアン W(m2 sr) m2 m-2 kg s-3=kg s-3

酵 素 活 性 濃 度 カタール毎立方メートル katm3 m-3 s-1 mol

表4単位の中に固有の名称と記号を含むSI組立単位の例

組立量SI 組立単位

名称 記号

面 積 平方メートル m2

体 積 立法メートル m3

速 さ 速 度 メートル毎秒 ms加 速 度 メートル毎秒毎秒 ms2

波 数 毎メートル m-1

密 度 質 量 密 度 キログラム毎立方メートル kgm3

面 積 密 度 キログラム毎平方メートル kgm2

比 体 積 立方メートル毎キログラム m3kg電 流 密 度 アンペア毎平方メートル Am2

磁 界 の 強 さ アンペア毎メートル Am量 濃 度 (a) 濃 度 モル毎立方メートル molm3

質 量 濃 度 キログラム毎立法メートル kgm3

輝 度 カンデラ毎平方メートル cdm2

屈 折 率 (b) (数字の) 1 1比 透 磁 率 (b) (数字の) 1 1

組立量SI 基本単位

表2基本単位を用いて表されるSI組立単位の例

名称 記号他のSI単位による

表し方SI基本単位による

表し方平 面 角 ラジアン(b) rad 1(b) mm立 体 角 ステラジアン(b) sr(c) 1(b) m2m2

周 波 数 ヘルツ(d) Hz s-1

ントーュニ力 N m kg s-2

圧 力 応 力 パスカル Pa Nm2 m-1 kg s-2

エ ネ ル ギ ー 仕 事 熱 量 ジュール J N m m2 kg s-2

仕 事 率 工 率 放 射 束 ワット W Js m2 kg s-3

電 荷 電 気 量 クーロン A sC電 位 差 ( 電 圧 ) 起 電 力 ボルト V WA m2 kg s-3 A-1

静 電 容 量 ファラド F CV m-2 kg-1 s4 A2

電 気 抵 抗 オーム Ω VA m2 kg s-3 A-2

コ ン ダ ク タ ン ス ジーメンス S AV m-2 kg-1 s3 A2

バーエウ束磁 Wb Vs m2 kg s-2 A-1

磁 束 密 度 テスラ T Wbm2 kg s-2 A-1

イ ン ダ ク タ ン ス ヘンリー H WbA m2 kg s-2 A-2

セ ル シ ウ ス 温 度 セルシウス度(e) Kンメール束光 lm cd sr(c) cd

スクル度照 lx lmm2 m-2 cd放射性核種の放射能( f ) ベクレル(d) Bq s-1

吸収線量 比エネルギー分与カーマ

グレイ Gy Jkg m2 s-2

線量当量 周辺線量当量 方向

性線量当量 個人線量当量シーベルト(g) Sv Jkg m2 s-2

酸 素 活 性 カタール kat s-1 mol

表3固有の名称と記号で表されるSI組立単位SI 組立単位

組立量

名称 記号 SI 単位で表される数値

バ ー ル bar 1bar=01MPa=100kPa=105Pa水銀柱ミリメートル mmHg 1mmHg=133322Paオングストローム Å 1Å=01nm=100pm=10-10m海 里 M 1M=1852mバ ー ン b 1b=100fm2=(10-12cm)2=10-28m2

ノ ッ ト kn 1kn=(18523600)msネ ー パ Npベ ル B

デ ジ ベ ル dB

表8SIに属さないがSIと併用されるその他の単位

SI単位との数値的な関係は    対数量の定義に依存

名称 記号

長 さ メ ー ト ル m質 量 キログラム kg時 間 秒 s電 流 ア ン ペ ア A熱力学温度 ケ ル ビ ン K物 質 量 モ ル mol光 度 カ ン デ ラ cd

基本量SI 基本単位

表1SI 基本単位

名称 記号 SI 単位で表される数値

エ ル グ erg 1 erg=10-7 Jダ イ ン dyn 1 dyn=10-5Nポ ア ズ P 1 P=1 dyn s cm-2=01Pa sス ト ー ク ス St 1 St =1cm2 s-1=10-4m2 s-1

ス チ ル ブ sb 1 sb =1cd cm-2=104cd m-2

フ ォ ト ph 1 ph=1cd sr cm-2 104lxガ ル Gal 1 Gal =1cm s-2=10-2ms-2

マ ク ス ウ ェ ル Mx 1 Mx = 1G cm2=10-8Wbガ ウ ス G 1 G =1Mx cm-2 =10-4Tエルステッド( c ) Oe 1 Oe  (1034π)A m-1

表9固有の名称をもつCGS組立単位

(c)3元系のCGS単位系とSIでは直接比較できないため等号「   」

   は対応関係を示すものである

(第8版2006年改訂)

この印刷物は再生紙を使用しています

Page 8: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター

JAEA-Testing 2011-005

1 EacuteH L13Pgt ABCDE

FGDHIJ 22 3K13LM=NO 153TflopsPQCRSTUVWXM=NO 200TflopsUV LinuxYZ$13LPRIMERGY BX900PWM=NO 12 Tflops []^_`$a^LFX1PbcdXefgh$13ijklmndoBpCDqBX900 rsXtuvwmWFX1 []efgh$Lxfgh$PyBzD^|13~hGmWWCq gtoBGXumlcopyordfyBClaquoPRIMERGY

BX900WFX1oacuteCO|Yl WpoundDGY`OslashCOslashGmWbrvbarsectdcWXpoundD13

GqUgrave[13EcircuJGq 3gtefgh$13MXBdn HPCC YbrvbarC 4cd 6O|YalLIOPa$fY`L^_ Pcent AringcdGq 7M$CqXyacuteampG()Ocirc$bWCUgraveWHDq

2 13Ecircu 13 3+XB-eacute13GXUVEumlIgraveOIacute[

]fIumlIacutenot|O-eacutecdX13gtlaquoqmnd13M

=NORSTUVWX 214TflopsMacirc012 56TBOgraveOacuteOcircOtildeY1212PBgtlaquoq13Ecircu Fig 1Gq 21 UVEumlIgraveOIacuteIcircIumlETHNtildeJ 1334WXUVEumlIgraveOIacutePRIMERGY BX900L13BX900P

200Tflops=NOLgYPWMacirc0 50TBnotGUV LinuxYZ$13gtlaquoq

BX900 10U-a5Aumleuml13aelig13T 186-eacute7IumlT 86accedil7Iuml8amp9XAumlAringOslashOuml7Iuml-eacutegtlaquoq1 7Iuml-eacuteaXeon ^_ccedil- X5570LYccedilIumlfP 2 ^_ccedil-Wacirc0 24GBLDDR3 SDRAMBiIacuteiumlIuml 12GB UgraveD 48GBP8ampGqiumlIumllaquoDsect=NO9376GflopsL293GHztimes4lt=gtOY_ccedilYtimes4ftimes2^_ccedil-Pgt|eacuteIuml 512GBsAumlAnotGqiumlIumlInfiniBandLIBP 4xQDRL13B QDRWCP2`ZaringCDAgravez 8GBsAbrvbarAumlFE7a13XOcirc$FGordf9gtlaquoq

JAEA-Testing 2011-005

Fig 1 13Ecircu

BX900gtacircHI4J2K currenLXMUV13Nh713~O=5Pq

22 []fIumlIacuteIcircIumlETHNtildeJ []fIumlIacuteFX1L13FX1P12Tflops =NOLgYPWMacirc0 46TBnotGYZ$13gtlaquoq

1iumlIumlQR SPARC64SLYccedilIumlfPT^_ccedil-Wacirc0 16GB8ampGqUgraveD|UCPUVBccedil^ccedil`LJSCPbrvbarsectAuml|eacuteIumlOslashWCqiumlIuml FBBLFull Bisectional BandwidthPccedil`X| InfiniBand 4xDDRLDAgravez 2GBsPgtYZGqiumlIumleacute|O=f7YOtilde IcircIumlETHNtildegtO=GVBOumltimesLAumla$fY`accedilPyBGmWgt [

brvbarUcircXOcirc$FG]EumlIgrave^_ZOslash5^=_wXM

`notGq FX1 gtFX1 CcedilEgraveordfxfgh$WXpound`acC2012 b

=Gxfgh$yBzD^|13~h5Pq 23 OgraveOacuteOcircOtildeYOumltimesIcircIumlETHNtildeJ OgraveOacuteOcircOtildeYOumltimes IO iumlIumlSPARC Enterprise M90002 6Ocirc$OcircOtildeYETERNUS DX80366$OcircOtildeYETERNUS4000 model40016gtJeumlnq

JAEA-Testing 2011-005

MOcircOtildeY12 12PBLRAID6Pgtlaquosect13gt 576GBs IO LgYPnotGLFig 2PqIO iumlIuml InfiniBand nota13Parallelnavi SRFSbrvbarsectciumlIumlcdYoacuteCaidefgCq Ocirc$OcircOtildeYOumltimeslaquoDsect 466Za SASOcircOtildeYL1TB7200rpmP8ampG

qmndOcircOtildeY RAID6 gt^UcircCOumltimesgt`ZagGmWgt IO AumlFUcirchpoundqXIOiumlIuml_aumla13Sun QFSBCLFig 3Pq

UVEumlIgraveOIacute11913aelig13()

i QDR 9j2 (113aelig13ntildesect) InfiniBandaccedil

(9iexcl) i FGFAring 72GBs (2iumlIumlcurren) IOiumlIuml (M90002iumlIuml) i FGFAring 576GBs (OcircOtildeYcurren) OgraveOacuteOcircOtildeY (ETERNUS DX80366) ()9kUcirclmiumlIumlnC

1 2 3 119

IBUSW

IO node

Fig 2 oOumltimesYZJWFGFAring

JAEA-Testing 2011-005

M90

00

2IO

M90

00

1IO

FC (3)

FC (4)

FC (36)

hellipFC (1

)F

C (2)

FC

(42)

hellipFC (3

7)FC (3

)FC (4

)FC (3

6)hellip

FC (1)

FC (2)

FC (42)

hellipFC (3

7)

DX8

0 1

helliphelliphelliphelliphelliphellip

helliphelliphelliphelliphellip

hellip

1TB 1T

B1T

B1T

B

DX8

0 2

1TB 1T

B1T

B1T

B

DX

80

35

1TB 1T

B1T

B1T

B

DX8

0 3

6

1TB 1T

B1T

B1T

B

ETER

NU

S D

X80

(36

)E4

K 1

1TB 1T

B1T

B1T

B

ETER

NU

S400

0 M

odel

400

(1)

32G

Bs(8

FC)

QFS

Q

FS

36

576

GBs

(144

FC)

F Fi

g 3

JAEA-Testing 2011-005

3 HPCCYbrvbar

HPCC (High Performance Computing Challenge)1) Y 7^_ZB BX900W FX15poundDq13gtHPCCEcircupqWrstuq 31 HPCCYEcircu

HPCC YvR Tennessee JDongara wRordf3xWXpoundyoBCefgh$BY`^_ZzgtlaquoqO

|Y^_ WpoundDefgh$acircuXu

GmWbrvbarsectefgh$13WCeacuteZGUacuteC

qO HPL(High-Performance Linpack)DGEMMFFTE 3q|^_Zgt|Y STREAM W Random Access 2 q|^_Zgt _ PTRANSW b_eff 2q|^_Zgt5n~noacuteCc^_Zbrvbar5PJWXpound 2)qo

^_ZUuml|YacuteTHORNFortranCgtcnordfHPCCYWCCgtcnD 1^_ZeumlnqXHPCCYC 2003 sect 300wordfeumlnsect+X13rsyBGmW9gtlaquoq

311 ^_Z efgh$13f CPU 1 iexclXCiexcl8ampG

L13iumlIumlPa$fY`gtYZCDYZ$JgtlaquosectCoreCPUiumlIuml13WeumlUgraveUgraveXgtO=ordf5nDHogtordfugt

laquoqmgtiumlIumlMG` (G Global system performance)WiumlIuml[gt` (SN Single Environment)gt` (EP Embarrassingly Parallel) 3gt5Pq

3111 HPL^_Z

HPL5IgraveLUcurrengtCmWgt13OG^_ZgtlaquoqOslashm^_Z|5 CcedilEgrave^_ZaumlIacutecurrenWCyB

eumlnmWordfqUgraveDefgh$13O[P TOP500B^_ZWCBeumln LinpackMPIEumlIgraveUcircWCnotgtlaquoqiumlIuml`$XOordfecircMAumlWXq Tflopsgteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq

JAEA-Testing 2011-005

3112 DGEMM^_Z DGEMM5IgraveGmWgtiumlIumlOG^_Zgtlaquoqo^_gt5curren13^_W i5Xqm^_Z

LinpackSgtegravenauml^_ZgtlaquoqUgraveDaumlefgh$gto13zeumlnDeacuteaTHORN|ordfZaZ|WCBeumlnsect

meacuteaTHORN|ordfoacute WXqY`LBMTPbrvbariumlIumlOiexclcentGq Gflops(Giga floating-point operations per second)gteumlnq` 2gtiumlIuml[(SN)(EP)`5Pq 3113 STREAM^_Z STREAM iumlIumlordf|W CDWecirceacuteIumlG^_Zgtlaquoqm

Wecirco^_gtO=ordf5np^_W i5Xqm^_Z

cpound(Copy)curren(Scale)Myen(Add)yen(Triad) 4 XO=gtJeumlnsectiumlIuml|YiexclcentGq GBs(Giga Bytes per second)gteumlnq` 8 gt[(SN)(EP)gtoiumlIuml 4(cpoundcurrenMyenyen)`5Pq 3114 PTRANS^_Z PTARNS(Parallel matrix TRANSpose)EumlIgrave^_ocirc5IgraveFtimesbrvbarsect13ccedil`brvbarY G^_Zgtlaquoq GBs gteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq 3115 Random Access^_Z Random Access |sectumleumlnDecopyIgravecdZordf 1 ulaquonotgtshyregGO=ordf 1 macrdegplusmn5ncG^_Zgtlaquoq1 plusmnYGOcirc$2ordf8BWgteumlordfgtlaquoqiumlIuml|Yucircsup2`WiumlIuml MPI gtucircsup2`5Pq Gups(Giga updates per second)gteumlnq` 3 gtiumlIuml[(SN)(EP)iumlIumlegravepoundDM(G)`5Pq 3116 FFTE^_Z FFTEAumlF|sup3^acuteOG^_ZgtlaquoqHPLWiacute CcedilEgrave^_ZaumlIacutecurrenWCyBeumlnmWordfq[(SN)W(EP)`gt2 [microcopyIgraveordfo^_gtpara|sup3^acuteeumlnqiumlIumlegravepoundDM(G)`gt3[microcopyIgraveordflaquo 1AgravezmiddotAcurrenetheumlnDcedil|sup3^acuteeumlnlaquoW^_ordf^_W 5PmWLoacute Pbrvbaro^_

copyIgravesup1copytimesordm^_raquofrac14sect[microoacuteG|sup3^acuteordf5nq

JAEA-Testing 2011-005

Gflopsgteumlnq 3117 b_eff^_Z b_eff Ocirc$FGFGfrac12frac34(7a13acute latency)WFGFAring(eacuteIuml band width)shycdiquestH^_ZgtlaquoqFGfrac12frac34ordfAgraveXCeacuteIumlordfecircM13^_ ordfAacutenmWGqb_eff iumlIumlegravepoundDM(G)G Ring W PingPong 2 ^_ZgtJeumlnqb_eff FGGOcirc$-a5mnUgravegt^_ZW+XsectlOcirc$-a5 N ] MPI ^_ P atildeAcircXq7a13 8B Ocirc$eacuteIuml 2MB Ocirc$FGGmWbrvbarsect5Pq Ring MPIbrvbarpoundAtildeHdnDZYAumlAringAEligo^_oacuteCAumlAring 1oacute 1 5PoumlWZYAumlAringZordfEumlu 1oacute 1 5Pouml 2CcedilEgravegtCDq PingPong 2Core Eacute Ecircnototilde^_ordfethsectntildednp Core EumlCOslash5Gq2CoreIgraveEacuteIacute5poundecircTIcircIIumlshyGq b_effETHNtildeOgraveRing(AumlAring)brvbar7a13OacuteRing(AumlAring)brvbareacuteIumlOcircRing(Zordf)brvbar7a13OtildeRing(Zordf)brvbareacuteIumlOumlPingPongbrvbar7a13timesPingPongbrvbareacuteIuml 6gtlaquoq7a13ordfaY_LOslashPmacreacuteIumlordf GBsgteumlnq 312 Oslash5

HPCCZaZ(Baseline Runs)WUuml^Otildea5Z(Optimized Runs) 2Oslash5ordflaquoordffeaZTUacuteUcircEacutebrvbarZaZgt5pound

q

3121 ZaZ Oslash5gtlaquoZaZfeaZbrvbarTUacuteUcircW13Oumlfrac34

eumlnaumlUgraveAuml BLAS ZaZ|W MPI ZaZ|egraveBUgravegtUacuteeumlnordfUcircfIuml^regUacuteeumlnXqegraveBCDfeaZeacuteYacute~

WfeaegraveBCDUuml^13~ZaZ|eacuteYacute~UumlordfYacutegtlaquoq 3122 Uuml^Otildea5Z UcircfIumlTHORNszligagravePTUacuteUcirc2 7gtUacute9eumlnqiAgraveaacutednDIacutecurren(-)fIumlTUacuteUcircgtlaquosectpAgrave|5macr8CEcircnototilde3XTUacuteUcircgtlaquoqUgraveD acirc HPCC^_YacuteNtildeY`acircatilde^WaumlaringordfaeligugtlaquosectatildeCUumlGaeliguordflaquoq

JAEA-Testing 2011-005

32 BX900 FX1n~nHPCC BMT^_Z 8ccedil512EumlIgravegtOslash5CDq 321 BX900 BX900 HPCC BMT^_Z Table 1GWHPL

PTRANSRandom Access FFTEo^_ZZK|Otilde Fig 4ccedilFig 7GqZK|Otildedegn^_Zegraveeacuteecirccopyordfeuml]cgtlaquoordf512 EumlIgravegtigraveyenCiacuteXCiacuteicircXordfdegdnqUgraveDHPL=NoacuteC 822(=4902596100)1gtlaquopoundDq

Table 1iumlntildeethntildeograveoacuteWCHPCCocircAEligcdotildebIcircIumlETHNtildeoumlccedilYAElig Intel Endeavor cluster (13gt Endeavor Wdivideoslash)HPCCWeboumlYacute 1) ugraveuacute Table 2Gq BX900shyucircoB3gtlaquoDH Ecircnototilde^_ZP EndeavoregraveeacuteplusmnpoundDlaquoordfEcircuumlb13yacuteordfdegdnqUgraveDEndeavorWEacute IntelfeaZBD Table 3 GbrvbarPQRfeaZbrvbarWbgtlaquopoundDqDotildeCSTREAM iumlIumlSgt5poundmWcd thornCXatilde

d(35GBs oacute 42GBs)W 2ethM EndeavorbrvbarsectEshygtlaquopoundDqm422gtaringNGq

1 BX90013VBbrvbargt17072fgt=N 2000TflopsoacuteC 1914TflopsOslashCsectOslashszlig 9566gtlaquopoundDq

- 9 -

JAEA-Testing 2011-005

Ta

ble

1 富士

通コ

ンパイ

ラに

よる機

構B

X900

での

HPC

C B

MT測

定結果

G-H

PL

G-P

TR

AN

SG

-FFTE

EP-S

TR

EA

MSys

()

EP-S

TR

EA

MTriad

EP-D

GEM

MR

ando

mR

ing

Ban

dwid

thR

ando

mR

ing

Lat

ency

TFlo

ps

GB

sG

ups

Gup

sG

Flo

ps

GB

sG

Bs

GFlo

ps

GB

sus

ecrd

ma

008

1315

12

9330

001

5112

008

9847

555

3328

074

350

9311

018

41

1171

51

268

nord

ma

008

1226

12

9544

001

5504

009

0139

557

5727

813

347

6711

023

91

1053

51

278

rdm

a0

1579

130

510

540

0238

850

1510

719

4469

555

143

4697

111

145

092

418

149

3no

rdm

a0

1536

650

491

080

0222

270

1334

439

7508

555

683

4730

111

123

078

781

324

5rd

ma

032

2493

09

0716

003

0751

024

2906

186

357

111

164

347

3911

101

00

5220

61

944

nord

ma

032

3494

08

4422

003

5001

023

7182

190

981

112

052

350

1611

100

30

5194

95

010

rdm

a0

6201

260

150

637

003

2786

037

8769

369

923

222

203

347

1911

092

40

4192

43

156

nord

ma

060

9871

014

686

80

0763

460

3882

6638

139

722

260

03

4781

110

887

044

542

652

7rd

ma

120

6270

026

303

70

0370

050

5186

5673

075

744

426

43

4708

111

007

038

892

463

5no

rdm

a1

2265

800

302

225

007

3652

061

6532

742

721

444

063

346

9211

100

50

4161

78

290

rdm

a2

4166

200

583

348

002

9897

053

8761

139

3770

888

453

347

0511

091

00

3583

55

937

nord

ma

236

3890

057

519

80

0765

560

8003

4814

845

5089

503

73

4962

110

879

040

323

103

72rd

ma

468

4620

011

199

400

0163

910

2622

2026

859

1017

779

513

4726

111

147

032

838

711

3no

rdm

a4

9017

400

109

9150

008

2799

153

7610

275

5280

1816

433

354

7711

111

90

3572

112

790

富士

通B

X90

0 (5

12C

ore

128C

PU

)C

ip

Xeo

n X55

70 2

93G

Hz

DD

R3-

1066

SM

T-O

N T

urbo

Mod

e -O

FF

Inte

rcon

nect

Q

DR

Inf

inib

and

2so

cket

nod

e F

at T

ree

OS

Red

Hat

EL 5

Com

pile

r 富

士通

CC

++ V

er O

ption -

Kfa

st -

SS

L2

()

EP-S

TR

EA

M S

ys の

値は

(EP-S

TR

EA

M T

riad

)times並

列数

で算

出し

てい

8 16 32 64

MPI

富士

通 P

aral

leln

avi Lng

uage

Pac

kage

512

128

HP

CC

Ver

V

13

1

256

並列

数通

信方

式G

-Ran

dom

Acc

ess

オリジナ

ル版

  

SA

ND

IA_O

PT2版

JAEA-Testing 2011-005

FFig 4 BX900 HPL

Fig 5 BX900 PTRANS

JAEA-Testing 2011-005

FFig 6 BX900 Random Access

Fig 7 BX900 FFTE

JAEA-Testing 2011-005

TTa

ble

2In

tel E

ndea

vor c

lust

er (

HPC

CW

eb1)

)

JAEA-Testing 2011-005

TTa

ble

3In

tel

BX9

00H

PCC

BM

T

JAEA-Testing 2011-005

322 FX1 FX1 HPCC BMT Table 4 HPL

PTRANS Random Access FFTE Fig 8 Fig 11512

HPL755(=377650100)2

Table 4 FFTE DGEMM

A

2 FX1 1200 129Tflops116Tflops 9037

JAEA-Testing 2011-005

TTa

ble

4FX

1H

PCC

BM

T

JAEA-Testing 2011-005

FFig 8 FX1 HPL

Fig 9 FX1 PTRANS

JAEA-Testing 2011-005

FFig 10 FX1 Random Access

Fig 11 FX1 FFTE

JAEA-Testing 2011-005

33 HPCCYUgraveWH Uklq BX900W FX1WrsndWK(Altix3700Bx2)Wrs

Fig 12GqmhgtTWXpoundDqshy 100WCLRandom Ring LatencyPqRandom Ring Latency13BX900ordf`ccedil^fCmWordfcurrencq

BX900 W FX1 rsGWSTREAM W FFTE necircOatildeC atildeC eacuteIumlrvCDordfmacrdnDqSTREAM

421FFTE 43aringNGq XAltix3700Bx2 Random AccessordfEordfAltix3700Bx2W

plusmngtegraveBCD HPCC BMTeacuteYacute~cdOcirc$Y|5ordf+Xpoundsect8YrsGmW13XqAltix3700Bx2 atildeC FFTEEordfmdoacute OslashszligWuumlwdn

q

JAEA-Testing 2011-005

13

0

20

40

60

80

100

BX90

0

FX1

Altix37

00Bx2

GHPL(490Tfplos)

113

GRando

mAccess(154G

ups)

GPTRANS(1120G

Bs)

EPSTREA

M(436GBs)

EPDGEM

M(111Gflo

ps)

13

GFFTE

(2755Gflo

ps)

$amp(13

Rand

omRingBandw

idth

(0359GBs)

Rand

omRingLatency(518usec)

)+

512-(FlatM

PI)

1172

10 64

64

10 32

8 2 32

130

13

123

[GFLOPS]

4567898

123

[GBs]

lt8

7898

[GBs]

=gtAB130CDgtEFGHIAB0J

KLMNOIPQ

Fi

g 1

213cent

HPC

C B

MT^_ZrsZ

Ran

dom

Rin

g La

tenc

y (5

18

sec)

)+R

JAEA-Testing 2011-005

4 BX900W FX1TUacuteUcirc 32 c^_Zgt BX900 W FX1 IcircIumlETHNtildeoumlccedilYWiCXordfWnDmWQR^_aZegraveBC(Xcurrenyen

5poundDqeumldfeaZUuml^13~brvbarTUacuteUcirc]Ucirc7^_ZTUacuteUcirc

EacuteDq 41 ^_aZ

BX900 W FX1 gtegraveBCD^_aZWQRUgraveyenXgtlaquosectoacute ^_ZOslash5^_a|Ocirc$divideC^_ZOslash5cedildivideOcirc$divideC

^_Z=)yen5PmWordf13q ^_aZyBGUgrave^_Z_IumlYacuteh)JGDHfe

a|YUuml^13~UKtl_trtGq[ _ZOslash5_IumlYacute

h fpcoll fIumlgt=GqTcedildivideCDeacuteaTHORN|Ocirc$ fprof fIumlgtdivideCaring`Gq^_aZgtPAElig 8q|gtlaquoq

^_a|Ocirc$divideAElig AElig ^_ AElig MPIZaZ|ordm)AElig f`AElig IcircIumlETHNtilde$AElig fZAElig UcircfIumlAElig

mPIcircIumlETHNtilde$AEliggtCPUlt=gtOaringaeligccedil13

hccedil`|WOcirc$FGszligbordfoacute WXq

42 feaZbrvbar|YTUacuteUcirc BX900HPCC BMT^_Z STREAMordfotildeb

gtlaquo EndeavorWrsC(35GBsacute42GBs)W 2ethMEshygtlaquopoundDmWETHNtilde5poundDqETHNtildeBDUcirc HPCCpSTREAMUuml|YacuteTHORN(CUcirc amp FortranUcirc)7ccedilIumlEumlIgraveUcircWMPIEumlIgraveUcircgtlaquoq[13BGbrvbarPmdegn BX900 feaZUuml^13~UKntst brvbarsectCDmWcdEndeavorfeaZ(IntelUgrave)W BX900feaZ(QRUgrave)igraveiacuteWuumlwdnq

JAEA-Testing 2011-005

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute BX900Uuml|YacuteTHORN STREAM^_Z Table 5Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntst brvbarordfmacrdnDqDotildeCUuml^13~gtOcirc$|sectumlGaringaeligccedil13hyBCXbrvbarPGDHoacute WX DO^SgtOcirc$sup1yBordfXmWordfdcXoumlEacuteegraveBGuecircgtlaquosectPgtXoumlordfUcirc

GordflaquoqUgraveDfeaZordfoacute WGDO ^SgtsectumlOcirc$ordf^^shypoundZYeumlnoumlEacutegtlaquoq CUcircoumlIntelfeaZXd OpenMPEumlIgraveUgraveUgravegt 42GBscoreordfdegdnDordfQRfeaZgt7ccedilIumlEumlIgraveUcirc OpenMP EumlIgravecd=EumlIgrave^regCbrvbarP]C Intel feaZWbWXpoundDqQRfeaZgtlaquo CD7ccedilIumlEumlIgraveUcircAgraveogravebrvbarpoundOcirc$copytimes]AgravebTUacuteUcircAgraveograveordflaquoWnq [ FX1Uuml|YacuteTHORN STREAM^_Z Table 6Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntstbrvbarordfmacrdnDqXTriad (132GBs)CPUQRordfWebgtUumlCshy(135GBs)CPUWotildeEacutegtlaquoq CUcircoumlordf FortranUcircUKntstUuml^13~XCWotildeiCqFX1 CfeaZ BX900W+XsectUKntstUuml^13~ordfcentsup3CXqDHFX1 CfeaZbrvbarTUacuteUcircordf$currengtlaquopoundD9ordfAumlq

JAEA-Testing 2011-005

TTa

ble

5B

X900

STRE

AM

Ta

ble

6FX

1ST

REA

M

JAEA-Testing 2011-005

422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute 321gtdegdnD BX900 HPCC BMT^_Z STREAM

Table 7GqmmgtUKntstcopyIgraveOcirc$ecircEacutearingaeligccedil13heumlX storeampegraveBGmWUKrestp=all(a$XsectordfXmW)WCDTUacuteUcirc5PfeaZUuml^13~gtlaquoq UgraveDyacutegtgtlaquoordf _ZOslash53iumlIuml XC

iumlIumlSgt|Yo CPU]o CoreiDHEumlIgrave] Agravejatildedn~nfeaZUuml^13~brvbar^UcircCXq

STREAMbrvbarPX CoreIumliX2gtZX|Yordf5n DO^gtfeaZUuml^13~UKntst GmWgt|Y+13mWordfcurrencpoundDq

Table 7 HPCC BMT^_Z STREAM (MPIEumlIgraveUcircCUcirc) feaUuml^13~Ograve(1) feaUuml^13~Oacute(2)

EumlIgrave Agravej 13aelig13

EP-STREAMTriad (GBs)

13aelig13 EP-STREAMTriad (GBs)

rdma 1 34739 U U 32

nordma 1 35016 1 43316 rdma 3 34708 U U

128 nordma 3 34692 1 43572

rdma 5 34726 4 43595 512

nordma 4 35477 4 43424 L-1QRfeaZUuml^13~OgraveacuteKfastpackedntst SSL2 -2QRfeaZUuml^13~OacuteacuteKfastpackedntstrestp=all SSL2P

JAEA-Testing 2011-005

43 hhbrvbar|YTUacuteUcirc 32 gtdcWXpoundDbrvbarPBX900 W FX1 FFTE ^_Z

acircIcircIumlETHNtilde13yacuteordfmacrdnDqmETHNtildeDH5pound

DhhW13gtszlig0Gq 13gt FFTE 32EumlIgraveOslash5CDWecircoacuteG ethBX900

gt 12FX1 gt 16Wgteumlqpoundordm)1|YEcirc2OgtlaquopoundD9ordfAumlqmgtFFTEUuml|YacuteTHORN(FortranUcirc)l3CUuml|YacuteTHORN(13aumlWoacuterGoumlUcircWdivideoslash)WaumlIacutecurren4C)JCDaumlacircoacuteGaumlZh5poundDiacuteegraveBC

hbrvbarpound^UcircCDordm)GmWbrvbarsectBX900 W FX1 oacuteG5g5poundDqUcircpaumlbrvbar5poundDh 1QRfeaZW Endeavor egraveBeumlnDafeaZWrs5PDHW 2auml13IacutecurrenTUacuteUcircbrvbar6)BbordfaumlIacutecurrenthorn7wXmWagraveaacuteGDHgtlaquoq UcircaumlIacutecurren 38plusmnUcircWmndaumlZhCDiacute

Ucircn~n Fig 13W Fig 14GWoIacutecurreniacuteEcircu0Gq Fig 13W Fig 14OgravegtCDIacutecurrenOO=5pound^gtlaquoqUuml|YacuteTHORN

gtcopyIgrave APXYZWWWWOcirc$Yordf 2ugt5nsectZgtXcpoundDDHiacutegt)9BcopyIgrave CY_W^curreneth5poundcopyIgrave APXYZWWWWYordfZXbrvbarPCDq)9BcopyIgrave CY_YZCXcpoundD=Ocirc$sectumlG^gtYordfZXpoundDoumlUcircordfecircDHgtlaquoq

Fig 13W Fig 14OacutegtCDIacutecurren3[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 3ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 3gtXC 6Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquosectaringaeligccedil13h-

a5brvbarpoundltG_ccedilY-a5^_ZUgrave)acircregmacr= 16 gteumlnq

Fig 13W Fig 14OcircgtCDIacutecurren2[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 2ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 2gtXC 4Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquoqFX1 gtBX900rufntildeDsectaringaeligccedilaringh-a5ordfgteumlmWcdmO=)t_ccedilY-a5 16cd 4^regGmWgtzyacuteordfmacrdnDq

Table 8BX900W FX1QRfeaZegraveBCDUcircWaumlbrvbarEuml BX900 afeaZegraveBCDaumlbrvbarGq

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeFFQhFQeFHFFIhFIeFHFFhFQWFRWFFILCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildeKgMSCFDEFGLCKmMlhKOFRLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeNHQRIAacuteVOEOlOjAtildehSRAacutelOEYEEfKAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKnMSCFDEFGKmMSCFDEFGKdMSCFDEFGKkMSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePMMSCFDEFGPKMSCFDEFGPPMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLSABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtildeLCnMEEhKOFKOFAVjLCgMllhKOFPOFAVjLCPMEhEEOUEFMAacuteEEfFAVjYKOFKAtildeLCKMlhllOUEFMAacutellfFAVjYKOFPAtildeAAacutelOEAtildehNAacuteEOlAtildeaumlIacutecurrenOcircKMSCFDEFGPMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBF

GFL

Fig 13 FFTEUuml|YacuteTHORNaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeop]qrsAacuteXAtildeObqqpobtbuqrvvowxAacutevOvOvAtildeL3yzPbqqpobtrAacuteowxAacutesOwO]|AtildeAtildeLCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildep~hKOwowxAacute^O~OqAtildehowAacute~O^Atilde ^currenethDHrp )9BcopyIgrave)JIacutecurrenKgMSCFDEFGKdMopt^|rKkMopt^|rpKdK^^hKOsOuqpKmM~~hKOwOuqpKmK~h~~O^MAacute~~fuqYKOwAtildeLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildepKkKVhKO]|NHQRIAacuteVOEOlOjAtildehowxAacute^Y^^fKO~OqAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKkKopt^|r LEW V^lnacutewPKnMSCFDEFGKmKSCFDEFGKmMSCFDEFGKdKSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePPMSCFDEFG LEW j^lnacutewPPKMSCFDEFGPMMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtilde]bcbrtrcAacuteuqKhnAtildeLCnMEEhKOFKOuqKLCgMllhKOFPOuqKLCKMlhllOUEFMAacutellfuqKYKOFPAtildeLCPMEhEEOUEFMAacuteEEfuqKYKOFKAtildeAAacutelOEAtildehNAacuteEOlAtilde aumlIacutecurrenOcircPMSCFDEFGLEW l^lnacutewPKMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBFGFL

Fig 14 FFTEiacuteaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

Ta

ble

8FF

TE V

er 4

1

FullW

kern

el

(EumlIgrave

32

180M

BP

roc

)

ST

UVWXYZ[]

^_

`6a8bcdRefghHdRefgij

`6a8bcdRefghHdRefgij

klmmnopqr

cltst

ouvq

wvx

ovy

zqvz

|vy

|vz

klmmnopq

~cltst

yvy

yvo

ovo

wqvq

v|

|vy

s88oy

oqvu

oqvu

ovq

z|vu

oquv

qvz

s88Ru

ovz

qvw

x|vq

ovw

klmmnopqr

cltst

ouvy

zvy

ovx

oq|vu

uvz

uv

klmmnopq

~cltst

yvx

xvw

ovo

w|vu

vz

|vy

s88oy

oqvx

oqvo

ovq

wxv

z|vo

qvz

s88Ru

o|v|

qvw

xuvw

ovy

klmmnopqr

cltst

ouvo

zvx

ovx

klmmnopq

~cltst

yv

xv

ovo

s88oy

oqvq

oqvu

ovq

s88Ru

ovz

qvw

ln

ltd

ln

zqq

D

mo

ltd

ln

mltd

JAEA-Testing 2011-005

UgraveBX900gtUcircWQRfeaZaumlWQRfeaZaumlWafeaZdegnIgraveEacuteIacuteotildeigt

laquosectauml4ordfXmWWfeaZAacuteordfXcpoundDmWordfcurren

cqUgraveDFig 14OcircgtCDIacutecurren_ccedilY-a5^regGW(_ccedilY-a5acute164)BX900gtordfpoundUcircCsectmoumlaringaeligccedil13hegravepoundXcpoundD9ordfAumlq [FX1gtUcircWQRfeaZaumlWQR

feaZIgraveEacuteIacuteshyegraveeacuteordfaEacuteDordfUuml|YacuteTHORNcdz

yacuteCiacutecurrengtrsGWotildeigtlaquomWcd BX900 WrsGmWXqBX900W FX1gtUuml|YacuteTHORNWiacuteOslash5GWFig 14OgraveIacutecurrenO=OordfEcircUgraven^currenethWPrsUVXiacute5pound

DDHBX900 gt FX1 gtzyacuteCqCcurren BX900 16 oacuteC FX1gt 40WXzyacutegtlaquopoundDqUgraveDFig 14OacuteWOcircIacutecurrendegncopyIgravefgO=EacutegtlaquosectBX900gt 11currenW 08currengtzyacuteXcpoundDordfFX1gt 36currenW 18curren(or 16curren)gtMddcXzyacuteordfaacuteHdnDq 13yacutecdFX1feaZTUacuteUcirc BX900Wdc+XpoundsectFX1

|Y]OszligyBGDHhhordfaeliggt

laquomWordfcurrencpoundDq 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute

42 STREAM^_ZgtethCDW ordfcent^_Zgtecircc5g5poundDq egraveBCD^_Z43 FFTEWEacute Stockham|5gt FFT5P

NPB (NAS Parallel Benchmarks) 23FT^_ZgtlaquoqFFTEoumlWiacuteNPB23 FT Uuml|YacuteTHORN(Fortran Ucircgt MPI EumlIgraveUcirceumln)l3CUuml|YacuteTHORNcdaumlIacutecurren4C)JCDaumloacuteCfeaZUuml^13~brvbarpound^

UcircCDordm)5poundDqUuml|YacuteTHORNgtXCaumlBD=YZ C7UV((XYZ)=(512512512))gtaumlIacutecurrenordf On aringaeligccedil13hWXsectfeaZUuml^13~szligordf13XDHgtlaquosectaumlIacutecurrenUVEacuteCDcpoundDcd

gtlaquoq ^_ZaumlIacutecurren 28plusmngtlaquosectn~n Fig 15W Fig 16GqMd

8plusmncopyIgraveoacuteGYordfZsectumlO=gt|YordfZgtlaquo

mWcdhtgtUKntst oacute WXXgtmUuml^13~notbrvbarGq UgraveDOcirc$_ccedilY-a5 16W 64brvbar5poundDq

JAEA-Testing 2011-005

13

|ucp|t^ro__tAacute^OOOsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]|trFRFY]p^top]qrsYtpYop]qrsJJDp_Q|^bbqpc^ta|rotpbctcb|urc`Q^uptatar^]|tbtarp|t]|tbccbwOa^qrR^boocbtoabccbw`Et^b|rtabtFhPU`Ar_pcrobqq^SJJDItpo]rc_pcJJDOtarbccbw|tur^^t^bq^ruwobqq^SJJDI^taEorttpMbUrttpUQOarcrUQ^tarbs^|bq|rp_U_pcbwo|ur|rtobqq`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^oq|r[qpubq`a[^oq|r[]^_`a[^trrc^OOO^O~OqOsO^rccOr^trrc__tuqpoO__tuqpo]b]bcbrtrcAacute__tuqpohdnO__tuqpo]bhddAtilde auml^_ccedilY-a5Kd]bcbrtrcAacute__tuqpohKdO__tuqpo]bhKXAtilde]bcbrtrcAacutehiOhmKPAtildep|uqrop]qrssOwO|p|uqr]cro^^prqOrqrOrq]^r^psAacute__tuqpo]bOAtildeOwAacute__tuqpo]bOAtildeO|AacuteAtildeobqqUHExE^tAacute^rccAtildesAacutevOvAtildehAacuteK`MMOM`MMAtildewAacutevOvAtildehAacuteK`MMOM`MMAtilde|AacutevAtildehAacuteK`MMOM`MMAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pcprbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqh]^xt^rAacuteAtildep^hKOKMMM^hY^pqhKOOPobqq__tPAacute^OqOOO__tuqpoO__tuqpo]bO|OsOwAtilde^_Aacuteq`r`AtildeptpKdMobqq__tPAacute^OqfKOOO__tuqpoO__tuqpo]bO|OwOsAtilderpptpKXMoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]wRtpQ`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKdMopt^|rp~hKOp^hKO__tuqposAacute^O~AtildehwAacute^O~Atilde aumlIacutecurrenLfgO=PrprpKXMopt^|rrpobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqrh]^xt^rAacuteAtilderq]hrqrYrqobqqUHExSpxcbAacuteUHExSCUUxTCBVLOrO^rccAtilde^_Aacuter`r`MAtildetarc^trAacutedOWAtilde[Gqb]rD^ruw|u`__tPAacutero`Atildev[Orq]r^_obqqUHExJ^bq^rAacute^rccAtildetp]

Fig 15 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- cfftz)

JAEA-Testing 2011-005

|ucp|t^r__tPAacute^OqOOOwOwKO|OsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pctarVYta^trcbt^pp_tarropbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^trrc^OOqOOOwOwKOKOq^Oq~OqO|O^O~O^KKO^KPO^PKO^PPp|uqrop]qrs|OsOwO|KOsKKOsPK^r^p|AacuteAtildeOsAacutewKOAtildeOwAacutewKOAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYort^^t^bq]bcbrtrc`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKhePqhPWWAacuteqYKAtildeq^hPWWAacuteYqAtildeq~hPWq|hq^fKp^hMOq^YK^KKh^WqfK^KPh^KKfK^PKh^Wq~fK^PPh^PKfq^_Aacute^`r`KAtildetar|Kh|Aacute|f^Atilderqr|Khop~Aacute|Aacute|f^AtildeAtilder^_oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoDa^qpp]^rotpc^buqr`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYphMOqYKp~hKOwsKKhsAacute~O^KKfAtildesPKhsAacute~O^KPfAtildewAacute~O^PKfAtildehsKKfsPK aumlIacutecurrenLOO=PwAacute~O^PPfAtildeh|KWAacutesKKYsPKAtilderprprpcrt|cr

Fig 16 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- fftz2)

JAEA-Testing 2011-005

UgraveWH Fig 17(Ocirc$ Table 9 Table 10GqQRfeaZbrvbarshyatildeCBX900 fftz2gtfeaZUuml^13~UKntstbrvbarzyacuteordfagraveaacute13cfftz gtnotXordfmacrdnXqUgraveDFX1degnnotXordfmacrdnXcpoundDqmW ETHNtildeCD

zyacuteordflaquopoundD BX900 fftz213gtfeaaringaeligccedil13heumlXsectumlampordf5eumlnXmWordfcurrencpoundDq13]h

eumlnX(cXCcedilEgraveWuumlwdnordf(currencdXq Ocirc$_ccedilY-a5degnOcirc` 16ordfTUacuteshygtlaquopoundDq UgraveDQRfeaZbrvbar BX900W FX1shyordfFFTEoumlbrvbarsectgteuml

(2curren13)ucircsup2WsectumlMd|YZgtlaquopoundDDHWuumlwdnq TcedilQRfeaZWafeaZUKntst ordf)BCD

IacutecurrenotildeQRfeaZAgraveordf 1 ethAacuteWXpoundDordfSTREAM gtfrac12Eacutesect

iumlntildegtlaquoq 45 TUacuteUcircUgraveWH

BX900WFX1STREAMWFFTEUuml|YacuteTHORNfIumlWNPB23 FTB13feaZbrvbarTUacuteUcirc5gCDFX1gtegraveBGfeaZUuml^13~brvbarpoundordfecircC^ordflaquo 5ordfaeligugtlaquopoundDq

FX1gtfeaZUuml^13~otildegt13ecircGmWordf13XoumlordflaquosectWecircUcirc7gth5Paeliguordflaquoq

Fig 17 feaZbrvbarTUacuteUcirc

JAEA-Testing 2011-005

TTa

ble

9N

AS

PARA

LLEL

Ben

chm

arks

Ver

23

FT

Tabl

e 10

NA

S PA

RALL

EL

Ben

chm

arks

Ver

23

FT

JAEA-Testing 2011-005

5 IORYbrvbar Y`fIuml IORBoa13alFGFAring

CaYoBETHNtildeCDq 51 IORYEcircu

IOR Benchmark3)vR ASCI Purple=_7|eacuteplusmngteumlnD C iexclbrvbarYfIumlgtlaquoLplusmneacuteYacute~ 2102egraveBPqPOSIXLwrite() read()PMPIIOLMPI_File_write_at() MPI_File_read_at()PbatildeB13EumlIgravel5PmWordfgtecircq 511 YacircueZ$

MPI oZYcplusmnLrepetitions plusmnP`5XFGFAringTshyiquestHqOcirc`centgtEumlIgraveOslash5eumln^_cdiaoacuteCpound

Eacuteecircordf5nndFGFAringshyordfeumlnqIOROslash59XacircueZ$ Fig 18Gq

1plusmn`gt readwriteplusmn(blockSizecurrentransferSizejsegmentCount)gtlaquosect`aecirceuml (blockSizej segmentCountjMPI EumlIgraveAring )WXqyenwtransferSize=1kBblockSize=1MiBL106 Bytes MB220 Bytes MiB WGPsegmentCount=1MPI EumlIgraveAring=32 oumlblockSizejsegmentCountjEumlIgraveAring=32MiB testFile ordf)dnoZY 1MiBmiddotAbrvbarntildeC1KB7fIuml 1024iexcl1313aeligUgraveDZordfpoundEacuteecircGq

[transferSize] 1plusmn readwritegtFGGeacutea`qsizeof(long long int)currencblockSizegtlaquoaeliguordflaquoq

[blockSize] o^_ordfbrvbarntildeGeacutea`qgt 1MiBq [segmentCount] o^_brvbarsectYeumlniOcirc$qyzshy 1q [randomOffset] 1XdZordfYqOcirc`shy 0q [fsync] ecircEacuteAring fsync13f5Gqyzshy 0q [filePerProc] ^_sectWcent`aYGqPOSIXEacutenotszligq

yzshy 0q [useFileView] MPI_File_set_view() MPI_File_write() MPI_File_read()egravePq

ZordfYWparaCXqyzshy 0q [collective] MPI_File_write_at_all() MPI_File_read_at_all()egravePqZordfY

WparaCXqyzshy 0q

Fig 18 IOR YacircueZ$

JAEA-Testing 2011-005

512 MPIIOatildeGUuml^13~eZ$

IOR gt Info aringbrvbarpound MPIIO =)`7wmWordfgtecircqBX900 notszligX InfoUumlYacuteNtildeY` keyW value Table 11GqplusmnCXLyzshyPq

Table 11 MPIIO Info UumlYacuteNtildeY` key value

cb_buffer_size 4194304 $amp()

+

cb_nodes -012345

6789lt

=gt-012A

B

ind_rd_buffer_size 4194304 ACDEFGF$amp()+

ind_wr_buffer_size 524288 ACDH+GF$amp()+

JAEA-Testing 2011-005

52 latildeLPOSIX MPIIOPn~nEumlIgraveAringtransferSize (1plusmn13

fgtFGeacutea`)YAgraveograveLsequential randomPb^wFig 19GumlcopyFGFAringCDL^_P Table 12Table 13Fig 20Fig 21GqUgraveDntildea13iumlIuml_aumlaringaeligccedil13hIO -eacute-eacutearingaeligccedil13hordfcentsup3CDHn~naringaeligccedil13hoacuteGIO XcOgraveOacuteOcircOtildeYoacuteG IO XcordfcentGDH_aumlaringaeligccedil13hszligEacuteDq

iexclcentpoundcurrenyenETERNUSDX80

970TB

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

brvbar

SRFSltlaquocumlcopyordf

READD WnotiteD

brvbar

SRFSltlaquocumlcopyordf

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

shyshyshy

Fig 19 ReadWriteumlcopy

JAEA-Testing 2011-005

Table 12 LPOSIXP

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3

oqu yvq ovy| |uxvzw zvwu |zwvyy ouv|u oywvo ozuqv

quw zqzvuy uvox uu|v|u u|uvqo uxovz| uwvuo owovuq wqvxuqzy oqqvo| vy uxyvyw uuuvxx uyvyw xqzvu| |vw| |q|wvzqwoz o|uv|w ouvoy ozwv|o uyxvux uqvyo u|v| uovzu zuvxoy|wu ouoov|w uwvqu uuvxq uyvwz uuvyw uxovww uqvxw zovzw|yw ouxzv wzvwq uy|vx uyvz| uyuvwz uwxvoo |xvo wxvzyyxx|y ouwyvq| wv uxvo| uovyx uzvuu u|qv xvxy yvyxo|oq ouqwvu| uqvqq uxv|w uxvxy uuwvyo uyvoo qvyw zv|qyouu oxoyvu ywovoq uuvw ux|vy |uxvu| uxovoy oqvow uwvzwxuww oxovww oo|zv uyuv uvoq |w|vo u|wv|y ouqvoq uqwvooquwxy ouwvyq ovq uuyvo| uzvyu uquvq u|uvo q|zvx| q|vq

oqu ow|qvy ovu uqvyo vyu xx|vyq uwvwy z|zyv|y oxxqwvuwquw quzvoy |vq |w|vwz uyvyw xxwvzu xuvw o|xxxvy owv|xuqzy ouvu |vu uqov|q uxzvzo u|vzq uuwvu ooouv| |qxvxuwoz ozqxv|x o|vq |z|vuo uq|vqz xxyv|u xzzvy oo|vwo xqvzzoy|wu uvuw uqvz uovz u|vwo xwwvo y|vu| oxvz ozvoq|yw uuvw zuvy uvzz uozv yqzvyx xzyv|u ovo oxvwyxx|y owvy ouuv|u uwv|| uuwvx| x|ovx| yqyvux o|ozvuo oyoovyoo|oq |wzvy xyvyw uuwvz| uvqu x|xv xuvu ooyyvxo oxxvoyyouu u|wvq |zxvwx uvo uyzvu xo|vzq xuqvqw oyzu|v|o q|quvqxxuww uuv|o ozuxvuu uuvxx uzvzu xqvow xuwvzz oyxxxvyx owyovyooquwxy xwvy |qzv| uuzvxu uzvo| xyuvwo uwzvwx oxwwvxz oxyuyvy|

oqu oxyovoo vo yo|vyu v|w xwyvzz zv |xwovy xzzxxvyquw oxvz |vwz yquvxx y|ovx xywvyz xzyvyq x|xy|vzo w|ywzvwouqzy oyvu| zv xzwvx| yyvu xvoq xv yxy|vz zqy|vyowoz oy|xvxx uvy yozv|q yvyu xwv| xxvu| ywuv| zqqqvozoy|wu oxyzvuw uyvzz y|vyy yowvxq xwuvwx xwxvy ywzuv|q wqzvoo|yw oxoovuy wwv yqyvyq yuvyw xwov|x xzov|u ywozyvqy wyyyvywyxx|y oxwxvz ouyvxy yoyvuz yvoq xwxvwx xzqvww ywv| wxxqzvo|oq oyxwvw qvwx yovw yuvw| xxwvwo xwyvuq yquvzw woyvyouu oyw|v|z yzvx| yqvwu yxyvq| x|vx| xywvuq yywqvxw wqvq|xuww oxqvxu ouxqv|u yqvuq y|wvqx xyqvw xovz| yx|||vqq yuy|vuoquwxy oyywvqx uwvuy yo|vxz yuzvz yyxvq xwv|u yox|wvww yoovqx

oqu

|

xy

acutedegmicrodegpara

degmiddotmdegemdegkk~qgcedilsup1sup3

mdegkk~ocedilsup1sup3 cedilordm cedilsup3

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3oqu |yvoy uvqw zqxvxx xvwq uzvx ovu y|zuvx xywy|vy|quw uxzvy yvuz zoyvxo vox xxv| |vqw zyuyovyo zqzuvuyuqzy uyvuz oovx zuovw wvzy zvqx |vzu o|qxxuvy oowzyyvyqwoz uoyvxx vy xyzvzq xxv|o zuovwx uyvyo ouyyqovxx o|woo|v||oy|wu xqvzy uuvy oquvou oqovq oqqovuw z|vy oxuwv| ouqxzyvwz|yw xoyvyw xwvqy oqqvo oz|vo oqzvyz o|yvu ouwwqvq o|wzuv|yxx|y yu|vq ooyvq o|yvzx qvu o|wvqq uwvy oozwwvyo oozqyvxo|oq youv|y ovoy oxu|v|w yqv ouv|y owzvoo wuo|vou w|qzvzwyouu yxvw uouvwo oywzvwo o|yxvqo oyyvqq oxzzvq xo||voy xoyvwuxuww wozvq yuv| oyuvwo ouxvo ox|wvq o|o|v| |qwqvxw |ouvyzoquwxy zyxv|o zqyvu| oyq|vx oxqwvy oyzvz ooxvu ozzvx owovyu

oqu |xvyy v|u ozuvy w|vyy |xvw |vxy uoyyzov|o |zqyyyvoquw |xvww uvxw owq|vx xvzx oyovu |vqy xw|zvzw xyzuzvzxuqzy |yzv|z zvqw qvuy v|x oyovww wqvyw wzyv|w yqu|vowwoz |wuvz owvu o||vu oqv xzyv ouuv|| yozqovu zzqvzoy|wu xqxvwy |yvx qovou uwvyx oz|zvxy ywv| xqqxvzq yq|qxvzw|yw xuv|u uv|z qywvyu zxvy qxovww |quvz yxwoqv yuquu|vy|yxx|y yzxvqq ouyv| xzwvwx xyvzq yvoq wqzvy uzyqzvu x|yzqzvw|o|oq yyqvx| zzvq |ov|x zzovw| w|ov|| ozvo x|o||xvuz |yqqvwyouu zzyvzq yq|vwz |zzzvz |qvy |wzuv zyvxq |xowu|v |uuoqovuuxuww ooxvu o|wvu |wyvww |zvu |ux|vo zw|vzo |xuwvy |uwovxoquwxy oz|yv| owzuv| |wxyvwu ||yvqu |yzuvzx |yyovq o|uwvo o|uywvo

oqu |yvq vq |qyvqq ozwv|u |zvwq qzvww o|wuzvyo ooq|xxvzyquw |yvzu uv ooqvyu ozzv|w uxxv ozqv| owzzwvyy oxxuq|vouuqzy |yyvx| wvxo yvqq ozwvqy |zwvzq ozwv zzvx qyqqovoowoz |xv| oyvyz ozxovou |wwvw uxqvq| uquvq| |ozqxv|| qwuzzzvw|oy|wu uyqvzo ||v|| |wvyy wquvw wxuvox zvuo |oxy|vw q|qyvoq|yw ux|vuw yvw |xov|u |uvz uuvyu |wqvy o|xxovzu q|wvwzyxx|y yzvwo o|xvu |uqovuy qvw |yvy wwzvx o|wuyov|y owxuqzyvowo|oq xzzvwz yzv|y |q|vxq ovzu uqwvuq oqqvz oyuyyxuv|u oyouoz|vuoyouu wxyvq xuuv|u uoyxvo |yywvo uzvw| uouvu ooq|ozvo oqyq|vqxuww oqqvqu oowvoz uoy|vw |yvx uwuvuw uux|vxx woyyqzvow zqzvqwoquwxy owquvu owov|u uouvx |zxyv uvw uxoqvqu uy|yozvuw uuqqvuo

xy

oqu

|

cedilsup1sup3plusmnsup2

microdegpara cedilsup1sup3 cedilordm cedilsup3degmiddotmdegemdegkk~qg mdegkk~o

JAEA-Testing 2011-005

Ta

ble

13L

MPI

IOP

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

qv

qvz

ovqq

ov

|vq

vwy

quw

vyu

ovq

ovzz

vx

yvqq

yvox

uqzy

xvq

|vx

|vzq

xvo

oovw

oovxu

woz

oqvz

wvqu

vy

oqvy

|vuz

vy

oy|wu

ozvwy

oyvwq

oyvy

ozvz|

uxvw

uxvz

|yw

|yvzx

|vyu

wvy

|xvxo

vq

xv|

yxx|y

yvy

xyvoz

xqvqq

xuv||

oyvqo

oovz

o|oq

zyv|o

wqvu

zovu

zvx

owwvw

oyvzx

youuoxvoq

oqyvx

ouovzu

oqov

x|vqw

uvux

xuwwuyuvoo

|wvuo

uwvxu

oyovqq

wuvqq

||zv|y

oquwxyyqxvu

uwyvux

uwxvzx

ozuvqy

|uwv|

|uuvx

oqu

ovou

qv|o

qvy|

qvyx

|vy

|voq

quw

v|

qvx|

ovq

ovw

yvu

yvz

uqzy

uvxy

ovu

vqw

vyw

ovu

ovou

woz

zv|w

vyu

xv

xvoq

xv

uv|

oy|wu

ov|z

uvyz

oqvou

zvzw

uzvyo

uwvwz

|yw

||vz

zv|w

uv|z

qvy

wvwq

zovy

yxx|y

xvxw

v

uyv

|wvy|

oxqvz

ouzv|q

o|oq

w|vuu

u|vz

wyvx

ywvzz

y|vw

yvuz

youuooyvu|

yovw

oyqvoqvxo

||xvyo

|vx

xuwwwwvww

o|ovu

|uvqzyvuy

|yxv|

uqvw

oquwxy|xvo|

oovux

qovwu|qyvqz

uqwvo

x|v||

|

xy

acutedeg

microdegpara

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

ovou

ovoo

ovq

ovo

vo

vwx

quw

vx

vow

vo

vxu

uvxo

yvo

uqzy

xvoq

uvou

uv|

xvqu

zvo

oovy

woz

oqv

wvx

wvy

oqvo

ozvwo

vo

oy|wu

qvoq

ovo

ovux

qvq|

|yvyq

uyvou

|yw

uqvyo

|yvz

|uvy|

|wvu|

wovo|

wyvzu

yxx|y

wqvox

|v|y

yyvww

wvy

ouovz

ouwvqq

o|oqoxyv|w

o|vuq

oxvyo

oxovz

wvw

xwv||

youuwovuu

xyvox

u|vzq

yzvxw

xoovux

x|xvwu

xuww|xvo

uzvx

u|vyz

xqvozoquqvw

zuyv

oquwxyzv|y

yovz

wqvou

qyvuqooo|vuoqovu|

oqu

ovo|

qvxy

qvzy

ovo

voz

|v

quw

vz

qvzu

ovzq

vy

uv|u

yvx

uqzy

uvy|

vqx

|v

uvuz

wvwq

ovxo

woz

zv|

|vx

vwq

wvwu

ovyy

yvqu

oy|wu

owvzu

vu

oyvox

owvoq

|xvxo

xuvwx

|yw

|vy|

ouv

|ov|u

|xvx

qvq

oquvwq

yxx|y

xvq|

|vu

uxvx|

yvwo

oxuv||

owov|w

o|oqoxqvyw

v|

wyvq

ovux

|qvo

|vu

youuz|vwo

ouzvwq

o|vy

wuvqw

yovy

yywvz

xuwwxxvyu

|qv

uy|voy

xoqvwoxqvwwo|yvzw

oquwxywxwvxq

xov

wqvx

zqwvu|uovxzoqvu

xy

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2

microdegpara

|

JAEA-Testing 2011-005

521 POSIX Fig 20GbrvbarPZYgtFGeacutea`brvbardotildeiWXoacute

CZordfYgtFGeacutea`laquoGnordfzyacuteCqXFGeacutea

`ordf 1MB oumlZYWZordfYWgtiacuteOcirc$FGe$WXDHordfiGqUgraveDZYecircCo^_ordfn

~nparaCDaYGouml`$iaYbrvbarsect

EGoacuteCpoundEacuteEacutegtiaYGoumlAgraveordfAumlgtlaquopoundDq

mnd$Ocirc$YUumleacutenotccedilIuml]OcircOtildeYYL13YEuml

plusmnFEumlPthornordfuumlwdnqecircCCgtordfFGeacutea`

iexclcentIacuteiWXmW]poundEacuteEacuteecircC+Xnota

13gtOslashordfmacrdnX=WC-eacutearingaeligccedil13hLIOiumlIumlshyaringaeligccedil13hPthornWuumlwdnq 522 MPIIO POSIXZordfYWiacute=zCFGeacutea`

otilderyenCordfzyacuteCLFig 21PqUgraveDEumlIgraveiexclcentCXqmndreg=cd MPIIO ]^_ordfZordfY5POslashOumlWXpoundgtXcWmacrdegeumlnq 523 _aumlaringaeligccedil13h 13 IOoiumlIuml_aumlaringaeligccedil13hWa-eacutearingaeligccedil

13hordflaquoqa-eacutearingaeligccedil13humlcopyordfplusmnsup2GmWsup3acutegtlaquoordf13

BeumlnmemcleanfIumlegravePmWbrvbarsect_aumlaringaeligccedil13hY|1micro5PmWordfgtecircq 13G3brvbarsectpoundEacuteEacuteCDPOSIX_aumlaringaeligccedil13h

szligordfaacuteHdnDLTable 14Pq

(1) ecircEacute` (2) memclean (3) TplusmnpoundEacuteEacute` (4) 2plusmnpoundEacuteEacute`

JAEA-Testing 2011-005

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomSingleFileFilePerProc

ReadWrite

homehome

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

Write Read

sequential randomhomework

N=322561024

N=256

N=1024

N=32

N=32

N=2561024

FilePerProc FilePerProc

N=256

N=1024

N=32

N=2561024

N=32

FFig 20 POSIX

JAEA-Testing 2011-005

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential random

use viewcollective

use view N=256 use view N=256

N=32 256 N=32 256

homehome

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomhomework N=32 N=32

FFig 21 MPIIO

JAEA-Testing 2011-005

Table 14 _aumlaringaeligccedil13hszlig

| xy | xy

oregRplusmnsup2 zyvx qzv|| xwvxw zyuvqu

sup2Rplusmnsup2 woxqvoz ||qqvz| uzvu z|voy

kfrac14 frac12kfrac14

53 IORYUgraveWH

IORYbrvbardegdnDregmacrb13tuq 531 a13egravecurren occedil`brvbarYCDcparaX IO O=5Pnota13LSRFSP

homea13brvbarsectiumlIumlS_aumla13Lext3Ptmpa13PordfOslashszligordfAumlDHcAumlFXgttmpa13YZccedila)W^|13~AumlFUcircgtecirc9ordflaquoLFig 22middot Table 15Pq UgraveDhomemiddotAQFS_ccedilY-a5 512kcentCsect186OcircOtildeYOumltimes

raquoZETHIuml_Kgtaordf)JeumlnqUcircfIuml13NtildeY|^`bgta

gt_ccedilY-a5gtyBUacuteCqiAgravework middotAIO eacuteIumlkCQFS_ccedilY-a5 1MBcentCsect366OcircOtildeYOumltimes`Zaggtaordf)Jeumlnq^|13~cda_ccedilY-a5gtlgtyB

UacuteCqa-a5ordfecircCXbrvbarPX MPIIO work middotAegravePuecircgtlaquoq 532 POSIXW MPIIO c^_brvbariaYhomemiddotA Readgt 256EumlIgravegt 2curren

Writegt 10currenW POSIXbrvbarsect MPIIOordfcedilqPOSIXbrvbaraYmacrsup1Gq 533 eacuteccedilecirceuml ntransferSizeGX 1 plusmn13fgtleumlneacute

a`ordfecircMFGFAringordfecircCXWPdegdnDmWcd^|13~

^_Zeacuteccedil-a5laquo]GWordfyacuteordf9ordflaquoq C`|lOcirc`eacuteccedil-a5Y_ BUFSIZgtAtildeUgravesect13

gt gccegravePoumlgt fccegravePoumlgt 8192eacutea`gtlaquoqCfIumlgtsetvbufZaZ|atildebrvbarpoundeacuteccedilecirceuml^wmWordfgtecircq QR Fortran1313aeligleacuteccedil-a5GAgraveograve13sectLAacute

ordmPq (1) OPENyacute BLOCKSIZE (2) ^ fuxxbf (3) Oslash5fIumlZaUuml^13~ -g (4) Ocirc`gt 8Meacutea`

JAEA-Testing 2011-005

ETER

NU

S D

X80

1-

18 (d

x01frac34

dx18

)

DX

80

01

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

DX

80

02D

X80

03

DX

80

04D

X80

05

DX

80

06D

X80

07

DX

80

08D

X80

09

DX

80

10D

X80

11

DX

80

12D

X80

13

DX

80

14D

X80

15

DX

80

16D

X80

17

DX

80

18

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

IObrvbar

1(io

1)

ETER

NU

S D

X80

19

-36

(dx1

9frac34dx

36)

DX

80

19

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

DX

80

20D

X80

21

DX

80

22D

X80

23

DX

80

24D

X80

25

DX

80

26D

X80

27

DX

80

28D

X80

29

DX

80

30D

X80

31

DX

80

32D

X80

33

DX

80

34D

X80

35

DX

80

36

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

Wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

IObrvbar

2(io

2)

Fi

g 2

2a13J

JAEA-Testing 2011-005

Table 15 OcircOtilde7Y`|J

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_01 QFS_data_01 2109888 MB 2109888Icirc36 = 724TB 1024k

data_02 QFS_data_02 2109888 MB 2109888Icirc36 = 724TB 1024k

data_03 QFS_data_03 2109888 MB 2109888Icirc36 = 724TB 1024k

data_04 QFS_data_04 2109888 MB 2109888Icirc36 = 724TB 1024k

data_05 QFS_data_05 3695488 MB 3695488Icirc36= 1269TB 1024k

home QFS_home 3695488 MB 3695488Icirc36= 1269TB 512k

work 524288 MB Io2JAgraveAacuteAcirc

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_06 QFS_data_06 2109888 MB 2109888Icirc36 = 724TB 1024k

data_07 QFS_data_07 2109888 MB 2109888Icirc36 = 724TB 1024k

data_08 QFS_data_08 2109888 MB 2109888Icirc36 = 724TB 1024k

data_09 QFS_data_09 2109888 MB 2109888Icirc36 = 724TB 1024k

data_10 QFS_data_10 3695488 MB 3695488Icirc36= 1269TB 1024k

data QFS_data 3695488 MB 3695488Icirc24= 846TB 1024k

sysdata QFS_sysdata 3695488 MB 3695488Icirc12= 423TB 16k IumlETHNtildeETH

work QFS_work 524288 MB 524288Icirc144=72TB 1024k dx01frac3436OgraveJ

JAEA-Testing 2011-005

6 a$fY` mmgt BX900 a$fY`ETHNtildeCDtuq

ETHNtildeBX900Xraquozfrac14frac12GmWordfgtecircDq 61 a$fY`igraveiacute mmgt BX900 a$fY`IcircIumligraveiacute AgravejiumlIumleacuteaIuml

0Gq 611 icircWiumlIumlethsectntilde

BX900 InfiniBand QDRegraveBCD Fat Treeicircgtlaquoq113aelig13S 18iumlIumlAEligicircgtlaquoordf113aelig13 LeafWdividenaccedil7Iumlordf 2Igravecenttimeseumln1accedil7IumlcdoiumlIuml QDR ordf 1 7gtYZeumlnqaccedil7Iuml 2 Igravelaquogt1 iumlIuml 2 QDR ordfYZeumln(QDR Iacutegt 18 j2Igravegt 36WX)qoiumlIuml`ZaringbrvbarsectT 8GBs (4GBsj2) AAEligpoundq 13aelig13 5PDHLeafaccedilyacute7 SpineWdividenaccedilordf

9iexcllaquosecto LeafaccedilGu Spineaccedil QDRgtYZeumlnqbrvbarpound 113aelig13cd SpineIacutegt 9j2Igrave=18 QDRordfegraveBeumlnq QDRordf 5PoumlLeaf accedil3xmacrDownlink ordf 36 Uplink ordf 18 Xgt Uplinkshy 50WXsectmn 50_ccedilaringWdividenotgtqBX900gta$fY`Ecircfrac34h Fig 23GqBX900gt 50_ccedilaring13aelig13ocircotilde gtthornordf]GqSpineLeafaccedilW13aelig13SoiumlIumlYZGEcircfrac34h Fig 24Gqmmgt13aelig13S nAumliumlIuml nAuml(niquest9)UgraveD nU9Auml(nAgrave9) SpineaccedilegraveBGqlaquo13aelig13laquoiumlIumlordfcent13aelig13iumlIumlcdOcirc$ltAacuteoumlSpineaccedilcd Leafaccedil(2Igrave) QDR2(1j2Igrave)ordfegraveBeumln(2Igrave)LeafaccedilcdiumlIumlQDR2gtYZeumlnqDotildeCmSpineaccedilcd Leaf accedilordmAcirc13aelig13SPAtildeWiumlIumlWnoteumlnqmDHyenwiumlIumlTEacuteOcirc$lt GoumlWiumlIuml 1 W 10 ordflt GoumlWgtn~n FAringordf+XqX13aelig13Sotilde gtiuml_ccedilaring

WXqmDHoYacute~iumlIumlethsectntildeAgraveograveLiumlIumleacuteaOcircOtildePordf

thornotildeGq (1) XiumlIumlethsectntildeL13klP iumlIumlZeacuteaIumleumlnqyenw13aelig13 1 iumlIumlordf bx0001ccedil

bx0018Ugravegtlaquoordf144EumlIgraveYacute~oumlmAumlgtiumlIumlordfeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDdmiumlIumlAringAEligw[iumlIumlegraveBGq13

aelig13ocircotildeoumliacuteZeacuteaIumleumlnq

JAEA-Testing 2011-005

(2) CcedilXiumlIumlethsectntilde Wsup33iIacuteOslash5YZniumlIuml 1ecirceacuteaIumleumlnqyenw13aelig13

1gt bx0001bx0003bx0005bx0007WeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDd8)eacuteaIumleumlnDiumlIumlordfEgraveAumlAringXdAringAEligwTEacuteEgrave

AumlAringiumlIumlegraveBGqEcircAumlAringoumliacuteAringAEligwTEacuteEcircAumlAringiumlIumlegrave

BGqmnAtildeWYacute~SiumlIuml gt IB [EumlDHcenteumlnDq plusmnoqgt65EacuteWsup3AgraveogravegtCDq

13aelig13 1 13aelig13 119

Leaf(1-2) Leaf(1-1)

bx0001 bx0002 bx0018

Spine(1) Spine(2) Spine(9)

Leaf(119-2)

Leaf(119-1)

bx2125 bx2126

Fig 23 BX900a$fY`icirc

3OslashCDqIgraveIacuteoFL13aelig13raquoIcircIumlntildePfrac34wWsup313kl

WiacuteiumlIumlethsectntildeAgravejBCq

JAEA-Testing 2011-005

accedil (Spine) accedil

(Leaf) 13aelig13 iumlIuml (1 13aelig13acute18 iumlIuml ) (acutecsect]GCGDHLeaf accedilTCcX)

1

10

2

11

3

12

4 5 6 7 8 9

13 14 15 16 17 18

SW 1

SW 2

SW 3

SW4

SW5

SW6

SW 7

SW 8

SW 9

SW

Fig 24 iumlIumlWaccedilatildeAcirc

612 Agravej

BX900 Ba$fY`plusmnsup2yBCD IB AgraveograveWC

RDMA(Remote Direct Memory Access)yBCD RDMA Agravej(13 RDMA WC)WRDMA egraveBCX NORDMAAgravej(13 NORDMA WC)ordflaquoqBX900 gtOcirc` RDMA gtlaquoqRDMA W IB begravecccedil`brvbarYaumlIuml(Host Channel Adapter(HCA))Ocirc$^|13~|W8Y]sectWsectgtecircAEligDIacutemWgtETH_fg (1)5PqAuml^ccedil`E7a13 ordfNtildeEumlgtecircqiAgraveNORDMA CPUbrvbar|fg5Pq

RDMANORDMAWcentccedilYacute AgraveograveWC Normal SendAgravej(2)W Send on RequestAgravej(3)ordflaquoqRDMANORDMAacircWmnd Normal SendAgravejSend on RequestAgravej5PmWordfgtecircordfmndsectOgravewG ccedilYacute-a5brvbarpoundAtildeUgraveqRDMANORDMAgtmndAgravejsectOgraveOcirc`Oacuteshy Table 16GqOacuteshyOcircOtilde Normal SendAgravejOacuteshy13yacute Send on RequestAgravejWXqmndAgravejWccedilYacute-a5^_atildeAcirc Table 17W Table 18Gq Oacuteshy^ MP_MAX_NORMAL_SENDgt^wdnordfyacuteaacuteshyCDOcirc`shyWEacutegtlaquoqbrvbarpound^egravePoumlOacuteshyOumloumlEacutegtlaquoq

JAEA-Testing 2011-005

XDshy 5PDHOslash5 -onesideUuml^13~CDoumlNORDMAAgravejc(ccedilYacute-a5WatildeAcirc)Normal SendAgravejgt5nq

Table 16 AgravejsectOgravewOacuteshy Iuml RDMA RDMA NORDMA

^_ Tccedil1024 1025ccedil U Oacuteshy 32768eacutea` 16384eacutea` 524288eacutea`

Table 17 RDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil16KBOcircOtilde 16KBccedil32KBOcircOtilde 32KB13yacute 1ccedil1024^_ Normal Send Normal Send Send on Request

1025ccedil2048^_ Normal Send Send on Request Send on Request 2049^_13yacute NORDMAWX Send on Request Send on Request

Table 18 NORDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil512KBOcircOtilde 512KB13yacute Gu^_ Normal Send Send on Request

timesOslash (1) ETH_fg ^_ccedil-brvbarOcirc$fgIacuteccedil`brvbarYordmgt ordm|8

YOcirc$FGGCcedilEgravemWq^_ccedil-( 13)Obbrvbarsectdivide3GmWordfgtecircq (2) Normal SendAgravej G shylt shyccdccedilYacuteG GqccedilYacuteUgraveUacutesect

umlordmordfagraveCnUgraveUgravefg(direct copy)qagraveCXoumliUcircIacuteEumlAagravefCmmfg(indirect copy)agraveCDdsectumlordmfgGqUumlccedilYacutezecircq (3) Send on RequestAgravej G shylt shyWYacuteAacutesectlt shyccedilYacutesectumlordmordfagraveCcdG 5Pq

`ccedilYacutezecircq

JAEA-Testing 2011-005

62 MPI MPI acircX atildeegraveBCD^_Z)JCBX900 yacutegt

CDq256EumlIgraveccedil4096EumlIgravegt Ocirc$-a5THORNumlcopyszligpound5poundDordfmmgto MPIatildegtagraveaacuteX=ecircordfmacrdn 32ccedil80000eacutea`gtGq WCRDMANORDMA ordflteumln atildeordflaquomWlaquoOcirc

$-a5cdecircCordf^UcircG atildeordflaquomWXMordfethCDq 621 Agraveograve mmgtG Ocirc$umlcopy 0ccedil80000 eacutea`gtlaquoordfOslash5vEacuteAacuteG

Ocirc$acircETHeCLTable 19Pq AgraveograveWCMPIatildeTplusmnOslash5GraquoCmn 20plusmnatildesectaumlGq20plusmnatilde

sectaumlC3 1plusmnWecircXshyordfeumlnmWordfDDlaquopoundDqWCpumlcopyYacute~ W[ordfuumlwdnqmDHT`C19plusmnshyIIumlAacutemWCDqXIuml RDMAW NORDMA 2IumlgtGq^_ZOslash MPIatildeIacutecurren Fig 25Gq

Table 19 Ocirc$acirc

EumlIgrave AllreduceReduceBcastSendRecv 13 MPIatilde 256 32eacutea` 128eacutea` 512 yacute 128eacutea`

1024 yacute 256eacutea` 2048 yacute 2048eacutea` 4096 yacute 2048eacutea`

do 1000 III = 1 NNN call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() mesuarment start call MPI_ALLGATHER(anelemMPI_DOUBLE_PRECISION amp bnelemMPI_DOUBLE_PRECISION amp MPI_COMM_WORLDierr) ttime2 = MPI_WTIME() mesuarment end ttime0 = ttime2 - ttime1 call MPI_REDUCE(ttime0ttime1MPI_DOUBLE_PRECISIONMPI_MAX0 amp MPI_COMM_WORLDierr) if(myideq0) then max_time = max(max_timettime) min_time = min(min_timettime) all_time = all_time + ttime end if 1000 continue

Fig 25 ^_ZLIacutecurrenP

JAEA-Testing 2011-005

13

622 mnUgravegttuDOslash5CcedilEgraveOslash5IumlgtP512EumlIgrave Fig 26 Fig

27 1024EumlIgrave B Gqoh$a` MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

0

500

1000

1500

2000

2500

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Reduce(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(512-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(512-)

rdma

Nordma

Fig 26 MPI atildeL512EumlIgrave-1P

JAEA-Testing 2011-005

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(512-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(512-)

rdma

Nordma

0

500000

1000000

1500000

2000000

2500000

3000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(512-)

rdma

Nordma

0

50

100

150

200

250

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(512-Rank0gtRank511)

rdma

nordma

Fig 27 MPI atildeL512EumlIgrave-2P

JAEA-Testing 2011-005

brvbarsect[raquozfrac14frac12CDq

(1) C gt RDMAbrvbarsect NORDMAordfFoumlordflaquoq (2) RDMAgtlaquo Ocirc$-a5cdaeligccedil ordflaquoEGqmn Normal Send

Agravejcd Send on RequestAgravejsectOgravethornWnq (3) NORDMA gt AgravejsectOgravewOacuteshyWatildeAcircOcirc$-a5gtaeligccedil ordflaquo

EGoumlordflaquoq (4) EumlIgraveordfCXWEacutebrvbarPX Ocirc$-a5gt decircordfmacrdn

oumlordflaquoq ^_ZgtacircegraveBG MPIatildebrvbarpoundRDMAW NORDMAegravecurrenaeliguordflaquo

qDotildeCmnd MPI ^_Z=GfIumlgtlaquo mpiexec Uuml^13~gtlaquogtGu MPIatildeoacuteC RDMAgt5Pc NORDMAgt5Pclaquo XgtC MPI atildeegravepoundoumlAgraveIumlgtOslash5rsGmWordfaeliguWXq ucircuumlWCo MPI atildegtUVOcirc$(ccedil8KB)3UVOcirc$(8Kccedil78KB)U

VOcirc$(80KBccedil584KB)gt RDMAZUcircCD Fig 28LgtUVOcirc$P CLUVOcirc$PGqZyacutegt 100n RDMA ordfnoty0n NORDMAordfnotyWXq

JAEA-Testing 2011-005

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

middotplusmnsup1raquo

~plusmnreg

plusmn

sup2cedil~raquo

sup2~

sup2~

sup2~Oslashreg~plusmn

middotplusmnsup1

~plusmnraquo

~plusmn

Ugraveplusmnsup1raquo

Ugraveplusmnsup1

~plusmn

plusmnraquo

Fig

28

RD

MAW

NO

RD

MALgtUVOcirc$acute

0K

B~8

KBP

JAEA-Testing 2011-005

63 RDMAszlig RDMA 612 gt0CDbrvbarPCPU plusmnsup2WnotM5XDH 13

O=Vfrac34gtecircqmnbrvbarsect WbordfO=gtecirc9ordflaquosectC

cCcedilEgrave`5poundEacuteDqUgraveDrsDHNORDMAgtiacuteX`5poundDq WC WordfO=eumlnDegraveyenagraveaacutegtecircXcpoundDq 631 Agraveograve W5P^_ZegravepoundRDMA W NORDMA rsGq

eacute^_WOcirc$ecircacuteGeumlsectCigraveecircacuteiacuteVicircG (13cedil Wdivideoslash)5PqGlt iuml_ccedilaring (MPI_IsendIrecv)egraveBGqG UgraveDlt GAtildeWcopyIgraveOcirc$-a5 5MB gtlaquoq W5P5Igravegtlaquoq5Igrave-a5(450450)gtlaquoqrsDHgt n~nOslash55Pq ^_ZZccedil`MPIW 2q|Icirca|ccedil`(MPIiumlOpenMP) 3q|)

JCDqZccedil` MPIgtOslash5CDoumliumlIumlS RDMAgtXDHmAgravejszligordfgtecircXqmDHiumlIumlSgt MPI 5XCeth2Icirca|ccedil`)JCDqIcirca|ccedil`Oslash5WOslash5Iacutecurren7ccedilIumlEumlIgraveUcircCXeacute

Yacute~ 1W7ccedilIumlEumlIgraveUcircCDeacuteYacute~ 2)JCDqeacuteYacute~ 2 WOslash5Iacutecurren Fig 29GqXMPIWIcirca|ccedil`eacuteYacute~ 15IgraveIacutecurren7ccedilIumlEumlIgraveampeuml$OMP parallel doiacuteordfotildegtlaquoq DHOslash5Zccedil` MPIgt 8EumlIgraveIcirca|ccedil`Oslash5 4mpij8omp(32core

egraveB)gtOslash5Gq

JAEA-Testing 2011-005

call MPI_BARRIER(MPI_COMM_WORLDierr) ttime5 = MPI_WTIME() mesuarment start call mpi_isend(aNMBmpi_DOUBLE_PRECISIONir1 1 MPI_COMM_WORLDireq(1)ierr) call mpi_irecv(bNMBmpi_DOUBLE_PRECISIONil1 1 MPI_COMM_WORLDireq(2)ierr) call mpi_isend(dNMBmpi_DOUBLE_PRECISIONil2 1 MPI_COMM_WORLDireq(3)ierr) call mpi_irecv(cNMBmpi_DOUBLE_PRECISIONir2 1 MPI_COMM_WORLDireq(4)ierr) ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ttime7 = MPI_WTIME() mesuarment start $OMP parallel do do j = 1 NMM do i = 1 NMM a1(ij) = 1d0 b1(ij) = 1d0 c1(ij) = 0d0 end do end do $OMP parallel do do j = 1 NMM do k = 1 NMM do i = 1 NMM c1(ij) = c1(ij) + a1(ik)b1(kj) end do end do end do ttime8 = MPI_WTIME() mesuarment end ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc call mpi_waitall(4ireqiastatusierr) ttime6 = MPI_WTIME() mesuarment end ttimeDD = ttime8 - ttime7 ttimeal = ttime6 - ttime5 Oslash5 ttimesd = ttimeal ndash ttimeDD Oslash5cdnD (= UgraveDiIacute)

Fig 29 WO=5PDH^_Z

JAEA-Testing 2011-005

632 Zccedil` MPI 2plusmnIcirca|ccedil`eacuteYacute~ 1 WeacuteYacute~ 2n~n 3plusmn

Oslash55poundDqmnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt D Gqontilde13Gq

acutegtOslash5CD Oslash5 acutegtOslash5CDOslash5 R acute Wn~nOslash5shy Oslash5 acute W5poundDOslash5 (U) acuteOslash5cd()Dshy acuteOslash5CD(OslashC) U acuteOslash5cdRshyDshy

laquogtOslash5CD oacuteG WOslash5GmWbrvbarpound

UumlCXpoundDrGqmj13sectgtlaquoq

laquo() = (Oslash5U( iuml)) j100 aTHORNshyordfecircMszligordflaquomWXqbrvbarsect[raquozfrac14frac12CDq

(1) Zccedil` MPIgtiumlIumlS IBegravepoundXDH RDMAograveoacuteordfltdnOslash5szligq

(2) Icirca|ccedil`eacuteYacute~ 1 RDMAOslash5gt WOslash5rsGWucircOslash5ordfFWPXq NORDMAatildeCiacuteXWXpoundq

(3) Icirca|ccedil`eacuteYacute~ 2 gtIacutecurrenordf7ccedilIumlEumlIgraveUcirceumlngteacuteYacute~ 1WRrsgt 72currenOslash5Rrsgt 55currenAumlFUcircordfXeumln(RDMA)qCcCgtucircOslash5ordfFWPdegdnXq NORDMAIacutecurren7ccedilIumlUcircgt Oslash5eacuteYacute~ 1ruUumlIgraveUcirceumlnordfRDMAbrvbarsectcedilCXpoundq

BX900gt RDMA Uuml_IumlOslashWGgtCPU oacuteG

dCb^|13~ CPUbrvbarsectVfrac34GmWordfgtecircqCcCBX900 ordfiacuteCUgraveDplusmngt ordfUumlmWlaquosect Wordf

O=eumlnDegraveyenagraveaacuteGmWordfgtecircXcpoundDq

JAEA-Testing 2011-005

64 CPUSCPUiumlIuml 1oacute 1 core FAringG^_Zegravepound CDqCPU S

iumlIumlS CPU iumlIuml gt Ocirc$ordmAcircAgravejordf+Xqn~n gtordfaEacuteiumlIuml ordfTFWXpoundDq 641 Agraveograve

1oacute 1 ^_Z coreOcirc$ G^_ZgtlaquoqZY0cdcZYCD^_acircgtCD-a5Ocirc$ 100plusmnG Cmnd uCDn~nGq^_ZiIacute Fig 30Gq plusmn 24EumlIgravegt128MBOcirc$G C^_acirc 1WGq BX900gt 1iuml

Iuml 2CPU1CPU 4coregtlaquogtZY 3Ugravegt CPUS ZY 7UgravegtiumlIumlSPAtildeW CPUW mnd13yacuteiumlIuml WXqiumlIumlS coreoacuteG core IDWethsectntildedn MPIZYAumlAring Fig 31Gq

IC = 0 do 1000 III = intcpu nodes-1 intcpu IIRANK = III call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN if(myideq0) amp call MPI_SEND(ansizeMPI_DOUBLE_PRECISION amp IIRANKIIMPI_COMM_WORLDierr) if(myideqIIRANK) amp call MPI_RECV(ansizeMPI_DOUBLE_PRECISION amp 0IIMPI_COMM_WORLDistatusierr) 120 continue ttime2 = MPI_WTIME() c IC = IC + 1 timew(IC) = ttime2 - ttime1 ir(IC) = IIRANK 1000 continue

Fig 30 ToacuteT ^_ZLIacutecurrenP

JAEA-Testing 2011-005

NODE

CPU0 CPU1

Core IDacute0 Core IDacute2 (Rank 0) (Rank 1) Core IDacute4 Core IDacute6 (Rank 2) (Rank 3)

Core IDacute1 Core IDacute3 (Rank 4) (Rank 5) Core IDacute5 Core IDacute7 (Rank 6) (Rank 7)

Fig 31 f MPIZYethsectntilde

BX900Ocirc`QRfeaZ+QR MPIZaZ|gtlaquoordfmQRRIgraveEacuteIacute13

IntelfeaZ+QR MPI IntelfeaZ+Intel MPI QRfeaZ+Open MPI

3 gt5poundDqDotildeCQR MPI W Intel MPI MPI ^_ coreoacuteGethsectntilde Fig 31CDsectgtlaquoordfOpen MPIatildeC rankAumlAring 01234567 Core ID 01234567gtethsectntildednq 642 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt E Gqmbrvbarsect[

raquozfrac14frac12CDq

(1) QRfeaZ+QR MPI gtCPU S FAringordf 3260MBsiumlIumlS CPU FAringordf 3750MBsiumlIumlordf 5350MBsgt CPUS ordfiAumlcedilCiumlIumlordfiAumlFWPWXpoundDq

(2) Intel feaZiumlQR MPI CPU SiumlIumlS CPU QRfeaZWotildeiacuteXotildeordfiumlIuml 4750MBsWegraveeacuteEGqmnQR MPIQRfeaZUcircCD5poundordfIntelfeaZoacuteC5nXDHWnq

(3) IntelfeaZ+ Intel MPI CPUSiumlIumlS CPUgtAgraveXordfmndocircoacute(1)(2)QR MPI egraveBqUgraveDiumlIumlgt Intel MPI 2(` QDRegraveBordfgtecircXDH1(` QDRCcXq

JAEA-Testing 2011-005

(4) QRfeaZiumlOpen MPICPUS ordf 4460MBSgtmnUgravegt3gtTbrvbarordfiumlIumlS CPU QR MPI WotildeAringiumlIumloumlCXWXpoundDqDotildeCiIacute QDRegraveB5poundbrvbarPotildeq

dividefrac12XegraveAgravegtlaquoQRfeaZiumlQR MPIgt CPUSiumlIumlS CPU ordf

iumlIuml brvbarsectcedilWPfrac34W+XWXpoundDqmmacraacutesectXuC

iumlIumlSgt MPI XuCEumluecircgtlaquoq 65 13aelig13ocircotildeouml 13aelig13ccedil`brvbarY 50_ccedilaringicircXpoundDHegraveBoslashbrvbarpound

thornltCUgravePqmthorn 1oacute 1 Wdivideoslash ETHuDq 651 1oacute 1

1oacute 1 MPI atildegtlaquosectbrvbarCegraveBeumlnqyenwOslashfIumlgteacute^_WFGgt^_ordfgt5noumlordflaquoqmbrvbarPXouml50_ccedilaringthornlt9ordflaquoqmmgt 5P^_Z)JC

thornETHuDq WC1oacute 1 gt 2iumlIumlgt IBnotGn[ordfecircDH50_

ccedilaringthornCltmWordfcpoundDqDotildeCWsup3iumlIumleacuteaOcircOtilde(Yacute~iumlIumlethsectntilde)AgraveogravegtAtildeWYacute~ordf IBnotGbrvbarPXiumlIumlcopytimesUVYacute~139ordfAgraveXDHthornordfCq 6511 Agraveograve ETHNtildeegraveBCD^_Z^_1curreno^_cdfrac14secto^_

1oacute 1 (MPI_SendRecv)gt 128MBOcirc$ 100plusmnG CmnuCDiexcl FAringGqOcirc$G aYacute Fig 32 Gqmyengt8EumlIgraveYacute~oumlgtZY 0ccedil3^_cdiexclZY 4ccedil7^_Ocirc$G CqOcirc$Glt ^_iugravepGgtoslashbrvbarpoundccedil`brvbarYgt[

ordfecircmWXq ^_ZZccedil` MPI WIcirca|ccedil`(MPI+OpenMP))JCDqIcirc

a|ccedil`7ccedilIumlETHeGmWgt1 iumlIumlcdOcirc$G 2^wmWordfgtecircq

JAEA-Testing 2011-005

13

6512 (1) 113aelig13S rsDH1 13aelig13S(144core)5poundDqZccedil` MPI 144 EumlIgraveIcirca

|ccedil` 72mpij2omp36mpij4omp18mpij8omp 3 IumlgtOslash55poundDq13aelig13Siuml_ccedilaringgtlaquoDH iumlIumluacuteordf IB iexclcentGqZccedil` MPI 1 iumlIumlcd 8 ^_currenOcirc$ordfeumlnIcirca|ccedil` 2 7ccedilIumliacute 1iumlIumlcd 4^_curren47ccedilIumlEacuteC 2^_curren87ccedilIuml1^_currenWXgt87ccedilIuml13 IBgt[ordfecircqmDHmnd FAring+XordfIBagraveaacuteGDHogtIIuml FAringW 1iumlIumlntildesect FAringshyiquestHDqm Table 20GqXeacuteYacute~Icirca|ccedil`cedil$ucirc3shy(MPI^_j7ccedilIuml)Gq Icirca|ccedil`(18mpij8omp)EacuteegraveeacuteshyordfEordfFAringshyGueacuteYacute~gt

otildeEacutebrvbarPXshyWXpoundqmnordf QDR2currenOslashgtlaquoigravePq (2) 13aelig13Zccedil` MPI TEacute144EumlIgraveZccedil`MPIgt5poundDqYacute~Oslash5gt t256YZegraveBCDq

mnmYZiumlIumleacuteaOcircOtildeordfZcopytimesWXDHIBnotordfecirc]GDHgtlaquoq _copytimes 72^_13aelig13(=9iumlIuml13aelig13)gtIumlb^_ 213aelig13currenethCDq13aelig13S 1iumlIumlecirc 2 IBordfuacuteordfpoundq13aelig13S 18iumlIumllaquogt 36 IBgteumlnqiAgrave13aelig13 50_ccedilaringgtlaquogt 18(9j2ccedil`)gteumlnq144EumlIgraveouml13aelig13S13aelig13W 18 IB egraveBGgtiuml_ccedilaringWXsect13aelig13S13aelig13EacuteXqbrvbarpoundm13aelig13ocircotilde 13aelig13SgtCD Wotilde^

dXq

Ocirc$G (G G)

^_0

^_1

^_2

^_3

^_4

^_5

^_6

^_7

Fig 32 ToacuteT Ocirc$G aYacute

JAEA-Testing 2011-005

Table 20 L13aelig13SP eacuteYacute~ Zccedil` MPI uumlyacutethorn(72j2) uumlyacutethorn(36j4) uumlyacutethorn(18j8)IIuml FAring 7238MBs 14474MBs 28942MBs 53467MBs iumlIumlshy 57906MBs 57899MBs 57884MBs 53467MBs

Table 21 L13aelig13Zccedil` MPIP 144 MPI 192 MPI 224 MPI iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 7246 R 4670 laquosectR 4671 laquosectR

2 7244 R 4670 laquosectR 4671 laquosectR

3 7246 R 4670 laquosectR 4671 laquosectR

4 7242 R 7243 R 4670 laquosectR

5 7243 R 7244 R 4671 laquosectR

6 7242 R 7240 R 7237 R

7 7241 R 7239 R 7246 R

8 7244 R 7240 R 7245 R

9 7249 R 7243 R 7240 R

10 R 4670 laquosectR 4670 laquosectR

11 R 4670 laquosectR 4672 laquosectR

12 R 4669 laquosectR 4671 laquosectR

13 R R 4671 laquosectR

14 R R 4670 laquosectR

iacute 192EumlIgraveW 224EumlIgraveZccedil` MPIn~n 96^_13aelig13gt 213

aelig13Oslash5112^_13aelig13gt 213aelig13Oslash55poundDqordm 144EumlIgraveOslash5WWlt shy13aelig13iumlIumlraquoIIuml FAring Table 21Gq yenw 224 EumlIgraveoumlltAacutesectshygt IBnotGiumlIumlordflaquogt(50_ccedilaring)

miumlIuml FAringordfcedilCXqFig 24a$fY`icircgtCDbrvbarPiumlIuml1W 102W 113W 12ordfnotGmWXsectmndiumlIuml FAringcedilCXpoundqCcCiumlIuml 15161718 egraveBeumlnXDHmndiumlIumlWnotGiumlIuml 6789 FAringordfFC 144MPI oumlW FAringordfEacutegtlaquoqbrvbarpoundiumlIuml6ccedil9 iuml_ccedilaringWXq192 EumlIgraveiacutegtiumlIuml 4ccedil9 p IB notGiumlIumlegraveBeumlnXDH FAringordfFq

JAEA-Testing 2011-005

Table 22 L13aelig13Icirca|ccedilIumlP Icirca|ccedil`(112x2) Icirca|ccedil`(56x4) Icirca|ccedil`(28x8) iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 9339 laquosectR 18567 laquosectR 36408 laquosectR

2 9337 laquosectR 18564 laquosectR 36919 laquosectR

3 9346 laquosectR 18566 laquosectR 36536 laquosectR

4 9347 laquosectR 18570 laquosectR 36457 laquosectR

5 9342 laquosectR 18651 laquosectR 36646 laquosectR

6 14385 R 28606 R 53808 R

7 14488 R 29118 R 53779 R

8 14500 R 28676 R 53789 R

9 14495 R 29100 R 53784 R

10 9337 laquosectR 18797 laquosectR 36408 laquosectR

11 9340 laquosectR 18775 laquosectR 36715 laquosectR

12 9343 laquosectR 18651 laquosectR 36559 laquosectR

13 9348 laquosectR 18811 laquosectR 36457 laquosectR

14 9338 laquosectR 18694 laquosectR 36991 laquosectR

(3) 13aelig13Icirca|ccedil` Icirca|ccedil`gt 213aelig13(224core)egravepound112mpij2omp(56^_13aelig13)

56mpij4omp(28^_13aelig13)28mpij8omp(14^_13aelig13)IumlgtOslash5CDqmndoiumlIumlgtIIuml FAringWpiumlIumlWnotnot Table 22Gq MIcirca|ccedil`gt IB notnotgt FAringordfecircC^poundqUgraveDIcirca|ccedil

`7ccedilIumlbrvbarpoundiumlIumlcdGlt GOcirc$2ordf^DHiumlIumlntildesect^_

ordfMiumlIumlXordf IB gtOcirc$[thornordfagraveaacuteXqmnduumlGWBX900 gtZccedil` MPI brvbarsectIcirca|ccedil`UcirceumlnDfIumlordfIcirca|ccedil`UcircfIumlXd 87ccedilIumlordfnotygtlaquomWordfcq WmigravegtWsup3iumlIumleacuteaOcircOtildeAgraveograveiumlIumlAtildeWecirccopytimeseumlnD

HUVYacute~13currenYacute~gt IB not9AgraveXqbrvbarpound50_ccedilaringmWuumlGbrvbarsectordmgttuDIcirca|ccedil`Ucirc5GuecircgtlaquoqCcC

Yacute~Oslash5 IB notGiumlIuml ordfpYacute~ordfOslash5eumlnWthornltCUgravePordfucircoB3mnEumlmWgtecircXq

JAEA-Testing 2011-005

652 divideoslash mmgtdivideoslash atildePiAuml 2ordfCX MPI_AlltoallegravepoundEacute

DGq 6521 Agraveograve

MPI_AlltoallGCcedilEgravecentCOslash5GqCmrsGqXMPI_Alltoall 100 plusmnOslash5C oacute copyIgrave-a5Zccedil` MPI ordf512MBIcirca|ccedil` 1024MBWCDq

A ucircYacute~Oslash5 B 13aelig13ntildesectiumlIumlgtecircotildeCGq C 13aelig13ntildesectiumlIumlAgraveXCG(13aelig13ordfCX)q D IBnotCG(iuml_ccedilaring)q

BccedilD atildeGgtlaquoordfiumlcopyordfYacute~gtegraveBGiumlIumllaquo GmWgtecircXq mDHWsup3ucircoBgteacuteccedilYacute~TYZ h4096egraveTEacute 512iumlIuml(4096core)agravefCm3cdXuCCcedilEgravePiumlIumllaquonotgtecircmCcedilEgravegtagravefgtecircEumlIgravegtCfNh$)JCmfNh$egravepound

MPI_Alltoall Oslash55poundDqmDHAacutedegCDOcirc$ MPI ^_UYacuteordfXq 6522 Zccedil` MPIWIcirca|ccedil`egravepoundD Table 23GqmmgtCXC

XmWOslash5CcedilEgraveordf+XDH AucircYacute~Oslash5WBccedilDUgravegtoshy8YrsCXmWgtlaquo(A 10WCDrClaquoordfmnoumlgt BccedilDEacutersWG)qBccedilDTEacute 512iumlIumlagravefCcdEumlIgravefNh$)JCOslash5GordfQR MPI gtmbrvbarPTEacuteCiumlIumlagravefCcdAgraveXiumlIumlgtCfNh$)poundOslash5CDoumlWTEacutecd

EumlIgravegtOslash5CDoumlWgt)acircOslash5ordfEGqZccedil` MPI WIcirca|ccedil`gtgt coreordfAgraveXoumlmraquozordfagraveaacuteXqX BccedilDrsXq

JAEA-Testing 2011-005

Ta

ble

23iacuteXCcedilEgrave

MPI

_allt

oall

vUacuteSTUcircR

vRcopyUumlYacutebrvbarTHORN

szligvcopyUumlYacutebrvbaragrave

pvRaacuteacircatildeNR

+aumleAumlovqaringaeligPgR

Oslash 5 Iuml

copy Igrave - a 5 (MB)

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

R

R

szligR

pR

144m

pi

512

3R6

0 11

14

118

0

145

9 3

60 R

138

6

36

0 13

87

100

1

31

124

1

25

2plusmn

(BC

DEacute

) 51

2 3R

60

111

4

118

0

148

2 11

16 R

146

9

63

0 14

75

100

1

33

132

1

32

3plusmn

(BC

DEacute

) 51

2 3R

60

111

4

29

0 14

81

714

13 R

153

3

29

0 14

21

100

1

33

138

1

28

448m

pi

512

14R

40

142

6

414

0

182

4 21

272

1 R

170

3

78

0 16

32

100

1

28

119

1

14

504m

pi

512

14R

45

142

7

415

8

180

4 27

242

6 R

160

5

79

0 14

33

100

1

26

112

1

00

2048

mpi

51

2 37R

69

182

1

298

8 24

93

465

6 R

250

1

465

6 25

04

100

1

37

137

1

38

2plusmn

51

2 37R

69

166

7

2211

6

262

5 67

564

6 R

264

7

298

8 25

04

100

1

57

159

1

50

3plusmn

(BC

DEacute

) 51

2 37R

69

166

7

2211

6

262

1 69

524

9 R

269

3 6

298

8 24

59

100

1

57

162

1

47

3072

mpi

51

2 69R

56

258

7 28

399

9 27

34

5371

54 R

286

3 9

507

7 26

44

100

1

06

111

1

02

68m

pij

2om

p 10

24

3R5

7 12

10

11

170

11

80

91

9 R

122

3

28

5 12

21

100

0

97

101

1

01

120m

pij

2om

p 10

24

9R3

3 12

89

215

0

127

0 12

191

6 R

130

4

47

5 12

94

100

0

99

101

1

00

132m

pij

2om

p 10

24

8R4

1 13

79

216

5

116

1 15

211

6 R

119

5

48

3 11

38

100

0

84

087

0

83

432m

pij

2om

p 10

24

17R

64

162

4

119

8 16

28

1925

43 R

161

2 1

129

0 16

18

100

1

00

099

1

00

30m

pitimes4

omp

1024

4R

38

598

115

0

674

12

13 R

683

27

5 68

2

100

1

13

114

1

14

2plusmn

(BC

DEacute

) 10

24

4R3

8 59

8

115

0

678

12

13 R

690

27

5 70

0

100

1

13

115

1

17

36m

pitimes4

omp

1024

6R

30

603

118

0

684

11

16 R

697

63

0 70

3

100

1

13

116

1

17

2plusmn

(BC

DEacute

) 10

24

6R3

0 60

3

29

0 68

8

714

13 R

702

29

0 71

0

100

1

14

116

1

18

126m

pitimes4

omp

1024

16R

39

822

3

415

8

805

27

242

6 R

738

1

79

0 71

5

100

0

98

090

0

87

180m

pitimes4

omp

1024

18R

50

925

615

0

892

40

283

2 R

776

127

5 75

3

100

0

96

084

0

81

2plusmn

(BC

DEacute

) 10

24

18R

50

925

712

9

861

29

352

6 R

808

109

0 76

8

100

0

93

087

0

83

612m

pitimes4

omp

1024

45R

68

110

2

349

0 10

34

525

9 R

111

4

349

0 10

46

100

0

94

101

0

95

2plusmn

(BC

DEacute

) 10

24

45R

68

110

2

2810

9

920

57

634

9 R

960

5

358

7 83

4

100

0

83

087

0

76

17m

pitimes8

omp

1024

3R

57

338

1

117

0

430

9

19 R

432

28

5 43

3

100

1

27

128

1

28

90m

pitimes8

omp

1024

15R

60

504

109

0 44

9

243

8 R

471

109

0 47

1

100

0

89

093

0

94

357m

pitimes8

omp

1024

53R

67

731

418

7 69

2

576

3 R

711

418

7 72

9

100

0

95

097

1

00

2plusmn

(BC

DEacute

) 10

24

53R

67

731

3410

5

642

57

685

3 R

628

6

438

3 60

2

100

0

88

086

0

82

JAEA-Testing 2011-005

brvbarWBccedilD gtecircCOslash5ordf+XAgraveXq13aelig13ordfAgraveXCc IB notGiumlIumlordfAgraveXnOslash5ordfUumlWnDordfmacrWaeligCmbrvbarPXpoundXqMPI_Alltoall MPI atilde3gtT 2ordfCX 5Pordfmnegravepound BX900 gtecircXordfXmWordfethCDq1 oacute 1 gt13aelig13thornordfDordfmgtpYacute~thornltC t256YZegraveBCDoacuteCdivideoslash h4096YZegraveBCDgtpYacute~thornlt9]QR MPIbrvbar TUacuteUcircthornuumlwdnWgtszligagraveXNXq 66 a$fY`UgraveWH mnUgravegt BX900 gt brvbarsect ecircXthorn7wWCacirc

RDMANORDMA Agravej Ocirc$-a5 1oacute 1 (13aelig13ocirc)IBnot

ordflaquopoundDqumlcopyfIuml oslashfrac14frac12CTbrvbarCcedilEgravelaquo GuecircgtlaquoigravePqeumld

fIumlszligOslash5GDH Zccedil` MPIUgraveDIcirca|ccedil`laquo

ordfugtlaquoqucircuumlWCGZccedil`MPIWIcirca|ccedil` rsGordfWmcdBX900gtIcirca|ccedil`ordfnotyXmWordfcurrencqmncdfIuml)JGoumllaquocentfIumlAumlFUcirc5GoumlcentX=ordfX

nIcirca|ccedil`laquo GuecircgtlaquoqBX9001iumlIuml 8coreoacuteC IB2gt brvbarntildeGDHm IBgt 2laquo]eumlXDHIcirca|ccedil`ordfugtlaquoqCPU13fUcircordfbrvbarsect2WnIcirca|ccedil`fIuml13vC]G

Weumlnq

JAEA-Testing 2011-005

7 UgraveWH

HPCC BMT7q|^_ZWUuml|YacuteTHORN^_ZBDbrvbarsectBX900W FX1gtfeaZbrvbarTUacuteUcircordflaquosectnordf^_ZOslashszligthornGmWordfdcXpoundDqIORBDgtegraveBG API]a13brvbarpoundlordfagraveaacutegtlaquopoundDqiacuteX B^_ZbrvbarOcirc$

a$fY`yBAgraveogravegt ordfecircC+XmWethCDqmnd

EcircHdegdnDregmacrHPCCYoJusectWUgraveWHWCtCqXyacuteampG(GOcirc$b AccedilGUgraveWHq mndregmacrBX900 W FX1 o13gtegraveBG^_ZIacuteDh

Agraveogravelaquo GDHnotszligXWXqordfbrvbarsectCumlcopy13

ABGlCWCJ7gtecircngtlaquoq

ugraveuacute

)JlaquoDsectXsup2 sup2kDotildeDAElig13=` v oBeacuteQRjlucircsup2acircAgraveXugraveG

q

ucircuumlyacutethorn

1) Innovative Innovative Computing Laboratory the University of Tennesee ldquoHPC CHALLENGErdquo httpiclcsutkeduhpcc

2) gt$middotamp()p HPCaelig7Yacutegt SX13L32 3plusmnAElig13THORNYacutel 3 A=P+ frac14AElig13THORNYacute$ 2005-05middot4 p98-116 (2005)

3) ldquoIOR HPC Benchmarkrdquo httpior-siosourceforgenet

JAEA-Testing 2011-005

A

FX1

HPC

C B

MT

G-F

FTEfeaZUuml^13~

-DH

PCC

_FFT

_235GmWgt

FFT|5ordfTUacuteUcirceumlnordfzyacuteCq

iAgrave

EP-

DG

EM

M-a5^regbrvbarpoundaringaeligccedil13h-a5

(15

MB

Cor

e)UacuteCordfzyacuteCq

Tabl

e A

-1feaZUuml^13~]-a5^regCD

FX1

HPC

C B

MT

UgraveccedilegravekeacuteUgraveccedilknUgraveccedilplusmnsup2sup3

~~regreg

Ugraveccedilmmnecircecirckccedilnecircfrac12

regReeumlgecirckccedilnecircfrac12

ndegplusmnsup2

ecirckccedilpUgraveecircfrac12frac12plusmnsup2sup3degmiddot

plusmnsup2ordmdegsup2sup1plusmnsup2sup3degmiddot

eacuteplusmn~+igraveiacute8

nmcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

reg~

reg~v

reg

sup2sup3plusmn

qvqyuqou

ovwyy

qvquzqq

vxzwo

qvzz

vyoy

zzv|w|

qvuqqy

uvuyx

qyu

sup2sup3plusmnqvqy|zx

ovwz

qvquyxz

vxuo

qvzuw

vyowx

zzv|z

qvxx|y

yvox

qwy

sup2sup3plusmn

qvxoquq

uvoxzo

qvoqqz

yyvuox

w|vyzx

vyoxx

zzvzyq

qv|qq

vqy

yy

sup2sup3plusmnqvxoqwwq

uvqwxx

qvooqzu

yyvxuwo

w|vyyq

vyouu

zzvzzx

qvux|

oqvz|

xw

sup2sup3plusmn

|vwq|qqq

|uvwz

qvqoxxw

qv|

o|uuv|q

vyxy

zzv||wq

qvqwzuw

oqvywyu

sup2sup3plusmn|vw||wqq

|vyz

qvuxu

yyyvuwwu

o|uyvyw

vy|q

zzv|uy

qvxwzqu

ovxuo

zq|z

sup2sup3plusmn

|vwuzqq

uuvxqwq

qvqoxxw

wwyv|zq

o|uv|q

vy|ox

xvuuwu

qvowox

oqvowuyq

sup2sup3plusmn|vwoquxqq

|yvzyxu

qvuzou

yvqwq|

o|uyvzxu

vy|qw

xvuuz

qvqwuy

ovuz

|y

uxqqq

wxqqq

|xqqqq

|qqqq

w

frac14icircRfrac14Rplusmndegreg

szligsup3degicircRRszligcedilszligiumliumlRethvRntildeRfrac14degicircRccedilogravemicroplusmnregntildemicro~sup1Oslashsup3sup2moRenecircfrac12oacuteregplusmnAumlocircotildeg

egravekeacuteOslashfrac14knicircccedilpegravekszligszligOslashmmnOslash|xRccedilpOslashpOslashfrac14knRccedilpegravekszligszligOslashfrac12ecircfrac12eacuteeacuteszlignRocircotilde

mszligpecircmicircccedilpeacutefrac14UgraveOslashOslashyunRocircotilde

frac12kicircRRkplusmnplusmnplusmnraquodegReacutemiddotplusmnmiddotRkplusmn~plusmnmiddot

-

moRexoszligcedilowszligkoumlg

szligdegicircRkszligyudividentildeRvxUgraveegraveparantildeRpposlashueuqUgravecedilregedegsup2plusmngntildeRo|vxUgravecedilregesup3plusmnregsup2RugraveRnecircfrac12gg

Cuacute uacuteucircicircuumlTyacutethornYacuteN0ucircicircuumlTyacutethornYacuteN0gt

|

xo

xo

egravekszligszligRethvicircRethov|vo

eeumlgicircecirckccedilnecircfrac12RregR0Eeecirckccedilnecircfrac12Rndegplusmnsup2gIcirc-RJ13NOIPQ

~~icircRppRmicrodegdegugraveplusmnsup2ntildeRoreg~cedilsup2ntildeRmplusmnRn

JAEA-Testing 2011-005

B EumlIgravecentMPI atilde oh$a`MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

1024EumlIgrave

0

1000

2000

3000

4000

5000

6000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

2000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(1024-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(1024-)

rdma

Nordma

0

100000

200000

300000

400000

500000

600000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(1024-)

rdma

Nordma

Fig B-1 MPI atildeL1024EumlIgrave-1P

JAEA-Testing 2011-005

0

50000

100000

150000

200000

250000

300000

350000

400000

q qqqq uqqqq yqqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(1024-)

rdma

Nordma

0

1000000

2000000

3000000

4000000

5000000

6000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(1024-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(1024-)

rdma

Nordma

0

50

100

150

200

250

300

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(1024-Rank0gtRank1023)

rdma

nordma

Fig B-2 MPI atildeL1024EumlIgrave-2P

JAEA-Testing 2011-005

13

C

RD

MAoacute

NO

RD

MA

Ta

ble

C-13UVOcirc$

(8K

Bccedil

78K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

~plusmnreg

plusmn

sup2~

sup2~

sup2cedil~raquo

sup2~Oslashreg~plusmn

middotplusmnsup1

middotplusmnsup1raquo

~plusmnraquo

~plusmn

~plusmn

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

plusmnraquo

Tabl

e C

-2UVOcirc$

(80

KBccedil

584K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

sup2~Oslashreg~plusmn

plusmn

~plusmnreg

middotplusmnmiddotsup1

~plusmnraquo

~plusmn

sup2~

sup2cedil~raquo

sup2~

middotplusmnsup1raquo

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

~plusmn

plusmnraquo

Zordf

100M

RD

MAAgravejordfnotyWXq

0CXM

NO

RD

MA

AgravejordfnotyWXq

JAEA-Testing 2011-005

D

W Oslash5

Tabl

e D

-1Zccedil`

MPI

8mpi

qzvquecirciumlq|yvwuecirciumlquvzuecirciumlquovu|oecirciumlqxvwoqecirciumlquyvxqoecirciumlquyvxoecirciumlqu

qwvo

qzvzzecirciumlq|vuecirciumlquwvyecirciumlquovu|oecirciumlqxvwuecirciumlquyvuzoecirciumlquxvy|ecirciumlqu

yqyv|

owvz|oecirciumlq|yvwyyecirciumlquvxzecirciumlquvqxecirciumlquov|uecirciumlquyvuecirciumlquccedilxv|xecirciumlq

ccedilyvq

ozvqoecirciumlq|vxxecirciumlquwvux|ecirciumlquvyzoecirciumlquovxoecirciumlquyvuuqecirciumlquccedilvyecirciumlq|

ccedilwuvy

ovq|ecirciumlquyvzz|ecirciumlquwvq|qecirciumlquovu|qecirciumlqxvw|wecirciumlquyvuyoecirciumlquyvywecirciumlqu

yquv

ovquyecirciumlquvuyqecirciumlquwvxqecirciumlquovu|qecirciumlqxvwecirciumlquyvuwqecirciumlquxvzxecirciumlqu

xx|vw

|ovq|zecirciumlquyvwzuecirciumlquvz||ecirciumlquvyoecirciumlquovwecirciumlquyvuzecirciumlquccedilovozecirciumlq|

ccediloyvy

|ovquecirciumlquv|qecirciumlquwvuoyecirciumlquvyy|ecirciumlquovowyecirciumlquyvuecirciumlquccedilvxwecirciumlq|

ccedilovz

uzvwoecirciumlq|vuxyecirciumlquwvu|uecirciumlquovu|ecirciumlqxvw|ecirciumlquyvuuxecirciumlquxvwwuecirciumlqu

yqovy

uzvywecirciumlq|yvzuzecirciumlquvzoyecirciumlquovuwecirciumlqxvwqyecirciumlquyvuecirciumlquyv|yecirciumlqu

yxvu

xzvqecirciumlq|vuoqecirciumlquwv|wqecirciumlquvyyecirciumlquovowecirciumlquyvuuzecirciumlquccedilvozecirciumlq|

ccedil|vx

xzvyzzecirciumlq|voxuecirciumlquwvouecirciumlquvqqecirciumlquovxuecirciumlquyvuuyecirciumlquccediluv|yecirciumlq|

ccedilu|v

yzvuecirciumlq|vxzecirciumlquwvxecirciumlquovuyecirciumlqxvwecirciumlquyvuxecirciumlquxvyzqecirciumlqu

xw|vw

yzvz|oecirciumlq|yvwzoecirciumlquvwwxecirciumlquovuzecirciumlqxvwyecirciumlquyvuyyecirciumlquyvuqwecirciumlqu

yuxv|

zvwqecirciumlq|v|uzecirciumlquwv||yecirciumlquvyxwecirciumlquovozecirciumlquyvuzecirciumlquccedilyvwuecirciumlq|

ccedilywv

ovqq|ecirciumlquyvzzyecirciumlquvzzzecirciumlquvyzecirciumlquovqoecirciumlquyvuzecirciumlquccedil|vozxecirciumlq|

ccedil|ovz

zvuzecirciumlq|vowqecirciumlquwvoxxecirciumlquovoqqecirciumlqxuvxwecirciumlquyvuqecirciumlquvwu|ecirciumlqu

zovy

zvwecirciumlq|vyxecirciumlquwvuecirciumlquovqzzecirciumlqxuvxoecirciumlquyvuyzecirciumlquvu|ecirciumlqu

wvw

qwv|xzecirciumlq|vzecirciumlquwvywecirciumlquvxyoecirciumlquovqwuecirciumlquyvuecirciumlquccedilovqyecirciumlquccedilov

qzvuuuecirciumlq|v|yecirciumlquwvqecirciumlquovuzecirciumlqxvwqqecirciumlquyvuzoecirciumlquyvqoecirciumlqu

y|vx

oovouwecirciumlquwvqqecirciumlquzvoywecirciumlquovuqwecirciumlqxvyxxecirciumlquyvuwecirciumlquuvzoxecirciumlqu

uwvq

owvwuecirciumlq|vq|qecirciumlquvzqzecirciumlquv|uuecirciumlquzvqoecirciumlq|yvu|ecirciumlquccedilxvyxqecirciumlq|

ccedilyuv|

yvz||ecirciumlq|v|oecirciumlquwvuuecirciumlquvqyecirciumlqxovuowecirciumlqxyvu||ecirciumlquovozecirciumlqxoxwvy

ovouecirciumlquvuxecirciumlquwv|zecirciumlquovu|oecirciumlqxvwxyecirciumlquyvux|ecirciumlquxvzoecirciumlqu

xoyvo

|ov|qoecirciumlquv|zqecirciumlquwvyzoecirciumlquv|oecirciumlquwvu|uecirciumlq|yvu|ecirciumlquccedilov|uecirciumlquccediloqxv

|v|qecirciumlq|vqecirciumlquvxzecirciumlquvxqecirciumlquovq|ecirciumlquyvuzecirciumlquccedilv|zoecirciumlq|

ccedil|vu

uwvxy|ecirciumlq|wvzoyecirciumlquzv|ecirciumlquovuozecirciumlqxvuoqecirciumlquyvwqecirciumlquuvuoecirciumlqu

xoxvw

uovquecirciumlquvwecirciumlquwvqecirciumlquovuouecirciumlqxvyxqecirciumlquyvuzoecirciumlquxvwoecirciumlqu

xy|vx

xzvux|ecirciumlq|vxwecirciumlquwvx|ecirciumlquvx|oecirciumlquovoqoecirciumlquyvu|qecirciumlquccedilzvzoecirciumlq|ccediloquvz

xwvqqecirciumlq|vuuwecirciumlquwv|owecirciumlquvzuzecirciumlquov|wyecirciumlquyvxy|ecirciumlquccedil|vywuecirciumlq|

ccediluv|

yzvqecirciumlq|zv|w|ecirciumlquovq|qecirciumlqxovuxecirciumlqxvyecirciumlquyvzqecirciumlquuvqecirciumlqu

uyuvq

yovqyuecirciumlquv|quecirciumlquwv|ywecirciumlquovuyuecirciumlqxwvqxwecirciumlquyvxwoecirciumlquyvecirciumlqu

xwzvx

ovoooecirciumlquvzwwecirciumlquzvqzzecirciumlquvo|ecirciumlqxovuoecirciumlqxyvyouecirciumlquovecirciumlqxooqqvy

ovqoecirciumlquvqyoecirciumlquwvo|ecirciumlquvy|ecirciumlquovoecirciumlquyvuxecirciumlquccediluvxzecirciumlq|

ccediluvz

zvy|ecirciumlq|wvoqqecirciumlquzvqyecirciumlquov|uqecirciumlqxyvw|ecirciumlquyvxyecirciumlquuv||ecirciumlqu

uzovo

zvyzoecirciumlq|vqwecirciumlquwvoecirciumlquovqzwecirciumlqxuvuwwecirciumlquyvuzyecirciumlquvwqyecirciumlqu

yxvy

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

JAEA-Testing 2011-005

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy1

Ccedil

Ccedil

+

Ccedil

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

Fi

g D

-1 Zccedil`

MPI

8mpi

JAEA-Testing 2011-005

Tabl

e D

-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p

q|vu|ecirciumlq|yvuqqecirciumlquyvuuecirciumlquyvy|ecirciumlqu|vuuxecirciumlq|yvuozecirciumlquovzzoecirciumlq

xvw

qvuuwecirciumlq|yvuqecirciumlquvouecirciumlquvqyecirciumlquyvqxqecirciumlq|yvuqecirciumlquccedilovoecirciumlq|

ccediloyv|

o|vuuecirciumlq|yvuq|ecirciumlquyvuecirciumlquyvwuwecirciumlquuv|ecirciumlq|yvuuecirciumlquovqquecirciumlq|

zv

oyvqzwecirciumlq|yvuqoecirciumlquvqooecirciumlquvqecirciumlquyvqyecirciumlq|yvuozecirciumlquovxxecirciumlq

vx

|vwuecirciumlq|yvuqoecirciumlquyv|qecirciumlquyvwxecirciumlquuv|uqecirciumlq|yvu|ecirciumlquovoecirciumlq|

|wv

yvqqwecirciumlq|yvuqyecirciumlquvqqecirciumlquvq|oecirciumlquyvoyecirciumlq|yvuouecirciumlquvuowecirciumlq

uvq

||vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvwxecirciumlquuv|yqecirciumlq|yvuoecirciumlquovooecirciumlq|

|yv

|vuzwecirciumlq|yvuqyecirciumlquvoxxecirciumlquvq|ecirciumlquyvoqyecirciumlq|yvuoecirciumlquccedilov|uecirciumlq|

ccediloyvx

|v|yuecirciumlq|yvuq|ecirciumlquyv|zecirciumlquyvw|oecirciumlquuvqzuecirciumlq|yvuecirciumlquzvo|ecirciumlq

vy

yvy|ecirciumlq|yvuquecirciumlquvqwqecirciumlquvqzecirciumlquyvoqecirciumlq|yvuozecirciumlquccedilxvo|uecirciumlq

ccedilyvy

q|vu|ecirciumlq|yvuqoecirciumlquyvu|ecirciumlquyvyyyecirciumlquvx|qecirciumlq|yvuo|ecirciumlquccedilvoqecirciumlq

ccedilvx

qvu||ecirciumlq|yvuquecirciumlquvouwecirciumlquvq|ecirciumlquyvoxecirciumlq|yvuecirciumlquccedilovoquecirciumlq|

ccedilouvz

ovxouecirciumlq|yvuqecirciumlquyvyxzecirciumlquyvyyecirciumlquvu|ecirciumlq|yvuxecirciumlquwvuwwecirciumlqo

|vu

oyvozecirciumlq|yvuq|ecirciumlquvqoyecirciumlquvq|ecirciumlquyvoecirciumlq|yvuxecirciumlquvooqecirciumlq

|vu

|vyoqecirciumlq|yvuoyecirciumlquyvecirciumlquyvxyecirciumlqu|v|xwecirciumlq|yvuqecirciumlquccedilvozecirciumlq

ccedilxvz

xvzzyecirciumlq|yv|wzecirciumlquyvzwzecirciumlquvq|oecirciumlquyvouecirciumlq|yvuoyecirciumlquuvoxwecirciumlq

yvz

||vyecirciumlq|yvuqoecirciumlquyvy|ecirciumlquyvxecirciumlqu|v|xxecirciumlq|yvuoecirciumlquccedilxvwwzecirciumlqo

ccedilovy

|vuzuecirciumlq|yvuquecirciumlquvox|ecirciumlquvq||ecirciumlquyvo|ecirciumlq|yvuozecirciumlquccedilovq|ecirciumlq|

ccediloyvo

|vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvooecirciumlquvzoyecirciumlq|yvuqecirciumlquccedilv|zxecirciumlq

ccedilyv

yvy|ecirciumlq|yvuqqecirciumlquvqecirciumlquvq|xecirciumlquyvouecirciumlq|yvuqecirciumlquccediluvqoecirciumlq

ccedilxvo

q|vozyecirciumlq|yv|zzecirciumlquyvowecirciumlquyvyzecirciumlqu|vxu|ecirciumlq|yvuouecirciumlquxvq|wecirciumlq

oxvw

qv|wyecirciumlq|yvuqxecirciumlquvouuecirciumlquvquuecirciumlquyvoqyecirciumlq|yvu||ecirciumlquccedilovqqqecirciumlq|

ccedilo|vx

ouvoq|ecirciumlq|yv|zecirciumlquyvwqecirciumlquyvyxecirciumlquvuoecirciumlq|yvuoyecirciumlquccedilovxqoecirciumlq|

ccedil|yvy

ovuqzecirciumlq|yv|zoecirciumlquvo|ecirciumlquvq|zecirciumlquyvo|ecirciumlq|yvuowecirciumlquccedilzv|ecirciumlq

ccedilovy

uvoecirciumlq|yvuqqecirciumlquyvwo|ecirciumlquyvxoecirciumlqu|vzwecirciumlq|yvuoecirciumlquccedilyvowecirciumlq

ccediloxvq

vxozecirciumlq|yvuoecirciumlquvo|ecirciumlquvquuecirciumlquyvoyecirciumlq|yvuecirciumlquccedilovzzecirciumlq|

ccedilov|

||vuoqecirciumlq|yvuqxecirciumlquyvuecirciumlquyvywecirciumlqu|vuyuecirciumlq|yvuoecirciumlquvooqecirciumlq

yv

|vxoecirciumlq|yvuquecirciumlquvoxyecirciumlquvq||ecirciumlquyvqwyecirciumlq|yvuuecirciumlquccedilov||ecirciumlq|

ccediloyvu

|vqwecirciumlq|yvuqqecirciumlquyvoecirciumlquyv|yecirciumlqu|vozecirciumlq|yvuowecirciumlquccedil|vxooecirciumlq

ccedilvu

vuxwecirciumlq|yvuqxecirciumlquvoxoecirciumlquvquqecirciumlquyvoxxecirciumlq|yvuuecirciumlquccedilovooyecirciumlq|

ccedilouvz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

jetimesg

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-3Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (2

)

JAEA-Testing 2011-005

Tabl

e D

-3Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p

q|v|oxecirciumlq|wvwz|ecirciumlq|ovoecirciumlquovou|ecirciumlquvu|qecirciumlq|zvqqecirciumlq|ccedilvxwecirciumlq

ccedil|vu

qvx||ecirciumlq|wvzquecirciumlq|ovyuuecirciumlquovyuyecirciumlquvxqoecirciumlq|wvzyecirciumlq|vy|ecirciumlqo

qv|

ovyecirciumlq|wvwzqecirciumlq|ovoyxecirciumlquovouuecirciumlquvuwecirciumlq|wvzyuecirciumlq|ccedilvoqqecirciumlq

ccedilvy

ovxqoecirciumlq|wvwzecirciumlq|ovyuqecirciumlquovy|yecirciumlquvuoqecirciumlq|wvzxqecirciumlq|ccedil|vzoecirciumlqo

ccedilqvx

|vywoecirciumlq|wvz|ecirciumlq|ovyoecirciumlquov|ecirciumlqu|v|wyecirciumlq|wvz|oecirciumlq|ccedilvzxzecirciumlq

ccedilwvq

v|zoecirciumlq|wvwzyecirciumlq|ovyzecirciumlquovuzyecirciumlquxvzzwecirciumlq|wvzxwecirciumlq|ccedilov||oecirciumlq|

ccedilowvq

||vywecirciumlq|wvwwzecirciumlq|ovxecirciumlquov||ecirciumlqu|v|xwecirciumlq|wvzoecirciumlq|ccedilv|zecirciumlq

ccedilyvx

|vuooecirciumlq|wvzqwecirciumlq|ovy|ecirciumlquovxqxecirciumlquyvqzecirciumlq|wvz|ecirciumlq|ccedilovyecirciumlq|

ccedilovo

|v|xzecirciumlq|wvzqoecirciumlq|ovyecirciumlquovowwecirciumlquvzo|ecirciumlq|wvzyecirciumlq|ccedil|vzzecirciumlq

ccediloovu

vuxzecirciumlq|wvzqoecirciumlq|ovy|yecirciumlquovxoecirciumlquyvuecirciumlq|wvzyoecirciumlq|ccedilyvxuecirciumlq

ccedilwvw

q|v|xzecirciumlq|wvwwoecirciumlq|ovuecirciumlquovxqecirciumlqu|vxxuecirciumlq|wvzuuecirciumlq|vxwecirciumlq

v

qvuxecirciumlq|wvzqyecirciumlq|ovy||ecirciumlquovuzwecirciumlquyvqqecirciumlq|wvzy|ecirciumlq|ccedilov|uwecirciumlq|

ccedilowv

ovuwqecirciumlq|wvwzecirciumlq|ovo|yecirciumlquovo|zecirciumlquvuuqecirciumlq|wvzuyecirciumlq|vowecirciumlqo

ovo

oyvoozecirciumlq|wvz|qecirciumlq|ovxqxecirciumlquovuzecirciumlquyvqoecirciumlq|wvzxzecirciumlq|ccedilvzyecirciumlqo

ccedilov|

|v||zecirciumlq|wvzqzecirciumlq|ovxecirciumlquov|oecirciumlqu|v|oxecirciumlq|wvzzyecirciumlq|yv|owecirciumlqo

ovz

xvzyuecirciumlq|wvzqoecirciumlq|ovuwecirciumlquovxqxecirciumlquyvqxecirciumlq|wvzuecirciumlq|ovwuoecirciumlq

|vo

||v|xzecirciumlq|wvwzoecirciumlq|ovxecirciumlquovuzecirciumlqu|vxozecirciumlq|wvzqecirciumlq|v|zoecirciumlq

vo

|vuwxecirciumlq|wvzqyecirciumlq|ovy|zecirciumlquovxquecirciumlquyvqwoecirciumlq|wvzxzecirciumlq|ccedilov|xoecirciumlq|

ccedilowvo

|vo|uecirciumlq|wvwzqecirciumlq|ovqecirciumlquovoecirciumlqu|vqecirciumlq|wvzyuecirciumlq|ovuyzecirciumlq

uvu

yvuwecirciumlq|wvzooecirciumlq|ovxyyecirciumlquovxqoecirciumlquyvquecirciumlq|wvzyuecirciumlq|ccedilyvuw|ecirciumlq

ccedilwvy

q|v|oyecirciumlq|wvwzqecirciumlq|ovoecirciumlquov|yecirciumlqu|v|yzecirciumlq|wvzwzecirciumlq|ovxoyecirciumlq

uvy

qxvzyyecirciumlq|wvzq|ecirciumlq|ovuwecirciumlquovy|wecirciumlquvu|qecirciumlq|wvzx|ecirciumlq|ovxouecirciumlq|

xvu

ovxyecirciumlq|wvwqecirciumlq|ovoy|ecirciumlquov|oecirciumlquuvuuecirciumlq|wvzyecirciumlq|ovxwqecirciumlq|

xv|

oyvqqecirciumlq|wvzqecirciumlq|ovuzecirciumlquovy|wecirciumlquvuyecirciumlq|wvzuzecirciumlq|ovux|ecirciumlq|

uvo

|vyecirciumlq|wvwzqecirciumlq|ovxecirciumlquov|zecirciumlquuv||ecirciumlq|wvzxuecirciumlq|vowyecirciumlq

ozvx

vuuuecirciumlq|wvzuuecirciumlq|ovy|zecirciumlquovuwzecirciumlquxvzxzecirciumlq|wvz|yecirciumlq|ccedilovuz|ecirciumlq|

ccedilqvo

||vywecirciumlq|wvww|ecirciumlq|ovxyecirciumlquov|zecirciumlquuv|owecirciumlq|wvzoecirciumlq|vwoecirciumlq

ozvw

|vuqqecirciumlq|wvzqwecirciumlq|ovy|oecirciumlquovxqqecirciumlquyvquqecirciumlq|wvzxwecirciumlq|ccedilov|oqecirciumlq|

ccedilov

|v|xecirciumlq|wvww|ecirciumlq|ovuecirciumlquov|q|ecirciumlquuvqyyecirciumlq|wvzyzecirciumlq|vzuyecirciumlq

xv|

yvqwecirciumlq|wvzouecirciumlq|ovxyecirciumlquovxyyecirciumlquyvouecirciumlq|wvzuzecirciumlq|uvqzxecirciumlqo

vz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

Ccedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

Xpound

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-4Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-5Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (2

)

JAEA-Testing 2011-005

E 1oacute 1

Fig E-1 QRfeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39305 sec Ave Speed = 32566145 MBs RANK = 2 TIME = 39197 sec Ave Speed = 32655168 MBs RANK = 3 TIME = 39212 sec Ave Speed = 32643219 MBs RANK = 4 TIME = 34341 sec Ave Speed = 37272765 MBs RANK = 5 TIME = 34175 sec Ave Speed = 37453885 MBs RANK = 6 TIME = 34086 sec Ave Speed = 37552470 MBs RANK = 7 TIME = 34329 sec Ave Speed = 37286226 MBs RANK = 8 TIME = 24021 sec Ave Speed = 53287420 MBs RANK = 9 TIME = 23794 sec Ave Speed = 53794939 MBs RANK = 10 TIME = 23805 sec Ave Speed = 53770236 MBs RANK = 11 TIME = 23798 sec Ave Speed = 53786914 MBs RANK = 12 TIME = 23942 sec Ave Speed = 53462490 MBs RANK = 13 TIME = 23931 sec Ave Speed = 53488046 MBs RANK = 14 TIME = 23950 sec Ave Speed = 53443826 MBs RANK = 15 TIME = 24066 sec Ave Speed = 53186052 MBs RANK = 16 TIME = 23798 sec Ave Speed = 53786327 MBs RANK = 17 TIME = 23802 sec Ave Speed = 53775886 MBs RANK = 18 TIME = 23821 sec Ave Speed = 53733399 MBs RANK = 19 TIME = 23794 sec Ave Speed = 53795866 MBs RANK = 20 TIME = 23939 sec Ave Speed = 53468837 MBs RANK = 21 TIME = 23938 sec Ave Speed = 53470775 MBs RANK = 22 TIME = 23939 sec Ave Speed = 53470333 MBs RANK = 23 TIME = 23942 sec Ave Speed = 53462889 MBs

JAEA-Testing 2011-005

Fig E-2 afeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39370 sec Ave Speed = 32511775 MBs RANK = 2 TIME = 39475 sec Ave Speed = 32425231 MBs RANK = 3 TIME = 39415 sec Ave Speed = 32474846 MBs RANK = 4 TIME = 34392 sec Ave Speed = 37217925 MBs RANK = 5 TIME = 34543 sec Ave Speed = 37055373 MBs RANK = 6 TIME = 34250 sec Ave Speed = 37372742 MBs RANK = 7 TIME = 34709 sec Ave Speed = 36878286 MBs RANK = 8 TIME = 26967 sec Ave Speed = 47465876 MBs RANK = 9 TIME = 26964 sec Ave Speed = 47471018 MBs RANK = 10 TIME = 26955 sec Ave Speed = 47486289 MBs RANK = 11 TIME = 26947 sec Ave Speed = 47499771 MBs RANK = 12 TIME = 26957 sec Ave Speed = 47483647 MBs RANK = 13 TIME = 27020 sec Ave Speed = 47373054 MBs RANK = 14 TIME = 26948 sec Ave Speed = 47498023 MBs RANK = 15 TIME = 27174 sec Ave Speed = 47104542 MBs RANK = 16 TIME = 26956 sec Ave Speed = 47485054 MBs RANK = 17 TIME = 26945 sec Ave Speed = 47504899 MBs RANK = 18 TIME = 26946 sec Ave Speed = 47502360 MBs RANK = 19 TIME = 26967 sec Ave Speed = 47466158 MBs RANK = 20 TIME = 26948 sec Ave Speed = 47499767 MBs RANK = 21 TIME = 26982 sec Ave Speed = 47438204 MBs RANK = 22 TIME = 27087 sec Ave Speed = 47254830 MBs RANK = 23 TIME = 27004 sec Ave Speed = 47399960 MBs

JAEA-Testing 2011-005

Fig E-3 QRfeaZOpenMPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes NODE CPU coreid Rank = 0 bx2050 CPU0 0 all time RANK = 1 TIME = 38110 sec Ave Speed = 33587213 MBs bx2050 CPU1 1 RANK = 2 TIME = 28656 sec Ave Speed = 44667752 MBs bx2050 CPU0 2 RANK = 3 TIME = 37959 sec Ave Speed = 33720175 MBs bx2050 CPU1 3 RANK = 4 TIME = 28658 sec Ave Speed = 44664664 MBs bx2050 CPU0 4 RANK = 5 TIME = 37829 sec Ave Speed = 33836073 MBs bx2050 CPU1 5 RANK = 6 TIME = 28641 sec Ave Speed = 44691880 MBs bx2050 CPU0 6 RANK = 7 TIME = 37892 sec Ave Speed = 33779924 MBs bx2050 CPU1 7 RANK = 8 TIME = 35140 sec Ave Speed = 36425620 MBs bx2051 CPU0 0 RANK = 9 TIME = 35017 sec Ave Speed = 36554163 MBs bx2051 CPU1 1 RANK = 10 TIME = 38916 sec Ave Speed = 32891010 MBs bx2051 CPU0 2 RANK = 11 TIME = 45416 sec Ave Speed = 28184186 MBs bx2051 CPU1 3 RANK = 12 TIME = 35172 sec Ave Speed = 36392511 MBs bx2051 CPU0 4 RANK = 13 TIME = 35053 sec Ave Speed = 36516468 MBs bx2051 CPU1 5 RANK = 14 TIME = 35200 sec Ave Speed = 36363472 MBs bx2051 CPU0 6 RANK = 15 TIME = 35009 sec Ave Speed = 36561987 MBs bx2051 CPU1 7 RANK = 16 TIME = 22996 sec Ave Speed = 55661441 MBs bx2052 CPU0 0 RANK = 17 TIME = 35587 sec Ave Speed = 35968575 MBs bx2052 CPU1 1 RANK = 18 TIME = 35587 sec Ave Speed = 35968151 MBs bx2052 CPU0 2 RANK = 19 TIME = 35712 sec Ave Speed = 35842525 MBs bx2052 CPU1 3 RANK = 20 TIME = 22996 sec Ave Speed = 55661978 MBs bx2052 CPU0 4 RANK = 21 TIME = 36856 sec Ave Speed = 34729458 MBs bx2052 CPU1 5 RANK = 22 TIME = 22994 sec Ave Speed = 55666382 MBs bx2052 CPU0 6 RANK = 23 TIME = 22991 sec Ave Speed = 55673257 MBs bx2052 CPU1 7

JAEA-Testing 2011-005

13

F |eacuteIuml -microX|eacuteIuml^_Z)JCQRfeaZW Intel feaZegrave

pound5poundDqQRfeaZgt|YatildeGfeaUuml^13~

WCeumlUKntstiacuteordflaquoqmnaringaeligccedil13heumlX store amp5GUuml^13~gtlaquoq2csup1egraveBordfOcirc$ storegtszligordflaquoqiAgraveIntelmnntildeGfeaUuml^13~centsup3CXqCcCcedilEgravegtacircOslash5Cagraveaacute5pound

Dq WCIntelfeaZmUuml^13~ordfccdQRfeaZ

UKntst egraveBCDoumlWordfqiAgraveUKntst ordfQRfeaZEGmWordfethCDq

F1 Agraveograve o^_UgraveDo7ccedilIumlgt 128MB copyIgrave0]lyacuteOslash5GZccedil` MPI

OpenMP^_Z)JCDqm0]lyacute 100plusmnatildesectaumlC5mnccC|eacuteIuml (MBs)iquestHq^_ZiIacute[oumlYacutebrvbarsectGqOslash5IumlWCMPI gt 12348 ^_OpenMP gt 123487ccedilIumlgtOslash5UgraveD IFyacute1EC1CPUcd2^_UgraveD7ccedilIuml 2laquonotgtOslash55poundEacuteDqQRfeaZUKntst CDoumlWCXouml 2sectCDqIntelucircfeagt5poundDq

JAEA-Testing 2011-005

Fig F-1 ^_ZugraveuacuteLFlat MPIP

(main) n = nmb102410248 nmb = 128 (from input data) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) nsize = n size c NNN Repetition number of times NNN = 100 do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = MPI_WTIME() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

Fig F-2 ^_ZugraveuacuteLOpenMPP

(main) $OMP threadprivate(ioutompidompnumabierr1ierr2) n = nmb102410248 nmb = 128 (from input data) $OMP parallel shared(n) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) if(ierr1ne0orierr2ne0) then write(6) Array allocation failed stop end if $OMP end parallel nsize = n size c NNN Repetition number of times NNN = 100 $OMP parallel private(i)shared(nsize) do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue $OMP end parallel $OMP parallel private(IIttime1ttime2timewave_spd) $OMPamp shared(NNNn) ttime1 = omp_get_wtime() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = omp_get_wtime() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

F2 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt[oumlYacutebrvbarsect 4gtGqmn

d QRUKntstlaquooumlWoumlWr 2 2^_laquonotgtUKntstlaquooumlWoumlWr 3KntstCc 1ccedil8^_gtQRW IntelWr QR(-Kntst)W Intel2 2^_Wr CqOMP Intel EcircHGu MPI Wiacute

XWXpoundDDHyzGqmndbrvbarsect[raquozfrac14frac12CDq (1) MPI^_ZOpenMP^_ZWQRfeaZgtUKntstordflaquooumlW

oumlgtordfaacuteCq (2) IntelfeaZUKntstCDQRfeaZWbAEligq (3) MPI^_ZOpenMP^_ZW 1CPUSgto coremacrdnXq (4) UKntstordfQRfeaZOslash5thornlt]G(1plusmnW 2plusmn

Oslash5ordfaEacuteD)q ^_ZbrvbarPXOcirc$sup1yBordfAgraveXCc store ampordfCXoumlQR

feaZUKntst Uuml^13~CXnordfyacuteordfdXqOcirc$sup1yBordfcAgraveXcpoundecircsectCXoumlmUuml^13~CagraveaacuteGaeliguordflaquoWnq

IntelfeaZdegIacuteQRUKntstlaquosectoumlWGmWordfgtecircq(XETHNtildefeaOslash5|YatildeG4X

yen5oslashIacuteD|Yordf9WXpoundbrvbarPmacrltdnDq

JAEA-Testing 2011-005

Table F-1 |eacuteIumlWeumlUKntstiacuteUuml^13~

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqvo uuuwvq u|xvq qvzu

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |o|v wq|vy xxwvu qvXpoundo |xwxvz |xwvq |xw|vz Xpoundo |xvy wovw xov qv

|xyvu |xuvo |xxv| |ozvu woqv xyxvq qv

Xpoundq z|vx zvz z|v Xpoundq ouuwvz ozwov ooxv| qvyoXpoundo woqv wqxvo wqvy Xpoundo oux|vu ozwv| oqvu qvyoXpound wqzv| wqvo wqxv Xpound oux|v| ozzqvo oov qvyo

wquv| wqqvo wqv ouxovz ozwyvu oozvo qvyo

Xpoundq o||vo o|xvq o|uvq Xpoundq oqoxvz oxquvu oyqvo qvxzXpoundo ouv ouov ouov Xpoundo oqovw oxoqvo oy|vz qvxzXpound ouov ouyv| ouuvq Xpound oqowvz oxqzv oyuv| qvxzXpound| ouvy ouuvw ou|v Xpound| oqowvz oxqzv oyuvq qvxz

o|zvz ouovw ouqvz oqovz oxqwv| oy|vo qvxz

Xpoundq qwyvq oqyvw qzyvu Xpoundq ouxvq ouwuvy ouywv| qvqXpoundo qz|v ooyvy oqxv Xpoundo ouxvz ouwuv ouovq qvqXpound qzuvy oovq oqxvw Xpound ouxzv ouz|v ouyv qvqXpound| qzxvy ooxvu oqxvx Xpound| ouxuvw ouzv ou|vx qvqXpoundu qwzv| o||v ooovx Xpoundu ouqxvo ouuv| ouyv qvyXpoundx qwzvu o|vw oo|vy Xpoundx ouq|vo ouuxv ouuv qvyXpoundy oqvy o|xvu oovx Xpoundy ouquvw ouuyvq ouxvu qvyXpound ooqvu o|yvo o|v Xpound ouqxvx ouuyv ouyvo qvy

qzxvw ouvw ooqv| ou|qv| ouyvu ouuwvz qvyw

usup3deg

wsup3deg wsup3deg

usup3deg

|sup3deg

sup3deg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

sup3deg

|sup3deg

Table F-2 |eacuteIumlLiacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq ywxvu xyuvw yxvo qvuXpoundo |xywv |xwvx |x|vy Xpoundo yzwvz xvu y|wvo qvuXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz yzvo xovo y|ovy qvu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq yqv xywvy yozvy qvuXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound ywyvz xwv y|uvw qvuXpound| Xpound|

|xyov| |xovu |xyyv| ywvw xxvy yv qvu

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq yoov xvu ywovw qvxXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| yxv qvo yzvy qvx

|xyv| |xqvo |xyyv yowv yov ywzv qvx

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo uyv| wqvo y|v qvuXpound |xyvz |xuvu |xovo Xpound uqvu wqyv| y|wv| qvuXpound| Xpound|

|xyyv| |xuvu |xqv| uywvw wqyv y|vw qvu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo uyxvy wqyv y|yvo qvuXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| uywvq wqxvz y|vq qvu

|xyuvw |x|vo |xywvz uyyvw wqyv| y|yvx qvu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound xq|v wovo yxvz qvuXpound| |xyxvu |x|v| |xyzv| Xpound| uzzvx wqyvu yxvz qvu

|xyuvw |xuvy |xyzv xqovy wqzv yxxvu qvu

ntildeR| ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

ontildeR

ontildeR|

qntildeRo

qntildeR

qntildeR|

qntildeRo

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

JAEA-Testing 2011-005

Talbe F-3 |eacuteIumlLQRafeaZrsacute 1ccedil8 ^_P

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqxvw uqzvq uqvu ovqx

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |ywqvx |ywovy |ywovo ovq|Xpoundo |xwxvz |xwvq |xw|vz Xpoundo |yzyvz |yzvo |yzvq ovq|

|xyvu |xuvo |xxv| |ywwv |ywzv| |ywzvq ovq|

Xpoundq z|vx zvz z|v Xpoundq zv| zxvq z|v ovqqXpoundo woqv wqxvo wqvy Xpoundo wqxvx wqwvy wqvo ovqqXpound wqzv| wqvo wqxv Xpound wquv| wqwv| wqyv| ovqq

wquv| wqqvo wqv wqqv wquvq wqv| ovqq

Xpoundq o||vo o|xvq o|uvq Xpoundq ov ouqv| o|ov| ovqqXpoundo ouv ouov ouov Xpoundo o|ovy ouwvw ouqv ovqqXpound ouov ouyv| ouuvq Xpound o|ovy oxqvo ouqvw ovqqXpound| ouvy ouuvw ou|v Xpound| o|qv ouzvo o|zvz ovqq

o|zvz ouovw ouqvz ozvq ouvo o|wvq ovqq

Xpoundq qwyvq oqyvw qzyvu Xpoundq qzv ooxvq oqyvo ovqqXpoundo qz|v ooyvy oqxv Xpoundo oquv oouv| oqzv ovqqXpound qzuvy oovq oqxvw Xpound oqxv| oxvx ooxvu ovqqXpound| qzxvy ooxvu oqxvx Xpound| oqxvw oxv ooxvx ovqqXpoundu qwzv| o||v ooovx Xpoundu oqv qzuv| oqvx qvzzXpoundx qwzvu o|vw oo|vy Xpoundx oov qzov oquvu qvzwXpoundy oqvy o|xvu oovx Xpoundy ooxvu qwzv oqvy qvzwXpound ooqvu o|yvo o|v Xpound oouvo qzqvo oqvo qvzw

qzxvw ouvw ooqv| ooqvq oqxv oqvz qvzz

wsup3deg

usup3deg

|sup3deg

sup3deg

13aumlecedilg

sup3deg

|sup3deg

usup3deg

eccedilograveregYacuteg

efrac12cedilregg

oefrac12cedilregg

wsup3deg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

Table F-4 |eacuteIumlLQRafeaZrsacute iacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq |quvy |quvx |quvx ovquXpoundo |xywv |xwvx |x|vy Xpoundo ||vx |ovw |vy ovquXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz |ouvq |o|vo |o|vy ovqu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq ||zzv |qv| |xxqvw ovqqXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound |zwv| ||v |yovq ovqxXpound| Xpound|

|xyov| |xovu |xyyv| |xzwvw |o|vq |yxxvz ovq

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq |quv |quvq |quvu ovquXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| |ovx |ovy |ovy ovqu

|xyv| |xqvo |xyyv |o|vo |ovw |o|vq ovqu

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo |oyvw |oyvu |oyvy ovquXpound |xyvz |xuvu |xovo Xpound |oyv |oyvq |oyvo ovquXpound| Xpound|

|xyyv| |xuvu |xqv| |oyvx |oyv |oyv| ovqu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo |ov |oyvo |oyvz ovquXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| |ouv| |ovx |o|vu ovqu

|xyuvw |x|vo |xywvz |oyvq |ouv| |oxvo ovqu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound |oyv |oxvz |oyvo ovquXpound| |xyxvu |x|v| |xyzv| Xpound| |oxvx |oyvw |oyvo ovqu

|xyuvw |xuvy |xyzv |oxvw |oyv| |oyvo ovqu

ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

qntildeRoqntildeRo

qntildeR

qntildeR|

ontildeR

ontildeR|

ntildeR|

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpound

JAEA-Testing 2011-005

G Zccedil` MPIWIcirca|ccedil`gt MPI atilde 6MPI ETHNtildegtegraveBCD^_ZiIacute^regCZccedil`MPI

WIcirca|ccedil`gtOslash55poundDq

G1 Agraveograve MPI ETHNtildegtegraveBCD^_ZiIacute^regC1024 ^_Zccedil

` MPIW 128^_j87ccedilIumlIcirca|ccedil`128^_Zccedil` MPIW128 ^_j8 7ccedilIumlIcirca|ccedil`acircgtCc Ocirc$-a5gtrs5poundDqUgraveDRDMANORDMAIumlgt5poundDq )acircEacute coregtZccedil` MPIWIcirca|ccedil`rs acircEacute MPI^_gt

coreCCDouml rsWGq Ocirc$-a5n~n 8eacutea`ccedil1Keacutea`256Keacutea`ccedil1Meacutea`WCDq

G2 [oumlYacutebrvbarsectGqmbrvbarsect[raquozfrac14frac12CDq (1) 1024Zccedil` MPIW 128j8Icirca|ccedil`gtGu MPIatildegtIcirca|ccedil`

ordfFqUgraveDatildebrvbarpoundaacuteCCordflaquoq (2) 128Zccedil` MPIW 128j8Icirca|ccedil`gtplusmn Ocirc$-a5(256KB13yacute)

gtlaquoWGu MPIatildegtaeligIcirca|ccedil`ordfnotyW wXordfaacuteCCIcirca|ccedil`ordfZccedil` MPIbrvbarsectordfmWq

(2)gt ordfAgravecedilCXpoundegraveBG core ordfCXgtOslashf

IumlntildeHDoumlfIumlOslash5FCXWeumlnqmndbrvbarsect

BX900gtIcirca|ccedil`ordfnotygtlaquoqfIumlUTUacuteUcircbgtIcirca|ccedil`Ucirc5Gaeliguordflaquoq

JAEA-Testing 2011-005

Table G-1 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (1)

sup2~ w |y |o ovq| zx o| qv |vu vuqsup2~ oy |o |qz ovq| oqq o|x qv| |vow vwsup2~ u |ow |o ovqq zz o|x qv| |vo v|xsup2~ | |y |o ovqu zw o|x qv| |v|u v|sup2~ yu |o |oy ovqq zy o|z qvyz |v| vsup2~ ow ||u |o| ovq zw o| qv |vuq vwsup2~ xy ||u |o ovqx oqx o|z qvx |voz vsup2~ xo ||u |q ovqu oq o|w qv |vo| v|sup2~ ontildequ |uo |q ovq oqz ou| qvy |vo vu

sup2~ w owq qw qvw q oo qvy| vxy ovwxsup2~ oy owu o| qvw q oqw qvyy vyo ovzwsup2~ u owq qu qvww yw oqz qvy vyx ovwsup2~ | o| qy qvwu o oqw qvyy vu| ovzosup2~ yu o| qu qvwx o oqw qvyy vu| ovwzsup2~ ow owx qq qvz| oqz qvy vxy ovwusup2~ xy oz qo qvwz wy ooo qvw vqz ovwsup2~ xo ow qu qvw w oo qv vqx ovwsup2~ ontildequ ow ou qvwx z oq qv ovzw ovz

sup2~Oslashreg~plusmn w untildeqoy untilde|z ov| ontildeuw ontildeyz| qvu xzv|q xvusup2~Oslashreg~plusmn oy |ntildeyuw uqntilde ovwo ontildezw ontildeyuu qvz xyvx uvwsup2~Oslashreg~plusmn u untildeqxu |zntildezu| ovwx ontildexo ontildeyoq qvw xzvow uvwsup2~Oslashreg~plusmn | |ntilde|| uqntildeyq ovw ontilde|q ontildeyq qvwo xxvxy uvwysup2~Oslashreg~plusmn yu untildey uqntildeuuu ovwu ontildeww ontildeyuy qvw xvyu uvxsup2~Oslashreg~plusmn ow |ntildeqq uqntildezw| ovw ontildew ontildeyzx qv xzvxq uvowsup2~Oslashreg~plusmn xy untildeyyx uqntildeywz ovwu ontildeuxw ontildeyy qvw xovq uvuosup2~Oslashreg~plusmn xo untildeqz |zntildeoww ovwz ontildexq ontildeqx qvwz uwvyz vzzsup2~Oslashreg~plusmn ontildequ yntildexwx |ntildeuuz vqx ontildey ontildewx qvw uvo ozvzw

middotplusmnsup1 w |w | ovqq oqq o|y qv| |vw vuqmiddotplusmnsup1 oy |u |uo qvzx oqo o|y qvu |vq vxqmiddotplusmnsup1 u |uo |yq qvzx oqo ouq qv |v|w vxwmiddotplusmnsup1 | |xo |yw qvzy oq o|w qvu |vux vyymiddotplusmnsup1 yu xzo u|| ov|y oqx ouw qvo xvy| vzmiddotplusmnsup1 ow ntildez yqu uvz| ooy oxx qvx xvq |vzqmiddotplusmnsup1 xy xntilde z xv| o| oy qvz |zvyx xvwumiddotplusmnsup1 xo wntildeqoz ontildew uvxo oy oz ovo| |voy zvmiddotplusmnsup1 ontildequ oqntildezy |ntildexxo |vqz zo |w uvqz oovz ouvzx

middotplusmnsup1raquo w ntildey| oxntildequq qvo || oy qvu vz ovqomiddotplusmnsup1raquo oy ntildeuyw oxntildeqx qvoy ||x oq qvu v| ovoxmiddotplusmnsup1raquo u ntilde|y oxntildequq qvow |uo o| qvuw wvq| ovqzmiddotplusmnsup1raquo | ntildexw| oxntildeq|| qvo || ox qvu vyy ovqomiddotplusmnsup1raquo yu ntildeyyz oxntildeqqz qvow |xq yw qvuy vy ozvxumiddotplusmnsup1raquo ow |ntildeqqx oxntildeqy| qvq u|q uo qvxw yvzz qv|middotplusmnsup1raquo xy |ntilde|uy oxntildeoqu qv xyo y qvu xvzy ozvw|middotplusmnsup1raquo xo |ntilde|zq oxntildeoxq qv xzw woq qvu xvy owvomiddotplusmnsup1raquo ontildequ |ntildex| oxntildeuox qvu yy z|z qv xvxx oyvuo

Ugraveplusmnsup1 w ontildeuox yntildeyo| vy| u q ovq ovzq |vUgraveplusmnsup1 oy owntildeqz yntildewu vy |q qo ovou wv|q |uvoxUgraveplusmnsup1 u owntilde| yntildewzz v |qo q ovuy yvoy ||v|wUgraveplusmnsup1 | ozntildeqzy yntildezq v uz q ovq yvwq ||v|wUgraveplusmnsup1 yu ozntilde|wx yntildewx| vw| z oy ov| yxv|o |ovqUgraveplusmnsup1 ow qntildewo yntildez|y vz ||q o ovxq yovxq |ovuxUgraveplusmnsup1 xy qntilde|o ntildeqo vzq |wy y ovo xvx |ovqqUgraveplusmnsup1 xo ontildeo|u ntildeow| vzu |xy |u ovx| xzv| |qvyUgraveplusmnsup1 ontildequ ontildey|q ntilde|z vz| |zw ow| vow xuv|o uqvu

Ugraveplusmnsup1raquo w ozntildexw yntildez| vw| |z q ovow wvqu |uvuqUgraveplusmnsup1raquo oy ozntildexoz yntildezq vw| |q q ovou wuvyz |uvoUgraveplusmnsup1raquo u ozntildexqy yntildezu vwo u q ov| ov|q ||vyqUgraveplusmnsup1raquo | ozntildeuu yntildewz vwy o qy ov|o vwx ||vuqUgraveplusmnsup1raquo yu ozntildeqw yntildezq| vwx y oy ovu |vy |ovzy

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

sectWOcirceg

oqu$XAcircsup3deg owsup3degIcircwsup3+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

JAEA-Testing 2011-005

Table G-2 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (2)

Ugraveplusmnsup1raquo ow qntildeqx yntildewz| |vqq |u q ovq xxv|| |ovzUgraveplusmnsup1raquo xy qntildew| ntildeq| vzy ||y y ovuz yvqx |ovoUgraveplusmnsup1raquo xo ontildew ntildexy vz| u|| |u ovwx uzvoo |qvzxUgraveplusmnsup1raquo oqu ontildeww| ntildeuqz vzx |oy owu ov yzv|q uqv||

~plusmn w yntildez zntildeuw qvo ontildeozu uux vyw xvwq ovzo~plusmn oy ntildeoqz zntildeu| qv| ontildeoy| uu vyq yvoo ovz~plusmn u ntildeq|z zntildeuy qv ontildeo uuy vy| yvqq ovwx~plusmn | ntildeoz zntildeyq qvu ontildequ uu vq xvzw ovwy~plusmn yu yntildezoq zntildey qvo ontildeoyu uu vyq xvzu ovw|~plusmn ow zntilde|qz zntildew| qvzx ntildeo uuy uvw uvz ovz~plusmn xy zntilde|wz zntildewqw qvzy ntildey uuy uvzz uv ovz~plusmn xo zntildeqx zntildewuq qvzu ntildeowx uuz uvw uvo ovzu~plusmn oqu zntildexz zntildezq| qvzz ntildeo uyo uv uvuz ovxq

~plusmnraquo w yntildewox zntilde| qvq ontildeuo uux vz xvuz ovw~plusmnraquo oy yntildezq zntildeu| qvo ontildeoz uu vyw xvw ovwo~plusmnraquo u yntildeyz| zntildeux qvyz ontildeoyw uuu vy| xv| ovz|~plusmnraquo | yntildewou zntildexx qvq ontildeu uux vx xvx ovzo~plusmnraquo yu yntildezx| zntildex qvo ontildeoyo uuu vyo xvzz ovzx~plusmnraquo ow zntildeuux zntildew qvz ntildeoxx uu| uvwy uv|w vqx~plusmnraquo xy zntilde|x zntildewqo qvzx ntildeq uuy uvzx uvu ovzw~plusmnraquo xo zntildeyw zntildewuy qvzw ntildeqy uxq uvzq uv|z ovwz~plusmnraquo oqu zntildeyxq zntildezoo qvz ntildeoxy uyq uvyw uvuw ovx|

plusmn w uy uo ovo zy oow qvw uvwq |vxqplusmn oy xq uw ovqy oqo oq qvwx xvqq |vzzplusmn u xwz x|| ovoq oqy o| qvwy xvx uv|uplusmn | oo yqy ovo ooq o|q qvwx yvuy uvywplusmn yu ntilde zoz vu o| ouo qvw owvqu yvxqplusmn ow ntildeww ntildeqqz ov|z oxo ox qvwy owvu oovxqplusmn xy untildeouy |ntildeoy ovo q| oz qvz| qvuy oyvzplusmn xo |zntildeo| |xntilde|u ovoo ontildeuox ontildeuy qvwo vyz qvowplusmn oqu uqntildeqwy |xntilde|w ovo ontildeuww ontildewqw qvw yvzu ozv

plusmnraquo w oqqntildexy yqntilde|y ovy ntildeuou wy vx uovyy ywvwzplusmnraquo oy oqntildey| yontildeq ovyw ntildeu|w ww| vy uvqz yzv|oplusmnraquo u oqntildexyw yontildeqz| ovyw ntilde|zy wz| vyw uvw ywvuqplusmnraquo | oq|ntildeqo yontildeo ovyw ntildeuqq www vq uvzx ywvzoplusmnraquo yu oq|ntildeqzw yontildeoou ovyz ntildeuyx wzx vx uovw| ywv|qplusmnraquo ow oq|ntildezo yontildeq| ovq |ntildexqx wz |vzo zvyx ywvoplusmnraquo xy oquntildeqy yontildeqq ovo |ntildexyw wzz |vz zvo yvwxplusmnraquo xo oqwntilde|zq yqntildewo ovw |ntildexyx wzx |vzw |qvuo yvzplusmnraquo oqu ooqntildeqq yontildeoxx ovwq |ntildeyu zo |vzz zvzx yyv|z

~plusmnreg w ooz o|o qvzq q x qvwq xvwo xvo~plusmnreg oy oo| ou qvzo y qvwx xvq uvwu~plusmnreg u oow o|o qvzo q x qvw xvz xvy~plusmnreg | oq ou qvz q x qvwq xvzu uvwz~plusmnreg yu oq ox qvzy q x qvw yvo uvzz~plusmnreg ow o oz qvzz y qvw xvy xvq|~plusmnreg xy ox o|o qvzx y qvzx uvwy uvw|~plusmnreg xo ox o|o qvzx | y qvww xv|x uvzx~plusmnreg oqu ou o|| qvz| y z qvz uvq uvy|

~plusmn w oontildewxu |ntildeyuy |vx yu ontildeoow qvx owvuy |vy~plusmn oy oontildewo |ntildeyq |vw y|z ontildeoox qvx owvuw |v|~plusmn u oontildey |ntildexw |vz yuo ontildeoox qvx owv| |vo~plusmn | oontilde| |ntildexzx |v y| ontildeo| qvx owvuz |vq~plusmn yu oontildew|o |ntildey |v y| ontildeo qvx owvxw |vo~plusmn ow ontildeqw |ntilde| |v| yy ontildeouq qvxx ozv|o |vw~plusmn xy ontilde |ntildewox |v zz ontildeoyw qvyw oxv| |v~plusmn xo ontildeuo |ntildezzo |voo wu ontilde qvyz ouvy |v~plusmn oqu ontildequ untilde|w vzq zu ontilde|u qvo o|vuz |v|o

sup2cedil~raquo w y o qvx o | qvu| xvoy uvusup2cedil~raquo oy z o qvwq o qvxx vzx xvuxsup2cedil~raquo u x oq qvuy o qvyo |v| uvuzsup2cedil~raquo | | oo qv o qvxu vuz uvzosup2cedil~raquo yu oo qvyo o qvuz xvyo uvxusup2cedil~raquo ow oo qvyw o qvxx yvqq uvwxsup2cedil~raquo xy y oo qvxy o qvy uvyq xvoosup2cedil~raquo xo y oo qvxy | qvy| |vq| |vusup2cedil~raquo oqu x o qv|z o qvx |vyz xvuq

JAEA-Testing 2011-005

Table G-3 MPIatilderaquo LMPI and hybrid acute256ccedil1024KBP

sup2~ yntildeouu |ntildewzz |ntildeuou ovou ontildezx ontildeqyw ovw| ovzz |vqsup2~ xuntildeww ntilde|o ntildeuo qvzz |ntildeqo |ntildey qvz vu vsup2~ wyntildeu| oqntildexx oontildeqyy qvz |ntildezwy untildeuyw qvwz vq vuwsup2~ ontildequwntildexy ountildexxy oxntildeuy qvzu untildezqq xntildewyw qvwu vz vyu

sup2~ yntildeouu zou zo qvzu ontildeqqq ontildeq qvzw qvzo qvzxsup2~ xuntildeww ontildeo ontildezy qvwz ontildew ntildeuuz qv| qvz qvzsup2~ wyntildeu| ntildexuo ntildey| qvz ntildeyzz |ntildeuqw qvz qvzu qvwosup2~ ontildequwntildexy |ntildex untildeqox qvzu |ntildexww untilde|| qvw ovqx qvz

sup2~Oslashreg~plusmn yntildeouu yntildeuw| wontildexoz qvzu xntildeuo |ntildeoqw ovoq |vqq |vx|sup2~Oslashreg~plusmn xuntildeww ou|ntildewx ountildeqo qvw| uyntildeqzq yyntildewxo qvyz |vo vyosup2~Oslashreg~plusmn wyntildeu| oontildezoo yzntildeuz qvz yntildeqz zxntildeox qvo |voy vw|sup2~Oslashreg~plusmn ontildequwntildexy w|ntildeyox |yontildeuq| qvw wntildexo o|ntildexqz qvo |v| vz|

middotplusmnsup1 yntildeouu |ntildezw ||ntilde|x| qvzz oontildexww oxntildeyzu qvu vwx vo|middotplusmnsup1 xuntildeww y|ntildex yyntildeyy qvzx ontildez|x untilde|x qvuy vww ovuomiddotplusmnsup1 wyntildeu| wzntildewy z|ntildeyux qvzy |qntildeyx yntildeyqu qvuy vz ov|zmiddotplusmnsup1 ontildequwntildexy oqntildeux| o|untildeqox qvzq uqntildeyxx wwntildez| qvuy vzy ovxo

middotplusmnsup1raquo yntildeouu yntildew|u ||ntildeu ovww ntildewxy oxntildeyuw uvyy qvwy voumiddotplusmnsup1raquo xuntildeww wxntildeoz yuntildeo| ov|| wxntildex|| untildeuy ovwq ovqq ov|xmiddotplusmnsup1raquo wyntildeu| ooxntildeuw z|ntildezyu ov| z|ntildeyow yntildexuz ov|z ov| ov|zmiddotplusmnsup1raquo ontildequwntildexy o|ntildexx o|untildeuqz ovq zyntildexoq wntildeuuy ovoq ovu| ovxu

Ugraveplusmnsup1 yntildeouu |zntildeyu| oqntildexwz |vu uontildeuyq ountildewy vz qvzy qvoUgraveplusmnsup1 xuntildeww untildewxx ntildeqo voy uzntildeoy| yntildewo ovw| qvz qvw|Ugraveplusmnsup1 wyntildeu| xxntildewoq zntildewyu ovw xyntildewuu |yntildexqx ovxy qvzw qvwUgraveplusmnsup1 ontildequwntildexy yntildewxy |ntildexxw ovy yuntildewzx uyntildeoq ovuo qvz qvwo

Ugraveplusmnsup1raquo yntildeouu uqntildeq|u oqntildeyxq |vy uontildey| ountildez|z vz qvzy qvoUgraveplusmnsup1raquo xuntildeww uwntilde|u ntildeqz vo uzntildew yntildewx ovw| qvzw qvw|Ugraveplusmnsup1raquo wyntildeu| xyntildezy zntildez|y ovww xntildeo| |yntildeuwq ovx qvzw qvwUgraveplusmnsup1raquo ontildequwntildexy y|ntildeo| |ntildeyxq ovyw yxntilde|yw uyntildeqx ovuo qvz qvwo

~plusmn yntildeouu untildeyw wntildezu uvo uuntildeo wntildeuq| xvx qvzy ovq~plusmn xuntildeww xqntildeqyx qntilde|xq vuy xontildexwy untildezux vq qvz qvw~plusmn wyntildeu| xwntildeq wntildeux vqy yqntildeox |untildeu ov| qvzw qvw~plusmn ontildequwntildexy yyntildequq |yntilde|q| ovw ywntildeqx uuntildeuyy ovx| qvz qvw

~plusmnraquo yntildeouu zntilde|q zntildeqq ovqu zntildexq wntildeu|w ovo| qvzw ovq~plusmnraquo xuntildeww oyntildeu| qntilde|wy qvwo oyntildeoz untildezuy qvyx ovq qvw~plusmnraquo wyntildeu| |ntildewo wntildeuxu qvw ntildewz |untilde qvyy ovq qvw~plusmnraquo ontildequwntildexy zntildezy |yntildexo| qvw zntildezxq uuntildeu qvy ovqq qvw

plusmn yntildeouu yxntildewu wzntildeq| qv| ontildeyoz yntildezz qvwq |vq |v|qplusmn xuntildeww ountildeqoz oxntilde|o qvwo |zntildexu xxntilde|ux qv |vo vxplusmn wyntildeu| ow|ntildeuwy untildeyu qvw xntildeoyu untildezu| qvy |vo |vqqplusmn ontildequwntildexy u|ntildewo zxntildezqo qvw xntildezy zuntildeoy qvwq |vo |vo

plusmnraquo yntildeouu x|ontildeyx wyntildeqoz yvow o|ntildeqox owntilde|xu vu |vww uvyzplusmnraquo xuntildeww yozntildeyz| oz|ntildeuwu |vq oyntildeyw ywntildeyw v| |vwo vwplusmnraquo wyntildeu| yw|ntildewxu wwntildeooz v| owyntilde|q| oqxntildeyw ovy |vy vplusmnraquo ontildequwntildexy |yntildezo uqxntildeoyq ovw oontildexo ouontildezzx ovuz |vuw vwx

~plusmnreg yntildeouu uzu y| qvw uwo xzw qvwq ovq| ovqy~plusmnreg xuntildeww zq| ontildeoz qvy wo| ontildexuy qvx| ovoo qv~plusmnreg wyntildeu| ontilde|q ontildeyq qvw ontildeoxo ntildeoqz qvxx ovo| qvz~plusmnreg ontildequwntildexy ontildezqu ntilde|ux qvwo ontildeuxu ntildeyx qvxu ov|o qvww

~plusmn yntildeouu ountildewqx ountildez|x qvzz ontildeqq owntildeuw| ovou qvq qvwo~plusmn xuntildeww zntildeuy zntildewxz qvzw uqntildex|o uwntildexyw qvw| qv qvyo~plusmn wyntildeu| uuntildeq uuntildeo qvzz xyntildeyoq yzntildexzw qvwo qvw qvyu~plusmn ontildequwntildexy qntildeqqx qntildeu|q qvzz ntildeo zqntilde|ux qvwo qvzy qvw

sup2cedil~raquo yntildeouu x w qvz zo ux vq| qvw ov|sup2cedil~raquo xuntildeww ox o| qv| o|w o| qvyx qvzo qvwosup2cedil~raquo wyntildeu| oo |o qvu owz w| qvy qvzo qvwsup2cedil~raquo ontildequwntildexy ou wz qvu | |x qvyx qvz qvwo

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

sectWOcirceg

ow$XAcircsup3deg owsup3degIcircwsup3

国際単位系(SI)

乗数  接頭語 記号 乗数  接頭語 記号

1024 ヨ タ Y 10-1 デ シ d1021 ゼ タ Z 10-2 セ ン チ c1018 エ ク サ E 10-3 ミ リ m1015 ペ タ P 10-6 マイクロ micro1012 テ ラ T 10-9 ナ ノ n109 ギ ガ G 10-12 ピ コ p106 メ ガ M 10-15 フェムト f103 キ ロ k 10-18 ア ト a102 ヘ ク ト h 10-21 ゼ プ ト z101 デ カ da 10-24 ヨ ク ト y

表5SI 接頭語

名称 記号 SI 単位による値

分 min 1 min=60s時 h 1h =60 min=3600 s日 d 1 d=24 h=86 400 s度 deg 1deg=(π180) rad分 rsquo 1rsquo=(160)deg=(π10800) rad秒 rdquo 1rdquo=(160)rsquo=(π648000) rad

ヘクタール ha 1ha=1hm2=104m2

リットル Ll 1L=11=1dm3=103cm3=10-3m3

トン t 1t=103 kg

表6SIに属さないがSIと併用される単位

名称 記号 SI 単位で表される数値

電 子 ボ ル ト eV 1eV=1602 176 53(14)times10-19Jダ ル ト ン Da 1Da=1660 538 86(28)times10-27kg統一原子質量単位 u 1u=1 Da天 文 単 位 ua 1ua=1495 978 706 91(6)times1011m

表7SIに属さないがSIと併用される単位でSI単位で表される数値が実験的に得られるもの

名称 記号 SI 単位で表される数値

キ ュ リ ー Ci 1 Ci=37times1010Bqレ ン ト ゲ ン R 1 R = 258times10-4Ckgラ ド rad 1 rad=1cGy=10-2Gyレ ム rem 1 rem=1 cSv=10-2Svガ ン マ γ 1γ=1 nT=10-9Tフ ェ ル ミ 1フェルミ=1 fm=10-15mメートル系カラット 1メートル系カラット = 200 mg = 2times10-4kgト ル Torr 1 Torr = (101 325760) Pa標 準 大 気 圧 atm 1 atm = 101 325 Pa

1cal=41858J(「15」カロリー)41868J(「IT」カロリー)4184J(「熱化学」カロリー)

ミ ク ロ ン micro 1 micro =1microm=10-6m

表10SIに属さないその他の単位の例

カ ロ リ ー cal

(a)SI接頭語は固有の名称と記号を持つ組立単位と組み合わせても使用できるしかし接頭語を付した単位はもはや コヒーレントではない(b)ラジアンとステラジアンは数字の1に対する単位の特別な名称で量についての情報をつたえるために使われる

 実際には使用する時には記号rad及びsrが用いられるが習慣として組立単位としての記号である数字の1は明 示されない(c)測光学ではステラジアンという名称と記号srを単位の表し方の中にそのまま維持している

(d)ヘルツは周期現象についてのみベクレルは放射性核種の統計的過程についてのみ使用される

(e)セルシウス度はケルビンの特別な名称でセルシウス温度を表すために使用されるセルシウス度とケルビンの

  単位の大きさは同一であるしたがって温度差や温度間隔を表す数値はどちらの単位で表しても同じである

(f)放射性核種の放射能(activity referred to a radionuclide)はしばしば誤った用語でrdquoradioactivityrdquoと記される

(g)単位シーベルト(PV200270205)についてはCIPM勧告2(CI-2002)を参照

(a)量濃度(amount concentration)は臨床化学の分野では物質濃度

  (substance concentration)ともよばれる(b)これらは無次元量あるいは次元1をもつ量であるがそのこと   を表す単位記号である数字の1は通常は表記しない

名称 記号SI 基本単位による

表し方

秒ルカスパ度粘 Pa s m-1 kg s-1

力 の モ ー メ ン ト ニュートンメートル N m m2 kg s-2

表 面 張 力 ニュートン毎メートル Nm kg s-2

角 速 度 ラジアン毎秒 rads m m-1 s-1=s-1

角 加 速 度 ラジアン毎秒毎秒 rads2 m m-1 s-2=s-2

熱 流 密 度 放 射 照 度 ワット毎平方メートル Wm2 kg s-3

熱 容 量 エ ン ト ロ ピ ー ジュール毎ケルビン JK m2 kg s-2 K-1

比熱容量比エントロピー ジュール毎キログラム毎ケルビン J(kg K) m2 s-2 K-1

比 エ ネ ル ギ ー ジュール毎キログラム Jkg m2 s-2

熱 伝 導 率 ワット毎メートル毎ケルビン W(m K) m kg s-3 K-1

体 積 エ ネ ル ギ ー ジュール毎立方メートル Jm3 m-1 kg s-2

電 界 の 強 さ ボルト毎メートル Vm m kg s-3 A-1

電 荷 密 度 クーロン毎立方メートル Cm3 m-3 sA表 面 電 荷 クーロン毎平方メートル Cm2 m-2 sA電 束 密 度 電 気 変 位 クーロン毎平方メートル Cm2 m-2 sA誘 電 率 ファラド毎メートル Fm m-3 kg-1 s4 A2

透 磁 率 ヘンリー毎メートル Hm m kg s-2 A-2

モ ル エ ネ ル ギ ー ジュール毎モル Jmol m2 kg s-2 mol-1

モルエントロピー モル熱容量ジュール毎モル毎ケルビン J(mol K) m2 kg s-2 K-1 mol-1

照射線量(X線及びγ線) クーロン毎キログラム Ckg kg-1 sA吸 収 線 量 率 グレイ毎秒 Gys m2 s-3

放 射 強 度 ワット毎ステラジアン Wsr m4 m-2 kg s-3=m2 kg s-3

放 射 輝 度 ワット毎平方メートル毎ステラジアン W(m2 sr) m2 m-2 kg s-3=kg s-3

酵 素 活 性 濃 度 カタール毎立方メートル katm3 m-3 s-1 mol

表4単位の中に固有の名称と記号を含むSI組立単位の例

組立量SI 組立単位

名称 記号

面 積 平方メートル m2

体 積 立法メートル m3

速 さ 速 度 メートル毎秒 ms加 速 度 メートル毎秒毎秒 ms2

波 数 毎メートル m-1

密 度 質 量 密 度 キログラム毎立方メートル kgm3

面 積 密 度 キログラム毎平方メートル kgm2

比 体 積 立方メートル毎キログラム m3kg電 流 密 度 アンペア毎平方メートル Am2

磁 界 の 強 さ アンペア毎メートル Am量 濃 度 (a) 濃 度 モル毎立方メートル molm3

質 量 濃 度 キログラム毎立法メートル kgm3

輝 度 カンデラ毎平方メートル cdm2

屈 折 率 (b) (数字の) 1 1比 透 磁 率 (b) (数字の) 1 1

組立量SI 基本単位

表2基本単位を用いて表されるSI組立単位の例

名称 記号他のSI単位による

表し方SI基本単位による

表し方平 面 角 ラジアン(b) rad 1(b) mm立 体 角 ステラジアン(b) sr(c) 1(b) m2m2

周 波 数 ヘルツ(d) Hz s-1

ントーュニ力 N m kg s-2

圧 力 応 力 パスカル Pa Nm2 m-1 kg s-2

エ ネ ル ギ ー 仕 事 熱 量 ジュール J N m m2 kg s-2

仕 事 率 工 率 放 射 束 ワット W Js m2 kg s-3

電 荷 電 気 量 クーロン A sC電 位 差 ( 電 圧 ) 起 電 力 ボルト V WA m2 kg s-3 A-1

静 電 容 量 ファラド F CV m-2 kg-1 s4 A2

電 気 抵 抗 オーム Ω VA m2 kg s-3 A-2

コ ン ダ ク タ ン ス ジーメンス S AV m-2 kg-1 s3 A2

バーエウ束磁 Wb Vs m2 kg s-2 A-1

磁 束 密 度 テスラ T Wbm2 kg s-2 A-1

イ ン ダ ク タ ン ス ヘンリー H WbA m2 kg s-2 A-2

セ ル シ ウ ス 温 度 セルシウス度(e) Kンメール束光 lm cd sr(c) cd

スクル度照 lx lmm2 m-2 cd放射性核種の放射能( f ) ベクレル(d) Bq s-1

吸収線量 比エネルギー分与カーマ

グレイ Gy Jkg m2 s-2

線量当量 周辺線量当量 方向

性線量当量 個人線量当量シーベルト(g) Sv Jkg m2 s-2

酸 素 活 性 カタール kat s-1 mol

表3固有の名称と記号で表されるSI組立単位SI 組立単位

組立量

名称 記号 SI 単位で表される数値

バ ー ル bar 1bar=01MPa=100kPa=105Pa水銀柱ミリメートル mmHg 1mmHg=133322Paオングストローム Å 1Å=01nm=100pm=10-10m海 里 M 1M=1852mバ ー ン b 1b=100fm2=(10-12cm)2=10-28m2

ノ ッ ト kn 1kn=(18523600)msネ ー パ Npベ ル B

デ ジ ベ ル dB

表8SIに属さないがSIと併用されるその他の単位

SI単位との数値的な関係は    対数量の定義に依存

名称 記号

長 さ メ ー ト ル m質 量 キログラム kg時 間 秒 s電 流 ア ン ペ ア A熱力学温度 ケ ル ビ ン K物 質 量 モ ル mol光 度 カ ン デ ラ cd

基本量SI 基本単位

表1SI 基本単位

名称 記号 SI 単位で表される数値

エ ル グ erg 1 erg=10-7 Jダ イ ン dyn 1 dyn=10-5Nポ ア ズ P 1 P=1 dyn s cm-2=01Pa sス ト ー ク ス St 1 St =1cm2 s-1=10-4m2 s-1

ス チ ル ブ sb 1 sb =1cd cm-2=104cd m-2

フ ォ ト ph 1 ph=1cd sr cm-2 104lxガ ル Gal 1 Gal =1cm s-2=10-2ms-2

マ ク ス ウ ェ ル Mx 1 Mx = 1G cm2=10-8Wbガ ウ ス G 1 G =1Mx cm-2 =10-4Tエルステッド( c ) Oe 1 Oe  (1034π)A m-1

表9固有の名称をもつCGS組立単位

(c)3元系のCGS単位系とSIでは直接比較できないため等号「   」

   は対応関係を示すものである

(第8版2006年改訂)

この印刷物は再生紙を使用しています

Page 9: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター

JAEA-Testing 2011-005

Fig 1 13Ecircu

BX900gtacircHI4J2K currenLXMUV13Nh713~O=5Pq

22 []fIumlIacuteIcircIumlETHNtildeJ []fIumlIacuteFX1L13FX1P12Tflops =NOLgYPWMacirc0 46TBnotGYZ$13gtlaquoq

1iumlIumlQR SPARC64SLYccedilIumlfPT^_ccedil-Wacirc0 16GB8ampGqUgraveD|UCPUVBccedil^ccedil`LJSCPbrvbarsectAuml|eacuteIumlOslashWCqiumlIuml FBBLFull Bisectional BandwidthPccedil`X| InfiniBand 4xDDRLDAgravez 2GBsPgtYZGqiumlIumleacute|O=f7YOtilde IcircIumlETHNtildegtO=GVBOumltimesLAumla$fY`accedilPyBGmWgt [

brvbarUcircXOcirc$FG]EumlIgrave^_ZOslash5^=_wXM

`notGq FX1 gtFX1 CcedilEgraveordfxfgh$WXpound`acC2012 b

=Gxfgh$yBzD^|13~h5Pq 23 OgraveOacuteOcircOtildeYOumltimesIcircIumlETHNtildeJ OgraveOacuteOcircOtildeYOumltimes IO iumlIumlSPARC Enterprise M90002 6Ocirc$OcircOtildeYETERNUS DX80366$OcircOtildeYETERNUS4000 model40016gtJeumlnq

JAEA-Testing 2011-005

MOcircOtildeY12 12PBLRAID6Pgtlaquosect13gt 576GBs IO LgYPnotGLFig 2PqIO iumlIuml InfiniBand nota13Parallelnavi SRFSbrvbarsectciumlIumlcdYoacuteCaidefgCq Ocirc$OcircOtildeYOumltimeslaquoDsect 466Za SASOcircOtildeYL1TB7200rpmP8ampG

qmndOcircOtildeY RAID6 gt^UcircCOumltimesgt`ZagGmWgt IO AumlFUcirchpoundqXIOiumlIuml_aumla13Sun QFSBCLFig 3Pq

UVEumlIgraveOIacute11913aelig13()

i QDR 9j2 (113aelig13ntildesect) InfiniBandaccedil

(9iexcl) i FGFAring 72GBs (2iumlIumlcurren) IOiumlIuml (M90002iumlIuml) i FGFAring 576GBs (OcircOtildeYcurren) OgraveOacuteOcircOtildeY (ETERNUS DX80366) ()9kUcirclmiumlIumlnC

1 2 3 119

IBUSW

IO node

Fig 2 oOumltimesYZJWFGFAring

JAEA-Testing 2011-005

M90

00

2IO

M90

00

1IO

FC (3)

FC (4)

FC (36)

hellipFC (1

)F

C (2)

FC

(42)

hellipFC (3

7)FC (3

)FC (4

)FC (3

6)hellip

FC (1)

FC (2)

FC (42)

hellipFC (3

7)

DX8

0 1

helliphelliphelliphelliphelliphellip

helliphelliphelliphelliphellip

hellip

1TB 1T

B1T

B1T

B

DX8

0 2

1TB 1T

B1T

B1T

B

DX

80

35

1TB 1T

B1T

B1T

B

DX8

0 3

6

1TB 1T

B1T

B1T

B

ETER

NU

S D

X80

(36

)E4

K 1

1TB 1T

B1T

B1T

B

ETER

NU

S400

0 M

odel

400

(1)

32G

Bs(8

FC)

QFS

Q

FS

36

576

GBs

(144

FC)

F Fi

g 3

JAEA-Testing 2011-005

3 HPCCYbrvbar

HPCC (High Performance Computing Challenge)1) Y 7^_ZB BX900W FX15poundDq13gtHPCCEcircupqWrstuq 31 HPCCYEcircu

HPCC YvR Tennessee JDongara wRordf3xWXpoundyoBCefgh$BY`^_ZzgtlaquoqO

|Y^_ WpoundDefgh$acircuXu

GmWbrvbarsectefgh$13WCeacuteZGUacuteC

qO HPL(High-Performance Linpack)DGEMMFFTE 3q|^_Zgt|Y STREAM W Random Access 2 q|^_Zgt _ PTRANSW b_eff 2q|^_Zgt5n~noacuteCc^_Zbrvbar5PJWXpound 2)qo

^_ZUuml|YacuteTHORNFortranCgtcnordfHPCCYWCCgtcnD 1^_ZeumlnqXHPCCYC 2003 sect 300wordfeumlnsect+X13rsyBGmW9gtlaquoq

311 ^_Z efgh$13f CPU 1 iexclXCiexcl8ampG

L13iumlIumlPa$fY`gtYZCDYZ$JgtlaquosectCoreCPUiumlIuml13WeumlUgraveUgraveXgtO=ordf5nDHogtordfugt

laquoqmgtiumlIumlMG` (G Global system performance)WiumlIuml[gt` (SN Single Environment)gt` (EP Embarrassingly Parallel) 3gt5Pq

3111 HPL^_Z

HPL5IgraveLUcurrengtCmWgt13OG^_ZgtlaquoqOslashm^_Z|5 CcedilEgrave^_ZaumlIacutecurrenWCyB

eumlnmWordfqUgraveDefgh$13O[P TOP500B^_ZWCBeumln LinpackMPIEumlIgraveUcircWCnotgtlaquoqiumlIuml`$XOordfecircMAumlWXq Tflopsgteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq

JAEA-Testing 2011-005

3112 DGEMM^_Z DGEMM5IgraveGmWgtiumlIumlOG^_Zgtlaquoqo^_gt5curren13^_W i5Xqm^_Z

LinpackSgtegravenauml^_ZgtlaquoqUgraveDaumlefgh$gto13zeumlnDeacuteaTHORN|ordfZaZ|WCBeumlnsect

meacuteaTHORN|ordfoacute WXqY`LBMTPbrvbariumlIumlOiexclcentGq Gflops(Giga floating-point operations per second)gteumlnq` 2gtiumlIuml[(SN)(EP)`5Pq 3113 STREAM^_Z STREAM iumlIumlordf|W CDWecirceacuteIumlG^_Zgtlaquoqm

Wecirco^_gtO=ordf5np^_W i5Xqm^_Z

cpound(Copy)curren(Scale)Myen(Add)yen(Triad) 4 XO=gtJeumlnsectiumlIuml|YiexclcentGq GBs(Giga Bytes per second)gteumlnq` 8 gt[(SN)(EP)gtoiumlIuml 4(cpoundcurrenMyenyen)`5Pq 3114 PTRANS^_Z PTARNS(Parallel matrix TRANSpose)EumlIgrave^_ocirc5IgraveFtimesbrvbarsect13ccedil`brvbarY G^_Zgtlaquoq GBs gteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq 3115 Random Access^_Z Random Access |sectumleumlnDecopyIgravecdZordf 1 ulaquonotgtshyregGO=ordf 1 macrdegplusmn5ncG^_Zgtlaquoq1 plusmnYGOcirc$2ordf8BWgteumlordfgtlaquoqiumlIuml|Yucircsup2`WiumlIuml MPI gtucircsup2`5Pq Gups(Giga updates per second)gteumlnq` 3 gtiumlIuml[(SN)(EP)iumlIumlegravepoundDM(G)`5Pq 3116 FFTE^_Z FFTEAumlF|sup3^acuteOG^_ZgtlaquoqHPLWiacute CcedilEgrave^_ZaumlIacutecurrenWCyBeumlnmWordfq[(SN)W(EP)`gt2 [microcopyIgraveordfo^_gtpara|sup3^acuteeumlnqiumlIumlegravepoundDM(G)`gt3[microcopyIgraveordflaquo 1AgravezmiddotAcurrenetheumlnDcedil|sup3^acuteeumlnlaquoW^_ordf^_W 5PmWLoacute Pbrvbaro^_

copyIgravesup1copytimesordm^_raquofrac14sect[microoacuteG|sup3^acuteordf5nq

JAEA-Testing 2011-005

Gflopsgteumlnq 3117 b_eff^_Z b_eff Ocirc$FGFGfrac12frac34(7a13acute latency)WFGFAring(eacuteIuml band width)shycdiquestH^_ZgtlaquoqFGfrac12frac34ordfAgraveXCeacuteIumlordfecircM13^_ ordfAacutenmWGqb_eff iumlIumlegravepoundDM(G)G Ring W PingPong 2 ^_ZgtJeumlnqb_eff FGGOcirc$-a5mnUgravegt^_ZW+XsectlOcirc$-a5 N ] MPI ^_ P atildeAcircXq7a13 8B Ocirc$eacuteIuml 2MB Ocirc$FGGmWbrvbarsect5Pq Ring MPIbrvbarpoundAtildeHdnDZYAumlAringAEligo^_oacuteCAumlAring 1oacute 1 5PoumlWZYAumlAringZordfEumlu 1oacute 1 5Pouml 2CcedilEgravegtCDq PingPong 2Core Eacute Ecircnototilde^_ordfethsectntildednp Core EumlCOslash5Gq2CoreIgraveEacuteIacute5poundecircTIcircIIumlshyGq b_effETHNtildeOgraveRing(AumlAring)brvbar7a13OacuteRing(AumlAring)brvbareacuteIumlOcircRing(Zordf)brvbar7a13OtildeRing(Zordf)brvbareacuteIumlOumlPingPongbrvbar7a13timesPingPongbrvbareacuteIuml 6gtlaquoq7a13ordfaY_LOslashPmacreacuteIumlordf GBsgteumlnq 312 Oslash5

HPCCZaZ(Baseline Runs)WUuml^Otildea5Z(Optimized Runs) 2Oslash5ordflaquoordffeaZTUacuteUcircEacutebrvbarZaZgt5pound

q

3121 ZaZ Oslash5gtlaquoZaZfeaZbrvbarTUacuteUcircW13Oumlfrac34

eumlnaumlUgraveAuml BLAS ZaZ|W MPI ZaZ|egraveBUgravegtUacuteeumlnordfUcircfIuml^regUacuteeumlnXqegraveBCDfeaZeacuteYacute~

WfeaegraveBCDUuml^13~ZaZ|eacuteYacute~UumlordfYacutegtlaquoq 3122 Uuml^Otildea5Z UcircfIumlTHORNszligagravePTUacuteUcirc2 7gtUacute9eumlnqiAgraveaacutednDIacutecurren(-)fIumlTUacuteUcircgtlaquosectpAgrave|5macr8CEcircnototilde3XTUacuteUcircgtlaquoqUgraveD acirc HPCC^_YacuteNtildeY`acircatilde^WaumlaringordfaeligugtlaquosectatildeCUumlGaeliguordflaquoq

JAEA-Testing 2011-005

32 BX900 FX1n~nHPCC BMT^_Z 8ccedil512EumlIgravegtOslash5CDq 321 BX900 BX900 HPCC BMT^_Z Table 1GWHPL

PTRANSRandom Access FFTEo^_ZZK|Otilde Fig 4ccedilFig 7GqZK|Otildedegn^_Zegraveeacuteecirccopyordfeuml]cgtlaquoordf512 EumlIgravegtigraveyenCiacuteXCiacuteicircXordfdegdnqUgraveDHPL=NoacuteC 822(=4902596100)1gtlaquopoundDq

Table 1iumlntildeethntildeograveoacuteWCHPCCocircAEligcdotildebIcircIumlETHNtildeoumlccedilYAElig Intel Endeavor cluster (13gt Endeavor Wdivideoslash)HPCCWeboumlYacute 1) ugraveuacute Table 2Gq BX900shyucircoB3gtlaquoDH Ecircnototilde^_ZP EndeavoregraveeacuteplusmnpoundDlaquoordfEcircuumlb13yacuteordfdegdnqUgraveDEndeavorWEacute IntelfeaZBD Table 3 GbrvbarPQRfeaZbrvbarWbgtlaquopoundDqDotildeCSTREAM iumlIumlSgt5poundmWcd thornCXatilde

d(35GBs oacute 42GBs)W 2ethM EndeavorbrvbarsectEshygtlaquopoundDqm422gtaringNGq

1 BX90013VBbrvbargt17072fgt=N 2000TflopsoacuteC 1914TflopsOslashCsectOslashszlig 9566gtlaquopoundDq

- 9 -

JAEA-Testing 2011-005

Ta

ble

1 富士

通コ

ンパイ

ラに

よる機

構B

X900

での

HPC

C B

MT測

定結果

G-H

PL

G-P

TR

AN

SG

-FFTE

EP-S

TR

EA

MSys

()

EP-S

TR

EA

MTriad

EP-D

GEM

MR

ando

mR

ing

Ban

dwid

thR

ando

mR

ing

Lat

ency

TFlo

ps

GB

sG

ups

Gup

sG

Flo

ps

GB

sG

Bs

GFlo

ps

GB

sus

ecrd

ma

008

1315

12

9330

001

5112

008

9847

555

3328

074

350

9311

018

41

1171

51

268

nord

ma

008

1226

12

9544

001

5504

009

0139

557

5727

813

347

6711

023

91

1053

51

278

rdm

a0

1579

130

510

540

0238

850

1510

719

4469

555

143

4697

111

145

092

418

149

3no

rdm

a0

1536

650

491

080

0222

270

1334

439

7508

555

683

4730

111

123

078

781

324

5rd

ma

032

2493

09

0716

003

0751

024

2906

186

357

111

164

347

3911

101

00

5220

61

944

nord

ma

032

3494

08

4422

003

5001

023

7182

190

981

112

052

350

1611

100

30

5194

95

010

rdm

a0

6201

260

150

637

003

2786

037

8769

369

923

222

203

347

1911

092

40

4192

43

156

nord

ma

060

9871

014

686

80

0763

460

3882

6638

139

722

260

03

4781

110

887

044

542

652

7rd

ma

120

6270

026

303

70

0370

050

5186

5673

075

744

426

43

4708

111

007

038

892

463

5no

rdm

a1

2265

800

302

225

007

3652

061

6532

742

721

444

063

346

9211

100

50

4161

78

290

rdm

a2

4166

200

583

348

002

9897

053

8761

139

3770

888

453

347

0511

091

00

3583

55

937

nord

ma

236

3890

057

519

80

0765

560

8003

4814

845

5089

503

73

4962

110

879

040

323

103

72rd

ma

468

4620

011

199

400

0163

910

2622

2026

859

1017

779

513

4726

111

147

032

838

711

3no

rdm

a4

9017

400

109

9150

008

2799

153

7610

275

5280

1816

433

354

7711

111

90

3572

112

790

富士

通B

X90

0 (5

12C

ore

128C

PU

)C

ip

Xeo

n X55

70 2

93G

Hz

DD

R3-

1066

SM

T-O

N T

urbo

Mod

e -O

FF

Inte

rcon

nect

Q

DR

Inf

inib

and

2so

cket

nod

e F

at T

ree

OS

Red

Hat

EL 5

Com

pile

r 富

士通

CC

++ V

er O

ption -

Kfa

st -

SS

L2

()

EP-S

TR

EA

M S

ys の

値は

(EP-S

TR

EA

M T

riad

)times並

列数

で算

出し

てい

8 16 32 64

MPI

富士

通 P

aral

leln

avi Lng

uage

Pac

kage

512

128

HP

CC

Ver

V

13

1

256

並列

数通

信方

式G

-Ran

dom

Acc

ess

オリジナ

ル版

  

SA

ND

IA_O

PT2版

JAEA-Testing 2011-005

FFig 4 BX900 HPL

Fig 5 BX900 PTRANS

JAEA-Testing 2011-005

FFig 6 BX900 Random Access

Fig 7 BX900 FFTE

JAEA-Testing 2011-005

TTa

ble

2In

tel E

ndea

vor c

lust

er (

HPC

CW

eb1)

)

JAEA-Testing 2011-005

TTa

ble

3In

tel

BX9

00H

PCC

BM

T

JAEA-Testing 2011-005

322 FX1 FX1 HPCC BMT Table 4 HPL

PTRANS Random Access FFTE Fig 8 Fig 11512

HPL755(=377650100)2

Table 4 FFTE DGEMM

A

2 FX1 1200 129Tflops116Tflops 9037

JAEA-Testing 2011-005

TTa

ble

4FX

1H

PCC

BM

T

JAEA-Testing 2011-005

FFig 8 FX1 HPL

Fig 9 FX1 PTRANS

JAEA-Testing 2011-005

FFig 10 FX1 Random Access

Fig 11 FX1 FFTE

JAEA-Testing 2011-005

33 HPCCYUgraveWH Uklq BX900W FX1WrsndWK(Altix3700Bx2)Wrs

Fig 12GqmhgtTWXpoundDqshy 100WCLRandom Ring LatencyPqRandom Ring Latency13BX900ordf`ccedil^fCmWordfcurrencq

BX900 W FX1 rsGWSTREAM W FFTE necircOatildeC atildeC eacuteIumlrvCDordfmacrdnDqSTREAM

421FFTE 43aringNGq XAltix3700Bx2 Random AccessordfEordfAltix3700Bx2W

plusmngtegraveBCD HPCC BMTeacuteYacute~cdOcirc$Y|5ordf+Xpoundsect8YrsGmW13XqAltix3700Bx2 atildeC FFTEEordfmdoacute OslashszligWuumlwdn

q

JAEA-Testing 2011-005

13

0

20

40

60

80

100

BX90

0

FX1

Altix37

00Bx2

GHPL(490Tfplos)

113

GRando

mAccess(154G

ups)

GPTRANS(1120G

Bs)

EPSTREA

M(436GBs)

EPDGEM

M(111Gflo

ps)

13

GFFTE

(2755Gflo

ps)

$amp(13

Rand

omRingBandw

idth

(0359GBs)

Rand

omRingLatency(518usec)

)+

512-(FlatM

PI)

1172

10 64

64

10 32

8 2 32

130

13

123

[GFLOPS]

4567898

123

[GBs]

lt8

7898

[GBs]

=gtAB130CDgtEFGHIAB0J

KLMNOIPQ

Fi

g 1

213cent

HPC

C B

MT^_ZrsZ

Ran

dom

Rin

g La

tenc

y (5

18

sec)

)+R

JAEA-Testing 2011-005

4 BX900W FX1TUacuteUcirc 32 c^_Zgt BX900 W FX1 IcircIumlETHNtildeoumlccedilYWiCXordfWnDmWQR^_aZegraveBC(Xcurrenyen

5poundDqeumldfeaZUuml^13~brvbarTUacuteUcirc]Ucirc7^_ZTUacuteUcirc

EacuteDq 41 ^_aZ

BX900 W FX1 gtegraveBCD^_aZWQRUgraveyenXgtlaquosectoacute ^_ZOslash5^_a|Ocirc$divideC^_ZOslash5cedildivideOcirc$divideC

^_Z=)yen5PmWordf13q ^_aZyBGUgrave^_Z_IumlYacuteh)JGDHfe

a|YUuml^13~UKtl_trtGq[ _ZOslash5_IumlYacute

h fpcoll fIumlgt=GqTcedildivideCDeacuteaTHORN|Ocirc$ fprof fIumlgtdivideCaring`Gq^_aZgtPAElig 8q|gtlaquoq

^_a|Ocirc$divideAElig AElig ^_ AElig MPIZaZ|ordm)AElig f`AElig IcircIumlETHNtilde$AElig fZAElig UcircfIumlAElig

mPIcircIumlETHNtilde$AEliggtCPUlt=gtOaringaeligccedil13

hccedil`|WOcirc$FGszligbordfoacute WXq

42 feaZbrvbar|YTUacuteUcirc BX900HPCC BMT^_Z STREAMordfotildeb

gtlaquo EndeavorWrsC(35GBsacute42GBs)W 2ethMEshygtlaquopoundDmWETHNtilde5poundDqETHNtildeBDUcirc HPCCpSTREAMUuml|YacuteTHORN(CUcirc amp FortranUcirc)7ccedilIumlEumlIgraveUcircWMPIEumlIgraveUcircgtlaquoq[13BGbrvbarPmdegn BX900 feaZUuml^13~UKntst brvbarsectCDmWcdEndeavorfeaZ(IntelUgrave)W BX900feaZ(QRUgrave)igraveiacuteWuumlwdnq

JAEA-Testing 2011-005

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute BX900Uuml|YacuteTHORN STREAM^_Z Table 5Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntst brvbarordfmacrdnDqDotildeCUuml^13~gtOcirc$|sectumlGaringaeligccedil13hyBCXbrvbarPGDHoacute WX DO^SgtOcirc$sup1yBordfXmWordfdcXoumlEacuteegraveBGuecircgtlaquosectPgtXoumlordfUcirc

GordflaquoqUgraveDfeaZordfoacute WGDO ^SgtsectumlOcirc$ordf^^shypoundZYeumlnoumlEacutegtlaquoq CUcircoumlIntelfeaZXd OpenMPEumlIgraveUgraveUgravegt 42GBscoreordfdegdnDordfQRfeaZgt7ccedilIumlEumlIgraveUcirc OpenMP EumlIgravecd=EumlIgrave^regCbrvbarP]C Intel feaZWbWXpoundDqQRfeaZgtlaquo CD7ccedilIumlEumlIgraveUcircAgraveogravebrvbarpoundOcirc$copytimes]AgravebTUacuteUcircAgraveograveordflaquoWnq [ FX1Uuml|YacuteTHORN STREAM^_Z Table 6Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntstbrvbarordfmacrdnDqXTriad (132GBs)CPUQRordfWebgtUumlCshy(135GBs)CPUWotildeEacutegtlaquoq CUcircoumlordf FortranUcircUKntstUuml^13~XCWotildeiCqFX1 CfeaZ BX900W+XsectUKntstUuml^13~ordfcentsup3CXqDHFX1 CfeaZbrvbarTUacuteUcircordf$currengtlaquopoundD9ordfAumlq

JAEA-Testing 2011-005

TTa

ble

5B

X900

STRE

AM

Ta

ble

6FX

1ST

REA

M

JAEA-Testing 2011-005

422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute 321gtdegdnD BX900 HPCC BMT^_Z STREAM

Table 7GqmmgtUKntstcopyIgraveOcirc$ecircEacutearingaeligccedil13heumlX storeampegraveBGmWUKrestp=all(a$XsectordfXmW)WCDTUacuteUcirc5PfeaZUuml^13~gtlaquoq UgraveDyacutegtgtlaquoordf _ZOslash53iumlIuml XC

iumlIumlSgt|Yo CPU]o CoreiDHEumlIgrave] Agravejatildedn~nfeaZUuml^13~brvbar^UcircCXq

STREAMbrvbarPX CoreIumliX2gtZX|Yordf5n DO^gtfeaZUuml^13~UKntst GmWgt|Y+13mWordfcurrencpoundDq

Table 7 HPCC BMT^_Z STREAM (MPIEumlIgraveUcircCUcirc) feaUuml^13~Ograve(1) feaUuml^13~Oacute(2)

EumlIgrave Agravej 13aelig13

EP-STREAMTriad (GBs)

13aelig13 EP-STREAMTriad (GBs)

rdma 1 34739 U U 32

nordma 1 35016 1 43316 rdma 3 34708 U U

128 nordma 3 34692 1 43572

rdma 5 34726 4 43595 512

nordma 4 35477 4 43424 L-1QRfeaZUuml^13~OgraveacuteKfastpackedntst SSL2 -2QRfeaZUuml^13~OacuteacuteKfastpackedntstrestp=all SSL2P

JAEA-Testing 2011-005

43 hhbrvbar|YTUacuteUcirc 32 gtdcWXpoundDbrvbarPBX900 W FX1 FFTE ^_Z

acircIcircIumlETHNtilde13yacuteordfmacrdnDqmETHNtildeDH5pound

DhhW13gtszlig0Gq 13gt FFTE 32EumlIgraveOslash5CDWecircoacuteG ethBX900

gt 12FX1 gt 16Wgteumlqpoundordm)1|YEcirc2OgtlaquopoundD9ordfAumlqmgtFFTEUuml|YacuteTHORN(FortranUcirc)l3CUuml|YacuteTHORN(13aumlWoacuterGoumlUcircWdivideoslash)WaumlIacutecurren4C)JCDaumlacircoacuteGaumlZh5poundDiacuteegraveBC

hbrvbarpound^UcircCDordm)GmWbrvbarsectBX900 W FX1 oacuteG5g5poundDqUcircpaumlbrvbar5poundDh 1QRfeaZW Endeavor egraveBeumlnDafeaZWrs5PDHW 2auml13IacutecurrenTUacuteUcircbrvbar6)BbordfaumlIacutecurrenthorn7wXmWagraveaacuteGDHgtlaquoq UcircaumlIacutecurren 38plusmnUcircWmndaumlZhCDiacute

Ucircn~n Fig 13W Fig 14GWoIacutecurreniacuteEcircu0Gq Fig 13W Fig 14OgravegtCDIacutecurrenOO=5pound^gtlaquoqUuml|YacuteTHORN

gtcopyIgrave APXYZWWWWOcirc$Yordf 2ugt5nsectZgtXcpoundDDHiacutegt)9BcopyIgrave CY_W^curreneth5poundcopyIgrave APXYZWWWWYordfZXbrvbarPCDq)9BcopyIgrave CY_YZCXcpoundD=Ocirc$sectumlG^gtYordfZXpoundDoumlUcircordfecircDHgtlaquoq

Fig 13W Fig 14OacutegtCDIacutecurren3[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 3ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 3gtXC 6Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquosectaringaeligccedil13h-

a5brvbarpoundltG_ccedilY-a5^_ZUgrave)acircregmacr= 16 gteumlnq

Fig 13W Fig 14OcircgtCDIacutecurren2[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 2ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 2gtXC 4Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquoqFX1 gtBX900rufntildeDsectaringaeligccedilaringh-a5ordfgteumlmWcdmO=)t_ccedilY-a5 16cd 4^regGmWgtzyacuteordfmacrdnDq

Table 8BX900W FX1QRfeaZegraveBCDUcircWaumlbrvbarEuml BX900 afeaZegraveBCDaumlbrvbarGq

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeFFQhFQeFHFFIhFIeFHFFhFQWFRWFFILCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildeKgMSCFDEFGLCKmMlhKOFRLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeNHQRIAacuteVOEOlOjAtildehSRAacutelOEYEEfKAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKnMSCFDEFGKmMSCFDEFGKdMSCFDEFGKkMSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePMMSCFDEFGPKMSCFDEFGPPMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLSABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtildeLCnMEEhKOFKOFAVjLCgMllhKOFPOFAVjLCPMEhEEOUEFMAacuteEEfFAVjYKOFKAtildeLCKMlhllOUEFMAacutellfFAVjYKOFPAtildeAAacutelOEAtildehNAacuteEOlAtildeaumlIacutecurrenOcircKMSCFDEFGPMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBF

GFL

Fig 13 FFTEUuml|YacuteTHORNaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeop]qrsAacuteXAtildeObqqpobtbuqrvvowxAacutevOvOvAtildeL3yzPbqqpobtrAacuteowxAacutesOwO]|AtildeAtildeLCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildep~hKOwowxAacute^O~OqAtildehowAacute~O^Atilde ^currenethDHrp )9BcopyIgrave)JIacutecurrenKgMSCFDEFGKdMopt^|rKkMopt^|rpKdK^^hKOsOuqpKmM~~hKOwOuqpKmK~h~~O^MAacute~~fuqYKOwAtildeLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildepKkKVhKO]|NHQRIAacuteVOEOlOjAtildehowxAacute^Y^^fKO~OqAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKkKopt^|r LEW V^lnacutewPKnMSCFDEFGKmKSCFDEFGKmMSCFDEFGKdKSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePPMSCFDEFG LEW j^lnacutewPPKMSCFDEFGPMMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtilde]bcbrtrcAacuteuqKhnAtildeLCnMEEhKOFKOuqKLCgMllhKOFPOuqKLCKMlhllOUEFMAacutellfuqKYKOFPAtildeLCPMEhEEOUEFMAacuteEEfuqKYKOFKAtildeAAacutelOEAtildehNAacuteEOlAtilde aumlIacutecurrenOcircPMSCFDEFGLEW l^lnacutewPKMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBFGFL

Fig 14 FFTEiacuteaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

Ta

ble

8FF

TE V

er 4

1

FullW

kern

el

(EumlIgrave

32

180M

BP

roc

)

ST

UVWXYZ[]

^_

`6a8bcdRefghHdRefgij

`6a8bcdRefghHdRefgij

klmmnopqr

cltst

ouvq

wvx

ovy

zqvz

|vy

|vz

klmmnopq

~cltst

yvy

yvo

ovo

wqvq

v|

|vy

s88oy

oqvu

oqvu

ovq

z|vu

oquv

qvz

s88Ru

ovz

qvw

x|vq

ovw

klmmnopqr

cltst

ouvy

zvy

ovx

oq|vu

uvz

uv

klmmnopq

~cltst

yvx

xvw

ovo

w|vu

vz

|vy

s88oy

oqvx

oqvo

ovq

wxv

z|vo

qvz

s88Ru

o|v|

qvw

xuvw

ovy

klmmnopqr

cltst

ouvo

zvx

ovx

klmmnopq

~cltst

yv

xv

ovo

s88oy

oqvq

oqvu

ovq

s88Ru

ovz

qvw

ln

ltd

ln

zqq

D

mo

ltd

ln

mltd

JAEA-Testing 2011-005

UgraveBX900gtUcircWQRfeaZaumlWQRfeaZaumlWafeaZdegnIgraveEacuteIacuteotildeigt

laquosectauml4ordfXmWWfeaZAacuteordfXcpoundDmWordfcurren

cqUgraveDFig 14OcircgtCDIacutecurren_ccedilY-a5^regGW(_ccedilY-a5acute164)BX900gtordfpoundUcircCsectmoumlaringaeligccedil13hegravepoundXcpoundD9ordfAumlq [FX1gtUcircWQRfeaZaumlWQR

feaZIgraveEacuteIacuteshyegraveeacuteordfaEacuteDordfUuml|YacuteTHORNcdz

yacuteCiacutecurrengtrsGWotildeigtlaquomWcd BX900 WrsGmWXqBX900W FX1gtUuml|YacuteTHORNWiacuteOslash5GWFig 14OgraveIacutecurrenO=OordfEcircUgraven^currenethWPrsUVXiacute5pound

DDHBX900 gt FX1 gtzyacuteCqCcurren BX900 16 oacuteC FX1gt 40WXzyacutegtlaquopoundDqUgraveDFig 14OacuteWOcircIacutecurrendegncopyIgravefgO=EacutegtlaquosectBX900gt 11currenW 08currengtzyacuteXcpoundDordfFX1gt 36currenW 18curren(or 16curren)gtMddcXzyacuteordfaacuteHdnDq 13yacutecdFX1feaZTUacuteUcirc BX900Wdc+XpoundsectFX1

|Y]OszligyBGDHhhordfaeliggt

laquomWordfcurrencpoundDq 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute

42 STREAM^_ZgtethCDW ordfcent^_Zgtecircc5g5poundDq egraveBCD^_Z43 FFTEWEacute Stockham|5gt FFT5P

NPB (NAS Parallel Benchmarks) 23FT^_ZgtlaquoqFFTEoumlWiacuteNPB23 FT Uuml|YacuteTHORN(Fortran Ucircgt MPI EumlIgraveUcirceumln)l3CUuml|YacuteTHORNcdaumlIacutecurren4C)JCDaumloacuteCfeaZUuml^13~brvbarpound^

UcircCDordm)5poundDqUuml|YacuteTHORNgtXCaumlBD=YZ C7UV((XYZ)=(512512512))gtaumlIacutecurrenordf On aringaeligccedil13hWXsectfeaZUuml^13~szligordf13XDHgtlaquosectaumlIacutecurrenUVEacuteCDcpoundDcd

gtlaquoq ^_ZaumlIacutecurren 28plusmngtlaquosectn~n Fig 15W Fig 16GqMd

8plusmncopyIgraveoacuteGYordfZsectumlO=gt|YordfZgtlaquo

mWcdhtgtUKntst oacute WXXgtmUuml^13~notbrvbarGq UgraveDOcirc$_ccedilY-a5 16W 64brvbar5poundDq

JAEA-Testing 2011-005

13

|ucp|t^ro__tAacute^OOOsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]|trFRFY]p^top]qrsYtpYop]qrsJJDp_Q|^bbqpc^ta|rotpbctcb|urc`Q^uptatar^]|tbtarp|t]|tbccbwOa^qrR^boocbtoabccbw`Et^b|rtabtFhPU`Ar_pcrobqq^SJJDItpo]rc_pcJJDOtarbccbw|tur^^t^bq^ruwobqq^SJJDI^taEorttpMbUrttpUQOarcrUQ^tarbs^|bq|rp_U_pcbwo|ur|rtobqq`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^oq|r[qpubq`a[^oq|r[]^_`a[^trrc^OOO^O~OqOsO^rccOr^trrc__tuqpoO__tuqpo]b]bcbrtrcAacute__tuqpohdnO__tuqpo]bhddAtilde auml^_ccedilY-a5Kd]bcbrtrcAacute__tuqpohKdO__tuqpo]bhKXAtilde]bcbrtrcAacutehiOhmKPAtildep|uqrop]qrssOwO|p|uqr]cro^^prqOrqrOrq]^r^psAacute__tuqpo]bOAtildeOwAacute__tuqpo]bOAtildeO|AacuteAtildeobqqUHExE^tAacute^rccAtildesAacutevOvAtildehAacuteK`MMOM`MMAtildewAacutevOvAtildehAacuteK`MMOM`MMAtilde|AacutevAtildehAacuteK`MMOM`MMAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pcprbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqh]^xt^rAacuteAtildep^hKOKMMM^hY^pqhKOOPobqq__tPAacute^OqOOO__tuqpoO__tuqpo]bO|OsOwAtilde^_Aacuteq`r`AtildeptpKdMobqq__tPAacute^OqfKOOO__tuqpoO__tuqpo]bO|OwOsAtilderpptpKXMoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]wRtpQ`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKdMopt^|rp~hKOp^hKO__tuqposAacute^O~AtildehwAacute^O~Atilde aumlIacutecurrenLfgO=PrprpKXMopt^|rrpobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqrh]^xt^rAacuteAtilderq]hrqrYrqobqqUHExSpxcbAacuteUHExSCUUxTCBVLOrO^rccAtilde^_Aacuter`r`MAtildetarc^trAacutedOWAtilde[Gqb]rD^ruw|u`__tPAacutero`Atildev[Orq]r^_obqqUHExJ^bq^rAacute^rccAtildetp]

Fig 15 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- cfftz)

JAEA-Testing 2011-005

|ucp|t^r__tPAacute^OqOOOwOwKO|OsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pctarVYta^trcbt^pp_tarropbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^trrc^OOqOOOwOwKOKOq^Oq~OqO|O^O~O^KKO^KPO^PKO^PPp|uqrop]qrs|OsOwO|KOsKKOsPK^r^p|AacuteAtildeOsAacutewKOAtildeOwAacutewKOAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYort^^t^bq]bcbrtrc`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKhePqhPWWAacuteqYKAtildeq^hPWWAacuteYqAtildeq~hPWq|hq^fKp^hMOq^YK^KKh^WqfK^KPh^KKfK^PKh^Wq~fK^PPh^PKfq^_Aacute^`r`KAtildetar|Kh|Aacute|f^Atilderqr|Khop~Aacute|Aacute|f^AtildeAtilder^_oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoDa^qpp]^rotpc^buqr`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYphMOqYKp~hKOwsKKhsAacute~O^KKfAtildesPKhsAacute~O^KPfAtildewAacute~O^PKfAtildehsKKfsPK aumlIacutecurrenLOO=PwAacute~O^PPfAtildeh|KWAacutesKKYsPKAtilderprprpcrt|cr

Fig 16 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- fftz2)

JAEA-Testing 2011-005

UgraveWH Fig 17(Ocirc$ Table 9 Table 10GqQRfeaZbrvbarshyatildeCBX900 fftz2gtfeaZUuml^13~UKntstbrvbarzyacuteordfagraveaacute13cfftz gtnotXordfmacrdnXqUgraveDFX1degnnotXordfmacrdnXcpoundDqmW ETHNtildeCD

zyacuteordflaquopoundD BX900 fftz213gtfeaaringaeligccedil13heumlXsectumlampordf5eumlnXmWordfcurrencpoundDq13]h

eumlnX(cXCcedilEgraveWuumlwdnordf(currencdXq Ocirc$_ccedilY-a5degnOcirc` 16ordfTUacuteshygtlaquopoundDq UgraveDQRfeaZbrvbar BX900W FX1shyordfFFTEoumlbrvbarsectgteuml

(2curren13)ucircsup2WsectumlMd|YZgtlaquopoundDDHWuumlwdnq TcedilQRfeaZWafeaZUKntst ordf)BCD

IacutecurrenotildeQRfeaZAgraveordf 1 ethAacuteWXpoundDordfSTREAM gtfrac12Eacutesect

iumlntildegtlaquoq 45 TUacuteUcircUgraveWH

BX900WFX1STREAMWFFTEUuml|YacuteTHORNfIumlWNPB23 FTB13feaZbrvbarTUacuteUcirc5gCDFX1gtegraveBGfeaZUuml^13~brvbarpoundordfecircC^ordflaquo 5ordfaeligugtlaquopoundDq

FX1gtfeaZUuml^13~otildegt13ecircGmWordf13XoumlordflaquosectWecircUcirc7gth5Paeliguordflaquoq

Fig 17 feaZbrvbarTUacuteUcirc

JAEA-Testing 2011-005

TTa

ble

9N

AS

PARA

LLEL

Ben

chm

arks

Ver

23

FT

Tabl

e 10

NA

S PA

RALL

EL

Ben

chm

arks

Ver

23

FT

JAEA-Testing 2011-005

5 IORYbrvbar Y`fIuml IORBoa13alFGFAring

CaYoBETHNtildeCDq 51 IORYEcircu

IOR Benchmark3)vR ASCI Purple=_7|eacuteplusmngteumlnD C iexclbrvbarYfIumlgtlaquoLplusmneacuteYacute~ 2102egraveBPqPOSIXLwrite() read()PMPIIOLMPI_File_write_at() MPI_File_read_at()PbatildeB13EumlIgravel5PmWordfgtecircq 511 YacircueZ$

MPI oZYcplusmnLrepetitions plusmnP`5XFGFAringTshyiquestHqOcirc`centgtEumlIgraveOslash5eumln^_cdiaoacuteCpound

Eacuteecircordf5nndFGFAringshyordfeumlnqIOROslash59XacircueZ$ Fig 18Gq

1plusmn`gt readwriteplusmn(blockSizecurrentransferSizejsegmentCount)gtlaquosect`aecirceuml (blockSizej segmentCountjMPI EumlIgraveAring )WXqyenwtransferSize=1kBblockSize=1MiBL106 Bytes MB220 Bytes MiB WGPsegmentCount=1MPI EumlIgraveAring=32 oumlblockSizejsegmentCountjEumlIgraveAring=32MiB testFile ordf)dnoZY 1MiBmiddotAbrvbarntildeC1KB7fIuml 1024iexcl1313aeligUgraveDZordfpoundEacuteecircGq

[transferSize] 1plusmn readwritegtFGGeacutea`qsizeof(long long int)currencblockSizegtlaquoaeliguordflaquoq

[blockSize] o^_ordfbrvbarntildeGeacutea`qgt 1MiBq [segmentCount] o^_brvbarsectYeumlniOcirc$qyzshy 1q [randomOffset] 1XdZordfYqOcirc`shy 0q [fsync] ecircEacuteAring fsync13f5Gqyzshy 0q [filePerProc] ^_sectWcent`aYGqPOSIXEacutenotszligq

yzshy 0q [useFileView] MPI_File_set_view() MPI_File_write() MPI_File_read()egravePq

ZordfYWparaCXqyzshy 0q [collective] MPI_File_write_at_all() MPI_File_read_at_all()egravePqZordfY

WparaCXqyzshy 0q

Fig 18 IOR YacircueZ$

JAEA-Testing 2011-005

512 MPIIOatildeGUuml^13~eZ$

IOR gt Info aringbrvbarpound MPIIO =)`7wmWordfgtecircqBX900 notszligX InfoUumlYacuteNtildeY` keyW value Table 11GqplusmnCXLyzshyPq

Table 11 MPIIO Info UumlYacuteNtildeY` key value

cb_buffer_size 4194304 $amp()

+

cb_nodes -012345

6789lt

=gt-012A

B

ind_rd_buffer_size 4194304 ACDEFGF$amp()+

ind_wr_buffer_size 524288 ACDH+GF$amp()+

JAEA-Testing 2011-005

52 latildeLPOSIX MPIIOPn~nEumlIgraveAringtransferSize (1plusmn13

fgtFGeacutea`)YAgraveograveLsequential randomPb^wFig 19GumlcopyFGFAringCDL^_P Table 12Table 13Fig 20Fig 21GqUgraveDntildea13iumlIuml_aumlaringaeligccedil13hIO -eacute-eacutearingaeligccedil13hordfcentsup3CDHn~naringaeligccedil13hoacuteGIO XcOgraveOacuteOcircOtildeYoacuteG IO XcordfcentGDH_aumlaringaeligccedil13hszligEacuteDq

iexclcentpoundcurrenyenETERNUSDX80

970TB

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

brvbar

SRFSltlaquocumlcopyordf

READD WnotiteD

brvbar

SRFSltlaquocumlcopyordf

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

shyshyshy

Fig 19 ReadWriteumlcopy

JAEA-Testing 2011-005

Table 12 LPOSIXP

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3

oqu yvq ovy| |uxvzw zvwu |zwvyy ouv|u oywvo ozuqv

quw zqzvuy uvox uu|v|u u|uvqo uxovz| uwvuo owovuq wqvxuqzy oqqvo| vy uxyvyw uuuvxx uyvyw xqzvu| |vw| |q|wvzqwoz o|uv|w ouvoy ozwv|o uyxvux uqvyo u|v| uovzu zuvxoy|wu ouoov|w uwvqu uuvxq uyvwz uuvyw uxovww uqvxw zovzw|yw ouxzv wzvwq uy|vx uyvz| uyuvwz uwxvoo |xvo wxvzyyxx|y ouwyvq| wv uxvo| uovyx uzvuu u|qv xvxy yvyxo|oq ouqwvu| uqvqq uxv|w uxvxy uuwvyo uyvoo qvyw zv|qyouu oxoyvu ywovoq uuvw ux|vy |uxvu| uxovoy oqvow uwvzwxuww oxovww oo|zv uyuv uvoq |w|vo u|wv|y ouqvoq uqwvooquwxy ouwvyq ovq uuyvo| uzvyu uquvq u|uvo q|zvx| q|vq

oqu ow|qvy ovu uqvyo vyu xx|vyq uwvwy z|zyv|y oxxqwvuwquw quzvoy |vq |w|vwz uyvyw xxwvzu xuvw o|xxxvy owv|xuqzy ouvu |vu uqov|q uxzvzo u|vzq uuwvu ooouv| |qxvxuwoz ozqxv|x o|vq |z|vuo uq|vqz xxyv|u xzzvy oo|vwo xqvzzoy|wu uvuw uqvz uovz u|vwo xwwvo y|vu| oxvz ozvoq|yw uuvw zuvy uvzz uozv yqzvyx xzyv|u ovo oxvwyxx|y owvy ouuv|u uwv|| uuwvx| x|ovx| yqyvux o|ozvuo oyoovyoo|oq |wzvy xyvyw uuwvz| uvqu x|xv xuvu ooyyvxo oxxvoyyouu u|wvq |zxvwx uvo uyzvu xo|vzq xuqvqw oyzu|v|o q|quvqxxuww uuv|o ozuxvuu uuvxx uzvzu xqvow xuwvzz oyxxxvyx owyovyooquwxy xwvy |qzv| uuzvxu uzvo| xyuvwo uwzvwx oxwwvxz oxyuyvy|

oqu oxyovoo vo yo|vyu v|w xwyvzz zv |xwovy xzzxxvyquw oxvz |vwz yquvxx y|ovx xywvyz xzyvyq x|xy|vzo w|ywzvwouqzy oyvu| zv xzwvx| yyvu xvoq xv yxy|vz zqy|vyowoz oy|xvxx uvy yozv|q yvyu xwv| xxvu| ywuv| zqqqvozoy|wu oxyzvuw uyvzz y|vyy yowvxq xwuvwx xwxvy ywzuv|q wqzvoo|yw oxoovuy wwv yqyvyq yuvyw xwov|x xzov|u ywozyvqy wyyyvywyxx|y oxwxvz ouyvxy yoyvuz yvoq xwxvwx xzqvww ywv| wxxqzvo|oq oyxwvw qvwx yovw yuvw| xxwvwo xwyvuq yquvzw woyvyouu oyw|v|z yzvx| yqvwu yxyvq| x|vx| xywvuq yywqvxw wqvq|xuww oxqvxu ouxqv|u yqvuq y|wvqx xyqvw xovz| yx|||vqq yuy|vuoquwxy oyywvqx uwvuy yo|vxz yuzvz yyxvq xwv|u yox|wvww yoovqx

oqu

|

xy

acutedegmicrodegpara

degmiddotmdegemdegkk~qgcedilsup1sup3

mdegkk~ocedilsup1sup3 cedilordm cedilsup3

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3oqu |yvoy uvqw zqxvxx xvwq uzvx ovu y|zuvx xywy|vy|quw uxzvy yvuz zoyvxo vox xxv| |vqw zyuyovyo zqzuvuyuqzy uyvuz oovx zuovw wvzy zvqx |vzu o|qxxuvy oowzyyvyqwoz uoyvxx vy xyzvzq xxv|o zuovwx uyvyo ouyyqovxx o|woo|v||oy|wu xqvzy uuvy oquvou oqovq oqqovuw z|vy oxuwv| ouqxzyvwz|yw xoyvyw xwvqy oqqvo oz|vo oqzvyz o|yvu ouwwqvq o|wzuv|yxx|y yu|vq ooyvq o|yvzx qvu o|wvqq uwvy oozwwvyo oozqyvxo|oq youv|y ovoy oxu|v|w yqv ouv|y owzvoo wuo|vou w|qzvzwyouu yxvw uouvwo oywzvwo o|yxvqo oyyvqq oxzzvq xo||voy xoyvwuxuww wozvq yuv| oyuvwo ouxvo ox|wvq o|o|v| |qwqvxw |ouvyzoquwxy zyxv|o zqyvu| oyq|vx oxqwvy oyzvz ooxvu ozzvx owovyu

oqu |xvyy v|u ozuvy w|vyy |xvw |vxy uoyyzov|o |zqyyyvoquw |xvww uvxw owq|vx xvzx oyovu |vqy xw|zvzw xyzuzvzxuqzy |yzv|z zvqw qvuy v|x oyovww wqvyw wzyv|w yqu|vowwoz |wuvz owvu o||vu oqv xzyv ouuv|| yozqovu zzqvzoy|wu xqxvwy |yvx qovou uwvyx oz|zvxy ywv| xqqxvzq yq|qxvzw|yw xuv|u uv|z qywvyu zxvy qxovww |quvz yxwoqv yuquu|vy|yxx|y yzxvqq ouyv| xzwvwx xyvzq yvoq wqzvy uzyqzvu x|yzqzvw|o|oq yyqvx| zzvq |ov|x zzovw| w|ov|| ozvo x|o||xvuz |yqqvwyouu zzyvzq yq|vwz |zzzvz |qvy |wzuv zyvxq |xowu|v |uuoqovuuxuww ooxvu o|wvu |wyvww |zvu |ux|vo zw|vzo |xuwvy |uwovxoquwxy oz|yv| owzuv| |wxyvwu ||yvqu |yzuvzx |yyovq o|uwvo o|uywvo

oqu |yvq vq |qyvqq ozwv|u |zvwq qzvww o|wuzvyo ooq|xxvzyquw |yvzu uv ooqvyu ozzv|w uxxv ozqv| owzzwvyy oxxuq|vouuqzy |yyvx| wvxo yvqq ozwvqy |zwvzq ozwv zzvx qyqqovoowoz |xv| oyvyz ozxovou |wwvw uxqvq| uquvq| |ozqxv|| qwuzzzvw|oy|wu uyqvzo ||v|| |wvyy wquvw wxuvox zvuo |oxy|vw q|qyvoq|yw ux|vuw yvw |xov|u |uvz uuvyu |wqvy o|xxovzu q|wvwzyxx|y yzvwo o|xvu |uqovuy qvw |yvy wwzvx o|wuyov|y owxuqzyvowo|oq xzzvwz yzv|y |q|vxq ovzu uqwvuq oqqvz oyuyyxuv|u oyouoz|vuoyouu wxyvq xuuv|u uoyxvo |yywvo uzvw| uouvu ooq|ozvo oqyq|vqxuww oqqvqu oowvoz uoy|vw |yvx uwuvuw uux|vxx woyyqzvow zqzvqwoquwxy owquvu owov|u uouvx |zxyv uvw uxoqvqu uy|yozvuw uuqqvuo

xy

oqu

|

cedilsup1sup3plusmnsup2

microdegpara cedilsup1sup3 cedilordm cedilsup3degmiddotmdegemdegkk~qg mdegkk~o

JAEA-Testing 2011-005

Ta

ble

13L

MPI

IOP

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

qv

qvz

ovqq

ov

|vq

vwy

quw

vyu

ovq

ovzz

vx

yvqq

yvox

uqzy

xvq

|vx

|vzq

xvo

oovw

oovxu

woz

oqvz

wvqu

vy

oqvy

|vuz

vy

oy|wu

ozvwy

oyvwq

oyvy

ozvz|

uxvw

uxvz

|yw

|yvzx

|vyu

wvy

|xvxo

vq

xv|

yxx|y

yvy

xyvoz

xqvqq

xuv||

oyvqo

oovz

o|oq

zyv|o

wqvu

zovu

zvx

owwvw

oyvzx

youuoxvoq

oqyvx

ouovzu

oqov

x|vqw

uvux

xuwwuyuvoo

|wvuo

uwvxu

oyovqq

wuvqq

||zv|y

oquwxyyqxvu

uwyvux

uwxvzx

ozuvqy

|uwv|

|uuvx

oqu

ovou

qv|o

qvy|

qvyx

|vy

|voq

quw

v|

qvx|

ovq

ovw

yvu

yvz

uqzy

uvxy

ovu

vqw

vyw

ovu

ovou

woz

zv|w

vyu

xv

xvoq

xv

uv|

oy|wu

ov|z

uvyz

oqvou

zvzw

uzvyo

uwvwz

|yw

||vz

zv|w

uv|z

qvy

wvwq

zovy

yxx|y

xvxw

v

uyv

|wvy|

oxqvz

ouzv|q

o|oq

w|vuu

u|vz

wyvx

ywvzz

y|vw

yvuz

youuooyvu|

yovw

oyqvoqvxo

||xvyo

|vx

xuwwwwvww

o|ovu

|uvqzyvuy

|yxv|

uqvw

oquwxy|xvo|

oovux

qovwu|qyvqz

uqwvo

x|v||

|

xy

acutedeg

microdegpara

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

ovou

ovoo

ovq

ovo

vo

vwx

quw

vx

vow

vo

vxu

uvxo

yvo

uqzy

xvoq

uvou

uv|

xvqu

zvo

oovy

woz

oqv

wvx

wvy

oqvo

ozvwo

vo

oy|wu

qvoq

ovo

ovux

qvq|

|yvyq

uyvou

|yw

uqvyo

|yvz

|uvy|

|wvu|

wovo|

wyvzu

yxx|y

wqvox

|v|y

yyvww

wvy

ouovz

ouwvqq

o|oqoxyv|w

o|vuq

oxvyo

oxovz

wvw

xwv||

youuwovuu

xyvox

u|vzq

yzvxw

xoovux

x|xvwu

xuww|xvo

uzvx

u|vyz

xqvozoquqvw

zuyv

oquwxyzv|y

yovz

wqvou

qyvuqooo|vuoqovu|

oqu

ovo|

qvxy

qvzy

ovo

voz

|v

quw

vz

qvzu

ovzq

vy

uv|u

yvx

uqzy

uvy|

vqx

|v

uvuz

wvwq

ovxo

woz

zv|

|vx

vwq

wvwu

ovyy

yvqu

oy|wu

owvzu

vu

oyvox

owvoq

|xvxo

xuvwx

|yw

|vy|

ouv

|ov|u

|xvx

qvq

oquvwq

yxx|y

xvq|

|vu

uxvx|

yvwo

oxuv||

owov|w

o|oqoxqvyw

v|

wyvq

ovux

|qvo

|vu

youuz|vwo

ouzvwq

o|vy

wuvqw

yovy

yywvz

xuwwxxvyu

|qv

uy|voy

xoqvwoxqvwwo|yvzw

oquwxywxwvxq

xov

wqvx

zqwvu|uovxzoqvu

xy

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2

microdegpara

|

JAEA-Testing 2011-005

521 POSIX Fig 20GbrvbarPZYgtFGeacutea`brvbardotildeiWXoacute

CZordfYgtFGeacutea`laquoGnordfzyacuteCqXFGeacutea

`ordf 1MB oumlZYWZordfYWgtiacuteOcirc$FGe$WXDHordfiGqUgraveDZYecircCo^_ordfn

~nparaCDaYGouml`$iaYbrvbarsect

EGoacuteCpoundEacuteEacutegtiaYGoumlAgraveordfAumlgtlaquopoundDq

mnd$Ocirc$YUumleacutenotccedilIuml]OcircOtildeYYL13YEuml

plusmnFEumlPthornordfuumlwdnqecircCCgtordfFGeacutea`

iexclcentIacuteiWXmW]poundEacuteEacuteecircC+Xnota

13gtOslashordfmacrdnX=WC-eacutearingaeligccedil13hLIOiumlIumlshyaringaeligccedil13hPthornWuumlwdnq 522 MPIIO POSIXZordfYWiacute=zCFGeacutea`

otilderyenCordfzyacuteCLFig 21PqUgraveDEumlIgraveiexclcentCXqmndreg=cd MPIIO ]^_ordfZordfY5POslashOumlWXpoundgtXcWmacrdegeumlnq 523 _aumlaringaeligccedil13h 13 IOoiumlIuml_aumlaringaeligccedil13hWa-eacutearingaeligccedil

13hordflaquoqa-eacutearingaeligccedil13humlcopyordfplusmnsup2GmWsup3acutegtlaquoordf13

BeumlnmemcleanfIumlegravePmWbrvbarsect_aumlaringaeligccedil13hY|1micro5PmWordfgtecircq 13G3brvbarsectpoundEacuteEacuteCDPOSIX_aumlaringaeligccedil13h

szligordfaacuteHdnDLTable 14Pq

(1) ecircEacute` (2) memclean (3) TplusmnpoundEacuteEacute` (4) 2plusmnpoundEacuteEacute`

JAEA-Testing 2011-005

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomSingleFileFilePerProc

ReadWrite

homehome

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

Write Read

sequential randomhomework

N=322561024

N=256

N=1024

N=32

N=32

N=2561024

FilePerProc FilePerProc

N=256

N=1024

N=32

N=2561024

N=32

FFig 20 POSIX

JAEA-Testing 2011-005

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential random

use viewcollective

use view N=256 use view N=256

N=32 256 N=32 256

homehome

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomhomework N=32 N=32

FFig 21 MPIIO

JAEA-Testing 2011-005

Table 14 _aumlaringaeligccedil13hszlig

| xy | xy

oregRplusmnsup2 zyvx qzv|| xwvxw zyuvqu

sup2Rplusmnsup2 woxqvoz ||qqvz| uzvu z|voy

kfrac14 frac12kfrac14

53 IORYUgraveWH

IORYbrvbardegdnDregmacrb13tuq 531 a13egravecurren occedil`brvbarYCDcparaX IO O=5Pnota13LSRFSP

homea13brvbarsectiumlIumlS_aumla13Lext3Ptmpa13PordfOslashszligordfAumlDHcAumlFXgttmpa13YZccedila)W^|13~AumlFUcircgtecirc9ordflaquoLFig 22middot Table 15Pq UgraveDhomemiddotAQFS_ccedilY-a5 512kcentCsect186OcircOtildeYOumltimes

raquoZETHIuml_Kgtaordf)JeumlnqUcircfIuml13NtildeY|^`bgta

gt_ccedilY-a5gtyBUacuteCqiAgravework middotAIO eacuteIumlkCQFS_ccedilY-a5 1MBcentCsect366OcircOtildeYOumltimes`Zaggtaordf)Jeumlnq^|13~cda_ccedilY-a5gtlgtyB

UacuteCqa-a5ordfecircCXbrvbarPX MPIIO work middotAegravePuecircgtlaquoq 532 POSIXW MPIIO c^_brvbariaYhomemiddotA Readgt 256EumlIgravegt 2curren

Writegt 10currenW POSIXbrvbarsect MPIIOordfcedilqPOSIXbrvbaraYmacrsup1Gq 533 eacuteccedilecirceuml ntransferSizeGX 1 plusmn13fgtleumlneacute

a`ordfecircMFGFAringordfecircCXWPdegdnDmWcd^|13~

^_Zeacuteccedil-a5laquo]GWordfyacuteordf9ordflaquoq C`|lOcirc`eacuteccedil-a5Y_ BUFSIZgtAtildeUgravesect13

gt gccegravePoumlgt fccegravePoumlgt 8192eacutea`gtlaquoqCfIumlgtsetvbufZaZ|atildebrvbarpoundeacuteccedilecirceuml^wmWordfgtecircq QR Fortran1313aeligleacuteccedil-a5GAgraveograve13sectLAacute

ordmPq (1) OPENyacute BLOCKSIZE (2) ^ fuxxbf (3) Oslash5fIumlZaUuml^13~ -g (4) Ocirc`gt 8Meacutea`

JAEA-Testing 2011-005

ETER

NU

S D

X80

1-

18 (d

x01frac34

dx18

)

DX

80

01

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

DX

80

02D

X80

03

DX

80

04D

X80

05

DX

80

06D

X80

07

DX

80

08D

X80

09

DX

80

10D

X80

11

DX

80

12D

X80

13

DX

80

14D

X80

15

DX

80

16D

X80

17

DX

80

18

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

IObrvbar

1(io

1)

ETER

NU

S D

X80

19

-36

(dx1

9frac34dx

36)

DX

80

19

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

DX

80

20D

X80

21

DX

80

22D

X80

23

DX

80

24D

X80

25

DX

80

26D

X80

27

DX

80

28D

X80

29

DX

80

30D

X80

31

DX

80

32D

X80

33

DX

80

34D

X80

35

DX

80

36

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

Wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

IObrvbar

2(io

2)

Fi

g 2

2a13J

JAEA-Testing 2011-005

Table 15 OcircOtilde7Y`|J

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_01 QFS_data_01 2109888 MB 2109888Icirc36 = 724TB 1024k

data_02 QFS_data_02 2109888 MB 2109888Icirc36 = 724TB 1024k

data_03 QFS_data_03 2109888 MB 2109888Icirc36 = 724TB 1024k

data_04 QFS_data_04 2109888 MB 2109888Icirc36 = 724TB 1024k

data_05 QFS_data_05 3695488 MB 3695488Icirc36= 1269TB 1024k

home QFS_home 3695488 MB 3695488Icirc36= 1269TB 512k

work 524288 MB Io2JAgraveAacuteAcirc

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_06 QFS_data_06 2109888 MB 2109888Icirc36 = 724TB 1024k

data_07 QFS_data_07 2109888 MB 2109888Icirc36 = 724TB 1024k

data_08 QFS_data_08 2109888 MB 2109888Icirc36 = 724TB 1024k

data_09 QFS_data_09 2109888 MB 2109888Icirc36 = 724TB 1024k

data_10 QFS_data_10 3695488 MB 3695488Icirc36= 1269TB 1024k

data QFS_data 3695488 MB 3695488Icirc24= 846TB 1024k

sysdata QFS_sysdata 3695488 MB 3695488Icirc12= 423TB 16k IumlETHNtildeETH

work QFS_work 524288 MB 524288Icirc144=72TB 1024k dx01frac3436OgraveJ

JAEA-Testing 2011-005

6 a$fY` mmgt BX900 a$fY`ETHNtildeCDtuq

ETHNtildeBX900Xraquozfrac14frac12GmWordfgtecircDq 61 a$fY`igraveiacute mmgt BX900 a$fY`IcircIumligraveiacute AgravejiumlIumleacuteaIuml

0Gq 611 icircWiumlIumlethsectntilde

BX900 InfiniBand QDRegraveBCD Fat Treeicircgtlaquoq113aelig13S 18iumlIumlAEligicircgtlaquoordf113aelig13 LeafWdividenaccedil7Iumlordf 2Igravecenttimeseumln1accedil7IumlcdoiumlIuml QDR ordf 1 7gtYZeumlnqaccedil7Iuml 2 Igravelaquogt1 iumlIuml 2 QDR ordfYZeumln(QDR Iacutegt 18 j2Igravegt 36WX)qoiumlIuml`ZaringbrvbarsectT 8GBs (4GBsj2) AAEligpoundq 13aelig13 5PDHLeafaccedilyacute7 SpineWdividenaccedilordf

9iexcllaquosecto LeafaccedilGu Spineaccedil QDRgtYZeumlnqbrvbarpound 113aelig13cd SpineIacutegt 9j2Igrave=18 QDRordfegraveBeumlnq QDRordf 5PoumlLeaf accedil3xmacrDownlink ordf 36 Uplink ordf 18 Xgt Uplinkshy 50WXsectmn 50_ccedilaringWdividenotgtqBX900gta$fY`Ecircfrac34h Fig 23GqBX900gt 50_ccedilaring13aelig13ocircotilde gtthornordf]GqSpineLeafaccedilW13aelig13SoiumlIumlYZGEcircfrac34h Fig 24Gqmmgt13aelig13S nAumliumlIuml nAuml(niquest9)UgraveD nU9Auml(nAgrave9) SpineaccedilegraveBGqlaquo13aelig13laquoiumlIumlordfcent13aelig13iumlIumlcdOcirc$ltAacuteoumlSpineaccedilcd Leafaccedil(2Igrave) QDR2(1j2Igrave)ordfegraveBeumln(2Igrave)LeafaccedilcdiumlIumlQDR2gtYZeumlnqDotildeCmSpineaccedilcd Leaf accedilordmAcirc13aelig13SPAtildeWiumlIumlWnoteumlnqmDHyenwiumlIumlTEacuteOcirc$lt GoumlWiumlIuml 1 W 10 ordflt GoumlWgtn~n FAringordf+XqX13aelig13Sotilde gtiuml_ccedilaring

WXqmDHoYacute~iumlIumlethsectntildeAgraveograveLiumlIumleacuteaOcircOtildePordf

thornotildeGq (1) XiumlIumlethsectntildeL13klP iumlIumlZeacuteaIumleumlnqyenw13aelig13 1 iumlIumlordf bx0001ccedil

bx0018Ugravegtlaquoordf144EumlIgraveYacute~oumlmAumlgtiumlIumlordfeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDdmiumlIumlAringAEligw[iumlIumlegraveBGq13

aelig13ocircotildeoumliacuteZeacuteaIumleumlnq

JAEA-Testing 2011-005

(2) CcedilXiumlIumlethsectntilde Wsup33iIacuteOslash5YZniumlIuml 1ecirceacuteaIumleumlnqyenw13aelig13

1gt bx0001bx0003bx0005bx0007WeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDd8)eacuteaIumleumlnDiumlIumlordfEgraveAumlAringXdAringAEligwTEacuteEgrave

AumlAringiumlIumlegraveBGqEcircAumlAringoumliacuteAringAEligwTEacuteEcircAumlAringiumlIumlegrave

BGqmnAtildeWYacute~SiumlIuml gt IB [EumlDHcenteumlnDq plusmnoqgt65EacuteWsup3AgraveogravegtCDq

13aelig13 1 13aelig13 119

Leaf(1-2) Leaf(1-1)

bx0001 bx0002 bx0018

Spine(1) Spine(2) Spine(9)

Leaf(119-2)

Leaf(119-1)

bx2125 bx2126

Fig 23 BX900a$fY`icirc

3OslashCDqIgraveIacuteoFL13aelig13raquoIcircIumlntildePfrac34wWsup313kl

WiacuteiumlIumlethsectntildeAgravejBCq

JAEA-Testing 2011-005

accedil (Spine) accedil

(Leaf) 13aelig13 iumlIuml (1 13aelig13acute18 iumlIuml ) (acutecsect]GCGDHLeaf accedilTCcX)

1

10

2

11

3

12

4 5 6 7 8 9

13 14 15 16 17 18

SW 1

SW 2

SW 3

SW4

SW5

SW6

SW 7

SW 8

SW 9

SW

Fig 24 iumlIumlWaccedilatildeAcirc

612 Agravej

BX900 Ba$fY`plusmnsup2yBCD IB AgraveograveWC

RDMA(Remote Direct Memory Access)yBCD RDMA Agravej(13 RDMA WC)WRDMA egraveBCX NORDMAAgravej(13 NORDMA WC)ordflaquoqBX900 gtOcirc` RDMA gtlaquoqRDMA W IB begravecccedil`brvbarYaumlIuml(Host Channel Adapter(HCA))Ocirc$^|13~|W8Y]sectWsectgtecircAEligDIacutemWgtETH_fg (1)5PqAuml^ccedil`E7a13 ordfNtildeEumlgtecircqiAgraveNORDMA CPUbrvbar|fg5Pq

RDMANORDMAWcentccedilYacute AgraveograveWC Normal SendAgravej(2)W Send on RequestAgravej(3)ordflaquoqRDMANORDMAacircWmnd Normal SendAgravejSend on RequestAgravej5PmWordfgtecircordfmndsectOgravewG ccedilYacute-a5brvbarpoundAtildeUgraveqRDMANORDMAgtmndAgravejsectOgraveOcirc`Oacuteshy Table 16GqOacuteshyOcircOtilde Normal SendAgravejOacuteshy13yacute Send on RequestAgravejWXqmndAgravejWccedilYacute-a5^_atildeAcirc Table 17W Table 18Gq Oacuteshy^ MP_MAX_NORMAL_SENDgt^wdnordfyacuteaacuteshyCDOcirc`shyWEacutegtlaquoqbrvbarpound^egravePoumlOacuteshyOumloumlEacutegtlaquoq

JAEA-Testing 2011-005

XDshy 5PDHOslash5 -onesideUuml^13~CDoumlNORDMAAgravejc(ccedilYacute-a5WatildeAcirc)Normal SendAgravejgt5nq

Table 16 AgravejsectOgravewOacuteshy Iuml RDMA RDMA NORDMA

^_ Tccedil1024 1025ccedil U Oacuteshy 32768eacutea` 16384eacutea` 524288eacutea`

Table 17 RDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil16KBOcircOtilde 16KBccedil32KBOcircOtilde 32KB13yacute 1ccedil1024^_ Normal Send Normal Send Send on Request

1025ccedil2048^_ Normal Send Send on Request Send on Request 2049^_13yacute NORDMAWX Send on Request Send on Request

Table 18 NORDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil512KBOcircOtilde 512KB13yacute Gu^_ Normal Send Send on Request

timesOslash (1) ETH_fg ^_ccedil-brvbarOcirc$fgIacuteccedil`brvbarYordmgt ordm|8

YOcirc$FGGCcedilEgravemWq^_ccedil-( 13)Obbrvbarsectdivide3GmWordfgtecircq (2) Normal SendAgravej G shylt shyccdccedilYacuteG GqccedilYacuteUgraveUacutesect

umlordmordfagraveCnUgraveUgravefg(direct copy)qagraveCXoumliUcircIacuteEumlAagravefCmmfg(indirect copy)agraveCDdsectumlordmfgGqUumlccedilYacutezecircq (3) Send on RequestAgravej G shylt shyWYacuteAacutesectlt shyccedilYacutesectumlordmordfagraveCcdG 5Pq

`ccedilYacutezecircq

JAEA-Testing 2011-005

62 MPI MPI acircX atildeegraveBCD^_Z)JCBX900 yacutegt

CDq256EumlIgraveccedil4096EumlIgravegt Ocirc$-a5THORNumlcopyszligpound5poundDordfmmgto MPIatildegtagraveaacuteX=ecircordfmacrdn 32ccedil80000eacutea`gtGq WCRDMANORDMA ordflteumln atildeordflaquomWlaquoOcirc

$-a5cdecircCordf^UcircG atildeordflaquomWXMordfethCDq 621 Agraveograve mmgtG Ocirc$umlcopy 0ccedil80000 eacutea`gtlaquoordfOslash5vEacuteAacuteG

Ocirc$acircETHeCLTable 19Pq AgraveograveWCMPIatildeTplusmnOslash5GraquoCmn 20plusmnatildesectaumlGq20plusmnatilde

sectaumlC3 1plusmnWecircXshyordfeumlnmWordfDDlaquopoundDqWCpumlcopyYacute~ W[ordfuumlwdnqmDHT`C19plusmnshyIIumlAacutemWCDqXIuml RDMAW NORDMA 2IumlgtGq^_ZOslash MPIatildeIacutecurren Fig 25Gq

Table 19 Ocirc$acirc

EumlIgrave AllreduceReduceBcastSendRecv 13 MPIatilde 256 32eacutea` 128eacutea` 512 yacute 128eacutea`

1024 yacute 256eacutea` 2048 yacute 2048eacutea` 4096 yacute 2048eacutea`

do 1000 III = 1 NNN call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() mesuarment start call MPI_ALLGATHER(anelemMPI_DOUBLE_PRECISION amp bnelemMPI_DOUBLE_PRECISION amp MPI_COMM_WORLDierr) ttime2 = MPI_WTIME() mesuarment end ttime0 = ttime2 - ttime1 call MPI_REDUCE(ttime0ttime1MPI_DOUBLE_PRECISIONMPI_MAX0 amp MPI_COMM_WORLDierr) if(myideq0) then max_time = max(max_timettime) min_time = min(min_timettime) all_time = all_time + ttime end if 1000 continue

Fig 25 ^_ZLIacutecurrenP

JAEA-Testing 2011-005

13

622 mnUgravegttuDOslash5CcedilEgraveOslash5IumlgtP512EumlIgrave Fig 26 Fig

27 1024EumlIgrave B Gqoh$a` MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

0

500

1000

1500

2000

2500

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Reduce(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(512-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(512-)

rdma

Nordma

Fig 26 MPI atildeL512EumlIgrave-1P

JAEA-Testing 2011-005

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(512-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(512-)

rdma

Nordma

0

500000

1000000

1500000

2000000

2500000

3000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(512-)

rdma

Nordma

0

50

100

150

200

250

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(512-Rank0gtRank511)

rdma

nordma

Fig 27 MPI atildeL512EumlIgrave-2P

JAEA-Testing 2011-005

brvbarsect[raquozfrac14frac12CDq

(1) C gt RDMAbrvbarsect NORDMAordfFoumlordflaquoq (2) RDMAgtlaquo Ocirc$-a5cdaeligccedil ordflaquoEGqmn Normal Send

Agravejcd Send on RequestAgravejsectOgravethornWnq (3) NORDMA gt AgravejsectOgravewOacuteshyWatildeAcircOcirc$-a5gtaeligccedil ordflaquo

EGoumlordflaquoq (4) EumlIgraveordfCXWEacutebrvbarPX Ocirc$-a5gt decircordfmacrdn

oumlordflaquoq ^_ZgtacircegraveBG MPIatildebrvbarpoundRDMAW NORDMAegravecurrenaeliguordflaquo

qDotildeCmnd MPI ^_Z=GfIumlgtlaquo mpiexec Uuml^13~gtlaquogtGu MPIatildeoacuteC RDMAgt5Pc NORDMAgt5Pclaquo XgtC MPI atildeegravepoundoumlAgraveIumlgtOslash5rsGmWordfaeliguWXq ucircuumlWCo MPI atildegtUVOcirc$(ccedil8KB)3UVOcirc$(8Kccedil78KB)U

VOcirc$(80KBccedil584KB)gt RDMAZUcircCD Fig 28LgtUVOcirc$P CLUVOcirc$PGqZyacutegt 100n RDMA ordfnoty0n NORDMAordfnotyWXq

JAEA-Testing 2011-005

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

middotplusmnsup1raquo

~plusmnreg

plusmn

sup2cedil~raquo

sup2~

sup2~

sup2~Oslashreg~plusmn

middotplusmnsup1

~plusmnraquo

~plusmn

Ugraveplusmnsup1raquo

Ugraveplusmnsup1

~plusmn

plusmnraquo

Fig

28

RD

MAW

NO

RD

MALgtUVOcirc$acute

0K

B~8

KBP

JAEA-Testing 2011-005

63 RDMAszlig RDMA 612 gt0CDbrvbarPCPU plusmnsup2WnotM5XDH 13

O=Vfrac34gtecircqmnbrvbarsect WbordfO=gtecirc9ordflaquosectC

cCcedilEgrave`5poundEacuteDqUgraveDrsDHNORDMAgtiacuteX`5poundDq WC WordfO=eumlnDegraveyenagraveaacutegtecircXcpoundDq 631 Agraveograve W5P^_ZegravepoundRDMA W NORDMA rsGq

eacute^_WOcirc$ecircacuteGeumlsectCigraveecircacuteiacuteVicircG (13cedil Wdivideoslash)5PqGlt iuml_ccedilaring (MPI_IsendIrecv)egraveBGqG UgraveDlt GAtildeWcopyIgraveOcirc$-a5 5MB gtlaquoq W5P5Igravegtlaquoq5Igrave-a5(450450)gtlaquoqrsDHgt n~nOslash55Pq ^_ZZccedil`MPIW 2q|Icirca|ccedil`(MPIiumlOpenMP) 3q|)

JCDqZccedil` MPIgtOslash5CDoumliumlIumlS RDMAgtXDHmAgravejszligordfgtecircXqmDHiumlIumlSgt MPI 5XCeth2Icirca|ccedil`)JCDqIcirca|ccedil`Oslash5WOslash5Iacutecurren7ccedilIumlEumlIgraveUcircCXeacute

Yacute~ 1W7ccedilIumlEumlIgraveUcircCDeacuteYacute~ 2)JCDqeacuteYacute~ 2 WOslash5Iacutecurren Fig 29GqXMPIWIcirca|ccedil`eacuteYacute~ 15IgraveIacutecurren7ccedilIumlEumlIgraveampeuml$OMP parallel doiacuteordfotildegtlaquoq DHOslash5Zccedil` MPIgt 8EumlIgraveIcirca|ccedil`Oslash5 4mpij8omp(32core

egraveB)gtOslash5Gq

JAEA-Testing 2011-005

call MPI_BARRIER(MPI_COMM_WORLDierr) ttime5 = MPI_WTIME() mesuarment start call mpi_isend(aNMBmpi_DOUBLE_PRECISIONir1 1 MPI_COMM_WORLDireq(1)ierr) call mpi_irecv(bNMBmpi_DOUBLE_PRECISIONil1 1 MPI_COMM_WORLDireq(2)ierr) call mpi_isend(dNMBmpi_DOUBLE_PRECISIONil2 1 MPI_COMM_WORLDireq(3)ierr) call mpi_irecv(cNMBmpi_DOUBLE_PRECISIONir2 1 MPI_COMM_WORLDireq(4)ierr) ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ttime7 = MPI_WTIME() mesuarment start $OMP parallel do do j = 1 NMM do i = 1 NMM a1(ij) = 1d0 b1(ij) = 1d0 c1(ij) = 0d0 end do end do $OMP parallel do do j = 1 NMM do k = 1 NMM do i = 1 NMM c1(ij) = c1(ij) + a1(ik)b1(kj) end do end do end do ttime8 = MPI_WTIME() mesuarment end ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc call mpi_waitall(4ireqiastatusierr) ttime6 = MPI_WTIME() mesuarment end ttimeDD = ttime8 - ttime7 ttimeal = ttime6 - ttime5 Oslash5 ttimesd = ttimeal ndash ttimeDD Oslash5cdnD (= UgraveDiIacute)

Fig 29 WO=5PDH^_Z

JAEA-Testing 2011-005

632 Zccedil` MPI 2plusmnIcirca|ccedil`eacuteYacute~ 1 WeacuteYacute~ 2n~n 3plusmn

Oslash55poundDqmnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt D Gqontilde13Gq

acutegtOslash5CD Oslash5 acutegtOslash5CDOslash5 R acute Wn~nOslash5shy Oslash5 acute W5poundDOslash5 (U) acuteOslash5cd()Dshy acuteOslash5CD(OslashC) U acuteOslash5cdRshyDshy

laquogtOslash5CD oacuteG WOslash5GmWbrvbarpound

UumlCXpoundDrGqmj13sectgtlaquoq

laquo() = (Oslash5U( iuml)) j100 aTHORNshyordfecircMszligordflaquomWXqbrvbarsect[raquozfrac14frac12CDq

(1) Zccedil` MPIgtiumlIumlS IBegravepoundXDH RDMAograveoacuteordfltdnOslash5szligq

(2) Icirca|ccedil`eacuteYacute~ 1 RDMAOslash5gt WOslash5rsGWucircOslash5ordfFWPXq NORDMAatildeCiacuteXWXpoundq

(3) Icirca|ccedil`eacuteYacute~ 2 gtIacutecurrenordf7ccedilIumlEumlIgraveUcirceumlngteacuteYacute~ 1WRrsgt 72currenOslash5Rrsgt 55currenAumlFUcircordfXeumln(RDMA)qCcCgtucircOslash5ordfFWPdegdnXq NORDMAIacutecurren7ccedilIumlUcircgt Oslash5eacuteYacute~ 1ruUumlIgraveUcirceumlnordfRDMAbrvbarsectcedilCXpoundq

BX900gt RDMA Uuml_IumlOslashWGgtCPU oacuteG

dCb^|13~ CPUbrvbarsectVfrac34GmWordfgtecircqCcCBX900 ordfiacuteCUgraveDplusmngt ordfUumlmWlaquosect Wordf

O=eumlnDegraveyenagraveaacuteGmWordfgtecircXcpoundDq

JAEA-Testing 2011-005

64 CPUSCPUiumlIuml 1oacute 1 core FAringG^_Zegravepound CDqCPU S

iumlIumlS CPU iumlIuml gt Ocirc$ordmAcircAgravejordf+Xqn~n gtordfaEacuteiumlIuml ordfTFWXpoundDq 641 Agraveograve

1oacute 1 ^_Z coreOcirc$ G^_ZgtlaquoqZY0cdcZYCD^_acircgtCD-a5Ocirc$ 100plusmnG Cmnd uCDn~nGq^_ZiIacute Fig 30Gq plusmn 24EumlIgravegt128MBOcirc$G C^_acirc 1WGq BX900gt 1iuml

Iuml 2CPU1CPU 4coregtlaquogtZY 3Ugravegt CPUS ZY 7UgravegtiumlIumlSPAtildeW CPUW mnd13yacuteiumlIuml WXqiumlIumlS coreoacuteG core IDWethsectntildedn MPIZYAumlAring Fig 31Gq

IC = 0 do 1000 III = intcpu nodes-1 intcpu IIRANK = III call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN if(myideq0) amp call MPI_SEND(ansizeMPI_DOUBLE_PRECISION amp IIRANKIIMPI_COMM_WORLDierr) if(myideqIIRANK) amp call MPI_RECV(ansizeMPI_DOUBLE_PRECISION amp 0IIMPI_COMM_WORLDistatusierr) 120 continue ttime2 = MPI_WTIME() c IC = IC + 1 timew(IC) = ttime2 - ttime1 ir(IC) = IIRANK 1000 continue

Fig 30 ToacuteT ^_ZLIacutecurrenP

JAEA-Testing 2011-005

NODE

CPU0 CPU1

Core IDacute0 Core IDacute2 (Rank 0) (Rank 1) Core IDacute4 Core IDacute6 (Rank 2) (Rank 3)

Core IDacute1 Core IDacute3 (Rank 4) (Rank 5) Core IDacute5 Core IDacute7 (Rank 6) (Rank 7)

Fig 31 f MPIZYethsectntilde

BX900Ocirc`QRfeaZ+QR MPIZaZ|gtlaquoordfmQRRIgraveEacuteIacute13

IntelfeaZ+QR MPI IntelfeaZ+Intel MPI QRfeaZ+Open MPI

3 gt5poundDqDotildeCQR MPI W Intel MPI MPI ^_ coreoacuteGethsectntilde Fig 31CDsectgtlaquoordfOpen MPIatildeC rankAumlAring 01234567 Core ID 01234567gtethsectntildednq 642 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt E Gqmbrvbarsect[

raquozfrac14frac12CDq

(1) QRfeaZ+QR MPI gtCPU S FAringordf 3260MBsiumlIumlS CPU FAringordf 3750MBsiumlIumlordf 5350MBsgt CPUS ordfiAumlcedilCiumlIumlordfiAumlFWPWXpoundDq

(2) Intel feaZiumlQR MPI CPU SiumlIumlS CPU QRfeaZWotildeiacuteXotildeordfiumlIuml 4750MBsWegraveeacuteEGqmnQR MPIQRfeaZUcircCD5poundordfIntelfeaZoacuteC5nXDHWnq

(3) IntelfeaZ+ Intel MPI CPUSiumlIumlS CPUgtAgraveXordfmndocircoacute(1)(2)QR MPI egraveBqUgraveDiumlIumlgt Intel MPI 2(` QDRegraveBordfgtecircXDH1(` QDRCcXq

JAEA-Testing 2011-005

(4) QRfeaZiumlOpen MPICPUS ordf 4460MBSgtmnUgravegt3gtTbrvbarordfiumlIumlS CPU QR MPI WotildeAringiumlIumloumlCXWXpoundDqDotildeCiIacute QDRegraveB5poundbrvbarPotildeq

dividefrac12XegraveAgravegtlaquoQRfeaZiumlQR MPIgt CPUSiumlIumlS CPU ordf

iumlIuml brvbarsectcedilWPfrac34W+XWXpoundDqmmacraacutesectXuC

iumlIumlSgt MPI XuCEumluecircgtlaquoq 65 13aelig13ocircotildeouml 13aelig13ccedil`brvbarY 50_ccedilaringicircXpoundDHegraveBoslashbrvbarpound

thornltCUgravePqmthorn 1oacute 1 Wdivideoslash ETHuDq 651 1oacute 1

1oacute 1 MPI atildegtlaquosectbrvbarCegraveBeumlnqyenwOslashfIumlgteacute^_WFGgt^_ordfgt5noumlordflaquoqmbrvbarPXouml50_ccedilaringthornlt9ordflaquoqmmgt 5P^_Z)JC

thornETHuDq WC1oacute 1 gt 2iumlIumlgt IBnotGn[ordfecircDH50_

ccedilaringthornCltmWordfcpoundDqDotildeCWsup3iumlIumleacuteaOcircOtilde(Yacute~iumlIumlethsectntilde)AgraveogravegtAtildeWYacute~ordf IBnotGbrvbarPXiumlIumlcopytimesUVYacute~139ordfAgraveXDHthornordfCq 6511 Agraveograve ETHNtildeegraveBCD^_Z^_1curreno^_cdfrac14secto^_

1oacute 1 (MPI_SendRecv)gt 128MBOcirc$ 100plusmnG CmnuCDiexcl FAringGqOcirc$G aYacute Fig 32 Gqmyengt8EumlIgraveYacute~oumlgtZY 0ccedil3^_cdiexclZY 4ccedil7^_Ocirc$G CqOcirc$Glt ^_iugravepGgtoslashbrvbarpoundccedil`brvbarYgt[

ordfecircmWXq ^_ZZccedil` MPI WIcirca|ccedil`(MPI+OpenMP))JCDqIcirc

a|ccedil`7ccedilIumlETHeGmWgt1 iumlIumlcdOcirc$G 2^wmWordfgtecircq

JAEA-Testing 2011-005

13

6512 (1) 113aelig13S rsDH1 13aelig13S(144core)5poundDqZccedil` MPI 144 EumlIgraveIcirca

|ccedil` 72mpij2omp36mpij4omp18mpij8omp 3 IumlgtOslash55poundDq13aelig13Siuml_ccedilaringgtlaquoDH iumlIumluacuteordf IB iexclcentGqZccedil` MPI 1 iumlIumlcd 8 ^_currenOcirc$ordfeumlnIcirca|ccedil` 2 7ccedilIumliacute 1iumlIumlcd 4^_curren47ccedilIumlEacuteC 2^_curren87ccedilIuml1^_currenWXgt87ccedilIuml13 IBgt[ordfecircqmDHmnd FAring+XordfIBagraveaacuteGDHogtIIuml FAringW 1iumlIumlntildesect FAringshyiquestHDqm Table 20GqXeacuteYacute~Icirca|ccedil`cedil$ucirc3shy(MPI^_j7ccedilIuml)Gq Icirca|ccedil`(18mpij8omp)EacuteegraveeacuteshyordfEordfFAringshyGueacuteYacute~gt

otildeEacutebrvbarPXshyWXpoundqmnordf QDR2currenOslashgtlaquoigravePq (2) 13aelig13Zccedil` MPI TEacute144EumlIgraveZccedil`MPIgt5poundDqYacute~Oslash5gt t256YZegraveBCDq

mnmYZiumlIumleacuteaOcircOtildeordfZcopytimesWXDHIBnotordfecirc]GDHgtlaquoq _copytimes 72^_13aelig13(=9iumlIuml13aelig13)gtIumlb^_ 213aelig13currenethCDq13aelig13S 1iumlIumlecirc 2 IBordfuacuteordfpoundq13aelig13S 18iumlIumllaquogt 36 IBgteumlnqiAgrave13aelig13 50_ccedilaringgtlaquogt 18(9j2ccedil`)gteumlnq144EumlIgraveouml13aelig13S13aelig13W 18 IB egraveBGgtiuml_ccedilaringWXsect13aelig13S13aelig13EacuteXqbrvbarpoundm13aelig13ocircotilde 13aelig13SgtCD Wotilde^

dXq

Ocirc$G (G G)

^_0

^_1

^_2

^_3

^_4

^_5

^_6

^_7

Fig 32 ToacuteT Ocirc$G aYacute

JAEA-Testing 2011-005

Table 20 L13aelig13SP eacuteYacute~ Zccedil` MPI uumlyacutethorn(72j2) uumlyacutethorn(36j4) uumlyacutethorn(18j8)IIuml FAring 7238MBs 14474MBs 28942MBs 53467MBs iumlIumlshy 57906MBs 57899MBs 57884MBs 53467MBs

Table 21 L13aelig13Zccedil` MPIP 144 MPI 192 MPI 224 MPI iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 7246 R 4670 laquosectR 4671 laquosectR

2 7244 R 4670 laquosectR 4671 laquosectR

3 7246 R 4670 laquosectR 4671 laquosectR

4 7242 R 7243 R 4670 laquosectR

5 7243 R 7244 R 4671 laquosectR

6 7242 R 7240 R 7237 R

7 7241 R 7239 R 7246 R

8 7244 R 7240 R 7245 R

9 7249 R 7243 R 7240 R

10 R 4670 laquosectR 4670 laquosectR

11 R 4670 laquosectR 4672 laquosectR

12 R 4669 laquosectR 4671 laquosectR

13 R R 4671 laquosectR

14 R R 4670 laquosectR

iacute 192EumlIgraveW 224EumlIgraveZccedil` MPIn~n 96^_13aelig13gt 213

aelig13Oslash5112^_13aelig13gt 213aelig13Oslash55poundDqordm 144EumlIgraveOslash5WWlt shy13aelig13iumlIumlraquoIIuml FAring Table 21Gq yenw 224 EumlIgraveoumlltAacutesectshygt IBnotGiumlIumlordflaquogt(50_ccedilaring)

miumlIuml FAringordfcedilCXqFig 24a$fY`icircgtCDbrvbarPiumlIuml1W 102W 113W 12ordfnotGmWXsectmndiumlIuml FAringcedilCXpoundqCcCiumlIuml 15161718 egraveBeumlnXDHmndiumlIumlWnotGiumlIuml 6789 FAringordfFC 144MPI oumlW FAringordfEacutegtlaquoqbrvbarpoundiumlIuml6ccedil9 iuml_ccedilaringWXq192 EumlIgraveiacutegtiumlIuml 4ccedil9 p IB notGiumlIumlegraveBeumlnXDH FAringordfFq

JAEA-Testing 2011-005

Table 22 L13aelig13Icirca|ccedilIumlP Icirca|ccedil`(112x2) Icirca|ccedil`(56x4) Icirca|ccedil`(28x8) iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 9339 laquosectR 18567 laquosectR 36408 laquosectR

2 9337 laquosectR 18564 laquosectR 36919 laquosectR

3 9346 laquosectR 18566 laquosectR 36536 laquosectR

4 9347 laquosectR 18570 laquosectR 36457 laquosectR

5 9342 laquosectR 18651 laquosectR 36646 laquosectR

6 14385 R 28606 R 53808 R

7 14488 R 29118 R 53779 R

8 14500 R 28676 R 53789 R

9 14495 R 29100 R 53784 R

10 9337 laquosectR 18797 laquosectR 36408 laquosectR

11 9340 laquosectR 18775 laquosectR 36715 laquosectR

12 9343 laquosectR 18651 laquosectR 36559 laquosectR

13 9348 laquosectR 18811 laquosectR 36457 laquosectR

14 9338 laquosectR 18694 laquosectR 36991 laquosectR

(3) 13aelig13Icirca|ccedil` Icirca|ccedil`gt 213aelig13(224core)egravepound112mpij2omp(56^_13aelig13)

56mpij4omp(28^_13aelig13)28mpij8omp(14^_13aelig13)IumlgtOslash5CDqmndoiumlIumlgtIIuml FAringWpiumlIumlWnotnot Table 22Gq MIcirca|ccedil`gt IB notnotgt FAringordfecircC^poundqUgraveDIcirca|ccedil

`7ccedilIumlbrvbarpoundiumlIumlcdGlt GOcirc$2ordf^DHiumlIumlntildesect^_

ordfMiumlIumlXordf IB gtOcirc$[thornordfagraveaacuteXqmnduumlGWBX900 gtZccedil` MPI brvbarsectIcirca|ccedil`UcirceumlnDfIumlordfIcirca|ccedil`UcircfIumlXd 87ccedilIumlordfnotygtlaquomWordfcq WmigravegtWsup3iumlIumleacuteaOcircOtildeAgraveograveiumlIumlAtildeWecirccopytimeseumlnD

HUVYacute~13currenYacute~gt IB not9AgraveXqbrvbarpound50_ccedilaringmWuumlGbrvbarsectordmgttuDIcirca|ccedil`Ucirc5GuecircgtlaquoqCcC

Yacute~Oslash5 IB notGiumlIuml ordfpYacute~ordfOslash5eumlnWthornltCUgravePordfucircoB3mnEumlmWgtecircXq

JAEA-Testing 2011-005

652 divideoslash mmgtdivideoslash atildePiAuml 2ordfCX MPI_AlltoallegravepoundEacute

DGq 6521 Agraveograve

MPI_AlltoallGCcedilEgravecentCOslash5GqCmrsGqXMPI_Alltoall 100 plusmnOslash5C oacute copyIgrave-a5Zccedil` MPI ordf512MBIcirca|ccedil` 1024MBWCDq

A ucircYacute~Oslash5 B 13aelig13ntildesectiumlIumlgtecircotildeCGq C 13aelig13ntildesectiumlIumlAgraveXCG(13aelig13ordfCX)q D IBnotCG(iuml_ccedilaring)q

BccedilD atildeGgtlaquoordfiumlcopyordfYacute~gtegraveBGiumlIumllaquo GmWgtecircXq mDHWsup3ucircoBgteacuteccedilYacute~TYZ h4096egraveTEacute 512iumlIuml(4096core)agravefCm3cdXuCCcedilEgravePiumlIumllaquonotgtecircmCcedilEgravegtagravefgtecircEumlIgravegtCfNh$)JCmfNh$egravepound

MPI_Alltoall Oslash55poundDqmDHAacutedegCDOcirc$ MPI ^_UYacuteordfXq 6522 Zccedil` MPIWIcirca|ccedil`egravepoundD Table 23GqmmgtCXC

XmWOslash5CcedilEgraveordf+XDH AucircYacute~Oslash5WBccedilDUgravegtoshy8YrsCXmWgtlaquo(A 10WCDrClaquoordfmnoumlgt BccedilDEacutersWG)qBccedilDTEacute 512iumlIumlagravefCcdEumlIgravefNh$)JCOslash5GordfQR MPI gtmbrvbarPTEacuteCiumlIumlagravefCcdAgraveXiumlIumlgtCfNh$)poundOslash5CDoumlWTEacutecd

EumlIgravegtOslash5CDoumlWgt)acircOslash5ordfEGqZccedil` MPI WIcirca|ccedil`gtgt coreordfAgraveXoumlmraquozordfagraveaacuteXqX BccedilDrsXq

JAEA-Testing 2011-005

Ta

ble

23iacuteXCcedilEgrave

MPI

_allt

oall

vUacuteSTUcircR

vRcopyUumlYacutebrvbarTHORN

szligvcopyUumlYacutebrvbaragrave

pvRaacuteacircatildeNR

+aumleAumlovqaringaeligPgR

Oslash 5 Iuml

copy Igrave - a 5 (MB)

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

R

R

szligR

pR

144m

pi

512

3R6

0 11

14

118

0

145

9 3

60 R

138

6

36

0 13

87

100

1

31

124

1

25

2plusmn

(BC

DEacute

) 51

2 3R

60

111

4

118

0

148

2 11

16 R

146

9

63

0 14

75

100

1

33

132

1

32

3plusmn

(BC

DEacute

) 51

2 3R

60

111

4

29

0 14

81

714

13 R

153

3

29

0 14

21

100

1

33

138

1

28

448m

pi

512

14R

40

142

6

414

0

182

4 21

272

1 R

170

3

78

0 16

32

100

1

28

119

1

14

504m

pi

512

14R

45

142

7

415

8

180

4 27

242

6 R

160

5

79

0 14

33

100

1

26

112

1

00

2048

mpi

51

2 37R

69

182

1

298

8 24

93

465

6 R

250

1

465

6 25

04

100

1

37

137

1

38

2plusmn

51

2 37R

69

166

7

2211

6

262

5 67

564

6 R

264

7

298

8 25

04

100

1

57

159

1

50

3plusmn

(BC

DEacute

) 51

2 37R

69

166

7

2211

6

262

1 69

524

9 R

269

3 6

298

8 24

59

100

1

57

162

1

47

3072

mpi

51

2 69R

56

258

7 28

399

9 27

34

5371

54 R

286

3 9

507

7 26

44

100

1

06

111

1

02

68m

pij

2om

p 10

24

3R5

7 12

10

11

170

11

80

91

9 R

122

3

28

5 12

21

100

0

97

101

1

01

120m

pij

2om

p 10

24

9R3

3 12

89

215

0

127

0 12

191

6 R

130

4

47

5 12

94

100

0

99

101

1

00

132m

pij

2om

p 10

24

8R4

1 13

79

216

5

116

1 15

211

6 R

119

5

48

3 11

38

100

0

84

087

0

83

432m

pij

2om

p 10

24

17R

64

162

4

119

8 16

28

1925

43 R

161

2 1

129

0 16

18

100

1

00

099

1

00

30m

pitimes4

omp

1024

4R

38

598

115

0

674

12

13 R

683

27

5 68

2

100

1

13

114

1

14

2plusmn

(BC

DEacute

) 10

24

4R3

8 59

8

115

0

678

12

13 R

690

27

5 70

0

100

1

13

115

1

17

36m

pitimes4

omp

1024

6R

30

603

118

0

684

11

16 R

697

63

0 70

3

100

1

13

116

1

17

2plusmn

(BC

DEacute

) 10

24

6R3

0 60

3

29

0 68

8

714

13 R

702

29

0 71

0

100

1

14

116

1

18

126m

pitimes4

omp

1024

16R

39

822

3

415

8

805

27

242

6 R

738

1

79

0 71

5

100

0

98

090

0

87

180m

pitimes4

omp

1024

18R

50

925

615

0

892

40

283

2 R

776

127

5 75

3

100

0

96

084

0

81

2plusmn

(BC

DEacute

) 10

24

18R

50

925

712

9

861

29

352

6 R

808

109

0 76

8

100

0

93

087

0

83

612m

pitimes4

omp

1024

45R

68

110

2

349

0 10

34

525

9 R

111

4

349

0 10

46

100

0

94

101

0

95

2plusmn

(BC

DEacute

) 10

24

45R

68

110

2

2810

9

920

57

634

9 R

960

5

358

7 83

4

100

0

83

087

0

76

17m

pitimes8

omp

1024

3R

57

338

1

117

0

430

9

19 R

432

28

5 43

3

100

1

27

128

1

28

90m

pitimes8

omp

1024

15R

60

504

109

0 44

9

243

8 R

471

109

0 47

1

100

0

89

093

0

94

357m

pitimes8

omp

1024

53R

67

731

418

7 69

2

576

3 R

711

418

7 72

9

100

0

95

097

1

00

2plusmn

(BC

DEacute

) 10

24

53R

67

731

3410

5

642

57

685

3 R

628

6

438

3 60

2

100

0

88

086

0

82

JAEA-Testing 2011-005

brvbarWBccedilD gtecircCOslash5ordf+XAgraveXq13aelig13ordfAgraveXCc IB notGiumlIumlordfAgraveXnOslash5ordfUumlWnDordfmacrWaeligCmbrvbarPXpoundXqMPI_Alltoall MPI atilde3gtT 2ordfCX 5Pordfmnegravepound BX900 gtecircXordfXmWordfethCDq1 oacute 1 gt13aelig13thornordfDordfmgtpYacute~thornltC t256YZegraveBCDoacuteCdivideoslash h4096YZegraveBCDgtpYacute~thornlt9]QR MPIbrvbar TUacuteUcircthornuumlwdnWgtszligagraveXNXq 66 a$fY`UgraveWH mnUgravegt BX900 gt brvbarsect ecircXthorn7wWCacirc

RDMANORDMA Agravej Ocirc$-a5 1oacute 1 (13aelig13ocirc)IBnot

ordflaquopoundDqumlcopyfIuml oslashfrac14frac12CTbrvbarCcedilEgravelaquo GuecircgtlaquoigravePqeumld

fIumlszligOslash5GDH Zccedil` MPIUgraveDIcirca|ccedil`laquo

ordfugtlaquoqucircuumlWCGZccedil`MPIWIcirca|ccedil` rsGordfWmcdBX900gtIcirca|ccedil`ordfnotyXmWordfcurrencqmncdfIuml)JGoumllaquocentfIumlAumlFUcirc5GoumlcentX=ordfX

nIcirca|ccedil`laquo GuecircgtlaquoqBX9001iumlIuml 8coreoacuteC IB2gt brvbarntildeGDHm IBgt 2laquo]eumlXDHIcirca|ccedil`ordfugtlaquoqCPU13fUcircordfbrvbarsect2WnIcirca|ccedil`fIuml13vC]G

Weumlnq

JAEA-Testing 2011-005

7 UgraveWH

HPCC BMT7q|^_ZWUuml|YacuteTHORN^_ZBDbrvbarsectBX900W FX1gtfeaZbrvbarTUacuteUcircordflaquosectnordf^_ZOslashszligthornGmWordfdcXpoundDqIORBDgtegraveBG API]a13brvbarpoundlordfagraveaacutegtlaquopoundDqiacuteX B^_ZbrvbarOcirc$

a$fY`yBAgraveogravegt ordfecircC+XmWethCDqmnd

EcircHdegdnDregmacrHPCCYoJusectWUgraveWHWCtCqXyacuteampG(GOcirc$b AccedilGUgraveWHq mndregmacrBX900 W FX1 o13gtegraveBG^_ZIacuteDh

Agraveogravelaquo GDHnotszligXWXqordfbrvbarsectCumlcopy13

ABGlCWCJ7gtecircngtlaquoq

ugraveuacute

)JlaquoDsectXsup2 sup2kDotildeDAElig13=` v oBeacuteQRjlucircsup2acircAgraveXugraveG

q

ucircuumlyacutethorn

1) Innovative Innovative Computing Laboratory the University of Tennesee ldquoHPC CHALLENGErdquo httpiclcsutkeduhpcc

2) gt$middotamp()p HPCaelig7Yacutegt SX13L32 3plusmnAElig13THORNYacutel 3 A=P+ frac14AElig13THORNYacute$ 2005-05middot4 p98-116 (2005)

3) ldquoIOR HPC Benchmarkrdquo httpior-siosourceforgenet

JAEA-Testing 2011-005

A

FX1

HPC

C B

MT

G-F

FTEfeaZUuml^13~

-DH

PCC

_FFT

_235GmWgt

FFT|5ordfTUacuteUcirceumlnordfzyacuteCq

iAgrave

EP-

DG

EM

M-a5^regbrvbarpoundaringaeligccedil13h-a5

(15

MB

Cor

e)UacuteCordfzyacuteCq

Tabl

e A

-1feaZUuml^13~]-a5^regCD

FX1

HPC

C B

MT

UgraveccedilegravekeacuteUgraveccedilknUgraveccedilplusmnsup2sup3

~~regreg

Ugraveccedilmmnecircecirckccedilnecircfrac12

regReeumlgecirckccedilnecircfrac12

ndegplusmnsup2

ecirckccedilpUgraveecircfrac12frac12plusmnsup2sup3degmiddot

plusmnsup2ordmdegsup2sup1plusmnsup2sup3degmiddot

eacuteplusmn~+igraveiacute8

nmcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

reg~

reg~v

reg

sup2sup3plusmn

qvqyuqou

ovwyy

qvquzqq

vxzwo

qvzz

vyoy

zzv|w|

qvuqqy

uvuyx

qyu

sup2sup3plusmnqvqy|zx

ovwz

qvquyxz

vxuo

qvzuw

vyowx

zzv|z

qvxx|y

yvox

qwy

sup2sup3plusmn

qvxoquq

uvoxzo

qvoqqz

yyvuox

w|vyzx

vyoxx

zzvzyq

qv|qq

vqy

yy

sup2sup3plusmnqvxoqwwq

uvqwxx

qvooqzu

yyvxuwo

w|vyyq

vyouu

zzvzzx

qvux|

oqvz|

xw

sup2sup3plusmn

|vwq|qqq

|uvwz

qvqoxxw

qv|

o|uuv|q

vyxy

zzv||wq

qvqwzuw

oqvywyu

sup2sup3plusmn|vw||wqq

|vyz

qvuxu

yyyvuwwu

o|uyvyw

vy|q

zzv|uy

qvxwzqu

ovxuo

zq|z

sup2sup3plusmn

|vwuzqq

uuvxqwq

qvqoxxw

wwyv|zq

o|uv|q

vy|ox

xvuuwu

qvowox

oqvowuyq

sup2sup3plusmn|vwoquxqq

|yvzyxu

qvuzou

yvqwq|

o|uyvzxu

vy|qw

xvuuz

qvqwuy

ovuz

|y

uxqqq

wxqqq

|xqqqq

|qqqq

w

frac14icircRfrac14Rplusmndegreg

szligsup3degicircRRszligcedilszligiumliumlRethvRntildeRfrac14degicircRccedilogravemicroplusmnregntildemicro~sup1Oslashsup3sup2moRenecircfrac12oacuteregplusmnAumlocircotildeg

egravekeacuteOslashfrac14knicircccedilpegravekszligszligOslashmmnOslash|xRccedilpOslashpOslashfrac14knRccedilpegravekszligszligOslashfrac12ecircfrac12eacuteeacuteszlignRocircotilde

mszligpecircmicircccedilpeacutefrac14UgraveOslashOslashyunRocircotilde

frac12kicircRRkplusmnplusmnplusmnraquodegReacutemiddotplusmnmiddotRkplusmn~plusmnmiddot

-

moRexoszligcedilowszligkoumlg

szligdegicircRkszligyudividentildeRvxUgraveegraveparantildeRpposlashueuqUgravecedilregedegsup2plusmngntildeRo|vxUgravecedilregesup3plusmnregsup2RugraveRnecircfrac12gg

Cuacute uacuteucircicircuumlTyacutethornYacuteN0ucircicircuumlTyacutethornYacuteN0gt

|

xo

xo

egravekszligszligRethvicircRethov|vo

eeumlgicircecirckccedilnecircfrac12RregR0Eeecirckccedilnecircfrac12Rndegplusmnsup2gIcirc-RJ13NOIPQ

~~icircRppRmicrodegdegugraveplusmnsup2ntildeRoreg~cedilsup2ntildeRmplusmnRn

JAEA-Testing 2011-005

B EumlIgravecentMPI atilde oh$a`MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

1024EumlIgrave

0

1000

2000

3000

4000

5000

6000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

2000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(1024-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(1024-)

rdma

Nordma

0

100000

200000

300000

400000

500000

600000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(1024-)

rdma

Nordma

Fig B-1 MPI atildeL1024EumlIgrave-1P

JAEA-Testing 2011-005

0

50000

100000

150000

200000

250000

300000

350000

400000

q qqqq uqqqq yqqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(1024-)

rdma

Nordma

0

1000000

2000000

3000000

4000000

5000000

6000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(1024-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(1024-)

rdma

Nordma

0

50

100

150

200

250

300

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(1024-Rank0gtRank1023)

rdma

nordma

Fig B-2 MPI atildeL1024EumlIgrave-2P

JAEA-Testing 2011-005

13

C

RD

MAoacute

NO

RD

MA

Ta

ble

C-13UVOcirc$

(8K

Bccedil

78K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

~plusmnreg

plusmn

sup2~

sup2~

sup2cedil~raquo

sup2~Oslashreg~plusmn

middotplusmnsup1

middotplusmnsup1raquo

~plusmnraquo

~plusmn

~plusmn

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

plusmnraquo

Tabl

e C

-2UVOcirc$

(80

KBccedil

584K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

sup2~Oslashreg~plusmn

plusmn

~plusmnreg

middotplusmnmiddotsup1

~plusmnraquo

~plusmn

sup2~

sup2cedil~raquo

sup2~

middotplusmnsup1raquo

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

~plusmn

plusmnraquo

Zordf

100M

RD

MAAgravejordfnotyWXq

0CXM

NO

RD

MA

AgravejordfnotyWXq

JAEA-Testing 2011-005

D

W Oslash5

Tabl

e D

-1Zccedil`

MPI

8mpi

qzvquecirciumlq|yvwuecirciumlquvzuecirciumlquovu|oecirciumlqxvwoqecirciumlquyvxqoecirciumlquyvxoecirciumlqu

qwvo

qzvzzecirciumlq|vuecirciumlquwvyecirciumlquovu|oecirciumlqxvwuecirciumlquyvuzoecirciumlquxvy|ecirciumlqu

yqyv|

owvz|oecirciumlq|yvwyyecirciumlquvxzecirciumlquvqxecirciumlquov|uecirciumlquyvuecirciumlquccedilxv|xecirciumlq

ccedilyvq

ozvqoecirciumlq|vxxecirciumlquwvux|ecirciumlquvyzoecirciumlquovxoecirciumlquyvuuqecirciumlquccedilvyecirciumlq|

ccedilwuvy

ovq|ecirciumlquyvzz|ecirciumlquwvq|qecirciumlquovu|qecirciumlqxvw|wecirciumlquyvuyoecirciumlquyvywecirciumlqu

yquv

ovquyecirciumlquvuyqecirciumlquwvxqecirciumlquovu|qecirciumlqxvwecirciumlquyvuwqecirciumlquxvzxecirciumlqu

xx|vw

|ovq|zecirciumlquyvwzuecirciumlquvz||ecirciumlquvyoecirciumlquovwecirciumlquyvuzecirciumlquccedilovozecirciumlq|

ccediloyvy

|ovquecirciumlquv|qecirciumlquwvuoyecirciumlquvyy|ecirciumlquovowyecirciumlquyvuecirciumlquccedilvxwecirciumlq|

ccedilovz

uzvwoecirciumlq|vuxyecirciumlquwvu|uecirciumlquovu|ecirciumlqxvw|ecirciumlquyvuuxecirciumlquxvwwuecirciumlqu

yqovy

uzvywecirciumlq|yvzuzecirciumlquvzoyecirciumlquovuwecirciumlqxvwqyecirciumlquyvuecirciumlquyv|yecirciumlqu

yxvu

xzvqecirciumlq|vuoqecirciumlquwv|wqecirciumlquvyyecirciumlquovowecirciumlquyvuuzecirciumlquccedilvozecirciumlq|

ccedil|vx

xzvyzzecirciumlq|voxuecirciumlquwvouecirciumlquvqqecirciumlquovxuecirciumlquyvuuyecirciumlquccediluv|yecirciumlq|

ccedilu|v

yzvuecirciumlq|vxzecirciumlquwvxecirciumlquovuyecirciumlqxvwecirciumlquyvuxecirciumlquxvyzqecirciumlqu

xw|vw

yzvz|oecirciumlq|yvwzoecirciumlquvwwxecirciumlquovuzecirciumlqxvwyecirciumlquyvuyyecirciumlquyvuqwecirciumlqu

yuxv|

zvwqecirciumlq|v|uzecirciumlquwv||yecirciumlquvyxwecirciumlquovozecirciumlquyvuzecirciumlquccedilyvwuecirciumlq|

ccedilywv

ovqq|ecirciumlquyvzzyecirciumlquvzzzecirciumlquvyzecirciumlquovqoecirciumlquyvuzecirciumlquccedil|vozxecirciumlq|

ccedil|ovz

zvuzecirciumlq|vowqecirciumlquwvoxxecirciumlquovoqqecirciumlqxuvxwecirciumlquyvuqecirciumlquvwu|ecirciumlqu

zovy

zvwecirciumlq|vyxecirciumlquwvuecirciumlquovqzzecirciumlqxuvxoecirciumlquyvuyzecirciumlquvu|ecirciumlqu

wvw

qwv|xzecirciumlq|vzecirciumlquwvywecirciumlquvxyoecirciumlquovqwuecirciumlquyvuecirciumlquccedilovqyecirciumlquccedilov

qzvuuuecirciumlq|v|yecirciumlquwvqecirciumlquovuzecirciumlqxvwqqecirciumlquyvuzoecirciumlquyvqoecirciumlqu

y|vx

oovouwecirciumlquwvqqecirciumlquzvoywecirciumlquovuqwecirciumlqxvyxxecirciumlquyvuwecirciumlquuvzoxecirciumlqu

uwvq

owvwuecirciumlq|vq|qecirciumlquvzqzecirciumlquv|uuecirciumlquzvqoecirciumlq|yvu|ecirciumlquccedilxvyxqecirciumlq|

ccedilyuv|

yvz||ecirciumlq|v|oecirciumlquwvuuecirciumlquvqyecirciumlqxovuowecirciumlqxyvu||ecirciumlquovozecirciumlqxoxwvy

ovouecirciumlquvuxecirciumlquwv|zecirciumlquovu|oecirciumlqxvwxyecirciumlquyvux|ecirciumlquxvzoecirciumlqu

xoyvo

|ov|qoecirciumlquv|zqecirciumlquwvyzoecirciumlquv|oecirciumlquwvu|uecirciumlq|yvu|ecirciumlquccedilov|uecirciumlquccediloqxv

|v|qecirciumlq|vqecirciumlquvxzecirciumlquvxqecirciumlquovq|ecirciumlquyvuzecirciumlquccedilv|zoecirciumlq|

ccedil|vu

uwvxy|ecirciumlq|wvzoyecirciumlquzv|ecirciumlquovuozecirciumlqxvuoqecirciumlquyvwqecirciumlquuvuoecirciumlqu

xoxvw

uovquecirciumlquvwecirciumlquwvqecirciumlquovuouecirciumlqxvyxqecirciumlquyvuzoecirciumlquxvwoecirciumlqu

xy|vx

xzvux|ecirciumlq|vxwecirciumlquwvx|ecirciumlquvx|oecirciumlquovoqoecirciumlquyvu|qecirciumlquccedilzvzoecirciumlq|ccediloquvz

xwvqqecirciumlq|vuuwecirciumlquwv|owecirciumlquvzuzecirciumlquov|wyecirciumlquyvxy|ecirciumlquccedil|vywuecirciumlq|

ccediluv|

yzvqecirciumlq|zv|w|ecirciumlquovq|qecirciumlqxovuxecirciumlqxvyecirciumlquyvzqecirciumlquuvqecirciumlqu

uyuvq

yovqyuecirciumlquv|quecirciumlquwv|ywecirciumlquovuyuecirciumlqxwvqxwecirciumlquyvxwoecirciumlquyvecirciumlqu

xwzvx

ovoooecirciumlquvzwwecirciumlquzvqzzecirciumlquvo|ecirciumlqxovuoecirciumlqxyvyouecirciumlquovecirciumlqxooqqvy

ovqoecirciumlquvqyoecirciumlquwvo|ecirciumlquvy|ecirciumlquovoecirciumlquyvuxecirciumlquccediluvxzecirciumlq|

ccediluvz

zvy|ecirciumlq|wvoqqecirciumlquzvqyecirciumlquov|uqecirciumlqxyvw|ecirciumlquyvxyecirciumlquuv||ecirciumlqu

uzovo

zvyzoecirciumlq|vqwecirciumlquwvoecirciumlquovqzwecirciumlqxuvuwwecirciumlquyvuzyecirciumlquvwqyecirciumlqu

yxvy

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

JAEA-Testing 2011-005

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy1

Ccedil

Ccedil

+

Ccedil

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

Fi

g D

-1 Zccedil`

MPI

8mpi

JAEA-Testing 2011-005

Tabl

e D

-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p

q|vu|ecirciumlq|yvuqqecirciumlquyvuuecirciumlquyvy|ecirciumlqu|vuuxecirciumlq|yvuozecirciumlquovzzoecirciumlq

xvw

qvuuwecirciumlq|yvuqecirciumlquvouecirciumlquvqyecirciumlquyvqxqecirciumlq|yvuqecirciumlquccedilovoecirciumlq|

ccediloyv|

o|vuuecirciumlq|yvuq|ecirciumlquyvuecirciumlquyvwuwecirciumlquuv|ecirciumlq|yvuuecirciumlquovqquecirciumlq|

zv

oyvqzwecirciumlq|yvuqoecirciumlquvqooecirciumlquvqecirciumlquyvqyecirciumlq|yvuozecirciumlquovxxecirciumlq

vx

|vwuecirciumlq|yvuqoecirciumlquyv|qecirciumlquyvwxecirciumlquuv|uqecirciumlq|yvu|ecirciumlquovoecirciumlq|

|wv

yvqqwecirciumlq|yvuqyecirciumlquvqqecirciumlquvq|oecirciumlquyvoyecirciumlq|yvuouecirciumlquvuowecirciumlq

uvq

||vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvwxecirciumlquuv|yqecirciumlq|yvuoecirciumlquovooecirciumlq|

|yv

|vuzwecirciumlq|yvuqyecirciumlquvoxxecirciumlquvq|ecirciumlquyvoqyecirciumlq|yvuoecirciumlquccedilov|uecirciumlq|

ccediloyvx

|v|yuecirciumlq|yvuq|ecirciumlquyv|zecirciumlquyvw|oecirciumlquuvqzuecirciumlq|yvuecirciumlquzvo|ecirciumlq

vy

yvy|ecirciumlq|yvuquecirciumlquvqwqecirciumlquvqzecirciumlquyvoqecirciumlq|yvuozecirciumlquccedilxvo|uecirciumlq

ccedilyvy

q|vu|ecirciumlq|yvuqoecirciumlquyvu|ecirciumlquyvyyyecirciumlquvx|qecirciumlq|yvuo|ecirciumlquccedilvoqecirciumlq

ccedilvx

qvu||ecirciumlq|yvuquecirciumlquvouwecirciumlquvq|ecirciumlquyvoxecirciumlq|yvuecirciumlquccedilovoquecirciumlq|

ccedilouvz

ovxouecirciumlq|yvuqecirciumlquyvyxzecirciumlquyvyyecirciumlquvu|ecirciumlq|yvuxecirciumlquwvuwwecirciumlqo

|vu

oyvozecirciumlq|yvuq|ecirciumlquvqoyecirciumlquvq|ecirciumlquyvoecirciumlq|yvuxecirciumlquvooqecirciumlq

|vu

|vyoqecirciumlq|yvuoyecirciumlquyvecirciumlquyvxyecirciumlqu|v|xwecirciumlq|yvuqecirciumlquccedilvozecirciumlq

ccedilxvz

xvzzyecirciumlq|yv|wzecirciumlquyvzwzecirciumlquvq|oecirciumlquyvouecirciumlq|yvuoyecirciumlquuvoxwecirciumlq

yvz

||vyecirciumlq|yvuqoecirciumlquyvy|ecirciumlquyvxecirciumlqu|v|xxecirciumlq|yvuoecirciumlquccedilxvwwzecirciumlqo

ccedilovy

|vuzuecirciumlq|yvuquecirciumlquvox|ecirciumlquvq||ecirciumlquyvo|ecirciumlq|yvuozecirciumlquccedilovq|ecirciumlq|

ccediloyvo

|vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvooecirciumlquvzoyecirciumlq|yvuqecirciumlquccedilv|zxecirciumlq

ccedilyv

yvy|ecirciumlq|yvuqqecirciumlquvqecirciumlquvq|xecirciumlquyvouecirciumlq|yvuqecirciumlquccediluvqoecirciumlq

ccedilxvo

q|vozyecirciumlq|yv|zzecirciumlquyvowecirciumlquyvyzecirciumlqu|vxu|ecirciumlq|yvuouecirciumlquxvq|wecirciumlq

oxvw

qv|wyecirciumlq|yvuqxecirciumlquvouuecirciumlquvquuecirciumlquyvoqyecirciumlq|yvu||ecirciumlquccedilovqqqecirciumlq|

ccedilo|vx

ouvoq|ecirciumlq|yv|zecirciumlquyvwqecirciumlquyvyxecirciumlquvuoecirciumlq|yvuoyecirciumlquccedilovxqoecirciumlq|

ccedil|yvy

ovuqzecirciumlq|yv|zoecirciumlquvo|ecirciumlquvq|zecirciumlquyvo|ecirciumlq|yvuowecirciumlquccedilzv|ecirciumlq

ccedilovy

uvoecirciumlq|yvuqqecirciumlquyvwo|ecirciumlquyvxoecirciumlqu|vzwecirciumlq|yvuoecirciumlquccedilyvowecirciumlq

ccediloxvq

vxozecirciumlq|yvuoecirciumlquvo|ecirciumlquvquuecirciumlquyvoyecirciumlq|yvuecirciumlquccedilovzzecirciumlq|

ccedilov|

||vuoqecirciumlq|yvuqxecirciumlquyvuecirciumlquyvywecirciumlqu|vuyuecirciumlq|yvuoecirciumlquvooqecirciumlq

yv

|vxoecirciumlq|yvuquecirciumlquvoxyecirciumlquvq||ecirciumlquyvqwyecirciumlq|yvuuecirciumlquccedilov||ecirciumlq|

ccediloyvu

|vqwecirciumlq|yvuqqecirciumlquyvoecirciumlquyv|yecirciumlqu|vozecirciumlq|yvuowecirciumlquccedil|vxooecirciumlq

ccedilvu

vuxwecirciumlq|yvuqxecirciumlquvoxoecirciumlquvquqecirciumlquyvoxxecirciumlq|yvuuecirciumlquccedilovooyecirciumlq|

ccedilouvz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

jetimesg

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-3Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (2

)

JAEA-Testing 2011-005

Tabl

e D

-3Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p

q|v|oxecirciumlq|wvwz|ecirciumlq|ovoecirciumlquovou|ecirciumlquvu|qecirciumlq|zvqqecirciumlq|ccedilvxwecirciumlq

ccedil|vu

qvx||ecirciumlq|wvzquecirciumlq|ovyuuecirciumlquovyuyecirciumlquvxqoecirciumlq|wvzyecirciumlq|vy|ecirciumlqo

qv|

ovyecirciumlq|wvwzqecirciumlq|ovoyxecirciumlquovouuecirciumlquvuwecirciumlq|wvzyuecirciumlq|ccedilvoqqecirciumlq

ccedilvy

ovxqoecirciumlq|wvwzecirciumlq|ovyuqecirciumlquovy|yecirciumlquvuoqecirciumlq|wvzxqecirciumlq|ccedil|vzoecirciumlqo

ccedilqvx

|vywoecirciumlq|wvz|ecirciumlq|ovyoecirciumlquov|ecirciumlqu|v|wyecirciumlq|wvz|oecirciumlq|ccedilvzxzecirciumlq

ccedilwvq

v|zoecirciumlq|wvwzyecirciumlq|ovyzecirciumlquovuzyecirciumlquxvzzwecirciumlq|wvzxwecirciumlq|ccedilov||oecirciumlq|

ccedilowvq

||vywecirciumlq|wvwwzecirciumlq|ovxecirciumlquov||ecirciumlqu|v|xwecirciumlq|wvzoecirciumlq|ccedilv|zecirciumlq

ccedilyvx

|vuooecirciumlq|wvzqwecirciumlq|ovy|ecirciumlquovxqxecirciumlquyvqzecirciumlq|wvz|ecirciumlq|ccedilovyecirciumlq|

ccedilovo

|v|xzecirciumlq|wvzqoecirciumlq|ovyecirciumlquovowwecirciumlquvzo|ecirciumlq|wvzyecirciumlq|ccedil|vzzecirciumlq

ccediloovu

vuxzecirciumlq|wvzqoecirciumlq|ovy|yecirciumlquovxoecirciumlquyvuecirciumlq|wvzyoecirciumlq|ccedilyvxuecirciumlq

ccedilwvw

q|v|xzecirciumlq|wvwwoecirciumlq|ovuecirciumlquovxqecirciumlqu|vxxuecirciumlq|wvzuuecirciumlq|vxwecirciumlq

v

qvuxecirciumlq|wvzqyecirciumlq|ovy||ecirciumlquovuzwecirciumlquyvqqecirciumlq|wvzy|ecirciumlq|ccedilov|uwecirciumlq|

ccedilowv

ovuwqecirciumlq|wvwzecirciumlq|ovo|yecirciumlquovo|zecirciumlquvuuqecirciumlq|wvzuyecirciumlq|vowecirciumlqo

ovo

oyvoozecirciumlq|wvz|qecirciumlq|ovxqxecirciumlquovuzecirciumlquyvqoecirciumlq|wvzxzecirciumlq|ccedilvzyecirciumlqo

ccedilov|

|v||zecirciumlq|wvzqzecirciumlq|ovxecirciumlquov|oecirciumlqu|v|oxecirciumlq|wvzzyecirciumlq|yv|owecirciumlqo

ovz

xvzyuecirciumlq|wvzqoecirciumlq|ovuwecirciumlquovxqxecirciumlquyvqxecirciumlq|wvzuecirciumlq|ovwuoecirciumlq

|vo

||v|xzecirciumlq|wvwzoecirciumlq|ovxecirciumlquovuzecirciumlqu|vxozecirciumlq|wvzqecirciumlq|v|zoecirciumlq

vo

|vuwxecirciumlq|wvzqyecirciumlq|ovy|zecirciumlquovxquecirciumlquyvqwoecirciumlq|wvzxzecirciumlq|ccedilov|xoecirciumlq|

ccedilowvo

|vo|uecirciumlq|wvwzqecirciumlq|ovqecirciumlquovoecirciumlqu|vqecirciumlq|wvzyuecirciumlq|ovuyzecirciumlq

uvu

yvuwecirciumlq|wvzooecirciumlq|ovxyyecirciumlquovxqoecirciumlquyvquecirciumlq|wvzyuecirciumlq|ccedilyvuw|ecirciumlq

ccedilwvy

q|v|oyecirciumlq|wvwzqecirciumlq|ovoecirciumlquov|yecirciumlqu|v|yzecirciumlq|wvzwzecirciumlq|ovxoyecirciumlq

uvy

qxvzyyecirciumlq|wvzq|ecirciumlq|ovuwecirciumlquovy|wecirciumlquvu|qecirciumlq|wvzx|ecirciumlq|ovxouecirciumlq|

xvu

ovxyecirciumlq|wvwqecirciumlq|ovoy|ecirciumlquov|oecirciumlquuvuuecirciumlq|wvzyecirciumlq|ovxwqecirciumlq|

xv|

oyvqqecirciumlq|wvzqecirciumlq|ovuzecirciumlquovy|wecirciumlquvuyecirciumlq|wvzuzecirciumlq|ovux|ecirciumlq|

uvo

|vyecirciumlq|wvwzqecirciumlq|ovxecirciumlquov|zecirciumlquuv||ecirciumlq|wvzxuecirciumlq|vowyecirciumlq

ozvx

vuuuecirciumlq|wvzuuecirciumlq|ovy|zecirciumlquovuwzecirciumlquxvzxzecirciumlq|wvz|yecirciumlq|ccedilovuz|ecirciumlq|

ccedilqvo

||vywecirciumlq|wvww|ecirciumlq|ovxyecirciumlquov|zecirciumlquuv|owecirciumlq|wvzoecirciumlq|vwoecirciumlq

ozvw

|vuqqecirciumlq|wvzqwecirciumlq|ovy|oecirciumlquovxqqecirciumlquyvquqecirciumlq|wvzxwecirciumlq|ccedilov|oqecirciumlq|

ccedilov

|v|xecirciumlq|wvww|ecirciumlq|ovuecirciumlquov|q|ecirciumlquuvqyyecirciumlq|wvzyzecirciumlq|vzuyecirciumlq

xv|

yvqwecirciumlq|wvzouecirciumlq|ovxyecirciumlquovxyyecirciumlquyvouecirciumlq|wvzuzecirciumlq|uvqzxecirciumlqo

vz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

Ccedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

Xpound

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-4Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-5Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (2

)

JAEA-Testing 2011-005

E 1oacute 1

Fig E-1 QRfeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39305 sec Ave Speed = 32566145 MBs RANK = 2 TIME = 39197 sec Ave Speed = 32655168 MBs RANK = 3 TIME = 39212 sec Ave Speed = 32643219 MBs RANK = 4 TIME = 34341 sec Ave Speed = 37272765 MBs RANK = 5 TIME = 34175 sec Ave Speed = 37453885 MBs RANK = 6 TIME = 34086 sec Ave Speed = 37552470 MBs RANK = 7 TIME = 34329 sec Ave Speed = 37286226 MBs RANK = 8 TIME = 24021 sec Ave Speed = 53287420 MBs RANK = 9 TIME = 23794 sec Ave Speed = 53794939 MBs RANK = 10 TIME = 23805 sec Ave Speed = 53770236 MBs RANK = 11 TIME = 23798 sec Ave Speed = 53786914 MBs RANK = 12 TIME = 23942 sec Ave Speed = 53462490 MBs RANK = 13 TIME = 23931 sec Ave Speed = 53488046 MBs RANK = 14 TIME = 23950 sec Ave Speed = 53443826 MBs RANK = 15 TIME = 24066 sec Ave Speed = 53186052 MBs RANK = 16 TIME = 23798 sec Ave Speed = 53786327 MBs RANK = 17 TIME = 23802 sec Ave Speed = 53775886 MBs RANK = 18 TIME = 23821 sec Ave Speed = 53733399 MBs RANK = 19 TIME = 23794 sec Ave Speed = 53795866 MBs RANK = 20 TIME = 23939 sec Ave Speed = 53468837 MBs RANK = 21 TIME = 23938 sec Ave Speed = 53470775 MBs RANK = 22 TIME = 23939 sec Ave Speed = 53470333 MBs RANK = 23 TIME = 23942 sec Ave Speed = 53462889 MBs

JAEA-Testing 2011-005

Fig E-2 afeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39370 sec Ave Speed = 32511775 MBs RANK = 2 TIME = 39475 sec Ave Speed = 32425231 MBs RANK = 3 TIME = 39415 sec Ave Speed = 32474846 MBs RANK = 4 TIME = 34392 sec Ave Speed = 37217925 MBs RANK = 5 TIME = 34543 sec Ave Speed = 37055373 MBs RANK = 6 TIME = 34250 sec Ave Speed = 37372742 MBs RANK = 7 TIME = 34709 sec Ave Speed = 36878286 MBs RANK = 8 TIME = 26967 sec Ave Speed = 47465876 MBs RANK = 9 TIME = 26964 sec Ave Speed = 47471018 MBs RANK = 10 TIME = 26955 sec Ave Speed = 47486289 MBs RANK = 11 TIME = 26947 sec Ave Speed = 47499771 MBs RANK = 12 TIME = 26957 sec Ave Speed = 47483647 MBs RANK = 13 TIME = 27020 sec Ave Speed = 47373054 MBs RANK = 14 TIME = 26948 sec Ave Speed = 47498023 MBs RANK = 15 TIME = 27174 sec Ave Speed = 47104542 MBs RANK = 16 TIME = 26956 sec Ave Speed = 47485054 MBs RANK = 17 TIME = 26945 sec Ave Speed = 47504899 MBs RANK = 18 TIME = 26946 sec Ave Speed = 47502360 MBs RANK = 19 TIME = 26967 sec Ave Speed = 47466158 MBs RANK = 20 TIME = 26948 sec Ave Speed = 47499767 MBs RANK = 21 TIME = 26982 sec Ave Speed = 47438204 MBs RANK = 22 TIME = 27087 sec Ave Speed = 47254830 MBs RANK = 23 TIME = 27004 sec Ave Speed = 47399960 MBs

JAEA-Testing 2011-005

Fig E-3 QRfeaZOpenMPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes NODE CPU coreid Rank = 0 bx2050 CPU0 0 all time RANK = 1 TIME = 38110 sec Ave Speed = 33587213 MBs bx2050 CPU1 1 RANK = 2 TIME = 28656 sec Ave Speed = 44667752 MBs bx2050 CPU0 2 RANK = 3 TIME = 37959 sec Ave Speed = 33720175 MBs bx2050 CPU1 3 RANK = 4 TIME = 28658 sec Ave Speed = 44664664 MBs bx2050 CPU0 4 RANK = 5 TIME = 37829 sec Ave Speed = 33836073 MBs bx2050 CPU1 5 RANK = 6 TIME = 28641 sec Ave Speed = 44691880 MBs bx2050 CPU0 6 RANK = 7 TIME = 37892 sec Ave Speed = 33779924 MBs bx2050 CPU1 7 RANK = 8 TIME = 35140 sec Ave Speed = 36425620 MBs bx2051 CPU0 0 RANK = 9 TIME = 35017 sec Ave Speed = 36554163 MBs bx2051 CPU1 1 RANK = 10 TIME = 38916 sec Ave Speed = 32891010 MBs bx2051 CPU0 2 RANK = 11 TIME = 45416 sec Ave Speed = 28184186 MBs bx2051 CPU1 3 RANK = 12 TIME = 35172 sec Ave Speed = 36392511 MBs bx2051 CPU0 4 RANK = 13 TIME = 35053 sec Ave Speed = 36516468 MBs bx2051 CPU1 5 RANK = 14 TIME = 35200 sec Ave Speed = 36363472 MBs bx2051 CPU0 6 RANK = 15 TIME = 35009 sec Ave Speed = 36561987 MBs bx2051 CPU1 7 RANK = 16 TIME = 22996 sec Ave Speed = 55661441 MBs bx2052 CPU0 0 RANK = 17 TIME = 35587 sec Ave Speed = 35968575 MBs bx2052 CPU1 1 RANK = 18 TIME = 35587 sec Ave Speed = 35968151 MBs bx2052 CPU0 2 RANK = 19 TIME = 35712 sec Ave Speed = 35842525 MBs bx2052 CPU1 3 RANK = 20 TIME = 22996 sec Ave Speed = 55661978 MBs bx2052 CPU0 4 RANK = 21 TIME = 36856 sec Ave Speed = 34729458 MBs bx2052 CPU1 5 RANK = 22 TIME = 22994 sec Ave Speed = 55666382 MBs bx2052 CPU0 6 RANK = 23 TIME = 22991 sec Ave Speed = 55673257 MBs bx2052 CPU1 7

JAEA-Testing 2011-005

13

F |eacuteIuml -microX|eacuteIuml^_Z)JCQRfeaZW Intel feaZegrave

pound5poundDqQRfeaZgt|YatildeGfeaUuml^13~

WCeumlUKntstiacuteordflaquoqmnaringaeligccedil13heumlX store amp5GUuml^13~gtlaquoq2csup1egraveBordfOcirc$ storegtszligordflaquoqiAgraveIntelmnntildeGfeaUuml^13~centsup3CXqCcCcedilEgravegtacircOslash5Cagraveaacute5pound

Dq WCIntelfeaZmUuml^13~ordfccdQRfeaZ

UKntst egraveBCDoumlWordfqiAgraveUKntst ordfQRfeaZEGmWordfethCDq

F1 Agraveograve o^_UgraveDo7ccedilIumlgt 128MB copyIgrave0]lyacuteOslash5GZccedil` MPI

OpenMP^_Z)JCDqm0]lyacute 100plusmnatildesectaumlC5mnccC|eacuteIuml (MBs)iquestHq^_ZiIacute[oumlYacutebrvbarsectGqOslash5IumlWCMPI gt 12348 ^_OpenMP gt 123487ccedilIumlgtOslash5UgraveD IFyacute1EC1CPUcd2^_UgraveD7ccedilIuml 2laquonotgtOslash55poundEacuteDqQRfeaZUKntst CDoumlWCXouml 2sectCDqIntelucircfeagt5poundDq

JAEA-Testing 2011-005

Fig F-1 ^_ZugraveuacuteLFlat MPIP

(main) n = nmb102410248 nmb = 128 (from input data) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) nsize = n size c NNN Repetition number of times NNN = 100 do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = MPI_WTIME() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

Fig F-2 ^_ZugraveuacuteLOpenMPP

(main) $OMP threadprivate(ioutompidompnumabierr1ierr2) n = nmb102410248 nmb = 128 (from input data) $OMP parallel shared(n) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) if(ierr1ne0orierr2ne0) then write(6) Array allocation failed stop end if $OMP end parallel nsize = n size c NNN Repetition number of times NNN = 100 $OMP parallel private(i)shared(nsize) do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue $OMP end parallel $OMP parallel private(IIttime1ttime2timewave_spd) $OMPamp shared(NNNn) ttime1 = omp_get_wtime() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = omp_get_wtime() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

F2 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt[oumlYacutebrvbarsect 4gtGqmn

d QRUKntstlaquooumlWoumlWr 2 2^_laquonotgtUKntstlaquooumlWoumlWr 3KntstCc 1ccedil8^_gtQRW IntelWr QR(-Kntst)W Intel2 2^_Wr CqOMP Intel EcircHGu MPI Wiacute

XWXpoundDDHyzGqmndbrvbarsect[raquozfrac14frac12CDq (1) MPI^_ZOpenMP^_ZWQRfeaZgtUKntstordflaquooumlW

oumlgtordfaacuteCq (2) IntelfeaZUKntstCDQRfeaZWbAEligq (3) MPI^_ZOpenMP^_ZW 1CPUSgto coremacrdnXq (4) UKntstordfQRfeaZOslash5thornlt]G(1plusmnW 2plusmn

Oslash5ordfaEacuteD)q ^_ZbrvbarPXOcirc$sup1yBordfAgraveXCc store ampordfCXoumlQR

feaZUKntst Uuml^13~CXnordfyacuteordfdXqOcirc$sup1yBordfcAgraveXcpoundecircsectCXoumlmUuml^13~CagraveaacuteGaeliguordflaquoWnq

IntelfeaZdegIacuteQRUKntstlaquosectoumlWGmWordfgtecircq(XETHNtildefeaOslash5|YatildeG4X

yen5oslashIacuteD|Yordf9WXpoundbrvbarPmacrltdnDq

JAEA-Testing 2011-005

Table F-1 |eacuteIumlWeumlUKntstiacuteUuml^13~

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqvo uuuwvq u|xvq qvzu

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |o|v wq|vy xxwvu qvXpoundo |xwxvz |xwvq |xw|vz Xpoundo |xvy wovw xov qv

|xyvu |xuvo |xxv| |ozvu woqv xyxvq qv

Xpoundq z|vx zvz z|v Xpoundq ouuwvz ozwov ooxv| qvyoXpoundo woqv wqxvo wqvy Xpoundo oux|vu ozwv| oqvu qvyoXpound wqzv| wqvo wqxv Xpound oux|v| ozzqvo oov qvyo

wquv| wqqvo wqv ouxovz ozwyvu oozvo qvyo

Xpoundq o||vo o|xvq o|uvq Xpoundq oqoxvz oxquvu oyqvo qvxzXpoundo ouv ouov ouov Xpoundo oqovw oxoqvo oy|vz qvxzXpound ouov ouyv| ouuvq Xpound oqowvz oxqzv oyuv| qvxzXpound| ouvy ouuvw ou|v Xpound| oqowvz oxqzv oyuvq qvxz

o|zvz ouovw ouqvz oqovz oxqwv| oy|vo qvxz

Xpoundq qwyvq oqyvw qzyvu Xpoundq ouxvq ouwuvy ouywv| qvqXpoundo qz|v ooyvy oqxv Xpoundo ouxvz ouwuv ouovq qvqXpound qzuvy oovq oqxvw Xpound ouxzv ouz|v ouyv qvqXpound| qzxvy ooxvu oqxvx Xpound| ouxuvw ouzv ou|vx qvqXpoundu qwzv| o||v ooovx Xpoundu ouqxvo ouuv| ouyv qvyXpoundx qwzvu o|vw oo|vy Xpoundx ouq|vo ouuxv ouuv qvyXpoundy oqvy o|xvu oovx Xpoundy ouquvw ouuyvq ouxvu qvyXpound ooqvu o|yvo o|v Xpound ouqxvx ouuyv ouyvo qvy

qzxvw ouvw ooqv| ou|qv| ouyvu ouuwvz qvyw

usup3deg

wsup3deg wsup3deg

usup3deg

|sup3deg

sup3deg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

sup3deg

|sup3deg

Table F-2 |eacuteIumlLiacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq ywxvu xyuvw yxvo qvuXpoundo |xywv |xwvx |x|vy Xpoundo yzwvz xvu y|wvo qvuXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz yzvo xovo y|ovy qvu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq yqv xywvy yozvy qvuXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound ywyvz xwv y|uvw qvuXpound| Xpound|

|xyov| |xovu |xyyv| ywvw xxvy yv qvu

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq yoov xvu ywovw qvxXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| yxv qvo yzvy qvx

|xyv| |xqvo |xyyv yowv yov ywzv qvx

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo uyv| wqvo y|v qvuXpound |xyvz |xuvu |xovo Xpound uqvu wqyv| y|wv| qvuXpound| Xpound|

|xyyv| |xuvu |xqv| uywvw wqyv y|vw qvu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo uyxvy wqyv y|yvo qvuXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| uywvq wqxvz y|vq qvu

|xyuvw |x|vo |xywvz uyyvw wqyv| y|yvx qvu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound xq|v wovo yxvz qvuXpound| |xyxvu |x|v| |xyzv| Xpound| uzzvx wqyvu yxvz qvu

|xyuvw |xuvy |xyzv xqovy wqzv yxxvu qvu

ntildeR| ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

ontildeR

ontildeR|

qntildeRo

qntildeR

qntildeR|

qntildeRo

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

JAEA-Testing 2011-005

Talbe F-3 |eacuteIumlLQRafeaZrsacute 1ccedil8 ^_P

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqxvw uqzvq uqvu ovqx

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |ywqvx |ywovy |ywovo ovq|Xpoundo |xwxvz |xwvq |xw|vz Xpoundo |yzyvz |yzvo |yzvq ovq|

|xyvu |xuvo |xxv| |ywwv |ywzv| |ywzvq ovq|

Xpoundq z|vx zvz z|v Xpoundq zv| zxvq z|v ovqqXpoundo woqv wqxvo wqvy Xpoundo wqxvx wqwvy wqvo ovqqXpound wqzv| wqvo wqxv Xpound wquv| wqwv| wqyv| ovqq

wquv| wqqvo wqv wqqv wquvq wqv| ovqq

Xpoundq o||vo o|xvq o|uvq Xpoundq ov ouqv| o|ov| ovqqXpoundo ouv ouov ouov Xpoundo o|ovy ouwvw ouqv ovqqXpound ouov ouyv| ouuvq Xpound o|ovy oxqvo ouqvw ovqqXpound| ouvy ouuvw ou|v Xpound| o|qv ouzvo o|zvz ovqq

o|zvz ouovw ouqvz ozvq ouvo o|wvq ovqq

Xpoundq qwyvq oqyvw qzyvu Xpoundq qzv ooxvq oqyvo ovqqXpoundo qz|v ooyvy oqxv Xpoundo oquv oouv| oqzv ovqqXpound qzuvy oovq oqxvw Xpound oqxv| oxvx ooxvu ovqqXpound| qzxvy ooxvu oqxvx Xpound| oqxvw oxv ooxvx ovqqXpoundu qwzv| o||v ooovx Xpoundu oqv qzuv| oqvx qvzzXpoundx qwzvu o|vw oo|vy Xpoundx oov qzov oquvu qvzwXpoundy oqvy o|xvu oovx Xpoundy ooxvu qwzv oqvy qvzwXpound ooqvu o|yvo o|v Xpound oouvo qzqvo oqvo qvzw

qzxvw ouvw ooqv| ooqvq oqxv oqvz qvzz

wsup3deg

usup3deg

|sup3deg

sup3deg

13aumlecedilg

sup3deg

|sup3deg

usup3deg

eccedilograveregYacuteg

efrac12cedilregg

oefrac12cedilregg

wsup3deg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

Table F-4 |eacuteIumlLQRafeaZrsacute iacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq |quvy |quvx |quvx ovquXpoundo |xywv |xwvx |x|vy Xpoundo ||vx |ovw |vy ovquXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz |ouvq |o|vo |o|vy ovqu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq ||zzv |qv| |xxqvw ovqqXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound |zwv| ||v |yovq ovqxXpound| Xpound|

|xyov| |xovu |xyyv| |xzwvw |o|vq |yxxvz ovq

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq |quv |quvq |quvu ovquXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| |ovx |ovy |ovy ovqu

|xyv| |xqvo |xyyv |o|vo |ovw |o|vq ovqu

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo |oyvw |oyvu |oyvy ovquXpound |xyvz |xuvu |xovo Xpound |oyv |oyvq |oyvo ovquXpound| Xpound|

|xyyv| |xuvu |xqv| |oyvx |oyv |oyv| ovqu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo |ov |oyvo |oyvz ovquXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| |ouv| |ovx |o|vu ovqu

|xyuvw |x|vo |xywvz |oyvq |ouv| |oxvo ovqu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound |oyv |oxvz |oyvo ovquXpound| |xyxvu |x|v| |xyzv| Xpound| |oxvx |oyvw |oyvo ovqu

|xyuvw |xuvy |xyzv |oxvw |oyv| |oyvo ovqu

ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

qntildeRoqntildeRo

qntildeR

qntildeR|

ontildeR

ontildeR|

ntildeR|

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpound

JAEA-Testing 2011-005

G Zccedil` MPIWIcirca|ccedil`gt MPI atilde 6MPI ETHNtildegtegraveBCD^_ZiIacute^regCZccedil`MPI

WIcirca|ccedil`gtOslash55poundDq

G1 Agraveograve MPI ETHNtildegtegraveBCD^_ZiIacute^regC1024 ^_Zccedil

` MPIW 128^_j87ccedilIumlIcirca|ccedil`128^_Zccedil` MPIW128 ^_j8 7ccedilIumlIcirca|ccedil`acircgtCc Ocirc$-a5gtrs5poundDqUgraveDRDMANORDMAIumlgt5poundDq )acircEacute coregtZccedil` MPIWIcirca|ccedil`rs acircEacute MPI^_gt

coreCCDouml rsWGq Ocirc$-a5n~n 8eacutea`ccedil1Keacutea`256Keacutea`ccedil1Meacutea`WCDq

G2 [oumlYacutebrvbarsectGqmbrvbarsect[raquozfrac14frac12CDq (1) 1024Zccedil` MPIW 128j8Icirca|ccedil`gtGu MPIatildegtIcirca|ccedil`

ordfFqUgraveDatildebrvbarpoundaacuteCCordflaquoq (2) 128Zccedil` MPIW 128j8Icirca|ccedil`gtplusmn Ocirc$-a5(256KB13yacute)

gtlaquoWGu MPIatildegtaeligIcirca|ccedil`ordfnotyW wXordfaacuteCCIcirca|ccedil`ordfZccedil` MPIbrvbarsectordfmWq

(2)gt ordfAgravecedilCXpoundegraveBG core ordfCXgtOslashf

IumlntildeHDoumlfIumlOslash5FCXWeumlnqmndbrvbarsect

BX900gtIcirca|ccedil`ordfnotygtlaquoqfIumlUTUacuteUcircbgtIcirca|ccedil`Ucirc5Gaeliguordflaquoq

JAEA-Testing 2011-005

Table G-1 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (1)

sup2~ w |y |o ovq| zx o| qv |vu vuqsup2~ oy |o |qz ovq| oqq o|x qv| |vow vwsup2~ u |ow |o ovqq zz o|x qv| |vo v|xsup2~ | |y |o ovqu zw o|x qv| |v|u v|sup2~ yu |o |oy ovqq zy o|z qvyz |v| vsup2~ ow ||u |o| ovq zw o| qv |vuq vwsup2~ xy ||u |o ovqx oqx o|z qvx |voz vsup2~ xo ||u |q ovqu oq o|w qv |vo| v|sup2~ ontildequ |uo |q ovq oqz ou| qvy |vo vu

sup2~ w owq qw qvw q oo qvy| vxy ovwxsup2~ oy owu o| qvw q oqw qvyy vyo ovzwsup2~ u owq qu qvww yw oqz qvy vyx ovwsup2~ | o| qy qvwu o oqw qvyy vu| ovzosup2~ yu o| qu qvwx o oqw qvyy vu| ovwzsup2~ ow owx qq qvz| oqz qvy vxy ovwusup2~ xy oz qo qvwz wy ooo qvw vqz ovwsup2~ xo ow qu qvw w oo qv vqx ovwsup2~ ontildequ ow ou qvwx z oq qv ovzw ovz

sup2~Oslashreg~plusmn w untildeqoy untilde|z ov| ontildeuw ontildeyz| qvu xzv|q xvusup2~Oslashreg~plusmn oy |ntildeyuw uqntilde ovwo ontildezw ontildeyuu qvz xyvx uvwsup2~Oslashreg~plusmn u untildeqxu |zntildezu| ovwx ontildexo ontildeyoq qvw xzvow uvwsup2~Oslashreg~plusmn | |ntilde|| uqntildeyq ovw ontilde|q ontildeyq qvwo xxvxy uvwysup2~Oslashreg~plusmn yu untildey uqntildeuuu ovwu ontildeww ontildeyuy qvw xvyu uvxsup2~Oslashreg~plusmn ow |ntildeqq uqntildezw| ovw ontildew ontildeyzx qv xzvxq uvowsup2~Oslashreg~plusmn xy untildeyyx uqntildeywz ovwu ontildeuxw ontildeyy qvw xovq uvuosup2~Oslashreg~plusmn xo untildeqz |zntildeoww ovwz ontildexq ontildeqx qvwz uwvyz vzzsup2~Oslashreg~plusmn ontildequ yntildexwx |ntildeuuz vqx ontildey ontildewx qvw uvo ozvzw

middotplusmnsup1 w |w | ovqq oqq o|y qv| |vw vuqmiddotplusmnsup1 oy |u |uo qvzx oqo o|y qvu |vq vxqmiddotplusmnsup1 u |uo |yq qvzx oqo ouq qv |v|w vxwmiddotplusmnsup1 | |xo |yw qvzy oq o|w qvu |vux vyymiddotplusmnsup1 yu xzo u|| ov|y oqx ouw qvo xvy| vzmiddotplusmnsup1 ow ntildez yqu uvz| ooy oxx qvx xvq |vzqmiddotplusmnsup1 xy xntilde z xv| o| oy qvz |zvyx xvwumiddotplusmnsup1 xo wntildeqoz ontildew uvxo oy oz ovo| |voy zvmiddotplusmnsup1 ontildequ oqntildezy |ntildexxo |vqz zo |w uvqz oovz ouvzx

middotplusmnsup1raquo w ntildey| oxntildequq qvo || oy qvu vz ovqomiddotplusmnsup1raquo oy ntildeuyw oxntildeqx qvoy ||x oq qvu v| ovoxmiddotplusmnsup1raquo u ntilde|y oxntildequq qvow |uo o| qvuw wvq| ovqzmiddotplusmnsup1raquo | ntildexw| oxntildeq|| qvo || ox qvu vyy ovqomiddotplusmnsup1raquo yu ntildeyyz oxntildeqqz qvow |xq yw qvuy vy ozvxumiddotplusmnsup1raquo ow |ntildeqqx oxntildeqy| qvq u|q uo qvxw yvzz qv|middotplusmnsup1raquo xy |ntilde|uy oxntildeoqu qv xyo y qvu xvzy ozvw|middotplusmnsup1raquo xo |ntilde|zq oxntildeoxq qv xzw woq qvu xvy owvomiddotplusmnsup1raquo ontildequ |ntildex| oxntildeuox qvu yy z|z qv xvxx oyvuo

Ugraveplusmnsup1 w ontildeuox yntildeyo| vy| u q ovq ovzq |vUgraveplusmnsup1 oy owntildeqz yntildewu vy |q qo ovou wv|q |uvoxUgraveplusmnsup1 u owntilde| yntildewzz v |qo q ovuy yvoy ||v|wUgraveplusmnsup1 | ozntildeqzy yntildezq v uz q ovq yvwq ||v|wUgraveplusmnsup1 yu ozntilde|wx yntildewx| vw| z oy ov| yxv|o |ovqUgraveplusmnsup1 ow qntildewo yntildez|y vz ||q o ovxq yovxq |ovuxUgraveplusmnsup1 xy qntilde|o ntildeqo vzq |wy y ovo xvx |ovqqUgraveplusmnsup1 xo ontildeo|u ntildeow| vzu |xy |u ovx| xzv| |qvyUgraveplusmnsup1 ontildequ ontildey|q ntilde|z vz| |zw ow| vow xuv|o uqvu

Ugraveplusmnsup1raquo w ozntildexw yntildez| vw| |z q ovow wvqu |uvuqUgraveplusmnsup1raquo oy ozntildexoz yntildezq vw| |q q ovou wuvyz |uvoUgraveplusmnsup1raquo u ozntildexqy yntildezu vwo u q ov| ov|q ||vyqUgraveplusmnsup1raquo | ozntildeuu yntildewz vwy o qy ov|o vwx ||vuqUgraveplusmnsup1raquo yu ozntildeqw yntildezq| vwx y oy ovu |vy |ovzy

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

sectWOcirceg

oqu$XAcircsup3deg owsup3degIcircwsup3+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

JAEA-Testing 2011-005

Table G-2 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (2)

Ugraveplusmnsup1raquo ow qntildeqx yntildewz| |vqq |u q ovq xxv|| |ovzUgraveplusmnsup1raquo xy qntildew| ntildeq| vzy ||y y ovuz yvqx |ovoUgraveplusmnsup1raquo xo ontildew ntildexy vz| u|| |u ovwx uzvoo |qvzxUgraveplusmnsup1raquo oqu ontildeww| ntildeuqz vzx |oy owu ov yzv|q uqv||

~plusmn w yntildez zntildeuw qvo ontildeozu uux vyw xvwq ovzo~plusmn oy ntildeoqz zntildeu| qv| ontildeoy| uu vyq yvoo ovz~plusmn u ntildeq|z zntildeuy qv ontildeo uuy vy| yvqq ovwx~plusmn | ntildeoz zntildeyq qvu ontildequ uu vq xvzw ovwy~plusmn yu yntildezoq zntildey qvo ontildeoyu uu vyq xvzu ovw|~plusmn ow zntilde|qz zntildew| qvzx ntildeo uuy uvw uvz ovz~plusmn xy zntilde|wz zntildewqw qvzy ntildey uuy uvzz uv ovz~plusmn xo zntildeqx zntildewuq qvzu ntildeowx uuz uvw uvo ovzu~plusmn oqu zntildexz zntildezq| qvzz ntildeo uyo uv uvuz ovxq

~plusmnraquo w yntildewox zntilde| qvq ontildeuo uux vz xvuz ovw~plusmnraquo oy yntildezq zntildeu| qvo ontildeoz uu vyw xvw ovwo~plusmnraquo u yntildeyz| zntildeux qvyz ontildeoyw uuu vy| xv| ovz|~plusmnraquo | yntildewou zntildexx qvq ontildeu uux vx xvx ovzo~plusmnraquo yu yntildezx| zntildex qvo ontildeoyo uuu vyo xvzz ovzx~plusmnraquo ow zntildeuux zntildew qvz ntildeoxx uu| uvwy uv|w vqx~plusmnraquo xy zntilde|x zntildewqo qvzx ntildeq uuy uvzx uvu ovzw~plusmnraquo xo zntildeyw zntildewuy qvzw ntildeqy uxq uvzq uv|z ovwz~plusmnraquo oqu zntildeyxq zntildezoo qvz ntildeoxy uyq uvyw uvuw ovx|

plusmn w uy uo ovo zy oow qvw uvwq |vxqplusmn oy xq uw ovqy oqo oq qvwx xvqq |vzzplusmn u xwz x|| ovoq oqy o| qvwy xvx uv|uplusmn | oo yqy ovo ooq o|q qvwx yvuy uvywplusmn yu ntilde zoz vu o| ouo qvw owvqu yvxqplusmn ow ntildeww ntildeqqz ov|z oxo ox qvwy owvu oovxqplusmn xy untildeouy |ntildeoy ovo q| oz qvz| qvuy oyvzplusmn xo |zntildeo| |xntilde|u ovoo ontildeuox ontildeuy qvwo vyz qvowplusmn oqu uqntildeqwy |xntilde|w ovo ontildeuww ontildewqw qvw yvzu ozv

plusmnraquo w oqqntildexy yqntilde|y ovy ntildeuou wy vx uovyy ywvwzplusmnraquo oy oqntildey| yontildeq ovyw ntildeu|w ww| vy uvqz yzv|oplusmnraquo u oqntildexyw yontildeqz| ovyw ntilde|zy wz| vyw uvw ywvuqplusmnraquo | oq|ntildeqo yontildeo ovyw ntildeuqq www vq uvzx ywvzoplusmnraquo yu oq|ntildeqzw yontildeoou ovyz ntildeuyx wzx vx uovw| ywv|qplusmnraquo ow oq|ntildezo yontildeq| ovq |ntildexqx wz |vzo zvyx ywvoplusmnraquo xy oquntildeqy yontildeqq ovo |ntildexyw wzz |vz zvo yvwxplusmnraquo xo oqwntilde|zq yqntildewo ovw |ntildexyx wzx |vzw |qvuo yvzplusmnraquo oqu ooqntildeqq yontildeoxx ovwq |ntildeyu zo |vzz zvzx yyv|z

~plusmnreg w ooz o|o qvzq q x qvwq xvwo xvo~plusmnreg oy oo| ou qvzo y qvwx xvq uvwu~plusmnreg u oow o|o qvzo q x qvw xvz xvy~plusmnreg | oq ou qvz q x qvwq xvzu uvwz~plusmnreg yu oq ox qvzy q x qvw yvo uvzz~plusmnreg ow o oz qvzz y qvw xvy xvq|~plusmnreg xy ox o|o qvzx y qvzx uvwy uvw|~plusmnreg xo ox o|o qvzx | y qvww xv|x uvzx~plusmnreg oqu ou o|| qvz| y z qvz uvq uvy|

~plusmn w oontildewxu |ntildeyuy |vx yu ontildeoow qvx owvuy |vy~plusmn oy oontildewo |ntildeyq |vw y|z ontildeoox qvx owvuw |v|~plusmn u oontildey |ntildexw |vz yuo ontildeoox qvx owv| |vo~plusmn | oontilde| |ntildexzx |v y| ontildeo| qvx owvuz |vq~plusmn yu oontildew|o |ntildey |v y| ontildeo qvx owvxw |vo~plusmn ow ontildeqw |ntilde| |v| yy ontildeouq qvxx ozv|o |vw~plusmn xy ontilde |ntildewox |v zz ontildeoyw qvyw oxv| |v~plusmn xo ontildeuo |ntildezzo |voo wu ontilde qvyz ouvy |v~plusmn oqu ontildequ untilde|w vzq zu ontilde|u qvo o|vuz |v|o

sup2cedil~raquo w y o qvx o | qvu| xvoy uvusup2cedil~raquo oy z o qvwq o qvxx vzx xvuxsup2cedil~raquo u x oq qvuy o qvyo |v| uvuzsup2cedil~raquo | | oo qv o qvxu vuz uvzosup2cedil~raquo yu oo qvyo o qvuz xvyo uvxusup2cedil~raquo ow oo qvyw o qvxx yvqq uvwxsup2cedil~raquo xy y oo qvxy o qvy uvyq xvoosup2cedil~raquo xo y oo qvxy | qvy| |vq| |vusup2cedil~raquo oqu x o qv|z o qvx |vyz xvuq

JAEA-Testing 2011-005

Table G-3 MPIatilderaquo LMPI and hybrid acute256ccedil1024KBP

sup2~ yntildeouu |ntildewzz |ntildeuou ovou ontildezx ontildeqyw ovw| ovzz |vqsup2~ xuntildeww ntilde|o ntildeuo qvzz |ntildeqo |ntildey qvz vu vsup2~ wyntildeu| oqntildexx oontildeqyy qvz |ntildezwy untildeuyw qvwz vq vuwsup2~ ontildequwntildexy ountildexxy oxntildeuy qvzu untildezqq xntildewyw qvwu vz vyu

sup2~ yntildeouu zou zo qvzu ontildeqqq ontildeq qvzw qvzo qvzxsup2~ xuntildeww ontildeo ontildezy qvwz ontildew ntildeuuz qv| qvz qvzsup2~ wyntildeu| ntildexuo ntildey| qvz ntildeyzz |ntildeuqw qvz qvzu qvwosup2~ ontildequwntildexy |ntildex untildeqox qvzu |ntildexww untilde|| qvw ovqx qvz

sup2~Oslashreg~plusmn yntildeouu yntildeuw| wontildexoz qvzu xntildeuo |ntildeoqw ovoq |vqq |vx|sup2~Oslashreg~plusmn xuntildeww ou|ntildewx ountildeqo qvw| uyntildeqzq yyntildewxo qvyz |vo vyosup2~Oslashreg~plusmn wyntildeu| oontildezoo yzntildeuz qvz yntildeqz zxntildeox qvo |voy vw|sup2~Oslashreg~plusmn ontildequwntildexy w|ntildeyox |yontildeuq| qvw wntildexo o|ntildexqz qvo |v| vz|

middotplusmnsup1 yntildeouu |ntildezw ||ntilde|x| qvzz oontildexww oxntildeyzu qvu vwx vo|middotplusmnsup1 xuntildeww y|ntildex yyntildeyy qvzx ontildez|x untilde|x qvuy vww ovuomiddotplusmnsup1 wyntildeu| wzntildewy z|ntildeyux qvzy |qntildeyx yntildeyqu qvuy vz ov|zmiddotplusmnsup1 ontildequwntildexy oqntildeux| o|untildeqox qvzq uqntildeyxx wwntildez| qvuy vzy ovxo

middotplusmnsup1raquo yntildeouu yntildew|u ||ntildeu ovww ntildewxy oxntildeyuw uvyy qvwy voumiddotplusmnsup1raquo xuntildeww wxntildeoz yuntildeo| ov|| wxntildex|| untildeuy ovwq ovqq ov|xmiddotplusmnsup1raquo wyntildeu| ooxntildeuw z|ntildezyu ov| z|ntildeyow yntildexuz ov|z ov| ov|zmiddotplusmnsup1raquo ontildequwntildexy o|ntildexx o|untildeuqz ovq zyntildexoq wntildeuuy ovoq ovu| ovxu

Ugraveplusmnsup1 yntildeouu |zntildeyu| oqntildexwz |vu uontildeuyq ountildewy vz qvzy qvoUgraveplusmnsup1 xuntildeww untildewxx ntildeqo voy uzntildeoy| yntildewo ovw| qvz qvw|Ugraveplusmnsup1 wyntildeu| xxntildewoq zntildewyu ovw xyntildewuu |yntildexqx ovxy qvzw qvwUgraveplusmnsup1 ontildequwntildexy yntildewxy |ntildexxw ovy yuntildewzx uyntildeoq ovuo qvz qvwo

Ugraveplusmnsup1raquo yntildeouu uqntildeq|u oqntildeyxq |vy uontildey| ountildez|z vz qvzy qvoUgraveplusmnsup1raquo xuntildeww uwntilde|u ntildeqz vo uzntildew yntildewx ovw| qvzw qvw|Ugraveplusmnsup1raquo wyntildeu| xyntildezy zntildez|y ovww xntildeo| |yntildeuwq ovx qvzw qvwUgraveplusmnsup1raquo ontildequwntildexy y|ntildeo| |ntildeyxq ovyw yxntilde|yw uyntildeqx ovuo qvz qvwo

~plusmn yntildeouu untildeyw wntildezu uvo uuntildeo wntildeuq| xvx qvzy ovq~plusmn xuntildeww xqntildeqyx qntilde|xq vuy xontildexwy untildezux vq qvz qvw~plusmn wyntildeu| xwntildeq wntildeux vqy yqntildeox |untildeu ov| qvzw qvw~plusmn ontildequwntildexy yyntildequq |yntilde|q| ovw ywntildeqx uuntildeuyy ovx| qvz qvw

~plusmnraquo yntildeouu zntilde|q zntildeqq ovqu zntildexq wntildeu|w ovo| qvzw ovq~plusmnraquo xuntildeww oyntildeu| qntilde|wy qvwo oyntildeoz untildezuy qvyx ovq qvw~plusmnraquo wyntildeu| |ntildewo wntildeuxu qvw ntildewz |untilde qvyy ovq qvw~plusmnraquo ontildequwntildexy zntildezy |yntildexo| qvw zntildezxq uuntildeu qvy ovqq qvw

plusmn yntildeouu yxntildewu wzntildeq| qv| ontildeyoz yntildezz qvwq |vq |v|qplusmn xuntildeww ountildeqoz oxntilde|o qvwo |zntildexu xxntilde|ux qv |vo vxplusmn wyntildeu| ow|ntildeuwy untildeyu qvw xntildeoyu untildezu| qvy |vo |vqqplusmn ontildequwntildexy u|ntildewo zxntildezqo qvw xntildezy zuntildeoy qvwq |vo |vo

plusmnraquo yntildeouu x|ontildeyx wyntildeqoz yvow o|ntildeqox owntilde|xu vu |vww uvyzplusmnraquo xuntildeww yozntildeyz| oz|ntildeuwu |vq oyntildeyw ywntildeyw v| |vwo vwplusmnraquo wyntildeu| yw|ntildewxu wwntildeooz v| owyntilde|q| oqxntildeyw ovy |vy vplusmnraquo ontildequwntildexy |yntildezo uqxntildeoyq ovw oontildexo ouontildezzx ovuz |vuw vwx

~plusmnreg yntildeouu uzu y| qvw uwo xzw qvwq ovq| ovqy~plusmnreg xuntildeww zq| ontildeoz qvy wo| ontildexuy qvx| ovoo qv~plusmnreg wyntildeu| ontilde|q ontildeyq qvw ontildeoxo ntildeoqz qvxx ovo| qvz~plusmnreg ontildequwntildexy ontildezqu ntilde|ux qvwo ontildeuxu ntildeyx qvxu ov|o qvww

~plusmn yntildeouu ountildewqx ountildez|x qvzz ontildeqq owntildeuw| ovou qvq qvwo~plusmn xuntildeww zntildeuy zntildewxz qvzw uqntildex|o uwntildexyw qvw| qv qvyo~plusmn wyntildeu| uuntildeq uuntildeo qvzz xyntildeyoq yzntildexzw qvwo qvw qvyu~plusmn ontildequwntildexy qntildeqqx qntildeu|q qvzz ntildeo zqntilde|ux qvwo qvzy qvw

sup2cedil~raquo yntildeouu x w qvz zo ux vq| qvw ov|sup2cedil~raquo xuntildeww ox o| qv| o|w o| qvyx qvzo qvwosup2cedil~raquo wyntildeu| oo |o qvu owz w| qvy qvzo qvwsup2cedil~raquo ontildequwntildexy ou wz qvu | |x qvyx qvz qvwo

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

sectWOcirceg

ow$XAcircsup3deg owsup3degIcircwsup3

国際単位系(SI)

乗数  接頭語 記号 乗数  接頭語 記号

1024 ヨ タ Y 10-1 デ シ d1021 ゼ タ Z 10-2 セ ン チ c1018 エ ク サ E 10-3 ミ リ m1015 ペ タ P 10-6 マイクロ micro1012 テ ラ T 10-9 ナ ノ n109 ギ ガ G 10-12 ピ コ p106 メ ガ M 10-15 フェムト f103 キ ロ k 10-18 ア ト a102 ヘ ク ト h 10-21 ゼ プ ト z101 デ カ da 10-24 ヨ ク ト y

表5SI 接頭語

名称 記号 SI 単位による値

分 min 1 min=60s時 h 1h =60 min=3600 s日 d 1 d=24 h=86 400 s度 deg 1deg=(π180) rad分 rsquo 1rsquo=(160)deg=(π10800) rad秒 rdquo 1rdquo=(160)rsquo=(π648000) rad

ヘクタール ha 1ha=1hm2=104m2

リットル Ll 1L=11=1dm3=103cm3=10-3m3

トン t 1t=103 kg

表6SIに属さないがSIと併用される単位

名称 記号 SI 単位で表される数値

電 子 ボ ル ト eV 1eV=1602 176 53(14)times10-19Jダ ル ト ン Da 1Da=1660 538 86(28)times10-27kg統一原子質量単位 u 1u=1 Da天 文 単 位 ua 1ua=1495 978 706 91(6)times1011m

表7SIに属さないがSIと併用される単位でSI単位で表される数値が実験的に得られるもの

名称 記号 SI 単位で表される数値

キ ュ リ ー Ci 1 Ci=37times1010Bqレ ン ト ゲ ン R 1 R = 258times10-4Ckgラ ド rad 1 rad=1cGy=10-2Gyレ ム rem 1 rem=1 cSv=10-2Svガ ン マ γ 1γ=1 nT=10-9Tフ ェ ル ミ 1フェルミ=1 fm=10-15mメートル系カラット 1メートル系カラット = 200 mg = 2times10-4kgト ル Torr 1 Torr = (101 325760) Pa標 準 大 気 圧 atm 1 atm = 101 325 Pa

1cal=41858J(「15」カロリー)41868J(「IT」カロリー)4184J(「熱化学」カロリー)

ミ ク ロ ン micro 1 micro =1microm=10-6m

表10SIに属さないその他の単位の例

カ ロ リ ー cal

(a)SI接頭語は固有の名称と記号を持つ組立単位と組み合わせても使用できるしかし接頭語を付した単位はもはや コヒーレントではない(b)ラジアンとステラジアンは数字の1に対する単位の特別な名称で量についての情報をつたえるために使われる

 実際には使用する時には記号rad及びsrが用いられるが習慣として組立単位としての記号である数字の1は明 示されない(c)測光学ではステラジアンという名称と記号srを単位の表し方の中にそのまま維持している

(d)ヘルツは周期現象についてのみベクレルは放射性核種の統計的過程についてのみ使用される

(e)セルシウス度はケルビンの特別な名称でセルシウス温度を表すために使用されるセルシウス度とケルビンの

  単位の大きさは同一であるしたがって温度差や温度間隔を表す数値はどちらの単位で表しても同じである

(f)放射性核種の放射能(activity referred to a radionuclide)はしばしば誤った用語でrdquoradioactivityrdquoと記される

(g)単位シーベルト(PV200270205)についてはCIPM勧告2(CI-2002)を参照

(a)量濃度(amount concentration)は臨床化学の分野では物質濃度

  (substance concentration)ともよばれる(b)これらは無次元量あるいは次元1をもつ量であるがそのこと   を表す単位記号である数字の1は通常は表記しない

名称 記号SI 基本単位による

表し方

秒ルカスパ度粘 Pa s m-1 kg s-1

力 の モ ー メ ン ト ニュートンメートル N m m2 kg s-2

表 面 張 力 ニュートン毎メートル Nm kg s-2

角 速 度 ラジアン毎秒 rads m m-1 s-1=s-1

角 加 速 度 ラジアン毎秒毎秒 rads2 m m-1 s-2=s-2

熱 流 密 度 放 射 照 度 ワット毎平方メートル Wm2 kg s-3

熱 容 量 エ ン ト ロ ピ ー ジュール毎ケルビン JK m2 kg s-2 K-1

比熱容量比エントロピー ジュール毎キログラム毎ケルビン J(kg K) m2 s-2 K-1

比 エ ネ ル ギ ー ジュール毎キログラム Jkg m2 s-2

熱 伝 導 率 ワット毎メートル毎ケルビン W(m K) m kg s-3 K-1

体 積 エ ネ ル ギ ー ジュール毎立方メートル Jm3 m-1 kg s-2

電 界 の 強 さ ボルト毎メートル Vm m kg s-3 A-1

電 荷 密 度 クーロン毎立方メートル Cm3 m-3 sA表 面 電 荷 クーロン毎平方メートル Cm2 m-2 sA電 束 密 度 電 気 変 位 クーロン毎平方メートル Cm2 m-2 sA誘 電 率 ファラド毎メートル Fm m-3 kg-1 s4 A2

透 磁 率 ヘンリー毎メートル Hm m kg s-2 A-2

モ ル エ ネ ル ギ ー ジュール毎モル Jmol m2 kg s-2 mol-1

モルエントロピー モル熱容量ジュール毎モル毎ケルビン J(mol K) m2 kg s-2 K-1 mol-1

照射線量(X線及びγ線) クーロン毎キログラム Ckg kg-1 sA吸 収 線 量 率 グレイ毎秒 Gys m2 s-3

放 射 強 度 ワット毎ステラジアン Wsr m4 m-2 kg s-3=m2 kg s-3

放 射 輝 度 ワット毎平方メートル毎ステラジアン W(m2 sr) m2 m-2 kg s-3=kg s-3

酵 素 活 性 濃 度 カタール毎立方メートル katm3 m-3 s-1 mol

表4単位の中に固有の名称と記号を含むSI組立単位の例

組立量SI 組立単位

名称 記号

面 積 平方メートル m2

体 積 立法メートル m3

速 さ 速 度 メートル毎秒 ms加 速 度 メートル毎秒毎秒 ms2

波 数 毎メートル m-1

密 度 質 量 密 度 キログラム毎立方メートル kgm3

面 積 密 度 キログラム毎平方メートル kgm2

比 体 積 立方メートル毎キログラム m3kg電 流 密 度 アンペア毎平方メートル Am2

磁 界 の 強 さ アンペア毎メートル Am量 濃 度 (a) 濃 度 モル毎立方メートル molm3

質 量 濃 度 キログラム毎立法メートル kgm3

輝 度 カンデラ毎平方メートル cdm2

屈 折 率 (b) (数字の) 1 1比 透 磁 率 (b) (数字の) 1 1

組立量SI 基本単位

表2基本単位を用いて表されるSI組立単位の例

名称 記号他のSI単位による

表し方SI基本単位による

表し方平 面 角 ラジアン(b) rad 1(b) mm立 体 角 ステラジアン(b) sr(c) 1(b) m2m2

周 波 数 ヘルツ(d) Hz s-1

ントーュニ力 N m kg s-2

圧 力 応 力 パスカル Pa Nm2 m-1 kg s-2

エ ネ ル ギ ー 仕 事 熱 量 ジュール J N m m2 kg s-2

仕 事 率 工 率 放 射 束 ワット W Js m2 kg s-3

電 荷 電 気 量 クーロン A sC電 位 差 ( 電 圧 ) 起 電 力 ボルト V WA m2 kg s-3 A-1

静 電 容 量 ファラド F CV m-2 kg-1 s4 A2

電 気 抵 抗 オーム Ω VA m2 kg s-3 A-2

コ ン ダ ク タ ン ス ジーメンス S AV m-2 kg-1 s3 A2

バーエウ束磁 Wb Vs m2 kg s-2 A-1

磁 束 密 度 テスラ T Wbm2 kg s-2 A-1

イ ン ダ ク タ ン ス ヘンリー H WbA m2 kg s-2 A-2

セ ル シ ウ ス 温 度 セルシウス度(e) Kンメール束光 lm cd sr(c) cd

スクル度照 lx lmm2 m-2 cd放射性核種の放射能( f ) ベクレル(d) Bq s-1

吸収線量 比エネルギー分与カーマ

グレイ Gy Jkg m2 s-2

線量当量 周辺線量当量 方向

性線量当量 個人線量当量シーベルト(g) Sv Jkg m2 s-2

酸 素 活 性 カタール kat s-1 mol

表3固有の名称と記号で表されるSI組立単位SI 組立単位

組立量

名称 記号 SI 単位で表される数値

バ ー ル bar 1bar=01MPa=100kPa=105Pa水銀柱ミリメートル mmHg 1mmHg=133322Paオングストローム Å 1Å=01nm=100pm=10-10m海 里 M 1M=1852mバ ー ン b 1b=100fm2=(10-12cm)2=10-28m2

ノ ッ ト kn 1kn=(18523600)msネ ー パ Npベ ル B

デ ジ ベ ル dB

表8SIに属さないがSIと併用されるその他の単位

SI単位との数値的な関係は    対数量の定義に依存

名称 記号

長 さ メ ー ト ル m質 量 キログラム kg時 間 秒 s電 流 ア ン ペ ア A熱力学温度 ケ ル ビ ン K物 質 量 モ ル mol光 度 カ ン デ ラ cd

基本量SI 基本単位

表1SI 基本単位

名称 記号 SI 単位で表される数値

エ ル グ erg 1 erg=10-7 Jダ イ ン dyn 1 dyn=10-5Nポ ア ズ P 1 P=1 dyn s cm-2=01Pa sス ト ー ク ス St 1 St =1cm2 s-1=10-4m2 s-1

ス チ ル ブ sb 1 sb =1cd cm-2=104cd m-2

フ ォ ト ph 1 ph=1cd sr cm-2 104lxガ ル Gal 1 Gal =1cm s-2=10-2ms-2

マ ク ス ウ ェ ル Mx 1 Mx = 1G cm2=10-8Wbガ ウ ス G 1 G =1Mx cm-2 =10-4Tエルステッド( c ) Oe 1 Oe  (1034π)A m-1

表9固有の名称をもつCGS組立単位

(c)3元系のCGS単位系とSIでは直接比較できないため等号「   」

   は対応関係を示すものである

(第8版2006年改訂)

この印刷物は再生紙を使用しています

Page 10: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター

JAEA-Testing 2011-005

MOcircOtildeY12 12PBLRAID6Pgtlaquosect13gt 576GBs IO LgYPnotGLFig 2PqIO iumlIuml InfiniBand nota13Parallelnavi SRFSbrvbarsectciumlIumlcdYoacuteCaidefgCq Ocirc$OcircOtildeYOumltimeslaquoDsect 466Za SASOcircOtildeYL1TB7200rpmP8ampG

qmndOcircOtildeY RAID6 gt^UcircCOumltimesgt`ZagGmWgt IO AumlFUcirchpoundqXIOiumlIuml_aumla13Sun QFSBCLFig 3Pq

UVEumlIgraveOIacute11913aelig13()

i QDR 9j2 (113aelig13ntildesect) InfiniBandaccedil

(9iexcl) i FGFAring 72GBs (2iumlIumlcurren) IOiumlIuml (M90002iumlIuml) i FGFAring 576GBs (OcircOtildeYcurren) OgraveOacuteOcircOtildeY (ETERNUS DX80366) ()9kUcirclmiumlIumlnC

1 2 3 119

IBUSW

IO node

Fig 2 oOumltimesYZJWFGFAring

JAEA-Testing 2011-005

M90

00

2IO

M90

00

1IO

FC (3)

FC (4)

FC (36)

hellipFC (1

)F

C (2)

FC

(42)

hellipFC (3

7)FC (3

)FC (4

)FC (3

6)hellip

FC (1)

FC (2)

FC (42)

hellipFC (3

7)

DX8

0 1

helliphelliphelliphelliphelliphellip

helliphelliphelliphelliphellip

hellip

1TB 1T

B1T

B1T

B

DX8

0 2

1TB 1T

B1T

B1T

B

DX

80

35

1TB 1T

B1T

B1T

B

DX8

0 3

6

1TB 1T

B1T

B1T

B

ETER

NU

S D

X80

(36

)E4

K 1

1TB 1T

B1T

B1T

B

ETER

NU

S400

0 M

odel

400

(1)

32G

Bs(8

FC)

QFS

Q

FS

36

576

GBs

(144

FC)

F Fi

g 3

JAEA-Testing 2011-005

3 HPCCYbrvbar

HPCC (High Performance Computing Challenge)1) Y 7^_ZB BX900W FX15poundDq13gtHPCCEcircupqWrstuq 31 HPCCYEcircu

HPCC YvR Tennessee JDongara wRordf3xWXpoundyoBCefgh$BY`^_ZzgtlaquoqO

|Y^_ WpoundDefgh$acircuXu

GmWbrvbarsectefgh$13WCeacuteZGUacuteC

qO HPL(High-Performance Linpack)DGEMMFFTE 3q|^_Zgt|Y STREAM W Random Access 2 q|^_Zgt _ PTRANSW b_eff 2q|^_Zgt5n~noacuteCc^_Zbrvbar5PJWXpound 2)qo

^_ZUuml|YacuteTHORNFortranCgtcnordfHPCCYWCCgtcnD 1^_ZeumlnqXHPCCYC 2003 sect 300wordfeumlnsect+X13rsyBGmW9gtlaquoq

311 ^_Z efgh$13f CPU 1 iexclXCiexcl8ampG

L13iumlIumlPa$fY`gtYZCDYZ$JgtlaquosectCoreCPUiumlIuml13WeumlUgraveUgraveXgtO=ordf5nDHogtordfugt

laquoqmgtiumlIumlMG` (G Global system performance)WiumlIuml[gt` (SN Single Environment)gt` (EP Embarrassingly Parallel) 3gt5Pq

3111 HPL^_Z

HPL5IgraveLUcurrengtCmWgt13OG^_ZgtlaquoqOslashm^_Z|5 CcedilEgrave^_ZaumlIacutecurrenWCyB

eumlnmWordfqUgraveDefgh$13O[P TOP500B^_ZWCBeumln LinpackMPIEumlIgraveUcircWCnotgtlaquoqiumlIuml`$XOordfecircMAumlWXq Tflopsgteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq

JAEA-Testing 2011-005

3112 DGEMM^_Z DGEMM5IgraveGmWgtiumlIumlOG^_Zgtlaquoqo^_gt5curren13^_W i5Xqm^_Z

LinpackSgtegravenauml^_ZgtlaquoqUgraveDaumlefgh$gto13zeumlnDeacuteaTHORN|ordfZaZ|WCBeumlnsect

meacuteaTHORN|ordfoacute WXqY`LBMTPbrvbariumlIumlOiexclcentGq Gflops(Giga floating-point operations per second)gteumlnq` 2gtiumlIuml[(SN)(EP)`5Pq 3113 STREAM^_Z STREAM iumlIumlordf|W CDWecirceacuteIumlG^_Zgtlaquoqm

Wecirco^_gtO=ordf5np^_W i5Xqm^_Z

cpound(Copy)curren(Scale)Myen(Add)yen(Triad) 4 XO=gtJeumlnsectiumlIuml|YiexclcentGq GBs(Giga Bytes per second)gteumlnq` 8 gt[(SN)(EP)gtoiumlIuml 4(cpoundcurrenMyenyen)`5Pq 3114 PTRANS^_Z PTARNS(Parallel matrix TRANSpose)EumlIgrave^_ocirc5IgraveFtimesbrvbarsect13ccedil`brvbarY G^_Zgtlaquoq GBs gteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq 3115 Random Access^_Z Random Access |sectumleumlnDecopyIgravecdZordf 1 ulaquonotgtshyregGO=ordf 1 macrdegplusmn5ncG^_Zgtlaquoq1 plusmnYGOcirc$2ordf8BWgteumlordfgtlaquoqiumlIuml|Yucircsup2`WiumlIuml MPI gtucircsup2`5Pq Gups(Giga updates per second)gteumlnq` 3 gtiumlIuml[(SN)(EP)iumlIumlegravepoundDM(G)`5Pq 3116 FFTE^_Z FFTEAumlF|sup3^acuteOG^_ZgtlaquoqHPLWiacute CcedilEgrave^_ZaumlIacutecurrenWCyBeumlnmWordfq[(SN)W(EP)`gt2 [microcopyIgraveordfo^_gtpara|sup3^acuteeumlnqiumlIumlegravepoundDM(G)`gt3[microcopyIgraveordflaquo 1AgravezmiddotAcurrenetheumlnDcedil|sup3^acuteeumlnlaquoW^_ordf^_W 5PmWLoacute Pbrvbaro^_

copyIgravesup1copytimesordm^_raquofrac14sect[microoacuteG|sup3^acuteordf5nq

JAEA-Testing 2011-005

Gflopsgteumlnq 3117 b_eff^_Z b_eff Ocirc$FGFGfrac12frac34(7a13acute latency)WFGFAring(eacuteIuml band width)shycdiquestH^_ZgtlaquoqFGfrac12frac34ordfAgraveXCeacuteIumlordfecircM13^_ ordfAacutenmWGqb_eff iumlIumlegravepoundDM(G)G Ring W PingPong 2 ^_ZgtJeumlnqb_eff FGGOcirc$-a5mnUgravegt^_ZW+XsectlOcirc$-a5 N ] MPI ^_ P atildeAcircXq7a13 8B Ocirc$eacuteIuml 2MB Ocirc$FGGmWbrvbarsect5Pq Ring MPIbrvbarpoundAtildeHdnDZYAumlAringAEligo^_oacuteCAumlAring 1oacute 1 5PoumlWZYAumlAringZordfEumlu 1oacute 1 5Pouml 2CcedilEgravegtCDq PingPong 2Core Eacute Ecircnototilde^_ordfethsectntildednp Core EumlCOslash5Gq2CoreIgraveEacuteIacute5poundecircTIcircIIumlshyGq b_effETHNtildeOgraveRing(AumlAring)brvbar7a13OacuteRing(AumlAring)brvbareacuteIumlOcircRing(Zordf)brvbar7a13OtildeRing(Zordf)brvbareacuteIumlOumlPingPongbrvbar7a13timesPingPongbrvbareacuteIuml 6gtlaquoq7a13ordfaY_LOslashPmacreacuteIumlordf GBsgteumlnq 312 Oslash5

HPCCZaZ(Baseline Runs)WUuml^Otildea5Z(Optimized Runs) 2Oslash5ordflaquoordffeaZTUacuteUcircEacutebrvbarZaZgt5pound

q

3121 ZaZ Oslash5gtlaquoZaZfeaZbrvbarTUacuteUcircW13Oumlfrac34

eumlnaumlUgraveAuml BLAS ZaZ|W MPI ZaZ|egraveBUgravegtUacuteeumlnordfUcircfIuml^regUacuteeumlnXqegraveBCDfeaZeacuteYacute~

WfeaegraveBCDUuml^13~ZaZ|eacuteYacute~UumlordfYacutegtlaquoq 3122 Uuml^Otildea5Z UcircfIumlTHORNszligagravePTUacuteUcirc2 7gtUacute9eumlnqiAgraveaacutednDIacutecurren(-)fIumlTUacuteUcircgtlaquosectpAgrave|5macr8CEcircnototilde3XTUacuteUcircgtlaquoqUgraveD acirc HPCC^_YacuteNtildeY`acircatilde^WaumlaringordfaeligugtlaquosectatildeCUumlGaeliguordflaquoq

JAEA-Testing 2011-005

32 BX900 FX1n~nHPCC BMT^_Z 8ccedil512EumlIgravegtOslash5CDq 321 BX900 BX900 HPCC BMT^_Z Table 1GWHPL

PTRANSRandom Access FFTEo^_ZZK|Otilde Fig 4ccedilFig 7GqZK|Otildedegn^_Zegraveeacuteecirccopyordfeuml]cgtlaquoordf512 EumlIgravegtigraveyenCiacuteXCiacuteicircXordfdegdnqUgraveDHPL=NoacuteC 822(=4902596100)1gtlaquopoundDq

Table 1iumlntildeethntildeograveoacuteWCHPCCocircAEligcdotildebIcircIumlETHNtildeoumlccedilYAElig Intel Endeavor cluster (13gt Endeavor Wdivideoslash)HPCCWeboumlYacute 1) ugraveuacute Table 2Gq BX900shyucircoB3gtlaquoDH Ecircnototilde^_ZP EndeavoregraveeacuteplusmnpoundDlaquoordfEcircuumlb13yacuteordfdegdnqUgraveDEndeavorWEacute IntelfeaZBD Table 3 GbrvbarPQRfeaZbrvbarWbgtlaquopoundDqDotildeCSTREAM iumlIumlSgt5poundmWcd thornCXatilde

d(35GBs oacute 42GBs)W 2ethM EndeavorbrvbarsectEshygtlaquopoundDqm422gtaringNGq

1 BX90013VBbrvbargt17072fgt=N 2000TflopsoacuteC 1914TflopsOslashCsectOslashszlig 9566gtlaquopoundDq

- 9 -

JAEA-Testing 2011-005

Ta

ble

1 富士

通コ

ンパイ

ラに

よる機

構B

X900

での

HPC

C B

MT測

定結果

G-H

PL

G-P

TR

AN

SG

-FFTE

EP-S

TR

EA

MSys

()

EP-S

TR

EA

MTriad

EP-D

GEM

MR

ando

mR

ing

Ban

dwid

thR

ando

mR

ing

Lat

ency

TFlo

ps

GB

sG

ups

Gup

sG

Flo

ps

GB

sG

Bs

GFlo

ps

GB

sus

ecrd

ma

008

1315

12

9330

001

5112

008

9847

555

3328

074

350

9311

018

41

1171

51

268

nord

ma

008

1226

12

9544

001

5504

009

0139

557

5727

813

347

6711

023

91

1053

51

278

rdm

a0

1579

130

510

540

0238

850

1510

719

4469

555

143

4697

111

145

092

418

149

3no

rdm

a0

1536

650

491

080

0222

270

1334

439

7508

555

683

4730

111

123

078

781

324

5rd

ma

032

2493

09

0716

003

0751

024

2906

186

357

111

164

347

3911

101

00

5220

61

944

nord

ma

032

3494

08

4422

003

5001

023

7182

190

981

112

052

350

1611

100

30

5194

95

010

rdm

a0

6201

260

150

637

003

2786

037

8769

369

923

222

203

347

1911

092

40

4192

43

156

nord

ma

060

9871

014

686

80

0763

460

3882

6638

139

722

260

03

4781

110

887

044

542

652

7rd

ma

120

6270

026

303

70

0370

050

5186

5673

075

744

426

43

4708

111

007

038

892

463

5no

rdm

a1

2265

800

302

225

007

3652

061

6532

742

721

444

063

346

9211

100

50

4161

78

290

rdm

a2

4166

200

583

348

002

9897

053

8761

139

3770

888

453

347

0511

091

00

3583

55

937

nord

ma

236

3890

057

519

80

0765

560

8003

4814

845

5089

503

73

4962

110

879

040

323

103

72rd

ma

468

4620

011

199

400

0163

910

2622

2026

859

1017

779

513

4726

111

147

032

838

711

3no

rdm

a4

9017

400

109

9150

008

2799

153

7610

275

5280

1816

433

354

7711

111

90

3572

112

790

富士

通B

X90

0 (5

12C

ore

128C

PU

)C

ip

Xeo

n X55

70 2

93G

Hz

DD

R3-

1066

SM

T-O

N T

urbo

Mod

e -O

FF

Inte

rcon

nect

Q

DR

Inf

inib

and

2so

cket

nod

e F

at T

ree

OS

Red

Hat

EL 5

Com

pile

r 富

士通

CC

++ V

er O

ption -

Kfa

st -

SS

L2

()

EP-S

TR

EA

M S

ys の

値は

(EP-S

TR

EA

M T

riad

)times並

列数

で算

出し

てい

8 16 32 64

MPI

富士

通 P

aral

leln

avi Lng

uage

Pac

kage

512

128

HP

CC

Ver

V

13

1

256

並列

数通

信方

式G

-Ran

dom

Acc

ess

オリジナ

ル版

  

SA

ND

IA_O

PT2版

JAEA-Testing 2011-005

FFig 4 BX900 HPL

Fig 5 BX900 PTRANS

JAEA-Testing 2011-005

FFig 6 BX900 Random Access

Fig 7 BX900 FFTE

JAEA-Testing 2011-005

TTa

ble

2In

tel E

ndea

vor c

lust

er (

HPC

CW

eb1)

)

JAEA-Testing 2011-005

TTa

ble

3In

tel

BX9

00H

PCC

BM

T

JAEA-Testing 2011-005

322 FX1 FX1 HPCC BMT Table 4 HPL

PTRANS Random Access FFTE Fig 8 Fig 11512

HPL755(=377650100)2

Table 4 FFTE DGEMM

A

2 FX1 1200 129Tflops116Tflops 9037

JAEA-Testing 2011-005

TTa

ble

4FX

1H

PCC

BM

T

JAEA-Testing 2011-005

FFig 8 FX1 HPL

Fig 9 FX1 PTRANS

JAEA-Testing 2011-005

FFig 10 FX1 Random Access

Fig 11 FX1 FFTE

JAEA-Testing 2011-005

33 HPCCYUgraveWH Uklq BX900W FX1WrsndWK(Altix3700Bx2)Wrs

Fig 12GqmhgtTWXpoundDqshy 100WCLRandom Ring LatencyPqRandom Ring Latency13BX900ordf`ccedil^fCmWordfcurrencq

BX900 W FX1 rsGWSTREAM W FFTE necircOatildeC atildeC eacuteIumlrvCDordfmacrdnDqSTREAM

421FFTE 43aringNGq XAltix3700Bx2 Random AccessordfEordfAltix3700Bx2W

plusmngtegraveBCD HPCC BMTeacuteYacute~cdOcirc$Y|5ordf+Xpoundsect8YrsGmW13XqAltix3700Bx2 atildeC FFTEEordfmdoacute OslashszligWuumlwdn

q

JAEA-Testing 2011-005

13

0

20

40

60

80

100

BX90

0

FX1

Altix37

00Bx2

GHPL(490Tfplos)

113

GRando

mAccess(154G

ups)

GPTRANS(1120G

Bs)

EPSTREA

M(436GBs)

EPDGEM

M(111Gflo

ps)

13

GFFTE

(2755Gflo

ps)

$amp(13

Rand

omRingBandw

idth

(0359GBs)

Rand

omRingLatency(518usec)

)+

512-(FlatM

PI)

1172

10 64

64

10 32

8 2 32

130

13

123

[GFLOPS]

4567898

123

[GBs]

lt8

7898

[GBs]

=gtAB130CDgtEFGHIAB0J

KLMNOIPQ

Fi

g 1

213cent

HPC

C B

MT^_ZrsZ

Ran

dom

Rin

g La

tenc

y (5

18

sec)

)+R

JAEA-Testing 2011-005

4 BX900W FX1TUacuteUcirc 32 c^_Zgt BX900 W FX1 IcircIumlETHNtildeoumlccedilYWiCXordfWnDmWQR^_aZegraveBC(Xcurrenyen

5poundDqeumldfeaZUuml^13~brvbarTUacuteUcirc]Ucirc7^_ZTUacuteUcirc

EacuteDq 41 ^_aZ

BX900 W FX1 gtegraveBCD^_aZWQRUgraveyenXgtlaquosectoacute ^_ZOslash5^_a|Ocirc$divideC^_ZOslash5cedildivideOcirc$divideC

^_Z=)yen5PmWordf13q ^_aZyBGUgrave^_Z_IumlYacuteh)JGDHfe

a|YUuml^13~UKtl_trtGq[ _ZOslash5_IumlYacute

h fpcoll fIumlgt=GqTcedildivideCDeacuteaTHORN|Ocirc$ fprof fIumlgtdivideCaring`Gq^_aZgtPAElig 8q|gtlaquoq

^_a|Ocirc$divideAElig AElig ^_ AElig MPIZaZ|ordm)AElig f`AElig IcircIumlETHNtilde$AElig fZAElig UcircfIumlAElig

mPIcircIumlETHNtilde$AEliggtCPUlt=gtOaringaeligccedil13

hccedil`|WOcirc$FGszligbordfoacute WXq

42 feaZbrvbar|YTUacuteUcirc BX900HPCC BMT^_Z STREAMordfotildeb

gtlaquo EndeavorWrsC(35GBsacute42GBs)W 2ethMEshygtlaquopoundDmWETHNtilde5poundDqETHNtildeBDUcirc HPCCpSTREAMUuml|YacuteTHORN(CUcirc amp FortranUcirc)7ccedilIumlEumlIgraveUcircWMPIEumlIgraveUcircgtlaquoq[13BGbrvbarPmdegn BX900 feaZUuml^13~UKntst brvbarsectCDmWcdEndeavorfeaZ(IntelUgrave)W BX900feaZ(QRUgrave)igraveiacuteWuumlwdnq

JAEA-Testing 2011-005

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute BX900Uuml|YacuteTHORN STREAM^_Z Table 5Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntst brvbarordfmacrdnDqDotildeCUuml^13~gtOcirc$|sectumlGaringaeligccedil13hyBCXbrvbarPGDHoacute WX DO^SgtOcirc$sup1yBordfXmWordfdcXoumlEacuteegraveBGuecircgtlaquosectPgtXoumlordfUcirc

GordflaquoqUgraveDfeaZordfoacute WGDO ^SgtsectumlOcirc$ordf^^shypoundZYeumlnoumlEacutegtlaquoq CUcircoumlIntelfeaZXd OpenMPEumlIgraveUgraveUgravegt 42GBscoreordfdegdnDordfQRfeaZgt7ccedilIumlEumlIgraveUcirc OpenMP EumlIgravecd=EumlIgrave^regCbrvbarP]C Intel feaZWbWXpoundDqQRfeaZgtlaquo CD7ccedilIumlEumlIgraveUcircAgraveogravebrvbarpoundOcirc$copytimes]AgravebTUacuteUcircAgraveograveordflaquoWnq [ FX1Uuml|YacuteTHORN STREAM^_Z Table 6Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntstbrvbarordfmacrdnDqXTriad (132GBs)CPUQRordfWebgtUumlCshy(135GBs)CPUWotildeEacutegtlaquoq CUcircoumlordf FortranUcircUKntstUuml^13~XCWotildeiCqFX1 CfeaZ BX900W+XsectUKntstUuml^13~ordfcentsup3CXqDHFX1 CfeaZbrvbarTUacuteUcircordf$currengtlaquopoundD9ordfAumlq

JAEA-Testing 2011-005

TTa

ble

5B

X900

STRE

AM

Ta

ble

6FX

1ST

REA

M

JAEA-Testing 2011-005

422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute 321gtdegdnD BX900 HPCC BMT^_Z STREAM

Table 7GqmmgtUKntstcopyIgraveOcirc$ecircEacutearingaeligccedil13heumlX storeampegraveBGmWUKrestp=all(a$XsectordfXmW)WCDTUacuteUcirc5PfeaZUuml^13~gtlaquoq UgraveDyacutegtgtlaquoordf _ZOslash53iumlIuml XC

iumlIumlSgt|Yo CPU]o CoreiDHEumlIgrave] Agravejatildedn~nfeaZUuml^13~brvbar^UcircCXq

STREAMbrvbarPX CoreIumliX2gtZX|Yordf5n DO^gtfeaZUuml^13~UKntst GmWgt|Y+13mWordfcurrencpoundDq

Table 7 HPCC BMT^_Z STREAM (MPIEumlIgraveUcircCUcirc) feaUuml^13~Ograve(1) feaUuml^13~Oacute(2)

EumlIgrave Agravej 13aelig13

EP-STREAMTriad (GBs)

13aelig13 EP-STREAMTriad (GBs)

rdma 1 34739 U U 32

nordma 1 35016 1 43316 rdma 3 34708 U U

128 nordma 3 34692 1 43572

rdma 5 34726 4 43595 512

nordma 4 35477 4 43424 L-1QRfeaZUuml^13~OgraveacuteKfastpackedntst SSL2 -2QRfeaZUuml^13~OacuteacuteKfastpackedntstrestp=all SSL2P

JAEA-Testing 2011-005

43 hhbrvbar|YTUacuteUcirc 32 gtdcWXpoundDbrvbarPBX900 W FX1 FFTE ^_Z

acircIcircIumlETHNtilde13yacuteordfmacrdnDqmETHNtildeDH5pound

DhhW13gtszlig0Gq 13gt FFTE 32EumlIgraveOslash5CDWecircoacuteG ethBX900

gt 12FX1 gt 16Wgteumlqpoundordm)1|YEcirc2OgtlaquopoundD9ordfAumlqmgtFFTEUuml|YacuteTHORN(FortranUcirc)l3CUuml|YacuteTHORN(13aumlWoacuterGoumlUcircWdivideoslash)WaumlIacutecurren4C)JCDaumlacircoacuteGaumlZh5poundDiacuteegraveBC

hbrvbarpound^UcircCDordm)GmWbrvbarsectBX900 W FX1 oacuteG5g5poundDqUcircpaumlbrvbar5poundDh 1QRfeaZW Endeavor egraveBeumlnDafeaZWrs5PDHW 2auml13IacutecurrenTUacuteUcircbrvbar6)BbordfaumlIacutecurrenthorn7wXmWagraveaacuteGDHgtlaquoq UcircaumlIacutecurren 38plusmnUcircWmndaumlZhCDiacute

Ucircn~n Fig 13W Fig 14GWoIacutecurreniacuteEcircu0Gq Fig 13W Fig 14OgravegtCDIacutecurrenOO=5pound^gtlaquoqUuml|YacuteTHORN

gtcopyIgrave APXYZWWWWOcirc$Yordf 2ugt5nsectZgtXcpoundDDHiacutegt)9BcopyIgrave CY_W^curreneth5poundcopyIgrave APXYZWWWWYordfZXbrvbarPCDq)9BcopyIgrave CY_YZCXcpoundD=Ocirc$sectumlG^gtYordfZXpoundDoumlUcircordfecircDHgtlaquoq

Fig 13W Fig 14OacutegtCDIacutecurren3[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 3ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 3gtXC 6Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquosectaringaeligccedil13h-

a5brvbarpoundltG_ccedilY-a5^_ZUgrave)acircregmacr= 16 gteumlnq

Fig 13W Fig 14OcircgtCDIacutecurren2[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 2ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 2gtXC 4Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquoqFX1 gtBX900rufntildeDsectaringaeligccedilaringh-a5ordfgteumlmWcdmO=)t_ccedilY-a5 16cd 4^regGmWgtzyacuteordfmacrdnDq

Table 8BX900W FX1QRfeaZegraveBCDUcircWaumlbrvbarEuml BX900 afeaZegraveBCDaumlbrvbarGq

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeFFQhFQeFHFFIhFIeFHFFhFQWFRWFFILCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildeKgMSCFDEFGLCKmMlhKOFRLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeNHQRIAacuteVOEOlOjAtildehSRAacutelOEYEEfKAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKnMSCFDEFGKmMSCFDEFGKdMSCFDEFGKkMSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePMMSCFDEFGPKMSCFDEFGPPMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLSABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtildeLCnMEEhKOFKOFAVjLCgMllhKOFPOFAVjLCPMEhEEOUEFMAacuteEEfFAVjYKOFKAtildeLCKMlhllOUEFMAacutellfFAVjYKOFPAtildeAAacutelOEAtildehNAacuteEOlAtildeaumlIacutecurrenOcircKMSCFDEFGPMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBF

GFL

Fig 13 FFTEUuml|YacuteTHORNaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeop]qrsAacuteXAtildeObqqpobtbuqrvvowxAacutevOvOvAtildeL3yzPbqqpobtrAacuteowxAacutesOwO]|AtildeAtildeLCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildep~hKOwowxAacute^O~OqAtildehowAacute~O^Atilde ^currenethDHrp )9BcopyIgrave)JIacutecurrenKgMSCFDEFGKdMopt^|rKkMopt^|rpKdK^^hKOsOuqpKmM~~hKOwOuqpKmK~h~~O^MAacute~~fuqYKOwAtildeLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildepKkKVhKO]|NHQRIAacuteVOEOlOjAtildehowxAacute^Y^^fKO~OqAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKkKopt^|r LEW V^lnacutewPKnMSCFDEFGKmKSCFDEFGKmMSCFDEFGKdKSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePPMSCFDEFG LEW j^lnacutewPPKMSCFDEFGPMMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtilde]bcbrtrcAacuteuqKhnAtildeLCnMEEhKOFKOuqKLCgMllhKOFPOuqKLCKMlhllOUEFMAacutellfuqKYKOFPAtildeLCPMEhEEOUEFMAacuteEEfuqKYKOFKAtildeAAacutelOEAtildehNAacuteEOlAtilde aumlIacutecurrenOcircPMSCFDEFGLEW l^lnacutewPKMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBFGFL

Fig 14 FFTEiacuteaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

Ta

ble

8FF

TE V

er 4

1

FullW

kern

el

(EumlIgrave

32

180M

BP

roc

)

ST

UVWXYZ[]

^_

`6a8bcdRefghHdRefgij

`6a8bcdRefghHdRefgij

klmmnopqr

cltst

ouvq

wvx

ovy

zqvz

|vy

|vz

klmmnopq

~cltst

yvy

yvo

ovo

wqvq

v|

|vy

s88oy

oqvu

oqvu

ovq

z|vu

oquv

qvz

s88Ru

ovz

qvw

x|vq

ovw

klmmnopqr

cltst

ouvy

zvy

ovx

oq|vu

uvz

uv

klmmnopq

~cltst

yvx

xvw

ovo

w|vu

vz

|vy

s88oy

oqvx

oqvo

ovq

wxv

z|vo

qvz

s88Ru

o|v|

qvw

xuvw

ovy

klmmnopqr

cltst

ouvo

zvx

ovx

klmmnopq

~cltst

yv

xv

ovo

s88oy

oqvq

oqvu

ovq

s88Ru

ovz

qvw

ln

ltd

ln

zqq

D

mo

ltd

ln

mltd

JAEA-Testing 2011-005

UgraveBX900gtUcircWQRfeaZaumlWQRfeaZaumlWafeaZdegnIgraveEacuteIacuteotildeigt

laquosectauml4ordfXmWWfeaZAacuteordfXcpoundDmWordfcurren

cqUgraveDFig 14OcircgtCDIacutecurren_ccedilY-a5^regGW(_ccedilY-a5acute164)BX900gtordfpoundUcircCsectmoumlaringaeligccedil13hegravepoundXcpoundD9ordfAumlq [FX1gtUcircWQRfeaZaumlWQR

feaZIgraveEacuteIacuteshyegraveeacuteordfaEacuteDordfUuml|YacuteTHORNcdz

yacuteCiacutecurrengtrsGWotildeigtlaquomWcd BX900 WrsGmWXqBX900W FX1gtUuml|YacuteTHORNWiacuteOslash5GWFig 14OgraveIacutecurrenO=OordfEcircUgraven^currenethWPrsUVXiacute5pound

DDHBX900 gt FX1 gtzyacuteCqCcurren BX900 16 oacuteC FX1gt 40WXzyacutegtlaquopoundDqUgraveDFig 14OacuteWOcircIacutecurrendegncopyIgravefgO=EacutegtlaquosectBX900gt 11currenW 08currengtzyacuteXcpoundDordfFX1gt 36currenW 18curren(or 16curren)gtMddcXzyacuteordfaacuteHdnDq 13yacutecdFX1feaZTUacuteUcirc BX900Wdc+XpoundsectFX1

|Y]OszligyBGDHhhordfaeliggt

laquomWordfcurrencpoundDq 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute

42 STREAM^_ZgtethCDW ordfcent^_Zgtecircc5g5poundDq egraveBCD^_Z43 FFTEWEacute Stockham|5gt FFT5P

NPB (NAS Parallel Benchmarks) 23FT^_ZgtlaquoqFFTEoumlWiacuteNPB23 FT Uuml|YacuteTHORN(Fortran Ucircgt MPI EumlIgraveUcirceumln)l3CUuml|YacuteTHORNcdaumlIacutecurren4C)JCDaumloacuteCfeaZUuml^13~brvbarpound^

UcircCDordm)5poundDqUuml|YacuteTHORNgtXCaumlBD=YZ C7UV((XYZ)=(512512512))gtaumlIacutecurrenordf On aringaeligccedil13hWXsectfeaZUuml^13~szligordf13XDHgtlaquosectaumlIacutecurrenUVEacuteCDcpoundDcd

gtlaquoq ^_ZaumlIacutecurren 28plusmngtlaquosectn~n Fig 15W Fig 16GqMd

8plusmncopyIgraveoacuteGYordfZsectumlO=gt|YordfZgtlaquo

mWcdhtgtUKntst oacute WXXgtmUuml^13~notbrvbarGq UgraveDOcirc$_ccedilY-a5 16W 64brvbar5poundDq

JAEA-Testing 2011-005

13

|ucp|t^ro__tAacute^OOOsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]|trFRFY]p^top]qrsYtpYop]qrsJJDp_Q|^bbqpc^ta|rotpbctcb|urc`Q^uptatar^]|tbtarp|t]|tbccbwOa^qrR^boocbtoabccbw`Et^b|rtabtFhPU`Ar_pcrobqq^SJJDItpo]rc_pcJJDOtarbccbw|tur^^t^bq^ruwobqq^SJJDI^taEorttpMbUrttpUQOarcrUQ^tarbs^|bq|rp_U_pcbwo|ur|rtobqq`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^oq|r[qpubq`a[^oq|r[]^_`a[^trrc^OOO^O~OqOsO^rccOr^trrc__tuqpoO__tuqpo]b]bcbrtrcAacute__tuqpohdnO__tuqpo]bhddAtilde auml^_ccedilY-a5Kd]bcbrtrcAacute__tuqpohKdO__tuqpo]bhKXAtilde]bcbrtrcAacutehiOhmKPAtildep|uqrop]qrssOwO|p|uqr]cro^^prqOrqrOrq]^r^psAacute__tuqpo]bOAtildeOwAacute__tuqpo]bOAtildeO|AacuteAtildeobqqUHExE^tAacute^rccAtildesAacutevOvAtildehAacuteK`MMOM`MMAtildewAacutevOvAtildehAacuteK`MMOM`MMAtilde|AacutevAtildehAacuteK`MMOM`MMAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pcprbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqh]^xt^rAacuteAtildep^hKOKMMM^hY^pqhKOOPobqq__tPAacute^OqOOO__tuqpoO__tuqpo]bO|OsOwAtilde^_Aacuteq`r`AtildeptpKdMobqq__tPAacute^OqfKOOO__tuqpoO__tuqpo]bO|OwOsAtilderpptpKXMoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]wRtpQ`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKdMopt^|rp~hKOp^hKO__tuqposAacute^O~AtildehwAacute^O~Atilde aumlIacutecurrenLfgO=PrprpKXMopt^|rrpobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqrh]^xt^rAacuteAtilderq]hrqrYrqobqqUHExSpxcbAacuteUHExSCUUxTCBVLOrO^rccAtilde^_Aacuter`r`MAtildetarc^trAacutedOWAtilde[Gqb]rD^ruw|u`__tPAacutero`Atildev[Orq]r^_obqqUHExJ^bq^rAacute^rccAtildetp]

Fig 15 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- cfftz)

JAEA-Testing 2011-005

|ucp|t^r__tPAacute^OqOOOwOwKO|OsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pctarVYta^trcbt^pp_tarropbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^trrc^OOqOOOwOwKOKOq^Oq~OqO|O^O~O^KKO^KPO^PKO^PPp|uqrop]qrs|OsOwO|KOsKKOsPK^r^p|AacuteAtildeOsAacutewKOAtildeOwAacutewKOAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYort^^t^bq]bcbrtrc`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKhePqhPWWAacuteqYKAtildeq^hPWWAacuteYqAtildeq~hPWq|hq^fKp^hMOq^YK^KKh^WqfK^KPh^KKfK^PKh^Wq~fK^PPh^PKfq^_Aacute^`r`KAtildetar|Kh|Aacute|f^Atilderqr|Khop~Aacute|Aacute|f^AtildeAtilder^_oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoDa^qpp]^rotpc^buqr`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYphMOqYKp~hKOwsKKhsAacute~O^KKfAtildesPKhsAacute~O^KPfAtildewAacute~O^PKfAtildehsKKfsPK aumlIacutecurrenLOO=PwAacute~O^PPfAtildeh|KWAacutesKKYsPKAtilderprprpcrt|cr

Fig 16 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- fftz2)

JAEA-Testing 2011-005

UgraveWH Fig 17(Ocirc$ Table 9 Table 10GqQRfeaZbrvbarshyatildeCBX900 fftz2gtfeaZUuml^13~UKntstbrvbarzyacuteordfagraveaacute13cfftz gtnotXordfmacrdnXqUgraveDFX1degnnotXordfmacrdnXcpoundDqmW ETHNtildeCD

zyacuteordflaquopoundD BX900 fftz213gtfeaaringaeligccedil13heumlXsectumlampordf5eumlnXmWordfcurrencpoundDq13]h

eumlnX(cXCcedilEgraveWuumlwdnordf(currencdXq Ocirc$_ccedilY-a5degnOcirc` 16ordfTUacuteshygtlaquopoundDq UgraveDQRfeaZbrvbar BX900W FX1shyordfFFTEoumlbrvbarsectgteuml

(2curren13)ucircsup2WsectumlMd|YZgtlaquopoundDDHWuumlwdnq TcedilQRfeaZWafeaZUKntst ordf)BCD

IacutecurrenotildeQRfeaZAgraveordf 1 ethAacuteWXpoundDordfSTREAM gtfrac12Eacutesect

iumlntildegtlaquoq 45 TUacuteUcircUgraveWH

BX900WFX1STREAMWFFTEUuml|YacuteTHORNfIumlWNPB23 FTB13feaZbrvbarTUacuteUcirc5gCDFX1gtegraveBGfeaZUuml^13~brvbarpoundordfecircC^ordflaquo 5ordfaeligugtlaquopoundDq

FX1gtfeaZUuml^13~otildegt13ecircGmWordf13XoumlordflaquosectWecircUcirc7gth5Paeliguordflaquoq

Fig 17 feaZbrvbarTUacuteUcirc

JAEA-Testing 2011-005

TTa

ble

9N

AS

PARA

LLEL

Ben

chm

arks

Ver

23

FT

Tabl

e 10

NA

S PA

RALL

EL

Ben

chm

arks

Ver

23

FT

JAEA-Testing 2011-005

5 IORYbrvbar Y`fIuml IORBoa13alFGFAring

CaYoBETHNtildeCDq 51 IORYEcircu

IOR Benchmark3)vR ASCI Purple=_7|eacuteplusmngteumlnD C iexclbrvbarYfIumlgtlaquoLplusmneacuteYacute~ 2102egraveBPqPOSIXLwrite() read()PMPIIOLMPI_File_write_at() MPI_File_read_at()PbatildeB13EumlIgravel5PmWordfgtecircq 511 YacircueZ$

MPI oZYcplusmnLrepetitions plusmnP`5XFGFAringTshyiquestHqOcirc`centgtEumlIgraveOslash5eumln^_cdiaoacuteCpound

Eacuteecircordf5nndFGFAringshyordfeumlnqIOROslash59XacircueZ$ Fig 18Gq

1plusmn`gt readwriteplusmn(blockSizecurrentransferSizejsegmentCount)gtlaquosect`aecirceuml (blockSizej segmentCountjMPI EumlIgraveAring )WXqyenwtransferSize=1kBblockSize=1MiBL106 Bytes MB220 Bytes MiB WGPsegmentCount=1MPI EumlIgraveAring=32 oumlblockSizejsegmentCountjEumlIgraveAring=32MiB testFile ordf)dnoZY 1MiBmiddotAbrvbarntildeC1KB7fIuml 1024iexcl1313aeligUgraveDZordfpoundEacuteecircGq

[transferSize] 1plusmn readwritegtFGGeacutea`qsizeof(long long int)currencblockSizegtlaquoaeliguordflaquoq

[blockSize] o^_ordfbrvbarntildeGeacutea`qgt 1MiBq [segmentCount] o^_brvbarsectYeumlniOcirc$qyzshy 1q [randomOffset] 1XdZordfYqOcirc`shy 0q [fsync] ecircEacuteAring fsync13f5Gqyzshy 0q [filePerProc] ^_sectWcent`aYGqPOSIXEacutenotszligq

yzshy 0q [useFileView] MPI_File_set_view() MPI_File_write() MPI_File_read()egravePq

ZordfYWparaCXqyzshy 0q [collective] MPI_File_write_at_all() MPI_File_read_at_all()egravePqZordfY

WparaCXqyzshy 0q

Fig 18 IOR YacircueZ$

JAEA-Testing 2011-005

512 MPIIOatildeGUuml^13~eZ$

IOR gt Info aringbrvbarpound MPIIO =)`7wmWordfgtecircqBX900 notszligX InfoUumlYacuteNtildeY` keyW value Table 11GqplusmnCXLyzshyPq

Table 11 MPIIO Info UumlYacuteNtildeY` key value

cb_buffer_size 4194304 $amp()

+

cb_nodes -012345

6789lt

=gt-012A

B

ind_rd_buffer_size 4194304 ACDEFGF$amp()+

ind_wr_buffer_size 524288 ACDH+GF$amp()+

JAEA-Testing 2011-005

52 latildeLPOSIX MPIIOPn~nEumlIgraveAringtransferSize (1plusmn13

fgtFGeacutea`)YAgraveograveLsequential randomPb^wFig 19GumlcopyFGFAringCDL^_P Table 12Table 13Fig 20Fig 21GqUgraveDntildea13iumlIuml_aumlaringaeligccedil13hIO -eacute-eacutearingaeligccedil13hordfcentsup3CDHn~naringaeligccedil13hoacuteGIO XcOgraveOacuteOcircOtildeYoacuteG IO XcordfcentGDH_aumlaringaeligccedil13hszligEacuteDq

iexclcentpoundcurrenyenETERNUSDX80

970TB

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

brvbar

SRFSltlaquocumlcopyordf

READD WnotiteD

brvbar

SRFSltlaquocumlcopyordf

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

shyshyshy

Fig 19 ReadWriteumlcopy

JAEA-Testing 2011-005

Table 12 LPOSIXP

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3

oqu yvq ovy| |uxvzw zvwu |zwvyy ouv|u oywvo ozuqv

quw zqzvuy uvox uu|v|u u|uvqo uxovz| uwvuo owovuq wqvxuqzy oqqvo| vy uxyvyw uuuvxx uyvyw xqzvu| |vw| |q|wvzqwoz o|uv|w ouvoy ozwv|o uyxvux uqvyo u|v| uovzu zuvxoy|wu ouoov|w uwvqu uuvxq uyvwz uuvyw uxovww uqvxw zovzw|yw ouxzv wzvwq uy|vx uyvz| uyuvwz uwxvoo |xvo wxvzyyxx|y ouwyvq| wv uxvo| uovyx uzvuu u|qv xvxy yvyxo|oq ouqwvu| uqvqq uxv|w uxvxy uuwvyo uyvoo qvyw zv|qyouu oxoyvu ywovoq uuvw ux|vy |uxvu| uxovoy oqvow uwvzwxuww oxovww oo|zv uyuv uvoq |w|vo u|wv|y ouqvoq uqwvooquwxy ouwvyq ovq uuyvo| uzvyu uquvq u|uvo q|zvx| q|vq

oqu ow|qvy ovu uqvyo vyu xx|vyq uwvwy z|zyv|y oxxqwvuwquw quzvoy |vq |w|vwz uyvyw xxwvzu xuvw o|xxxvy owv|xuqzy ouvu |vu uqov|q uxzvzo u|vzq uuwvu ooouv| |qxvxuwoz ozqxv|x o|vq |z|vuo uq|vqz xxyv|u xzzvy oo|vwo xqvzzoy|wu uvuw uqvz uovz u|vwo xwwvo y|vu| oxvz ozvoq|yw uuvw zuvy uvzz uozv yqzvyx xzyv|u ovo oxvwyxx|y owvy ouuv|u uwv|| uuwvx| x|ovx| yqyvux o|ozvuo oyoovyoo|oq |wzvy xyvyw uuwvz| uvqu x|xv xuvu ooyyvxo oxxvoyyouu u|wvq |zxvwx uvo uyzvu xo|vzq xuqvqw oyzu|v|o q|quvqxxuww uuv|o ozuxvuu uuvxx uzvzu xqvow xuwvzz oyxxxvyx owyovyooquwxy xwvy |qzv| uuzvxu uzvo| xyuvwo uwzvwx oxwwvxz oxyuyvy|

oqu oxyovoo vo yo|vyu v|w xwyvzz zv |xwovy xzzxxvyquw oxvz |vwz yquvxx y|ovx xywvyz xzyvyq x|xy|vzo w|ywzvwouqzy oyvu| zv xzwvx| yyvu xvoq xv yxy|vz zqy|vyowoz oy|xvxx uvy yozv|q yvyu xwv| xxvu| ywuv| zqqqvozoy|wu oxyzvuw uyvzz y|vyy yowvxq xwuvwx xwxvy ywzuv|q wqzvoo|yw oxoovuy wwv yqyvyq yuvyw xwov|x xzov|u ywozyvqy wyyyvywyxx|y oxwxvz ouyvxy yoyvuz yvoq xwxvwx xzqvww ywv| wxxqzvo|oq oyxwvw qvwx yovw yuvw| xxwvwo xwyvuq yquvzw woyvyouu oyw|v|z yzvx| yqvwu yxyvq| x|vx| xywvuq yywqvxw wqvq|xuww oxqvxu ouxqv|u yqvuq y|wvqx xyqvw xovz| yx|||vqq yuy|vuoquwxy oyywvqx uwvuy yo|vxz yuzvz yyxvq xwv|u yox|wvww yoovqx

oqu

|

xy

acutedegmicrodegpara

degmiddotmdegemdegkk~qgcedilsup1sup3

mdegkk~ocedilsup1sup3 cedilordm cedilsup3

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3oqu |yvoy uvqw zqxvxx xvwq uzvx ovu y|zuvx xywy|vy|quw uxzvy yvuz zoyvxo vox xxv| |vqw zyuyovyo zqzuvuyuqzy uyvuz oovx zuovw wvzy zvqx |vzu o|qxxuvy oowzyyvyqwoz uoyvxx vy xyzvzq xxv|o zuovwx uyvyo ouyyqovxx o|woo|v||oy|wu xqvzy uuvy oquvou oqovq oqqovuw z|vy oxuwv| ouqxzyvwz|yw xoyvyw xwvqy oqqvo oz|vo oqzvyz o|yvu ouwwqvq o|wzuv|yxx|y yu|vq ooyvq o|yvzx qvu o|wvqq uwvy oozwwvyo oozqyvxo|oq youv|y ovoy oxu|v|w yqv ouv|y owzvoo wuo|vou w|qzvzwyouu yxvw uouvwo oywzvwo o|yxvqo oyyvqq oxzzvq xo||voy xoyvwuxuww wozvq yuv| oyuvwo ouxvo ox|wvq o|o|v| |qwqvxw |ouvyzoquwxy zyxv|o zqyvu| oyq|vx oxqwvy oyzvz ooxvu ozzvx owovyu

oqu |xvyy v|u ozuvy w|vyy |xvw |vxy uoyyzov|o |zqyyyvoquw |xvww uvxw owq|vx xvzx oyovu |vqy xw|zvzw xyzuzvzxuqzy |yzv|z zvqw qvuy v|x oyovww wqvyw wzyv|w yqu|vowwoz |wuvz owvu o||vu oqv xzyv ouuv|| yozqovu zzqvzoy|wu xqxvwy |yvx qovou uwvyx oz|zvxy ywv| xqqxvzq yq|qxvzw|yw xuv|u uv|z qywvyu zxvy qxovww |quvz yxwoqv yuquu|vy|yxx|y yzxvqq ouyv| xzwvwx xyvzq yvoq wqzvy uzyqzvu x|yzqzvw|o|oq yyqvx| zzvq |ov|x zzovw| w|ov|| ozvo x|o||xvuz |yqqvwyouu zzyvzq yq|vwz |zzzvz |qvy |wzuv zyvxq |xowu|v |uuoqovuuxuww ooxvu o|wvu |wyvww |zvu |ux|vo zw|vzo |xuwvy |uwovxoquwxy oz|yv| owzuv| |wxyvwu ||yvqu |yzuvzx |yyovq o|uwvo o|uywvo

oqu |yvq vq |qyvqq ozwv|u |zvwq qzvww o|wuzvyo ooq|xxvzyquw |yvzu uv ooqvyu ozzv|w uxxv ozqv| owzzwvyy oxxuq|vouuqzy |yyvx| wvxo yvqq ozwvqy |zwvzq ozwv zzvx qyqqovoowoz |xv| oyvyz ozxovou |wwvw uxqvq| uquvq| |ozqxv|| qwuzzzvw|oy|wu uyqvzo ||v|| |wvyy wquvw wxuvox zvuo |oxy|vw q|qyvoq|yw ux|vuw yvw |xov|u |uvz uuvyu |wqvy o|xxovzu q|wvwzyxx|y yzvwo o|xvu |uqovuy qvw |yvy wwzvx o|wuyov|y owxuqzyvowo|oq xzzvwz yzv|y |q|vxq ovzu uqwvuq oqqvz oyuyyxuv|u oyouoz|vuoyouu wxyvq xuuv|u uoyxvo |yywvo uzvw| uouvu ooq|ozvo oqyq|vqxuww oqqvqu oowvoz uoy|vw |yvx uwuvuw uux|vxx woyyqzvow zqzvqwoquwxy owquvu owov|u uouvx |zxyv uvw uxoqvqu uy|yozvuw uuqqvuo

xy

oqu

|

cedilsup1sup3plusmnsup2

microdegpara cedilsup1sup3 cedilordm cedilsup3degmiddotmdegemdegkk~qg mdegkk~o

JAEA-Testing 2011-005

Ta

ble

13L

MPI

IOP

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

qv

qvz

ovqq

ov

|vq

vwy

quw

vyu

ovq

ovzz

vx

yvqq

yvox

uqzy

xvq

|vx

|vzq

xvo

oovw

oovxu

woz

oqvz

wvqu

vy

oqvy

|vuz

vy

oy|wu

ozvwy

oyvwq

oyvy

ozvz|

uxvw

uxvz

|yw

|yvzx

|vyu

wvy

|xvxo

vq

xv|

yxx|y

yvy

xyvoz

xqvqq

xuv||

oyvqo

oovz

o|oq

zyv|o

wqvu

zovu

zvx

owwvw

oyvzx

youuoxvoq

oqyvx

ouovzu

oqov

x|vqw

uvux

xuwwuyuvoo

|wvuo

uwvxu

oyovqq

wuvqq

||zv|y

oquwxyyqxvu

uwyvux

uwxvzx

ozuvqy

|uwv|

|uuvx

oqu

ovou

qv|o

qvy|

qvyx

|vy

|voq

quw

v|

qvx|

ovq

ovw

yvu

yvz

uqzy

uvxy

ovu

vqw

vyw

ovu

ovou

woz

zv|w

vyu

xv

xvoq

xv

uv|

oy|wu

ov|z

uvyz

oqvou

zvzw

uzvyo

uwvwz

|yw

||vz

zv|w

uv|z

qvy

wvwq

zovy

yxx|y

xvxw

v

uyv

|wvy|

oxqvz

ouzv|q

o|oq

w|vuu

u|vz

wyvx

ywvzz

y|vw

yvuz

youuooyvu|

yovw

oyqvoqvxo

||xvyo

|vx

xuwwwwvww

o|ovu

|uvqzyvuy

|yxv|

uqvw

oquwxy|xvo|

oovux

qovwu|qyvqz

uqwvo

x|v||

|

xy

acutedeg

microdegpara

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

ovou

ovoo

ovq

ovo

vo

vwx

quw

vx

vow

vo

vxu

uvxo

yvo

uqzy

xvoq

uvou

uv|

xvqu

zvo

oovy

woz

oqv

wvx

wvy

oqvo

ozvwo

vo

oy|wu

qvoq

ovo

ovux

qvq|

|yvyq

uyvou

|yw

uqvyo

|yvz

|uvy|

|wvu|

wovo|

wyvzu

yxx|y

wqvox

|v|y

yyvww

wvy

ouovz

ouwvqq

o|oqoxyv|w

o|vuq

oxvyo

oxovz

wvw

xwv||

youuwovuu

xyvox

u|vzq

yzvxw

xoovux

x|xvwu

xuww|xvo

uzvx

u|vyz

xqvozoquqvw

zuyv

oquwxyzv|y

yovz

wqvou

qyvuqooo|vuoqovu|

oqu

ovo|

qvxy

qvzy

ovo

voz

|v

quw

vz

qvzu

ovzq

vy

uv|u

yvx

uqzy

uvy|

vqx

|v

uvuz

wvwq

ovxo

woz

zv|

|vx

vwq

wvwu

ovyy

yvqu

oy|wu

owvzu

vu

oyvox

owvoq

|xvxo

xuvwx

|yw

|vy|

ouv

|ov|u

|xvx

qvq

oquvwq

yxx|y

xvq|

|vu

uxvx|

yvwo

oxuv||

owov|w

o|oqoxqvyw

v|

wyvq

ovux

|qvo

|vu

youuz|vwo

ouzvwq

o|vy

wuvqw

yovy

yywvz

xuwwxxvyu

|qv

uy|voy

xoqvwoxqvwwo|yvzw

oquwxywxwvxq

xov

wqvx

zqwvu|uovxzoqvu

xy

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2

microdegpara

|

JAEA-Testing 2011-005

521 POSIX Fig 20GbrvbarPZYgtFGeacutea`brvbardotildeiWXoacute

CZordfYgtFGeacutea`laquoGnordfzyacuteCqXFGeacutea

`ordf 1MB oumlZYWZordfYWgtiacuteOcirc$FGe$WXDHordfiGqUgraveDZYecircCo^_ordfn

~nparaCDaYGouml`$iaYbrvbarsect

EGoacuteCpoundEacuteEacutegtiaYGoumlAgraveordfAumlgtlaquopoundDq

mnd$Ocirc$YUumleacutenotccedilIuml]OcircOtildeYYL13YEuml

plusmnFEumlPthornordfuumlwdnqecircCCgtordfFGeacutea`

iexclcentIacuteiWXmW]poundEacuteEacuteecircC+Xnota

13gtOslashordfmacrdnX=WC-eacutearingaeligccedil13hLIOiumlIumlshyaringaeligccedil13hPthornWuumlwdnq 522 MPIIO POSIXZordfYWiacute=zCFGeacutea`

otilderyenCordfzyacuteCLFig 21PqUgraveDEumlIgraveiexclcentCXqmndreg=cd MPIIO ]^_ordfZordfY5POslashOumlWXpoundgtXcWmacrdegeumlnq 523 _aumlaringaeligccedil13h 13 IOoiumlIuml_aumlaringaeligccedil13hWa-eacutearingaeligccedil

13hordflaquoqa-eacutearingaeligccedil13humlcopyordfplusmnsup2GmWsup3acutegtlaquoordf13

BeumlnmemcleanfIumlegravePmWbrvbarsect_aumlaringaeligccedil13hY|1micro5PmWordfgtecircq 13G3brvbarsectpoundEacuteEacuteCDPOSIX_aumlaringaeligccedil13h

szligordfaacuteHdnDLTable 14Pq

(1) ecircEacute` (2) memclean (3) TplusmnpoundEacuteEacute` (4) 2plusmnpoundEacuteEacute`

JAEA-Testing 2011-005

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomSingleFileFilePerProc

ReadWrite

homehome

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

Write Read

sequential randomhomework

N=322561024

N=256

N=1024

N=32

N=32

N=2561024

FilePerProc FilePerProc

N=256

N=1024

N=32

N=2561024

N=32

FFig 20 POSIX

JAEA-Testing 2011-005

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential random

use viewcollective

use view N=256 use view N=256

N=32 256 N=32 256

homehome

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomhomework N=32 N=32

FFig 21 MPIIO

JAEA-Testing 2011-005

Table 14 _aumlaringaeligccedil13hszlig

| xy | xy

oregRplusmnsup2 zyvx qzv|| xwvxw zyuvqu

sup2Rplusmnsup2 woxqvoz ||qqvz| uzvu z|voy

kfrac14 frac12kfrac14

53 IORYUgraveWH

IORYbrvbardegdnDregmacrb13tuq 531 a13egravecurren occedil`brvbarYCDcparaX IO O=5Pnota13LSRFSP

homea13brvbarsectiumlIumlS_aumla13Lext3Ptmpa13PordfOslashszligordfAumlDHcAumlFXgttmpa13YZccedila)W^|13~AumlFUcircgtecirc9ordflaquoLFig 22middot Table 15Pq UgraveDhomemiddotAQFS_ccedilY-a5 512kcentCsect186OcircOtildeYOumltimes

raquoZETHIuml_Kgtaordf)JeumlnqUcircfIuml13NtildeY|^`bgta

gt_ccedilY-a5gtyBUacuteCqiAgravework middotAIO eacuteIumlkCQFS_ccedilY-a5 1MBcentCsect366OcircOtildeYOumltimes`Zaggtaordf)Jeumlnq^|13~cda_ccedilY-a5gtlgtyB

UacuteCqa-a5ordfecircCXbrvbarPX MPIIO work middotAegravePuecircgtlaquoq 532 POSIXW MPIIO c^_brvbariaYhomemiddotA Readgt 256EumlIgravegt 2curren

Writegt 10currenW POSIXbrvbarsect MPIIOordfcedilqPOSIXbrvbaraYmacrsup1Gq 533 eacuteccedilecirceuml ntransferSizeGX 1 plusmn13fgtleumlneacute

a`ordfecircMFGFAringordfecircCXWPdegdnDmWcd^|13~

^_Zeacuteccedil-a5laquo]GWordfyacuteordf9ordflaquoq C`|lOcirc`eacuteccedil-a5Y_ BUFSIZgtAtildeUgravesect13

gt gccegravePoumlgt fccegravePoumlgt 8192eacutea`gtlaquoqCfIumlgtsetvbufZaZ|atildebrvbarpoundeacuteccedilecirceuml^wmWordfgtecircq QR Fortran1313aeligleacuteccedil-a5GAgraveograve13sectLAacute

ordmPq (1) OPENyacute BLOCKSIZE (2) ^ fuxxbf (3) Oslash5fIumlZaUuml^13~ -g (4) Ocirc`gt 8Meacutea`

JAEA-Testing 2011-005

ETER

NU

S D

X80

1-

18 (d

x01frac34

dx18

)

DX

80

01

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

DX

80

02D

X80

03

DX

80

04D

X80

05

DX

80

06D

X80

07

DX

80

08D

X80

09

DX

80

10D

X80

11

DX

80

12D

X80

13

DX

80

14D

X80

15

DX

80

16D

X80

17

DX

80

18

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

IObrvbar

1(io

1)

ETER

NU

S D

X80

19

-36

(dx1

9frac34dx

36)

DX

80

19

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

DX

80

20D

X80

21

DX

80

22D

X80

23

DX

80

24D

X80

25

DX

80

26D

X80

27

DX

80

28D

X80

29

DX

80

30D

X80

31

DX

80

32D

X80

33

DX

80

34D

X80

35

DX

80

36

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

Wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

IObrvbar

2(io

2)

Fi

g 2

2a13J

JAEA-Testing 2011-005

Table 15 OcircOtilde7Y`|J

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_01 QFS_data_01 2109888 MB 2109888Icirc36 = 724TB 1024k

data_02 QFS_data_02 2109888 MB 2109888Icirc36 = 724TB 1024k

data_03 QFS_data_03 2109888 MB 2109888Icirc36 = 724TB 1024k

data_04 QFS_data_04 2109888 MB 2109888Icirc36 = 724TB 1024k

data_05 QFS_data_05 3695488 MB 3695488Icirc36= 1269TB 1024k

home QFS_home 3695488 MB 3695488Icirc36= 1269TB 512k

work 524288 MB Io2JAgraveAacuteAcirc

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_06 QFS_data_06 2109888 MB 2109888Icirc36 = 724TB 1024k

data_07 QFS_data_07 2109888 MB 2109888Icirc36 = 724TB 1024k

data_08 QFS_data_08 2109888 MB 2109888Icirc36 = 724TB 1024k

data_09 QFS_data_09 2109888 MB 2109888Icirc36 = 724TB 1024k

data_10 QFS_data_10 3695488 MB 3695488Icirc36= 1269TB 1024k

data QFS_data 3695488 MB 3695488Icirc24= 846TB 1024k

sysdata QFS_sysdata 3695488 MB 3695488Icirc12= 423TB 16k IumlETHNtildeETH

work QFS_work 524288 MB 524288Icirc144=72TB 1024k dx01frac3436OgraveJ

JAEA-Testing 2011-005

6 a$fY` mmgt BX900 a$fY`ETHNtildeCDtuq

ETHNtildeBX900Xraquozfrac14frac12GmWordfgtecircDq 61 a$fY`igraveiacute mmgt BX900 a$fY`IcircIumligraveiacute AgravejiumlIumleacuteaIuml

0Gq 611 icircWiumlIumlethsectntilde

BX900 InfiniBand QDRegraveBCD Fat Treeicircgtlaquoq113aelig13S 18iumlIumlAEligicircgtlaquoordf113aelig13 LeafWdividenaccedil7Iumlordf 2Igravecenttimeseumln1accedil7IumlcdoiumlIuml QDR ordf 1 7gtYZeumlnqaccedil7Iuml 2 Igravelaquogt1 iumlIuml 2 QDR ordfYZeumln(QDR Iacutegt 18 j2Igravegt 36WX)qoiumlIuml`ZaringbrvbarsectT 8GBs (4GBsj2) AAEligpoundq 13aelig13 5PDHLeafaccedilyacute7 SpineWdividenaccedilordf

9iexcllaquosecto LeafaccedilGu Spineaccedil QDRgtYZeumlnqbrvbarpound 113aelig13cd SpineIacutegt 9j2Igrave=18 QDRordfegraveBeumlnq QDRordf 5PoumlLeaf accedil3xmacrDownlink ordf 36 Uplink ordf 18 Xgt Uplinkshy 50WXsectmn 50_ccedilaringWdividenotgtqBX900gta$fY`Ecircfrac34h Fig 23GqBX900gt 50_ccedilaring13aelig13ocircotilde gtthornordf]GqSpineLeafaccedilW13aelig13SoiumlIumlYZGEcircfrac34h Fig 24Gqmmgt13aelig13S nAumliumlIuml nAuml(niquest9)UgraveD nU9Auml(nAgrave9) SpineaccedilegraveBGqlaquo13aelig13laquoiumlIumlordfcent13aelig13iumlIumlcdOcirc$ltAacuteoumlSpineaccedilcd Leafaccedil(2Igrave) QDR2(1j2Igrave)ordfegraveBeumln(2Igrave)LeafaccedilcdiumlIumlQDR2gtYZeumlnqDotildeCmSpineaccedilcd Leaf accedilordmAcirc13aelig13SPAtildeWiumlIumlWnoteumlnqmDHyenwiumlIumlTEacuteOcirc$lt GoumlWiumlIuml 1 W 10 ordflt GoumlWgtn~n FAringordf+XqX13aelig13Sotilde gtiuml_ccedilaring

WXqmDHoYacute~iumlIumlethsectntildeAgraveograveLiumlIumleacuteaOcircOtildePordf

thornotildeGq (1) XiumlIumlethsectntildeL13klP iumlIumlZeacuteaIumleumlnqyenw13aelig13 1 iumlIumlordf bx0001ccedil

bx0018Ugravegtlaquoordf144EumlIgraveYacute~oumlmAumlgtiumlIumlordfeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDdmiumlIumlAringAEligw[iumlIumlegraveBGq13

aelig13ocircotildeoumliacuteZeacuteaIumleumlnq

JAEA-Testing 2011-005

(2) CcedilXiumlIumlethsectntilde Wsup33iIacuteOslash5YZniumlIuml 1ecirceacuteaIumleumlnqyenw13aelig13

1gt bx0001bx0003bx0005bx0007WeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDd8)eacuteaIumleumlnDiumlIumlordfEgraveAumlAringXdAringAEligwTEacuteEgrave

AumlAringiumlIumlegraveBGqEcircAumlAringoumliacuteAringAEligwTEacuteEcircAumlAringiumlIumlegrave

BGqmnAtildeWYacute~SiumlIuml gt IB [EumlDHcenteumlnDq plusmnoqgt65EacuteWsup3AgraveogravegtCDq

13aelig13 1 13aelig13 119

Leaf(1-2) Leaf(1-1)

bx0001 bx0002 bx0018

Spine(1) Spine(2) Spine(9)

Leaf(119-2)

Leaf(119-1)

bx2125 bx2126

Fig 23 BX900a$fY`icirc

3OslashCDqIgraveIacuteoFL13aelig13raquoIcircIumlntildePfrac34wWsup313kl

WiacuteiumlIumlethsectntildeAgravejBCq

JAEA-Testing 2011-005

accedil (Spine) accedil

(Leaf) 13aelig13 iumlIuml (1 13aelig13acute18 iumlIuml ) (acutecsect]GCGDHLeaf accedilTCcX)

1

10

2

11

3

12

4 5 6 7 8 9

13 14 15 16 17 18

SW 1

SW 2

SW 3

SW4

SW5

SW6

SW 7

SW 8

SW 9

SW

Fig 24 iumlIumlWaccedilatildeAcirc

612 Agravej

BX900 Ba$fY`plusmnsup2yBCD IB AgraveograveWC

RDMA(Remote Direct Memory Access)yBCD RDMA Agravej(13 RDMA WC)WRDMA egraveBCX NORDMAAgravej(13 NORDMA WC)ordflaquoqBX900 gtOcirc` RDMA gtlaquoqRDMA W IB begravecccedil`brvbarYaumlIuml(Host Channel Adapter(HCA))Ocirc$^|13~|W8Y]sectWsectgtecircAEligDIacutemWgtETH_fg (1)5PqAuml^ccedil`E7a13 ordfNtildeEumlgtecircqiAgraveNORDMA CPUbrvbar|fg5Pq

RDMANORDMAWcentccedilYacute AgraveograveWC Normal SendAgravej(2)W Send on RequestAgravej(3)ordflaquoqRDMANORDMAacircWmnd Normal SendAgravejSend on RequestAgravej5PmWordfgtecircordfmndsectOgravewG ccedilYacute-a5brvbarpoundAtildeUgraveqRDMANORDMAgtmndAgravejsectOgraveOcirc`Oacuteshy Table 16GqOacuteshyOcircOtilde Normal SendAgravejOacuteshy13yacute Send on RequestAgravejWXqmndAgravejWccedilYacute-a5^_atildeAcirc Table 17W Table 18Gq Oacuteshy^ MP_MAX_NORMAL_SENDgt^wdnordfyacuteaacuteshyCDOcirc`shyWEacutegtlaquoqbrvbarpound^egravePoumlOacuteshyOumloumlEacutegtlaquoq

JAEA-Testing 2011-005

XDshy 5PDHOslash5 -onesideUuml^13~CDoumlNORDMAAgravejc(ccedilYacute-a5WatildeAcirc)Normal SendAgravejgt5nq

Table 16 AgravejsectOgravewOacuteshy Iuml RDMA RDMA NORDMA

^_ Tccedil1024 1025ccedil U Oacuteshy 32768eacutea` 16384eacutea` 524288eacutea`

Table 17 RDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil16KBOcircOtilde 16KBccedil32KBOcircOtilde 32KB13yacute 1ccedil1024^_ Normal Send Normal Send Send on Request

1025ccedil2048^_ Normal Send Send on Request Send on Request 2049^_13yacute NORDMAWX Send on Request Send on Request

Table 18 NORDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil512KBOcircOtilde 512KB13yacute Gu^_ Normal Send Send on Request

timesOslash (1) ETH_fg ^_ccedil-brvbarOcirc$fgIacuteccedil`brvbarYordmgt ordm|8

YOcirc$FGGCcedilEgravemWq^_ccedil-( 13)Obbrvbarsectdivide3GmWordfgtecircq (2) Normal SendAgravej G shylt shyccdccedilYacuteG GqccedilYacuteUgraveUacutesect

umlordmordfagraveCnUgraveUgravefg(direct copy)qagraveCXoumliUcircIacuteEumlAagravefCmmfg(indirect copy)agraveCDdsectumlordmfgGqUumlccedilYacutezecircq (3) Send on RequestAgravej G shylt shyWYacuteAacutesectlt shyccedilYacutesectumlordmordfagraveCcdG 5Pq

`ccedilYacutezecircq

JAEA-Testing 2011-005

62 MPI MPI acircX atildeegraveBCD^_Z)JCBX900 yacutegt

CDq256EumlIgraveccedil4096EumlIgravegt Ocirc$-a5THORNumlcopyszligpound5poundDordfmmgto MPIatildegtagraveaacuteX=ecircordfmacrdn 32ccedil80000eacutea`gtGq WCRDMANORDMA ordflteumln atildeordflaquomWlaquoOcirc

$-a5cdecircCordf^UcircG atildeordflaquomWXMordfethCDq 621 Agraveograve mmgtG Ocirc$umlcopy 0ccedil80000 eacutea`gtlaquoordfOslash5vEacuteAacuteG

Ocirc$acircETHeCLTable 19Pq AgraveograveWCMPIatildeTplusmnOslash5GraquoCmn 20plusmnatildesectaumlGq20plusmnatilde

sectaumlC3 1plusmnWecircXshyordfeumlnmWordfDDlaquopoundDqWCpumlcopyYacute~ W[ordfuumlwdnqmDHT`C19plusmnshyIIumlAacutemWCDqXIuml RDMAW NORDMA 2IumlgtGq^_ZOslash MPIatildeIacutecurren Fig 25Gq

Table 19 Ocirc$acirc

EumlIgrave AllreduceReduceBcastSendRecv 13 MPIatilde 256 32eacutea` 128eacutea` 512 yacute 128eacutea`

1024 yacute 256eacutea` 2048 yacute 2048eacutea` 4096 yacute 2048eacutea`

do 1000 III = 1 NNN call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() mesuarment start call MPI_ALLGATHER(anelemMPI_DOUBLE_PRECISION amp bnelemMPI_DOUBLE_PRECISION amp MPI_COMM_WORLDierr) ttime2 = MPI_WTIME() mesuarment end ttime0 = ttime2 - ttime1 call MPI_REDUCE(ttime0ttime1MPI_DOUBLE_PRECISIONMPI_MAX0 amp MPI_COMM_WORLDierr) if(myideq0) then max_time = max(max_timettime) min_time = min(min_timettime) all_time = all_time + ttime end if 1000 continue

Fig 25 ^_ZLIacutecurrenP

JAEA-Testing 2011-005

13

622 mnUgravegttuDOslash5CcedilEgraveOslash5IumlgtP512EumlIgrave Fig 26 Fig

27 1024EumlIgrave B Gqoh$a` MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

0

500

1000

1500

2000

2500

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Reduce(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(512-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(512-)

rdma

Nordma

Fig 26 MPI atildeL512EumlIgrave-1P

JAEA-Testing 2011-005

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(512-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(512-)

rdma

Nordma

0

500000

1000000

1500000

2000000

2500000

3000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(512-)

rdma

Nordma

0

50

100

150

200

250

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(512-Rank0gtRank511)

rdma

nordma

Fig 27 MPI atildeL512EumlIgrave-2P

JAEA-Testing 2011-005

brvbarsect[raquozfrac14frac12CDq

(1) C gt RDMAbrvbarsect NORDMAordfFoumlordflaquoq (2) RDMAgtlaquo Ocirc$-a5cdaeligccedil ordflaquoEGqmn Normal Send

Agravejcd Send on RequestAgravejsectOgravethornWnq (3) NORDMA gt AgravejsectOgravewOacuteshyWatildeAcircOcirc$-a5gtaeligccedil ordflaquo

EGoumlordflaquoq (4) EumlIgraveordfCXWEacutebrvbarPX Ocirc$-a5gt decircordfmacrdn

oumlordflaquoq ^_ZgtacircegraveBG MPIatildebrvbarpoundRDMAW NORDMAegravecurrenaeliguordflaquo

qDotildeCmnd MPI ^_Z=GfIumlgtlaquo mpiexec Uuml^13~gtlaquogtGu MPIatildeoacuteC RDMAgt5Pc NORDMAgt5Pclaquo XgtC MPI atildeegravepoundoumlAgraveIumlgtOslash5rsGmWordfaeliguWXq ucircuumlWCo MPI atildegtUVOcirc$(ccedil8KB)3UVOcirc$(8Kccedil78KB)U

VOcirc$(80KBccedil584KB)gt RDMAZUcircCD Fig 28LgtUVOcirc$P CLUVOcirc$PGqZyacutegt 100n RDMA ordfnoty0n NORDMAordfnotyWXq

JAEA-Testing 2011-005

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

middotplusmnsup1raquo

~plusmnreg

plusmn

sup2cedil~raquo

sup2~

sup2~

sup2~Oslashreg~plusmn

middotplusmnsup1

~plusmnraquo

~plusmn

Ugraveplusmnsup1raquo

Ugraveplusmnsup1

~plusmn

plusmnraquo

Fig

28

RD

MAW

NO

RD

MALgtUVOcirc$acute

0K

B~8

KBP

JAEA-Testing 2011-005

63 RDMAszlig RDMA 612 gt0CDbrvbarPCPU plusmnsup2WnotM5XDH 13

O=Vfrac34gtecircqmnbrvbarsect WbordfO=gtecirc9ordflaquosectC

cCcedilEgrave`5poundEacuteDqUgraveDrsDHNORDMAgtiacuteX`5poundDq WC WordfO=eumlnDegraveyenagraveaacutegtecircXcpoundDq 631 Agraveograve W5P^_ZegravepoundRDMA W NORDMA rsGq

eacute^_WOcirc$ecircacuteGeumlsectCigraveecircacuteiacuteVicircG (13cedil Wdivideoslash)5PqGlt iuml_ccedilaring (MPI_IsendIrecv)egraveBGqG UgraveDlt GAtildeWcopyIgraveOcirc$-a5 5MB gtlaquoq W5P5Igravegtlaquoq5Igrave-a5(450450)gtlaquoqrsDHgt n~nOslash55Pq ^_ZZccedil`MPIW 2q|Icirca|ccedil`(MPIiumlOpenMP) 3q|)

JCDqZccedil` MPIgtOslash5CDoumliumlIumlS RDMAgtXDHmAgravejszligordfgtecircXqmDHiumlIumlSgt MPI 5XCeth2Icirca|ccedil`)JCDqIcirca|ccedil`Oslash5WOslash5Iacutecurren7ccedilIumlEumlIgraveUcircCXeacute

Yacute~ 1W7ccedilIumlEumlIgraveUcircCDeacuteYacute~ 2)JCDqeacuteYacute~ 2 WOslash5Iacutecurren Fig 29GqXMPIWIcirca|ccedil`eacuteYacute~ 15IgraveIacutecurren7ccedilIumlEumlIgraveampeuml$OMP parallel doiacuteordfotildegtlaquoq DHOslash5Zccedil` MPIgt 8EumlIgraveIcirca|ccedil`Oslash5 4mpij8omp(32core

egraveB)gtOslash5Gq

JAEA-Testing 2011-005

call MPI_BARRIER(MPI_COMM_WORLDierr) ttime5 = MPI_WTIME() mesuarment start call mpi_isend(aNMBmpi_DOUBLE_PRECISIONir1 1 MPI_COMM_WORLDireq(1)ierr) call mpi_irecv(bNMBmpi_DOUBLE_PRECISIONil1 1 MPI_COMM_WORLDireq(2)ierr) call mpi_isend(dNMBmpi_DOUBLE_PRECISIONil2 1 MPI_COMM_WORLDireq(3)ierr) call mpi_irecv(cNMBmpi_DOUBLE_PRECISIONir2 1 MPI_COMM_WORLDireq(4)ierr) ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ttime7 = MPI_WTIME() mesuarment start $OMP parallel do do j = 1 NMM do i = 1 NMM a1(ij) = 1d0 b1(ij) = 1d0 c1(ij) = 0d0 end do end do $OMP parallel do do j = 1 NMM do k = 1 NMM do i = 1 NMM c1(ij) = c1(ij) + a1(ik)b1(kj) end do end do end do ttime8 = MPI_WTIME() mesuarment end ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc call mpi_waitall(4ireqiastatusierr) ttime6 = MPI_WTIME() mesuarment end ttimeDD = ttime8 - ttime7 ttimeal = ttime6 - ttime5 Oslash5 ttimesd = ttimeal ndash ttimeDD Oslash5cdnD (= UgraveDiIacute)

Fig 29 WO=5PDH^_Z

JAEA-Testing 2011-005

632 Zccedil` MPI 2plusmnIcirca|ccedil`eacuteYacute~ 1 WeacuteYacute~ 2n~n 3plusmn

Oslash55poundDqmnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt D Gqontilde13Gq

acutegtOslash5CD Oslash5 acutegtOslash5CDOslash5 R acute Wn~nOslash5shy Oslash5 acute W5poundDOslash5 (U) acuteOslash5cd()Dshy acuteOslash5CD(OslashC) U acuteOslash5cdRshyDshy

laquogtOslash5CD oacuteG WOslash5GmWbrvbarpound

UumlCXpoundDrGqmj13sectgtlaquoq

laquo() = (Oslash5U( iuml)) j100 aTHORNshyordfecircMszligordflaquomWXqbrvbarsect[raquozfrac14frac12CDq

(1) Zccedil` MPIgtiumlIumlS IBegravepoundXDH RDMAograveoacuteordfltdnOslash5szligq

(2) Icirca|ccedil`eacuteYacute~ 1 RDMAOslash5gt WOslash5rsGWucircOslash5ordfFWPXq NORDMAatildeCiacuteXWXpoundq

(3) Icirca|ccedil`eacuteYacute~ 2 gtIacutecurrenordf7ccedilIumlEumlIgraveUcirceumlngteacuteYacute~ 1WRrsgt 72currenOslash5Rrsgt 55currenAumlFUcircordfXeumln(RDMA)qCcCgtucircOslash5ordfFWPdegdnXq NORDMAIacutecurren7ccedilIumlUcircgt Oslash5eacuteYacute~ 1ruUumlIgraveUcirceumlnordfRDMAbrvbarsectcedilCXpoundq

BX900gt RDMA Uuml_IumlOslashWGgtCPU oacuteG

dCb^|13~ CPUbrvbarsectVfrac34GmWordfgtecircqCcCBX900 ordfiacuteCUgraveDplusmngt ordfUumlmWlaquosect Wordf

O=eumlnDegraveyenagraveaacuteGmWordfgtecircXcpoundDq

JAEA-Testing 2011-005

64 CPUSCPUiumlIuml 1oacute 1 core FAringG^_Zegravepound CDqCPU S

iumlIumlS CPU iumlIuml gt Ocirc$ordmAcircAgravejordf+Xqn~n gtordfaEacuteiumlIuml ordfTFWXpoundDq 641 Agraveograve

1oacute 1 ^_Z coreOcirc$ G^_ZgtlaquoqZY0cdcZYCD^_acircgtCD-a5Ocirc$ 100plusmnG Cmnd uCDn~nGq^_ZiIacute Fig 30Gq plusmn 24EumlIgravegt128MBOcirc$G C^_acirc 1WGq BX900gt 1iuml

Iuml 2CPU1CPU 4coregtlaquogtZY 3Ugravegt CPUS ZY 7UgravegtiumlIumlSPAtildeW CPUW mnd13yacuteiumlIuml WXqiumlIumlS coreoacuteG core IDWethsectntildedn MPIZYAumlAring Fig 31Gq

IC = 0 do 1000 III = intcpu nodes-1 intcpu IIRANK = III call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN if(myideq0) amp call MPI_SEND(ansizeMPI_DOUBLE_PRECISION amp IIRANKIIMPI_COMM_WORLDierr) if(myideqIIRANK) amp call MPI_RECV(ansizeMPI_DOUBLE_PRECISION amp 0IIMPI_COMM_WORLDistatusierr) 120 continue ttime2 = MPI_WTIME() c IC = IC + 1 timew(IC) = ttime2 - ttime1 ir(IC) = IIRANK 1000 continue

Fig 30 ToacuteT ^_ZLIacutecurrenP

JAEA-Testing 2011-005

NODE

CPU0 CPU1

Core IDacute0 Core IDacute2 (Rank 0) (Rank 1) Core IDacute4 Core IDacute6 (Rank 2) (Rank 3)

Core IDacute1 Core IDacute3 (Rank 4) (Rank 5) Core IDacute5 Core IDacute7 (Rank 6) (Rank 7)

Fig 31 f MPIZYethsectntilde

BX900Ocirc`QRfeaZ+QR MPIZaZ|gtlaquoordfmQRRIgraveEacuteIacute13

IntelfeaZ+QR MPI IntelfeaZ+Intel MPI QRfeaZ+Open MPI

3 gt5poundDqDotildeCQR MPI W Intel MPI MPI ^_ coreoacuteGethsectntilde Fig 31CDsectgtlaquoordfOpen MPIatildeC rankAumlAring 01234567 Core ID 01234567gtethsectntildednq 642 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt E Gqmbrvbarsect[

raquozfrac14frac12CDq

(1) QRfeaZ+QR MPI gtCPU S FAringordf 3260MBsiumlIumlS CPU FAringordf 3750MBsiumlIumlordf 5350MBsgt CPUS ordfiAumlcedilCiumlIumlordfiAumlFWPWXpoundDq

(2) Intel feaZiumlQR MPI CPU SiumlIumlS CPU QRfeaZWotildeiacuteXotildeordfiumlIuml 4750MBsWegraveeacuteEGqmnQR MPIQRfeaZUcircCD5poundordfIntelfeaZoacuteC5nXDHWnq

(3) IntelfeaZ+ Intel MPI CPUSiumlIumlS CPUgtAgraveXordfmndocircoacute(1)(2)QR MPI egraveBqUgraveDiumlIumlgt Intel MPI 2(` QDRegraveBordfgtecircXDH1(` QDRCcXq

JAEA-Testing 2011-005

(4) QRfeaZiumlOpen MPICPUS ordf 4460MBSgtmnUgravegt3gtTbrvbarordfiumlIumlS CPU QR MPI WotildeAringiumlIumloumlCXWXpoundDqDotildeCiIacute QDRegraveB5poundbrvbarPotildeq

dividefrac12XegraveAgravegtlaquoQRfeaZiumlQR MPIgt CPUSiumlIumlS CPU ordf

iumlIuml brvbarsectcedilWPfrac34W+XWXpoundDqmmacraacutesectXuC

iumlIumlSgt MPI XuCEumluecircgtlaquoq 65 13aelig13ocircotildeouml 13aelig13ccedil`brvbarY 50_ccedilaringicircXpoundDHegraveBoslashbrvbarpound

thornltCUgravePqmthorn 1oacute 1 Wdivideoslash ETHuDq 651 1oacute 1

1oacute 1 MPI atildegtlaquosectbrvbarCegraveBeumlnqyenwOslashfIumlgteacute^_WFGgt^_ordfgt5noumlordflaquoqmbrvbarPXouml50_ccedilaringthornlt9ordflaquoqmmgt 5P^_Z)JC

thornETHuDq WC1oacute 1 gt 2iumlIumlgt IBnotGn[ordfecircDH50_

ccedilaringthornCltmWordfcpoundDqDotildeCWsup3iumlIumleacuteaOcircOtilde(Yacute~iumlIumlethsectntilde)AgraveogravegtAtildeWYacute~ordf IBnotGbrvbarPXiumlIumlcopytimesUVYacute~139ordfAgraveXDHthornordfCq 6511 Agraveograve ETHNtildeegraveBCD^_Z^_1curreno^_cdfrac14secto^_

1oacute 1 (MPI_SendRecv)gt 128MBOcirc$ 100plusmnG CmnuCDiexcl FAringGqOcirc$G aYacute Fig 32 Gqmyengt8EumlIgraveYacute~oumlgtZY 0ccedil3^_cdiexclZY 4ccedil7^_Ocirc$G CqOcirc$Glt ^_iugravepGgtoslashbrvbarpoundccedil`brvbarYgt[

ordfecircmWXq ^_ZZccedil` MPI WIcirca|ccedil`(MPI+OpenMP))JCDqIcirc

a|ccedil`7ccedilIumlETHeGmWgt1 iumlIumlcdOcirc$G 2^wmWordfgtecircq

JAEA-Testing 2011-005

13

6512 (1) 113aelig13S rsDH1 13aelig13S(144core)5poundDqZccedil` MPI 144 EumlIgraveIcirca

|ccedil` 72mpij2omp36mpij4omp18mpij8omp 3 IumlgtOslash55poundDq13aelig13Siuml_ccedilaringgtlaquoDH iumlIumluacuteordf IB iexclcentGqZccedil` MPI 1 iumlIumlcd 8 ^_currenOcirc$ordfeumlnIcirca|ccedil` 2 7ccedilIumliacute 1iumlIumlcd 4^_curren47ccedilIumlEacuteC 2^_curren87ccedilIuml1^_currenWXgt87ccedilIuml13 IBgt[ordfecircqmDHmnd FAring+XordfIBagraveaacuteGDHogtIIuml FAringW 1iumlIumlntildesect FAringshyiquestHDqm Table 20GqXeacuteYacute~Icirca|ccedil`cedil$ucirc3shy(MPI^_j7ccedilIuml)Gq Icirca|ccedil`(18mpij8omp)EacuteegraveeacuteshyordfEordfFAringshyGueacuteYacute~gt

otildeEacutebrvbarPXshyWXpoundqmnordf QDR2currenOslashgtlaquoigravePq (2) 13aelig13Zccedil` MPI TEacute144EumlIgraveZccedil`MPIgt5poundDqYacute~Oslash5gt t256YZegraveBCDq

mnmYZiumlIumleacuteaOcircOtildeordfZcopytimesWXDHIBnotordfecirc]GDHgtlaquoq _copytimes 72^_13aelig13(=9iumlIuml13aelig13)gtIumlb^_ 213aelig13currenethCDq13aelig13S 1iumlIumlecirc 2 IBordfuacuteordfpoundq13aelig13S 18iumlIumllaquogt 36 IBgteumlnqiAgrave13aelig13 50_ccedilaringgtlaquogt 18(9j2ccedil`)gteumlnq144EumlIgraveouml13aelig13S13aelig13W 18 IB egraveBGgtiuml_ccedilaringWXsect13aelig13S13aelig13EacuteXqbrvbarpoundm13aelig13ocircotilde 13aelig13SgtCD Wotilde^

dXq

Ocirc$G (G G)

^_0

^_1

^_2

^_3

^_4

^_5

^_6

^_7

Fig 32 ToacuteT Ocirc$G aYacute

JAEA-Testing 2011-005

Table 20 L13aelig13SP eacuteYacute~ Zccedil` MPI uumlyacutethorn(72j2) uumlyacutethorn(36j4) uumlyacutethorn(18j8)IIuml FAring 7238MBs 14474MBs 28942MBs 53467MBs iumlIumlshy 57906MBs 57899MBs 57884MBs 53467MBs

Table 21 L13aelig13Zccedil` MPIP 144 MPI 192 MPI 224 MPI iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 7246 R 4670 laquosectR 4671 laquosectR

2 7244 R 4670 laquosectR 4671 laquosectR

3 7246 R 4670 laquosectR 4671 laquosectR

4 7242 R 7243 R 4670 laquosectR

5 7243 R 7244 R 4671 laquosectR

6 7242 R 7240 R 7237 R

7 7241 R 7239 R 7246 R

8 7244 R 7240 R 7245 R

9 7249 R 7243 R 7240 R

10 R 4670 laquosectR 4670 laquosectR

11 R 4670 laquosectR 4672 laquosectR

12 R 4669 laquosectR 4671 laquosectR

13 R R 4671 laquosectR

14 R R 4670 laquosectR

iacute 192EumlIgraveW 224EumlIgraveZccedil` MPIn~n 96^_13aelig13gt 213

aelig13Oslash5112^_13aelig13gt 213aelig13Oslash55poundDqordm 144EumlIgraveOslash5WWlt shy13aelig13iumlIumlraquoIIuml FAring Table 21Gq yenw 224 EumlIgraveoumlltAacutesectshygt IBnotGiumlIumlordflaquogt(50_ccedilaring)

miumlIuml FAringordfcedilCXqFig 24a$fY`icircgtCDbrvbarPiumlIuml1W 102W 113W 12ordfnotGmWXsectmndiumlIuml FAringcedilCXpoundqCcCiumlIuml 15161718 egraveBeumlnXDHmndiumlIumlWnotGiumlIuml 6789 FAringordfFC 144MPI oumlW FAringordfEacutegtlaquoqbrvbarpoundiumlIuml6ccedil9 iuml_ccedilaringWXq192 EumlIgraveiacutegtiumlIuml 4ccedil9 p IB notGiumlIumlegraveBeumlnXDH FAringordfFq

JAEA-Testing 2011-005

Table 22 L13aelig13Icirca|ccedilIumlP Icirca|ccedil`(112x2) Icirca|ccedil`(56x4) Icirca|ccedil`(28x8) iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 9339 laquosectR 18567 laquosectR 36408 laquosectR

2 9337 laquosectR 18564 laquosectR 36919 laquosectR

3 9346 laquosectR 18566 laquosectR 36536 laquosectR

4 9347 laquosectR 18570 laquosectR 36457 laquosectR

5 9342 laquosectR 18651 laquosectR 36646 laquosectR

6 14385 R 28606 R 53808 R

7 14488 R 29118 R 53779 R

8 14500 R 28676 R 53789 R

9 14495 R 29100 R 53784 R

10 9337 laquosectR 18797 laquosectR 36408 laquosectR

11 9340 laquosectR 18775 laquosectR 36715 laquosectR

12 9343 laquosectR 18651 laquosectR 36559 laquosectR

13 9348 laquosectR 18811 laquosectR 36457 laquosectR

14 9338 laquosectR 18694 laquosectR 36991 laquosectR

(3) 13aelig13Icirca|ccedil` Icirca|ccedil`gt 213aelig13(224core)egravepound112mpij2omp(56^_13aelig13)

56mpij4omp(28^_13aelig13)28mpij8omp(14^_13aelig13)IumlgtOslash5CDqmndoiumlIumlgtIIuml FAringWpiumlIumlWnotnot Table 22Gq MIcirca|ccedil`gt IB notnotgt FAringordfecircC^poundqUgraveDIcirca|ccedil

`7ccedilIumlbrvbarpoundiumlIumlcdGlt GOcirc$2ordf^DHiumlIumlntildesect^_

ordfMiumlIumlXordf IB gtOcirc$[thornordfagraveaacuteXqmnduumlGWBX900 gtZccedil` MPI brvbarsectIcirca|ccedil`UcirceumlnDfIumlordfIcirca|ccedil`UcircfIumlXd 87ccedilIumlordfnotygtlaquomWordfcq WmigravegtWsup3iumlIumleacuteaOcircOtildeAgraveograveiumlIumlAtildeWecirccopytimeseumlnD

HUVYacute~13currenYacute~gt IB not9AgraveXqbrvbarpound50_ccedilaringmWuumlGbrvbarsectordmgttuDIcirca|ccedil`Ucirc5GuecircgtlaquoqCcC

Yacute~Oslash5 IB notGiumlIuml ordfpYacute~ordfOslash5eumlnWthornltCUgravePordfucircoB3mnEumlmWgtecircXq

JAEA-Testing 2011-005

652 divideoslash mmgtdivideoslash atildePiAuml 2ordfCX MPI_AlltoallegravepoundEacute

DGq 6521 Agraveograve

MPI_AlltoallGCcedilEgravecentCOslash5GqCmrsGqXMPI_Alltoall 100 plusmnOslash5C oacute copyIgrave-a5Zccedil` MPI ordf512MBIcirca|ccedil` 1024MBWCDq

A ucircYacute~Oslash5 B 13aelig13ntildesectiumlIumlgtecircotildeCGq C 13aelig13ntildesectiumlIumlAgraveXCG(13aelig13ordfCX)q D IBnotCG(iuml_ccedilaring)q

BccedilD atildeGgtlaquoordfiumlcopyordfYacute~gtegraveBGiumlIumllaquo GmWgtecircXq mDHWsup3ucircoBgteacuteccedilYacute~TYZ h4096egraveTEacute 512iumlIuml(4096core)agravefCm3cdXuCCcedilEgravePiumlIumllaquonotgtecircmCcedilEgravegtagravefgtecircEumlIgravegtCfNh$)JCmfNh$egravepound

MPI_Alltoall Oslash55poundDqmDHAacutedegCDOcirc$ MPI ^_UYacuteordfXq 6522 Zccedil` MPIWIcirca|ccedil`egravepoundD Table 23GqmmgtCXC

XmWOslash5CcedilEgraveordf+XDH AucircYacute~Oslash5WBccedilDUgravegtoshy8YrsCXmWgtlaquo(A 10WCDrClaquoordfmnoumlgt BccedilDEacutersWG)qBccedilDTEacute 512iumlIumlagravefCcdEumlIgravefNh$)JCOslash5GordfQR MPI gtmbrvbarPTEacuteCiumlIumlagravefCcdAgraveXiumlIumlgtCfNh$)poundOslash5CDoumlWTEacutecd

EumlIgravegtOslash5CDoumlWgt)acircOslash5ordfEGqZccedil` MPI WIcirca|ccedil`gtgt coreordfAgraveXoumlmraquozordfagraveaacuteXqX BccedilDrsXq

JAEA-Testing 2011-005

Ta

ble

23iacuteXCcedilEgrave

MPI

_allt

oall

vUacuteSTUcircR

vRcopyUumlYacutebrvbarTHORN

szligvcopyUumlYacutebrvbaragrave

pvRaacuteacircatildeNR

+aumleAumlovqaringaeligPgR

Oslash 5 Iuml

copy Igrave - a 5 (MB)

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

R

R

szligR

pR

144m

pi

512

3R6

0 11

14

118

0

145

9 3

60 R

138

6

36

0 13

87

100

1

31

124

1

25

2plusmn

(BC

DEacute

) 51

2 3R

60

111

4

118

0

148

2 11

16 R

146

9

63

0 14

75

100

1

33

132

1

32

3plusmn

(BC

DEacute

) 51

2 3R

60

111

4

29

0 14

81

714

13 R

153

3

29

0 14

21

100

1

33

138

1

28

448m

pi

512

14R

40

142

6

414

0

182

4 21

272

1 R

170

3

78

0 16

32

100

1

28

119

1

14

504m

pi

512

14R

45

142

7

415

8

180

4 27

242

6 R

160

5

79

0 14

33

100

1

26

112

1

00

2048

mpi

51

2 37R

69

182

1

298

8 24

93

465

6 R

250

1

465

6 25

04

100

1

37

137

1

38

2plusmn

51

2 37R

69

166

7

2211

6

262

5 67

564

6 R

264

7

298

8 25

04

100

1

57

159

1

50

3plusmn

(BC

DEacute

) 51

2 37R

69

166

7

2211

6

262

1 69

524

9 R

269

3 6

298

8 24

59

100

1

57

162

1

47

3072

mpi

51

2 69R

56

258

7 28

399

9 27

34

5371

54 R

286

3 9

507

7 26

44

100

1

06

111

1

02

68m

pij

2om

p 10

24

3R5

7 12

10

11

170

11

80

91

9 R

122

3

28

5 12

21

100

0

97

101

1

01

120m

pij

2om

p 10

24

9R3

3 12

89

215

0

127

0 12

191

6 R

130

4

47

5 12

94

100

0

99

101

1

00

132m

pij

2om

p 10

24

8R4

1 13

79

216

5

116

1 15

211

6 R

119

5

48

3 11

38

100

0

84

087

0

83

432m

pij

2om

p 10

24

17R

64

162

4

119

8 16

28

1925

43 R

161

2 1

129

0 16

18

100

1

00

099

1

00

30m

pitimes4

omp

1024

4R

38

598

115

0

674

12

13 R

683

27

5 68

2

100

1

13

114

1

14

2plusmn

(BC

DEacute

) 10

24

4R3

8 59

8

115

0

678

12

13 R

690

27

5 70

0

100

1

13

115

1

17

36m

pitimes4

omp

1024

6R

30

603

118

0

684

11

16 R

697

63

0 70

3

100

1

13

116

1

17

2plusmn

(BC

DEacute

) 10

24

6R3

0 60

3

29

0 68

8

714

13 R

702

29

0 71

0

100

1

14

116

1

18

126m

pitimes4

omp

1024

16R

39

822

3

415

8

805

27

242

6 R

738

1

79

0 71

5

100

0

98

090

0

87

180m

pitimes4

omp

1024

18R

50

925

615

0

892

40

283

2 R

776

127

5 75

3

100

0

96

084

0

81

2plusmn

(BC

DEacute

) 10

24

18R

50

925

712

9

861

29

352

6 R

808

109

0 76

8

100

0

93

087

0

83

612m

pitimes4

omp

1024

45R

68

110

2

349

0 10

34

525

9 R

111

4

349

0 10

46

100

0

94

101

0

95

2plusmn

(BC

DEacute

) 10

24

45R

68

110

2

2810

9

920

57

634

9 R

960

5

358

7 83

4

100

0

83

087

0

76

17m

pitimes8

omp

1024

3R

57

338

1

117

0

430

9

19 R

432

28

5 43

3

100

1

27

128

1

28

90m

pitimes8

omp

1024

15R

60

504

109

0 44

9

243

8 R

471

109

0 47

1

100

0

89

093

0

94

357m

pitimes8

omp

1024

53R

67

731

418

7 69

2

576

3 R

711

418

7 72

9

100

0

95

097

1

00

2plusmn

(BC

DEacute

) 10

24

53R

67

731

3410

5

642

57

685

3 R

628

6

438

3 60

2

100

0

88

086

0

82

JAEA-Testing 2011-005

brvbarWBccedilD gtecircCOslash5ordf+XAgraveXq13aelig13ordfAgraveXCc IB notGiumlIumlordfAgraveXnOslash5ordfUumlWnDordfmacrWaeligCmbrvbarPXpoundXqMPI_Alltoall MPI atilde3gtT 2ordfCX 5Pordfmnegravepound BX900 gtecircXordfXmWordfethCDq1 oacute 1 gt13aelig13thornordfDordfmgtpYacute~thornltC t256YZegraveBCDoacuteCdivideoslash h4096YZegraveBCDgtpYacute~thornlt9]QR MPIbrvbar TUacuteUcircthornuumlwdnWgtszligagraveXNXq 66 a$fY`UgraveWH mnUgravegt BX900 gt brvbarsect ecircXthorn7wWCacirc

RDMANORDMA Agravej Ocirc$-a5 1oacute 1 (13aelig13ocirc)IBnot

ordflaquopoundDqumlcopyfIuml oslashfrac14frac12CTbrvbarCcedilEgravelaquo GuecircgtlaquoigravePqeumld

fIumlszligOslash5GDH Zccedil` MPIUgraveDIcirca|ccedil`laquo

ordfugtlaquoqucircuumlWCGZccedil`MPIWIcirca|ccedil` rsGordfWmcdBX900gtIcirca|ccedil`ordfnotyXmWordfcurrencqmncdfIuml)JGoumllaquocentfIumlAumlFUcirc5GoumlcentX=ordfX

nIcirca|ccedil`laquo GuecircgtlaquoqBX9001iumlIuml 8coreoacuteC IB2gt brvbarntildeGDHm IBgt 2laquo]eumlXDHIcirca|ccedil`ordfugtlaquoqCPU13fUcircordfbrvbarsect2WnIcirca|ccedil`fIuml13vC]G

Weumlnq

JAEA-Testing 2011-005

7 UgraveWH

HPCC BMT7q|^_ZWUuml|YacuteTHORN^_ZBDbrvbarsectBX900W FX1gtfeaZbrvbarTUacuteUcircordflaquosectnordf^_ZOslashszligthornGmWordfdcXpoundDqIORBDgtegraveBG API]a13brvbarpoundlordfagraveaacutegtlaquopoundDqiacuteX B^_ZbrvbarOcirc$

a$fY`yBAgraveogravegt ordfecircC+XmWethCDqmnd

EcircHdegdnDregmacrHPCCYoJusectWUgraveWHWCtCqXyacuteampG(GOcirc$b AccedilGUgraveWHq mndregmacrBX900 W FX1 o13gtegraveBG^_ZIacuteDh

Agraveogravelaquo GDHnotszligXWXqordfbrvbarsectCumlcopy13

ABGlCWCJ7gtecircngtlaquoq

ugraveuacute

)JlaquoDsectXsup2 sup2kDotildeDAElig13=` v oBeacuteQRjlucircsup2acircAgraveXugraveG

q

ucircuumlyacutethorn

1) Innovative Innovative Computing Laboratory the University of Tennesee ldquoHPC CHALLENGErdquo httpiclcsutkeduhpcc

2) gt$middotamp()p HPCaelig7Yacutegt SX13L32 3plusmnAElig13THORNYacutel 3 A=P+ frac14AElig13THORNYacute$ 2005-05middot4 p98-116 (2005)

3) ldquoIOR HPC Benchmarkrdquo httpior-siosourceforgenet

JAEA-Testing 2011-005

A

FX1

HPC

C B

MT

G-F

FTEfeaZUuml^13~

-DH

PCC

_FFT

_235GmWgt

FFT|5ordfTUacuteUcirceumlnordfzyacuteCq

iAgrave

EP-

DG

EM

M-a5^regbrvbarpoundaringaeligccedil13h-a5

(15

MB

Cor

e)UacuteCordfzyacuteCq

Tabl

e A

-1feaZUuml^13~]-a5^regCD

FX1

HPC

C B

MT

UgraveccedilegravekeacuteUgraveccedilknUgraveccedilplusmnsup2sup3

~~regreg

Ugraveccedilmmnecircecirckccedilnecircfrac12

regReeumlgecirckccedilnecircfrac12

ndegplusmnsup2

ecirckccedilpUgraveecircfrac12frac12plusmnsup2sup3degmiddot

plusmnsup2ordmdegsup2sup1plusmnsup2sup3degmiddot

eacuteplusmn~+igraveiacute8

nmcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

reg~

reg~v

reg

sup2sup3plusmn

qvqyuqou

ovwyy

qvquzqq

vxzwo

qvzz

vyoy

zzv|w|

qvuqqy

uvuyx

qyu

sup2sup3plusmnqvqy|zx

ovwz

qvquyxz

vxuo

qvzuw

vyowx

zzv|z

qvxx|y

yvox

qwy

sup2sup3plusmn

qvxoquq

uvoxzo

qvoqqz

yyvuox

w|vyzx

vyoxx

zzvzyq

qv|qq

vqy

yy

sup2sup3plusmnqvxoqwwq

uvqwxx

qvooqzu

yyvxuwo

w|vyyq

vyouu

zzvzzx

qvux|

oqvz|

xw

sup2sup3plusmn

|vwq|qqq

|uvwz

qvqoxxw

qv|

o|uuv|q

vyxy

zzv||wq

qvqwzuw

oqvywyu

sup2sup3plusmn|vw||wqq

|vyz

qvuxu

yyyvuwwu

o|uyvyw

vy|q

zzv|uy

qvxwzqu

ovxuo

zq|z

sup2sup3plusmn

|vwuzqq

uuvxqwq

qvqoxxw

wwyv|zq

o|uv|q

vy|ox

xvuuwu

qvowox

oqvowuyq

sup2sup3plusmn|vwoquxqq

|yvzyxu

qvuzou

yvqwq|

o|uyvzxu

vy|qw

xvuuz

qvqwuy

ovuz

|y

uxqqq

wxqqq

|xqqqq

|qqqq

w

frac14icircRfrac14Rplusmndegreg

szligsup3degicircRRszligcedilszligiumliumlRethvRntildeRfrac14degicircRccedilogravemicroplusmnregntildemicro~sup1Oslashsup3sup2moRenecircfrac12oacuteregplusmnAumlocircotildeg

egravekeacuteOslashfrac14knicircccedilpegravekszligszligOslashmmnOslash|xRccedilpOslashpOslashfrac14knRccedilpegravekszligszligOslashfrac12ecircfrac12eacuteeacuteszlignRocircotilde

mszligpecircmicircccedilpeacutefrac14UgraveOslashOslashyunRocircotilde

frac12kicircRRkplusmnplusmnplusmnraquodegReacutemiddotplusmnmiddotRkplusmn~plusmnmiddot

-

moRexoszligcedilowszligkoumlg

szligdegicircRkszligyudividentildeRvxUgraveegraveparantildeRpposlashueuqUgravecedilregedegsup2plusmngntildeRo|vxUgravecedilregesup3plusmnregsup2RugraveRnecircfrac12gg

Cuacute uacuteucircicircuumlTyacutethornYacuteN0ucircicircuumlTyacutethornYacuteN0gt

|

xo

xo

egravekszligszligRethvicircRethov|vo

eeumlgicircecirckccedilnecircfrac12RregR0Eeecirckccedilnecircfrac12Rndegplusmnsup2gIcirc-RJ13NOIPQ

~~icircRppRmicrodegdegugraveplusmnsup2ntildeRoreg~cedilsup2ntildeRmplusmnRn

JAEA-Testing 2011-005

B EumlIgravecentMPI atilde oh$a`MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

1024EumlIgrave

0

1000

2000

3000

4000

5000

6000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

2000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(1024-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(1024-)

rdma

Nordma

0

100000

200000

300000

400000

500000

600000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(1024-)

rdma

Nordma

Fig B-1 MPI atildeL1024EumlIgrave-1P

JAEA-Testing 2011-005

0

50000

100000

150000

200000

250000

300000

350000

400000

q qqqq uqqqq yqqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(1024-)

rdma

Nordma

0

1000000

2000000

3000000

4000000

5000000

6000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(1024-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(1024-)

rdma

Nordma

0

50

100

150

200

250

300

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(1024-Rank0gtRank1023)

rdma

nordma

Fig B-2 MPI atildeL1024EumlIgrave-2P

JAEA-Testing 2011-005

13

C

RD

MAoacute

NO

RD

MA

Ta

ble

C-13UVOcirc$

(8K

Bccedil

78K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

~plusmnreg

plusmn

sup2~

sup2~

sup2cedil~raquo

sup2~Oslashreg~plusmn

middotplusmnsup1

middotplusmnsup1raquo

~plusmnraquo

~plusmn

~plusmn

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

plusmnraquo

Tabl

e C

-2UVOcirc$

(80

KBccedil

584K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

sup2~Oslashreg~plusmn

plusmn

~plusmnreg

middotplusmnmiddotsup1

~plusmnraquo

~plusmn

sup2~

sup2cedil~raquo

sup2~

middotplusmnsup1raquo

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

~plusmn

plusmnraquo

Zordf

100M

RD

MAAgravejordfnotyWXq

0CXM

NO

RD

MA

AgravejordfnotyWXq

JAEA-Testing 2011-005

D

W Oslash5

Tabl

e D

-1Zccedil`

MPI

8mpi

qzvquecirciumlq|yvwuecirciumlquvzuecirciumlquovu|oecirciumlqxvwoqecirciumlquyvxqoecirciumlquyvxoecirciumlqu

qwvo

qzvzzecirciumlq|vuecirciumlquwvyecirciumlquovu|oecirciumlqxvwuecirciumlquyvuzoecirciumlquxvy|ecirciumlqu

yqyv|

owvz|oecirciumlq|yvwyyecirciumlquvxzecirciumlquvqxecirciumlquov|uecirciumlquyvuecirciumlquccedilxv|xecirciumlq

ccedilyvq

ozvqoecirciumlq|vxxecirciumlquwvux|ecirciumlquvyzoecirciumlquovxoecirciumlquyvuuqecirciumlquccedilvyecirciumlq|

ccedilwuvy

ovq|ecirciumlquyvzz|ecirciumlquwvq|qecirciumlquovu|qecirciumlqxvw|wecirciumlquyvuyoecirciumlquyvywecirciumlqu

yquv

ovquyecirciumlquvuyqecirciumlquwvxqecirciumlquovu|qecirciumlqxvwecirciumlquyvuwqecirciumlquxvzxecirciumlqu

xx|vw

|ovq|zecirciumlquyvwzuecirciumlquvz||ecirciumlquvyoecirciumlquovwecirciumlquyvuzecirciumlquccedilovozecirciumlq|

ccediloyvy

|ovquecirciumlquv|qecirciumlquwvuoyecirciumlquvyy|ecirciumlquovowyecirciumlquyvuecirciumlquccedilvxwecirciumlq|

ccedilovz

uzvwoecirciumlq|vuxyecirciumlquwvu|uecirciumlquovu|ecirciumlqxvw|ecirciumlquyvuuxecirciumlquxvwwuecirciumlqu

yqovy

uzvywecirciumlq|yvzuzecirciumlquvzoyecirciumlquovuwecirciumlqxvwqyecirciumlquyvuecirciumlquyv|yecirciumlqu

yxvu

xzvqecirciumlq|vuoqecirciumlquwv|wqecirciumlquvyyecirciumlquovowecirciumlquyvuuzecirciumlquccedilvozecirciumlq|

ccedil|vx

xzvyzzecirciumlq|voxuecirciumlquwvouecirciumlquvqqecirciumlquovxuecirciumlquyvuuyecirciumlquccediluv|yecirciumlq|

ccedilu|v

yzvuecirciumlq|vxzecirciumlquwvxecirciumlquovuyecirciumlqxvwecirciumlquyvuxecirciumlquxvyzqecirciumlqu

xw|vw

yzvz|oecirciumlq|yvwzoecirciumlquvwwxecirciumlquovuzecirciumlqxvwyecirciumlquyvuyyecirciumlquyvuqwecirciumlqu

yuxv|

zvwqecirciumlq|v|uzecirciumlquwv||yecirciumlquvyxwecirciumlquovozecirciumlquyvuzecirciumlquccedilyvwuecirciumlq|

ccedilywv

ovqq|ecirciumlquyvzzyecirciumlquvzzzecirciumlquvyzecirciumlquovqoecirciumlquyvuzecirciumlquccedil|vozxecirciumlq|

ccedil|ovz

zvuzecirciumlq|vowqecirciumlquwvoxxecirciumlquovoqqecirciumlqxuvxwecirciumlquyvuqecirciumlquvwu|ecirciumlqu

zovy

zvwecirciumlq|vyxecirciumlquwvuecirciumlquovqzzecirciumlqxuvxoecirciumlquyvuyzecirciumlquvu|ecirciumlqu

wvw

qwv|xzecirciumlq|vzecirciumlquwvywecirciumlquvxyoecirciumlquovqwuecirciumlquyvuecirciumlquccedilovqyecirciumlquccedilov

qzvuuuecirciumlq|v|yecirciumlquwvqecirciumlquovuzecirciumlqxvwqqecirciumlquyvuzoecirciumlquyvqoecirciumlqu

y|vx

oovouwecirciumlquwvqqecirciumlquzvoywecirciumlquovuqwecirciumlqxvyxxecirciumlquyvuwecirciumlquuvzoxecirciumlqu

uwvq

owvwuecirciumlq|vq|qecirciumlquvzqzecirciumlquv|uuecirciumlquzvqoecirciumlq|yvu|ecirciumlquccedilxvyxqecirciumlq|

ccedilyuv|

yvz||ecirciumlq|v|oecirciumlquwvuuecirciumlquvqyecirciumlqxovuowecirciumlqxyvu||ecirciumlquovozecirciumlqxoxwvy

ovouecirciumlquvuxecirciumlquwv|zecirciumlquovu|oecirciumlqxvwxyecirciumlquyvux|ecirciumlquxvzoecirciumlqu

xoyvo

|ov|qoecirciumlquv|zqecirciumlquwvyzoecirciumlquv|oecirciumlquwvu|uecirciumlq|yvu|ecirciumlquccedilov|uecirciumlquccediloqxv

|v|qecirciumlq|vqecirciumlquvxzecirciumlquvxqecirciumlquovq|ecirciumlquyvuzecirciumlquccedilv|zoecirciumlq|

ccedil|vu

uwvxy|ecirciumlq|wvzoyecirciumlquzv|ecirciumlquovuozecirciumlqxvuoqecirciumlquyvwqecirciumlquuvuoecirciumlqu

xoxvw

uovquecirciumlquvwecirciumlquwvqecirciumlquovuouecirciumlqxvyxqecirciumlquyvuzoecirciumlquxvwoecirciumlqu

xy|vx

xzvux|ecirciumlq|vxwecirciumlquwvx|ecirciumlquvx|oecirciumlquovoqoecirciumlquyvu|qecirciumlquccedilzvzoecirciumlq|ccediloquvz

xwvqqecirciumlq|vuuwecirciumlquwv|owecirciumlquvzuzecirciumlquov|wyecirciumlquyvxy|ecirciumlquccedil|vywuecirciumlq|

ccediluv|

yzvqecirciumlq|zv|w|ecirciumlquovq|qecirciumlqxovuxecirciumlqxvyecirciumlquyvzqecirciumlquuvqecirciumlqu

uyuvq

yovqyuecirciumlquv|quecirciumlquwv|ywecirciumlquovuyuecirciumlqxwvqxwecirciumlquyvxwoecirciumlquyvecirciumlqu

xwzvx

ovoooecirciumlquvzwwecirciumlquzvqzzecirciumlquvo|ecirciumlqxovuoecirciumlqxyvyouecirciumlquovecirciumlqxooqqvy

ovqoecirciumlquvqyoecirciumlquwvo|ecirciumlquvy|ecirciumlquovoecirciumlquyvuxecirciumlquccediluvxzecirciumlq|

ccediluvz

zvy|ecirciumlq|wvoqqecirciumlquzvqyecirciumlquov|uqecirciumlqxyvw|ecirciumlquyvxyecirciumlquuv||ecirciumlqu

uzovo

zvyzoecirciumlq|vqwecirciumlquwvoecirciumlquovqzwecirciumlqxuvuwwecirciumlquyvuzyecirciumlquvwqyecirciumlqu

yxvy

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

JAEA-Testing 2011-005

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy1

Ccedil

Ccedil

+

Ccedil

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

Fi

g D

-1 Zccedil`

MPI

8mpi

JAEA-Testing 2011-005

Tabl

e D

-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p

q|vu|ecirciumlq|yvuqqecirciumlquyvuuecirciumlquyvy|ecirciumlqu|vuuxecirciumlq|yvuozecirciumlquovzzoecirciumlq

xvw

qvuuwecirciumlq|yvuqecirciumlquvouecirciumlquvqyecirciumlquyvqxqecirciumlq|yvuqecirciumlquccedilovoecirciumlq|

ccediloyv|

o|vuuecirciumlq|yvuq|ecirciumlquyvuecirciumlquyvwuwecirciumlquuv|ecirciumlq|yvuuecirciumlquovqquecirciumlq|

zv

oyvqzwecirciumlq|yvuqoecirciumlquvqooecirciumlquvqecirciumlquyvqyecirciumlq|yvuozecirciumlquovxxecirciumlq

vx

|vwuecirciumlq|yvuqoecirciumlquyv|qecirciumlquyvwxecirciumlquuv|uqecirciumlq|yvu|ecirciumlquovoecirciumlq|

|wv

yvqqwecirciumlq|yvuqyecirciumlquvqqecirciumlquvq|oecirciumlquyvoyecirciumlq|yvuouecirciumlquvuowecirciumlq

uvq

||vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvwxecirciumlquuv|yqecirciumlq|yvuoecirciumlquovooecirciumlq|

|yv

|vuzwecirciumlq|yvuqyecirciumlquvoxxecirciumlquvq|ecirciumlquyvoqyecirciumlq|yvuoecirciumlquccedilov|uecirciumlq|

ccediloyvx

|v|yuecirciumlq|yvuq|ecirciumlquyv|zecirciumlquyvw|oecirciumlquuvqzuecirciumlq|yvuecirciumlquzvo|ecirciumlq

vy

yvy|ecirciumlq|yvuquecirciumlquvqwqecirciumlquvqzecirciumlquyvoqecirciumlq|yvuozecirciumlquccedilxvo|uecirciumlq

ccedilyvy

q|vu|ecirciumlq|yvuqoecirciumlquyvu|ecirciumlquyvyyyecirciumlquvx|qecirciumlq|yvuo|ecirciumlquccedilvoqecirciumlq

ccedilvx

qvu||ecirciumlq|yvuquecirciumlquvouwecirciumlquvq|ecirciumlquyvoxecirciumlq|yvuecirciumlquccedilovoquecirciumlq|

ccedilouvz

ovxouecirciumlq|yvuqecirciumlquyvyxzecirciumlquyvyyecirciumlquvu|ecirciumlq|yvuxecirciumlquwvuwwecirciumlqo

|vu

oyvozecirciumlq|yvuq|ecirciumlquvqoyecirciumlquvq|ecirciumlquyvoecirciumlq|yvuxecirciumlquvooqecirciumlq

|vu

|vyoqecirciumlq|yvuoyecirciumlquyvecirciumlquyvxyecirciumlqu|v|xwecirciumlq|yvuqecirciumlquccedilvozecirciumlq

ccedilxvz

xvzzyecirciumlq|yv|wzecirciumlquyvzwzecirciumlquvq|oecirciumlquyvouecirciumlq|yvuoyecirciumlquuvoxwecirciumlq

yvz

||vyecirciumlq|yvuqoecirciumlquyvy|ecirciumlquyvxecirciumlqu|v|xxecirciumlq|yvuoecirciumlquccedilxvwwzecirciumlqo

ccedilovy

|vuzuecirciumlq|yvuquecirciumlquvox|ecirciumlquvq||ecirciumlquyvo|ecirciumlq|yvuozecirciumlquccedilovq|ecirciumlq|

ccediloyvo

|vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvooecirciumlquvzoyecirciumlq|yvuqecirciumlquccedilv|zxecirciumlq

ccedilyv

yvy|ecirciumlq|yvuqqecirciumlquvqecirciumlquvq|xecirciumlquyvouecirciumlq|yvuqecirciumlquccediluvqoecirciumlq

ccedilxvo

q|vozyecirciumlq|yv|zzecirciumlquyvowecirciumlquyvyzecirciumlqu|vxu|ecirciumlq|yvuouecirciumlquxvq|wecirciumlq

oxvw

qv|wyecirciumlq|yvuqxecirciumlquvouuecirciumlquvquuecirciumlquyvoqyecirciumlq|yvu||ecirciumlquccedilovqqqecirciumlq|

ccedilo|vx

ouvoq|ecirciumlq|yv|zecirciumlquyvwqecirciumlquyvyxecirciumlquvuoecirciumlq|yvuoyecirciumlquccedilovxqoecirciumlq|

ccedil|yvy

ovuqzecirciumlq|yv|zoecirciumlquvo|ecirciumlquvq|zecirciumlquyvo|ecirciumlq|yvuowecirciumlquccedilzv|ecirciumlq

ccedilovy

uvoecirciumlq|yvuqqecirciumlquyvwo|ecirciumlquyvxoecirciumlqu|vzwecirciumlq|yvuoecirciumlquccedilyvowecirciumlq

ccediloxvq

vxozecirciumlq|yvuoecirciumlquvo|ecirciumlquvquuecirciumlquyvoyecirciumlq|yvuecirciumlquccedilovzzecirciumlq|

ccedilov|

||vuoqecirciumlq|yvuqxecirciumlquyvuecirciumlquyvywecirciumlqu|vuyuecirciumlq|yvuoecirciumlquvooqecirciumlq

yv

|vxoecirciumlq|yvuquecirciumlquvoxyecirciumlquvq||ecirciumlquyvqwyecirciumlq|yvuuecirciumlquccedilov||ecirciumlq|

ccediloyvu

|vqwecirciumlq|yvuqqecirciumlquyvoecirciumlquyv|yecirciumlqu|vozecirciumlq|yvuowecirciumlquccedil|vxooecirciumlq

ccedilvu

vuxwecirciumlq|yvuqxecirciumlquvoxoecirciumlquvquqecirciumlquyvoxxecirciumlq|yvuuecirciumlquccedilovooyecirciumlq|

ccedilouvz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

jetimesg

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-3Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (2

)

JAEA-Testing 2011-005

Tabl

e D

-3Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p

q|v|oxecirciumlq|wvwz|ecirciumlq|ovoecirciumlquovou|ecirciumlquvu|qecirciumlq|zvqqecirciumlq|ccedilvxwecirciumlq

ccedil|vu

qvx||ecirciumlq|wvzquecirciumlq|ovyuuecirciumlquovyuyecirciumlquvxqoecirciumlq|wvzyecirciumlq|vy|ecirciumlqo

qv|

ovyecirciumlq|wvwzqecirciumlq|ovoyxecirciumlquovouuecirciumlquvuwecirciumlq|wvzyuecirciumlq|ccedilvoqqecirciumlq

ccedilvy

ovxqoecirciumlq|wvwzecirciumlq|ovyuqecirciumlquovy|yecirciumlquvuoqecirciumlq|wvzxqecirciumlq|ccedil|vzoecirciumlqo

ccedilqvx

|vywoecirciumlq|wvz|ecirciumlq|ovyoecirciumlquov|ecirciumlqu|v|wyecirciumlq|wvz|oecirciumlq|ccedilvzxzecirciumlq

ccedilwvq

v|zoecirciumlq|wvwzyecirciumlq|ovyzecirciumlquovuzyecirciumlquxvzzwecirciumlq|wvzxwecirciumlq|ccedilov||oecirciumlq|

ccedilowvq

||vywecirciumlq|wvwwzecirciumlq|ovxecirciumlquov||ecirciumlqu|v|xwecirciumlq|wvzoecirciumlq|ccedilv|zecirciumlq

ccedilyvx

|vuooecirciumlq|wvzqwecirciumlq|ovy|ecirciumlquovxqxecirciumlquyvqzecirciumlq|wvz|ecirciumlq|ccedilovyecirciumlq|

ccedilovo

|v|xzecirciumlq|wvzqoecirciumlq|ovyecirciumlquovowwecirciumlquvzo|ecirciumlq|wvzyecirciumlq|ccedil|vzzecirciumlq

ccediloovu

vuxzecirciumlq|wvzqoecirciumlq|ovy|yecirciumlquovxoecirciumlquyvuecirciumlq|wvzyoecirciumlq|ccedilyvxuecirciumlq

ccedilwvw

q|v|xzecirciumlq|wvwwoecirciumlq|ovuecirciumlquovxqecirciumlqu|vxxuecirciumlq|wvzuuecirciumlq|vxwecirciumlq

v

qvuxecirciumlq|wvzqyecirciumlq|ovy||ecirciumlquovuzwecirciumlquyvqqecirciumlq|wvzy|ecirciumlq|ccedilov|uwecirciumlq|

ccedilowv

ovuwqecirciumlq|wvwzecirciumlq|ovo|yecirciumlquovo|zecirciumlquvuuqecirciumlq|wvzuyecirciumlq|vowecirciumlqo

ovo

oyvoozecirciumlq|wvz|qecirciumlq|ovxqxecirciumlquovuzecirciumlquyvqoecirciumlq|wvzxzecirciumlq|ccedilvzyecirciumlqo

ccedilov|

|v||zecirciumlq|wvzqzecirciumlq|ovxecirciumlquov|oecirciumlqu|v|oxecirciumlq|wvzzyecirciumlq|yv|owecirciumlqo

ovz

xvzyuecirciumlq|wvzqoecirciumlq|ovuwecirciumlquovxqxecirciumlquyvqxecirciumlq|wvzuecirciumlq|ovwuoecirciumlq

|vo

||v|xzecirciumlq|wvwzoecirciumlq|ovxecirciumlquovuzecirciumlqu|vxozecirciumlq|wvzqecirciumlq|v|zoecirciumlq

vo

|vuwxecirciumlq|wvzqyecirciumlq|ovy|zecirciumlquovxquecirciumlquyvqwoecirciumlq|wvzxzecirciumlq|ccedilov|xoecirciumlq|

ccedilowvo

|vo|uecirciumlq|wvwzqecirciumlq|ovqecirciumlquovoecirciumlqu|vqecirciumlq|wvzyuecirciumlq|ovuyzecirciumlq

uvu

yvuwecirciumlq|wvzooecirciumlq|ovxyyecirciumlquovxqoecirciumlquyvquecirciumlq|wvzyuecirciumlq|ccedilyvuw|ecirciumlq

ccedilwvy

q|v|oyecirciumlq|wvwzqecirciumlq|ovoecirciumlquov|yecirciumlqu|v|yzecirciumlq|wvzwzecirciumlq|ovxoyecirciumlq

uvy

qxvzyyecirciumlq|wvzq|ecirciumlq|ovuwecirciumlquovy|wecirciumlquvu|qecirciumlq|wvzx|ecirciumlq|ovxouecirciumlq|

xvu

ovxyecirciumlq|wvwqecirciumlq|ovoy|ecirciumlquov|oecirciumlquuvuuecirciumlq|wvzyecirciumlq|ovxwqecirciumlq|

xv|

oyvqqecirciumlq|wvzqecirciumlq|ovuzecirciumlquovy|wecirciumlquvuyecirciumlq|wvzuzecirciumlq|ovux|ecirciumlq|

uvo

|vyecirciumlq|wvwzqecirciumlq|ovxecirciumlquov|zecirciumlquuv||ecirciumlq|wvzxuecirciumlq|vowyecirciumlq

ozvx

vuuuecirciumlq|wvzuuecirciumlq|ovy|zecirciumlquovuwzecirciumlquxvzxzecirciumlq|wvz|yecirciumlq|ccedilovuz|ecirciumlq|

ccedilqvo

||vywecirciumlq|wvww|ecirciumlq|ovxyecirciumlquov|zecirciumlquuv|owecirciumlq|wvzoecirciumlq|vwoecirciumlq

ozvw

|vuqqecirciumlq|wvzqwecirciumlq|ovy|oecirciumlquovxqqecirciumlquyvquqecirciumlq|wvzxwecirciumlq|ccedilov|oqecirciumlq|

ccedilov

|v|xecirciumlq|wvww|ecirciumlq|ovuecirciumlquov|q|ecirciumlquuvqyyecirciumlq|wvzyzecirciumlq|vzuyecirciumlq

xv|

yvqwecirciumlq|wvzouecirciumlq|ovxyecirciumlquovxyyecirciumlquyvouecirciumlq|wvzuzecirciumlq|uvqzxecirciumlqo

vz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

Ccedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

Xpound

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-4Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-5Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (2

)

JAEA-Testing 2011-005

E 1oacute 1

Fig E-1 QRfeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39305 sec Ave Speed = 32566145 MBs RANK = 2 TIME = 39197 sec Ave Speed = 32655168 MBs RANK = 3 TIME = 39212 sec Ave Speed = 32643219 MBs RANK = 4 TIME = 34341 sec Ave Speed = 37272765 MBs RANK = 5 TIME = 34175 sec Ave Speed = 37453885 MBs RANK = 6 TIME = 34086 sec Ave Speed = 37552470 MBs RANK = 7 TIME = 34329 sec Ave Speed = 37286226 MBs RANK = 8 TIME = 24021 sec Ave Speed = 53287420 MBs RANK = 9 TIME = 23794 sec Ave Speed = 53794939 MBs RANK = 10 TIME = 23805 sec Ave Speed = 53770236 MBs RANK = 11 TIME = 23798 sec Ave Speed = 53786914 MBs RANK = 12 TIME = 23942 sec Ave Speed = 53462490 MBs RANK = 13 TIME = 23931 sec Ave Speed = 53488046 MBs RANK = 14 TIME = 23950 sec Ave Speed = 53443826 MBs RANK = 15 TIME = 24066 sec Ave Speed = 53186052 MBs RANK = 16 TIME = 23798 sec Ave Speed = 53786327 MBs RANK = 17 TIME = 23802 sec Ave Speed = 53775886 MBs RANK = 18 TIME = 23821 sec Ave Speed = 53733399 MBs RANK = 19 TIME = 23794 sec Ave Speed = 53795866 MBs RANK = 20 TIME = 23939 sec Ave Speed = 53468837 MBs RANK = 21 TIME = 23938 sec Ave Speed = 53470775 MBs RANK = 22 TIME = 23939 sec Ave Speed = 53470333 MBs RANK = 23 TIME = 23942 sec Ave Speed = 53462889 MBs

JAEA-Testing 2011-005

Fig E-2 afeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39370 sec Ave Speed = 32511775 MBs RANK = 2 TIME = 39475 sec Ave Speed = 32425231 MBs RANK = 3 TIME = 39415 sec Ave Speed = 32474846 MBs RANK = 4 TIME = 34392 sec Ave Speed = 37217925 MBs RANK = 5 TIME = 34543 sec Ave Speed = 37055373 MBs RANK = 6 TIME = 34250 sec Ave Speed = 37372742 MBs RANK = 7 TIME = 34709 sec Ave Speed = 36878286 MBs RANK = 8 TIME = 26967 sec Ave Speed = 47465876 MBs RANK = 9 TIME = 26964 sec Ave Speed = 47471018 MBs RANK = 10 TIME = 26955 sec Ave Speed = 47486289 MBs RANK = 11 TIME = 26947 sec Ave Speed = 47499771 MBs RANK = 12 TIME = 26957 sec Ave Speed = 47483647 MBs RANK = 13 TIME = 27020 sec Ave Speed = 47373054 MBs RANK = 14 TIME = 26948 sec Ave Speed = 47498023 MBs RANK = 15 TIME = 27174 sec Ave Speed = 47104542 MBs RANK = 16 TIME = 26956 sec Ave Speed = 47485054 MBs RANK = 17 TIME = 26945 sec Ave Speed = 47504899 MBs RANK = 18 TIME = 26946 sec Ave Speed = 47502360 MBs RANK = 19 TIME = 26967 sec Ave Speed = 47466158 MBs RANK = 20 TIME = 26948 sec Ave Speed = 47499767 MBs RANK = 21 TIME = 26982 sec Ave Speed = 47438204 MBs RANK = 22 TIME = 27087 sec Ave Speed = 47254830 MBs RANK = 23 TIME = 27004 sec Ave Speed = 47399960 MBs

JAEA-Testing 2011-005

Fig E-3 QRfeaZOpenMPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes NODE CPU coreid Rank = 0 bx2050 CPU0 0 all time RANK = 1 TIME = 38110 sec Ave Speed = 33587213 MBs bx2050 CPU1 1 RANK = 2 TIME = 28656 sec Ave Speed = 44667752 MBs bx2050 CPU0 2 RANK = 3 TIME = 37959 sec Ave Speed = 33720175 MBs bx2050 CPU1 3 RANK = 4 TIME = 28658 sec Ave Speed = 44664664 MBs bx2050 CPU0 4 RANK = 5 TIME = 37829 sec Ave Speed = 33836073 MBs bx2050 CPU1 5 RANK = 6 TIME = 28641 sec Ave Speed = 44691880 MBs bx2050 CPU0 6 RANK = 7 TIME = 37892 sec Ave Speed = 33779924 MBs bx2050 CPU1 7 RANK = 8 TIME = 35140 sec Ave Speed = 36425620 MBs bx2051 CPU0 0 RANK = 9 TIME = 35017 sec Ave Speed = 36554163 MBs bx2051 CPU1 1 RANK = 10 TIME = 38916 sec Ave Speed = 32891010 MBs bx2051 CPU0 2 RANK = 11 TIME = 45416 sec Ave Speed = 28184186 MBs bx2051 CPU1 3 RANK = 12 TIME = 35172 sec Ave Speed = 36392511 MBs bx2051 CPU0 4 RANK = 13 TIME = 35053 sec Ave Speed = 36516468 MBs bx2051 CPU1 5 RANK = 14 TIME = 35200 sec Ave Speed = 36363472 MBs bx2051 CPU0 6 RANK = 15 TIME = 35009 sec Ave Speed = 36561987 MBs bx2051 CPU1 7 RANK = 16 TIME = 22996 sec Ave Speed = 55661441 MBs bx2052 CPU0 0 RANK = 17 TIME = 35587 sec Ave Speed = 35968575 MBs bx2052 CPU1 1 RANK = 18 TIME = 35587 sec Ave Speed = 35968151 MBs bx2052 CPU0 2 RANK = 19 TIME = 35712 sec Ave Speed = 35842525 MBs bx2052 CPU1 3 RANK = 20 TIME = 22996 sec Ave Speed = 55661978 MBs bx2052 CPU0 4 RANK = 21 TIME = 36856 sec Ave Speed = 34729458 MBs bx2052 CPU1 5 RANK = 22 TIME = 22994 sec Ave Speed = 55666382 MBs bx2052 CPU0 6 RANK = 23 TIME = 22991 sec Ave Speed = 55673257 MBs bx2052 CPU1 7

JAEA-Testing 2011-005

13

F |eacuteIuml -microX|eacuteIuml^_Z)JCQRfeaZW Intel feaZegrave

pound5poundDqQRfeaZgt|YatildeGfeaUuml^13~

WCeumlUKntstiacuteordflaquoqmnaringaeligccedil13heumlX store amp5GUuml^13~gtlaquoq2csup1egraveBordfOcirc$ storegtszligordflaquoqiAgraveIntelmnntildeGfeaUuml^13~centsup3CXqCcCcedilEgravegtacircOslash5Cagraveaacute5pound

Dq WCIntelfeaZmUuml^13~ordfccdQRfeaZ

UKntst egraveBCDoumlWordfqiAgraveUKntst ordfQRfeaZEGmWordfethCDq

F1 Agraveograve o^_UgraveDo7ccedilIumlgt 128MB copyIgrave0]lyacuteOslash5GZccedil` MPI

OpenMP^_Z)JCDqm0]lyacute 100plusmnatildesectaumlC5mnccC|eacuteIuml (MBs)iquestHq^_ZiIacute[oumlYacutebrvbarsectGqOslash5IumlWCMPI gt 12348 ^_OpenMP gt 123487ccedilIumlgtOslash5UgraveD IFyacute1EC1CPUcd2^_UgraveD7ccedilIuml 2laquonotgtOslash55poundEacuteDqQRfeaZUKntst CDoumlWCXouml 2sectCDqIntelucircfeagt5poundDq

JAEA-Testing 2011-005

Fig F-1 ^_ZugraveuacuteLFlat MPIP

(main) n = nmb102410248 nmb = 128 (from input data) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) nsize = n size c NNN Repetition number of times NNN = 100 do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = MPI_WTIME() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

Fig F-2 ^_ZugraveuacuteLOpenMPP

(main) $OMP threadprivate(ioutompidompnumabierr1ierr2) n = nmb102410248 nmb = 128 (from input data) $OMP parallel shared(n) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) if(ierr1ne0orierr2ne0) then write(6) Array allocation failed stop end if $OMP end parallel nsize = n size c NNN Repetition number of times NNN = 100 $OMP parallel private(i)shared(nsize) do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue $OMP end parallel $OMP parallel private(IIttime1ttime2timewave_spd) $OMPamp shared(NNNn) ttime1 = omp_get_wtime() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = omp_get_wtime() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

F2 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt[oumlYacutebrvbarsect 4gtGqmn

d QRUKntstlaquooumlWoumlWr 2 2^_laquonotgtUKntstlaquooumlWoumlWr 3KntstCc 1ccedil8^_gtQRW IntelWr QR(-Kntst)W Intel2 2^_Wr CqOMP Intel EcircHGu MPI Wiacute

XWXpoundDDHyzGqmndbrvbarsect[raquozfrac14frac12CDq (1) MPI^_ZOpenMP^_ZWQRfeaZgtUKntstordflaquooumlW

oumlgtordfaacuteCq (2) IntelfeaZUKntstCDQRfeaZWbAEligq (3) MPI^_ZOpenMP^_ZW 1CPUSgto coremacrdnXq (4) UKntstordfQRfeaZOslash5thornlt]G(1plusmnW 2plusmn

Oslash5ordfaEacuteD)q ^_ZbrvbarPXOcirc$sup1yBordfAgraveXCc store ampordfCXoumlQR

feaZUKntst Uuml^13~CXnordfyacuteordfdXqOcirc$sup1yBordfcAgraveXcpoundecircsectCXoumlmUuml^13~CagraveaacuteGaeliguordflaquoWnq

IntelfeaZdegIacuteQRUKntstlaquosectoumlWGmWordfgtecircq(XETHNtildefeaOslash5|YatildeG4X

yen5oslashIacuteD|Yordf9WXpoundbrvbarPmacrltdnDq

JAEA-Testing 2011-005

Table F-1 |eacuteIumlWeumlUKntstiacuteUuml^13~

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqvo uuuwvq u|xvq qvzu

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |o|v wq|vy xxwvu qvXpoundo |xwxvz |xwvq |xw|vz Xpoundo |xvy wovw xov qv

|xyvu |xuvo |xxv| |ozvu woqv xyxvq qv

Xpoundq z|vx zvz z|v Xpoundq ouuwvz ozwov ooxv| qvyoXpoundo woqv wqxvo wqvy Xpoundo oux|vu ozwv| oqvu qvyoXpound wqzv| wqvo wqxv Xpound oux|v| ozzqvo oov qvyo

wquv| wqqvo wqv ouxovz ozwyvu oozvo qvyo

Xpoundq o||vo o|xvq o|uvq Xpoundq oqoxvz oxquvu oyqvo qvxzXpoundo ouv ouov ouov Xpoundo oqovw oxoqvo oy|vz qvxzXpound ouov ouyv| ouuvq Xpound oqowvz oxqzv oyuv| qvxzXpound| ouvy ouuvw ou|v Xpound| oqowvz oxqzv oyuvq qvxz

o|zvz ouovw ouqvz oqovz oxqwv| oy|vo qvxz

Xpoundq qwyvq oqyvw qzyvu Xpoundq ouxvq ouwuvy ouywv| qvqXpoundo qz|v ooyvy oqxv Xpoundo ouxvz ouwuv ouovq qvqXpound qzuvy oovq oqxvw Xpound ouxzv ouz|v ouyv qvqXpound| qzxvy ooxvu oqxvx Xpound| ouxuvw ouzv ou|vx qvqXpoundu qwzv| o||v ooovx Xpoundu ouqxvo ouuv| ouyv qvyXpoundx qwzvu o|vw oo|vy Xpoundx ouq|vo ouuxv ouuv qvyXpoundy oqvy o|xvu oovx Xpoundy ouquvw ouuyvq ouxvu qvyXpound ooqvu o|yvo o|v Xpound ouqxvx ouuyv ouyvo qvy

qzxvw ouvw ooqv| ou|qv| ouyvu ouuwvz qvyw

usup3deg

wsup3deg wsup3deg

usup3deg

|sup3deg

sup3deg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

sup3deg

|sup3deg

Table F-2 |eacuteIumlLiacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq ywxvu xyuvw yxvo qvuXpoundo |xywv |xwvx |x|vy Xpoundo yzwvz xvu y|wvo qvuXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz yzvo xovo y|ovy qvu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq yqv xywvy yozvy qvuXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound ywyvz xwv y|uvw qvuXpound| Xpound|

|xyov| |xovu |xyyv| ywvw xxvy yv qvu

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq yoov xvu ywovw qvxXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| yxv qvo yzvy qvx

|xyv| |xqvo |xyyv yowv yov ywzv qvx

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo uyv| wqvo y|v qvuXpound |xyvz |xuvu |xovo Xpound uqvu wqyv| y|wv| qvuXpound| Xpound|

|xyyv| |xuvu |xqv| uywvw wqyv y|vw qvu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo uyxvy wqyv y|yvo qvuXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| uywvq wqxvz y|vq qvu

|xyuvw |x|vo |xywvz uyyvw wqyv| y|yvx qvu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound xq|v wovo yxvz qvuXpound| |xyxvu |x|v| |xyzv| Xpound| uzzvx wqyvu yxvz qvu

|xyuvw |xuvy |xyzv xqovy wqzv yxxvu qvu

ntildeR| ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

ontildeR

ontildeR|

qntildeRo

qntildeR

qntildeR|

qntildeRo

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

JAEA-Testing 2011-005

Talbe F-3 |eacuteIumlLQRafeaZrsacute 1ccedil8 ^_P

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqxvw uqzvq uqvu ovqx

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |ywqvx |ywovy |ywovo ovq|Xpoundo |xwxvz |xwvq |xw|vz Xpoundo |yzyvz |yzvo |yzvq ovq|

|xyvu |xuvo |xxv| |ywwv |ywzv| |ywzvq ovq|

Xpoundq z|vx zvz z|v Xpoundq zv| zxvq z|v ovqqXpoundo woqv wqxvo wqvy Xpoundo wqxvx wqwvy wqvo ovqqXpound wqzv| wqvo wqxv Xpound wquv| wqwv| wqyv| ovqq

wquv| wqqvo wqv wqqv wquvq wqv| ovqq

Xpoundq o||vo o|xvq o|uvq Xpoundq ov ouqv| o|ov| ovqqXpoundo ouv ouov ouov Xpoundo o|ovy ouwvw ouqv ovqqXpound ouov ouyv| ouuvq Xpound o|ovy oxqvo ouqvw ovqqXpound| ouvy ouuvw ou|v Xpound| o|qv ouzvo o|zvz ovqq

o|zvz ouovw ouqvz ozvq ouvo o|wvq ovqq

Xpoundq qwyvq oqyvw qzyvu Xpoundq qzv ooxvq oqyvo ovqqXpoundo qz|v ooyvy oqxv Xpoundo oquv oouv| oqzv ovqqXpound qzuvy oovq oqxvw Xpound oqxv| oxvx ooxvu ovqqXpound| qzxvy ooxvu oqxvx Xpound| oqxvw oxv ooxvx ovqqXpoundu qwzv| o||v ooovx Xpoundu oqv qzuv| oqvx qvzzXpoundx qwzvu o|vw oo|vy Xpoundx oov qzov oquvu qvzwXpoundy oqvy o|xvu oovx Xpoundy ooxvu qwzv oqvy qvzwXpound ooqvu o|yvo o|v Xpound oouvo qzqvo oqvo qvzw

qzxvw ouvw ooqv| ooqvq oqxv oqvz qvzz

wsup3deg

usup3deg

|sup3deg

sup3deg

13aumlecedilg

sup3deg

|sup3deg

usup3deg

eccedilograveregYacuteg

efrac12cedilregg

oefrac12cedilregg

wsup3deg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

Table F-4 |eacuteIumlLQRafeaZrsacute iacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq |quvy |quvx |quvx ovquXpoundo |xywv |xwvx |x|vy Xpoundo ||vx |ovw |vy ovquXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz |ouvq |o|vo |o|vy ovqu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq ||zzv |qv| |xxqvw ovqqXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound |zwv| ||v |yovq ovqxXpound| Xpound|

|xyov| |xovu |xyyv| |xzwvw |o|vq |yxxvz ovq

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq |quv |quvq |quvu ovquXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| |ovx |ovy |ovy ovqu

|xyv| |xqvo |xyyv |o|vo |ovw |o|vq ovqu

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo |oyvw |oyvu |oyvy ovquXpound |xyvz |xuvu |xovo Xpound |oyv |oyvq |oyvo ovquXpound| Xpound|

|xyyv| |xuvu |xqv| |oyvx |oyv |oyv| ovqu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo |ov |oyvo |oyvz ovquXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| |ouv| |ovx |o|vu ovqu

|xyuvw |x|vo |xywvz |oyvq |ouv| |oxvo ovqu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound |oyv |oxvz |oyvo ovquXpound| |xyxvu |x|v| |xyzv| Xpound| |oxvx |oyvw |oyvo ovqu

|xyuvw |xuvy |xyzv |oxvw |oyv| |oyvo ovqu

ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

qntildeRoqntildeRo

qntildeR

qntildeR|

ontildeR

ontildeR|

ntildeR|

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpound

JAEA-Testing 2011-005

G Zccedil` MPIWIcirca|ccedil`gt MPI atilde 6MPI ETHNtildegtegraveBCD^_ZiIacute^regCZccedil`MPI

WIcirca|ccedil`gtOslash55poundDq

G1 Agraveograve MPI ETHNtildegtegraveBCD^_ZiIacute^regC1024 ^_Zccedil

` MPIW 128^_j87ccedilIumlIcirca|ccedil`128^_Zccedil` MPIW128 ^_j8 7ccedilIumlIcirca|ccedil`acircgtCc Ocirc$-a5gtrs5poundDqUgraveDRDMANORDMAIumlgt5poundDq )acircEacute coregtZccedil` MPIWIcirca|ccedil`rs acircEacute MPI^_gt

coreCCDouml rsWGq Ocirc$-a5n~n 8eacutea`ccedil1Keacutea`256Keacutea`ccedil1Meacutea`WCDq

G2 [oumlYacutebrvbarsectGqmbrvbarsect[raquozfrac14frac12CDq (1) 1024Zccedil` MPIW 128j8Icirca|ccedil`gtGu MPIatildegtIcirca|ccedil`

ordfFqUgraveDatildebrvbarpoundaacuteCCordflaquoq (2) 128Zccedil` MPIW 128j8Icirca|ccedil`gtplusmn Ocirc$-a5(256KB13yacute)

gtlaquoWGu MPIatildegtaeligIcirca|ccedil`ordfnotyW wXordfaacuteCCIcirca|ccedil`ordfZccedil` MPIbrvbarsectordfmWq

(2)gt ordfAgravecedilCXpoundegraveBG core ordfCXgtOslashf

IumlntildeHDoumlfIumlOslash5FCXWeumlnqmndbrvbarsect

BX900gtIcirca|ccedil`ordfnotygtlaquoqfIumlUTUacuteUcircbgtIcirca|ccedil`Ucirc5Gaeliguordflaquoq

JAEA-Testing 2011-005

Table G-1 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (1)

sup2~ w |y |o ovq| zx o| qv |vu vuqsup2~ oy |o |qz ovq| oqq o|x qv| |vow vwsup2~ u |ow |o ovqq zz o|x qv| |vo v|xsup2~ | |y |o ovqu zw o|x qv| |v|u v|sup2~ yu |o |oy ovqq zy o|z qvyz |v| vsup2~ ow ||u |o| ovq zw o| qv |vuq vwsup2~ xy ||u |o ovqx oqx o|z qvx |voz vsup2~ xo ||u |q ovqu oq o|w qv |vo| v|sup2~ ontildequ |uo |q ovq oqz ou| qvy |vo vu

sup2~ w owq qw qvw q oo qvy| vxy ovwxsup2~ oy owu o| qvw q oqw qvyy vyo ovzwsup2~ u owq qu qvww yw oqz qvy vyx ovwsup2~ | o| qy qvwu o oqw qvyy vu| ovzosup2~ yu o| qu qvwx o oqw qvyy vu| ovwzsup2~ ow owx qq qvz| oqz qvy vxy ovwusup2~ xy oz qo qvwz wy ooo qvw vqz ovwsup2~ xo ow qu qvw w oo qv vqx ovwsup2~ ontildequ ow ou qvwx z oq qv ovzw ovz

sup2~Oslashreg~plusmn w untildeqoy untilde|z ov| ontildeuw ontildeyz| qvu xzv|q xvusup2~Oslashreg~plusmn oy |ntildeyuw uqntilde ovwo ontildezw ontildeyuu qvz xyvx uvwsup2~Oslashreg~plusmn u untildeqxu |zntildezu| ovwx ontildexo ontildeyoq qvw xzvow uvwsup2~Oslashreg~plusmn | |ntilde|| uqntildeyq ovw ontilde|q ontildeyq qvwo xxvxy uvwysup2~Oslashreg~plusmn yu untildey uqntildeuuu ovwu ontildeww ontildeyuy qvw xvyu uvxsup2~Oslashreg~plusmn ow |ntildeqq uqntildezw| ovw ontildew ontildeyzx qv xzvxq uvowsup2~Oslashreg~plusmn xy untildeyyx uqntildeywz ovwu ontildeuxw ontildeyy qvw xovq uvuosup2~Oslashreg~plusmn xo untildeqz |zntildeoww ovwz ontildexq ontildeqx qvwz uwvyz vzzsup2~Oslashreg~plusmn ontildequ yntildexwx |ntildeuuz vqx ontildey ontildewx qvw uvo ozvzw

middotplusmnsup1 w |w | ovqq oqq o|y qv| |vw vuqmiddotplusmnsup1 oy |u |uo qvzx oqo o|y qvu |vq vxqmiddotplusmnsup1 u |uo |yq qvzx oqo ouq qv |v|w vxwmiddotplusmnsup1 | |xo |yw qvzy oq o|w qvu |vux vyymiddotplusmnsup1 yu xzo u|| ov|y oqx ouw qvo xvy| vzmiddotplusmnsup1 ow ntildez yqu uvz| ooy oxx qvx xvq |vzqmiddotplusmnsup1 xy xntilde z xv| o| oy qvz |zvyx xvwumiddotplusmnsup1 xo wntildeqoz ontildew uvxo oy oz ovo| |voy zvmiddotplusmnsup1 ontildequ oqntildezy |ntildexxo |vqz zo |w uvqz oovz ouvzx

middotplusmnsup1raquo w ntildey| oxntildequq qvo || oy qvu vz ovqomiddotplusmnsup1raquo oy ntildeuyw oxntildeqx qvoy ||x oq qvu v| ovoxmiddotplusmnsup1raquo u ntilde|y oxntildequq qvow |uo o| qvuw wvq| ovqzmiddotplusmnsup1raquo | ntildexw| oxntildeq|| qvo || ox qvu vyy ovqomiddotplusmnsup1raquo yu ntildeyyz oxntildeqqz qvow |xq yw qvuy vy ozvxumiddotplusmnsup1raquo ow |ntildeqqx oxntildeqy| qvq u|q uo qvxw yvzz qv|middotplusmnsup1raquo xy |ntilde|uy oxntildeoqu qv xyo y qvu xvzy ozvw|middotplusmnsup1raquo xo |ntilde|zq oxntildeoxq qv xzw woq qvu xvy owvomiddotplusmnsup1raquo ontildequ |ntildex| oxntildeuox qvu yy z|z qv xvxx oyvuo

Ugraveplusmnsup1 w ontildeuox yntildeyo| vy| u q ovq ovzq |vUgraveplusmnsup1 oy owntildeqz yntildewu vy |q qo ovou wv|q |uvoxUgraveplusmnsup1 u owntilde| yntildewzz v |qo q ovuy yvoy ||v|wUgraveplusmnsup1 | ozntildeqzy yntildezq v uz q ovq yvwq ||v|wUgraveplusmnsup1 yu ozntilde|wx yntildewx| vw| z oy ov| yxv|o |ovqUgraveplusmnsup1 ow qntildewo yntildez|y vz ||q o ovxq yovxq |ovuxUgraveplusmnsup1 xy qntilde|o ntildeqo vzq |wy y ovo xvx |ovqqUgraveplusmnsup1 xo ontildeo|u ntildeow| vzu |xy |u ovx| xzv| |qvyUgraveplusmnsup1 ontildequ ontildey|q ntilde|z vz| |zw ow| vow xuv|o uqvu

Ugraveplusmnsup1raquo w ozntildexw yntildez| vw| |z q ovow wvqu |uvuqUgraveplusmnsup1raquo oy ozntildexoz yntildezq vw| |q q ovou wuvyz |uvoUgraveplusmnsup1raquo u ozntildexqy yntildezu vwo u q ov| ov|q ||vyqUgraveplusmnsup1raquo | ozntildeuu yntildewz vwy o qy ov|o vwx ||vuqUgraveplusmnsup1raquo yu ozntildeqw yntildezq| vwx y oy ovu |vy |ovzy

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

sectWOcirceg

oqu$XAcircsup3deg owsup3degIcircwsup3+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

JAEA-Testing 2011-005

Table G-2 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (2)

Ugraveplusmnsup1raquo ow qntildeqx yntildewz| |vqq |u q ovq xxv|| |ovzUgraveplusmnsup1raquo xy qntildew| ntildeq| vzy ||y y ovuz yvqx |ovoUgraveplusmnsup1raquo xo ontildew ntildexy vz| u|| |u ovwx uzvoo |qvzxUgraveplusmnsup1raquo oqu ontildeww| ntildeuqz vzx |oy owu ov yzv|q uqv||

~plusmn w yntildez zntildeuw qvo ontildeozu uux vyw xvwq ovzo~plusmn oy ntildeoqz zntildeu| qv| ontildeoy| uu vyq yvoo ovz~plusmn u ntildeq|z zntildeuy qv ontildeo uuy vy| yvqq ovwx~plusmn | ntildeoz zntildeyq qvu ontildequ uu vq xvzw ovwy~plusmn yu yntildezoq zntildey qvo ontildeoyu uu vyq xvzu ovw|~plusmn ow zntilde|qz zntildew| qvzx ntildeo uuy uvw uvz ovz~plusmn xy zntilde|wz zntildewqw qvzy ntildey uuy uvzz uv ovz~plusmn xo zntildeqx zntildewuq qvzu ntildeowx uuz uvw uvo ovzu~plusmn oqu zntildexz zntildezq| qvzz ntildeo uyo uv uvuz ovxq

~plusmnraquo w yntildewox zntilde| qvq ontildeuo uux vz xvuz ovw~plusmnraquo oy yntildezq zntildeu| qvo ontildeoz uu vyw xvw ovwo~plusmnraquo u yntildeyz| zntildeux qvyz ontildeoyw uuu vy| xv| ovz|~plusmnraquo | yntildewou zntildexx qvq ontildeu uux vx xvx ovzo~plusmnraquo yu yntildezx| zntildex qvo ontildeoyo uuu vyo xvzz ovzx~plusmnraquo ow zntildeuux zntildew qvz ntildeoxx uu| uvwy uv|w vqx~plusmnraquo xy zntilde|x zntildewqo qvzx ntildeq uuy uvzx uvu ovzw~plusmnraquo xo zntildeyw zntildewuy qvzw ntildeqy uxq uvzq uv|z ovwz~plusmnraquo oqu zntildeyxq zntildezoo qvz ntildeoxy uyq uvyw uvuw ovx|

plusmn w uy uo ovo zy oow qvw uvwq |vxqplusmn oy xq uw ovqy oqo oq qvwx xvqq |vzzplusmn u xwz x|| ovoq oqy o| qvwy xvx uv|uplusmn | oo yqy ovo ooq o|q qvwx yvuy uvywplusmn yu ntilde zoz vu o| ouo qvw owvqu yvxqplusmn ow ntildeww ntildeqqz ov|z oxo ox qvwy owvu oovxqplusmn xy untildeouy |ntildeoy ovo q| oz qvz| qvuy oyvzplusmn xo |zntildeo| |xntilde|u ovoo ontildeuox ontildeuy qvwo vyz qvowplusmn oqu uqntildeqwy |xntilde|w ovo ontildeuww ontildewqw qvw yvzu ozv

plusmnraquo w oqqntildexy yqntilde|y ovy ntildeuou wy vx uovyy ywvwzplusmnraquo oy oqntildey| yontildeq ovyw ntildeu|w ww| vy uvqz yzv|oplusmnraquo u oqntildexyw yontildeqz| ovyw ntilde|zy wz| vyw uvw ywvuqplusmnraquo | oq|ntildeqo yontildeo ovyw ntildeuqq www vq uvzx ywvzoplusmnraquo yu oq|ntildeqzw yontildeoou ovyz ntildeuyx wzx vx uovw| ywv|qplusmnraquo ow oq|ntildezo yontildeq| ovq |ntildexqx wz |vzo zvyx ywvoplusmnraquo xy oquntildeqy yontildeqq ovo |ntildexyw wzz |vz zvo yvwxplusmnraquo xo oqwntilde|zq yqntildewo ovw |ntildexyx wzx |vzw |qvuo yvzplusmnraquo oqu ooqntildeqq yontildeoxx ovwq |ntildeyu zo |vzz zvzx yyv|z

~plusmnreg w ooz o|o qvzq q x qvwq xvwo xvo~plusmnreg oy oo| ou qvzo y qvwx xvq uvwu~plusmnreg u oow o|o qvzo q x qvw xvz xvy~plusmnreg | oq ou qvz q x qvwq xvzu uvwz~plusmnreg yu oq ox qvzy q x qvw yvo uvzz~plusmnreg ow o oz qvzz y qvw xvy xvq|~plusmnreg xy ox o|o qvzx y qvzx uvwy uvw|~plusmnreg xo ox o|o qvzx | y qvww xv|x uvzx~plusmnreg oqu ou o|| qvz| y z qvz uvq uvy|

~plusmn w oontildewxu |ntildeyuy |vx yu ontildeoow qvx owvuy |vy~plusmn oy oontildewo |ntildeyq |vw y|z ontildeoox qvx owvuw |v|~plusmn u oontildey |ntildexw |vz yuo ontildeoox qvx owv| |vo~plusmn | oontilde| |ntildexzx |v y| ontildeo| qvx owvuz |vq~plusmn yu oontildew|o |ntildey |v y| ontildeo qvx owvxw |vo~plusmn ow ontildeqw |ntilde| |v| yy ontildeouq qvxx ozv|o |vw~plusmn xy ontilde |ntildewox |v zz ontildeoyw qvyw oxv| |v~plusmn xo ontildeuo |ntildezzo |voo wu ontilde qvyz ouvy |v~plusmn oqu ontildequ untilde|w vzq zu ontilde|u qvo o|vuz |v|o

sup2cedil~raquo w y o qvx o | qvu| xvoy uvusup2cedil~raquo oy z o qvwq o qvxx vzx xvuxsup2cedil~raquo u x oq qvuy o qvyo |v| uvuzsup2cedil~raquo | | oo qv o qvxu vuz uvzosup2cedil~raquo yu oo qvyo o qvuz xvyo uvxusup2cedil~raquo ow oo qvyw o qvxx yvqq uvwxsup2cedil~raquo xy y oo qvxy o qvy uvyq xvoosup2cedil~raquo xo y oo qvxy | qvy| |vq| |vusup2cedil~raquo oqu x o qv|z o qvx |vyz xvuq

JAEA-Testing 2011-005

Table G-3 MPIatilderaquo LMPI and hybrid acute256ccedil1024KBP

sup2~ yntildeouu |ntildewzz |ntildeuou ovou ontildezx ontildeqyw ovw| ovzz |vqsup2~ xuntildeww ntilde|o ntildeuo qvzz |ntildeqo |ntildey qvz vu vsup2~ wyntildeu| oqntildexx oontildeqyy qvz |ntildezwy untildeuyw qvwz vq vuwsup2~ ontildequwntildexy ountildexxy oxntildeuy qvzu untildezqq xntildewyw qvwu vz vyu

sup2~ yntildeouu zou zo qvzu ontildeqqq ontildeq qvzw qvzo qvzxsup2~ xuntildeww ontildeo ontildezy qvwz ontildew ntildeuuz qv| qvz qvzsup2~ wyntildeu| ntildexuo ntildey| qvz ntildeyzz |ntildeuqw qvz qvzu qvwosup2~ ontildequwntildexy |ntildex untildeqox qvzu |ntildexww untilde|| qvw ovqx qvz

sup2~Oslashreg~plusmn yntildeouu yntildeuw| wontildexoz qvzu xntildeuo |ntildeoqw ovoq |vqq |vx|sup2~Oslashreg~plusmn xuntildeww ou|ntildewx ountildeqo qvw| uyntildeqzq yyntildewxo qvyz |vo vyosup2~Oslashreg~plusmn wyntildeu| oontildezoo yzntildeuz qvz yntildeqz zxntildeox qvo |voy vw|sup2~Oslashreg~plusmn ontildequwntildexy w|ntildeyox |yontildeuq| qvw wntildexo o|ntildexqz qvo |v| vz|

middotplusmnsup1 yntildeouu |ntildezw ||ntilde|x| qvzz oontildexww oxntildeyzu qvu vwx vo|middotplusmnsup1 xuntildeww y|ntildex yyntildeyy qvzx ontildez|x untilde|x qvuy vww ovuomiddotplusmnsup1 wyntildeu| wzntildewy z|ntildeyux qvzy |qntildeyx yntildeyqu qvuy vz ov|zmiddotplusmnsup1 ontildequwntildexy oqntildeux| o|untildeqox qvzq uqntildeyxx wwntildez| qvuy vzy ovxo

middotplusmnsup1raquo yntildeouu yntildew|u ||ntildeu ovww ntildewxy oxntildeyuw uvyy qvwy voumiddotplusmnsup1raquo xuntildeww wxntildeoz yuntildeo| ov|| wxntildex|| untildeuy ovwq ovqq ov|xmiddotplusmnsup1raquo wyntildeu| ooxntildeuw z|ntildezyu ov| z|ntildeyow yntildexuz ov|z ov| ov|zmiddotplusmnsup1raquo ontildequwntildexy o|ntildexx o|untildeuqz ovq zyntildexoq wntildeuuy ovoq ovu| ovxu

Ugraveplusmnsup1 yntildeouu |zntildeyu| oqntildexwz |vu uontildeuyq ountildewy vz qvzy qvoUgraveplusmnsup1 xuntildeww untildewxx ntildeqo voy uzntildeoy| yntildewo ovw| qvz qvw|Ugraveplusmnsup1 wyntildeu| xxntildewoq zntildewyu ovw xyntildewuu |yntildexqx ovxy qvzw qvwUgraveplusmnsup1 ontildequwntildexy yntildewxy |ntildexxw ovy yuntildewzx uyntildeoq ovuo qvz qvwo

Ugraveplusmnsup1raquo yntildeouu uqntildeq|u oqntildeyxq |vy uontildey| ountildez|z vz qvzy qvoUgraveplusmnsup1raquo xuntildeww uwntilde|u ntildeqz vo uzntildew yntildewx ovw| qvzw qvw|Ugraveplusmnsup1raquo wyntildeu| xyntildezy zntildez|y ovww xntildeo| |yntildeuwq ovx qvzw qvwUgraveplusmnsup1raquo ontildequwntildexy y|ntildeo| |ntildeyxq ovyw yxntilde|yw uyntildeqx ovuo qvz qvwo

~plusmn yntildeouu untildeyw wntildezu uvo uuntildeo wntildeuq| xvx qvzy ovq~plusmn xuntildeww xqntildeqyx qntilde|xq vuy xontildexwy untildezux vq qvz qvw~plusmn wyntildeu| xwntildeq wntildeux vqy yqntildeox |untildeu ov| qvzw qvw~plusmn ontildequwntildexy yyntildequq |yntilde|q| ovw ywntildeqx uuntildeuyy ovx| qvz qvw

~plusmnraquo yntildeouu zntilde|q zntildeqq ovqu zntildexq wntildeu|w ovo| qvzw ovq~plusmnraquo xuntildeww oyntildeu| qntilde|wy qvwo oyntildeoz untildezuy qvyx ovq qvw~plusmnraquo wyntildeu| |ntildewo wntildeuxu qvw ntildewz |untilde qvyy ovq qvw~plusmnraquo ontildequwntildexy zntildezy |yntildexo| qvw zntildezxq uuntildeu qvy ovqq qvw

plusmn yntildeouu yxntildewu wzntildeq| qv| ontildeyoz yntildezz qvwq |vq |v|qplusmn xuntildeww ountildeqoz oxntilde|o qvwo |zntildexu xxntilde|ux qv |vo vxplusmn wyntildeu| ow|ntildeuwy untildeyu qvw xntildeoyu untildezu| qvy |vo |vqqplusmn ontildequwntildexy u|ntildewo zxntildezqo qvw xntildezy zuntildeoy qvwq |vo |vo

plusmnraquo yntildeouu x|ontildeyx wyntildeqoz yvow o|ntildeqox owntilde|xu vu |vww uvyzplusmnraquo xuntildeww yozntildeyz| oz|ntildeuwu |vq oyntildeyw ywntildeyw v| |vwo vwplusmnraquo wyntildeu| yw|ntildewxu wwntildeooz v| owyntilde|q| oqxntildeyw ovy |vy vplusmnraquo ontildequwntildexy |yntildezo uqxntildeoyq ovw oontildexo ouontildezzx ovuz |vuw vwx

~plusmnreg yntildeouu uzu y| qvw uwo xzw qvwq ovq| ovqy~plusmnreg xuntildeww zq| ontildeoz qvy wo| ontildexuy qvx| ovoo qv~plusmnreg wyntildeu| ontilde|q ontildeyq qvw ontildeoxo ntildeoqz qvxx ovo| qvz~plusmnreg ontildequwntildexy ontildezqu ntilde|ux qvwo ontildeuxu ntildeyx qvxu ov|o qvww

~plusmn yntildeouu ountildewqx ountildez|x qvzz ontildeqq owntildeuw| ovou qvq qvwo~plusmn xuntildeww zntildeuy zntildewxz qvzw uqntildex|o uwntildexyw qvw| qv qvyo~plusmn wyntildeu| uuntildeq uuntildeo qvzz xyntildeyoq yzntildexzw qvwo qvw qvyu~plusmn ontildequwntildexy qntildeqqx qntildeu|q qvzz ntildeo zqntilde|ux qvwo qvzy qvw

sup2cedil~raquo yntildeouu x w qvz zo ux vq| qvw ov|sup2cedil~raquo xuntildeww ox o| qv| o|w o| qvyx qvzo qvwosup2cedil~raquo wyntildeu| oo |o qvu owz w| qvy qvzo qvwsup2cedil~raquo ontildequwntildexy ou wz qvu | |x qvyx qvz qvwo

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

sectWOcirceg

ow$XAcircsup3deg owsup3degIcircwsup3

国際単位系(SI)

乗数  接頭語 記号 乗数  接頭語 記号

1024 ヨ タ Y 10-1 デ シ d1021 ゼ タ Z 10-2 セ ン チ c1018 エ ク サ E 10-3 ミ リ m1015 ペ タ P 10-6 マイクロ micro1012 テ ラ T 10-9 ナ ノ n109 ギ ガ G 10-12 ピ コ p106 メ ガ M 10-15 フェムト f103 キ ロ k 10-18 ア ト a102 ヘ ク ト h 10-21 ゼ プ ト z101 デ カ da 10-24 ヨ ク ト y

表5SI 接頭語

名称 記号 SI 単位による値

分 min 1 min=60s時 h 1h =60 min=3600 s日 d 1 d=24 h=86 400 s度 deg 1deg=(π180) rad分 rsquo 1rsquo=(160)deg=(π10800) rad秒 rdquo 1rdquo=(160)rsquo=(π648000) rad

ヘクタール ha 1ha=1hm2=104m2

リットル Ll 1L=11=1dm3=103cm3=10-3m3

トン t 1t=103 kg

表6SIに属さないがSIと併用される単位

名称 記号 SI 単位で表される数値

電 子 ボ ル ト eV 1eV=1602 176 53(14)times10-19Jダ ル ト ン Da 1Da=1660 538 86(28)times10-27kg統一原子質量単位 u 1u=1 Da天 文 単 位 ua 1ua=1495 978 706 91(6)times1011m

表7SIに属さないがSIと併用される単位でSI単位で表される数値が実験的に得られるもの

名称 記号 SI 単位で表される数値

キ ュ リ ー Ci 1 Ci=37times1010Bqレ ン ト ゲ ン R 1 R = 258times10-4Ckgラ ド rad 1 rad=1cGy=10-2Gyレ ム rem 1 rem=1 cSv=10-2Svガ ン マ γ 1γ=1 nT=10-9Tフ ェ ル ミ 1フェルミ=1 fm=10-15mメートル系カラット 1メートル系カラット = 200 mg = 2times10-4kgト ル Torr 1 Torr = (101 325760) Pa標 準 大 気 圧 atm 1 atm = 101 325 Pa

1cal=41858J(「15」カロリー)41868J(「IT」カロリー)4184J(「熱化学」カロリー)

ミ ク ロ ン micro 1 micro =1microm=10-6m

表10SIに属さないその他の単位の例

カ ロ リ ー cal

(a)SI接頭語は固有の名称と記号を持つ組立単位と組み合わせても使用できるしかし接頭語を付した単位はもはや コヒーレントではない(b)ラジアンとステラジアンは数字の1に対する単位の特別な名称で量についての情報をつたえるために使われる

 実際には使用する時には記号rad及びsrが用いられるが習慣として組立単位としての記号である数字の1は明 示されない(c)測光学ではステラジアンという名称と記号srを単位の表し方の中にそのまま維持している

(d)ヘルツは周期現象についてのみベクレルは放射性核種の統計的過程についてのみ使用される

(e)セルシウス度はケルビンの特別な名称でセルシウス温度を表すために使用されるセルシウス度とケルビンの

  単位の大きさは同一であるしたがって温度差や温度間隔を表す数値はどちらの単位で表しても同じである

(f)放射性核種の放射能(activity referred to a radionuclide)はしばしば誤った用語でrdquoradioactivityrdquoと記される

(g)単位シーベルト(PV200270205)についてはCIPM勧告2(CI-2002)を参照

(a)量濃度(amount concentration)は臨床化学の分野では物質濃度

  (substance concentration)ともよばれる(b)これらは無次元量あるいは次元1をもつ量であるがそのこと   を表す単位記号である数字の1は通常は表記しない

名称 記号SI 基本単位による

表し方

秒ルカスパ度粘 Pa s m-1 kg s-1

力 の モ ー メ ン ト ニュートンメートル N m m2 kg s-2

表 面 張 力 ニュートン毎メートル Nm kg s-2

角 速 度 ラジアン毎秒 rads m m-1 s-1=s-1

角 加 速 度 ラジアン毎秒毎秒 rads2 m m-1 s-2=s-2

熱 流 密 度 放 射 照 度 ワット毎平方メートル Wm2 kg s-3

熱 容 量 エ ン ト ロ ピ ー ジュール毎ケルビン JK m2 kg s-2 K-1

比熱容量比エントロピー ジュール毎キログラム毎ケルビン J(kg K) m2 s-2 K-1

比 エ ネ ル ギ ー ジュール毎キログラム Jkg m2 s-2

熱 伝 導 率 ワット毎メートル毎ケルビン W(m K) m kg s-3 K-1

体 積 エ ネ ル ギ ー ジュール毎立方メートル Jm3 m-1 kg s-2

電 界 の 強 さ ボルト毎メートル Vm m kg s-3 A-1

電 荷 密 度 クーロン毎立方メートル Cm3 m-3 sA表 面 電 荷 クーロン毎平方メートル Cm2 m-2 sA電 束 密 度 電 気 変 位 クーロン毎平方メートル Cm2 m-2 sA誘 電 率 ファラド毎メートル Fm m-3 kg-1 s4 A2

透 磁 率 ヘンリー毎メートル Hm m kg s-2 A-2

モ ル エ ネ ル ギ ー ジュール毎モル Jmol m2 kg s-2 mol-1

モルエントロピー モル熱容量ジュール毎モル毎ケルビン J(mol K) m2 kg s-2 K-1 mol-1

照射線量(X線及びγ線) クーロン毎キログラム Ckg kg-1 sA吸 収 線 量 率 グレイ毎秒 Gys m2 s-3

放 射 強 度 ワット毎ステラジアン Wsr m4 m-2 kg s-3=m2 kg s-3

放 射 輝 度 ワット毎平方メートル毎ステラジアン W(m2 sr) m2 m-2 kg s-3=kg s-3

酵 素 活 性 濃 度 カタール毎立方メートル katm3 m-3 s-1 mol

表4単位の中に固有の名称と記号を含むSI組立単位の例

組立量SI 組立単位

名称 記号

面 積 平方メートル m2

体 積 立法メートル m3

速 さ 速 度 メートル毎秒 ms加 速 度 メートル毎秒毎秒 ms2

波 数 毎メートル m-1

密 度 質 量 密 度 キログラム毎立方メートル kgm3

面 積 密 度 キログラム毎平方メートル kgm2

比 体 積 立方メートル毎キログラム m3kg電 流 密 度 アンペア毎平方メートル Am2

磁 界 の 強 さ アンペア毎メートル Am量 濃 度 (a) 濃 度 モル毎立方メートル molm3

質 量 濃 度 キログラム毎立法メートル kgm3

輝 度 カンデラ毎平方メートル cdm2

屈 折 率 (b) (数字の) 1 1比 透 磁 率 (b) (数字の) 1 1

組立量SI 基本単位

表2基本単位を用いて表されるSI組立単位の例

名称 記号他のSI単位による

表し方SI基本単位による

表し方平 面 角 ラジアン(b) rad 1(b) mm立 体 角 ステラジアン(b) sr(c) 1(b) m2m2

周 波 数 ヘルツ(d) Hz s-1

ントーュニ力 N m kg s-2

圧 力 応 力 パスカル Pa Nm2 m-1 kg s-2

エ ネ ル ギ ー 仕 事 熱 量 ジュール J N m m2 kg s-2

仕 事 率 工 率 放 射 束 ワット W Js m2 kg s-3

電 荷 電 気 量 クーロン A sC電 位 差 ( 電 圧 ) 起 電 力 ボルト V WA m2 kg s-3 A-1

静 電 容 量 ファラド F CV m-2 kg-1 s4 A2

電 気 抵 抗 オーム Ω VA m2 kg s-3 A-2

コ ン ダ ク タ ン ス ジーメンス S AV m-2 kg-1 s3 A2

バーエウ束磁 Wb Vs m2 kg s-2 A-1

磁 束 密 度 テスラ T Wbm2 kg s-2 A-1

イ ン ダ ク タ ン ス ヘンリー H WbA m2 kg s-2 A-2

セ ル シ ウ ス 温 度 セルシウス度(e) Kンメール束光 lm cd sr(c) cd

スクル度照 lx lmm2 m-2 cd放射性核種の放射能( f ) ベクレル(d) Bq s-1

吸収線量 比エネルギー分与カーマ

グレイ Gy Jkg m2 s-2

線量当量 周辺線量当量 方向

性線量当量 個人線量当量シーベルト(g) Sv Jkg m2 s-2

酸 素 活 性 カタール kat s-1 mol

表3固有の名称と記号で表されるSI組立単位SI 組立単位

組立量

名称 記号 SI 単位で表される数値

バ ー ル bar 1bar=01MPa=100kPa=105Pa水銀柱ミリメートル mmHg 1mmHg=133322Paオングストローム Å 1Å=01nm=100pm=10-10m海 里 M 1M=1852mバ ー ン b 1b=100fm2=(10-12cm)2=10-28m2

ノ ッ ト kn 1kn=(18523600)msネ ー パ Npベ ル B

デ ジ ベ ル dB

表8SIに属さないがSIと併用されるその他の単位

SI単位との数値的な関係は    対数量の定義に依存

名称 記号

長 さ メ ー ト ル m質 量 キログラム kg時 間 秒 s電 流 ア ン ペ ア A熱力学温度 ケ ル ビ ン K物 質 量 モ ル mol光 度 カ ン デ ラ cd

基本量SI 基本単位

表1SI 基本単位

名称 記号 SI 単位で表される数値

エ ル グ erg 1 erg=10-7 Jダ イ ン dyn 1 dyn=10-5Nポ ア ズ P 1 P=1 dyn s cm-2=01Pa sス ト ー ク ス St 1 St =1cm2 s-1=10-4m2 s-1

ス チ ル ブ sb 1 sb =1cd cm-2=104cd m-2

フ ォ ト ph 1 ph=1cd sr cm-2 104lxガ ル Gal 1 Gal =1cm s-2=10-2ms-2

マ ク ス ウ ェ ル Mx 1 Mx = 1G cm2=10-8Wbガ ウ ス G 1 G =1Mx cm-2 =10-4Tエルステッド( c ) Oe 1 Oe  (1034π)A m-1

表9固有の名称をもつCGS組立単位

(c)3元系のCGS単位系とSIでは直接比較できないため等号「   」

   は対応関係を示すものである

(第8版2006年改訂)

この印刷物は再生紙を使用しています

Page 11: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター

JAEA-Testing 2011-005

M90

00

2IO

M90

00

1IO

FC (3)

FC (4)

FC (36)

hellipFC (1

)F

C (2)

FC

(42)

hellipFC (3

7)FC (3

)FC (4

)FC (3

6)hellip

FC (1)

FC (2)

FC (42)

hellipFC (3

7)

DX8

0 1

helliphelliphelliphelliphelliphellip

helliphelliphelliphelliphellip

hellip

1TB 1T

B1T

B1T

B

DX8

0 2

1TB 1T

B1T

B1T

B

DX

80

35

1TB 1T

B1T

B1T

B

DX8

0 3

6

1TB 1T

B1T

B1T

B

ETER

NU

S D

X80

(36

)E4

K 1

1TB 1T

B1T

B1T

B

ETER

NU

S400

0 M

odel

400

(1)

32G

Bs(8

FC)

QFS

Q

FS

36

576

GBs

(144

FC)

F Fi

g 3

JAEA-Testing 2011-005

3 HPCCYbrvbar

HPCC (High Performance Computing Challenge)1) Y 7^_ZB BX900W FX15poundDq13gtHPCCEcircupqWrstuq 31 HPCCYEcircu

HPCC YvR Tennessee JDongara wRordf3xWXpoundyoBCefgh$BY`^_ZzgtlaquoqO

|Y^_ WpoundDefgh$acircuXu

GmWbrvbarsectefgh$13WCeacuteZGUacuteC

qO HPL(High-Performance Linpack)DGEMMFFTE 3q|^_Zgt|Y STREAM W Random Access 2 q|^_Zgt _ PTRANSW b_eff 2q|^_Zgt5n~noacuteCc^_Zbrvbar5PJWXpound 2)qo

^_ZUuml|YacuteTHORNFortranCgtcnordfHPCCYWCCgtcnD 1^_ZeumlnqXHPCCYC 2003 sect 300wordfeumlnsect+X13rsyBGmW9gtlaquoq

311 ^_Z efgh$13f CPU 1 iexclXCiexcl8ampG

L13iumlIumlPa$fY`gtYZCDYZ$JgtlaquosectCoreCPUiumlIuml13WeumlUgraveUgraveXgtO=ordf5nDHogtordfugt

laquoqmgtiumlIumlMG` (G Global system performance)WiumlIuml[gt` (SN Single Environment)gt` (EP Embarrassingly Parallel) 3gt5Pq

3111 HPL^_Z

HPL5IgraveLUcurrengtCmWgt13OG^_ZgtlaquoqOslashm^_Z|5 CcedilEgrave^_ZaumlIacutecurrenWCyB

eumlnmWordfqUgraveDefgh$13O[P TOP500B^_ZWCBeumln LinpackMPIEumlIgraveUcircWCnotgtlaquoqiumlIuml`$XOordfecircMAumlWXq Tflopsgteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq

JAEA-Testing 2011-005

3112 DGEMM^_Z DGEMM5IgraveGmWgtiumlIumlOG^_Zgtlaquoqo^_gt5curren13^_W i5Xqm^_Z

LinpackSgtegravenauml^_ZgtlaquoqUgraveDaumlefgh$gto13zeumlnDeacuteaTHORN|ordfZaZ|WCBeumlnsect

meacuteaTHORN|ordfoacute WXqY`LBMTPbrvbariumlIumlOiexclcentGq Gflops(Giga floating-point operations per second)gteumlnq` 2gtiumlIuml[(SN)(EP)`5Pq 3113 STREAM^_Z STREAM iumlIumlordf|W CDWecirceacuteIumlG^_Zgtlaquoqm

Wecirco^_gtO=ordf5np^_W i5Xqm^_Z

cpound(Copy)curren(Scale)Myen(Add)yen(Triad) 4 XO=gtJeumlnsectiumlIuml|YiexclcentGq GBs(Giga Bytes per second)gteumlnq` 8 gt[(SN)(EP)gtoiumlIuml 4(cpoundcurrenMyenyen)`5Pq 3114 PTRANS^_Z PTARNS(Parallel matrix TRANSpose)EumlIgrave^_ocirc5IgraveFtimesbrvbarsect13ccedil`brvbarY G^_Zgtlaquoq GBs gteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq 3115 Random Access^_Z Random Access |sectumleumlnDecopyIgravecdZordf 1 ulaquonotgtshyregGO=ordf 1 macrdegplusmn5ncG^_Zgtlaquoq1 plusmnYGOcirc$2ordf8BWgteumlordfgtlaquoqiumlIuml|Yucircsup2`WiumlIuml MPI gtucircsup2`5Pq Gups(Giga updates per second)gteumlnq` 3 gtiumlIuml[(SN)(EP)iumlIumlegravepoundDM(G)`5Pq 3116 FFTE^_Z FFTEAumlF|sup3^acuteOG^_ZgtlaquoqHPLWiacute CcedilEgrave^_ZaumlIacutecurrenWCyBeumlnmWordfq[(SN)W(EP)`gt2 [microcopyIgraveordfo^_gtpara|sup3^acuteeumlnqiumlIumlegravepoundDM(G)`gt3[microcopyIgraveordflaquo 1AgravezmiddotAcurrenetheumlnDcedil|sup3^acuteeumlnlaquoW^_ordf^_W 5PmWLoacute Pbrvbaro^_

copyIgravesup1copytimesordm^_raquofrac14sect[microoacuteG|sup3^acuteordf5nq

JAEA-Testing 2011-005

Gflopsgteumlnq 3117 b_eff^_Z b_eff Ocirc$FGFGfrac12frac34(7a13acute latency)WFGFAring(eacuteIuml band width)shycdiquestH^_ZgtlaquoqFGfrac12frac34ordfAgraveXCeacuteIumlordfecircM13^_ ordfAacutenmWGqb_eff iumlIumlegravepoundDM(G)G Ring W PingPong 2 ^_ZgtJeumlnqb_eff FGGOcirc$-a5mnUgravegt^_ZW+XsectlOcirc$-a5 N ] MPI ^_ P atildeAcircXq7a13 8B Ocirc$eacuteIuml 2MB Ocirc$FGGmWbrvbarsect5Pq Ring MPIbrvbarpoundAtildeHdnDZYAumlAringAEligo^_oacuteCAumlAring 1oacute 1 5PoumlWZYAumlAringZordfEumlu 1oacute 1 5Pouml 2CcedilEgravegtCDq PingPong 2Core Eacute Ecircnototilde^_ordfethsectntildednp Core EumlCOslash5Gq2CoreIgraveEacuteIacute5poundecircTIcircIIumlshyGq b_effETHNtildeOgraveRing(AumlAring)brvbar7a13OacuteRing(AumlAring)brvbareacuteIumlOcircRing(Zordf)brvbar7a13OtildeRing(Zordf)brvbareacuteIumlOumlPingPongbrvbar7a13timesPingPongbrvbareacuteIuml 6gtlaquoq7a13ordfaY_LOslashPmacreacuteIumlordf GBsgteumlnq 312 Oslash5

HPCCZaZ(Baseline Runs)WUuml^Otildea5Z(Optimized Runs) 2Oslash5ordflaquoordffeaZTUacuteUcircEacutebrvbarZaZgt5pound

q

3121 ZaZ Oslash5gtlaquoZaZfeaZbrvbarTUacuteUcircW13Oumlfrac34

eumlnaumlUgraveAuml BLAS ZaZ|W MPI ZaZ|egraveBUgravegtUacuteeumlnordfUcircfIuml^regUacuteeumlnXqegraveBCDfeaZeacuteYacute~

WfeaegraveBCDUuml^13~ZaZ|eacuteYacute~UumlordfYacutegtlaquoq 3122 Uuml^Otildea5Z UcircfIumlTHORNszligagravePTUacuteUcirc2 7gtUacute9eumlnqiAgraveaacutednDIacutecurren(-)fIumlTUacuteUcircgtlaquosectpAgrave|5macr8CEcircnototilde3XTUacuteUcircgtlaquoqUgraveD acirc HPCC^_YacuteNtildeY`acircatilde^WaumlaringordfaeligugtlaquosectatildeCUumlGaeliguordflaquoq

JAEA-Testing 2011-005

32 BX900 FX1n~nHPCC BMT^_Z 8ccedil512EumlIgravegtOslash5CDq 321 BX900 BX900 HPCC BMT^_Z Table 1GWHPL

PTRANSRandom Access FFTEo^_ZZK|Otilde Fig 4ccedilFig 7GqZK|Otildedegn^_Zegraveeacuteecirccopyordfeuml]cgtlaquoordf512 EumlIgravegtigraveyenCiacuteXCiacuteicircXordfdegdnqUgraveDHPL=NoacuteC 822(=4902596100)1gtlaquopoundDq

Table 1iumlntildeethntildeograveoacuteWCHPCCocircAEligcdotildebIcircIumlETHNtildeoumlccedilYAElig Intel Endeavor cluster (13gt Endeavor Wdivideoslash)HPCCWeboumlYacute 1) ugraveuacute Table 2Gq BX900shyucircoB3gtlaquoDH Ecircnototilde^_ZP EndeavoregraveeacuteplusmnpoundDlaquoordfEcircuumlb13yacuteordfdegdnqUgraveDEndeavorWEacute IntelfeaZBD Table 3 GbrvbarPQRfeaZbrvbarWbgtlaquopoundDqDotildeCSTREAM iumlIumlSgt5poundmWcd thornCXatilde

d(35GBs oacute 42GBs)W 2ethM EndeavorbrvbarsectEshygtlaquopoundDqm422gtaringNGq

1 BX90013VBbrvbargt17072fgt=N 2000TflopsoacuteC 1914TflopsOslashCsectOslashszlig 9566gtlaquopoundDq

- 9 -

JAEA-Testing 2011-005

Ta

ble

1 富士

通コ

ンパイ

ラに

よる機

構B

X900

での

HPC

C B

MT測

定結果

G-H

PL

G-P

TR

AN

SG

-FFTE

EP-S

TR

EA

MSys

()

EP-S

TR

EA

MTriad

EP-D

GEM

MR

ando

mR

ing

Ban

dwid

thR

ando

mR

ing

Lat

ency

TFlo

ps

GB

sG

ups

Gup

sG

Flo

ps

GB

sG

Bs

GFlo

ps

GB

sus

ecrd

ma

008

1315

12

9330

001

5112

008

9847

555

3328

074

350

9311

018

41

1171

51

268

nord

ma

008

1226

12

9544

001

5504

009

0139

557

5727

813

347

6711

023

91

1053

51

278

rdm

a0

1579

130

510

540

0238

850

1510

719

4469

555

143

4697

111

145

092

418

149

3no

rdm

a0

1536

650

491

080

0222

270

1334

439

7508

555

683

4730

111

123

078

781

324

5rd

ma

032

2493

09

0716

003

0751

024

2906

186

357

111

164

347

3911

101

00

5220

61

944

nord

ma

032

3494

08

4422

003

5001

023

7182

190

981

112

052

350

1611

100

30

5194

95

010

rdm

a0

6201

260

150

637

003

2786

037

8769

369

923

222

203

347

1911

092

40

4192

43

156

nord

ma

060

9871

014

686

80

0763

460

3882

6638

139

722

260

03

4781

110

887

044

542

652

7rd

ma

120

6270

026

303

70

0370

050

5186

5673

075

744

426

43

4708

111

007

038

892

463

5no

rdm

a1

2265

800

302

225

007

3652

061

6532

742

721

444

063

346

9211

100

50

4161

78

290

rdm

a2

4166

200

583

348

002

9897

053

8761

139

3770

888

453

347

0511

091

00

3583

55

937

nord

ma

236

3890

057

519

80

0765

560

8003

4814

845

5089

503

73

4962

110

879

040

323

103

72rd

ma

468

4620

011

199

400

0163

910

2622

2026

859

1017

779

513

4726

111

147

032

838

711

3no

rdm

a4

9017

400

109

9150

008

2799

153

7610

275

5280

1816

433

354

7711

111

90

3572

112

790

富士

通B

X90

0 (5

12C

ore

128C

PU

)C

ip

Xeo

n X55

70 2

93G

Hz

DD

R3-

1066

SM

T-O

N T

urbo

Mod

e -O

FF

Inte

rcon

nect

Q

DR

Inf

inib

and

2so

cket

nod

e F

at T

ree

OS

Red

Hat

EL 5

Com

pile

r 富

士通

CC

++ V

er O

ption -

Kfa

st -

SS

L2

()

EP-S

TR

EA

M S

ys の

値は

(EP-S

TR

EA

M T

riad

)times並

列数

で算

出し

てい

8 16 32 64

MPI

富士

通 P

aral

leln

avi Lng

uage

Pac

kage

512

128

HP

CC

Ver

V

13

1

256

並列

数通

信方

式G

-Ran

dom

Acc

ess

オリジナ

ル版

  

SA

ND

IA_O

PT2版

JAEA-Testing 2011-005

FFig 4 BX900 HPL

Fig 5 BX900 PTRANS

JAEA-Testing 2011-005

FFig 6 BX900 Random Access

Fig 7 BX900 FFTE

JAEA-Testing 2011-005

TTa

ble

2In

tel E

ndea

vor c

lust

er (

HPC

CW

eb1)

)

JAEA-Testing 2011-005

TTa

ble

3In

tel

BX9

00H

PCC

BM

T

JAEA-Testing 2011-005

322 FX1 FX1 HPCC BMT Table 4 HPL

PTRANS Random Access FFTE Fig 8 Fig 11512

HPL755(=377650100)2

Table 4 FFTE DGEMM

A

2 FX1 1200 129Tflops116Tflops 9037

JAEA-Testing 2011-005

TTa

ble

4FX

1H

PCC

BM

T

JAEA-Testing 2011-005

FFig 8 FX1 HPL

Fig 9 FX1 PTRANS

JAEA-Testing 2011-005

FFig 10 FX1 Random Access

Fig 11 FX1 FFTE

JAEA-Testing 2011-005

33 HPCCYUgraveWH Uklq BX900W FX1WrsndWK(Altix3700Bx2)Wrs

Fig 12GqmhgtTWXpoundDqshy 100WCLRandom Ring LatencyPqRandom Ring Latency13BX900ordf`ccedil^fCmWordfcurrencq

BX900 W FX1 rsGWSTREAM W FFTE necircOatildeC atildeC eacuteIumlrvCDordfmacrdnDqSTREAM

421FFTE 43aringNGq XAltix3700Bx2 Random AccessordfEordfAltix3700Bx2W

plusmngtegraveBCD HPCC BMTeacuteYacute~cdOcirc$Y|5ordf+Xpoundsect8YrsGmW13XqAltix3700Bx2 atildeC FFTEEordfmdoacute OslashszligWuumlwdn

q

JAEA-Testing 2011-005

13

0

20

40

60

80

100

BX90

0

FX1

Altix37

00Bx2

GHPL(490Tfplos)

113

GRando

mAccess(154G

ups)

GPTRANS(1120G

Bs)

EPSTREA

M(436GBs)

EPDGEM

M(111Gflo

ps)

13

GFFTE

(2755Gflo

ps)

$amp(13

Rand

omRingBandw

idth

(0359GBs)

Rand

omRingLatency(518usec)

)+

512-(FlatM

PI)

1172

10 64

64

10 32

8 2 32

130

13

123

[GFLOPS]

4567898

123

[GBs]

lt8

7898

[GBs]

=gtAB130CDgtEFGHIAB0J

KLMNOIPQ

Fi

g 1

213cent

HPC

C B

MT^_ZrsZ

Ran

dom

Rin

g La

tenc

y (5

18

sec)

)+R

JAEA-Testing 2011-005

4 BX900W FX1TUacuteUcirc 32 c^_Zgt BX900 W FX1 IcircIumlETHNtildeoumlccedilYWiCXordfWnDmWQR^_aZegraveBC(Xcurrenyen

5poundDqeumldfeaZUuml^13~brvbarTUacuteUcirc]Ucirc7^_ZTUacuteUcirc

EacuteDq 41 ^_aZ

BX900 W FX1 gtegraveBCD^_aZWQRUgraveyenXgtlaquosectoacute ^_ZOslash5^_a|Ocirc$divideC^_ZOslash5cedildivideOcirc$divideC

^_Z=)yen5PmWordf13q ^_aZyBGUgrave^_Z_IumlYacuteh)JGDHfe

a|YUuml^13~UKtl_trtGq[ _ZOslash5_IumlYacute

h fpcoll fIumlgt=GqTcedildivideCDeacuteaTHORN|Ocirc$ fprof fIumlgtdivideCaring`Gq^_aZgtPAElig 8q|gtlaquoq

^_a|Ocirc$divideAElig AElig ^_ AElig MPIZaZ|ordm)AElig f`AElig IcircIumlETHNtilde$AElig fZAElig UcircfIumlAElig

mPIcircIumlETHNtilde$AEliggtCPUlt=gtOaringaeligccedil13

hccedil`|WOcirc$FGszligbordfoacute WXq

42 feaZbrvbar|YTUacuteUcirc BX900HPCC BMT^_Z STREAMordfotildeb

gtlaquo EndeavorWrsC(35GBsacute42GBs)W 2ethMEshygtlaquopoundDmWETHNtilde5poundDqETHNtildeBDUcirc HPCCpSTREAMUuml|YacuteTHORN(CUcirc amp FortranUcirc)7ccedilIumlEumlIgraveUcircWMPIEumlIgraveUcircgtlaquoq[13BGbrvbarPmdegn BX900 feaZUuml^13~UKntst brvbarsectCDmWcdEndeavorfeaZ(IntelUgrave)W BX900feaZ(QRUgrave)igraveiacuteWuumlwdnq

JAEA-Testing 2011-005

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute BX900Uuml|YacuteTHORN STREAM^_Z Table 5Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntst brvbarordfmacrdnDqDotildeCUuml^13~gtOcirc$|sectumlGaringaeligccedil13hyBCXbrvbarPGDHoacute WX DO^SgtOcirc$sup1yBordfXmWordfdcXoumlEacuteegraveBGuecircgtlaquosectPgtXoumlordfUcirc

GordflaquoqUgraveDfeaZordfoacute WGDO ^SgtsectumlOcirc$ordf^^shypoundZYeumlnoumlEacutegtlaquoq CUcircoumlIntelfeaZXd OpenMPEumlIgraveUgraveUgravegt 42GBscoreordfdegdnDordfQRfeaZgt7ccedilIumlEumlIgraveUcirc OpenMP EumlIgravecd=EumlIgrave^regCbrvbarP]C Intel feaZWbWXpoundDqQRfeaZgtlaquo CD7ccedilIumlEumlIgraveUcircAgraveogravebrvbarpoundOcirc$copytimes]AgravebTUacuteUcircAgraveograveordflaquoWnq [ FX1Uuml|YacuteTHORN STREAM^_Z Table 6Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntstbrvbarordfmacrdnDqXTriad (132GBs)CPUQRordfWebgtUumlCshy(135GBs)CPUWotildeEacutegtlaquoq CUcircoumlordf FortranUcircUKntstUuml^13~XCWotildeiCqFX1 CfeaZ BX900W+XsectUKntstUuml^13~ordfcentsup3CXqDHFX1 CfeaZbrvbarTUacuteUcircordf$currengtlaquopoundD9ordfAumlq

JAEA-Testing 2011-005

TTa

ble

5B

X900

STRE

AM

Ta

ble

6FX

1ST

REA

M

JAEA-Testing 2011-005

422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute 321gtdegdnD BX900 HPCC BMT^_Z STREAM

Table 7GqmmgtUKntstcopyIgraveOcirc$ecircEacutearingaeligccedil13heumlX storeampegraveBGmWUKrestp=all(a$XsectordfXmW)WCDTUacuteUcirc5PfeaZUuml^13~gtlaquoq UgraveDyacutegtgtlaquoordf _ZOslash53iumlIuml XC

iumlIumlSgt|Yo CPU]o CoreiDHEumlIgrave] Agravejatildedn~nfeaZUuml^13~brvbar^UcircCXq

STREAMbrvbarPX CoreIumliX2gtZX|Yordf5n DO^gtfeaZUuml^13~UKntst GmWgt|Y+13mWordfcurrencpoundDq

Table 7 HPCC BMT^_Z STREAM (MPIEumlIgraveUcircCUcirc) feaUuml^13~Ograve(1) feaUuml^13~Oacute(2)

EumlIgrave Agravej 13aelig13

EP-STREAMTriad (GBs)

13aelig13 EP-STREAMTriad (GBs)

rdma 1 34739 U U 32

nordma 1 35016 1 43316 rdma 3 34708 U U

128 nordma 3 34692 1 43572

rdma 5 34726 4 43595 512

nordma 4 35477 4 43424 L-1QRfeaZUuml^13~OgraveacuteKfastpackedntst SSL2 -2QRfeaZUuml^13~OacuteacuteKfastpackedntstrestp=all SSL2P

JAEA-Testing 2011-005

43 hhbrvbar|YTUacuteUcirc 32 gtdcWXpoundDbrvbarPBX900 W FX1 FFTE ^_Z

acircIcircIumlETHNtilde13yacuteordfmacrdnDqmETHNtildeDH5pound

DhhW13gtszlig0Gq 13gt FFTE 32EumlIgraveOslash5CDWecircoacuteG ethBX900

gt 12FX1 gt 16Wgteumlqpoundordm)1|YEcirc2OgtlaquopoundD9ordfAumlqmgtFFTEUuml|YacuteTHORN(FortranUcirc)l3CUuml|YacuteTHORN(13aumlWoacuterGoumlUcircWdivideoslash)WaumlIacutecurren4C)JCDaumlacircoacuteGaumlZh5poundDiacuteegraveBC

hbrvbarpound^UcircCDordm)GmWbrvbarsectBX900 W FX1 oacuteG5g5poundDqUcircpaumlbrvbar5poundDh 1QRfeaZW Endeavor egraveBeumlnDafeaZWrs5PDHW 2auml13IacutecurrenTUacuteUcircbrvbar6)BbordfaumlIacutecurrenthorn7wXmWagraveaacuteGDHgtlaquoq UcircaumlIacutecurren 38plusmnUcircWmndaumlZhCDiacute

Ucircn~n Fig 13W Fig 14GWoIacutecurreniacuteEcircu0Gq Fig 13W Fig 14OgravegtCDIacutecurrenOO=5pound^gtlaquoqUuml|YacuteTHORN

gtcopyIgrave APXYZWWWWOcirc$Yordf 2ugt5nsectZgtXcpoundDDHiacutegt)9BcopyIgrave CY_W^curreneth5poundcopyIgrave APXYZWWWWYordfZXbrvbarPCDq)9BcopyIgrave CY_YZCXcpoundD=Ocirc$sectumlG^gtYordfZXpoundDoumlUcircordfecircDHgtlaquoq

Fig 13W Fig 14OacutegtCDIacutecurren3[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 3ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 3gtXC 6Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquosectaringaeligccedil13h-

a5brvbarpoundltG_ccedilY-a5^_ZUgrave)acircregmacr= 16 gteumlnq

Fig 13W Fig 14OcircgtCDIacutecurren2[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 2ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 2gtXC 4Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquoqFX1 gtBX900rufntildeDsectaringaeligccedilaringh-a5ordfgteumlmWcdmO=)t_ccedilY-a5 16cd 4^regGmWgtzyacuteordfmacrdnDq

Table 8BX900W FX1QRfeaZegraveBCDUcircWaumlbrvbarEuml BX900 afeaZegraveBCDaumlbrvbarGq

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeFFQhFQeFHFFIhFIeFHFFhFQWFRWFFILCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildeKgMSCFDEFGLCKmMlhKOFRLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeNHQRIAacuteVOEOlOjAtildehSRAacutelOEYEEfKAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKnMSCFDEFGKmMSCFDEFGKdMSCFDEFGKkMSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePMMSCFDEFGPKMSCFDEFGPPMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLSABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtildeLCnMEEhKOFKOFAVjLCgMllhKOFPOFAVjLCPMEhEEOUEFMAacuteEEfFAVjYKOFKAtildeLCKMlhllOUEFMAacutellfFAVjYKOFPAtildeAAacutelOEAtildehNAacuteEOlAtildeaumlIacutecurrenOcircKMSCFDEFGPMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBF

GFL

Fig 13 FFTEUuml|YacuteTHORNaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeop]qrsAacuteXAtildeObqqpobtbuqrvvowxAacutevOvOvAtildeL3yzPbqqpobtrAacuteowxAacutesOwO]|AtildeAtildeLCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildep~hKOwowxAacute^O~OqAtildehowAacute~O^Atilde ^currenethDHrp )9BcopyIgrave)JIacutecurrenKgMSCFDEFGKdMopt^|rKkMopt^|rpKdK^^hKOsOuqpKmM~~hKOwOuqpKmK~h~~O^MAacute~~fuqYKOwAtildeLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildepKkKVhKO]|NHQRIAacuteVOEOlOjAtildehowxAacute^Y^^fKO~OqAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKkKopt^|r LEW V^lnacutewPKnMSCFDEFGKmKSCFDEFGKmMSCFDEFGKdKSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePPMSCFDEFG LEW j^lnacutewPPKMSCFDEFGPMMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtilde]bcbrtrcAacuteuqKhnAtildeLCnMEEhKOFKOuqKLCgMllhKOFPOuqKLCKMlhllOUEFMAacutellfuqKYKOFPAtildeLCPMEhEEOUEFMAacuteEEfuqKYKOFKAtildeAAacutelOEAtildehNAacuteEOlAtilde aumlIacutecurrenOcircPMSCFDEFGLEW l^lnacutewPKMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBFGFL

Fig 14 FFTEiacuteaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

Ta

ble

8FF

TE V

er 4

1

FullW

kern

el

(EumlIgrave

32

180M

BP

roc

)

ST

UVWXYZ[]

^_

`6a8bcdRefghHdRefgij

`6a8bcdRefghHdRefgij

klmmnopqr

cltst

ouvq

wvx

ovy

zqvz

|vy

|vz

klmmnopq

~cltst

yvy

yvo

ovo

wqvq

v|

|vy

s88oy

oqvu

oqvu

ovq

z|vu

oquv

qvz

s88Ru

ovz

qvw

x|vq

ovw

klmmnopqr

cltst

ouvy

zvy

ovx

oq|vu

uvz

uv

klmmnopq

~cltst

yvx

xvw

ovo

w|vu

vz

|vy

s88oy

oqvx

oqvo

ovq

wxv

z|vo

qvz

s88Ru

o|v|

qvw

xuvw

ovy

klmmnopqr

cltst

ouvo

zvx

ovx

klmmnopq

~cltst

yv

xv

ovo

s88oy

oqvq

oqvu

ovq

s88Ru

ovz

qvw

ln

ltd

ln

zqq

D

mo

ltd

ln

mltd

JAEA-Testing 2011-005

UgraveBX900gtUcircWQRfeaZaumlWQRfeaZaumlWafeaZdegnIgraveEacuteIacuteotildeigt

laquosectauml4ordfXmWWfeaZAacuteordfXcpoundDmWordfcurren

cqUgraveDFig 14OcircgtCDIacutecurren_ccedilY-a5^regGW(_ccedilY-a5acute164)BX900gtordfpoundUcircCsectmoumlaringaeligccedil13hegravepoundXcpoundD9ordfAumlq [FX1gtUcircWQRfeaZaumlWQR

feaZIgraveEacuteIacuteshyegraveeacuteordfaEacuteDordfUuml|YacuteTHORNcdz

yacuteCiacutecurrengtrsGWotildeigtlaquomWcd BX900 WrsGmWXqBX900W FX1gtUuml|YacuteTHORNWiacuteOslash5GWFig 14OgraveIacutecurrenO=OordfEcircUgraven^currenethWPrsUVXiacute5pound

DDHBX900 gt FX1 gtzyacuteCqCcurren BX900 16 oacuteC FX1gt 40WXzyacutegtlaquopoundDqUgraveDFig 14OacuteWOcircIacutecurrendegncopyIgravefgO=EacutegtlaquosectBX900gt 11currenW 08currengtzyacuteXcpoundDordfFX1gt 36currenW 18curren(or 16curren)gtMddcXzyacuteordfaacuteHdnDq 13yacutecdFX1feaZTUacuteUcirc BX900Wdc+XpoundsectFX1

|Y]OszligyBGDHhhordfaeliggt

laquomWordfcurrencpoundDq 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute

42 STREAM^_ZgtethCDW ordfcent^_Zgtecircc5g5poundDq egraveBCD^_Z43 FFTEWEacute Stockham|5gt FFT5P

NPB (NAS Parallel Benchmarks) 23FT^_ZgtlaquoqFFTEoumlWiacuteNPB23 FT Uuml|YacuteTHORN(Fortran Ucircgt MPI EumlIgraveUcirceumln)l3CUuml|YacuteTHORNcdaumlIacutecurren4C)JCDaumloacuteCfeaZUuml^13~brvbarpound^

UcircCDordm)5poundDqUuml|YacuteTHORNgtXCaumlBD=YZ C7UV((XYZ)=(512512512))gtaumlIacutecurrenordf On aringaeligccedil13hWXsectfeaZUuml^13~szligordf13XDHgtlaquosectaumlIacutecurrenUVEacuteCDcpoundDcd

gtlaquoq ^_ZaumlIacutecurren 28plusmngtlaquosectn~n Fig 15W Fig 16GqMd

8plusmncopyIgraveoacuteGYordfZsectumlO=gt|YordfZgtlaquo

mWcdhtgtUKntst oacute WXXgtmUuml^13~notbrvbarGq UgraveDOcirc$_ccedilY-a5 16W 64brvbar5poundDq

JAEA-Testing 2011-005

13

|ucp|t^ro__tAacute^OOOsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]|trFRFY]p^top]qrsYtpYop]qrsJJDp_Q|^bbqpc^ta|rotpbctcb|urc`Q^uptatar^]|tbtarp|t]|tbccbwOa^qrR^boocbtoabccbw`Et^b|rtabtFhPU`Ar_pcrobqq^SJJDItpo]rc_pcJJDOtarbccbw|tur^^t^bq^ruwobqq^SJJDI^taEorttpMbUrttpUQOarcrUQ^tarbs^|bq|rp_U_pcbwo|ur|rtobqq`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^oq|r[qpubq`a[^oq|r[]^_`a[^trrc^OOO^O~OqOsO^rccOr^trrc__tuqpoO__tuqpo]b]bcbrtrcAacute__tuqpohdnO__tuqpo]bhddAtilde auml^_ccedilY-a5Kd]bcbrtrcAacute__tuqpohKdO__tuqpo]bhKXAtilde]bcbrtrcAacutehiOhmKPAtildep|uqrop]qrssOwO|p|uqr]cro^^prqOrqrOrq]^r^psAacute__tuqpo]bOAtildeOwAacute__tuqpo]bOAtildeO|AacuteAtildeobqqUHExE^tAacute^rccAtildesAacutevOvAtildehAacuteK`MMOM`MMAtildewAacutevOvAtildehAacuteK`MMOM`MMAtilde|AacutevAtildehAacuteK`MMOM`MMAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pcprbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqh]^xt^rAacuteAtildep^hKOKMMM^hY^pqhKOOPobqq__tPAacute^OqOOO__tuqpoO__tuqpo]bO|OsOwAtilde^_Aacuteq`r`AtildeptpKdMobqq__tPAacute^OqfKOOO__tuqpoO__tuqpo]bO|OwOsAtilderpptpKXMoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]wRtpQ`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKdMopt^|rp~hKOp^hKO__tuqposAacute^O~AtildehwAacute^O~Atilde aumlIacutecurrenLfgO=PrprpKXMopt^|rrpobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqrh]^xt^rAacuteAtilderq]hrqrYrqobqqUHExSpxcbAacuteUHExSCUUxTCBVLOrO^rccAtilde^_Aacuter`r`MAtildetarc^trAacutedOWAtilde[Gqb]rD^ruw|u`__tPAacutero`Atildev[Orq]r^_obqqUHExJ^bq^rAacute^rccAtildetp]

Fig 15 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- cfftz)

JAEA-Testing 2011-005

|ucp|t^r__tPAacute^OqOOOwOwKO|OsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pctarVYta^trcbt^pp_tarropbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^trrc^OOqOOOwOwKOKOq^Oq~OqO|O^O~O^KKO^KPO^PKO^PPp|uqrop]qrs|OsOwO|KOsKKOsPK^r^p|AacuteAtildeOsAacutewKOAtildeOwAacutewKOAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYort^^t^bq]bcbrtrc`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKhePqhPWWAacuteqYKAtildeq^hPWWAacuteYqAtildeq~hPWq|hq^fKp^hMOq^YK^KKh^WqfK^KPh^KKfK^PKh^Wq~fK^PPh^PKfq^_Aacute^`r`KAtildetar|Kh|Aacute|f^Atilderqr|Khop~Aacute|Aacute|f^AtildeAtilder^_oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoDa^qpp]^rotpc^buqr`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYphMOqYKp~hKOwsKKhsAacute~O^KKfAtildesPKhsAacute~O^KPfAtildewAacute~O^PKfAtildehsKKfsPK aumlIacutecurrenLOO=PwAacute~O^PPfAtildeh|KWAacutesKKYsPKAtilderprprpcrt|cr

Fig 16 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- fftz2)

JAEA-Testing 2011-005

UgraveWH Fig 17(Ocirc$ Table 9 Table 10GqQRfeaZbrvbarshyatildeCBX900 fftz2gtfeaZUuml^13~UKntstbrvbarzyacuteordfagraveaacute13cfftz gtnotXordfmacrdnXqUgraveDFX1degnnotXordfmacrdnXcpoundDqmW ETHNtildeCD

zyacuteordflaquopoundD BX900 fftz213gtfeaaringaeligccedil13heumlXsectumlampordf5eumlnXmWordfcurrencpoundDq13]h

eumlnX(cXCcedilEgraveWuumlwdnordf(currencdXq Ocirc$_ccedilY-a5degnOcirc` 16ordfTUacuteshygtlaquopoundDq UgraveDQRfeaZbrvbar BX900W FX1shyordfFFTEoumlbrvbarsectgteuml

(2curren13)ucircsup2WsectumlMd|YZgtlaquopoundDDHWuumlwdnq TcedilQRfeaZWafeaZUKntst ordf)BCD

IacutecurrenotildeQRfeaZAgraveordf 1 ethAacuteWXpoundDordfSTREAM gtfrac12Eacutesect

iumlntildegtlaquoq 45 TUacuteUcircUgraveWH

BX900WFX1STREAMWFFTEUuml|YacuteTHORNfIumlWNPB23 FTB13feaZbrvbarTUacuteUcirc5gCDFX1gtegraveBGfeaZUuml^13~brvbarpoundordfecircC^ordflaquo 5ordfaeligugtlaquopoundDq

FX1gtfeaZUuml^13~otildegt13ecircGmWordf13XoumlordflaquosectWecircUcirc7gth5Paeliguordflaquoq

Fig 17 feaZbrvbarTUacuteUcirc

JAEA-Testing 2011-005

TTa

ble

9N

AS

PARA

LLEL

Ben

chm

arks

Ver

23

FT

Tabl

e 10

NA

S PA

RALL

EL

Ben

chm

arks

Ver

23

FT

JAEA-Testing 2011-005

5 IORYbrvbar Y`fIuml IORBoa13alFGFAring

CaYoBETHNtildeCDq 51 IORYEcircu

IOR Benchmark3)vR ASCI Purple=_7|eacuteplusmngteumlnD C iexclbrvbarYfIumlgtlaquoLplusmneacuteYacute~ 2102egraveBPqPOSIXLwrite() read()PMPIIOLMPI_File_write_at() MPI_File_read_at()PbatildeB13EumlIgravel5PmWordfgtecircq 511 YacircueZ$

MPI oZYcplusmnLrepetitions plusmnP`5XFGFAringTshyiquestHqOcirc`centgtEumlIgraveOslash5eumln^_cdiaoacuteCpound

Eacuteecircordf5nndFGFAringshyordfeumlnqIOROslash59XacircueZ$ Fig 18Gq

1plusmn`gt readwriteplusmn(blockSizecurrentransferSizejsegmentCount)gtlaquosect`aecirceuml (blockSizej segmentCountjMPI EumlIgraveAring )WXqyenwtransferSize=1kBblockSize=1MiBL106 Bytes MB220 Bytes MiB WGPsegmentCount=1MPI EumlIgraveAring=32 oumlblockSizejsegmentCountjEumlIgraveAring=32MiB testFile ordf)dnoZY 1MiBmiddotAbrvbarntildeC1KB7fIuml 1024iexcl1313aeligUgraveDZordfpoundEacuteecircGq

[transferSize] 1plusmn readwritegtFGGeacutea`qsizeof(long long int)currencblockSizegtlaquoaeliguordflaquoq

[blockSize] o^_ordfbrvbarntildeGeacutea`qgt 1MiBq [segmentCount] o^_brvbarsectYeumlniOcirc$qyzshy 1q [randomOffset] 1XdZordfYqOcirc`shy 0q [fsync] ecircEacuteAring fsync13f5Gqyzshy 0q [filePerProc] ^_sectWcent`aYGqPOSIXEacutenotszligq

yzshy 0q [useFileView] MPI_File_set_view() MPI_File_write() MPI_File_read()egravePq

ZordfYWparaCXqyzshy 0q [collective] MPI_File_write_at_all() MPI_File_read_at_all()egravePqZordfY

WparaCXqyzshy 0q

Fig 18 IOR YacircueZ$

JAEA-Testing 2011-005

512 MPIIOatildeGUuml^13~eZ$

IOR gt Info aringbrvbarpound MPIIO =)`7wmWordfgtecircqBX900 notszligX InfoUumlYacuteNtildeY` keyW value Table 11GqplusmnCXLyzshyPq

Table 11 MPIIO Info UumlYacuteNtildeY` key value

cb_buffer_size 4194304 $amp()

+

cb_nodes -012345

6789lt

=gt-012A

B

ind_rd_buffer_size 4194304 ACDEFGF$amp()+

ind_wr_buffer_size 524288 ACDH+GF$amp()+

JAEA-Testing 2011-005

52 latildeLPOSIX MPIIOPn~nEumlIgraveAringtransferSize (1plusmn13

fgtFGeacutea`)YAgraveograveLsequential randomPb^wFig 19GumlcopyFGFAringCDL^_P Table 12Table 13Fig 20Fig 21GqUgraveDntildea13iumlIuml_aumlaringaeligccedil13hIO -eacute-eacutearingaeligccedil13hordfcentsup3CDHn~naringaeligccedil13hoacuteGIO XcOgraveOacuteOcircOtildeYoacuteG IO XcordfcentGDH_aumlaringaeligccedil13hszligEacuteDq

iexclcentpoundcurrenyenETERNUSDX80

970TB

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

brvbar

SRFSltlaquocumlcopyordf

READD WnotiteD

brvbar

SRFSltlaquocumlcopyordf

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

shyshyshy

Fig 19 ReadWriteumlcopy

JAEA-Testing 2011-005

Table 12 LPOSIXP

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3

oqu yvq ovy| |uxvzw zvwu |zwvyy ouv|u oywvo ozuqv

quw zqzvuy uvox uu|v|u u|uvqo uxovz| uwvuo owovuq wqvxuqzy oqqvo| vy uxyvyw uuuvxx uyvyw xqzvu| |vw| |q|wvzqwoz o|uv|w ouvoy ozwv|o uyxvux uqvyo u|v| uovzu zuvxoy|wu ouoov|w uwvqu uuvxq uyvwz uuvyw uxovww uqvxw zovzw|yw ouxzv wzvwq uy|vx uyvz| uyuvwz uwxvoo |xvo wxvzyyxx|y ouwyvq| wv uxvo| uovyx uzvuu u|qv xvxy yvyxo|oq ouqwvu| uqvqq uxv|w uxvxy uuwvyo uyvoo qvyw zv|qyouu oxoyvu ywovoq uuvw ux|vy |uxvu| uxovoy oqvow uwvzwxuww oxovww oo|zv uyuv uvoq |w|vo u|wv|y ouqvoq uqwvooquwxy ouwvyq ovq uuyvo| uzvyu uquvq u|uvo q|zvx| q|vq

oqu ow|qvy ovu uqvyo vyu xx|vyq uwvwy z|zyv|y oxxqwvuwquw quzvoy |vq |w|vwz uyvyw xxwvzu xuvw o|xxxvy owv|xuqzy ouvu |vu uqov|q uxzvzo u|vzq uuwvu ooouv| |qxvxuwoz ozqxv|x o|vq |z|vuo uq|vqz xxyv|u xzzvy oo|vwo xqvzzoy|wu uvuw uqvz uovz u|vwo xwwvo y|vu| oxvz ozvoq|yw uuvw zuvy uvzz uozv yqzvyx xzyv|u ovo oxvwyxx|y owvy ouuv|u uwv|| uuwvx| x|ovx| yqyvux o|ozvuo oyoovyoo|oq |wzvy xyvyw uuwvz| uvqu x|xv xuvu ooyyvxo oxxvoyyouu u|wvq |zxvwx uvo uyzvu xo|vzq xuqvqw oyzu|v|o q|quvqxxuww uuv|o ozuxvuu uuvxx uzvzu xqvow xuwvzz oyxxxvyx owyovyooquwxy xwvy |qzv| uuzvxu uzvo| xyuvwo uwzvwx oxwwvxz oxyuyvy|

oqu oxyovoo vo yo|vyu v|w xwyvzz zv |xwovy xzzxxvyquw oxvz |vwz yquvxx y|ovx xywvyz xzyvyq x|xy|vzo w|ywzvwouqzy oyvu| zv xzwvx| yyvu xvoq xv yxy|vz zqy|vyowoz oy|xvxx uvy yozv|q yvyu xwv| xxvu| ywuv| zqqqvozoy|wu oxyzvuw uyvzz y|vyy yowvxq xwuvwx xwxvy ywzuv|q wqzvoo|yw oxoovuy wwv yqyvyq yuvyw xwov|x xzov|u ywozyvqy wyyyvywyxx|y oxwxvz ouyvxy yoyvuz yvoq xwxvwx xzqvww ywv| wxxqzvo|oq oyxwvw qvwx yovw yuvw| xxwvwo xwyvuq yquvzw woyvyouu oyw|v|z yzvx| yqvwu yxyvq| x|vx| xywvuq yywqvxw wqvq|xuww oxqvxu ouxqv|u yqvuq y|wvqx xyqvw xovz| yx|||vqq yuy|vuoquwxy oyywvqx uwvuy yo|vxz yuzvz yyxvq xwv|u yox|wvww yoovqx

oqu

|

xy

acutedegmicrodegpara

degmiddotmdegemdegkk~qgcedilsup1sup3

mdegkk~ocedilsup1sup3 cedilordm cedilsup3

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3oqu |yvoy uvqw zqxvxx xvwq uzvx ovu y|zuvx xywy|vy|quw uxzvy yvuz zoyvxo vox xxv| |vqw zyuyovyo zqzuvuyuqzy uyvuz oovx zuovw wvzy zvqx |vzu o|qxxuvy oowzyyvyqwoz uoyvxx vy xyzvzq xxv|o zuovwx uyvyo ouyyqovxx o|woo|v||oy|wu xqvzy uuvy oquvou oqovq oqqovuw z|vy oxuwv| ouqxzyvwz|yw xoyvyw xwvqy oqqvo oz|vo oqzvyz o|yvu ouwwqvq o|wzuv|yxx|y yu|vq ooyvq o|yvzx qvu o|wvqq uwvy oozwwvyo oozqyvxo|oq youv|y ovoy oxu|v|w yqv ouv|y owzvoo wuo|vou w|qzvzwyouu yxvw uouvwo oywzvwo o|yxvqo oyyvqq oxzzvq xo||voy xoyvwuxuww wozvq yuv| oyuvwo ouxvo ox|wvq o|o|v| |qwqvxw |ouvyzoquwxy zyxv|o zqyvu| oyq|vx oxqwvy oyzvz ooxvu ozzvx owovyu

oqu |xvyy v|u ozuvy w|vyy |xvw |vxy uoyyzov|o |zqyyyvoquw |xvww uvxw owq|vx xvzx oyovu |vqy xw|zvzw xyzuzvzxuqzy |yzv|z zvqw qvuy v|x oyovww wqvyw wzyv|w yqu|vowwoz |wuvz owvu o||vu oqv xzyv ouuv|| yozqovu zzqvzoy|wu xqxvwy |yvx qovou uwvyx oz|zvxy ywv| xqqxvzq yq|qxvzw|yw xuv|u uv|z qywvyu zxvy qxovww |quvz yxwoqv yuquu|vy|yxx|y yzxvqq ouyv| xzwvwx xyvzq yvoq wqzvy uzyqzvu x|yzqzvw|o|oq yyqvx| zzvq |ov|x zzovw| w|ov|| ozvo x|o||xvuz |yqqvwyouu zzyvzq yq|vwz |zzzvz |qvy |wzuv zyvxq |xowu|v |uuoqovuuxuww ooxvu o|wvu |wyvww |zvu |ux|vo zw|vzo |xuwvy |uwovxoquwxy oz|yv| owzuv| |wxyvwu ||yvqu |yzuvzx |yyovq o|uwvo o|uywvo

oqu |yvq vq |qyvqq ozwv|u |zvwq qzvww o|wuzvyo ooq|xxvzyquw |yvzu uv ooqvyu ozzv|w uxxv ozqv| owzzwvyy oxxuq|vouuqzy |yyvx| wvxo yvqq ozwvqy |zwvzq ozwv zzvx qyqqovoowoz |xv| oyvyz ozxovou |wwvw uxqvq| uquvq| |ozqxv|| qwuzzzvw|oy|wu uyqvzo ||v|| |wvyy wquvw wxuvox zvuo |oxy|vw q|qyvoq|yw ux|vuw yvw |xov|u |uvz uuvyu |wqvy o|xxovzu q|wvwzyxx|y yzvwo o|xvu |uqovuy qvw |yvy wwzvx o|wuyov|y owxuqzyvowo|oq xzzvwz yzv|y |q|vxq ovzu uqwvuq oqqvz oyuyyxuv|u oyouoz|vuoyouu wxyvq xuuv|u uoyxvo |yywvo uzvw| uouvu ooq|ozvo oqyq|vqxuww oqqvqu oowvoz uoy|vw |yvx uwuvuw uux|vxx woyyqzvow zqzvqwoquwxy owquvu owov|u uouvx |zxyv uvw uxoqvqu uy|yozvuw uuqqvuo

xy

oqu

|

cedilsup1sup3plusmnsup2

microdegpara cedilsup1sup3 cedilordm cedilsup3degmiddotmdegemdegkk~qg mdegkk~o

JAEA-Testing 2011-005

Ta

ble

13L

MPI

IOP

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

qv

qvz

ovqq

ov

|vq

vwy

quw

vyu

ovq

ovzz

vx

yvqq

yvox

uqzy

xvq

|vx

|vzq

xvo

oovw

oovxu

woz

oqvz

wvqu

vy

oqvy

|vuz

vy

oy|wu

ozvwy

oyvwq

oyvy

ozvz|

uxvw

uxvz

|yw

|yvzx

|vyu

wvy

|xvxo

vq

xv|

yxx|y

yvy

xyvoz

xqvqq

xuv||

oyvqo

oovz

o|oq

zyv|o

wqvu

zovu

zvx

owwvw

oyvzx

youuoxvoq

oqyvx

ouovzu

oqov

x|vqw

uvux

xuwwuyuvoo

|wvuo

uwvxu

oyovqq

wuvqq

||zv|y

oquwxyyqxvu

uwyvux

uwxvzx

ozuvqy

|uwv|

|uuvx

oqu

ovou

qv|o

qvy|

qvyx

|vy

|voq

quw

v|

qvx|

ovq

ovw

yvu

yvz

uqzy

uvxy

ovu

vqw

vyw

ovu

ovou

woz

zv|w

vyu

xv

xvoq

xv

uv|

oy|wu

ov|z

uvyz

oqvou

zvzw

uzvyo

uwvwz

|yw

||vz

zv|w

uv|z

qvy

wvwq

zovy

yxx|y

xvxw

v

uyv

|wvy|

oxqvz

ouzv|q

o|oq

w|vuu

u|vz

wyvx

ywvzz

y|vw

yvuz

youuooyvu|

yovw

oyqvoqvxo

||xvyo

|vx

xuwwwwvww

o|ovu

|uvqzyvuy

|yxv|

uqvw

oquwxy|xvo|

oovux

qovwu|qyvqz

uqwvo

x|v||

|

xy

acutedeg

microdegpara

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

ovou

ovoo

ovq

ovo

vo

vwx

quw

vx

vow

vo

vxu

uvxo

yvo

uqzy

xvoq

uvou

uv|

xvqu

zvo

oovy

woz

oqv

wvx

wvy

oqvo

ozvwo

vo

oy|wu

qvoq

ovo

ovux

qvq|

|yvyq

uyvou

|yw

uqvyo

|yvz

|uvy|

|wvu|

wovo|

wyvzu

yxx|y

wqvox

|v|y

yyvww

wvy

ouovz

ouwvqq

o|oqoxyv|w

o|vuq

oxvyo

oxovz

wvw

xwv||

youuwovuu

xyvox

u|vzq

yzvxw

xoovux

x|xvwu

xuww|xvo

uzvx

u|vyz

xqvozoquqvw

zuyv

oquwxyzv|y

yovz

wqvou

qyvuqooo|vuoqovu|

oqu

ovo|

qvxy

qvzy

ovo

voz

|v

quw

vz

qvzu

ovzq

vy

uv|u

yvx

uqzy

uvy|

vqx

|v

uvuz

wvwq

ovxo

woz

zv|

|vx

vwq

wvwu

ovyy

yvqu

oy|wu

owvzu

vu

oyvox

owvoq

|xvxo

xuvwx

|yw

|vy|

ouv

|ov|u

|xvx

qvq

oquvwq

yxx|y

xvq|

|vu

uxvx|

yvwo

oxuv||

owov|w

o|oqoxqvyw

v|

wyvq

ovux

|qvo

|vu

youuz|vwo

ouzvwq

o|vy

wuvqw

yovy

yywvz

xuwwxxvyu

|qv

uy|voy

xoqvwoxqvwwo|yvzw

oquwxywxwvxq

xov

wqvx

zqwvu|uovxzoqvu

xy

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2

microdegpara

|

JAEA-Testing 2011-005

521 POSIX Fig 20GbrvbarPZYgtFGeacutea`brvbardotildeiWXoacute

CZordfYgtFGeacutea`laquoGnordfzyacuteCqXFGeacutea

`ordf 1MB oumlZYWZordfYWgtiacuteOcirc$FGe$WXDHordfiGqUgraveDZYecircCo^_ordfn

~nparaCDaYGouml`$iaYbrvbarsect

EGoacuteCpoundEacuteEacutegtiaYGoumlAgraveordfAumlgtlaquopoundDq

mnd$Ocirc$YUumleacutenotccedilIuml]OcircOtildeYYL13YEuml

plusmnFEumlPthornordfuumlwdnqecircCCgtordfFGeacutea`

iexclcentIacuteiWXmW]poundEacuteEacuteecircC+Xnota

13gtOslashordfmacrdnX=WC-eacutearingaeligccedil13hLIOiumlIumlshyaringaeligccedil13hPthornWuumlwdnq 522 MPIIO POSIXZordfYWiacute=zCFGeacutea`

otilderyenCordfzyacuteCLFig 21PqUgraveDEumlIgraveiexclcentCXqmndreg=cd MPIIO ]^_ordfZordfY5POslashOumlWXpoundgtXcWmacrdegeumlnq 523 _aumlaringaeligccedil13h 13 IOoiumlIuml_aumlaringaeligccedil13hWa-eacutearingaeligccedil

13hordflaquoqa-eacutearingaeligccedil13humlcopyordfplusmnsup2GmWsup3acutegtlaquoordf13

BeumlnmemcleanfIumlegravePmWbrvbarsect_aumlaringaeligccedil13hY|1micro5PmWordfgtecircq 13G3brvbarsectpoundEacuteEacuteCDPOSIX_aumlaringaeligccedil13h

szligordfaacuteHdnDLTable 14Pq

(1) ecircEacute` (2) memclean (3) TplusmnpoundEacuteEacute` (4) 2plusmnpoundEacuteEacute`

JAEA-Testing 2011-005

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomSingleFileFilePerProc

ReadWrite

homehome

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

Write Read

sequential randomhomework

N=322561024

N=256

N=1024

N=32

N=32

N=2561024

FilePerProc FilePerProc

N=256

N=1024

N=32

N=2561024

N=32

FFig 20 POSIX

JAEA-Testing 2011-005

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential random

use viewcollective

use view N=256 use view N=256

N=32 256 N=32 256

homehome

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomhomework N=32 N=32

FFig 21 MPIIO

JAEA-Testing 2011-005

Table 14 _aumlaringaeligccedil13hszlig

| xy | xy

oregRplusmnsup2 zyvx qzv|| xwvxw zyuvqu

sup2Rplusmnsup2 woxqvoz ||qqvz| uzvu z|voy

kfrac14 frac12kfrac14

53 IORYUgraveWH

IORYbrvbardegdnDregmacrb13tuq 531 a13egravecurren occedil`brvbarYCDcparaX IO O=5Pnota13LSRFSP

homea13brvbarsectiumlIumlS_aumla13Lext3Ptmpa13PordfOslashszligordfAumlDHcAumlFXgttmpa13YZccedila)W^|13~AumlFUcircgtecirc9ordflaquoLFig 22middot Table 15Pq UgraveDhomemiddotAQFS_ccedilY-a5 512kcentCsect186OcircOtildeYOumltimes

raquoZETHIuml_Kgtaordf)JeumlnqUcircfIuml13NtildeY|^`bgta

gt_ccedilY-a5gtyBUacuteCqiAgravework middotAIO eacuteIumlkCQFS_ccedilY-a5 1MBcentCsect366OcircOtildeYOumltimes`Zaggtaordf)Jeumlnq^|13~cda_ccedilY-a5gtlgtyB

UacuteCqa-a5ordfecircCXbrvbarPX MPIIO work middotAegravePuecircgtlaquoq 532 POSIXW MPIIO c^_brvbariaYhomemiddotA Readgt 256EumlIgravegt 2curren

Writegt 10currenW POSIXbrvbarsect MPIIOordfcedilqPOSIXbrvbaraYmacrsup1Gq 533 eacuteccedilecirceuml ntransferSizeGX 1 plusmn13fgtleumlneacute

a`ordfecircMFGFAringordfecircCXWPdegdnDmWcd^|13~

^_Zeacuteccedil-a5laquo]GWordfyacuteordf9ordflaquoq C`|lOcirc`eacuteccedil-a5Y_ BUFSIZgtAtildeUgravesect13

gt gccegravePoumlgt fccegravePoumlgt 8192eacutea`gtlaquoqCfIumlgtsetvbufZaZ|atildebrvbarpoundeacuteccedilecirceuml^wmWordfgtecircq QR Fortran1313aeligleacuteccedil-a5GAgraveograve13sectLAacute

ordmPq (1) OPENyacute BLOCKSIZE (2) ^ fuxxbf (3) Oslash5fIumlZaUuml^13~ -g (4) Ocirc`gt 8Meacutea`

JAEA-Testing 2011-005

ETER

NU

S D

X80

1-

18 (d

x01frac34

dx18

)

DX

80

01

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

DX

80

02D

X80

03

DX

80

04D

X80

05

DX

80

06D

X80

07

DX

80

08D

X80

09

DX

80

10D

X80

11

DX

80

12D

X80

13

DX

80

14D

X80

15

DX

80

16D

X80

17

DX

80

18

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

IObrvbar

1(io

1)

ETER

NU

S D

X80

19

-36

(dx1

9frac34dx

36)

DX

80

19

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

DX

80

20D

X80

21

DX

80

22D

X80

23

DX

80

24D

X80

25

DX

80

26D

X80

27

DX

80

28D

X80

29

DX

80

30D

X80

31

DX

80

32D

X80

33

DX

80

34D

X80

35

DX

80

36

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

Wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

IObrvbar

2(io

2)

Fi

g 2

2a13J

JAEA-Testing 2011-005

Table 15 OcircOtilde7Y`|J

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_01 QFS_data_01 2109888 MB 2109888Icirc36 = 724TB 1024k

data_02 QFS_data_02 2109888 MB 2109888Icirc36 = 724TB 1024k

data_03 QFS_data_03 2109888 MB 2109888Icirc36 = 724TB 1024k

data_04 QFS_data_04 2109888 MB 2109888Icirc36 = 724TB 1024k

data_05 QFS_data_05 3695488 MB 3695488Icirc36= 1269TB 1024k

home QFS_home 3695488 MB 3695488Icirc36= 1269TB 512k

work 524288 MB Io2JAgraveAacuteAcirc

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_06 QFS_data_06 2109888 MB 2109888Icirc36 = 724TB 1024k

data_07 QFS_data_07 2109888 MB 2109888Icirc36 = 724TB 1024k

data_08 QFS_data_08 2109888 MB 2109888Icirc36 = 724TB 1024k

data_09 QFS_data_09 2109888 MB 2109888Icirc36 = 724TB 1024k

data_10 QFS_data_10 3695488 MB 3695488Icirc36= 1269TB 1024k

data QFS_data 3695488 MB 3695488Icirc24= 846TB 1024k

sysdata QFS_sysdata 3695488 MB 3695488Icirc12= 423TB 16k IumlETHNtildeETH

work QFS_work 524288 MB 524288Icirc144=72TB 1024k dx01frac3436OgraveJ

JAEA-Testing 2011-005

6 a$fY` mmgt BX900 a$fY`ETHNtildeCDtuq

ETHNtildeBX900Xraquozfrac14frac12GmWordfgtecircDq 61 a$fY`igraveiacute mmgt BX900 a$fY`IcircIumligraveiacute AgravejiumlIumleacuteaIuml

0Gq 611 icircWiumlIumlethsectntilde

BX900 InfiniBand QDRegraveBCD Fat Treeicircgtlaquoq113aelig13S 18iumlIumlAEligicircgtlaquoordf113aelig13 LeafWdividenaccedil7Iumlordf 2Igravecenttimeseumln1accedil7IumlcdoiumlIuml QDR ordf 1 7gtYZeumlnqaccedil7Iuml 2 Igravelaquogt1 iumlIuml 2 QDR ordfYZeumln(QDR Iacutegt 18 j2Igravegt 36WX)qoiumlIuml`ZaringbrvbarsectT 8GBs (4GBsj2) AAEligpoundq 13aelig13 5PDHLeafaccedilyacute7 SpineWdividenaccedilordf

9iexcllaquosecto LeafaccedilGu Spineaccedil QDRgtYZeumlnqbrvbarpound 113aelig13cd SpineIacutegt 9j2Igrave=18 QDRordfegraveBeumlnq QDRordf 5PoumlLeaf accedil3xmacrDownlink ordf 36 Uplink ordf 18 Xgt Uplinkshy 50WXsectmn 50_ccedilaringWdividenotgtqBX900gta$fY`Ecircfrac34h Fig 23GqBX900gt 50_ccedilaring13aelig13ocircotilde gtthornordf]GqSpineLeafaccedilW13aelig13SoiumlIumlYZGEcircfrac34h Fig 24Gqmmgt13aelig13S nAumliumlIuml nAuml(niquest9)UgraveD nU9Auml(nAgrave9) SpineaccedilegraveBGqlaquo13aelig13laquoiumlIumlordfcent13aelig13iumlIumlcdOcirc$ltAacuteoumlSpineaccedilcd Leafaccedil(2Igrave) QDR2(1j2Igrave)ordfegraveBeumln(2Igrave)LeafaccedilcdiumlIumlQDR2gtYZeumlnqDotildeCmSpineaccedilcd Leaf accedilordmAcirc13aelig13SPAtildeWiumlIumlWnoteumlnqmDHyenwiumlIumlTEacuteOcirc$lt GoumlWiumlIuml 1 W 10 ordflt GoumlWgtn~n FAringordf+XqX13aelig13Sotilde gtiuml_ccedilaring

WXqmDHoYacute~iumlIumlethsectntildeAgraveograveLiumlIumleacuteaOcircOtildePordf

thornotildeGq (1) XiumlIumlethsectntildeL13klP iumlIumlZeacuteaIumleumlnqyenw13aelig13 1 iumlIumlordf bx0001ccedil

bx0018Ugravegtlaquoordf144EumlIgraveYacute~oumlmAumlgtiumlIumlordfeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDdmiumlIumlAringAEligw[iumlIumlegraveBGq13

aelig13ocircotildeoumliacuteZeacuteaIumleumlnq

JAEA-Testing 2011-005

(2) CcedilXiumlIumlethsectntilde Wsup33iIacuteOslash5YZniumlIuml 1ecirceacuteaIumleumlnqyenw13aelig13

1gt bx0001bx0003bx0005bx0007WeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDd8)eacuteaIumleumlnDiumlIumlordfEgraveAumlAringXdAringAEligwTEacuteEgrave

AumlAringiumlIumlegraveBGqEcircAumlAringoumliacuteAringAEligwTEacuteEcircAumlAringiumlIumlegrave

BGqmnAtildeWYacute~SiumlIuml gt IB [EumlDHcenteumlnDq plusmnoqgt65EacuteWsup3AgraveogravegtCDq

13aelig13 1 13aelig13 119

Leaf(1-2) Leaf(1-1)

bx0001 bx0002 bx0018

Spine(1) Spine(2) Spine(9)

Leaf(119-2)

Leaf(119-1)

bx2125 bx2126

Fig 23 BX900a$fY`icirc

3OslashCDqIgraveIacuteoFL13aelig13raquoIcircIumlntildePfrac34wWsup313kl

WiacuteiumlIumlethsectntildeAgravejBCq

JAEA-Testing 2011-005

accedil (Spine) accedil

(Leaf) 13aelig13 iumlIuml (1 13aelig13acute18 iumlIuml ) (acutecsect]GCGDHLeaf accedilTCcX)

1

10

2

11

3

12

4 5 6 7 8 9

13 14 15 16 17 18

SW 1

SW 2

SW 3

SW4

SW5

SW6

SW 7

SW 8

SW 9

SW

Fig 24 iumlIumlWaccedilatildeAcirc

612 Agravej

BX900 Ba$fY`plusmnsup2yBCD IB AgraveograveWC

RDMA(Remote Direct Memory Access)yBCD RDMA Agravej(13 RDMA WC)WRDMA egraveBCX NORDMAAgravej(13 NORDMA WC)ordflaquoqBX900 gtOcirc` RDMA gtlaquoqRDMA W IB begravecccedil`brvbarYaumlIuml(Host Channel Adapter(HCA))Ocirc$^|13~|W8Y]sectWsectgtecircAEligDIacutemWgtETH_fg (1)5PqAuml^ccedil`E7a13 ordfNtildeEumlgtecircqiAgraveNORDMA CPUbrvbar|fg5Pq

RDMANORDMAWcentccedilYacute AgraveograveWC Normal SendAgravej(2)W Send on RequestAgravej(3)ordflaquoqRDMANORDMAacircWmnd Normal SendAgravejSend on RequestAgravej5PmWordfgtecircordfmndsectOgravewG ccedilYacute-a5brvbarpoundAtildeUgraveqRDMANORDMAgtmndAgravejsectOgraveOcirc`Oacuteshy Table 16GqOacuteshyOcircOtilde Normal SendAgravejOacuteshy13yacute Send on RequestAgravejWXqmndAgravejWccedilYacute-a5^_atildeAcirc Table 17W Table 18Gq Oacuteshy^ MP_MAX_NORMAL_SENDgt^wdnordfyacuteaacuteshyCDOcirc`shyWEacutegtlaquoqbrvbarpound^egravePoumlOacuteshyOumloumlEacutegtlaquoq

JAEA-Testing 2011-005

XDshy 5PDHOslash5 -onesideUuml^13~CDoumlNORDMAAgravejc(ccedilYacute-a5WatildeAcirc)Normal SendAgravejgt5nq

Table 16 AgravejsectOgravewOacuteshy Iuml RDMA RDMA NORDMA

^_ Tccedil1024 1025ccedil U Oacuteshy 32768eacutea` 16384eacutea` 524288eacutea`

Table 17 RDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil16KBOcircOtilde 16KBccedil32KBOcircOtilde 32KB13yacute 1ccedil1024^_ Normal Send Normal Send Send on Request

1025ccedil2048^_ Normal Send Send on Request Send on Request 2049^_13yacute NORDMAWX Send on Request Send on Request

Table 18 NORDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil512KBOcircOtilde 512KB13yacute Gu^_ Normal Send Send on Request

timesOslash (1) ETH_fg ^_ccedil-brvbarOcirc$fgIacuteccedil`brvbarYordmgt ordm|8

YOcirc$FGGCcedilEgravemWq^_ccedil-( 13)Obbrvbarsectdivide3GmWordfgtecircq (2) Normal SendAgravej G shylt shyccdccedilYacuteG GqccedilYacuteUgraveUacutesect

umlordmordfagraveCnUgraveUgravefg(direct copy)qagraveCXoumliUcircIacuteEumlAagravefCmmfg(indirect copy)agraveCDdsectumlordmfgGqUumlccedilYacutezecircq (3) Send on RequestAgravej G shylt shyWYacuteAacutesectlt shyccedilYacutesectumlordmordfagraveCcdG 5Pq

`ccedilYacutezecircq

JAEA-Testing 2011-005

62 MPI MPI acircX atildeegraveBCD^_Z)JCBX900 yacutegt

CDq256EumlIgraveccedil4096EumlIgravegt Ocirc$-a5THORNumlcopyszligpound5poundDordfmmgto MPIatildegtagraveaacuteX=ecircordfmacrdn 32ccedil80000eacutea`gtGq WCRDMANORDMA ordflteumln atildeordflaquomWlaquoOcirc

$-a5cdecircCordf^UcircG atildeordflaquomWXMordfethCDq 621 Agraveograve mmgtG Ocirc$umlcopy 0ccedil80000 eacutea`gtlaquoordfOslash5vEacuteAacuteG

Ocirc$acircETHeCLTable 19Pq AgraveograveWCMPIatildeTplusmnOslash5GraquoCmn 20plusmnatildesectaumlGq20plusmnatilde

sectaumlC3 1plusmnWecircXshyordfeumlnmWordfDDlaquopoundDqWCpumlcopyYacute~ W[ordfuumlwdnqmDHT`C19plusmnshyIIumlAacutemWCDqXIuml RDMAW NORDMA 2IumlgtGq^_ZOslash MPIatildeIacutecurren Fig 25Gq

Table 19 Ocirc$acirc

EumlIgrave AllreduceReduceBcastSendRecv 13 MPIatilde 256 32eacutea` 128eacutea` 512 yacute 128eacutea`

1024 yacute 256eacutea` 2048 yacute 2048eacutea` 4096 yacute 2048eacutea`

do 1000 III = 1 NNN call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() mesuarment start call MPI_ALLGATHER(anelemMPI_DOUBLE_PRECISION amp bnelemMPI_DOUBLE_PRECISION amp MPI_COMM_WORLDierr) ttime2 = MPI_WTIME() mesuarment end ttime0 = ttime2 - ttime1 call MPI_REDUCE(ttime0ttime1MPI_DOUBLE_PRECISIONMPI_MAX0 amp MPI_COMM_WORLDierr) if(myideq0) then max_time = max(max_timettime) min_time = min(min_timettime) all_time = all_time + ttime end if 1000 continue

Fig 25 ^_ZLIacutecurrenP

JAEA-Testing 2011-005

13

622 mnUgravegttuDOslash5CcedilEgraveOslash5IumlgtP512EumlIgrave Fig 26 Fig

27 1024EumlIgrave B Gqoh$a` MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

0

500

1000

1500

2000

2500

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Reduce(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(512-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(512-)

rdma

Nordma

Fig 26 MPI atildeL512EumlIgrave-1P

JAEA-Testing 2011-005

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(512-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(512-)

rdma

Nordma

0

500000

1000000

1500000

2000000

2500000

3000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(512-)

rdma

Nordma

0

50

100

150

200

250

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(512-Rank0gtRank511)

rdma

nordma

Fig 27 MPI atildeL512EumlIgrave-2P

JAEA-Testing 2011-005

brvbarsect[raquozfrac14frac12CDq

(1) C gt RDMAbrvbarsect NORDMAordfFoumlordflaquoq (2) RDMAgtlaquo Ocirc$-a5cdaeligccedil ordflaquoEGqmn Normal Send

Agravejcd Send on RequestAgravejsectOgravethornWnq (3) NORDMA gt AgravejsectOgravewOacuteshyWatildeAcircOcirc$-a5gtaeligccedil ordflaquo

EGoumlordflaquoq (4) EumlIgraveordfCXWEacutebrvbarPX Ocirc$-a5gt decircordfmacrdn

oumlordflaquoq ^_ZgtacircegraveBG MPIatildebrvbarpoundRDMAW NORDMAegravecurrenaeliguordflaquo

qDotildeCmnd MPI ^_Z=GfIumlgtlaquo mpiexec Uuml^13~gtlaquogtGu MPIatildeoacuteC RDMAgt5Pc NORDMAgt5Pclaquo XgtC MPI atildeegravepoundoumlAgraveIumlgtOslash5rsGmWordfaeliguWXq ucircuumlWCo MPI atildegtUVOcirc$(ccedil8KB)3UVOcirc$(8Kccedil78KB)U

VOcirc$(80KBccedil584KB)gt RDMAZUcircCD Fig 28LgtUVOcirc$P CLUVOcirc$PGqZyacutegt 100n RDMA ordfnoty0n NORDMAordfnotyWXq

JAEA-Testing 2011-005

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

middotplusmnsup1raquo

~plusmnreg

plusmn

sup2cedil~raquo

sup2~

sup2~

sup2~Oslashreg~plusmn

middotplusmnsup1

~plusmnraquo

~plusmn

Ugraveplusmnsup1raquo

Ugraveplusmnsup1

~plusmn

plusmnraquo

Fig

28

RD

MAW

NO

RD

MALgtUVOcirc$acute

0K

B~8

KBP

JAEA-Testing 2011-005

63 RDMAszlig RDMA 612 gt0CDbrvbarPCPU plusmnsup2WnotM5XDH 13

O=Vfrac34gtecircqmnbrvbarsect WbordfO=gtecirc9ordflaquosectC

cCcedilEgrave`5poundEacuteDqUgraveDrsDHNORDMAgtiacuteX`5poundDq WC WordfO=eumlnDegraveyenagraveaacutegtecircXcpoundDq 631 Agraveograve W5P^_ZegravepoundRDMA W NORDMA rsGq

eacute^_WOcirc$ecircacuteGeumlsectCigraveecircacuteiacuteVicircG (13cedil Wdivideoslash)5PqGlt iuml_ccedilaring (MPI_IsendIrecv)egraveBGqG UgraveDlt GAtildeWcopyIgraveOcirc$-a5 5MB gtlaquoq W5P5Igravegtlaquoq5Igrave-a5(450450)gtlaquoqrsDHgt n~nOslash55Pq ^_ZZccedil`MPIW 2q|Icirca|ccedil`(MPIiumlOpenMP) 3q|)

JCDqZccedil` MPIgtOslash5CDoumliumlIumlS RDMAgtXDHmAgravejszligordfgtecircXqmDHiumlIumlSgt MPI 5XCeth2Icirca|ccedil`)JCDqIcirca|ccedil`Oslash5WOslash5Iacutecurren7ccedilIumlEumlIgraveUcircCXeacute

Yacute~ 1W7ccedilIumlEumlIgraveUcircCDeacuteYacute~ 2)JCDqeacuteYacute~ 2 WOslash5Iacutecurren Fig 29GqXMPIWIcirca|ccedil`eacuteYacute~ 15IgraveIacutecurren7ccedilIumlEumlIgraveampeuml$OMP parallel doiacuteordfotildegtlaquoq DHOslash5Zccedil` MPIgt 8EumlIgraveIcirca|ccedil`Oslash5 4mpij8omp(32core

egraveB)gtOslash5Gq

JAEA-Testing 2011-005

call MPI_BARRIER(MPI_COMM_WORLDierr) ttime5 = MPI_WTIME() mesuarment start call mpi_isend(aNMBmpi_DOUBLE_PRECISIONir1 1 MPI_COMM_WORLDireq(1)ierr) call mpi_irecv(bNMBmpi_DOUBLE_PRECISIONil1 1 MPI_COMM_WORLDireq(2)ierr) call mpi_isend(dNMBmpi_DOUBLE_PRECISIONil2 1 MPI_COMM_WORLDireq(3)ierr) call mpi_irecv(cNMBmpi_DOUBLE_PRECISIONir2 1 MPI_COMM_WORLDireq(4)ierr) ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ttime7 = MPI_WTIME() mesuarment start $OMP parallel do do j = 1 NMM do i = 1 NMM a1(ij) = 1d0 b1(ij) = 1d0 c1(ij) = 0d0 end do end do $OMP parallel do do j = 1 NMM do k = 1 NMM do i = 1 NMM c1(ij) = c1(ij) + a1(ik)b1(kj) end do end do end do ttime8 = MPI_WTIME() mesuarment end ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc call mpi_waitall(4ireqiastatusierr) ttime6 = MPI_WTIME() mesuarment end ttimeDD = ttime8 - ttime7 ttimeal = ttime6 - ttime5 Oslash5 ttimesd = ttimeal ndash ttimeDD Oslash5cdnD (= UgraveDiIacute)

Fig 29 WO=5PDH^_Z

JAEA-Testing 2011-005

632 Zccedil` MPI 2plusmnIcirca|ccedil`eacuteYacute~ 1 WeacuteYacute~ 2n~n 3plusmn

Oslash55poundDqmnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt D Gqontilde13Gq

acutegtOslash5CD Oslash5 acutegtOslash5CDOslash5 R acute Wn~nOslash5shy Oslash5 acute W5poundDOslash5 (U) acuteOslash5cd()Dshy acuteOslash5CD(OslashC) U acuteOslash5cdRshyDshy

laquogtOslash5CD oacuteG WOslash5GmWbrvbarpound

UumlCXpoundDrGqmj13sectgtlaquoq

laquo() = (Oslash5U( iuml)) j100 aTHORNshyordfecircMszligordflaquomWXqbrvbarsect[raquozfrac14frac12CDq

(1) Zccedil` MPIgtiumlIumlS IBegravepoundXDH RDMAograveoacuteordfltdnOslash5szligq

(2) Icirca|ccedil`eacuteYacute~ 1 RDMAOslash5gt WOslash5rsGWucircOslash5ordfFWPXq NORDMAatildeCiacuteXWXpoundq

(3) Icirca|ccedil`eacuteYacute~ 2 gtIacutecurrenordf7ccedilIumlEumlIgraveUcirceumlngteacuteYacute~ 1WRrsgt 72currenOslash5Rrsgt 55currenAumlFUcircordfXeumln(RDMA)qCcCgtucircOslash5ordfFWPdegdnXq NORDMAIacutecurren7ccedilIumlUcircgt Oslash5eacuteYacute~ 1ruUumlIgraveUcirceumlnordfRDMAbrvbarsectcedilCXpoundq

BX900gt RDMA Uuml_IumlOslashWGgtCPU oacuteG

dCb^|13~ CPUbrvbarsectVfrac34GmWordfgtecircqCcCBX900 ordfiacuteCUgraveDplusmngt ordfUumlmWlaquosect Wordf

O=eumlnDegraveyenagraveaacuteGmWordfgtecircXcpoundDq

JAEA-Testing 2011-005

64 CPUSCPUiumlIuml 1oacute 1 core FAringG^_Zegravepound CDqCPU S

iumlIumlS CPU iumlIuml gt Ocirc$ordmAcircAgravejordf+Xqn~n gtordfaEacuteiumlIuml ordfTFWXpoundDq 641 Agraveograve

1oacute 1 ^_Z coreOcirc$ G^_ZgtlaquoqZY0cdcZYCD^_acircgtCD-a5Ocirc$ 100plusmnG Cmnd uCDn~nGq^_ZiIacute Fig 30Gq plusmn 24EumlIgravegt128MBOcirc$G C^_acirc 1WGq BX900gt 1iuml

Iuml 2CPU1CPU 4coregtlaquogtZY 3Ugravegt CPUS ZY 7UgravegtiumlIumlSPAtildeW CPUW mnd13yacuteiumlIuml WXqiumlIumlS coreoacuteG core IDWethsectntildedn MPIZYAumlAring Fig 31Gq

IC = 0 do 1000 III = intcpu nodes-1 intcpu IIRANK = III call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN if(myideq0) amp call MPI_SEND(ansizeMPI_DOUBLE_PRECISION amp IIRANKIIMPI_COMM_WORLDierr) if(myideqIIRANK) amp call MPI_RECV(ansizeMPI_DOUBLE_PRECISION amp 0IIMPI_COMM_WORLDistatusierr) 120 continue ttime2 = MPI_WTIME() c IC = IC + 1 timew(IC) = ttime2 - ttime1 ir(IC) = IIRANK 1000 continue

Fig 30 ToacuteT ^_ZLIacutecurrenP

JAEA-Testing 2011-005

NODE

CPU0 CPU1

Core IDacute0 Core IDacute2 (Rank 0) (Rank 1) Core IDacute4 Core IDacute6 (Rank 2) (Rank 3)

Core IDacute1 Core IDacute3 (Rank 4) (Rank 5) Core IDacute5 Core IDacute7 (Rank 6) (Rank 7)

Fig 31 f MPIZYethsectntilde

BX900Ocirc`QRfeaZ+QR MPIZaZ|gtlaquoordfmQRRIgraveEacuteIacute13

IntelfeaZ+QR MPI IntelfeaZ+Intel MPI QRfeaZ+Open MPI

3 gt5poundDqDotildeCQR MPI W Intel MPI MPI ^_ coreoacuteGethsectntilde Fig 31CDsectgtlaquoordfOpen MPIatildeC rankAumlAring 01234567 Core ID 01234567gtethsectntildednq 642 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt E Gqmbrvbarsect[

raquozfrac14frac12CDq

(1) QRfeaZ+QR MPI gtCPU S FAringordf 3260MBsiumlIumlS CPU FAringordf 3750MBsiumlIumlordf 5350MBsgt CPUS ordfiAumlcedilCiumlIumlordfiAumlFWPWXpoundDq

(2) Intel feaZiumlQR MPI CPU SiumlIumlS CPU QRfeaZWotildeiacuteXotildeordfiumlIuml 4750MBsWegraveeacuteEGqmnQR MPIQRfeaZUcircCD5poundordfIntelfeaZoacuteC5nXDHWnq

(3) IntelfeaZ+ Intel MPI CPUSiumlIumlS CPUgtAgraveXordfmndocircoacute(1)(2)QR MPI egraveBqUgraveDiumlIumlgt Intel MPI 2(` QDRegraveBordfgtecircXDH1(` QDRCcXq

JAEA-Testing 2011-005

(4) QRfeaZiumlOpen MPICPUS ordf 4460MBSgtmnUgravegt3gtTbrvbarordfiumlIumlS CPU QR MPI WotildeAringiumlIumloumlCXWXpoundDqDotildeCiIacute QDRegraveB5poundbrvbarPotildeq

dividefrac12XegraveAgravegtlaquoQRfeaZiumlQR MPIgt CPUSiumlIumlS CPU ordf

iumlIuml brvbarsectcedilWPfrac34W+XWXpoundDqmmacraacutesectXuC

iumlIumlSgt MPI XuCEumluecircgtlaquoq 65 13aelig13ocircotildeouml 13aelig13ccedil`brvbarY 50_ccedilaringicircXpoundDHegraveBoslashbrvbarpound

thornltCUgravePqmthorn 1oacute 1 Wdivideoslash ETHuDq 651 1oacute 1

1oacute 1 MPI atildegtlaquosectbrvbarCegraveBeumlnqyenwOslashfIumlgteacute^_WFGgt^_ordfgt5noumlordflaquoqmbrvbarPXouml50_ccedilaringthornlt9ordflaquoqmmgt 5P^_Z)JC

thornETHuDq WC1oacute 1 gt 2iumlIumlgt IBnotGn[ordfecircDH50_

ccedilaringthornCltmWordfcpoundDqDotildeCWsup3iumlIumleacuteaOcircOtilde(Yacute~iumlIumlethsectntilde)AgraveogravegtAtildeWYacute~ordf IBnotGbrvbarPXiumlIumlcopytimesUVYacute~139ordfAgraveXDHthornordfCq 6511 Agraveograve ETHNtildeegraveBCD^_Z^_1curreno^_cdfrac14secto^_

1oacute 1 (MPI_SendRecv)gt 128MBOcirc$ 100plusmnG CmnuCDiexcl FAringGqOcirc$G aYacute Fig 32 Gqmyengt8EumlIgraveYacute~oumlgtZY 0ccedil3^_cdiexclZY 4ccedil7^_Ocirc$G CqOcirc$Glt ^_iugravepGgtoslashbrvbarpoundccedil`brvbarYgt[

ordfecircmWXq ^_ZZccedil` MPI WIcirca|ccedil`(MPI+OpenMP))JCDqIcirc

a|ccedil`7ccedilIumlETHeGmWgt1 iumlIumlcdOcirc$G 2^wmWordfgtecircq

JAEA-Testing 2011-005

13

6512 (1) 113aelig13S rsDH1 13aelig13S(144core)5poundDqZccedil` MPI 144 EumlIgraveIcirca

|ccedil` 72mpij2omp36mpij4omp18mpij8omp 3 IumlgtOslash55poundDq13aelig13Siuml_ccedilaringgtlaquoDH iumlIumluacuteordf IB iexclcentGqZccedil` MPI 1 iumlIumlcd 8 ^_currenOcirc$ordfeumlnIcirca|ccedil` 2 7ccedilIumliacute 1iumlIumlcd 4^_curren47ccedilIumlEacuteC 2^_curren87ccedilIuml1^_currenWXgt87ccedilIuml13 IBgt[ordfecircqmDHmnd FAring+XordfIBagraveaacuteGDHogtIIuml FAringW 1iumlIumlntildesect FAringshyiquestHDqm Table 20GqXeacuteYacute~Icirca|ccedil`cedil$ucirc3shy(MPI^_j7ccedilIuml)Gq Icirca|ccedil`(18mpij8omp)EacuteegraveeacuteshyordfEordfFAringshyGueacuteYacute~gt

otildeEacutebrvbarPXshyWXpoundqmnordf QDR2currenOslashgtlaquoigravePq (2) 13aelig13Zccedil` MPI TEacute144EumlIgraveZccedil`MPIgt5poundDqYacute~Oslash5gt t256YZegraveBCDq

mnmYZiumlIumleacuteaOcircOtildeordfZcopytimesWXDHIBnotordfecirc]GDHgtlaquoq _copytimes 72^_13aelig13(=9iumlIuml13aelig13)gtIumlb^_ 213aelig13currenethCDq13aelig13S 1iumlIumlecirc 2 IBordfuacuteordfpoundq13aelig13S 18iumlIumllaquogt 36 IBgteumlnqiAgrave13aelig13 50_ccedilaringgtlaquogt 18(9j2ccedil`)gteumlnq144EumlIgraveouml13aelig13S13aelig13W 18 IB egraveBGgtiuml_ccedilaringWXsect13aelig13S13aelig13EacuteXqbrvbarpoundm13aelig13ocircotilde 13aelig13SgtCD Wotilde^

dXq

Ocirc$G (G G)

^_0

^_1

^_2

^_3

^_4

^_5

^_6

^_7

Fig 32 ToacuteT Ocirc$G aYacute

JAEA-Testing 2011-005

Table 20 L13aelig13SP eacuteYacute~ Zccedil` MPI uumlyacutethorn(72j2) uumlyacutethorn(36j4) uumlyacutethorn(18j8)IIuml FAring 7238MBs 14474MBs 28942MBs 53467MBs iumlIumlshy 57906MBs 57899MBs 57884MBs 53467MBs

Table 21 L13aelig13Zccedil` MPIP 144 MPI 192 MPI 224 MPI iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 7246 R 4670 laquosectR 4671 laquosectR

2 7244 R 4670 laquosectR 4671 laquosectR

3 7246 R 4670 laquosectR 4671 laquosectR

4 7242 R 7243 R 4670 laquosectR

5 7243 R 7244 R 4671 laquosectR

6 7242 R 7240 R 7237 R

7 7241 R 7239 R 7246 R

8 7244 R 7240 R 7245 R

9 7249 R 7243 R 7240 R

10 R 4670 laquosectR 4670 laquosectR

11 R 4670 laquosectR 4672 laquosectR

12 R 4669 laquosectR 4671 laquosectR

13 R R 4671 laquosectR

14 R R 4670 laquosectR

iacute 192EumlIgraveW 224EumlIgraveZccedil` MPIn~n 96^_13aelig13gt 213

aelig13Oslash5112^_13aelig13gt 213aelig13Oslash55poundDqordm 144EumlIgraveOslash5WWlt shy13aelig13iumlIumlraquoIIuml FAring Table 21Gq yenw 224 EumlIgraveoumlltAacutesectshygt IBnotGiumlIumlordflaquogt(50_ccedilaring)

miumlIuml FAringordfcedilCXqFig 24a$fY`icircgtCDbrvbarPiumlIuml1W 102W 113W 12ordfnotGmWXsectmndiumlIuml FAringcedilCXpoundqCcCiumlIuml 15161718 egraveBeumlnXDHmndiumlIumlWnotGiumlIuml 6789 FAringordfFC 144MPI oumlW FAringordfEacutegtlaquoqbrvbarpoundiumlIuml6ccedil9 iuml_ccedilaringWXq192 EumlIgraveiacutegtiumlIuml 4ccedil9 p IB notGiumlIumlegraveBeumlnXDH FAringordfFq

JAEA-Testing 2011-005

Table 22 L13aelig13Icirca|ccedilIumlP Icirca|ccedil`(112x2) Icirca|ccedil`(56x4) Icirca|ccedil`(28x8) iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 9339 laquosectR 18567 laquosectR 36408 laquosectR

2 9337 laquosectR 18564 laquosectR 36919 laquosectR

3 9346 laquosectR 18566 laquosectR 36536 laquosectR

4 9347 laquosectR 18570 laquosectR 36457 laquosectR

5 9342 laquosectR 18651 laquosectR 36646 laquosectR

6 14385 R 28606 R 53808 R

7 14488 R 29118 R 53779 R

8 14500 R 28676 R 53789 R

9 14495 R 29100 R 53784 R

10 9337 laquosectR 18797 laquosectR 36408 laquosectR

11 9340 laquosectR 18775 laquosectR 36715 laquosectR

12 9343 laquosectR 18651 laquosectR 36559 laquosectR

13 9348 laquosectR 18811 laquosectR 36457 laquosectR

14 9338 laquosectR 18694 laquosectR 36991 laquosectR

(3) 13aelig13Icirca|ccedil` Icirca|ccedil`gt 213aelig13(224core)egravepound112mpij2omp(56^_13aelig13)

56mpij4omp(28^_13aelig13)28mpij8omp(14^_13aelig13)IumlgtOslash5CDqmndoiumlIumlgtIIuml FAringWpiumlIumlWnotnot Table 22Gq MIcirca|ccedil`gt IB notnotgt FAringordfecircC^poundqUgraveDIcirca|ccedil

`7ccedilIumlbrvbarpoundiumlIumlcdGlt GOcirc$2ordf^DHiumlIumlntildesect^_

ordfMiumlIumlXordf IB gtOcirc$[thornordfagraveaacuteXqmnduumlGWBX900 gtZccedil` MPI brvbarsectIcirca|ccedil`UcirceumlnDfIumlordfIcirca|ccedil`UcircfIumlXd 87ccedilIumlordfnotygtlaquomWordfcq WmigravegtWsup3iumlIumleacuteaOcircOtildeAgraveograveiumlIumlAtildeWecirccopytimeseumlnD

HUVYacute~13currenYacute~gt IB not9AgraveXqbrvbarpound50_ccedilaringmWuumlGbrvbarsectordmgttuDIcirca|ccedil`Ucirc5GuecircgtlaquoqCcC

Yacute~Oslash5 IB notGiumlIuml ordfpYacute~ordfOslash5eumlnWthornltCUgravePordfucircoB3mnEumlmWgtecircXq

JAEA-Testing 2011-005

652 divideoslash mmgtdivideoslash atildePiAuml 2ordfCX MPI_AlltoallegravepoundEacute

DGq 6521 Agraveograve

MPI_AlltoallGCcedilEgravecentCOslash5GqCmrsGqXMPI_Alltoall 100 plusmnOslash5C oacute copyIgrave-a5Zccedil` MPI ordf512MBIcirca|ccedil` 1024MBWCDq

A ucircYacute~Oslash5 B 13aelig13ntildesectiumlIumlgtecircotildeCGq C 13aelig13ntildesectiumlIumlAgraveXCG(13aelig13ordfCX)q D IBnotCG(iuml_ccedilaring)q

BccedilD atildeGgtlaquoordfiumlcopyordfYacute~gtegraveBGiumlIumllaquo GmWgtecircXq mDHWsup3ucircoBgteacuteccedilYacute~TYZ h4096egraveTEacute 512iumlIuml(4096core)agravefCm3cdXuCCcedilEgravePiumlIumllaquonotgtecircmCcedilEgravegtagravefgtecircEumlIgravegtCfNh$)JCmfNh$egravepound

MPI_Alltoall Oslash55poundDqmDHAacutedegCDOcirc$ MPI ^_UYacuteordfXq 6522 Zccedil` MPIWIcirca|ccedil`egravepoundD Table 23GqmmgtCXC

XmWOslash5CcedilEgraveordf+XDH AucircYacute~Oslash5WBccedilDUgravegtoshy8YrsCXmWgtlaquo(A 10WCDrClaquoordfmnoumlgt BccedilDEacutersWG)qBccedilDTEacute 512iumlIumlagravefCcdEumlIgravefNh$)JCOslash5GordfQR MPI gtmbrvbarPTEacuteCiumlIumlagravefCcdAgraveXiumlIumlgtCfNh$)poundOslash5CDoumlWTEacutecd

EumlIgravegtOslash5CDoumlWgt)acircOslash5ordfEGqZccedil` MPI WIcirca|ccedil`gtgt coreordfAgraveXoumlmraquozordfagraveaacuteXqX BccedilDrsXq

JAEA-Testing 2011-005

Ta

ble

23iacuteXCcedilEgrave

MPI

_allt

oall

vUacuteSTUcircR

vRcopyUumlYacutebrvbarTHORN

szligvcopyUumlYacutebrvbaragrave

pvRaacuteacircatildeNR

+aumleAumlovqaringaeligPgR

Oslash 5 Iuml

copy Igrave - a 5 (MB)

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

R

R

szligR

pR

144m

pi

512

3R6

0 11

14

118

0

145

9 3

60 R

138

6

36

0 13

87

100

1

31

124

1

25

2plusmn

(BC

DEacute

) 51

2 3R

60

111

4

118

0

148

2 11

16 R

146

9

63

0 14

75

100

1

33

132

1

32

3plusmn

(BC

DEacute

) 51

2 3R

60

111

4

29

0 14

81

714

13 R

153

3

29

0 14

21

100

1

33

138

1

28

448m

pi

512

14R

40

142

6

414

0

182

4 21

272

1 R

170

3

78

0 16

32

100

1

28

119

1

14

504m

pi

512

14R

45

142

7

415

8

180

4 27

242

6 R

160

5

79

0 14

33

100

1

26

112

1

00

2048

mpi

51

2 37R

69

182

1

298

8 24

93

465

6 R

250

1

465

6 25

04

100

1

37

137

1

38

2plusmn

51

2 37R

69

166

7

2211

6

262

5 67

564

6 R

264

7

298

8 25

04

100

1

57

159

1

50

3plusmn

(BC

DEacute

) 51

2 37R

69

166

7

2211

6

262

1 69

524

9 R

269

3 6

298

8 24

59

100

1

57

162

1

47

3072

mpi

51

2 69R

56

258

7 28

399

9 27

34

5371

54 R

286

3 9

507

7 26

44

100

1

06

111

1

02

68m

pij

2om

p 10

24

3R5

7 12

10

11

170

11

80

91

9 R

122

3

28

5 12

21

100

0

97

101

1

01

120m

pij

2om

p 10

24

9R3

3 12

89

215

0

127

0 12

191

6 R

130

4

47

5 12

94

100

0

99

101

1

00

132m

pij

2om

p 10

24

8R4

1 13

79

216

5

116

1 15

211

6 R

119

5

48

3 11

38

100

0

84

087

0

83

432m

pij

2om

p 10

24

17R

64

162

4

119

8 16

28

1925

43 R

161

2 1

129

0 16

18

100

1

00

099

1

00

30m

pitimes4

omp

1024

4R

38

598

115

0

674

12

13 R

683

27

5 68

2

100

1

13

114

1

14

2plusmn

(BC

DEacute

) 10

24

4R3

8 59

8

115

0

678

12

13 R

690

27

5 70

0

100

1

13

115

1

17

36m

pitimes4

omp

1024

6R

30

603

118

0

684

11

16 R

697

63

0 70

3

100

1

13

116

1

17

2plusmn

(BC

DEacute

) 10

24

6R3

0 60

3

29

0 68

8

714

13 R

702

29

0 71

0

100

1

14

116

1

18

126m

pitimes4

omp

1024

16R

39

822

3

415

8

805

27

242

6 R

738

1

79

0 71

5

100

0

98

090

0

87

180m

pitimes4

omp

1024

18R

50

925

615

0

892

40

283

2 R

776

127

5 75

3

100

0

96

084

0

81

2plusmn

(BC

DEacute

) 10

24

18R

50

925

712

9

861

29

352

6 R

808

109

0 76

8

100

0

93

087

0

83

612m

pitimes4

omp

1024

45R

68

110

2

349

0 10

34

525

9 R

111

4

349

0 10

46

100

0

94

101

0

95

2plusmn

(BC

DEacute

) 10

24

45R

68

110

2

2810

9

920

57

634

9 R

960

5

358

7 83

4

100

0

83

087

0

76

17m

pitimes8

omp

1024

3R

57

338

1

117

0

430

9

19 R

432

28

5 43

3

100

1

27

128

1

28

90m

pitimes8

omp

1024

15R

60

504

109

0 44

9

243

8 R

471

109

0 47

1

100

0

89

093

0

94

357m

pitimes8

omp

1024

53R

67

731

418

7 69

2

576

3 R

711

418

7 72

9

100

0

95

097

1

00

2plusmn

(BC

DEacute

) 10

24

53R

67

731

3410

5

642

57

685

3 R

628

6

438

3 60

2

100

0

88

086

0

82

JAEA-Testing 2011-005

brvbarWBccedilD gtecircCOslash5ordf+XAgraveXq13aelig13ordfAgraveXCc IB notGiumlIumlordfAgraveXnOslash5ordfUumlWnDordfmacrWaeligCmbrvbarPXpoundXqMPI_Alltoall MPI atilde3gtT 2ordfCX 5Pordfmnegravepound BX900 gtecircXordfXmWordfethCDq1 oacute 1 gt13aelig13thornordfDordfmgtpYacute~thornltC t256YZegraveBCDoacuteCdivideoslash h4096YZegraveBCDgtpYacute~thornlt9]QR MPIbrvbar TUacuteUcircthornuumlwdnWgtszligagraveXNXq 66 a$fY`UgraveWH mnUgravegt BX900 gt brvbarsect ecircXthorn7wWCacirc

RDMANORDMA Agravej Ocirc$-a5 1oacute 1 (13aelig13ocirc)IBnot

ordflaquopoundDqumlcopyfIuml oslashfrac14frac12CTbrvbarCcedilEgravelaquo GuecircgtlaquoigravePqeumld

fIumlszligOslash5GDH Zccedil` MPIUgraveDIcirca|ccedil`laquo

ordfugtlaquoqucircuumlWCGZccedil`MPIWIcirca|ccedil` rsGordfWmcdBX900gtIcirca|ccedil`ordfnotyXmWordfcurrencqmncdfIuml)JGoumllaquocentfIumlAumlFUcirc5GoumlcentX=ordfX

nIcirca|ccedil`laquo GuecircgtlaquoqBX9001iumlIuml 8coreoacuteC IB2gt brvbarntildeGDHm IBgt 2laquo]eumlXDHIcirca|ccedil`ordfugtlaquoqCPU13fUcircordfbrvbarsect2WnIcirca|ccedil`fIuml13vC]G

Weumlnq

JAEA-Testing 2011-005

7 UgraveWH

HPCC BMT7q|^_ZWUuml|YacuteTHORN^_ZBDbrvbarsectBX900W FX1gtfeaZbrvbarTUacuteUcircordflaquosectnordf^_ZOslashszligthornGmWordfdcXpoundDqIORBDgtegraveBG API]a13brvbarpoundlordfagraveaacutegtlaquopoundDqiacuteX B^_ZbrvbarOcirc$

a$fY`yBAgraveogravegt ordfecircC+XmWethCDqmnd

EcircHdegdnDregmacrHPCCYoJusectWUgraveWHWCtCqXyacuteampG(GOcirc$b AccedilGUgraveWHq mndregmacrBX900 W FX1 o13gtegraveBG^_ZIacuteDh

Agraveogravelaquo GDHnotszligXWXqordfbrvbarsectCumlcopy13

ABGlCWCJ7gtecircngtlaquoq

ugraveuacute

)JlaquoDsectXsup2 sup2kDotildeDAElig13=` v oBeacuteQRjlucircsup2acircAgraveXugraveG

q

ucircuumlyacutethorn

1) Innovative Innovative Computing Laboratory the University of Tennesee ldquoHPC CHALLENGErdquo httpiclcsutkeduhpcc

2) gt$middotamp()p HPCaelig7Yacutegt SX13L32 3plusmnAElig13THORNYacutel 3 A=P+ frac14AElig13THORNYacute$ 2005-05middot4 p98-116 (2005)

3) ldquoIOR HPC Benchmarkrdquo httpior-siosourceforgenet

JAEA-Testing 2011-005

A

FX1

HPC

C B

MT

G-F

FTEfeaZUuml^13~

-DH

PCC

_FFT

_235GmWgt

FFT|5ordfTUacuteUcirceumlnordfzyacuteCq

iAgrave

EP-

DG

EM

M-a5^regbrvbarpoundaringaeligccedil13h-a5

(15

MB

Cor

e)UacuteCordfzyacuteCq

Tabl

e A

-1feaZUuml^13~]-a5^regCD

FX1

HPC

C B

MT

UgraveccedilegravekeacuteUgraveccedilknUgraveccedilplusmnsup2sup3

~~regreg

Ugraveccedilmmnecircecirckccedilnecircfrac12

regReeumlgecirckccedilnecircfrac12

ndegplusmnsup2

ecirckccedilpUgraveecircfrac12frac12plusmnsup2sup3degmiddot

plusmnsup2ordmdegsup2sup1plusmnsup2sup3degmiddot

eacuteplusmn~+igraveiacute8

nmcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

reg~

reg~v

reg

sup2sup3plusmn

qvqyuqou

ovwyy

qvquzqq

vxzwo

qvzz

vyoy

zzv|w|

qvuqqy

uvuyx

qyu

sup2sup3plusmnqvqy|zx

ovwz

qvquyxz

vxuo

qvzuw

vyowx

zzv|z

qvxx|y

yvox

qwy

sup2sup3plusmn

qvxoquq

uvoxzo

qvoqqz

yyvuox

w|vyzx

vyoxx

zzvzyq

qv|qq

vqy

yy

sup2sup3plusmnqvxoqwwq

uvqwxx

qvooqzu

yyvxuwo

w|vyyq

vyouu

zzvzzx

qvux|

oqvz|

xw

sup2sup3plusmn

|vwq|qqq

|uvwz

qvqoxxw

qv|

o|uuv|q

vyxy

zzv||wq

qvqwzuw

oqvywyu

sup2sup3plusmn|vw||wqq

|vyz

qvuxu

yyyvuwwu

o|uyvyw

vy|q

zzv|uy

qvxwzqu

ovxuo

zq|z

sup2sup3plusmn

|vwuzqq

uuvxqwq

qvqoxxw

wwyv|zq

o|uv|q

vy|ox

xvuuwu

qvowox

oqvowuyq

sup2sup3plusmn|vwoquxqq

|yvzyxu

qvuzou

yvqwq|

o|uyvzxu

vy|qw

xvuuz

qvqwuy

ovuz

|y

uxqqq

wxqqq

|xqqqq

|qqqq

w

frac14icircRfrac14Rplusmndegreg

szligsup3degicircRRszligcedilszligiumliumlRethvRntildeRfrac14degicircRccedilogravemicroplusmnregntildemicro~sup1Oslashsup3sup2moRenecircfrac12oacuteregplusmnAumlocircotildeg

egravekeacuteOslashfrac14knicircccedilpegravekszligszligOslashmmnOslash|xRccedilpOslashpOslashfrac14knRccedilpegravekszligszligOslashfrac12ecircfrac12eacuteeacuteszlignRocircotilde

mszligpecircmicircccedilpeacutefrac14UgraveOslashOslashyunRocircotilde

frac12kicircRRkplusmnplusmnplusmnraquodegReacutemiddotplusmnmiddotRkplusmn~plusmnmiddot

-

moRexoszligcedilowszligkoumlg

szligdegicircRkszligyudividentildeRvxUgraveegraveparantildeRpposlashueuqUgravecedilregedegsup2plusmngntildeRo|vxUgravecedilregesup3plusmnregsup2RugraveRnecircfrac12gg

Cuacute uacuteucircicircuumlTyacutethornYacuteN0ucircicircuumlTyacutethornYacuteN0gt

|

xo

xo

egravekszligszligRethvicircRethov|vo

eeumlgicircecirckccedilnecircfrac12RregR0Eeecirckccedilnecircfrac12Rndegplusmnsup2gIcirc-RJ13NOIPQ

~~icircRppRmicrodegdegugraveplusmnsup2ntildeRoreg~cedilsup2ntildeRmplusmnRn

JAEA-Testing 2011-005

B EumlIgravecentMPI atilde oh$a`MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

1024EumlIgrave

0

1000

2000

3000

4000

5000

6000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

2000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(1024-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(1024-)

rdma

Nordma

0

100000

200000

300000

400000

500000

600000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(1024-)

rdma

Nordma

Fig B-1 MPI atildeL1024EumlIgrave-1P

JAEA-Testing 2011-005

0

50000

100000

150000

200000

250000

300000

350000

400000

q qqqq uqqqq yqqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(1024-)

rdma

Nordma

0

1000000

2000000

3000000

4000000

5000000

6000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(1024-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(1024-)

rdma

Nordma

0

50

100

150

200

250

300

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(1024-Rank0gtRank1023)

rdma

nordma

Fig B-2 MPI atildeL1024EumlIgrave-2P

JAEA-Testing 2011-005

13

C

RD

MAoacute

NO

RD

MA

Ta

ble

C-13UVOcirc$

(8K

Bccedil

78K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

~plusmnreg

plusmn

sup2~

sup2~

sup2cedil~raquo

sup2~Oslashreg~plusmn

middotplusmnsup1

middotplusmnsup1raquo

~plusmnraquo

~plusmn

~plusmn

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

plusmnraquo

Tabl

e C

-2UVOcirc$

(80

KBccedil

584K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

sup2~Oslashreg~plusmn

plusmn

~plusmnreg

middotplusmnmiddotsup1

~plusmnraquo

~plusmn

sup2~

sup2cedil~raquo

sup2~

middotplusmnsup1raquo

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

~plusmn

plusmnraquo

Zordf

100M

RD

MAAgravejordfnotyWXq

0CXM

NO

RD

MA

AgravejordfnotyWXq

JAEA-Testing 2011-005

D

W Oslash5

Tabl

e D

-1Zccedil`

MPI

8mpi

qzvquecirciumlq|yvwuecirciumlquvzuecirciumlquovu|oecirciumlqxvwoqecirciumlquyvxqoecirciumlquyvxoecirciumlqu

qwvo

qzvzzecirciumlq|vuecirciumlquwvyecirciumlquovu|oecirciumlqxvwuecirciumlquyvuzoecirciumlquxvy|ecirciumlqu

yqyv|

owvz|oecirciumlq|yvwyyecirciumlquvxzecirciumlquvqxecirciumlquov|uecirciumlquyvuecirciumlquccedilxv|xecirciumlq

ccedilyvq

ozvqoecirciumlq|vxxecirciumlquwvux|ecirciumlquvyzoecirciumlquovxoecirciumlquyvuuqecirciumlquccedilvyecirciumlq|

ccedilwuvy

ovq|ecirciumlquyvzz|ecirciumlquwvq|qecirciumlquovu|qecirciumlqxvw|wecirciumlquyvuyoecirciumlquyvywecirciumlqu

yquv

ovquyecirciumlquvuyqecirciumlquwvxqecirciumlquovu|qecirciumlqxvwecirciumlquyvuwqecirciumlquxvzxecirciumlqu

xx|vw

|ovq|zecirciumlquyvwzuecirciumlquvz||ecirciumlquvyoecirciumlquovwecirciumlquyvuzecirciumlquccedilovozecirciumlq|

ccediloyvy

|ovquecirciumlquv|qecirciumlquwvuoyecirciumlquvyy|ecirciumlquovowyecirciumlquyvuecirciumlquccedilvxwecirciumlq|

ccedilovz

uzvwoecirciumlq|vuxyecirciumlquwvu|uecirciumlquovu|ecirciumlqxvw|ecirciumlquyvuuxecirciumlquxvwwuecirciumlqu

yqovy

uzvywecirciumlq|yvzuzecirciumlquvzoyecirciumlquovuwecirciumlqxvwqyecirciumlquyvuecirciumlquyv|yecirciumlqu

yxvu

xzvqecirciumlq|vuoqecirciumlquwv|wqecirciumlquvyyecirciumlquovowecirciumlquyvuuzecirciumlquccedilvozecirciumlq|

ccedil|vx

xzvyzzecirciumlq|voxuecirciumlquwvouecirciumlquvqqecirciumlquovxuecirciumlquyvuuyecirciumlquccediluv|yecirciumlq|

ccedilu|v

yzvuecirciumlq|vxzecirciumlquwvxecirciumlquovuyecirciumlqxvwecirciumlquyvuxecirciumlquxvyzqecirciumlqu

xw|vw

yzvz|oecirciumlq|yvwzoecirciumlquvwwxecirciumlquovuzecirciumlqxvwyecirciumlquyvuyyecirciumlquyvuqwecirciumlqu

yuxv|

zvwqecirciumlq|v|uzecirciumlquwv||yecirciumlquvyxwecirciumlquovozecirciumlquyvuzecirciumlquccedilyvwuecirciumlq|

ccedilywv

ovqq|ecirciumlquyvzzyecirciumlquvzzzecirciumlquvyzecirciumlquovqoecirciumlquyvuzecirciumlquccedil|vozxecirciumlq|

ccedil|ovz

zvuzecirciumlq|vowqecirciumlquwvoxxecirciumlquovoqqecirciumlqxuvxwecirciumlquyvuqecirciumlquvwu|ecirciumlqu

zovy

zvwecirciumlq|vyxecirciumlquwvuecirciumlquovqzzecirciumlqxuvxoecirciumlquyvuyzecirciumlquvu|ecirciumlqu

wvw

qwv|xzecirciumlq|vzecirciumlquwvywecirciumlquvxyoecirciumlquovqwuecirciumlquyvuecirciumlquccedilovqyecirciumlquccedilov

qzvuuuecirciumlq|v|yecirciumlquwvqecirciumlquovuzecirciumlqxvwqqecirciumlquyvuzoecirciumlquyvqoecirciumlqu

y|vx

oovouwecirciumlquwvqqecirciumlquzvoywecirciumlquovuqwecirciumlqxvyxxecirciumlquyvuwecirciumlquuvzoxecirciumlqu

uwvq

owvwuecirciumlq|vq|qecirciumlquvzqzecirciumlquv|uuecirciumlquzvqoecirciumlq|yvu|ecirciumlquccedilxvyxqecirciumlq|

ccedilyuv|

yvz||ecirciumlq|v|oecirciumlquwvuuecirciumlquvqyecirciumlqxovuowecirciumlqxyvu||ecirciumlquovozecirciumlqxoxwvy

ovouecirciumlquvuxecirciumlquwv|zecirciumlquovu|oecirciumlqxvwxyecirciumlquyvux|ecirciumlquxvzoecirciumlqu

xoyvo

|ov|qoecirciumlquv|zqecirciumlquwvyzoecirciumlquv|oecirciumlquwvu|uecirciumlq|yvu|ecirciumlquccedilov|uecirciumlquccediloqxv

|v|qecirciumlq|vqecirciumlquvxzecirciumlquvxqecirciumlquovq|ecirciumlquyvuzecirciumlquccedilv|zoecirciumlq|

ccedil|vu

uwvxy|ecirciumlq|wvzoyecirciumlquzv|ecirciumlquovuozecirciumlqxvuoqecirciumlquyvwqecirciumlquuvuoecirciumlqu

xoxvw

uovquecirciumlquvwecirciumlquwvqecirciumlquovuouecirciumlqxvyxqecirciumlquyvuzoecirciumlquxvwoecirciumlqu

xy|vx

xzvux|ecirciumlq|vxwecirciumlquwvx|ecirciumlquvx|oecirciumlquovoqoecirciumlquyvu|qecirciumlquccedilzvzoecirciumlq|ccediloquvz

xwvqqecirciumlq|vuuwecirciumlquwv|owecirciumlquvzuzecirciumlquov|wyecirciumlquyvxy|ecirciumlquccedil|vywuecirciumlq|

ccediluv|

yzvqecirciumlq|zv|w|ecirciumlquovq|qecirciumlqxovuxecirciumlqxvyecirciumlquyvzqecirciumlquuvqecirciumlqu

uyuvq

yovqyuecirciumlquv|quecirciumlquwv|ywecirciumlquovuyuecirciumlqxwvqxwecirciumlquyvxwoecirciumlquyvecirciumlqu

xwzvx

ovoooecirciumlquvzwwecirciumlquzvqzzecirciumlquvo|ecirciumlqxovuoecirciumlqxyvyouecirciumlquovecirciumlqxooqqvy

ovqoecirciumlquvqyoecirciumlquwvo|ecirciumlquvy|ecirciumlquovoecirciumlquyvuxecirciumlquccediluvxzecirciumlq|

ccediluvz

zvy|ecirciumlq|wvoqqecirciumlquzvqyecirciumlquov|uqecirciumlqxyvw|ecirciumlquyvxyecirciumlquuv||ecirciumlqu

uzovo

zvyzoecirciumlq|vqwecirciumlquwvoecirciumlquovqzwecirciumlqxuvuwwecirciumlquyvuzyecirciumlquvwqyecirciumlqu

yxvy

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

JAEA-Testing 2011-005

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy1

Ccedil

Ccedil

+

Ccedil

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

Fi

g D

-1 Zccedil`

MPI

8mpi

JAEA-Testing 2011-005

Tabl

e D

-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p

q|vu|ecirciumlq|yvuqqecirciumlquyvuuecirciumlquyvy|ecirciumlqu|vuuxecirciumlq|yvuozecirciumlquovzzoecirciumlq

xvw

qvuuwecirciumlq|yvuqecirciumlquvouecirciumlquvqyecirciumlquyvqxqecirciumlq|yvuqecirciumlquccedilovoecirciumlq|

ccediloyv|

o|vuuecirciumlq|yvuq|ecirciumlquyvuecirciumlquyvwuwecirciumlquuv|ecirciumlq|yvuuecirciumlquovqquecirciumlq|

zv

oyvqzwecirciumlq|yvuqoecirciumlquvqooecirciumlquvqecirciumlquyvqyecirciumlq|yvuozecirciumlquovxxecirciumlq

vx

|vwuecirciumlq|yvuqoecirciumlquyv|qecirciumlquyvwxecirciumlquuv|uqecirciumlq|yvu|ecirciumlquovoecirciumlq|

|wv

yvqqwecirciumlq|yvuqyecirciumlquvqqecirciumlquvq|oecirciumlquyvoyecirciumlq|yvuouecirciumlquvuowecirciumlq

uvq

||vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvwxecirciumlquuv|yqecirciumlq|yvuoecirciumlquovooecirciumlq|

|yv

|vuzwecirciumlq|yvuqyecirciumlquvoxxecirciumlquvq|ecirciumlquyvoqyecirciumlq|yvuoecirciumlquccedilov|uecirciumlq|

ccediloyvx

|v|yuecirciumlq|yvuq|ecirciumlquyv|zecirciumlquyvw|oecirciumlquuvqzuecirciumlq|yvuecirciumlquzvo|ecirciumlq

vy

yvy|ecirciumlq|yvuquecirciumlquvqwqecirciumlquvqzecirciumlquyvoqecirciumlq|yvuozecirciumlquccedilxvo|uecirciumlq

ccedilyvy

q|vu|ecirciumlq|yvuqoecirciumlquyvu|ecirciumlquyvyyyecirciumlquvx|qecirciumlq|yvuo|ecirciumlquccedilvoqecirciumlq

ccedilvx

qvu||ecirciumlq|yvuquecirciumlquvouwecirciumlquvq|ecirciumlquyvoxecirciumlq|yvuecirciumlquccedilovoquecirciumlq|

ccedilouvz

ovxouecirciumlq|yvuqecirciumlquyvyxzecirciumlquyvyyecirciumlquvu|ecirciumlq|yvuxecirciumlquwvuwwecirciumlqo

|vu

oyvozecirciumlq|yvuq|ecirciumlquvqoyecirciumlquvq|ecirciumlquyvoecirciumlq|yvuxecirciumlquvooqecirciumlq

|vu

|vyoqecirciumlq|yvuoyecirciumlquyvecirciumlquyvxyecirciumlqu|v|xwecirciumlq|yvuqecirciumlquccedilvozecirciumlq

ccedilxvz

xvzzyecirciumlq|yv|wzecirciumlquyvzwzecirciumlquvq|oecirciumlquyvouecirciumlq|yvuoyecirciumlquuvoxwecirciumlq

yvz

||vyecirciumlq|yvuqoecirciumlquyvy|ecirciumlquyvxecirciumlqu|v|xxecirciumlq|yvuoecirciumlquccedilxvwwzecirciumlqo

ccedilovy

|vuzuecirciumlq|yvuquecirciumlquvox|ecirciumlquvq||ecirciumlquyvo|ecirciumlq|yvuozecirciumlquccedilovq|ecirciumlq|

ccediloyvo

|vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvooecirciumlquvzoyecirciumlq|yvuqecirciumlquccedilv|zxecirciumlq

ccedilyv

yvy|ecirciumlq|yvuqqecirciumlquvqecirciumlquvq|xecirciumlquyvouecirciumlq|yvuqecirciumlquccediluvqoecirciumlq

ccedilxvo

q|vozyecirciumlq|yv|zzecirciumlquyvowecirciumlquyvyzecirciumlqu|vxu|ecirciumlq|yvuouecirciumlquxvq|wecirciumlq

oxvw

qv|wyecirciumlq|yvuqxecirciumlquvouuecirciumlquvquuecirciumlquyvoqyecirciumlq|yvu||ecirciumlquccedilovqqqecirciumlq|

ccedilo|vx

ouvoq|ecirciumlq|yv|zecirciumlquyvwqecirciumlquyvyxecirciumlquvuoecirciumlq|yvuoyecirciumlquccedilovxqoecirciumlq|

ccedil|yvy

ovuqzecirciumlq|yv|zoecirciumlquvo|ecirciumlquvq|zecirciumlquyvo|ecirciumlq|yvuowecirciumlquccedilzv|ecirciumlq

ccedilovy

uvoecirciumlq|yvuqqecirciumlquyvwo|ecirciumlquyvxoecirciumlqu|vzwecirciumlq|yvuoecirciumlquccedilyvowecirciumlq

ccediloxvq

vxozecirciumlq|yvuoecirciumlquvo|ecirciumlquvquuecirciumlquyvoyecirciumlq|yvuecirciumlquccedilovzzecirciumlq|

ccedilov|

||vuoqecirciumlq|yvuqxecirciumlquyvuecirciumlquyvywecirciumlqu|vuyuecirciumlq|yvuoecirciumlquvooqecirciumlq

yv

|vxoecirciumlq|yvuquecirciumlquvoxyecirciumlquvq||ecirciumlquyvqwyecirciumlq|yvuuecirciumlquccedilov||ecirciumlq|

ccediloyvu

|vqwecirciumlq|yvuqqecirciumlquyvoecirciumlquyv|yecirciumlqu|vozecirciumlq|yvuowecirciumlquccedil|vxooecirciumlq

ccedilvu

vuxwecirciumlq|yvuqxecirciumlquvoxoecirciumlquvquqecirciumlquyvoxxecirciumlq|yvuuecirciumlquccedilovooyecirciumlq|

ccedilouvz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

jetimesg

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-3Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (2

)

JAEA-Testing 2011-005

Tabl

e D

-3Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p

q|v|oxecirciumlq|wvwz|ecirciumlq|ovoecirciumlquovou|ecirciumlquvu|qecirciumlq|zvqqecirciumlq|ccedilvxwecirciumlq

ccedil|vu

qvx||ecirciumlq|wvzquecirciumlq|ovyuuecirciumlquovyuyecirciumlquvxqoecirciumlq|wvzyecirciumlq|vy|ecirciumlqo

qv|

ovyecirciumlq|wvwzqecirciumlq|ovoyxecirciumlquovouuecirciumlquvuwecirciumlq|wvzyuecirciumlq|ccedilvoqqecirciumlq

ccedilvy

ovxqoecirciumlq|wvwzecirciumlq|ovyuqecirciumlquovy|yecirciumlquvuoqecirciumlq|wvzxqecirciumlq|ccedil|vzoecirciumlqo

ccedilqvx

|vywoecirciumlq|wvz|ecirciumlq|ovyoecirciumlquov|ecirciumlqu|v|wyecirciumlq|wvz|oecirciumlq|ccedilvzxzecirciumlq

ccedilwvq

v|zoecirciumlq|wvwzyecirciumlq|ovyzecirciumlquovuzyecirciumlquxvzzwecirciumlq|wvzxwecirciumlq|ccedilov||oecirciumlq|

ccedilowvq

||vywecirciumlq|wvwwzecirciumlq|ovxecirciumlquov||ecirciumlqu|v|xwecirciumlq|wvzoecirciumlq|ccedilv|zecirciumlq

ccedilyvx

|vuooecirciumlq|wvzqwecirciumlq|ovy|ecirciumlquovxqxecirciumlquyvqzecirciumlq|wvz|ecirciumlq|ccedilovyecirciumlq|

ccedilovo

|v|xzecirciumlq|wvzqoecirciumlq|ovyecirciumlquovowwecirciumlquvzo|ecirciumlq|wvzyecirciumlq|ccedil|vzzecirciumlq

ccediloovu

vuxzecirciumlq|wvzqoecirciumlq|ovy|yecirciumlquovxoecirciumlquyvuecirciumlq|wvzyoecirciumlq|ccedilyvxuecirciumlq

ccedilwvw

q|v|xzecirciumlq|wvwwoecirciumlq|ovuecirciumlquovxqecirciumlqu|vxxuecirciumlq|wvzuuecirciumlq|vxwecirciumlq

v

qvuxecirciumlq|wvzqyecirciumlq|ovy||ecirciumlquovuzwecirciumlquyvqqecirciumlq|wvzy|ecirciumlq|ccedilov|uwecirciumlq|

ccedilowv

ovuwqecirciumlq|wvwzecirciumlq|ovo|yecirciumlquovo|zecirciumlquvuuqecirciumlq|wvzuyecirciumlq|vowecirciumlqo

ovo

oyvoozecirciumlq|wvz|qecirciumlq|ovxqxecirciumlquovuzecirciumlquyvqoecirciumlq|wvzxzecirciumlq|ccedilvzyecirciumlqo

ccedilov|

|v||zecirciumlq|wvzqzecirciumlq|ovxecirciumlquov|oecirciumlqu|v|oxecirciumlq|wvzzyecirciumlq|yv|owecirciumlqo

ovz

xvzyuecirciumlq|wvzqoecirciumlq|ovuwecirciumlquovxqxecirciumlquyvqxecirciumlq|wvzuecirciumlq|ovwuoecirciumlq

|vo

||v|xzecirciumlq|wvwzoecirciumlq|ovxecirciumlquovuzecirciumlqu|vxozecirciumlq|wvzqecirciumlq|v|zoecirciumlq

vo

|vuwxecirciumlq|wvzqyecirciumlq|ovy|zecirciumlquovxquecirciumlquyvqwoecirciumlq|wvzxzecirciumlq|ccedilov|xoecirciumlq|

ccedilowvo

|vo|uecirciumlq|wvwzqecirciumlq|ovqecirciumlquovoecirciumlqu|vqecirciumlq|wvzyuecirciumlq|ovuyzecirciumlq

uvu

yvuwecirciumlq|wvzooecirciumlq|ovxyyecirciumlquovxqoecirciumlquyvquecirciumlq|wvzyuecirciumlq|ccedilyvuw|ecirciumlq

ccedilwvy

q|v|oyecirciumlq|wvwzqecirciumlq|ovoecirciumlquov|yecirciumlqu|v|yzecirciumlq|wvzwzecirciumlq|ovxoyecirciumlq

uvy

qxvzyyecirciumlq|wvzq|ecirciumlq|ovuwecirciumlquovy|wecirciumlquvu|qecirciumlq|wvzx|ecirciumlq|ovxouecirciumlq|

xvu

ovxyecirciumlq|wvwqecirciumlq|ovoy|ecirciumlquov|oecirciumlquuvuuecirciumlq|wvzyecirciumlq|ovxwqecirciumlq|

xv|

oyvqqecirciumlq|wvzqecirciumlq|ovuzecirciumlquovy|wecirciumlquvuyecirciumlq|wvzuzecirciumlq|ovux|ecirciumlq|

uvo

|vyecirciumlq|wvwzqecirciumlq|ovxecirciumlquov|zecirciumlquuv||ecirciumlq|wvzxuecirciumlq|vowyecirciumlq

ozvx

vuuuecirciumlq|wvzuuecirciumlq|ovy|zecirciumlquovuwzecirciumlquxvzxzecirciumlq|wvz|yecirciumlq|ccedilovuz|ecirciumlq|

ccedilqvo

||vywecirciumlq|wvww|ecirciumlq|ovxyecirciumlquov|zecirciumlquuv|owecirciumlq|wvzoecirciumlq|vwoecirciumlq

ozvw

|vuqqecirciumlq|wvzqwecirciumlq|ovy|oecirciumlquovxqqecirciumlquyvquqecirciumlq|wvzxwecirciumlq|ccedilov|oqecirciumlq|

ccedilov

|v|xecirciumlq|wvww|ecirciumlq|ovuecirciumlquov|q|ecirciumlquuvqyyecirciumlq|wvzyzecirciumlq|vzuyecirciumlq

xv|

yvqwecirciumlq|wvzouecirciumlq|ovxyecirciumlquovxyyecirciumlquyvouecirciumlq|wvzuzecirciumlq|uvqzxecirciumlqo

vz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

Ccedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

Xpound

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-4Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-5Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (2

)

JAEA-Testing 2011-005

E 1oacute 1

Fig E-1 QRfeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39305 sec Ave Speed = 32566145 MBs RANK = 2 TIME = 39197 sec Ave Speed = 32655168 MBs RANK = 3 TIME = 39212 sec Ave Speed = 32643219 MBs RANK = 4 TIME = 34341 sec Ave Speed = 37272765 MBs RANK = 5 TIME = 34175 sec Ave Speed = 37453885 MBs RANK = 6 TIME = 34086 sec Ave Speed = 37552470 MBs RANK = 7 TIME = 34329 sec Ave Speed = 37286226 MBs RANK = 8 TIME = 24021 sec Ave Speed = 53287420 MBs RANK = 9 TIME = 23794 sec Ave Speed = 53794939 MBs RANK = 10 TIME = 23805 sec Ave Speed = 53770236 MBs RANK = 11 TIME = 23798 sec Ave Speed = 53786914 MBs RANK = 12 TIME = 23942 sec Ave Speed = 53462490 MBs RANK = 13 TIME = 23931 sec Ave Speed = 53488046 MBs RANK = 14 TIME = 23950 sec Ave Speed = 53443826 MBs RANK = 15 TIME = 24066 sec Ave Speed = 53186052 MBs RANK = 16 TIME = 23798 sec Ave Speed = 53786327 MBs RANK = 17 TIME = 23802 sec Ave Speed = 53775886 MBs RANK = 18 TIME = 23821 sec Ave Speed = 53733399 MBs RANK = 19 TIME = 23794 sec Ave Speed = 53795866 MBs RANK = 20 TIME = 23939 sec Ave Speed = 53468837 MBs RANK = 21 TIME = 23938 sec Ave Speed = 53470775 MBs RANK = 22 TIME = 23939 sec Ave Speed = 53470333 MBs RANK = 23 TIME = 23942 sec Ave Speed = 53462889 MBs

JAEA-Testing 2011-005

Fig E-2 afeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39370 sec Ave Speed = 32511775 MBs RANK = 2 TIME = 39475 sec Ave Speed = 32425231 MBs RANK = 3 TIME = 39415 sec Ave Speed = 32474846 MBs RANK = 4 TIME = 34392 sec Ave Speed = 37217925 MBs RANK = 5 TIME = 34543 sec Ave Speed = 37055373 MBs RANK = 6 TIME = 34250 sec Ave Speed = 37372742 MBs RANK = 7 TIME = 34709 sec Ave Speed = 36878286 MBs RANK = 8 TIME = 26967 sec Ave Speed = 47465876 MBs RANK = 9 TIME = 26964 sec Ave Speed = 47471018 MBs RANK = 10 TIME = 26955 sec Ave Speed = 47486289 MBs RANK = 11 TIME = 26947 sec Ave Speed = 47499771 MBs RANK = 12 TIME = 26957 sec Ave Speed = 47483647 MBs RANK = 13 TIME = 27020 sec Ave Speed = 47373054 MBs RANK = 14 TIME = 26948 sec Ave Speed = 47498023 MBs RANK = 15 TIME = 27174 sec Ave Speed = 47104542 MBs RANK = 16 TIME = 26956 sec Ave Speed = 47485054 MBs RANK = 17 TIME = 26945 sec Ave Speed = 47504899 MBs RANK = 18 TIME = 26946 sec Ave Speed = 47502360 MBs RANK = 19 TIME = 26967 sec Ave Speed = 47466158 MBs RANK = 20 TIME = 26948 sec Ave Speed = 47499767 MBs RANK = 21 TIME = 26982 sec Ave Speed = 47438204 MBs RANK = 22 TIME = 27087 sec Ave Speed = 47254830 MBs RANK = 23 TIME = 27004 sec Ave Speed = 47399960 MBs

JAEA-Testing 2011-005

Fig E-3 QRfeaZOpenMPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes NODE CPU coreid Rank = 0 bx2050 CPU0 0 all time RANK = 1 TIME = 38110 sec Ave Speed = 33587213 MBs bx2050 CPU1 1 RANK = 2 TIME = 28656 sec Ave Speed = 44667752 MBs bx2050 CPU0 2 RANK = 3 TIME = 37959 sec Ave Speed = 33720175 MBs bx2050 CPU1 3 RANK = 4 TIME = 28658 sec Ave Speed = 44664664 MBs bx2050 CPU0 4 RANK = 5 TIME = 37829 sec Ave Speed = 33836073 MBs bx2050 CPU1 5 RANK = 6 TIME = 28641 sec Ave Speed = 44691880 MBs bx2050 CPU0 6 RANK = 7 TIME = 37892 sec Ave Speed = 33779924 MBs bx2050 CPU1 7 RANK = 8 TIME = 35140 sec Ave Speed = 36425620 MBs bx2051 CPU0 0 RANK = 9 TIME = 35017 sec Ave Speed = 36554163 MBs bx2051 CPU1 1 RANK = 10 TIME = 38916 sec Ave Speed = 32891010 MBs bx2051 CPU0 2 RANK = 11 TIME = 45416 sec Ave Speed = 28184186 MBs bx2051 CPU1 3 RANK = 12 TIME = 35172 sec Ave Speed = 36392511 MBs bx2051 CPU0 4 RANK = 13 TIME = 35053 sec Ave Speed = 36516468 MBs bx2051 CPU1 5 RANK = 14 TIME = 35200 sec Ave Speed = 36363472 MBs bx2051 CPU0 6 RANK = 15 TIME = 35009 sec Ave Speed = 36561987 MBs bx2051 CPU1 7 RANK = 16 TIME = 22996 sec Ave Speed = 55661441 MBs bx2052 CPU0 0 RANK = 17 TIME = 35587 sec Ave Speed = 35968575 MBs bx2052 CPU1 1 RANK = 18 TIME = 35587 sec Ave Speed = 35968151 MBs bx2052 CPU0 2 RANK = 19 TIME = 35712 sec Ave Speed = 35842525 MBs bx2052 CPU1 3 RANK = 20 TIME = 22996 sec Ave Speed = 55661978 MBs bx2052 CPU0 4 RANK = 21 TIME = 36856 sec Ave Speed = 34729458 MBs bx2052 CPU1 5 RANK = 22 TIME = 22994 sec Ave Speed = 55666382 MBs bx2052 CPU0 6 RANK = 23 TIME = 22991 sec Ave Speed = 55673257 MBs bx2052 CPU1 7

JAEA-Testing 2011-005

13

F |eacuteIuml -microX|eacuteIuml^_Z)JCQRfeaZW Intel feaZegrave

pound5poundDqQRfeaZgt|YatildeGfeaUuml^13~

WCeumlUKntstiacuteordflaquoqmnaringaeligccedil13heumlX store amp5GUuml^13~gtlaquoq2csup1egraveBordfOcirc$ storegtszligordflaquoqiAgraveIntelmnntildeGfeaUuml^13~centsup3CXqCcCcedilEgravegtacircOslash5Cagraveaacute5pound

Dq WCIntelfeaZmUuml^13~ordfccdQRfeaZ

UKntst egraveBCDoumlWordfqiAgraveUKntst ordfQRfeaZEGmWordfethCDq

F1 Agraveograve o^_UgraveDo7ccedilIumlgt 128MB copyIgrave0]lyacuteOslash5GZccedil` MPI

OpenMP^_Z)JCDqm0]lyacute 100plusmnatildesectaumlC5mnccC|eacuteIuml (MBs)iquestHq^_ZiIacute[oumlYacutebrvbarsectGqOslash5IumlWCMPI gt 12348 ^_OpenMP gt 123487ccedilIumlgtOslash5UgraveD IFyacute1EC1CPUcd2^_UgraveD7ccedilIuml 2laquonotgtOslash55poundEacuteDqQRfeaZUKntst CDoumlWCXouml 2sectCDqIntelucircfeagt5poundDq

JAEA-Testing 2011-005

Fig F-1 ^_ZugraveuacuteLFlat MPIP

(main) n = nmb102410248 nmb = 128 (from input data) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) nsize = n size c NNN Repetition number of times NNN = 100 do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = MPI_WTIME() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

Fig F-2 ^_ZugraveuacuteLOpenMPP

(main) $OMP threadprivate(ioutompidompnumabierr1ierr2) n = nmb102410248 nmb = 128 (from input data) $OMP parallel shared(n) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) if(ierr1ne0orierr2ne0) then write(6) Array allocation failed stop end if $OMP end parallel nsize = n size c NNN Repetition number of times NNN = 100 $OMP parallel private(i)shared(nsize) do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue $OMP end parallel $OMP parallel private(IIttime1ttime2timewave_spd) $OMPamp shared(NNNn) ttime1 = omp_get_wtime() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = omp_get_wtime() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

F2 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt[oumlYacutebrvbarsect 4gtGqmn

d QRUKntstlaquooumlWoumlWr 2 2^_laquonotgtUKntstlaquooumlWoumlWr 3KntstCc 1ccedil8^_gtQRW IntelWr QR(-Kntst)W Intel2 2^_Wr CqOMP Intel EcircHGu MPI Wiacute

XWXpoundDDHyzGqmndbrvbarsect[raquozfrac14frac12CDq (1) MPI^_ZOpenMP^_ZWQRfeaZgtUKntstordflaquooumlW

oumlgtordfaacuteCq (2) IntelfeaZUKntstCDQRfeaZWbAEligq (3) MPI^_ZOpenMP^_ZW 1CPUSgto coremacrdnXq (4) UKntstordfQRfeaZOslash5thornlt]G(1plusmnW 2plusmn

Oslash5ordfaEacuteD)q ^_ZbrvbarPXOcirc$sup1yBordfAgraveXCc store ampordfCXoumlQR

feaZUKntst Uuml^13~CXnordfyacuteordfdXqOcirc$sup1yBordfcAgraveXcpoundecircsectCXoumlmUuml^13~CagraveaacuteGaeliguordflaquoWnq

IntelfeaZdegIacuteQRUKntstlaquosectoumlWGmWordfgtecircq(XETHNtildefeaOslash5|YatildeG4X

yen5oslashIacuteD|Yordf9WXpoundbrvbarPmacrltdnDq

JAEA-Testing 2011-005

Table F-1 |eacuteIumlWeumlUKntstiacuteUuml^13~

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqvo uuuwvq u|xvq qvzu

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |o|v wq|vy xxwvu qvXpoundo |xwxvz |xwvq |xw|vz Xpoundo |xvy wovw xov qv

|xyvu |xuvo |xxv| |ozvu woqv xyxvq qv

Xpoundq z|vx zvz z|v Xpoundq ouuwvz ozwov ooxv| qvyoXpoundo woqv wqxvo wqvy Xpoundo oux|vu ozwv| oqvu qvyoXpound wqzv| wqvo wqxv Xpound oux|v| ozzqvo oov qvyo

wquv| wqqvo wqv ouxovz ozwyvu oozvo qvyo

Xpoundq o||vo o|xvq o|uvq Xpoundq oqoxvz oxquvu oyqvo qvxzXpoundo ouv ouov ouov Xpoundo oqovw oxoqvo oy|vz qvxzXpound ouov ouyv| ouuvq Xpound oqowvz oxqzv oyuv| qvxzXpound| ouvy ouuvw ou|v Xpound| oqowvz oxqzv oyuvq qvxz

o|zvz ouovw ouqvz oqovz oxqwv| oy|vo qvxz

Xpoundq qwyvq oqyvw qzyvu Xpoundq ouxvq ouwuvy ouywv| qvqXpoundo qz|v ooyvy oqxv Xpoundo ouxvz ouwuv ouovq qvqXpound qzuvy oovq oqxvw Xpound ouxzv ouz|v ouyv qvqXpound| qzxvy ooxvu oqxvx Xpound| ouxuvw ouzv ou|vx qvqXpoundu qwzv| o||v ooovx Xpoundu ouqxvo ouuv| ouyv qvyXpoundx qwzvu o|vw oo|vy Xpoundx ouq|vo ouuxv ouuv qvyXpoundy oqvy o|xvu oovx Xpoundy ouquvw ouuyvq ouxvu qvyXpound ooqvu o|yvo o|v Xpound ouqxvx ouuyv ouyvo qvy

qzxvw ouvw ooqv| ou|qv| ouyvu ouuwvz qvyw

usup3deg

wsup3deg wsup3deg

usup3deg

|sup3deg

sup3deg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

sup3deg

|sup3deg

Table F-2 |eacuteIumlLiacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq ywxvu xyuvw yxvo qvuXpoundo |xywv |xwvx |x|vy Xpoundo yzwvz xvu y|wvo qvuXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz yzvo xovo y|ovy qvu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq yqv xywvy yozvy qvuXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound ywyvz xwv y|uvw qvuXpound| Xpound|

|xyov| |xovu |xyyv| ywvw xxvy yv qvu

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq yoov xvu ywovw qvxXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| yxv qvo yzvy qvx

|xyv| |xqvo |xyyv yowv yov ywzv qvx

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo uyv| wqvo y|v qvuXpound |xyvz |xuvu |xovo Xpound uqvu wqyv| y|wv| qvuXpound| Xpound|

|xyyv| |xuvu |xqv| uywvw wqyv y|vw qvu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo uyxvy wqyv y|yvo qvuXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| uywvq wqxvz y|vq qvu

|xyuvw |x|vo |xywvz uyyvw wqyv| y|yvx qvu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound xq|v wovo yxvz qvuXpound| |xyxvu |x|v| |xyzv| Xpound| uzzvx wqyvu yxvz qvu

|xyuvw |xuvy |xyzv xqovy wqzv yxxvu qvu

ntildeR| ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

ontildeR

ontildeR|

qntildeRo

qntildeR

qntildeR|

qntildeRo

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

JAEA-Testing 2011-005

Talbe F-3 |eacuteIumlLQRafeaZrsacute 1ccedil8 ^_P

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqxvw uqzvq uqvu ovqx

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |ywqvx |ywovy |ywovo ovq|Xpoundo |xwxvz |xwvq |xw|vz Xpoundo |yzyvz |yzvo |yzvq ovq|

|xyvu |xuvo |xxv| |ywwv |ywzv| |ywzvq ovq|

Xpoundq z|vx zvz z|v Xpoundq zv| zxvq z|v ovqqXpoundo woqv wqxvo wqvy Xpoundo wqxvx wqwvy wqvo ovqqXpound wqzv| wqvo wqxv Xpound wquv| wqwv| wqyv| ovqq

wquv| wqqvo wqv wqqv wquvq wqv| ovqq

Xpoundq o||vo o|xvq o|uvq Xpoundq ov ouqv| o|ov| ovqqXpoundo ouv ouov ouov Xpoundo o|ovy ouwvw ouqv ovqqXpound ouov ouyv| ouuvq Xpound o|ovy oxqvo ouqvw ovqqXpound| ouvy ouuvw ou|v Xpound| o|qv ouzvo o|zvz ovqq

o|zvz ouovw ouqvz ozvq ouvo o|wvq ovqq

Xpoundq qwyvq oqyvw qzyvu Xpoundq qzv ooxvq oqyvo ovqqXpoundo qz|v ooyvy oqxv Xpoundo oquv oouv| oqzv ovqqXpound qzuvy oovq oqxvw Xpound oqxv| oxvx ooxvu ovqqXpound| qzxvy ooxvu oqxvx Xpound| oqxvw oxv ooxvx ovqqXpoundu qwzv| o||v ooovx Xpoundu oqv qzuv| oqvx qvzzXpoundx qwzvu o|vw oo|vy Xpoundx oov qzov oquvu qvzwXpoundy oqvy o|xvu oovx Xpoundy ooxvu qwzv oqvy qvzwXpound ooqvu o|yvo o|v Xpound oouvo qzqvo oqvo qvzw

qzxvw ouvw ooqv| ooqvq oqxv oqvz qvzz

wsup3deg

usup3deg

|sup3deg

sup3deg

13aumlecedilg

sup3deg

|sup3deg

usup3deg

eccedilograveregYacuteg

efrac12cedilregg

oefrac12cedilregg

wsup3deg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

Table F-4 |eacuteIumlLQRafeaZrsacute iacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq |quvy |quvx |quvx ovquXpoundo |xywv |xwvx |x|vy Xpoundo ||vx |ovw |vy ovquXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz |ouvq |o|vo |o|vy ovqu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq ||zzv |qv| |xxqvw ovqqXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound |zwv| ||v |yovq ovqxXpound| Xpound|

|xyov| |xovu |xyyv| |xzwvw |o|vq |yxxvz ovq

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq |quv |quvq |quvu ovquXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| |ovx |ovy |ovy ovqu

|xyv| |xqvo |xyyv |o|vo |ovw |o|vq ovqu

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo |oyvw |oyvu |oyvy ovquXpound |xyvz |xuvu |xovo Xpound |oyv |oyvq |oyvo ovquXpound| Xpound|

|xyyv| |xuvu |xqv| |oyvx |oyv |oyv| ovqu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo |ov |oyvo |oyvz ovquXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| |ouv| |ovx |o|vu ovqu

|xyuvw |x|vo |xywvz |oyvq |ouv| |oxvo ovqu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound |oyv |oxvz |oyvo ovquXpound| |xyxvu |x|v| |xyzv| Xpound| |oxvx |oyvw |oyvo ovqu

|xyuvw |xuvy |xyzv |oxvw |oyv| |oyvo ovqu

ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

qntildeRoqntildeRo

qntildeR

qntildeR|

ontildeR

ontildeR|

ntildeR|

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpound

JAEA-Testing 2011-005

G Zccedil` MPIWIcirca|ccedil`gt MPI atilde 6MPI ETHNtildegtegraveBCD^_ZiIacute^regCZccedil`MPI

WIcirca|ccedil`gtOslash55poundDq

G1 Agraveograve MPI ETHNtildegtegraveBCD^_ZiIacute^regC1024 ^_Zccedil

` MPIW 128^_j87ccedilIumlIcirca|ccedil`128^_Zccedil` MPIW128 ^_j8 7ccedilIumlIcirca|ccedil`acircgtCc Ocirc$-a5gtrs5poundDqUgraveDRDMANORDMAIumlgt5poundDq )acircEacute coregtZccedil` MPIWIcirca|ccedil`rs acircEacute MPI^_gt

coreCCDouml rsWGq Ocirc$-a5n~n 8eacutea`ccedil1Keacutea`256Keacutea`ccedil1Meacutea`WCDq

G2 [oumlYacutebrvbarsectGqmbrvbarsect[raquozfrac14frac12CDq (1) 1024Zccedil` MPIW 128j8Icirca|ccedil`gtGu MPIatildegtIcirca|ccedil`

ordfFqUgraveDatildebrvbarpoundaacuteCCordflaquoq (2) 128Zccedil` MPIW 128j8Icirca|ccedil`gtplusmn Ocirc$-a5(256KB13yacute)

gtlaquoWGu MPIatildegtaeligIcirca|ccedil`ordfnotyW wXordfaacuteCCIcirca|ccedil`ordfZccedil` MPIbrvbarsectordfmWq

(2)gt ordfAgravecedilCXpoundegraveBG core ordfCXgtOslashf

IumlntildeHDoumlfIumlOslash5FCXWeumlnqmndbrvbarsect

BX900gtIcirca|ccedil`ordfnotygtlaquoqfIumlUTUacuteUcircbgtIcirca|ccedil`Ucirc5Gaeliguordflaquoq

JAEA-Testing 2011-005

Table G-1 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (1)

sup2~ w |y |o ovq| zx o| qv |vu vuqsup2~ oy |o |qz ovq| oqq o|x qv| |vow vwsup2~ u |ow |o ovqq zz o|x qv| |vo v|xsup2~ | |y |o ovqu zw o|x qv| |v|u v|sup2~ yu |o |oy ovqq zy o|z qvyz |v| vsup2~ ow ||u |o| ovq zw o| qv |vuq vwsup2~ xy ||u |o ovqx oqx o|z qvx |voz vsup2~ xo ||u |q ovqu oq o|w qv |vo| v|sup2~ ontildequ |uo |q ovq oqz ou| qvy |vo vu

sup2~ w owq qw qvw q oo qvy| vxy ovwxsup2~ oy owu o| qvw q oqw qvyy vyo ovzwsup2~ u owq qu qvww yw oqz qvy vyx ovwsup2~ | o| qy qvwu o oqw qvyy vu| ovzosup2~ yu o| qu qvwx o oqw qvyy vu| ovwzsup2~ ow owx qq qvz| oqz qvy vxy ovwusup2~ xy oz qo qvwz wy ooo qvw vqz ovwsup2~ xo ow qu qvw w oo qv vqx ovwsup2~ ontildequ ow ou qvwx z oq qv ovzw ovz

sup2~Oslashreg~plusmn w untildeqoy untilde|z ov| ontildeuw ontildeyz| qvu xzv|q xvusup2~Oslashreg~plusmn oy |ntildeyuw uqntilde ovwo ontildezw ontildeyuu qvz xyvx uvwsup2~Oslashreg~plusmn u untildeqxu |zntildezu| ovwx ontildexo ontildeyoq qvw xzvow uvwsup2~Oslashreg~plusmn | |ntilde|| uqntildeyq ovw ontilde|q ontildeyq qvwo xxvxy uvwysup2~Oslashreg~plusmn yu untildey uqntildeuuu ovwu ontildeww ontildeyuy qvw xvyu uvxsup2~Oslashreg~plusmn ow |ntildeqq uqntildezw| ovw ontildew ontildeyzx qv xzvxq uvowsup2~Oslashreg~plusmn xy untildeyyx uqntildeywz ovwu ontildeuxw ontildeyy qvw xovq uvuosup2~Oslashreg~plusmn xo untildeqz |zntildeoww ovwz ontildexq ontildeqx qvwz uwvyz vzzsup2~Oslashreg~plusmn ontildequ yntildexwx |ntildeuuz vqx ontildey ontildewx qvw uvo ozvzw

middotplusmnsup1 w |w | ovqq oqq o|y qv| |vw vuqmiddotplusmnsup1 oy |u |uo qvzx oqo o|y qvu |vq vxqmiddotplusmnsup1 u |uo |yq qvzx oqo ouq qv |v|w vxwmiddotplusmnsup1 | |xo |yw qvzy oq o|w qvu |vux vyymiddotplusmnsup1 yu xzo u|| ov|y oqx ouw qvo xvy| vzmiddotplusmnsup1 ow ntildez yqu uvz| ooy oxx qvx xvq |vzqmiddotplusmnsup1 xy xntilde z xv| o| oy qvz |zvyx xvwumiddotplusmnsup1 xo wntildeqoz ontildew uvxo oy oz ovo| |voy zvmiddotplusmnsup1 ontildequ oqntildezy |ntildexxo |vqz zo |w uvqz oovz ouvzx

middotplusmnsup1raquo w ntildey| oxntildequq qvo || oy qvu vz ovqomiddotplusmnsup1raquo oy ntildeuyw oxntildeqx qvoy ||x oq qvu v| ovoxmiddotplusmnsup1raquo u ntilde|y oxntildequq qvow |uo o| qvuw wvq| ovqzmiddotplusmnsup1raquo | ntildexw| oxntildeq|| qvo || ox qvu vyy ovqomiddotplusmnsup1raquo yu ntildeyyz oxntildeqqz qvow |xq yw qvuy vy ozvxumiddotplusmnsup1raquo ow |ntildeqqx oxntildeqy| qvq u|q uo qvxw yvzz qv|middotplusmnsup1raquo xy |ntilde|uy oxntildeoqu qv xyo y qvu xvzy ozvw|middotplusmnsup1raquo xo |ntilde|zq oxntildeoxq qv xzw woq qvu xvy owvomiddotplusmnsup1raquo ontildequ |ntildex| oxntildeuox qvu yy z|z qv xvxx oyvuo

Ugraveplusmnsup1 w ontildeuox yntildeyo| vy| u q ovq ovzq |vUgraveplusmnsup1 oy owntildeqz yntildewu vy |q qo ovou wv|q |uvoxUgraveplusmnsup1 u owntilde| yntildewzz v |qo q ovuy yvoy ||v|wUgraveplusmnsup1 | ozntildeqzy yntildezq v uz q ovq yvwq ||v|wUgraveplusmnsup1 yu ozntilde|wx yntildewx| vw| z oy ov| yxv|o |ovqUgraveplusmnsup1 ow qntildewo yntildez|y vz ||q o ovxq yovxq |ovuxUgraveplusmnsup1 xy qntilde|o ntildeqo vzq |wy y ovo xvx |ovqqUgraveplusmnsup1 xo ontildeo|u ntildeow| vzu |xy |u ovx| xzv| |qvyUgraveplusmnsup1 ontildequ ontildey|q ntilde|z vz| |zw ow| vow xuv|o uqvu

Ugraveplusmnsup1raquo w ozntildexw yntildez| vw| |z q ovow wvqu |uvuqUgraveplusmnsup1raquo oy ozntildexoz yntildezq vw| |q q ovou wuvyz |uvoUgraveplusmnsup1raquo u ozntildexqy yntildezu vwo u q ov| ov|q ||vyqUgraveplusmnsup1raquo | ozntildeuu yntildewz vwy o qy ov|o vwx ||vuqUgraveplusmnsup1raquo yu ozntildeqw yntildezq| vwx y oy ovu |vy |ovzy

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

sectWOcirceg

oqu$XAcircsup3deg owsup3degIcircwsup3+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

JAEA-Testing 2011-005

Table G-2 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (2)

Ugraveplusmnsup1raquo ow qntildeqx yntildewz| |vqq |u q ovq xxv|| |ovzUgraveplusmnsup1raquo xy qntildew| ntildeq| vzy ||y y ovuz yvqx |ovoUgraveplusmnsup1raquo xo ontildew ntildexy vz| u|| |u ovwx uzvoo |qvzxUgraveplusmnsup1raquo oqu ontildeww| ntildeuqz vzx |oy owu ov yzv|q uqv||

~plusmn w yntildez zntildeuw qvo ontildeozu uux vyw xvwq ovzo~plusmn oy ntildeoqz zntildeu| qv| ontildeoy| uu vyq yvoo ovz~plusmn u ntildeq|z zntildeuy qv ontildeo uuy vy| yvqq ovwx~plusmn | ntildeoz zntildeyq qvu ontildequ uu vq xvzw ovwy~plusmn yu yntildezoq zntildey qvo ontildeoyu uu vyq xvzu ovw|~plusmn ow zntilde|qz zntildew| qvzx ntildeo uuy uvw uvz ovz~plusmn xy zntilde|wz zntildewqw qvzy ntildey uuy uvzz uv ovz~plusmn xo zntildeqx zntildewuq qvzu ntildeowx uuz uvw uvo ovzu~plusmn oqu zntildexz zntildezq| qvzz ntildeo uyo uv uvuz ovxq

~plusmnraquo w yntildewox zntilde| qvq ontildeuo uux vz xvuz ovw~plusmnraquo oy yntildezq zntildeu| qvo ontildeoz uu vyw xvw ovwo~plusmnraquo u yntildeyz| zntildeux qvyz ontildeoyw uuu vy| xv| ovz|~plusmnraquo | yntildewou zntildexx qvq ontildeu uux vx xvx ovzo~plusmnraquo yu yntildezx| zntildex qvo ontildeoyo uuu vyo xvzz ovzx~plusmnraquo ow zntildeuux zntildew qvz ntildeoxx uu| uvwy uv|w vqx~plusmnraquo xy zntilde|x zntildewqo qvzx ntildeq uuy uvzx uvu ovzw~plusmnraquo xo zntildeyw zntildewuy qvzw ntildeqy uxq uvzq uv|z ovwz~plusmnraquo oqu zntildeyxq zntildezoo qvz ntildeoxy uyq uvyw uvuw ovx|

plusmn w uy uo ovo zy oow qvw uvwq |vxqplusmn oy xq uw ovqy oqo oq qvwx xvqq |vzzplusmn u xwz x|| ovoq oqy o| qvwy xvx uv|uplusmn | oo yqy ovo ooq o|q qvwx yvuy uvywplusmn yu ntilde zoz vu o| ouo qvw owvqu yvxqplusmn ow ntildeww ntildeqqz ov|z oxo ox qvwy owvu oovxqplusmn xy untildeouy |ntildeoy ovo q| oz qvz| qvuy oyvzplusmn xo |zntildeo| |xntilde|u ovoo ontildeuox ontildeuy qvwo vyz qvowplusmn oqu uqntildeqwy |xntilde|w ovo ontildeuww ontildewqw qvw yvzu ozv

plusmnraquo w oqqntildexy yqntilde|y ovy ntildeuou wy vx uovyy ywvwzplusmnraquo oy oqntildey| yontildeq ovyw ntildeu|w ww| vy uvqz yzv|oplusmnraquo u oqntildexyw yontildeqz| ovyw ntilde|zy wz| vyw uvw ywvuqplusmnraquo | oq|ntildeqo yontildeo ovyw ntildeuqq www vq uvzx ywvzoplusmnraquo yu oq|ntildeqzw yontildeoou ovyz ntildeuyx wzx vx uovw| ywv|qplusmnraquo ow oq|ntildezo yontildeq| ovq |ntildexqx wz |vzo zvyx ywvoplusmnraquo xy oquntildeqy yontildeqq ovo |ntildexyw wzz |vz zvo yvwxplusmnraquo xo oqwntilde|zq yqntildewo ovw |ntildexyx wzx |vzw |qvuo yvzplusmnraquo oqu ooqntildeqq yontildeoxx ovwq |ntildeyu zo |vzz zvzx yyv|z

~plusmnreg w ooz o|o qvzq q x qvwq xvwo xvo~plusmnreg oy oo| ou qvzo y qvwx xvq uvwu~plusmnreg u oow o|o qvzo q x qvw xvz xvy~plusmnreg | oq ou qvz q x qvwq xvzu uvwz~plusmnreg yu oq ox qvzy q x qvw yvo uvzz~plusmnreg ow o oz qvzz y qvw xvy xvq|~plusmnreg xy ox o|o qvzx y qvzx uvwy uvw|~plusmnreg xo ox o|o qvzx | y qvww xv|x uvzx~plusmnreg oqu ou o|| qvz| y z qvz uvq uvy|

~plusmn w oontildewxu |ntildeyuy |vx yu ontildeoow qvx owvuy |vy~plusmn oy oontildewo |ntildeyq |vw y|z ontildeoox qvx owvuw |v|~plusmn u oontildey |ntildexw |vz yuo ontildeoox qvx owv| |vo~plusmn | oontilde| |ntildexzx |v y| ontildeo| qvx owvuz |vq~plusmn yu oontildew|o |ntildey |v y| ontildeo qvx owvxw |vo~plusmn ow ontildeqw |ntilde| |v| yy ontildeouq qvxx ozv|o |vw~plusmn xy ontilde |ntildewox |v zz ontildeoyw qvyw oxv| |v~plusmn xo ontildeuo |ntildezzo |voo wu ontilde qvyz ouvy |v~plusmn oqu ontildequ untilde|w vzq zu ontilde|u qvo o|vuz |v|o

sup2cedil~raquo w y o qvx o | qvu| xvoy uvusup2cedil~raquo oy z o qvwq o qvxx vzx xvuxsup2cedil~raquo u x oq qvuy o qvyo |v| uvuzsup2cedil~raquo | | oo qv o qvxu vuz uvzosup2cedil~raquo yu oo qvyo o qvuz xvyo uvxusup2cedil~raquo ow oo qvyw o qvxx yvqq uvwxsup2cedil~raquo xy y oo qvxy o qvy uvyq xvoosup2cedil~raquo xo y oo qvxy | qvy| |vq| |vusup2cedil~raquo oqu x o qv|z o qvx |vyz xvuq

JAEA-Testing 2011-005

Table G-3 MPIatilderaquo LMPI and hybrid acute256ccedil1024KBP

sup2~ yntildeouu |ntildewzz |ntildeuou ovou ontildezx ontildeqyw ovw| ovzz |vqsup2~ xuntildeww ntilde|o ntildeuo qvzz |ntildeqo |ntildey qvz vu vsup2~ wyntildeu| oqntildexx oontildeqyy qvz |ntildezwy untildeuyw qvwz vq vuwsup2~ ontildequwntildexy ountildexxy oxntildeuy qvzu untildezqq xntildewyw qvwu vz vyu

sup2~ yntildeouu zou zo qvzu ontildeqqq ontildeq qvzw qvzo qvzxsup2~ xuntildeww ontildeo ontildezy qvwz ontildew ntildeuuz qv| qvz qvzsup2~ wyntildeu| ntildexuo ntildey| qvz ntildeyzz |ntildeuqw qvz qvzu qvwosup2~ ontildequwntildexy |ntildex untildeqox qvzu |ntildexww untilde|| qvw ovqx qvz

sup2~Oslashreg~plusmn yntildeouu yntildeuw| wontildexoz qvzu xntildeuo |ntildeoqw ovoq |vqq |vx|sup2~Oslashreg~plusmn xuntildeww ou|ntildewx ountildeqo qvw| uyntildeqzq yyntildewxo qvyz |vo vyosup2~Oslashreg~plusmn wyntildeu| oontildezoo yzntildeuz qvz yntildeqz zxntildeox qvo |voy vw|sup2~Oslashreg~plusmn ontildequwntildexy w|ntildeyox |yontildeuq| qvw wntildexo o|ntildexqz qvo |v| vz|

middotplusmnsup1 yntildeouu |ntildezw ||ntilde|x| qvzz oontildexww oxntildeyzu qvu vwx vo|middotplusmnsup1 xuntildeww y|ntildex yyntildeyy qvzx ontildez|x untilde|x qvuy vww ovuomiddotplusmnsup1 wyntildeu| wzntildewy z|ntildeyux qvzy |qntildeyx yntildeyqu qvuy vz ov|zmiddotplusmnsup1 ontildequwntildexy oqntildeux| o|untildeqox qvzq uqntildeyxx wwntildez| qvuy vzy ovxo

middotplusmnsup1raquo yntildeouu yntildew|u ||ntildeu ovww ntildewxy oxntildeyuw uvyy qvwy voumiddotplusmnsup1raquo xuntildeww wxntildeoz yuntildeo| ov|| wxntildex|| untildeuy ovwq ovqq ov|xmiddotplusmnsup1raquo wyntildeu| ooxntildeuw z|ntildezyu ov| z|ntildeyow yntildexuz ov|z ov| ov|zmiddotplusmnsup1raquo ontildequwntildexy o|ntildexx o|untildeuqz ovq zyntildexoq wntildeuuy ovoq ovu| ovxu

Ugraveplusmnsup1 yntildeouu |zntildeyu| oqntildexwz |vu uontildeuyq ountildewy vz qvzy qvoUgraveplusmnsup1 xuntildeww untildewxx ntildeqo voy uzntildeoy| yntildewo ovw| qvz qvw|Ugraveplusmnsup1 wyntildeu| xxntildewoq zntildewyu ovw xyntildewuu |yntildexqx ovxy qvzw qvwUgraveplusmnsup1 ontildequwntildexy yntildewxy |ntildexxw ovy yuntildewzx uyntildeoq ovuo qvz qvwo

Ugraveplusmnsup1raquo yntildeouu uqntildeq|u oqntildeyxq |vy uontildey| ountildez|z vz qvzy qvoUgraveplusmnsup1raquo xuntildeww uwntilde|u ntildeqz vo uzntildew yntildewx ovw| qvzw qvw|Ugraveplusmnsup1raquo wyntildeu| xyntildezy zntildez|y ovww xntildeo| |yntildeuwq ovx qvzw qvwUgraveplusmnsup1raquo ontildequwntildexy y|ntildeo| |ntildeyxq ovyw yxntilde|yw uyntildeqx ovuo qvz qvwo

~plusmn yntildeouu untildeyw wntildezu uvo uuntildeo wntildeuq| xvx qvzy ovq~plusmn xuntildeww xqntildeqyx qntilde|xq vuy xontildexwy untildezux vq qvz qvw~plusmn wyntildeu| xwntildeq wntildeux vqy yqntildeox |untildeu ov| qvzw qvw~plusmn ontildequwntildexy yyntildequq |yntilde|q| ovw ywntildeqx uuntildeuyy ovx| qvz qvw

~plusmnraquo yntildeouu zntilde|q zntildeqq ovqu zntildexq wntildeu|w ovo| qvzw ovq~plusmnraquo xuntildeww oyntildeu| qntilde|wy qvwo oyntildeoz untildezuy qvyx ovq qvw~plusmnraquo wyntildeu| |ntildewo wntildeuxu qvw ntildewz |untilde qvyy ovq qvw~plusmnraquo ontildequwntildexy zntildezy |yntildexo| qvw zntildezxq uuntildeu qvy ovqq qvw

plusmn yntildeouu yxntildewu wzntildeq| qv| ontildeyoz yntildezz qvwq |vq |v|qplusmn xuntildeww ountildeqoz oxntilde|o qvwo |zntildexu xxntilde|ux qv |vo vxplusmn wyntildeu| ow|ntildeuwy untildeyu qvw xntildeoyu untildezu| qvy |vo |vqqplusmn ontildequwntildexy u|ntildewo zxntildezqo qvw xntildezy zuntildeoy qvwq |vo |vo

plusmnraquo yntildeouu x|ontildeyx wyntildeqoz yvow o|ntildeqox owntilde|xu vu |vww uvyzplusmnraquo xuntildeww yozntildeyz| oz|ntildeuwu |vq oyntildeyw ywntildeyw v| |vwo vwplusmnraquo wyntildeu| yw|ntildewxu wwntildeooz v| owyntilde|q| oqxntildeyw ovy |vy vplusmnraquo ontildequwntildexy |yntildezo uqxntildeoyq ovw oontildexo ouontildezzx ovuz |vuw vwx

~plusmnreg yntildeouu uzu y| qvw uwo xzw qvwq ovq| ovqy~plusmnreg xuntildeww zq| ontildeoz qvy wo| ontildexuy qvx| ovoo qv~plusmnreg wyntildeu| ontilde|q ontildeyq qvw ontildeoxo ntildeoqz qvxx ovo| qvz~plusmnreg ontildequwntildexy ontildezqu ntilde|ux qvwo ontildeuxu ntildeyx qvxu ov|o qvww

~plusmn yntildeouu ountildewqx ountildez|x qvzz ontildeqq owntildeuw| ovou qvq qvwo~plusmn xuntildeww zntildeuy zntildewxz qvzw uqntildex|o uwntildexyw qvw| qv qvyo~plusmn wyntildeu| uuntildeq uuntildeo qvzz xyntildeyoq yzntildexzw qvwo qvw qvyu~plusmn ontildequwntildexy qntildeqqx qntildeu|q qvzz ntildeo zqntilde|ux qvwo qvzy qvw

sup2cedil~raquo yntildeouu x w qvz zo ux vq| qvw ov|sup2cedil~raquo xuntildeww ox o| qv| o|w o| qvyx qvzo qvwosup2cedil~raquo wyntildeu| oo |o qvu owz w| qvy qvzo qvwsup2cedil~raquo ontildequwntildexy ou wz qvu | |x qvyx qvz qvwo

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

sectWOcirceg

ow$XAcircsup3deg owsup3degIcircwsup3

国際単位系(SI)

乗数  接頭語 記号 乗数  接頭語 記号

1024 ヨ タ Y 10-1 デ シ d1021 ゼ タ Z 10-2 セ ン チ c1018 エ ク サ E 10-3 ミ リ m1015 ペ タ P 10-6 マイクロ micro1012 テ ラ T 10-9 ナ ノ n109 ギ ガ G 10-12 ピ コ p106 メ ガ M 10-15 フェムト f103 キ ロ k 10-18 ア ト a102 ヘ ク ト h 10-21 ゼ プ ト z101 デ カ da 10-24 ヨ ク ト y

表5SI 接頭語

名称 記号 SI 単位による値

分 min 1 min=60s時 h 1h =60 min=3600 s日 d 1 d=24 h=86 400 s度 deg 1deg=(π180) rad分 rsquo 1rsquo=(160)deg=(π10800) rad秒 rdquo 1rdquo=(160)rsquo=(π648000) rad

ヘクタール ha 1ha=1hm2=104m2

リットル Ll 1L=11=1dm3=103cm3=10-3m3

トン t 1t=103 kg

表6SIに属さないがSIと併用される単位

名称 記号 SI 単位で表される数値

電 子 ボ ル ト eV 1eV=1602 176 53(14)times10-19Jダ ル ト ン Da 1Da=1660 538 86(28)times10-27kg統一原子質量単位 u 1u=1 Da天 文 単 位 ua 1ua=1495 978 706 91(6)times1011m

表7SIに属さないがSIと併用される単位でSI単位で表される数値が実験的に得られるもの

名称 記号 SI 単位で表される数値

キ ュ リ ー Ci 1 Ci=37times1010Bqレ ン ト ゲ ン R 1 R = 258times10-4Ckgラ ド rad 1 rad=1cGy=10-2Gyレ ム rem 1 rem=1 cSv=10-2Svガ ン マ γ 1γ=1 nT=10-9Tフ ェ ル ミ 1フェルミ=1 fm=10-15mメートル系カラット 1メートル系カラット = 200 mg = 2times10-4kgト ル Torr 1 Torr = (101 325760) Pa標 準 大 気 圧 atm 1 atm = 101 325 Pa

1cal=41858J(「15」カロリー)41868J(「IT」カロリー)4184J(「熱化学」カロリー)

ミ ク ロ ン micro 1 micro =1microm=10-6m

表10SIに属さないその他の単位の例

カ ロ リ ー cal

(a)SI接頭語は固有の名称と記号を持つ組立単位と組み合わせても使用できるしかし接頭語を付した単位はもはや コヒーレントではない(b)ラジアンとステラジアンは数字の1に対する単位の特別な名称で量についての情報をつたえるために使われる

 実際には使用する時には記号rad及びsrが用いられるが習慣として組立単位としての記号である数字の1は明 示されない(c)測光学ではステラジアンという名称と記号srを単位の表し方の中にそのまま維持している

(d)ヘルツは周期現象についてのみベクレルは放射性核種の統計的過程についてのみ使用される

(e)セルシウス度はケルビンの特別な名称でセルシウス温度を表すために使用されるセルシウス度とケルビンの

  単位の大きさは同一であるしたがって温度差や温度間隔を表す数値はどちらの単位で表しても同じである

(f)放射性核種の放射能(activity referred to a radionuclide)はしばしば誤った用語でrdquoradioactivityrdquoと記される

(g)単位シーベルト(PV200270205)についてはCIPM勧告2(CI-2002)を参照

(a)量濃度(amount concentration)は臨床化学の分野では物質濃度

  (substance concentration)ともよばれる(b)これらは無次元量あるいは次元1をもつ量であるがそのこと   を表す単位記号である数字の1は通常は表記しない

名称 記号SI 基本単位による

表し方

秒ルカスパ度粘 Pa s m-1 kg s-1

力 の モ ー メ ン ト ニュートンメートル N m m2 kg s-2

表 面 張 力 ニュートン毎メートル Nm kg s-2

角 速 度 ラジアン毎秒 rads m m-1 s-1=s-1

角 加 速 度 ラジアン毎秒毎秒 rads2 m m-1 s-2=s-2

熱 流 密 度 放 射 照 度 ワット毎平方メートル Wm2 kg s-3

熱 容 量 エ ン ト ロ ピ ー ジュール毎ケルビン JK m2 kg s-2 K-1

比熱容量比エントロピー ジュール毎キログラム毎ケルビン J(kg K) m2 s-2 K-1

比 エ ネ ル ギ ー ジュール毎キログラム Jkg m2 s-2

熱 伝 導 率 ワット毎メートル毎ケルビン W(m K) m kg s-3 K-1

体 積 エ ネ ル ギ ー ジュール毎立方メートル Jm3 m-1 kg s-2

電 界 の 強 さ ボルト毎メートル Vm m kg s-3 A-1

電 荷 密 度 クーロン毎立方メートル Cm3 m-3 sA表 面 電 荷 クーロン毎平方メートル Cm2 m-2 sA電 束 密 度 電 気 変 位 クーロン毎平方メートル Cm2 m-2 sA誘 電 率 ファラド毎メートル Fm m-3 kg-1 s4 A2

透 磁 率 ヘンリー毎メートル Hm m kg s-2 A-2

モ ル エ ネ ル ギ ー ジュール毎モル Jmol m2 kg s-2 mol-1

モルエントロピー モル熱容量ジュール毎モル毎ケルビン J(mol K) m2 kg s-2 K-1 mol-1

照射線量(X線及びγ線) クーロン毎キログラム Ckg kg-1 sA吸 収 線 量 率 グレイ毎秒 Gys m2 s-3

放 射 強 度 ワット毎ステラジアン Wsr m4 m-2 kg s-3=m2 kg s-3

放 射 輝 度 ワット毎平方メートル毎ステラジアン W(m2 sr) m2 m-2 kg s-3=kg s-3

酵 素 活 性 濃 度 カタール毎立方メートル katm3 m-3 s-1 mol

表4単位の中に固有の名称と記号を含むSI組立単位の例

組立量SI 組立単位

名称 記号

面 積 平方メートル m2

体 積 立法メートル m3

速 さ 速 度 メートル毎秒 ms加 速 度 メートル毎秒毎秒 ms2

波 数 毎メートル m-1

密 度 質 量 密 度 キログラム毎立方メートル kgm3

面 積 密 度 キログラム毎平方メートル kgm2

比 体 積 立方メートル毎キログラム m3kg電 流 密 度 アンペア毎平方メートル Am2

磁 界 の 強 さ アンペア毎メートル Am量 濃 度 (a) 濃 度 モル毎立方メートル molm3

質 量 濃 度 キログラム毎立法メートル kgm3

輝 度 カンデラ毎平方メートル cdm2

屈 折 率 (b) (数字の) 1 1比 透 磁 率 (b) (数字の) 1 1

組立量SI 基本単位

表2基本単位を用いて表されるSI組立単位の例

名称 記号他のSI単位による

表し方SI基本単位による

表し方平 面 角 ラジアン(b) rad 1(b) mm立 体 角 ステラジアン(b) sr(c) 1(b) m2m2

周 波 数 ヘルツ(d) Hz s-1

ントーュニ力 N m kg s-2

圧 力 応 力 パスカル Pa Nm2 m-1 kg s-2

エ ネ ル ギ ー 仕 事 熱 量 ジュール J N m m2 kg s-2

仕 事 率 工 率 放 射 束 ワット W Js m2 kg s-3

電 荷 電 気 量 クーロン A sC電 位 差 ( 電 圧 ) 起 電 力 ボルト V WA m2 kg s-3 A-1

静 電 容 量 ファラド F CV m-2 kg-1 s4 A2

電 気 抵 抗 オーム Ω VA m2 kg s-3 A-2

コ ン ダ ク タ ン ス ジーメンス S AV m-2 kg-1 s3 A2

バーエウ束磁 Wb Vs m2 kg s-2 A-1

磁 束 密 度 テスラ T Wbm2 kg s-2 A-1

イ ン ダ ク タ ン ス ヘンリー H WbA m2 kg s-2 A-2

セ ル シ ウ ス 温 度 セルシウス度(e) Kンメール束光 lm cd sr(c) cd

スクル度照 lx lmm2 m-2 cd放射性核種の放射能( f ) ベクレル(d) Bq s-1

吸収線量 比エネルギー分与カーマ

グレイ Gy Jkg m2 s-2

線量当量 周辺線量当量 方向

性線量当量 個人線量当量シーベルト(g) Sv Jkg m2 s-2

酸 素 活 性 カタール kat s-1 mol

表3固有の名称と記号で表されるSI組立単位SI 組立単位

組立量

名称 記号 SI 単位で表される数値

バ ー ル bar 1bar=01MPa=100kPa=105Pa水銀柱ミリメートル mmHg 1mmHg=133322Paオングストローム Å 1Å=01nm=100pm=10-10m海 里 M 1M=1852mバ ー ン b 1b=100fm2=(10-12cm)2=10-28m2

ノ ッ ト kn 1kn=(18523600)msネ ー パ Npベ ル B

デ ジ ベ ル dB

表8SIに属さないがSIと併用されるその他の単位

SI単位との数値的な関係は    対数量の定義に依存

名称 記号

長 さ メ ー ト ル m質 量 キログラム kg時 間 秒 s電 流 ア ン ペ ア A熱力学温度 ケ ル ビ ン K物 質 量 モ ル mol光 度 カ ン デ ラ cd

基本量SI 基本単位

表1SI 基本単位

名称 記号 SI 単位で表される数値

エ ル グ erg 1 erg=10-7 Jダ イ ン dyn 1 dyn=10-5Nポ ア ズ P 1 P=1 dyn s cm-2=01Pa sス ト ー ク ス St 1 St =1cm2 s-1=10-4m2 s-1

ス チ ル ブ sb 1 sb =1cd cm-2=104cd m-2

フ ォ ト ph 1 ph=1cd sr cm-2 104lxガ ル Gal 1 Gal =1cm s-2=10-2ms-2

マ ク ス ウ ェ ル Mx 1 Mx = 1G cm2=10-8Wbガ ウ ス G 1 G =1Mx cm-2 =10-4Tエルステッド( c ) Oe 1 Oe  (1034π)A m-1

表9固有の名称をもつCGS組立単位

(c)3元系のCGS単位系とSIでは直接比較できないため等号「   」

   は対応関係を示すものである

(第8版2006年改訂)

この印刷物は再生紙を使用しています

Page 12: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター

JAEA-Testing 2011-005

3 HPCCYbrvbar

HPCC (High Performance Computing Challenge)1) Y 7^_ZB BX900W FX15poundDq13gtHPCCEcircupqWrstuq 31 HPCCYEcircu

HPCC YvR Tennessee JDongara wRordf3xWXpoundyoBCefgh$BY`^_ZzgtlaquoqO

|Y^_ WpoundDefgh$acircuXu

GmWbrvbarsectefgh$13WCeacuteZGUacuteC

qO HPL(High-Performance Linpack)DGEMMFFTE 3q|^_Zgt|Y STREAM W Random Access 2 q|^_Zgt _ PTRANSW b_eff 2q|^_Zgt5n~noacuteCc^_Zbrvbar5PJWXpound 2)qo

^_ZUuml|YacuteTHORNFortranCgtcnordfHPCCYWCCgtcnD 1^_ZeumlnqXHPCCYC 2003 sect 300wordfeumlnsect+X13rsyBGmW9gtlaquoq

311 ^_Z efgh$13f CPU 1 iexclXCiexcl8ampG

L13iumlIumlPa$fY`gtYZCDYZ$JgtlaquosectCoreCPUiumlIuml13WeumlUgraveUgraveXgtO=ordf5nDHogtordfugt

laquoqmgtiumlIumlMG` (G Global system performance)WiumlIuml[gt` (SN Single Environment)gt` (EP Embarrassingly Parallel) 3gt5Pq

3111 HPL^_Z

HPL5IgraveLUcurrengtCmWgt13OG^_ZgtlaquoqOslashm^_Z|5 CcedilEgrave^_ZaumlIacutecurrenWCyB

eumlnmWordfqUgraveDefgh$13O[P TOP500B^_ZWCBeumln LinpackMPIEumlIgraveUcircWCnotgtlaquoqiumlIuml`$XOordfecircMAumlWXq Tflopsgteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq

JAEA-Testing 2011-005

3112 DGEMM^_Z DGEMM5IgraveGmWgtiumlIumlOG^_Zgtlaquoqo^_gt5curren13^_W i5Xqm^_Z

LinpackSgtegravenauml^_ZgtlaquoqUgraveDaumlefgh$gto13zeumlnDeacuteaTHORN|ordfZaZ|WCBeumlnsect

meacuteaTHORN|ordfoacute WXqY`LBMTPbrvbariumlIumlOiexclcentGq Gflops(Giga floating-point operations per second)gteumlnq` 2gtiumlIuml[(SN)(EP)`5Pq 3113 STREAM^_Z STREAM iumlIumlordf|W CDWecirceacuteIumlG^_Zgtlaquoqm

Wecirco^_gtO=ordf5np^_W i5Xqm^_Z

cpound(Copy)curren(Scale)Myen(Add)yen(Triad) 4 XO=gtJeumlnsectiumlIuml|YiexclcentGq GBs(Giga Bytes per second)gteumlnq` 8 gt[(SN)(EP)gtoiumlIuml 4(cpoundcurrenMyenyen)`5Pq 3114 PTRANS^_Z PTARNS(Parallel matrix TRANSpose)EumlIgrave^_ocirc5IgraveFtimesbrvbarsect13ccedil`brvbarY G^_Zgtlaquoq GBs gteumlnq` 1gtiumlIumlegravepoundDM(G)`5Pq 3115 Random Access^_Z Random Access |sectumleumlnDecopyIgravecdZordf 1 ulaquonotgtshyregGO=ordf 1 macrdegplusmn5ncG^_Zgtlaquoq1 plusmnYGOcirc$2ordf8BWgteumlordfgtlaquoqiumlIuml|Yucircsup2`WiumlIuml MPI gtucircsup2`5Pq Gups(Giga updates per second)gteumlnq` 3 gtiumlIuml[(SN)(EP)iumlIumlegravepoundDM(G)`5Pq 3116 FFTE^_Z FFTEAumlF|sup3^acuteOG^_ZgtlaquoqHPLWiacute CcedilEgrave^_ZaumlIacutecurrenWCyBeumlnmWordfq[(SN)W(EP)`gt2 [microcopyIgraveordfo^_gtpara|sup3^acuteeumlnqiumlIumlegravepoundDM(G)`gt3[microcopyIgraveordflaquo 1AgravezmiddotAcurrenetheumlnDcedil|sup3^acuteeumlnlaquoW^_ordf^_W 5PmWLoacute Pbrvbaro^_

copyIgravesup1copytimesordm^_raquofrac14sect[microoacuteG|sup3^acuteordf5nq

JAEA-Testing 2011-005

Gflopsgteumlnq 3117 b_eff^_Z b_eff Ocirc$FGFGfrac12frac34(7a13acute latency)WFGFAring(eacuteIuml band width)shycdiquestH^_ZgtlaquoqFGfrac12frac34ordfAgraveXCeacuteIumlordfecircM13^_ ordfAacutenmWGqb_eff iumlIumlegravepoundDM(G)G Ring W PingPong 2 ^_ZgtJeumlnqb_eff FGGOcirc$-a5mnUgravegt^_ZW+XsectlOcirc$-a5 N ] MPI ^_ P atildeAcircXq7a13 8B Ocirc$eacuteIuml 2MB Ocirc$FGGmWbrvbarsect5Pq Ring MPIbrvbarpoundAtildeHdnDZYAumlAringAEligo^_oacuteCAumlAring 1oacute 1 5PoumlWZYAumlAringZordfEumlu 1oacute 1 5Pouml 2CcedilEgravegtCDq PingPong 2Core Eacute Ecircnototilde^_ordfethsectntildednp Core EumlCOslash5Gq2CoreIgraveEacuteIacute5poundecircTIcircIIumlshyGq b_effETHNtildeOgraveRing(AumlAring)brvbar7a13OacuteRing(AumlAring)brvbareacuteIumlOcircRing(Zordf)brvbar7a13OtildeRing(Zordf)brvbareacuteIumlOumlPingPongbrvbar7a13timesPingPongbrvbareacuteIuml 6gtlaquoq7a13ordfaY_LOslashPmacreacuteIumlordf GBsgteumlnq 312 Oslash5

HPCCZaZ(Baseline Runs)WUuml^Otildea5Z(Optimized Runs) 2Oslash5ordflaquoordffeaZTUacuteUcircEacutebrvbarZaZgt5pound

q

3121 ZaZ Oslash5gtlaquoZaZfeaZbrvbarTUacuteUcircW13Oumlfrac34

eumlnaumlUgraveAuml BLAS ZaZ|W MPI ZaZ|egraveBUgravegtUacuteeumlnordfUcircfIuml^regUacuteeumlnXqegraveBCDfeaZeacuteYacute~

WfeaegraveBCDUuml^13~ZaZ|eacuteYacute~UumlordfYacutegtlaquoq 3122 Uuml^Otildea5Z UcircfIumlTHORNszligagravePTUacuteUcirc2 7gtUacute9eumlnqiAgraveaacutednDIacutecurren(-)fIumlTUacuteUcircgtlaquosectpAgrave|5macr8CEcircnototilde3XTUacuteUcircgtlaquoqUgraveD acirc HPCC^_YacuteNtildeY`acircatilde^WaumlaringordfaeligugtlaquosectatildeCUumlGaeliguordflaquoq

JAEA-Testing 2011-005

32 BX900 FX1n~nHPCC BMT^_Z 8ccedil512EumlIgravegtOslash5CDq 321 BX900 BX900 HPCC BMT^_Z Table 1GWHPL

PTRANSRandom Access FFTEo^_ZZK|Otilde Fig 4ccedilFig 7GqZK|Otildedegn^_Zegraveeacuteecirccopyordfeuml]cgtlaquoordf512 EumlIgravegtigraveyenCiacuteXCiacuteicircXordfdegdnqUgraveDHPL=NoacuteC 822(=4902596100)1gtlaquopoundDq

Table 1iumlntildeethntildeograveoacuteWCHPCCocircAEligcdotildebIcircIumlETHNtildeoumlccedilYAElig Intel Endeavor cluster (13gt Endeavor Wdivideoslash)HPCCWeboumlYacute 1) ugraveuacute Table 2Gq BX900shyucircoB3gtlaquoDH Ecircnototilde^_ZP EndeavoregraveeacuteplusmnpoundDlaquoordfEcircuumlb13yacuteordfdegdnqUgraveDEndeavorWEacute IntelfeaZBD Table 3 GbrvbarPQRfeaZbrvbarWbgtlaquopoundDqDotildeCSTREAM iumlIumlSgt5poundmWcd thornCXatilde

d(35GBs oacute 42GBs)W 2ethM EndeavorbrvbarsectEshygtlaquopoundDqm422gtaringNGq

1 BX90013VBbrvbargt17072fgt=N 2000TflopsoacuteC 1914TflopsOslashCsectOslashszlig 9566gtlaquopoundDq

- 9 -

JAEA-Testing 2011-005

Ta

ble

1 富士

通コ

ンパイ

ラに

よる機

構B

X900

での

HPC

C B

MT測

定結果

G-H

PL

G-P

TR

AN

SG

-FFTE

EP-S

TR

EA

MSys

()

EP-S

TR

EA

MTriad

EP-D

GEM

MR

ando

mR

ing

Ban

dwid

thR

ando

mR

ing

Lat

ency

TFlo

ps

GB

sG

ups

Gup

sG

Flo

ps

GB

sG

Bs

GFlo

ps

GB

sus

ecrd

ma

008

1315

12

9330

001

5112

008

9847

555

3328

074

350

9311

018

41

1171

51

268

nord

ma

008

1226

12

9544

001

5504

009

0139

557

5727

813

347

6711

023

91

1053

51

278

rdm

a0

1579

130

510

540

0238

850

1510

719

4469

555

143

4697

111

145

092

418

149

3no

rdm

a0

1536

650

491

080

0222

270

1334

439

7508

555

683

4730

111

123

078

781

324

5rd

ma

032

2493

09

0716

003

0751

024

2906

186

357

111

164

347

3911

101

00

5220

61

944

nord

ma

032

3494

08

4422

003

5001

023

7182

190

981

112

052

350

1611

100

30

5194

95

010

rdm

a0

6201

260

150

637

003

2786

037

8769

369

923

222

203

347

1911

092

40

4192

43

156

nord

ma

060

9871

014

686

80

0763

460

3882

6638

139

722

260

03

4781

110

887

044

542

652

7rd

ma

120

6270

026

303

70

0370

050

5186

5673

075

744

426

43

4708

111

007

038

892

463

5no

rdm

a1

2265

800

302

225

007

3652

061

6532

742

721

444

063

346

9211

100

50

4161

78

290

rdm

a2

4166

200

583

348

002

9897

053

8761

139

3770

888

453

347

0511

091

00

3583

55

937

nord

ma

236

3890

057

519

80

0765

560

8003

4814

845

5089

503

73

4962

110

879

040

323

103

72rd

ma

468

4620

011

199

400

0163

910

2622

2026

859

1017

779

513

4726

111

147

032

838

711

3no

rdm

a4

9017

400

109

9150

008

2799

153

7610

275

5280

1816

433

354

7711

111

90

3572

112

790

富士

通B

X90

0 (5

12C

ore

128C

PU

)C

ip

Xeo

n X55

70 2

93G

Hz

DD

R3-

1066

SM

T-O

N T

urbo

Mod

e -O

FF

Inte

rcon

nect

Q

DR

Inf

inib

and

2so

cket

nod

e F

at T

ree

OS

Red

Hat

EL 5

Com

pile

r 富

士通

CC

++ V

er O

ption -

Kfa

st -

SS

L2

()

EP-S

TR

EA

M S

ys の

値は

(EP-S

TR

EA

M T

riad

)times並

列数

で算

出し

てい

8 16 32 64

MPI

富士

通 P

aral

leln

avi Lng

uage

Pac

kage

512

128

HP

CC

Ver

V

13

1

256

並列

数通

信方

式G

-Ran

dom

Acc

ess

オリジナ

ル版

  

SA

ND

IA_O

PT2版

JAEA-Testing 2011-005

FFig 4 BX900 HPL

Fig 5 BX900 PTRANS

JAEA-Testing 2011-005

FFig 6 BX900 Random Access

Fig 7 BX900 FFTE

JAEA-Testing 2011-005

TTa

ble

2In

tel E

ndea

vor c

lust

er (

HPC

CW

eb1)

)

JAEA-Testing 2011-005

TTa

ble

3In

tel

BX9

00H

PCC

BM

T

JAEA-Testing 2011-005

322 FX1 FX1 HPCC BMT Table 4 HPL

PTRANS Random Access FFTE Fig 8 Fig 11512

HPL755(=377650100)2

Table 4 FFTE DGEMM

A

2 FX1 1200 129Tflops116Tflops 9037

JAEA-Testing 2011-005

TTa

ble

4FX

1H

PCC

BM

T

JAEA-Testing 2011-005

FFig 8 FX1 HPL

Fig 9 FX1 PTRANS

JAEA-Testing 2011-005

FFig 10 FX1 Random Access

Fig 11 FX1 FFTE

JAEA-Testing 2011-005

33 HPCCYUgraveWH Uklq BX900W FX1WrsndWK(Altix3700Bx2)Wrs

Fig 12GqmhgtTWXpoundDqshy 100WCLRandom Ring LatencyPqRandom Ring Latency13BX900ordf`ccedil^fCmWordfcurrencq

BX900 W FX1 rsGWSTREAM W FFTE necircOatildeC atildeC eacuteIumlrvCDordfmacrdnDqSTREAM

421FFTE 43aringNGq XAltix3700Bx2 Random AccessordfEordfAltix3700Bx2W

plusmngtegraveBCD HPCC BMTeacuteYacute~cdOcirc$Y|5ordf+Xpoundsect8YrsGmW13XqAltix3700Bx2 atildeC FFTEEordfmdoacute OslashszligWuumlwdn

q

JAEA-Testing 2011-005

13

0

20

40

60

80

100

BX90

0

FX1

Altix37

00Bx2

GHPL(490Tfplos)

113

GRando

mAccess(154G

ups)

GPTRANS(1120G

Bs)

EPSTREA

M(436GBs)

EPDGEM

M(111Gflo

ps)

13

GFFTE

(2755Gflo

ps)

$amp(13

Rand

omRingBandw

idth

(0359GBs)

Rand

omRingLatency(518usec)

)+

512-(FlatM

PI)

1172

10 64

64

10 32

8 2 32

130

13

123

[GFLOPS]

4567898

123

[GBs]

lt8

7898

[GBs]

=gtAB130CDgtEFGHIAB0J

KLMNOIPQ

Fi

g 1

213cent

HPC

C B

MT^_ZrsZ

Ran

dom

Rin

g La

tenc

y (5

18

sec)

)+R

JAEA-Testing 2011-005

4 BX900W FX1TUacuteUcirc 32 c^_Zgt BX900 W FX1 IcircIumlETHNtildeoumlccedilYWiCXordfWnDmWQR^_aZegraveBC(Xcurrenyen

5poundDqeumldfeaZUuml^13~brvbarTUacuteUcirc]Ucirc7^_ZTUacuteUcirc

EacuteDq 41 ^_aZ

BX900 W FX1 gtegraveBCD^_aZWQRUgraveyenXgtlaquosectoacute ^_ZOslash5^_a|Ocirc$divideC^_ZOslash5cedildivideOcirc$divideC

^_Z=)yen5PmWordf13q ^_aZyBGUgrave^_Z_IumlYacuteh)JGDHfe

a|YUuml^13~UKtl_trtGq[ _ZOslash5_IumlYacute

h fpcoll fIumlgt=GqTcedildivideCDeacuteaTHORN|Ocirc$ fprof fIumlgtdivideCaring`Gq^_aZgtPAElig 8q|gtlaquoq

^_a|Ocirc$divideAElig AElig ^_ AElig MPIZaZ|ordm)AElig f`AElig IcircIumlETHNtilde$AElig fZAElig UcircfIumlAElig

mPIcircIumlETHNtilde$AEliggtCPUlt=gtOaringaeligccedil13

hccedil`|WOcirc$FGszligbordfoacute WXq

42 feaZbrvbar|YTUacuteUcirc BX900HPCC BMT^_Z STREAMordfotildeb

gtlaquo EndeavorWrsC(35GBsacute42GBs)W 2ethMEshygtlaquopoundDmWETHNtilde5poundDqETHNtildeBDUcirc HPCCpSTREAMUuml|YacuteTHORN(CUcirc amp FortranUcirc)7ccedilIumlEumlIgraveUcircWMPIEumlIgraveUcircgtlaquoq[13BGbrvbarPmdegn BX900 feaZUuml^13~UKntst brvbarsectCDmWcdEndeavorfeaZ(IntelUgrave)W BX900feaZ(QRUgrave)igraveiacuteWuumlwdnq

JAEA-Testing 2011-005

421 Uuml|YacuteTHORN STREAM^_Zbrvbarszligagraveaacute BX900Uuml|YacuteTHORN STREAM^_Z Table 5Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntst brvbarordfmacrdnDqDotildeCUuml^13~gtOcirc$|sectumlGaringaeligccedil13hyBCXbrvbarPGDHoacute WX DO^SgtOcirc$sup1yBordfXmWordfdcXoumlEacuteegraveBGuecircgtlaquosectPgtXoumlordfUcirc

GordflaquoqUgraveDfeaZordfoacute WGDO ^SgtsectumlOcirc$ordf^^shypoundZYeumlnoumlEacutegtlaquoq CUcircoumlIntelfeaZXd OpenMPEumlIgraveUgraveUgravegt 42GBscoreordfdegdnDordfQRfeaZgt7ccedilIumlEumlIgraveUcirc OpenMP EumlIgravecd=EumlIgrave^regCbrvbarP]C Intel feaZWbWXpoundDqQRfeaZgtlaquo CD7ccedilIumlEumlIgraveUcircAgraveogravebrvbarpoundOcirc$copytimes]AgravebTUacuteUcircAgraveograveordflaquoWnq [ FX1Uuml|YacuteTHORN STREAM^_Z Table 6Gq

FortranUcircouml7ccedilIumlEumlIgraveUcircW MPIEumlIgraveUcircdegnfeaZUuml^13~UKntstbrvbarordfmacrdnDqXTriad (132GBs)CPUQRordfWebgtUumlCshy(135GBs)CPUWotildeEacutegtlaquoq CUcircoumlordf FortranUcircUKntstUuml^13~XCWotildeiCqFX1 CfeaZ BX900W+XsectUKntstUuml^13~ordfcentsup3CXqDHFX1 CfeaZbrvbarTUacuteUcircordf$currengtlaquopoundD9ordfAumlq

JAEA-Testing 2011-005

TTa

ble

5B

X900

STRE

AM

Ta

ble

6FX

1ST

REA

M

JAEA-Testing 2011-005

422 HPCC BMT^_Z STREAMbrvbarszligagraveaacute 321gtdegdnD BX900 HPCC BMT^_Z STREAM

Table 7GqmmgtUKntstcopyIgraveOcirc$ecircEacutearingaeligccedil13heumlX storeampegraveBGmWUKrestp=all(a$XsectordfXmW)WCDTUacuteUcirc5PfeaZUuml^13~gtlaquoq UgraveDyacutegtgtlaquoordf _ZOslash53iumlIuml XC

iumlIumlSgt|Yo CPU]o CoreiDHEumlIgrave] Agravejatildedn~nfeaZUuml^13~brvbar^UcircCXq

STREAMbrvbarPX CoreIumliX2gtZX|Yordf5n DO^gtfeaZUuml^13~UKntst GmWgt|Y+13mWordfcurrencpoundDq

Table 7 HPCC BMT^_Z STREAM (MPIEumlIgraveUcircCUcirc) feaUuml^13~Ograve(1) feaUuml^13~Oacute(2)

EumlIgrave Agravej 13aelig13

EP-STREAMTriad (GBs)

13aelig13 EP-STREAMTriad (GBs)

rdma 1 34739 U U 32

nordma 1 35016 1 43316 rdma 3 34708 U U

128 nordma 3 34692 1 43572

rdma 5 34726 4 43595 512

nordma 4 35477 4 43424 L-1QRfeaZUuml^13~OgraveacuteKfastpackedntst SSL2 -2QRfeaZUuml^13~OacuteacuteKfastpackedntstrestp=all SSL2P

JAEA-Testing 2011-005

43 hhbrvbar|YTUacuteUcirc 32 gtdcWXpoundDbrvbarPBX900 W FX1 FFTE ^_Z

acircIcircIumlETHNtilde13yacuteordfmacrdnDqmETHNtildeDH5pound

DhhW13gtszlig0Gq 13gt FFTE 32EumlIgraveOslash5CDWecircoacuteG ethBX900

gt 12FX1 gt 16Wgteumlqpoundordm)1|YEcirc2OgtlaquopoundD9ordfAumlqmgtFFTEUuml|YacuteTHORN(FortranUcirc)l3CUuml|YacuteTHORN(13aumlWoacuterGoumlUcircWdivideoslash)WaumlIacutecurren4C)JCDaumlacircoacuteGaumlZh5poundDiacuteegraveBC

hbrvbarpound^UcircCDordm)GmWbrvbarsectBX900 W FX1 oacuteG5g5poundDqUcircpaumlbrvbar5poundDh 1QRfeaZW Endeavor egraveBeumlnDafeaZWrs5PDHW 2auml13IacutecurrenTUacuteUcircbrvbar6)BbordfaumlIacutecurrenthorn7wXmWagraveaacuteGDHgtlaquoq UcircaumlIacutecurren 38plusmnUcircWmndaumlZhCDiacute

Ucircn~n Fig 13W Fig 14GWoIacutecurreniacuteEcircu0Gq Fig 13W Fig 14OgravegtCDIacutecurrenOO=5pound^gtlaquoqUuml|YacuteTHORN

gtcopyIgrave APXYZWWWWOcirc$Yordf 2ugt5nsectZgtXcpoundDDHiacutegt)9BcopyIgrave CY_W^curreneth5poundcopyIgrave APXYZWWWWYordfZXbrvbarPCDq)9BcopyIgrave CY_YZCXcpoundD=Ocirc$sectumlG^gtYordfZXpoundDoumlUcircordfecircDHgtlaquoq

Fig 13W Fig 14OacutegtCDIacutecurren3[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 3ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 3gtXC 6Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquosectaringaeligccedil13h-

a5brvbarpoundltG_ccedilY-a5^_ZUgrave)acircregmacr= 16 gteumlnq

Fig 13W Fig 14OcircgtCDIacutecurren2[microcopyIgravefg(Ftimes)O=5pound^gtlaquoqUuml|YacuteTHORNgtOcirc$ucircsup2ordf5ncopyIgrave AoacuteGYordf 2ugt5nDordfiacutegt 1ugtYGbrvbarPCDq^ordf 2gtXC 4Xpound=o[microoacuteCOcirc$_ccedilYUcircordf5nDHgtlaquoqFX1 gtBX900rufntildeDsectaringaeligccedilaringh-a5ordfgteumlmWcdmO=)t_ccedilY-a5 16cd 4^regGmWgtzyacuteordfmacrdnDq

Table 8BX900W FX1QRfeaZegraveBCDUcircWaumlbrvbarEuml BX900 afeaZegraveBCDaumlbrvbarGq

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeFFQhFQeFHFFIhFIeFHFFhFQWFRWFFILCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildeKgMSCFDEFGLCKmMlhKOFRLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeNHQRIAacuteVOEOlOjAtildehSRAacutelOEYEEfKAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKnMSCFDEFGKmMSCFDEFGKdMSCFDEFGKkMSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePMMSCFDEFGPKMSCFDEFGPPMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLSABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtildeLCnMEEhKOFKOFAVjLCgMllhKOFPOFAVjLCPMEhEEOUEFMAacuteEEfFAVjYKOFKAtildeLCKMlhllOUEFMAacutellfFAVjYKOFPAtildeAAacutelOEAtildehNAacuteEOlAtildeaumlIacutecurrenOcircKMSCFDEFGPMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBF

GFL

Fig 13 FFTEUuml|YacuteTHORNaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

ABCDEFGHIJJDKLMAacuteNONPONHQRIONQRIHOAOAQRIHOAIRQOSROSIOLOKTQOTROTIOTTOTTTOFQOFROFIOESCUUOFHAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]^_`a[EFSVLG[]bcb`a[SCUHVGQWKdNAacuteFQOFROWAtildeONPAacuteFQeFHOFROWAtildeONHQRIAacuteFHOFQeFHOFROWAtildeOKNQRIHAacuteFQeFHOFROFIeFHOWAtildeSCUHVGQWKdAAacuteFQOFROWAtildeOAQRIHAacuteFQeFHOFROFIeFHOWAtildeOAIRQAacuteFIeFHOFROWAtildeSCUHVGQWKdSRAacuteFRfFHOWAtildeOSIAacuteFIfFHOWAtildeOLAacuteWAtildeSCUHVGQWKdTQAacuteWAtildeOTRAacuteWAtildeOTIAacuteWAtildeOTTAacuteFROWAtildeOTTTAacuteFHOFQeFHOFROWAtildeLEUGFECFVFQAacutegAtildeOVFRAacutegAtildeOVFIAacutegAtildeop]qrsAacuteXAtildeObqqpobtbuqrvvowxAacutevOvOvAtildeL3yzPbqqpobtrAacuteowxAacutesOwO]|AtildeAtildeLCKiMjhKOFFILCKkMVhKOFHLCKdMEEhKOFFQOFAVjLCKPMllhKOFROFAVjLCKKMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeLCKMMlhllOUEFMAacutellfFAVjYKOFRAtildeSRAacutelOEYEEfKAtildehAQRIHAacuteEOlOjOVAtildeWTTAacutelOjAtildeKMMSCFDEFGKKMSCFDEFGKPMSCFDEFGLCKgMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildeSNVVJJDPgmAacuteSRAacuteKOEYEEfKAtildeOLOTROFROVFRAtildep~hKOwowxAacute^O~OqAtildehowAacute~O^Atilde ^currenethDHrp )9BcopyIgrave)JIacutecurrenKgMSCFDEFGKdMopt^|rKkMopt^|rpKdK^^hKOsOuqpKmM~~hKOwOuqpKmK~h~~O^MAacute~~fuqYKOwAtildeLCKnMEhEEOUEFMAacuteEEfFAVjYKOFFQAtildepKkKVhKO]|NHQRIAacuteVOEOlOjAtildehowxAacute^Y^^fKO~OqAtildeWTTTAacuteVOEOlOjAtilde aumlIacutecurrenOgraveKkKopt^|r LEW V^lnacutewPKnMSCFDEFGKmKSCFDEFGKmMSCFDEFGKdKSCFDEFGLCKXMlhKOFRSNVVJJDPgmAacuteNAacuteKOlOjAtildeOLOTQOFQOVFQAtildeKXMSCFDEFGKiMSCFDEFGLCPmMEEhKOFQOFAVjLCPnMllhKOFROFAVjLCPgMjjhKOFFIOFAVjLCPMMjhjjOUEFMAacutejjfFAVjYKOFFIAtildeLCPKMlhllOUEFMAacutellfFAVjYKOFRAtildeLCPPMEhEEOUEFMAacuteEEfFAVjYKOFQAtildeAIRQAacutejOlOEAtildehNAacuteEOlOjAtilde aumlIacutecurrenOacutePPMSCFDEFG LEW j^lnacutewPPKMSCFDEFGPMMSCFDEFGPgMSCFDEFGPnMSCFDEFGPmMSCFDEFGBGDBFGFLABCDEFGIDBNFAacuteNOAOFKOFPAtildeEUHVESEDBGNVWXAacuteNYZOCYIAtildeEFSVLG[]bcb`a[SCUHVGQWKdNAacuteFKOWAtildeOAAacuteFPOWAtilde]bcbrtrcAacuteuqKhnAtildeLCnMEEhKOFKOuqKLCgMllhKOFPOuqKLCKMlhllOUEFMAacutellfuqKYKOFPAtildeLCPMEhEEOUEFMAacuteEEfuqKYKOFKAtildeAAacutelOEAtildehNAacuteEOlAtilde aumlIacutecurrenOcircPMSCFDEFGLEW l^lnacutewPKMSCFDEFGgMSCFDEFGnMSCFDEFGBGDBFGFL

Fig 14 FFTEiacuteaumlIacutecurrenUcirc^_Z

JAEA-Testing 2011-005

Ta

ble

8FF

TE V

er 4

1

FullW

kern

el

(EumlIgrave

32

180M

BP

roc

)

ST

UVWXYZ[]

^_

`6a8bcdRefghHdRefgij

`6a8bcdRefghHdRefgij

klmmnopqr

cltst

ouvq

wvx

ovy

zqvz

|vy

|vz

klmmnopq

~cltst

yvy

yvo

ovo

wqvq

v|

|vy

s88oy

oqvu

oqvu

ovq

z|vu

oquv

qvz

s88Ru

ovz

qvw

x|vq

ovw

klmmnopqr

cltst

ouvy

zvy

ovx

oq|vu

uvz

uv

klmmnopq

~cltst

yvx

xvw

ovo

w|vu

vz

|vy

s88oy

oqvx

oqvo

ovq

wxv

z|vo

qvz

s88Ru

o|v|

qvw

xuvw

ovy

klmmnopqr

cltst

ouvo

zvx

ovx

klmmnopq

~cltst

yv

xv

ovo

s88oy

oqvq

oqvu

ovq

s88Ru

ovz

qvw

ln

ltd

ln

zqq

D

mo

ltd

ln

mltd

JAEA-Testing 2011-005

UgraveBX900gtUcircWQRfeaZaumlWQRfeaZaumlWafeaZdegnIgraveEacuteIacuteotildeigt

laquosectauml4ordfXmWWfeaZAacuteordfXcpoundDmWordfcurren

cqUgraveDFig 14OcircgtCDIacutecurren_ccedilY-a5^regGW(_ccedilY-a5acute164)BX900gtordfpoundUcircCsectmoumlaringaeligccedil13hegravepoundXcpoundD9ordfAumlq [FX1gtUcircWQRfeaZaumlWQR

feaZIgraveEacuteIacuteshyegraveeacuteordfaEacuteDordfUuml|YacuteTHORNcdz

yacuteCiacutecurrengtrsGWotildeigtlaquomWcd BX900 WrsGmWXqBX900W FX1gtUuml|YacuteTHORNWiacuteOslash5GWFig 14OgraveIacutecurrenO=OordfEcircUgraven^currenethWPrsUVXiacute5pound

DDHBX900 gt FX1 gtzyacuteCqCcurren BX900 16 oacuteC FX1gt 40WXzyacutegtlaquopoundDqUgraveDFig 14OacuteWOcircIacutecurrendegncopyIgravefgO=EacutegtlaquosectBX900gt 11currenW 08currengtzyacuteXcpoundDordfFX1gt 36currenW 18curren(or 16curren)gtMddcXzyacuteordfaacuteHdnDq 13yacutecdFX1feaZTUacuteUcirc BX900Wdc+XpoundsectFX1

|Y]OszligyBGDHhhordfaeliggt

laquomWordfcurrencpoundDq 44 NPB23 FT^_ZBDTUacuteUcircszligagraveaacute

42 STREAM^_ZgtethCDW ordfcent^_Zgtecircc5g5poundDq egraveBCD^_Z43 FFTEWEacute Stockham|5gt FFT5P

NPB (NAS Parallel Benchmarks) 23FT^_ZgtlaquoqFFTEoumlWiacuteNPB23 FT Uuml|YacuteTHORN(Fortran Ucircgt MPI EumlIgraveUcirceumln)l3CUuml|YacuteTHORNcdaumlIacutecurren4C)JCDaumloacuteCfeaZUuml^13~brvbarpound^

UcircCDordm)5poundDqUuml|YacuteTHORNgtXCaumlBD=YZ C7UV((XYZ)=(512512512))gtaumlIacutecurrenordf On aringaeligccedil13hWXsectfeaZUuml^13~szligordf13XDHgtlaquosectaumlIacutecurrenUVEacuteCDcpoundDcd

gtlaquoq ^_ZaumlIacutecurren 28plusmngtlaquosectn~n Fig 15W Fig 16GqMd

8plusmncopyIgraveoacuteGYordfZsectumlO=gt|YordfZgtlaquo

mWcdhtgtUKntst oacute WXXgtmUuml^13~notbrvbarGq UgraveDOcirc$_ccedilY-a5 16W 64brvbar5poundDq

JAEA-Testing 2011-005

13

|ucp|t^ro__tAacute^OOOsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]|trFRFY]p^top]qrsYtpYop]qrsJJDp_Q|^bbqpc^ta|rotpbctcb|urc`Q^uptatar^]|tbtarp|t]|tbccbwOa^qrR^boocbtoabccbw`Et^b|rtabtFhPU`Ar_pcrobqq^SJJDItpo]rc_pcJJDOtarbccbw|tur^^t^bq^ruwobqq^SJJDI^taEorttpMbUrttpUQOarcrUQ^tarbs^|bq|rp_U_pcbwo|ur|rtobqq`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^oq|r[qpubq`a[^oq|r[]^_`a[^trrc^OOO^O~OqOsO^rccOr^trrc__tuqpoO__tuqpo]b]bcbrtrcAacute__tuqpohdnO__tuqpo]bhddAtilde auml^_ccedilY-a5Kd]bcbrtrcAacute__tuqpohKdO__tuqpo]bhKXAtilde]bcbrtrcAacutehiOhmKPAtildep|uqrop]qrssOwO|p|uqr]cro^^prqOrqrOrq]^r^psAacute__tuqpo]bOAtildeOwAacute__tuqpo]bOAtildeO|AacuteAtildeobqqUHExE^tAacute^rccAtildesAacutevOvAtildehAacuteK`MMOM`MMAtildewAacutevOvAtildehAacuteK`MMOM`MMAtilde|AacutevAtildehAacuteK`MMOM`MMAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pcprbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqh]^xt^rAacuteAtildep^hKOKMMM^hY^pqhKOOPobqq__tPAacute^OqOOO__tuqpoO__tuqpo]bO|OsOwAtilde^_Aacuteq`r`AtildeptpKdMobqq__tPAacute^OqfKOOO__tuqpoO__tuqpo]bO|OwOsAtilderpptpKXMoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoSp]wRtpQ`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKdMopt^|rp~hKOp^hKO__tuqposAacute^O~AtildehwAacute^O~Atilde aumlIacutecurrenLfgO=PrprpKXMopt^|rrpobqq]^xubcc^rcAacute]^xopxpcqO^rccAtilderqrh]^xt^rAacuteAtilderq]hrqrYrqobqqUHExSpxcbAacuteUHExSCUUxTCBVLOrO^rccAtilde^_Aacuter`r`MAtildetarc^trAacutedOWAtilde[Gqb]rD^ruw|u`__tPAacutero`Atildev[Orq]r^_obqqUHExJ^bq^rAacute^rccAtildetp]

Fig 15 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- cfftz)

JAEA-Testing 2011-005

|ucp|t^r__tPAacute^OqOOOwOwKO|OsOwAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoHrc_pctarVYta^trcbt^pp_tarropbc^btp_tartpoabJJD`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY^]q^o^tpr^trrc^OOqOOOwOwKOKOq^Oq~OqO|O^O~O^KKO^KPO^PKO^PPp|uqrop]qrs|OsOwO|KOsKKOsPK^r^p|AacuteAtildeOsAacutewKOAtildeOwAacutewKOAtildeoYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYort^^t^bq]bcbrtrc`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYKhePqhPWWAacuteqYKAtildeq^hPWWAacuteYqAtildeq~hPWq|hq^fKp^hMOq^YK^KKh^WqfK^KPh^KKfK^PKh^Wq~fK^PPh^PKfq^_Aacute^`r`KAtildetar|Kh|Aacute|f^Atilderqr|Khop~Aacute|Aacute|f^AtildeAtilder^_oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYoDa^qpp]^rotpc^buqr`oYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYphMOqYKp~hKOwsKKhsAacute~O^KKfAtildesPKhsAacute~O^KPfAtildewAacute~O^PKfAtildehsKKfsPK aumlIacutecurrenLOO=PwAacute~O^PPfAtildeh|KWAacutesKKYsPKAtilderprprpcrt|cr

Fig 16 NPB23 FT^_ZaumlIacutecurren (Uuml|YacuteTHORN- fftz2)

JAEA-Testing 2011-005

UgraveWH Fig 17(Ocirc$ Table 9 Table 10GqQRfeaZbrvbarshyatildeCBX900 fftz2gtfeaZUuml^13~UKntstbrvbarzyacuteordfagraveaacute13cfftz gtnotXordfmacrdnXqUgraveDFX1degnnotXordfmacrdnXcpoundDqmW ETHNtildeCD

zyacuteordflaquopoundD BX900 fftz213gtfeaaringaeligccedil13heumlXsectumlampordf5eumlnXmWordfcurrencpoundDq13]h

eumlnX(cXCcedilEgraveWuumlwdnordf(currencdXq Ocirc$_ccedilY-a5degnOcirc` 16ordfTUacuteshygtlaquopoundDq UgraveDQRfeaZbrvbar BX900W FX1shyordfFFTEoumlbrvbarsectgteuml

(2curren13)ucircsup2WsectumlMd|YZgtlaquopoundDDHWuumlwdnq TcedilQRfeaZWafeaZUKntst ordf)BCD

IacutecurrenotildeQRfeaZAgraveordf 1 ethAacuteWXpoundDordfSTREAM gtfrac12Eacutesect

iumlntildegtlaquoq 45 TUacuteUcircUgraveWH

BX900WFX1STREAMWFFTEUuml|YacuteTHORNfIumlWNPB23 FTB13feaZbrvbarTUacuteUcirc5gCDFX1gtegraveBGfeaZUuml^13~brvbarpoundordfecircC^ordflaquo 5ordfaeligugtlaquopoundDq

FX1gtfeaZUuml^13~otildegt13ecircGmWordf13XoumlordflaquosectWecircUcirc7gth5Paeliguordflaquoq

Fig 17 feaZbrvbarTUacuteUcirc

JAEA-Testing 2011-005

TTa

ble

9N

AS

PARA

LLEL

Ben

chm

arks

Ver

23

FT

Tabl

e 10

NA

S PA

RALL

EL

Ben

chm

arks

Ver

23

FT

JAEA-Testing 2011-005

5 IORYbrvbar Y`fIuml IORBoa13alFGFAring

CaYoBETHNtildeCDq 51 IORYEcircu

IOR Benchmark3)vR ASCI Purple=_7|eacuteplusmngteumlnD C iexclbrvbarYfIumlgtlaquoLplusmneacuteYacute~ 2102egraveBPqPOSIXLwrite() read()PMPIIOLMPI_File_write_at() MPI_File_read_at()PbatildeB13EumlIgravel5PmWordfgtecircq 511 YacircueZ$

MPI oZYcplusmnLrepetitions plusmnP`5XFGFAringTshyiquestHqOcirc`centgtEumlIgraveOslash5eumln^_cdiaoacuteCpound

Eacuteecircordf5nndFGFAringshyordfeumlnqIOROslash59XacircueZ$ Fig 18Gq

1plusmn`gt readwriteplusmn(blockSizecurrentransferSizejsegmentCount)gtlaquosect`aecirceuml (blockSizej segmentCountjMPI EumlIgraveAring )WXqyenwtransferSize=1kBblockSize=1MiBL106 Bytes MB220 Bytes MiB WGPsegmentCount=1MPI EumlIgraveAring=32 oumlblockSizejsegmentCountjEumlIgraveAring=32MiB testFile ordf)dnoZY 1MiBmiddotAbrvbarntildeC1KB7fIuml 1024iexcl1313aeligUgraveDZordfpoundEacuteecircGq

[transferSize] 1plusmn readwritegtFGGeacutea`qsizeof(long long int)currencblockSizegtlaquoaeliguordflaquoq

[blockSize] o^_ordfbrvbarntildeGeacutea`qgt 1MiBq [segmentCount] o^_brvbarsectYeumlniOcirc$qyzshy 1q [randomOffset] 1XdZordfYqOcirc`shy 0q [fsync] ecircEacuteAring fsync13f5Gqyzshy 0q [filePerProc] ^_sectWcent`aYGqPOSIXEacutenotszligq

yzshy 0q [useFileView] MPI_File_set_view() MPI_File_write() MPI_File_read()egravePq

ZordfYWparaCXqyzshy 0q [collective] MPI_File_write_at_all() MPI_File_read_at_all()egravePqZordfY

WparaCXqyzshy 0q

Fig 18 IOR YacircueZ$

JAEA-Testing 2011-005

512 MPIIOatildeGUuml^13~eZ$

IOR gt Info aringbrvbarpound MPIIO =)`7wmWordfgtecircqBX900 notszligX InfoUumlYacuteNtildeY` keyW value Table 11GqplusmnCXLyzshyPq

Table 11 MPIIO Info UumlYacuteNtildeY` key value

cb_buffer_size 4194304 $amp()

+

cb_nodes -012345

6789lt

=gt-012A

B

ind_rd_buffer_size 4194304 ACDEFGF$amp()+

ind_wr_buffer_size 524288 ACDH+GF$amp()+

JAEA-Testing 2011-005

52 latildeLPOSIX MPIIOPn~nEumlIgraveAringtransferSize (1plusmn13

fgtFGeacutea`)YAgraveograveLsequential randomPb^wFig 19GumlcopyFGFAringCDL^_P Table 12Table 13Fig 20Fig 21GqUgraveDntildea13iumlIuml_aumlaringaeligccedil13hIO -eacute-eacutearingaeligccedil13hordfcentsup3CDHn~naringaeligccedil13hoacuteGIO XcOgraveOacuteOcircOtildeYoacuteG IO XcordfcentGDH_aumlaringaeligccedil13hszligEacuteDq

iexclcentpoundcurrenyenETERNUSDX80

970TB

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

brvbar

SRFSltlaquocumlcopyordf

READD WnotiteD

brvbar

SRFSltlaquocumlcopyordf

IObrvbarSPARCEnterprise

M9000

SRFSsectumlcopyordf

shyshyshy

Fig 19 ReadWriteumlcopy

JAEA-Testing 2011-005

Table 12 LPOSIXP

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3

oqu yvq ovy| |uxvzw zvwu |zwvyy ouv|u oywvo ozuqv

quw zqzvuy uvox uu|v|u u|uvqo uxovz| uwvuo owovuq wqvxuqzy oqqvo| vy uxyvyw uuuvxx uyvyw xqzvu| |vw| |q|wvzqwoz o|uv|w ouvoy ozwv|o uyxvux uqvyo u|v| uovzu zuvxoy|wu ouoov|w uwvqu uuvxq uyvwz uuvyw uxovww uqvxw zovzw|yw ouxzv wzvwq uy|vx uyvz| uyuvwz uwxvoo |xvo wxvzyyxx|y ouwyvq| wv uxvo| uovyx uzvuu u|qv xvxy yvyxo|oq ouqwvu| uqvqq uxv|w uxvxy uuwvyo uyvoo qvyw zv|qyouu oxoyvu ywovoq uuvw ux|vy |uxvu| uxovoy oqvow uwvzwxuww oxovww oo|zv uyuv uvoq |w|vo u|wv|y ouqvoq uqwvooquwxy ouwvyq ovq uuyvo| uzvyu uquvq u|uvo q|zvx| q|vq

oqu ow|qvy ovu uqvyo vyu xx|vyq uwvwy z|zyv|y oxxqwvuwquw quzvoy |vq |w|vwz uyvyw xxwvzu xuvw o|xxxvy owv|xuqzy ouvu |vu uqov|q uxzvzo u|vzq uuwvu ooouv| |qxvxuwoz ozqxv|x o|vq |z|vuo uq|vqz xxyv|u xzzvy oo|vwo xqvzzoy|wu uvuw uqvz uovz u|vwo xwwvo y|vu| oxvz ozvoq|yw uuvw zuvy uvzz uozv yqzvyx xzyv|u ovo oxvwyxx|y owvy ouuv|u uwv|| uuwvx| x|ovx| yqyvux o|ozvuo oyoovyoo|oq |wzvy xyvyw uuwvz| uvqu x|xv xuvu ooyyvxo oxxvoyyouu u|wvq |zxvwx uvo uyzvu xo|vzq xuqvqw oyzu|v|o q|quvqxxuww uuv|o ozuxvuu uuvxx uzvzu xqvow xuwvzz oyxxxvyx owyovyooquwxy xwvy |qzv| uuzvxu uzvo| xyuvwo uwzvwx oxwwvxz oxyuyvy|

oqu oxyovoo vo yo|vyu v|w xwyvzz zv |xwovy xzzxxvyquw oxvz |vwz yquvxx y|ovx xywvyz xzyvyq x|xy|vzo w|ywzvwouqzy oyvu| zv xzwvx| yyvu xvoq xv yxy|vz zqy|vyowoz oy|xvxx uvy yozv|q yvyu xwv| xxvu| ywuv| zqqqvozoy|wu oxyzvuw uyvzz y|vyy yowvxq xwuvwx xwxvy ywzuv|q wqzvoo|yw oxoovuy wwv yqyvyq yuvyw xwov|x xzov|u ywozyvqy wyyyvywyxx|y oxwxvz ouyvxy yoyvuz yvoq xwxvwx xzqvww ywv| wxxqzvo|oq oyxwvw qvwx yovw yuvw| xxwvwo xwyvuq yquvzw woyvyouu oyw|v|z yzvx| yqvwu yxyvq| x|vx| xywvuq yywqvxw wqvq|xuww oxqvxu ouxqv|u yqvuq y|wvqx xyqvw xovz| yx|||vqq yuy|vuoquwxy oyywvqx uwvuy yo|vxz yuzvz yyxvq xwv|u yox|wvww yoovqx

oqu

|

xy

acutedegmicrodegpara

degmiddotmdegemdegkk~qgcedilsup1sup3

mdegkk~ocedilsup1sup3 cedilordm cedilsup3

regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3 regmacrdegplusmn plusmnsup2sup3oqu |yvoy uvqw zqxvxx xvwq uzvx ovu y|zuvx xywy|vy|quw uxzvy yvuz zoyvxo vox xxv| |vqw zyuyovyo zqzuvuyuqzy uyvuz oovx zuovw wvzy zvqx |vzu o|qxxuvy oowzyyvyqwoz uoyvxx vy xyzvzq xxv|o zuovwx uyvyo ouyyqovxx o|woo|v||oy|wu xqvzy uuvy oquvou oqovq oqqovuw z|vy oxuwv| ouqxzyvwz|yw xoyvyw xwvqy oqqvo oz|vo oqzvyz o|yvu ouwwqvq o|wzuv|yxx|y yu|vq ooyvq o|yvzx qvu o|wvqq uwvy oozwwvyo oozqyvxo|oq youv|y ovoy oxu|v|w yqv ouv|y owzvoo wuo|vou w|qzvzwyouu yxvw uouvwo oywzvwo o|yxvqo oyyvqq oxzzvq xo||voy xoyvwuxuww wozvq yuv| oyuvwo ouxvo ox|wvq o|o|v| |qwqvxw |ouvyzoquwxy zyxv|o zqyvu| oyq|vx oxqwvy oyzvz ooxvu ozzvx owovyu

oqu |xvyy v|u ozuvy w|vyy |xvw |vxy uoyyzov|o |zqyyyvoquw |xvww uvxw owq|vx xvzx oyovu |vqy xw|zvzw xyzuzvzxuqzy |yzv|z zvqw qvuy v|x oyovww wqvyw wzyv|w yqu|vowwoz |wuvz owvu o||vu oqv xzyv ouuv|| yozqovu zzqvzoy|wu xqxvwy |yvx qovou uwvyx oz|zvxy ywv| xqqxvzq yq|qxvzw|yw xuv|u uv|z qywvyu zxvy qxovww |quvz yxwoqv yuquu|vy|yxx|y yzxvqq ouyv| xzwvwx xyvzq yvoq wqzvy uzyqzvu x|yzqzvw|o|oq yyqvx| zzvq |ov|x zzovw| w|ov|| ozvo x|o||xvuz |yqqvwyouu zzyvzq yq|vwz |zzzvz |qvy |wzuv zyvxq |xowu|v |uuoqovuuxuww ooxvu o|wvu |wyvww |zvu |ux|vo zw|vzo |xuwvy |uwovxoquwxy oz|yv| owzuv| |wxyvwu ||yvqu |yzuvzx |yyovq o|uwvo o|uywvo

oqu |yvq vq |qyvqq ozwv|u |zvwq qzvww o|wuzvyo ooq|xxvzyquw |yvzu uv ooqvyu ozzv|w uxxv ozqv| owzzwvyy oxxuq|vouuqzy |yyvx| wvxo yvqq ozwvqy |zwvzq ozwv zzvx qyqqovoowoz |xv| oyvyz ozxovou |wwvw uxqvq| uquvq| |ozqxv|| qwuzzzvw|oy|wu uyqvzo ||v|| |wvyy wquvw wxuvox zvuo |oxy|vw q|qyvoq|yw ux|vuw yvw |xov|u |uvz uuvyu |wqvy o|xxovzu q|wvwzyxx|y yzvwo o|xvu |uqovuy qvw |yvy wwzvx o|wuyov|y owxuqzyvowo|oq xzzvwz yzv|y |q|vxq ovzu uqwvuq oqqvz oyuyyxuv|u oyouoz|vuoyouu wxyvq xuuv|u uoyxvo |yywvo uzvw| uouvu ooq|ozvo oqyq|vqxuww oqqvqu oowvoz uoy|vw |yvx uwuvuw uux|vxx woyyqzvow zqzvqwoquwxy owquvu owov|u uouvx |zxyv uvw uxoqvqu uy|yozvuw uuqqvuo

xy

oqu

|

cedilsup1sup3plusmnsup2

microdegpara cedilsup1sup3 cedilordm cedilsup3degmiddotmdegemdegkk~qg mdegkk~o

JAEA-Testing 2011-005

Ta

ble

13L

MPI

IOP

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

qv

qvz

ovqq

ov

|vq

vwy

quw

vyu

ovq

ovzz

vx

yvqq

yvox

uqzy

xvq

|vx

|vzq

xvo

oovw

oovxu

woz

oqvz

wvqu

vy

oqvy

|vuz

vy

oy|wu

ozvwy

oyvwq

oyvy

ozvz|

uxvw

uxvz

|yw

|yvzx

|vyu

wvy

|xvxo

vq

xv|

yxx|y

yvy

xyvoz

xqvqq

xuv||

oyvqo

oovz

o|oq

zyv|o

wqvu

zovu

zvx

owwvw

oyvzx

youuoxvoq

oqyvx

ouovzu

oqov

x|vqw

uvux

xuwwuyuvoo

|wvuo

uwvxu

oyovqq

wuvqq

||zv|y

oquwxyyqxvu

uwyvux

uwxvzx

ozuvqy

|uwv|

|uuvx

oqu

ovou

qv|o

qvy|

qvyx

|vy

|voq

quw

v|

qvx|

ovq

ovw

yvu

yvz

uqzy

uvxy

ovu

vqw

vyw

ovu

ovou

woz

zv|w

vyu

xv

xvoq

xv

uv|

oy|wu

ov|z

uvyz

oqvou

zvzw

uzvyo

uwvwz

|yw

||vz

zv|w

uv|z

qvy

wvwq

zovy

yxx|y

xvxw

v

uyv

|wvy|

oxqvz

ouzv|q

o|oq

w|vuu

u|vz

wyvx

ywvzz

y|vw

yvuz

youuooyvu|

yovw

oyqvoqvxo

||xvyo

|vx

xuwwwwvww

o|ovu

|uvqzyvuy

|yxv|

uqvw

oquwxy|xvo|

oovux

qovwu|qyvqz

uqwvo

x|v||

|

xy

acutedeg

microdegpara

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2sup3regmacrdegplusmnplusmnsup2sup3

regRraquodegordm~~degraquo

oqu

ovou

ovoo

ovq

ovo

vo

vwx

quw

vx

vow

vo

vxu

uvxo

yvo

uqzy

xvoq

uvou

uv|

xvqu

zvo

oovy

woz

oqv

wvx

wvy

oqvo

ozvwo

vo

oy|wu

qvoq

ovo

ovux

qvq|

|yvyq

uyvou

|yw

uqvyo

|yvz

|uvy|

|wvu|

wovo|

wyvzu

yxx|y

wqvox

|v|y

yyvww

wvy

ouovz

ouwvqq

o|oqoxyv|w

o|vuq

oxvyo

oxovz

wvw

xwv||

youuwovuu

xyvox

u|vzq

yzvxw

xoovux

x|xvwu

xuww|xvo

uzvx

u|vyz

xqvozoquqvw

zuyv

oquwxyzv|y

yovz

wqvou

qyvuqooo|vuoqovu|

oqu

ovo|

qvxy

qvzy

ovo

voz

|v

quw

vz

qvzu

ovzq

vy

uv|u

yvx

uqzy

uvy|

vqx

|v

uvuz

wvwq

ovxo

woz

zv|

|vx

vwq

wvwu

ovyy

yvqu

oy|wu

owvzu

vu

oyvox

owvoq

|xvxo

xuvwx

|yw

|vy|

ouv

|ov|u

|xvx

qvq

oquvwq

yxx|y

xvq|

|vu

uxvx|

yvwo

oxuv||

owov|w

o|oqoxqvyw

v|

wyvq

ovux

|qvo

|vu

youuz|vwo

ouzvwq

o|vy

wuvqw

yovy

yywvz

xuwwxxvyu

|qv

uy|voy

xoqvwoxqvwwo|yvzw

oquwxywxwvxq

xov

wqvx

zqwvu|uovxzoqvu

xy

degmiddotmdegRemdegkk~qg

cedilsup1sup3

cedilordm

regmacrdegplusmn

plusmnsup2

microdegpara

|

JAEA-Testing 2011-005

521 POSIX Fig 20GbrvbarPZYgtFGeacutea`brvbardotildeiWXoacute

CZordfYgtFGeacutea`laquoGnordfzyacuteCqXFGeacutea

`ordf 1MB oumlZYWZordfYWgtiacuteOcirc$FGe$WXDHordfiGqUgraveDZYecircCo^_ordfn

~nparaCDaYGouml`$iaYbrvbarsect

EGoacuteCpoundEacuteEacutegtiaYGoumlAgraveordfAumlgtlaquopoundDq

mnd$Ocirc$YUumleacutenotccedilIuml]OcircOtildeYYL13YEuml

plusmnFEumlPthornordfuumlwdnqecircCCgtordfFGeacutea`

iexclcentIacuteiWXmW]poundEacuteEacuteecircC+Xnota

13gtOslashordfmacrdnX=WC-eacutearingaeligccedil13hLIOiumlIumlshyaringaeligccedil13hPthornWuumlwdnq 522 MPIIO POSIXZordfYWiacute=zCFGeacutea`

otilderyenCordfzyacuteCLFig 21PqUgraveDEumlIgraveiexclcentCXqmndreg=cd MPIIO ]^_ordfZordfY5POslashOumlWXpoundgtXcWmacrdegeumlnq 523 _aumlaringaeligccedil13h 13 IOoiumlIuml_aumlaringaeligccedil13hWa-eacutearingaeligccedil

13hordflaquoqa-eacutearingaeligccedil13humlcopyordfplusmnsup2GmWsup3acutegtlaquoordf13

BeumlnmemcleanfIumlegravePmWbrvbarsect_aumlaringaeligccedil13hY|1micro5PmWordfgtecircq 13G3brvbarsectpoundEacuteEacuteCDPOSIX_aumlaringaeligccedil13h

szligordfaacuteHdnDLTable 14Pq

(1) ecircEacute` (2) memclean (3) TplusmnpoundEacuteEacute` (4) 2plusmnpoundEacuteEacute`

JAEA-Testing 2011-005

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomSingleFileFilePerProc

ReadWrite

homehome

1

4

16

64

256

1024

4096

16384

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

Write Read

sequential randomhomework

N=322561024

N=256

N=1024

N=32

N=32

N=2561024

FilePerProc FilePerProc

N=256

N=1024

N=32

N=2561024

N=32

FFig 20 POSIX

JAEA-Testing 2011-005

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential random

use viewcollective

use view N=256 use view N=256

N=32 256 N=32 256

homehome

025

1

4

16

64

256

1024

4096

1024 4096 16384 65536 262144 1048576 1024 4096 16384 65536 262144 1048576

TransferSize [B]

Perfo

rman

ce [M

iBs]

sequential randomhomework N=32 N=32

FFig 21 MPIIO

JAEA-Testing 2011-005

Table 14 _aumlaringaeligccedil13hszlig

| xy | xy

oregRplusmnsup2 zyvx qzv|| xwvxw zyuvqu

sup2Rplusmnsup2 woxqvoz ||qqvz| uzvu z|voy

kfrac14 frac12kfrac14

53 IORYUgraveWH

IORYbrvbardegdnDregmacrb13tuq 531 a13egravecurren occedil`brvbarYCDcparaX IO O=5Pnota13LSRFSP

homea13brvbarsectiumlIumlS_aumla13Lext3Ptmpa13PordfOslashszligordfAumlDHcAumlFXgttmpa13YZccedila)W^|13~AumlFUcircgtecirc9ordflaquoLFig 22middot Table 15Pq UgraveDhomemiddotAQFS_ccedilY-a5 512kcentCsect186OcircOtildeYOumltimes

raquoZETHIuml_Kgtaordf)JeumlnqUcircfIuml13NtildeY|^`bgta

gt_ccedilY-a5gtyBUacuteCqiAgravework middotAIO eacuteIumlkCQFS_ccedilY-a5 1MBcentCsect366OcircOtildeYOumltimes`Zaggtaordf)Jeumlnq^|13~cda_ccedilY-a5gtlgtyB

UacuteCqa-a5ordfecircCXbrvbarPX MPIIO work middotAegravePuecircgtlaquoq 532 POSIXW MPIIO c^_brvbariaYhomemiddotA Readgt 256EumlIgravegt 2curren

Writegt 10currenW POSIXbrvbarsect MPIIOordfcedilqPOSIXbrvbaraYmacrsup1Gq 533 eacuteccedilecirceuml ntransferSizeGX 1 plusmn13fgtleumlneacute

a`ordfecircMFGFAringordfecircCXWPdegdnDmWcd^|13~

^_Zeacuteccedil-a5laquo]GWordfyacuteordf9ordflaquoq C`|lOcirc`eacuteccedil-a5Y_ BUFSIZgtAtildeUgravesect13

gt gccegravePoumlgt fccegravePoumlgt 8192eacutea`gtlaquoqCfIumlgtsetvbufZaZ|atildebrvbarpoundeacuteccedilecirceuml^wmWordfgtecircq QR Fortran1313aeligleacuteccedil-a5GAgraveograve13sectLAacute

ordmPq (1) OPENyacute BLOCKSIZE (2) ^ fuxxbf (3) Oslash5fIumlZaUuml^13~ -g (4) Ocirc`gt 8Meacutea`

JAEA-Testing 2011-005

ETER

NU

S D

X80

1-

18 (d

x01frac34

dx18

)

DX

80

01

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

DX

80

02D

X80

03

DX

80

04D

X80

05

DX

80

06D

X80

07

DX

80

08D

X80

09

DX

80

10D

X80

11

DX

80

12D

X80

13

DX

80

14D

X80

15

DX

80

16D

X80

17

DX

80

18

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

data

_01

data

_02

hom

ew

ork

data

_01

data

_02

hom

ew

ork

data

_03

data

_04

data

_05

wor

kda

ta_0

3da

ta_0

4da

ta_0

5w

ork

IObrvbar

1(io

1)

ETER

NU

S D

X80

19

-36

(dx1

9frac34dx

36)

DX

80

19

iquestE

SRFSAgraveAacuteAcircAtildeAumlAringAElig

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

DX

80

20D

X80

21

DX

80

22D

X80

23

DX

80

24D

X80

25

DX

80

26D

X80

27

DX

80

28D

X80

29

DX

80

30D

X80

31

DX

80

32D

X80

33

DX

80

34D

X80

35

DX

80

36

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

Wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

data

wor

kda

ta_0

6da

ta_0

7da

taw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

data

_06

data

_07

sysd

ata

wor

kda

ta_0

6da

ta_0

7sy

sdat

aw

ork

data

_08

data

_09

data

_10

wor

kda

ta_0

8da

ta_0

9da

ta_1

0w

ork

IObrvbar

2(io

2)

Fi

g 2

2a13J

JAEA-Testing 2011-005

Table 15 OcircOtilde7Y`|J

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_01 QFS_data_01 2109888 MB 2109888Icirc36 = 724TB 1024k

data_02 QFS_data_02 2109888 MB 2109888Icirc36 = 724TB 1024k

data_03 QFS_data_03 2109888 MB 2109888Icirc36 = 724TB 1024k

data_04 QFS_data_04 2109888 MB 2109888Icirc36 = 724TB 1024k

data_05 QFS_data_05 3695488 MB 3695488Icirc36= 1269TB 1024k

home QFS_home 3695488 MB 3695488Icirc36= 1269TB 512k

work 524288 MB Io2JAgraveAacuteAcirc

SRFSAgraveAacuteAcircAtilde QFSAgraveAacuteAcircAtilde XWCcedilEgraveEacuteEcirc iexclcentpoundEacuteEcirc DAUEuml IgraveIacute

data_06 QFS_data_06 2109888 MB 2109888Icirc36 = 724TB 1024k

data_07 QFS_data_07 2109888 MB 2109888Icirc36 = 724TB 1024k

data_08 QFS_data_08 2109888 MB 2109888Icirc36 = 724TB 1024k

data_09 QFS_data_09 2109888 MB 2109888Icirc36 = 724TB 1024k

data_10 QFS_data_10 3695488 MB 3695488Icirc36= 1269TB 1024k

data QFS_data 3695488 MB 3695488Icirc24= 846TB 1024k

sysdata QFS_sysdata 3695488 MB 3695488Icirc12= 423TB 16k IumlETHNtildeETH

work QFS_work 524288 MB 524288Icirc144=72TB 1024k dx01frac3436OgraveJ

JAEA-Testing 2011-005

6 a$fY` mmgt BX900 a$fY`ETHNtildeCDtuq

ETHNtildeBX900Xraquozfrac14frac12GmWordfgtecircDq 61 a$fY`igraveiacute mmgt BX900 a$fY`IcircIumligraveiacute AgravejiumlIumleacuteaIuml

0Gq 611 icircWiumlIumlethsectntilde

BX900 InfiniBand QDRegraveBCD Fat Treeicircgtlaquoq113aelig13S 18iumlIumlAEligicircgtlaquoordf113aelig13 LeafWdividenaccedil7Iumlordf 2Igravecenttimeseumln1accedil7IumlcdoiumlIuml QDR ordf 1 7gtYZeumlnqaccedil7Iuml 2 Igravelaquogt1 iumlIuml 2 QDR ordfYZeumln(QDR Iacutegt 18 j2Igravegt 36WX)qoiumlIuml`ZaringbrvbarsectT 8GBs (4GBsj2) AAEligpoundq 13aelig13 5PDHLeafaccedilyacute7 SpineWdividenaccedilordf

9iexcllaquosecto LeafaccedilGu Spineaccedil QDRgtYZeumlnqbrvbarpound 113aelig13cd SpineIacutegt 9j2Igrave=18 QDRordfegraveBeumlnq QDRordf 5PoumlLeaf accedil3xmacrDownlink ordf 36 Uplink ordf 18 Xgt Uplinkshy 50WXsectmn 50_ccedilaringWdividenotgtqBX900gta$fY`Ecircfrac34h Fig 23GqBX900gt 50_ccedilaring13aelig13ocircotilde gtthornordf]GqSpineLeafaccedilW13aelig13SoiumlIumlYZGEcircfrac34h Fig 24Gqmmgt13aelig13S nAumliumlIuml nAuml(niquest9)UgraveD nU9Auml(nAgrave9) SpineaccedilegraveBGqlaquo13aelig13laquoiumlIumlordfcent13aelig13iumlIumlcdOcirc$ltAacuteoumlSpineaccedilcd Leafaccedil(2Igrave) QDR2(1j2Igrave)ordfegraveBeumln(2Igrave)LeafaccedilcdiumlIumlQDR2gtYZeumlnqDotildeCmSpineaccedilcd Leaf accedilordmAcirc13aelig13SPAtildeWiumlIumlWnoteumlnqmDHyenwiumlIumlTEacuteOcirc$lt GoumlWiumlIuml 1 W 10 ordflt GoumlWgtn~n FAringordf+XqX13aelig13Sotilde gtiuml_ccedilaring

WXqmDHoYacute~iumlIumlethsectntildeAgraveograveLiumlIumleacuteaOcircOtildePordf

thornotildeGq (1) XiumlIumlethsectntildeL13klP iumlIumlZeacuteaIumleumlnqyenw13aelig13 1 iumlIumlordf bx0001ccedil

bx0018Ugravegtlaquoordf144EumlIgraveYacute~oumlmAumlgtiumlIumlordfeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDdmiumlIumlAringAEligw[iumlIumlegraveBGq13

aelig13ocircotildeoumliacuteZeacuteaIumleumlnq

JAEA-Testing 2011-005

(2) CcedilXiumlIumlethsectntilde Wsup33iIacuteOslash5YZniumlIuml 1ecirceacuteaIumleumlnqyenw13aelig13

1gt bx0001bx0003bx0005bx0007WeacuteaIumleumlnqC3iumlIumlordfpYacute~egravenDd8)eacuteaIumleumlnDiumlIumlordfEgraveAumlAringXdAringAEligwTEacuteEgrave

AumlAringiumlIumlegraveBGqEcircAumlAringoumliacuteAringAEligwTEacuteEcircAumlAringiumlIumlegrave

BGqmnAtildeWYacute~SiumlIuml gt IB [EumlDHcenteumlnDq plusmnoqgt65EacuteWsup3AgraveogravegtCDq

13aelig13 1 13aelig13 119

Leaf(1-2) Leaf(1-1)

bx0001 bx0002 bx0018

Spine(1) Spine(2) Spine(9)

Leaf(119-2)

Leaf(119-1)

bx2125 bx2126

Fig 23 BX900a$fY`icirc

3OslashCDqIgraveIacuteoFL13aelig13raquoIcircIumlntildePfrac34wWsup313kl

WiacuteiumlIumlethsectntildeAgravejBCq

JAEA-Testing 2011-005

accedil (Spine) accedil

(Leaf) 13aelig13 iumlIuml (1 13aelig13acute18 iumlIuml ) (acutecsect]GCGDHLeaf accedilTCcX)

1

10

2

11

3

12

4 5 6 7 8 9

13 14 15 16 17 18

SW 1

SW 2

SW 3

SW4

SW5

SW6

SW 7

SW 8

SW 9

SW

Fig 24 iumlIumlWaccedilatildeAcirc

612 Agravej

BX900 Ba$fY`plusmnsup2yBCD IB AgraveograveWC

RDMA(Remote Direct Memory Access)yBCD RDMA Agravej(13 RDMA WC)WRDMA egraveBCX NORDMAAgravej(13 NORDMA WC)ordflaquoqBX900 gtOcirc` RDMA gtlaquoqRDMA W IB begravecccedil`brvbarYaumlIuml(Host Channel Adapter(HCA))Ocirc$^|13~|W8Y]sectWsectgtecircAEligDIacutemWgtETH_fg (1)5PqAuml^ccedil`E7a13 ordfNtildeEumlgtecircqiAgraveNORDMA CPUbrvbar|fg5Pq

RDMANORDMAWcentccedilYacute AgraveograveWC Normal SendAgravej(2)W Send on RequestAgravej(3)ordflaquoqRDMANORDMAacircWmnd Normal SendAgravejSend on RequestAgravej5PmWordfgtecircordfmndsectOgravewG ccedilYacute-a5brvbarpoundAtildeUgraveqRDMANORDMAgtmndAgravejsectOgraveOcirc`Oacuteshy Table 16GqOacuteshyOcircOtilde Normal SendAgravejOacuteshy13yacute Send on RequestAgravejWXqmndAgravejWccedilYacute-a5^_atildeAcirc Table 17W Table 18Gq Oacuteshy^ MP_MAX_NORMAL_SENDgt^wdnordfyacuteaacuteshyCDOcirc`shyWEacutegtlaquoqbrvbarpound^egravePoumlOacuteshyOumloumlEacutegtlaquoq

JAEA-Testing 2011-005

XDshy 5PDHOslash5 -onesideUuml^13~CDoumlNORDMAAgravejc(ccedilYacute-a5WatildeAcirc)Normal SendAgravejgt5nq

Table 16 AgravejsectOgravewOacuteshy Iuml RDMA RDMA NORDMA

^_ Tccedil1024 1025ccedil U Oacuteshy 32768eacutea` 16384eacutea` 524288eacutea`

Table 17 RDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil16KBOcircOtilde 16KBccedil32KBOcircOtilde 32KB13yacute 1ccedil1024^_ Normal Send Normal Send Send on Request

1025ccedil2048^_ Normal Send Send on Request Send on Request 2049^_13yacute NORDMAWX Send on Request Send on Request

Table 18 NORDMAAgravej Normal SendAgravejW Send on RequestAgravejsectOgravew ccedilYacute-a5 0ccedil512KBOcircOtilde 512KB13yacute Gu^_ Normal Send Send on Request

timesOslash (1) ETH_fg ^_ccedil-brvbarOcirc$fgIacuteccedil`brvbarYordmgt ordm|8

YOcirc$FGGCcedilEgravemWq^_ccedil-( 13)Obbrvbarsectdivide3GmWordfgtecircq (2) Normal SendAgravej G shylt shyccdccedilYacuteG GqccedilYacuteUgraveUacutesect

umlordmordfagraveCnUgraveUgravefg(direct copy)qagraveCXoumliUcircIacuteEumlAagravefCmmfg(indirect copy)agraveCDdsectumlordmfgGqUumlccedilYacutezecircq (3) Send on RequestAgravej G shylt shyWYacuteAacutesectlt shyccedilYacutesectumlordmordfagraveCcdG 5Pq

`ccedilYacutezecircq

JAEA-Testing 2011-005

62 MPI MPI acircX atildeegraveBCD^_Z)JCBX900 yacutegt

CDq256EumlIgraveccedil4096EumlIgravegt Ocirc$-a5THORNumlcopyszligpound5poundDordfmmgto MPIatildegtagraveaacuteX=ecircordfmacrdn 32ccedil80000eacutea`gtGq WCRDMANORDMA ordflteumln atildeordflaquomWlaquoOcirc

$-a5cdecircCordf^UcircG atildeordflaquomWXMordfethCDq 621 Agraveograve mmgtG Ocirc$umlcopy 0ccedil80000 eacutea`gtlaquoordfOslash5vEacuteAacuteG

Ocirc$acircETHeCLTable 19Pq AgraveograveWCMPIatildeTplusmnOslash5GraquoCmn 20plusmnatildesectaumlGq20plusmnatilde

sectaumlC3 1plusmnWecircXshyordfeumlnmWordfDDlaquopoundDqWCpumlcopyYacute~ W[ordfuumlwdnqmDHT`C19plusmnshyIIumlAacutemWCDqXIuml RDMAW NORDMA 2IumlgtGq^_ZOslash MPIatildeIacutecurren Fig 25Gq

Table 19 Ocirc$acirc

EumlIgrave AllreduceReduceBcastSendRecv 13 MPIatilde 256 32eacutea` 128eacutea` 512 yacute 128eacutea`

1024 yacute 256eacutea` 2048 yacute 2048eacutea` 4096 yacute 2048eacutea`

do 1000 III = 1 NNN call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() mesuarment start call MPI_ALLGATHER(anelemMPI_DOUBLE_PRECISION amp bnelemMPI_DOUBLE_PRECISION amp MPI_COMM_WORLDierr) ttime2 = MPI_WTIME() mesuarment end ttime0 = ttime2 - ttime1 call MPI_REDUCE(ttime0ttime1MPI_DOUBLE_PRECISIONMPI_MAX0 amp MPI_COMM_WORLDierr) if(myideq0) then max_time = max(max_timettime) min_time = min(min_timettime) all_time = all_time + ttime end if 1000 continue

Fig 25 ^_ZLIacutecurrenP

JAEA-Testing 2011-005

13

622 mnUgravegttuDOslash5CcedilEgraveOslash5IumlgtP512EumlIgrave Fig 26 Fig

27 1024EumlIgrave B Gqoh$a` MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

0

500

1000

1500

2000

2500

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Reduce(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(512-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(512-)

rdma

Nordma

Fig 26 MPI atildeL512EumlIgrave-1P

JAEA-Testing 2011-005

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(512-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(512-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

140000

160000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(512-)

rdma

Nordma

0

500000

1000000

1500000

2000000

2500000

3000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(512-)

rdma

Nordma

0

200

400

600

800

1000

1200

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(512-)

rdma

Nordma

0

5000

10000

15000

20000

25000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(512-)

rdma

Nordma

0

50

100

150

200

250

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(512-Rank0gtRank511)

rdma

nordma

Fig 27 MPI atildeL512EumlIgrave-2P

JAEA-Testing 2011-005

brvbarsect[raquozfrac14frac12CDq

(1) C gt RDMAbrvbarsect NORDMAordfFoumlordflaquoq (2) RDMAgtlaquo Ocirc$-a5cdaeligccedil ordflaquoEGqmn Normal Send

Agravejcd Send on RequestAgravejsectOgravethornWnq (3) NORDMA gt AgravejsectOgravewOacuteshyWatildeAcircOcirc$-a5gtaeligccedil ordflaquo

EGoumlordflaquoq (4) EumlIgraveordfCXWEacutebrvbarPX Ocirc$-a5gt decircordfmacrdn

oumlordflaquoq ^_ZgtacircegraveBG MPIatildebrvbarpoundRDMAW NORDMAegravecurrenaeliguordflaquo

qDotildeCmnd MPI ^_Z=GfIumlgtlaquo mpiexec Uuml^13~gtlaquogtGu MPIatildeoacuteC RDMAgt5Pc NORDMAgt5Pclaquo XgtC MPI atildeegravepoundoumlAgraveIumlgtOslash5rsGmWordfaeliguWXq ucircuumlWCo MPI atildegtUVOcirc$(ccedil8KB)3UVOcirc$(8Kccedil78KB)U

VOcirc$(80KBccedil584KB)gt RDMAZUcircCD Fig 28LgtUVOcirc$P CLUVOcirc$PGqZyacutegt 100n RDMA ordfnoty0n NORDMAordfnotyWXq

JAEA-Testing 2011-005

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

middotplusmnsup1raquo

~plusmnreg

plusmn

sup2cedil~raquo

sup2~

sup2~

sup2~Oslashreg~plusmn

middotplusmnsup1

~plusmnraquo

~plusmn

Ugraveplusmnsup1raquo

Ugraveplusmnsup1

~plusmn

plusmnraquo

Fig

28

RD

MAW

NO

RD

MALgtUVOcirc$acute

0K

B~8

KBP

JAEA-Testing 2011-005

63 RDMAszlig RDMA 612 gt0CDbrvbarPCPU plusmnsup2WnotM5XDH 13

O=Vfrac34gtecircqmnbrvbarsect WbordfO=gtecirc9ordflaquosectC

cCcedilEgrave`5poundEacuteDqUgraveDrsDHNORDMAgtiacuteX`5poundDq WC WordfO=eumlnDegraveyenagraveaacutegtecircXcpoundDq 631 Agraveograve W5P^_ZegravepoundRDMA W NORDMA rsGq

eacute^_WOcirc$ecircacuteGeumlsectCigraveecircacuteiacuteVicircG (13cedil Wdivideoslash)5PqGlt iuml_ccedilaring (MPI_IsendIrecv)egraveBGqG UgraveDlt GAtildeWcopyIgraveOcirc$-a5 5MB gtlaquoq W5P5Igravegtlaquoq5Igrave-a5(450450)gtlaquoqrsDHgt n~nOslash55Pq ^_ZZccedil`MPIW 2q|Icirca|ccedil`(MPIiumlOpenMP) 3q|)

JCDqZccedil` MPIgtOslash5CDoumliumlIumlS RDMAgtXDHmAgravejszligordfgtecircXqmDHiumlIumlSgt MPI 5XCeth2Icirca|ccedil`)JCDqIcirca|ccedil`Oslash5WOslash5Iacutecurren7ccedilIumlEumlIgraveUcircCXeacute

Yacute~ 1W7ccedilIumlEumlIgraveUcircCDeacuteYacute~ 2)JCDqeacuteYacute~ 2 WOslash5Iacutecurren Fig 29GqXMPIWIcirca|ccedil`eacuteYacute~ 15IgraveIacutecurren7ccedilIumlEumlIgraveampeuml$OMP parallel doiacuteordfotildegtlaquoq DHOslash5Zccedil` MPIgt 8EumlIgraveIcirca|ccedil`Oslash5 4mpij8omp(32core

egraveB)gtOslash5Gq

JAEA-Testing 2011-005

call MPI_BARRIER(MPI_COMM_WORLDierr) ttime5 = MPI_WTIME() mesuarment start call mpi_isend(aNMBmpi_DOUBLE_PRECISIONir1 1 MPI_COMM_WORLDireq(1)ierr) call mpi_irecv(bNMBmpi_DOUBLE_PRECISIONil1 1 MPI_COMM_WORLDireq(2)ierr) call mpi_isend(dNMBmpi_DOUBLE_PRECISIONil2 1 MPI_COMM_WORLDireq(3)ierr) call mpi_irecv(cNMBmpi_DOUBLE_PRECISIONir2 1 MPI_COMM_WORLDireq(4)ierr) ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ttime7 = MPI_WTIME() mesuarment start $OMP parallel do do j = 1 NMM do i = 1 NMM a1(ij) = 1d0 b1(ij) = 1d0 c1(ij) = 0d0 end do end do $OMP parallel do do j = 1 NMM do k = 1 NMM do i = 1 NMM c1(ij) = c1(ij) + a1(ik)b1(kj) end do end do end do ttime8 = MPI_WTIME() mesuarment end ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc call mpi_waitall(4ireqiastatusierr) ttime6 = MPI_WTIME() mesuarment end ttimeDD = ttime8 - ttime7 ttimeal = ttime6 - ttime5 Oslash5 ttimesd = ttimeal ndash ttimeDD Oslash5cdnD (= UgraveDiIacute)

Fig 29 WO=5PDH^_Z

JAEA-Testing 2011-005

632 Zccedil` MPI 2plusmnIcirca|ccedil`eacuteYacute~ 1 WeacuteYacute~ 2n~n 3plusmn

Oslash55poundDqmnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt D Gqontilde13Gq

acutegtOslash5CD Oslash5 acutegtOslash5CDOslash5 R acute Wn~nOslash5shy Oslash5 acute W5poundDOslash5 (U) acuteOslash5cd()Dshy acuteOslash5CD(OslashC) U acuteOslash5cdRshyDshy

laquogtOslash5CD oacuteG WOslash5GmWbrvbarpound

UumlCXpoundDrGqmj13sectgtlaquoq

laquo() = (Oslash5U( iuml)) j100 aTHORNshyordfecircMszligordflaquomWXqbrvbarsect[raquozfrac14frac12CDq

(1) Zccedil` MPIgtiumlIumlS IBegravepoundXDH RDMAograveoacuteordfltdnOslash5szligq

(2) Icirca|ccedil`eacuteYacute~ 1 RDMAOslash5gt WOslash5rsGWucircOslash5ordfFWPXq NORDMAatildeCiacuteXWXpoundq

(3) Icirca|ccedil`eacuteYacute~ 2 gtIacutecurrenordf7ccedilIumlEumlIgraveUcirceumlngteacuteYacute~ 1WRrsgt 72currenOslash5Rrsgt 55currenAumlFUcircordfXeumln(RDMA)qCcCgtucircOslash5ordfFWPdegdnXq NORDMAIacutecurren7ccedilIumlUcircgt Oslash5eacuteYacute~ 1ruUumlIgraveUcirceumlnordfRDMAbrvbarsectcedilCXpoundq

BX900gt RDMA Uuml_IumlOslashWGgtCPU oacuteG

dCb^|13~ CPUbrvbarsectVfrac34GmWordfgtecircqCcCBX900 ordfiacuteCUgraveDplusmngt ordfUumlmWlaquosect Wordf

O=eumlnDegraveyenagraveaacuteGmWordfgtecircXcpoundDq

JAEA-Testing 2011-005

64 CPUSCPUiumlIuml 1oacute 1 core FAringG^_Zegravepound CDqCPU S

iumlIumlS CPU iumlIuml gt Ocirc$ordmAcircAgravejordf+Xqn~n gtordfaEacuteiumlIuml ordfTFWXpoundDq 641 Agraveograve

1oacute 1 ^_Z coreOcirc$ G^_ZgtlaquoqZY0cdcZYCD^_acircgtCD-a5Ocirc$ 100plusmnG Cmnd uCDn~nGq^_ZiIacute Fig 30Gq plusmn 24EumlIgravegt128MBOcirc$G C^_acirc 1WGq BX900gt 1iuml

Iuml 2CPU1CPU 4coregtlaquogtZY 3Ugravegt CPUS ZY 7UgravegtiumlIumlSPAtildeW CPUW mnd13yacuteiumlIuml WXqiumlIumlS coreoacuteG core IDWethsectntildedn MPIZYAumlAring Fig 31Gq

IC = 0 do 1000 III = intcpu nodes-1 intcpu IIRANK = III call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN if(myideq0) amp call MPI_SEND(ansizeMPI_DOUBLE_PRECISION amp IIRANKIIMPI_COMM_WORLDierr) if(myideqIIRANK) amp call MPI_RECV(ansizeMPI_DOUBLE_PRECISION amp 0IIMPI_COMM_WORLDistatusierr) 120 continue ttime2 = MPI_WTIME() c IC = IC + 1 timew(IC) = ttime2 - ttime1 ir(IC) = IIRANK 1000 continue

Fig 30 ToacuteT ^_ZLIacutecurrenP

JAEA-Testing 2011-005

NODE

CPU0 CPU1

Core IDacute0 Core IDacute2 (Rank 0) (Rank 1) Core IDacute4 Core IDacute6 (Rank 2) (Rank 3)

Core IDacute1 Core IDacute3 (Rank 4) (Rank 5) Core IDacute5 Core IDacute7 (Rank 6) (Rank 7)

Fig 31 f MPIZYethsectntilde

BX900Ocirc`QRfeaZ+QR MPIZaZ|gtlaquoordfmQRRIgraveEacuteIacute13

IntelfeaZ+QR MPI IntelfeaZ+Intel MPI QRfeaZ+Open MPI

3 gt5poundDqDotildeCQR MPI W Intel MPI MPI ^_ coreoacuteGethsectntilde Fig 31CDsectgtlaquoordfOpen MPIatildeC rankAumlAring 01234567 Core ID 01234567gtethsectntildednq 642 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt E Gqmbrvbarsect[

raquozfrac14frac12CDq

(1) QRfeaZ+QR MPI gtCPU S FAringordf 3260MBsiumlIumlS CPU FAringordf 3750MBsiumlIumlordf 5350MBsgt CPUS ordfiAumlcedilCiumlIumlordfiAumlFWPWXpoundDq

(2) Intel feaZiumlQR MPI CPU SiumlIumlS CPU QRfeaZWotildeiacuteXotildeordfiumlIuml 4750MBsWegraveeacuteEGqmnQR MPIQRfeaZUcircCD5poundordfIntelfeaZoacuteC5nXDHWnq

(3) IntelfeaZ+ Intel MPI CPUSiumlIumlS CPUgtAgraveXordfmndocircoacute(1)(2)QR MPI egraveBqUgraveDiumlIumlgt Intel MPI 2(` QDRegraveBordfgtecircXDH1(` QDRCcXq

JAEA-Testing 2011-005

(4) QRfeaZiumlOpen MPICPUS ordf 4460MBSgtmnUgravegt3gtTbrvbarordfiumlIumlS CPU QR MPI WotildeAringiumlIumloumlCXWXpoundDqDotildeCiIacute QDRegraveB5poundbrvbarPotildeq

dividefrac12XegraveAgravegtlaquoQRfeaZiumlQR MPIgt CPUSiumlIumlS CPU ordf

iumlIuml brvbarsectcedilWPfrac34W+XWXpoundDqmmacraacutesectXuC

iumlIumlSgt MPI XuCEumluecircgtlaquoq 65 13aelig13ocircotildeouml 13aelig13ccedil`brvbarY 50_ccedilaringicircXpoundDHegraveBoslashbrvbarpound

thornltCUgravePqmthorn 1oacute 1 Wdivideoslash ETHuDq 651 1oacute 1

1oacute 1 MPI atildegtlaquosectbrvbarCegraveBeumlnqyenwOslashfIumlgteacute^_WFGgt^_ordfgt5noumlordflaquoqmbrvbarPXouml50_ccedilaringthornlt9ordflaquoqmmgt 5P^_Z)JC

thornETHuDq WC1oacute 1 gt 2iumlIumlgt IBnotGn[ordfecircDH50_

ccedilaringthornCltmWordfcpoundDqDotildeCWsup3iumlIumleacuteaOcircOtilde(Yacute~iumlIumlethsectntilde)AgraveogravegtAtildeWYacute~ordf IBnotGbrvbarPXiumlIumlcopytimesUVYacute~139ordfAgraveXDHthornordfCq 6511 Agraveograve ETHNtildeegraveBCD^_Z^_1curreno^_cdfrac14secto^_

1oacute 1 (MPI_SendRecv)gt 128MBOcirc$ 100plusmnG CmnuCDiexcl FAringGqOcirc$G aYacute Fig 32 Gqmyengt8EumlIgraveYacute~oumlgtZY 0ccedil3^_cdiexclZY 4ccedil7^_Ocirc$G CqOcirc$Glt ^_iugravepGgtoslashbrvbarpoundccedil`brvbarYgt[

ordfecircmWXq ^_ZZccedil` MPI WIcirca|ccedil`(MPI+OpenMP))JCDqIcirc

a|ccedil`7ccedilIumlETHeGmWgt1 iumlIumlcdOcirc$G 2^wmWordfgtecircq

JAEA-Testing 2011-005

13

6512 (1) 113aelig13S rsDH1 13aelig13S(144core)5poundDqZccedil` MPI 144 EumlIgraveIcirca

|ccedil` 72mpij2omp36mpij4omp18mpij8omp 3 IumlgtOslash55poundDq13aelig13Siuml_ccedilaringgtlaquoDH iumlIumluacuteordf IB iexclcentGqZccedil` MPI 1 iumlIumlcd 8 ^_currenOcirc$ordfeumlnIcirca|ccedil` 2 7ccedilIumliacute 1iumlIumlcd 4^_curren47ccedilIumlEacuteC 2^_curren87ccedilIuml1^_currenWXgt87ccedilIuml13 IBgt[ordfecircqmDHmnd FAring+XordfIBagraveaacuteGDHogtIIuml FAringW 1iumlIumlntildesect FAringshyiquestHDqm Table 20GqXeacuteYacute~Icirca|ccedil`cedil$ucirc3shy(MPI^_j7ccedilIuml)Gq Icirca|ccedil`(18mpij8omp)EacuteegraveeacuteshyordfEordfFAringshyGueacuteYacute~gt

otildeEacutebrvbarPXshyWXpoundqmnordf QDR2currenOslashgtlaquoigravePq (2) 13aelig13Zccedil` MPI TEacute144EumlIgraveZccedil`MPIgt5poundDqYacute~Oslash5gt t256YZegraveBCDq

mnmYZiumlIumleacuteaOcircOtildeordfZcopytimesWXDHIBnotordfecirc]GDHgtlaquoq _copytimes 72^_13aelig13(=9iumlIuml13aelig13)gtIumlb^_ 213aelig13currenethCDq13aelig13S 1iumlIumlecirc 2 IBordfuacuteordfpoundq13aelig13S 18iumlIumllaquogt 36 IBgteumlnqiAgrave13aelig13 50_ccedilaringgtlaquogt 18(9j2ccedil`)gteumlnq144EumlIgraveouml13aelig13S13aelig13W 18 IB egraveBGgtiuml_ccedilaringWXsect13aelig13S13aelig13EacuteXqbrvbarpoundm13aelig13ocircotilde 13aelig13SgtCD Wotilde^

dXq

Ocirc$G (G G)

^_0

^_1

^_2

^_3

^_4

^_5

^_6

^_7

Fig 32 ToacuteT Ocirc$G aYacute

JAEA-Testing 2011-005

Table 20 L13aelig13SP eacuteYacute~ Zccedil` MPI uumlyacutethorn(72j2) uumlyacutethorn(36j4) uumlyacutethorn(18j8)IIuml FAring 7238MBs 14474MBs 28942MBs 53467MBs iumlIumlshy 57906MBs 57899MBs 57884MBs 53467MBs

Table 21 L13aelig13Zccedil` MPIP 144 MPI 192 MPI 224 MPI iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 7246 R 4670 laquosectR 4671 laquosectR

2 7244 R 4670 laquosectR 4671 laquosectR

3 7246 R 4670 laquosectR 4671 laquosectR

4 7242 R 7243 R 4670 laquosectR

5 7243 R 7244 R 4671 laquosectR

6 7242 R 7240 R 7237 R

7 7241 R 7239 R 7246 R

8 7244 R 7240 R 7245 R

9 7249 R 7243 R 7240 R

10 R 4670 laquosectR 4670 laquosectR

11 R 4670 laquosectR 4672 laquosectR

12 R 4669 laquosectR 4671 laquosectR

13 R R 4671 laquosectR

14 R R 4670 laquosectR

iacute 192EumlIgraveW 224EumlIgraveZccedil` MPIn~n 96^_13aelig13gt 213

aelig13Oslash5112^_13aelig13gt 213aelig13Oslash55poundDqordm 144EumlIgraveOslash5WWlt shy13aelig13iumlIumlraquoIIuml FAring Table 21Gq yenw 224 EumlIgraveoumlltAacutesectshygt IBnotGiumlIumlordflaquogt(50_ccedilaring)

miumlIuml FAringordfcedilCXqFig 24a$fY`icircgtCDbrvbarPiumlIuml1W 102W 113W 12ordfnotGmWXsectmndiumlIuml FAringcedilCXpoundqCcCiumlIuml 15161718 egraveBeumlnXDHmndiumlIumlWnotGiumlIuml 6789 FAringordfFC 144MPI oumlW FAringordfEacutegtlaquoqbrvbarpoundiumlIuml6ccedil9 iuml_ccedilaringWXq192 EumlIgraveiacutegtiumlIuml 4ccedil9 p IB notGiumlIumlegraveBeumlnXDH FAringordfFq

JAEA-Testing 2011-005

Table 22 L13aelig13Icirca|ccedilIumlP Icirca|ccedil`(112x2) Icirca|ccedil`(56x4) Icirca|ccedil`(28x8) iumlIuml IIuml F

Aring(MBs)Rp W

IBnotRIIuml F

Aring(MBs)Rp W

IBnotIIuml F

Aring(MBs)Rp W

IBnot1 9339 laquosectR 18567 laquosectR 36408 laquosectR

2 9337 laquosectR 18564 laquosectR 36919 laquosectR

3 9346 laquosectR 18566 laquosectR 36536 laquosectR

4 9347 laquosectR 18570 laquosectR 36457 laquosectR

5 9342 laquosectR 18651 laquosectR 36646 laquosectR

6 14385 R 28606 R 53808 R

7 14488 R 29118 R 53779 R

8 14500 R 28676 R 53789 R

9 14495 R 29100 R 53784 R

10 9337 laquosectR 18797 laquosectR 36408 laquosectR

11 9340 laquosectR 18775 laquosectR 36715 laquosectR

12 9343 laquosectR 18651 laquosectR 36559 laquosectR

13 9348 laquosectR 18811 laquosectR 36457 laquosectR

14 9338 laquosectR 18694 laquosectR 36991 laquosectR

(3) 13aelig13Icirca|ccedil` Icirca|ccedil`gt 213aelig13(224core)egravepound112mpij2omp(56^_13aelig13)

56mpij4omp(28^_13aelig13)28mpij8omp(14^_13aelig13)IumlgtOslash5CDqmndoiumlIumlgtIIuml FAringWpiumlIumlWnotnot Table 22Gq MIcirca|ccedil`gt IB notnotgt FAringordfecircC^poundqUgraveDIcirca|ccedil

`7ccedilIumlbrvbarpoundiumlIumlcdGlt GOcirc$2ordf^DHiumlIumlntildesect^_

ordfMiumlIumlXordf IB gtOcirc$[thornordfagraveaacuteXqmnduumlGWBX900 gtZccedil` MPI brvbarsectIcirca|ccedil`UcirceumlnDfIumlordfIcirca|ccedil`UcircfIumlXd 87ccedilIumlordfnotygtlaquomWordfcq WmigravegtWsup3iumlIumleacuteaOcircOtildeAgraveograveiumlIumlAtildeWecirccopytimeseumlnD

HUVYacute~13currenYacute~gt IB not9AgraveXqbrvbarpound50_ccedilaringmWuumlGbrvbarsectordmgttuDIcirca|ccedil`Ucirc5GuecircgtlaquoqCcC

Yacute~Oslash5 IB notGiumlIuml ordfpYacute~ordfOslash5eumlnWthornltCUgravePordfucircoB3mnEumlmWgtecircXq

JAEA-Testing 2011-005

652 divideoslash mmgtdivideoslash atildePiAuml 2ordfCX MPI_AlltoallegravepoundEacute

DGq 6521 Agraveograve

MPI_AlltoallGCcedilEgravecentCOslash5GqCmrsGqXMPI_Alltoall 100 plusmnOslash5C oacute copyIgrave-a5Zccedil` MPI ordf512MBIcirca|ccedil` 1024MBWCDq

A ucircYacute~Oslash5 B 13aelig13ntildesectiumlIumlgtecircotildeCGq C 13aelig13ntildesectiumlIumlAgraveXCG(13aelig13ordfCX)q D IBnotCG(iuml_ccedilaring)q

BccedilD atildeGgtlaquoordfiumlcopyordfYacute~gtegraveBGiumlIumllaquo GmWgtecircXq mDHWsup3ucircoBgteacuteccedilYacute~TYZ h4096egraveTEacute 512iumlIuml(4096core)agravefCm3cdXuCCcedilEgravePiumlIumllaquonotgtecircmCcedilEgravegtagravefgtecircEumlIgravegtCfNh$)JCmfNh$egravepound

MPI_Alltoall Oslash55poundDqmDHAacutedegCDOcirc$ MPI ^_UYacuteordfXq 6522 Zccedil` MPIWIcirca|ccedil`egravepoundD Table 23GqmmgtCXC

XmWOslash5CcedilEgraveordf+XDH AucircYacute~Oslash5WBccedilDUgravegtoshy8YrsCXmWgtlaquo(A 10WCDrClaquoordfmnoumlgt BccedilDEacutersWG)qBccedilDTEacute 512iumlIumlagravefCcdEumlIgravefNh$)JCOslash5GordfQR MPI gtmbrvbarPTEacuteCiumlIumlagravefCcdAgraveXiumlIumlgtCfNh$)poundOslash5CDoumlWTEacutecd

EumlIgravegtOslash5CDoumlWgt)acircOslash5ordfEGqZccedil` MPI WIcirca|ccedil`gtgt coreordfAgraveXoumlmraquozordfagraveaacuteXqX BccedilDrsXq

JAEA-Testing 2011-005

Ta

ble

23iacuteXCcedilEgrave

MPI

_allt

oall

vUacuteSTUcircR

vRcopyUumlYacutebrvbarTHORN

szligvcopyUumlYacutebrvbaragrave

pvRaacuteacircatildeNR

+aumleAumlovqaringaeligPgR

Oslash 5 Iuml

copy Igrave - a 5 (MB)

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

EA not iexcl plusmn

13 13 13

13 13 13 ntilde sect

I Iuml iuml Iuml

macr

R

R

szligR

pR

144m

pi

512

3R6

0 11

14

118

0

145

9 3

60 R

138

6

36

0 13

87

100

1

31

124

1

25

2plusmn

(BC

DEacute

) 51

2 3R

60

111

4

118

0

148

2 11

16 R

146

9

63

0 14

75

100

1

33

132

1

32

3plusmn

(BC

DEacute

) 51

2 3R

60

111

4

29

0 14

81

714

13 R

153

3

29

0 14

21

100

1

33

138

1

28

448m

pi

512

14R

40

142

6

414

0

182

4 21

272

1 R

170

3

78

0 16

32

100

1

28

119

1

14

504m

pi

512

14R

45

142

7

415

8

180

4 27

242

6 R

160

5

79

0 14

33

100

1

26

112

1

00

2048

mpi

51

2 37R

69

182

1

298

8 24

93

465

6 R

250

1

465

6 25

04

100

1

37

137

1

38

2plusmn

51

2 37R

69

166

7

2211

6

262

5 67

564

6 R

264

7

298

8 25

04

100

1

57

159

1

50

3plusmn

(BC

DEacute

) 51

2 37R

69

166

7

2211

6

262

1 69

524

9 R

269

3 6

298

8 24

59

100

1

57

162

1

47

3072

mpi

51

2 69R

56

258

7 28

399

9 27

34

5371

54 R

286

3 9

507

7 26

44

100

1

06

111

1

02

68m

pij

2om

p 10

24

3R5

7 12

10

11

170

11

80

91

9 R

122

3

28

5 12

21

100

0

97

101

1

01

120m

pij

2om

p 10

24

9R3

3 12

89

215

0

127

0 12

191

6 R

130

4

47

5 12

94

100

0

99

101

1

00

132m

pij

2om

p 10

24

8R4

1 13

79

216

5

116

1 15

211

6 R

119

5

48

3 11

38

100

0

84

087

0

83

432m

pij

2om

p 10

24

17R

64

162

4

119

8 16

28

1925

43 R

161

2 1

129

0 16

18

100

1

00

099

1

00

30m

pitimes4

omp

1024

4R

38

598

115

0

674

12

13 R

683

27

5 68

2

100

1

13

114

1

14

2plusmn

(BC

DEacute

) 10

24

4R3

8 59

8

115

0

678

12

13 R

690

27

5 70

0

100

1

13

115

1

17

36m

pitimes4

omp

1024

6R

30

603

118

0

684

11

16 R

697

63

0 70

3

100

1

13

116

1

17

2plusmn

(BC

DEacute

) 10

24

6R3

0 60

3

29

0 68

8

714

13 R

702

29

0 71

0

100

1

14

116

1

18

126m

pitimes4

omp

1024

16R

39

822

3

415

8

805

27

242

6 R

738

1

79

0 71

5

100

0

98

090

0

87

180m

pitimes4

omp

1024

18R

50

925

615

0

892

40

283

2 R

776

127

5 75

3

100

0

96

084

0

81

2plusmn

(BC

DEacute

) 10

24

18R

50

925

712

9

861

29

352

6 R

808

109

0 76

8

100

0

93

087

0

83

612m

pitimes4

omp

1024

45R

68

110

2

349

0 10

34

525

9 R

111

4

349

0 10

46

100

0

94

101

0

95

2plusmn

(BC

DEacute

) 10

24

45R

68

110

2

2810

9

920

57

634

9 R

960

5

358

7 83

4

100

0

83

087

0

76

17m

pitimes8

omp

1024

3R

57

338

1

117

0

430

9

19 R

432

28

5 43

3

100

1

27

128

1

28

90m

pitimes8

omp

1024

15R

60

504

109

0 44

9

243

8 R

471

109

0 47

1

100

0

89

093

0

94

357m

pitimes8

omp

1024

53R

67

731

418

7 69

2

576

3 R

711

418

7 72

9

100

0

95

097

1

00

2plusmn

(BC

DEacute

) 10

24

53R

67

731

3410

5

642

57

685

3 R

628

6

438

3 60

2

100

0

88

086

0

82

JAEA-Testing 2011-005

brvbarWBccedilD gtecircCOslash5ordf+XAgraveXq13aelig13ordfAgraveXCc IB notGiumlIumlordfAgraveXnOslash5ordfUumlWnDordfmacrWaeligCmbrvbarPXpoundXqMPI_Alltoall MPI atilde3gtT 2ordfCX 5Pordfmnegravepound BX900 gtecircXordfXmWordfethCDq1 oacute 1 gt13aelig13thornordfDordfmgtpYacute~thornltC t256YZegraveBCDoacuteCdivideoslash h4096YZegraveBCDgtpYacute~thornlt9]QR MPIbrvbar TUacuteUcircthornuumlwdnWgtszligagraveXNXq 66 a$fY`UgraveWH mnUgravegt BX900 gt brvbarsect ecircXthorn7wWCacirc

RDMANORDMA Agravej Ocirc$-a5 1oacute 1 (13aelig13ocirc)IBnot

ordflaquopoundDqumlcopyfIuml oslashfrac14frac12CTbrvbarCcedilEgravelaquo GuecircgtlaquoigravePqeumld

fIumlszligOslash5GDH Zccedil` MPIUgraveDIcirca|ccedil`laquo

ordfugtlaquoqucircuumlWCGZccedil`MPIWIcirca|ccedil` rsGordfWmcdBX900gtIcirca|ccedil`ordfnotyXmWordfcurrencqmncdfIuml)JGoumllaquocentfIumlAumlFUcirc5GoumlcentX=ordfX

nIcirca|ccedil`laquo GuecircgtlaquoqBX9001iumlIuml 8coreoacuteC IB2gt brvbarntildeGDHm IBgt 2laquo]eumlXDHIcirca|ccedil`ordfugtlaquoqCPU13fUcircordfbrvbarsect2WnIcirca|ccedil`fIuml13vC]G

Weumlnq

JAEA-Testing 2011-005

7 UgraveWH

HPCC BMT7q|^_ZWUuml|YacuteTHORN^_ZBDbrvbarsectBX900W FX1gtfeaZbrvbarTUacuteUcircordflaquosectnordf^_ZOslashszligthornGmWordfdcXpoundDqIORBDgtegraveBG API]a13brvbarpoundlordfagraveaacutegtlaquopoundDqiacuteX B^_ZbrvbarOcirc$

a$fY`yBAgraveogravegt ordfecircC+XmWethCDqmnd

EcircHdegdnDregmacrHPCCYoJusectWUgraveWHWCtCqXyacuteampG(GOcirc$b AccedilGUgraveWHq mndregmacrBX900 W FX1 o13gtegraveBG^_ZIacuteDh

Agraveogravelaquo GDHnotszligXWXqordfbrvbarsectCumlcopy13

ABGlCWCJ7gtecircngtlaquoq

ugraveuacute

)JlaquoDsectXsup2 sup2kDotildeDAElig13=` v oBeacuteQRjlucircsup2acircAgraveXugraveG

q

ucircuumlyacutethorn

1) Innovative Innovative Computing Laboratory the University of Tennesee ldquoHPC CHALLENGErdquo httpiclcsutkeduhpcc

2) gt$middotamp()p HPCaelig7Yacutegt SX13L32 3plusmnAElig13THORNYacutel 3 A=P+ frac14AElig13THORNYacute$ 2005-05middot4 p98-116 (2005)

3) ldquoIOR HPC Benchmarkrdquo httpior-siosourceforgenet

JAEA-Testing 2011-005

A

FX1

HPC

C B

MT

G-F

FTEfeaZUuml^13~

-DH

PCC

_FFT

_235GmWgt

FFT|5ordfTUacuteUcirceumlnordfzyacuteCq

iAgrave

EP-

DG

EM

M-a5^regbrvbarpoundaringaeligccedil13h-a5

(15

MB

Cor

e)UacuteCordfzyacuteCq

Tabl

e A

-1feaZUuml^13~]-a5^regCD

FX1

HPC

C B

MT

UgraveccedilegravekeacuteUgraveccedilknUgraveccedilplusmnsup2sup3

~~regreg

Ugraveccedilmmnecircecirckccedilnecircfrac12

regReeumlgecirckccedilnecircfrac12

ndegplusmnsup2

ecirckccedilpUgraveecircfrac12frac12plusmnsup2sup3degmiddot

plusmnsup2ordmdegsup2sup1plusmnsup2sup3degmiddot

eacuteplusmn~+igraveiacute8

nmcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

Ugravecedilreg

Ugravemcedilreg

Ugravecedilreg

reg~

reg~v

reg

sup2sup3plusmn

qvqyuqou

ovwyy

qvquzqq

vxzwo

qvzz

vyoy

zzv|w|

qvuqqy

uvuyx

qyu

sup2sup3plusmnqvqy|zx

ovwz

qvquyxz

vxuo

qvzuw

vyowx

zzv|z

qvxx|y

yvox

qwy

sup2sup3plusmn

qvxoquq

uvoxzo

qvoqqz

yyvuox

w|vyzx

vyoxx

zzvzyq

qv|qq

vqy

yy

sup2sup3plusmnqvxoqwwq

uvqwxx

qvooqzu

yyvxuwo

w|vyyq

vyouu

zzvzzx

qvux|

oqvz|

xw

sup2sup3plusmn

|vwq|qqq

|uvwz

qvqoxxw

qv|

o|uuv|q

vyxy

zzv||wq

qvqwzuw

oqvywyu

sup2sup3plusmn|vw||wqq

|vyz

qvuxu

yyyvuwwu

o|uyvyw

vy|q

zzv|uy

qvxwzqu

ovxuo

zq|z

sup2sup3plusmn

|vwuzqq

uuvxqwq

qvqoxxw

wwyv|zq

o|uv|q

vy|ox

xvuuwu

qvowox

oqvowuyq

sup2sup3plusmn|vwoquxqq

|yvzyxu

qvuzou

yvqwq|

o|uyvzxu

vy|qw

xvuuz

qvqwuy

ovuz

|y

uxqqq

wxqqq

|xqqqq

|qqqq

w

frac14icircRfrac14Rplusmndegreg

szligsup3degicircRRszligcedilszligiumliumlRethvRntildeRfrac14degicircRccedilogravemicroplusmnregntildemicro~sup1Oslashsup3sup2moRenecircfrac12oacuteregplusmnAumlocircotildeg

egravekeacuteOslashfrac14knicircccedilpegravekszligszligOslashmmnOslash|xRccedilpOslashpOslashfrac14knRccedilpegravekszligszligOslashfrac12ecircfrac12eacuteeacuteszlignRocircotilde

mszligpecircmicircccedilpeacutefrac14UgraveOslashOslashyunRocircotilde

frac12kicircRRkplusmnplusmnplusmnraquodegReacutemiddotplusmnmiddotRkplusmn~plusmnmiddot

-

moRexoszligcedilowszligkoumlg

szligdegicircRkszligyudividentildeRvxUgraveegraveparantildeRpposlashueuqUgravecedilregedegsup2plusmngntildeRo|vxUgravecedilregesup3plusmnregsup2RugraveRnecircfrac12gg

Cuacute uacuteucircicircuumlTyacutethornYacuteN0ucircicircuumlTyacutethornYacuteN0gt

|

xo

xo

egravekszligszligRethvicircRethov|vo

eeumlgicircecirckccedilnecircfrac12RregR0Eeecirckccedilnecircfrac12Rndegplusmnsup2gIcirc-RJ13NOIPQ

~~icircRppRmicrodegdegugraveplusmnsup2ntildeRoreg~cedilsup2ntildeRmplusmnRn

JAEA-Testing 2011-005

B EumlIgravecentMPI atilde oh$a`MPIatildeEumlIgraveC Ocirc$-a5raquoaringpoundq

1024EumlIgrave

0

1000

2000

3000

4000

5000

6000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allreduce(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

1600

1800

2000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Reduce_scatter(1024-)

rdma

Nordma

0

20000

40000

60000

80000

100000

120000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgather(1024-)

rdma

Nordma

0

100000

200000

300000

400000

500000

600000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Allgatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gather(1024-)

rdma

Nordma

Fig B-1 MPI atildeL1024EumlIgrave-1P

JAEA-Testing 2011-005

0

50000

100000

150000

200000

250000

300000

350000

400000

q qqqq uqqqq yqqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Gatherv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

400000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatter(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scatterv(1024-)

rdma

Nordma

0

50000

100000

150000

200000

250000

300000

350000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoall(1024-)

rdma

Nordma

0

1000000

2000000

3000000

4000000

5000000

6000000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Alltoallv(1024-)

rdma

Nordma

0

200

400

600

800

1000

1200

1400

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Bcast(1024-)

rdma

Nordma

0

10000

20000

30000

40000

50000

60000

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

Scan(1024-)

rdma

Nordma

0

50

100

150

200

250

300

q oqqqq qqqq |qqqq uqqqq xqqqq yqqqq qqqq wqqqq

+(

f)

iexclOacutesectWOcirc (WAcirc)

SendRecv(1024-Rank0gtRank1023)

rdma

nordma

Fig B-2 MPI atildeL1024EumlIgrave-2P

JAEA-Testing 2011-005

13

C

RD

MAoacute

NO

RD

MA

Ta

ble

C-13UVOcirc$

(8K

Bccedil

78K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

quw-

uqzy-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

~plusmnreg

plusmn

sup2~

sup2~

sup2cedil~raquo

sup2~Oslashreg~plusmn

middotplusmnsup1

middotplusmnsup1raquo

~plusmnraquo

~plusmn

~plusmn

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

plusmnraquo

Tabl

e C

-2UVOcirc$

(80

KBccedil

584K

B)

ouu-eaOtildeltaOumlgouu-eaOtildeltag

xy-

xo-

oqu-

qtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimesqtimesRoqqtimes

sup2~Oslashreg~plusmn

plusmn

~plusmnreg

middotplusmnmiddotsup1

~plusmnraquo

~plusmn

sup2~

sup2cedil~raquo

sup2~

middotplusmnsup1raquo

Ugraveplusmnsup1

Ugraveplusmnsup1raquo

~plusmn

plusmnraquo

Zordf

100M

RD

MAAgravejordfnotyWXq

0CXM

NO

RD

MA

AgravejordfnotyWXq

JAEA-Testing 2011-005

D

W Oslash5

Tabl

e D

-1Zccedil`

MPI

8mpi

qzvquecirciumlq|yvwuecirciumlquvzuecirciumlquovu|oecirciumlqxvwoqecirciumlquyvxqoecirciumlquyvxoecirciumlqu

qwvo

qzvzzecirciumlq|vuecirciumlquwvyecirciumlquovu|oecirciumlqxvwuecirciumlquyvuzoecirciumlquxvy|ecirciumlqu

yqyv|

owvz|oecirciumlq|yvwyyecirciumlquvxzecirciumlquvqxecirciumlquov|uecirciumlquyvuecirciumlquccedilxv|xecirciumlq

ccedilyvq

ozvqoecirciumlq|vxxecirciumlquwvux|ecirciumlquvyzoecirciumlquovxoecirciumlquyvuuqecirciumlquccedilvyecirciumlq|

ccedilwuvy

ovq|ecirciumlquyvzz|ecirciumlquwvq|qecirciumlquovu|qecirciumlqxvw|wecirciumlquyvuyoecirciumlquyvywecirciumlqu

yquv

ovquyecirciumlquvuyqecirciumlquwvxqecirciumlquovu|qecirciumlqxvwecirciumlquyvuwqecirciumlquxvzxecirciumlqu

xx|vw

|ovq|zecirciumlquyvwzuecirciumlquvz||ecirciumlquvyoecirciumlquovwecirciumlquyvuzecirciumlquccedilovozecirciumlq|

ccediloyvy

|ovquecirciumlquv|qecirciumlquwvuoyecirciumlquvyy|ecirciumlquovowyecirciumlquyvuecirciumlquccedilvxwecirciumlq|

ccedilovz

uzvwoecirciumlq|vuxyecirciumlquwvu|uecirciumlquovu|ecirciumlqxvw|ecirciumlquyvuuxecirciumlquxvwwuecirciumlqu

yqovy

uzvywecirciumlq|yvzuzecirciumlquvzoyecirciumlquovuwecirciumlqxvwqyecirciumlquyvuecirciumlquyv|yecirciumlqu

yxvu

xzvqecirciumlq|vuoqecirciumlquwv|wqecirciumlquvyyecirciumlquovowecirciumlquyvuuzecirciumlquccedilvozecirciumlq|

ccedil|vx

xzvyzzecirciumlq|voxuecirciumlquwvouecirciumlquvqqecirciumlquovxuecirciumlquyvuuyecirciumlquccediluv|yecirciumlq|

ccedilu|v

yzvuecirciumlq|vxzecirciumlquwvxecirciumlquovuyecirciumlqxvwecirciumlquyvuxecirciumlquxvyzqecirciumlqu

xw|vw

yzvz|oecirciumlq|yvwzoecirciumlquvwwxecirciumlquovuzecirciumlqxvwyecirciumlquyvuyyecirciumlquyvuqwecirciumlqu

yuxv|

zvwqecirciumlq|v|uzecirciumlquwv||yecirciumlquvyxwecirciumlquovozecirciumlquyvuzecirciumlquccedilyvwuecirciumlq|

ccedilywv

ovqq|ecirciumlquyvzzyecirciumlquvzzzecirciumlquvyzecirciumlquovqoecirciumlquyvuzecirciumlquccedil|vozxecirciumlq|

ccedil|ovz

zvuzecirciumlq|vowqecirciumlquwvoxxecirciumlquovoqqecirciumlqxuvxwecirciumlquyvuqecirciumlquvwu|ecirciumlqu

zovy

zvwecirciumlq|vyxecirciumlquwvuecirciumlquovqzzecirciumlqxuvxoecirciumlquyvuyzecirciumlquvu|ecirciumlqu

wvw

qwv|xzecirciumlq|vzecirciumlquwvywecirciumlquvxyoecirciumlquovqwuecirciumlquyvuecirciumlquccedilovqyecirciumlquccedilov

qzvuuuecirciumlq|v|yecirciumlquwvqecirciumlquovuzecirciumlqxvwqqecirciumlquyvuzoecirciumlquyvqoecirciumlqu

y|vx

oovouwecirciumlquwvqqecirciumlquzvoywecirciumlquovuqwecirciumlqxvyxxecirciumlquyvuwecirciumlquuvzoxecirciumlqu

uwvq

owvwuecirciumlq|vq|qecirciumlquvzqzecirciumlquv|uuecirciumlquzvqoecirciumlq|yvu|ecirciumlquccedilxvyxqecirciumlq|

ccedilyuv|

yvz||ecirciumlq|v|oecirciumlquwvuuecirciumlquvqyecirciumlqxovuowecirciumlqxyvu||ecirciumlquovozecirciumlqxoxwvy

ovouecirciumlquvuxecirciumlquwv|zecirciumlquovu|oecirciumlqxvwxyecirciumlquyvux|ecirciumlquxvzoecirciumlqu

xoyvo

|ov|qoecirciumlquv|zqecirciumlquwvyzoecirciumlquv|oecirciumlquwvu|uecirciumlq|yvu|ecirciumlquccedilov|uecirciumlquccediloqxv

|v|qecirciumlq|vqecirciumlquvxzecirciumlquvxqecirciumlquovq|ecirciumlquyvuzecirciumlquccedilv|zoecirciumlq|

ccedil|vu

uwvxy|ecirciumlq|wvzoyecirciumlquzv|ecirciumlquovuozecirciumlqxvuoqecirciumlquyvwqecirciumlquuvuoecirciumlqu

xoxvw

uovquecirciumlquvwecirciumlquwvqecirciumlquovuouecirciumlqxvyxqecirciumlquyvuzoecirciumlquxvwoecirciumlqu

xy|vx

xzvux|ecirciumlq|vxwecirciumlquwvx|ecirciumlquvx|oecirciumlquovoqoecirciumlquyvu|qecirciumlquccedilzvzoecirciumlq|ccediloquvz

xwvqqecirciumlq|vuuwecirciumlquwv|owecirciumlquvzuzecirciumlquov|wyecirciumlquyvxy|ecirciumlquccedil|vywuecirciumlq|

ccediluv|

yzvqecirciumlq|zv|w|ecirciumlquovq|qecirciumlqxovuxecirciumlqxvyecirciumlquyvzqecirciumlquuvqecirciumlqu

uyuvq

yovqyuecirciumlquv|quecirciumlquwv|ywecirciumlquovuyuecirciumlqxwvqxwecirciumlquyvxwoecirciumlquyvecirciumlqu

xwzvx

ovoooecirciumlquvzwwecirciumlquzvqzzecirciumlquvo|ecirciumlqxovuoecirciumlqxyvyouecirciumlquovecirciumlqxooqqvy

ovqoecirciumlquvqyoecirciumlquwvo|ecirciumlquvy|ecirciumlquovoecirciumlquyvuxecirciumlquccediluvxzecirciumlq|

ccediluvz

zvy|ecirciumlq|wvoqqecirciumlquzvqyecirciumlquov|uqecirciumlqxyvw|ecirciumlquyvxyecirciumlquuv||ecirciumlqu

uzovo

zvyzoecirciumlq|vqwecirciumlquwvoecirciumlquovqzwecirciumlqxuvuwwecirciumlquyvuzyecirciumlquvwqyecirciumlqu

yxvy

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

JAEA-Testing 2011-005

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy1

Ccedil

Ccedil

+

Ccedil

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

rdma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

500

0E+0

4

100

0E+0

5

150

0E+0

5

200

0E+0

5

250

0E+0

5

qo

|

ux

y

+ ( f )

nordma+

(8mpi)shy2

Ccedil

Ccedil

Ccedil

+

Fi

g D

-1 Zccedil`

MPI

8mpi

JAEA-Testing 2011-005

Tabl

e D

-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p

q|vu|ecirciumlq|yvuqqecirciumlquyvuuecirciumlquyvy|ecirciumlqu|vuuxecirciumlq|yvuozecirciumlquovzzoecirciumlq

xvw

qvuuwecirciumlq|yvuqecirciumlquvouecirciumlquvqyecirciumlquyvqxqecirciumlq|yvuqecirciumlquccedilovoecirciumlq|

ccediloyv|

o|vuuecirciumlq|yvuq|ecirciumlquyvuecirciumlquyvwuwecirciumlquuv|ecirciumlq|yvuuecirciumlquovqquecirciumlq|

zv

oyvqzwecirciumlq|yvuqoecirciumlquvqooecirciumlquvqecirciumlquyvqyecirciumlq|yvuozecirciumlquovxxecirciumlq

vx

|vwuecirciumlq|yvuqoecirciumlquyv|qecirciumlquyvwxecirciumlquuv|uqecirciumlq|yvu|ecirciumlquovoecirciumlq|

|wv

yvqqwecirciumlq|yvuqyecirciumlquvqqecirciumlquvq|oecirciumlquyvoyecirciumlq|yvuouecirciumlquvuowecirciumlq

uvq

||vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvwxecirciumlquuv|yqecirciumlq|yvuoecirciumlquovooecirciumlq|

|yv

|vuzwecirciumlq|yvuqyecirciumlquvoxxecirciumlquvq|ecirciumlquyvoqyecirciumlq|yvuoecirciumlquccedilov|uecirciumlq|

ccediloyvx

|v|yuecirciumlq|yvuq|ecirciumlquyv|zecirciumlquyvw|oecirciumlquuvqzuecirciumlq|yvuecirciumlquzvo|ecirciumlq

vy

yvy|ecirciumlq|yvuquecirciumlquvqwqecirciumlquvqzecirciumlquyvoqecirciumlq|yvuozecirciumlquccedilxvo|uecirciumlq

ccedilyvy

q|vu|ecirciumlq|yvuqoecirciumlquyvu|ecirciumlquyvyyyecirciumlquvx|qecirciumlq|yvuo|ecirciumlquccedilvoqecirciumlq

ccedilvx

qvu||ecirciumlq|yvuquecirciumlquvouwecirciumlquvq|ecirciumlquyvoxecirciumlq|yvuecirciumlquccedilovoquecirciumlq|

ccedilouvz

ovxouecirciumlq|yvuqecirciumlquyvyxzecirciumlquyvyyecirciumlquvu|ecirciumlq|yvuxecirciumlquwvuwwecirciumlqo

|vu

oyvozecirciumlq|yvuq|ecirciumlquvqoyecirciumlquvq|ecirciumlquyvoecirciumlq|yvuxecirciumlquvooqecirciumlq

|vu

|vyoqecirciumlq|yvuoyecirciumlquyvecirciumlquyvxyecirciumlqu|v|xwecirciumlq|yvuqecirciumlquccedilvozecirciumlq

ccedilxvz

xvzzyecirciumlq|yv|wzecirciumlquyvzwzecirciumlquvq|oecirciumlquyvouecirciumlq|yvuoyecirciumlquuvoxwecirciumlq

yvz

||vyecirciumlq|yvuqoecirciumlquyvy|ecirciumlquyvxecirciumlqu|v|xxecirciumlq|yvuoecirciumlquccedilxvwwzecirciumlqo

ccedilovy

|vuzuecirciumlq|yvuquecirciumlquvox|ecirciumlquvq||ecirciumlquyvo|ecirciumlq|yvuozecirciumlquccedilovq|ecirciumlq|

ccediloyvo

|vzecirciumlq|yvuqyecirciumlquyv|xecirciumlquyvooecirciumlquvzoyecirciumlq|yvuqecirciumlquccedilv|zxecirciumlq

ccedilyv

yvy|ecirciumlq|yvuqqecirciumlquvqecirciumlquvq|xecirciumlquyvouecirciumlq|yvuqecirciumlquccediluvqoecirciumlq

ccedilxvo

q|vozyecirciumlq|yv|zzecirciumlquyvowecirciumlquyvyzecirciumlqu|vxu|ecirciumlq|yvuouecirciumlquxvq|wecirciumlq

oxvw

qv|wyecirciumlq|yvuqxecirciumlquvouuecirciumlquvquuecirciumlquyvoqyecirciumlq|yvu||ecirciumlquccedilovqqqecirciumlq|

ccedilo|vx

ouvoq|ecirciumlq|yv|zecirciumlquyvwqecirciumlquyvyxecirciumlquvuoecirciumlq|yvuoyecirciumlquccedilovxqoecirciumlq|

ccedil|yvy

ovuqzecirciumlq|yv|zoecirciumlquvo|ecirciumlquvq|zecirciumlquyvo|ecirciumlq|yvuowecirciumlquccedilzv|ecirciumlq

ccedilovy

uvoecirciumlq|yvuqqecirciumlquyvwo|ecirciumlquyvxoecirciumlqu|vzwecirciumlq|yvuoecirciumlquccedilyvowecirciumlq

ccediloxvq

vxozecirciumlq|yvuoecirciumlquvo|ecirciumlquvquuecirciumlquyvoyecirciumlq|yvuecirciumlquccedilovzzecirciumlq|

ccedilov|

||vuoqecirciumlq|yvuqxecirciumlquyvuecirciumlquyvywecirciumlqu|vuyuecirciumlq|yvuoecirciumlquvooqecirciumlq

yv

|vxoecirciumlq|yvuquecirciumlquvoxyecirciumlquvq||ecirciumlquyvqwyecirciumlq|yvuuecirciumlquccedilov||ecirciumlq|

ccediloyvu

|vqwecirciumlq|yvuqqecirciumlquyvoecirciumlquyv|yecirciumlqu|vozecirciumlq|yvuowecirciumlquccedil|vxooecirciumlq

ccedilvu

vuxwecirciumlq|yvuqxecirciumlquvoxoecirciumlquvquqecirciumlquyvoxxecirciumlq|yvuuecirciumlquccedilovooyecirciumlq|

ccedilouvz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

jetimesg

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-2Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

rdma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

100

0E+0

4

200

0E+0

4

300

0E+0

4

400

0E+0

4

500

0E+0

4

600

0E+0

4

700

0E+0

4

800

0E+0

4

qo

|

+ ( f )

nordma

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-3Icirca|ccedil`eacuteYacute~

1

4mpij

8om

p (2

)

JAEA-Testing 2011-005

Tabl

e D

-3Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p

q|v|oxecirciumlq|wvwz|ecirciumlq|ovoecirciumlquovou|ecirciumlquvu|qecirciumlq|zvqqecirciumlq|ccedilvxwecirciumlq

ccedil|vu

qvx||ecirciumlq|wvzquecirciumlq|ovyuuecirciumlquovyuyecirciumlquvxqoecirciumlq|wvzyecirciumlq|vy|ecirciumlqo

qv|

ovyecirciumlq|wvwzqecirciumlq|ovoyxecirciumlquovouuecirciumlquvuwecirciumlq|wvzyuecirciumlq|ccedilvoqqecirciumlq

ccedilvy

ovxqoecirciumlq|wvwzecirciumlq|ovyuqecirciumlquovy|yecirciumlquvuoqecirciumlq|wvzxqecirciumlq|ccedil|vzoecirciumlqo

ccedilqvx

|vywoecirciumlq|wvz|ecirciumlq|ovyoecirciumlquov|ecirciumlqu|v|wyecirciumlq|wvz|oecirciumlq|ccedilvzxzecirciumlq

ccedilwvq

v|zoecirciumlq|wvwzyecirciumlq|ovyzecirciumlquovuzyecirciumlquxvzzwecirciumlq|wvzxwecirciumlq|ccedilov||oecirciumlq|

ccedilowvq

||vywecirciumlq|wvwwzecirciumlq|ovxecirciumlquov||ecirciumlqu|v|xwecirciumlq|wvzoecirciumlq|ccedilv|zecirciumlq

ccedilyvx

|vuooecirciumlq|wvzqwecirciumlq|ovy|ecirciumlquovxqxecirciumlquyvqzecirciumlq|wvz|ecirciumlq|ccedilovyecirciumlq|

ccedilovo

|v|xzecirciumlq|wvzqoecirciumlq|ovyecirciumlquovowwecirciumlquvzo|ecirciumlq|wvzyecirciumlq|ccedil|vzzecirciumlq

ccediloovu

vuxzecirciumlq|wvzqoecirciumlq|ovy|yecirciumlquovxoecirciumlquyvuecirciumlq|wvzyoecirciumlq|ccedilyvxuecirciumlq

ccedilwvw

q|v|xzecirciumlq|wvwwoecirciumlq|ovuecirciumlquovxqecirciumlqu|vxxuecirciumlq|wvzuuecirciumlq|vxwecirciumlq

v

qvuxecirciumlq|wvzqyecirciumlq|ovy||ecirciumlquovuzwecirciumlquyvqqecirciumlq|wvzy|ecirciumlq|ccedilov|uwecirciumlq|

ccedilowv

ovuwqecirciumlq|wvwzecirciumlq|ovo|yecirciumlquovo|zecirciumlquvuuqecirciumlq|wvzuyecirciumlq|vowecirciumlqo

ovo

oyvoozecirciumlq|wvz|qecirciumlq|ovxqxecirciumlquovuzecirciumlquyvqoecirciumlq|wvzxzecirciumlq|ccedilvzyecirciumlqo

ccedilov|

|v||zecirciumlq|wvzqzecirciumlq|ovxecirciumlquov|oecirciumlqu|v|oxecirciumlq|wvzzyecirciumlq|yv|owecirciumlqo

ovz

xvzyuecirciumlq|wvzqoecirciumlq|ovuwecirciumlquovxqxecirciumlquyvqxecirciumlq|wvzuecirciumlq|ovwuoecirciumlq

|vo

||v|xzecirciumlq|wvwzoecirciumlq|ovxecirciumlquovuzecirciumlqu|vxozecirciumlq|wvzqecirciumlq|v|zoecirciumlq

vo

|vuwxecirciumlq|wvzqyecirciumlq|ovy|zecirciumlquovxquecirciumlquyvqwoecirciumlq|wvzxzecirciumlq|ccedilov|xoecirciumlq|

ccedilowvo

|vo|uecirciumlq|wvwzqecirciumlq|ovqecirciumlquovoecirciumlqu|vqecirciumlq|wvzyuecirciumlq|ovuyzecirciumlq

uvu

yvuwecirciumlq|wvzooecirciumlq|ovxyyecirciumlquovxqoecirciumlquyvquecirciumlq|wvzyuecirciumlq|ccedilyvuw|ecirciumlq

ccedilwvy

q|v|oyecirciumlq|wvwzqecirciumlq|ovoecirciumlquov|yecirciumlqu|v|yzecirciumlq|wvzwzecirciumlq|ovxoyecirciumlq

uvy

qxvzyyecirciumlq|wvzq|ecirciumlq|ovuwecirciumlquovy|wecirciumlquvu|qecirciumlq|wvzx|ecirciumlq|ovxouecirciumlq|

xvu

ovxyecirciumlq|wvwqecirciumlq|ovoy|ecirciumlquov|oecirciumlquuvuuecirciumlq|wvzyecirciumlq|ovxwqecirciumlq|

xv|

oyvqqecirciumlq|wvzqecirciumlq|ovuzecirciumlquovy|wecirciumlquvuyecirciumlq|wvzuzecirciumlq|ovux|ecirciumlq|

uvo

|vyecirciumlq|wvwzqecirciumlq|ovxecirciumlquov|zecirciumlquuv||ecirciumlq|wvzxuecirciumlq|vowyecirciumlq

ozvx

vuuuecirciumlq|wvzuuecirciumlq|ovy|zecirciumlquovuwzecirciumlquxvzxzecirciumlq|wvz|yecirciumlq|ccedilovuz|ecirciumlq|

ccedilqvo

||vywecirciumlq|wvww|ecirciumlq|ovxyecirciumlquov|zecirciumlquuv|owecirciumlq|wvzoecirciumlq|vwoecirciumlq

ozvw

|vuqqecirciumlq|wvzqwecirciumlq|ovy|oecirciumlquovxqqecirciumlquyvquqecirciumlq|wvzxwecirciumlq|ccedilov|oqecirciumlq|

ccedilov

|v|xecirciumlq|wvww|ecirciumlq|ovuecirciumlquov|q|ecirciumlquuvqyyecirciumlq|wvzyzecirciumlq|vzuyecirciumlq

xv|

yvqwecirciumlq|wvzouecirciumlq|ovxyecirciumlquovxyyecirciumlquyvouecirciumlq|wvzuzecirciumlq|uvqzxecirciumlqo

vz

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

efg

+Ccedil

efg

jetimesg

XpoundCcedil

Refg

Ccedil

efg

XpoundCcedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyD|

sup2sup3plusmnshyD|

Ccedil

Refg

Ccedil

efg

Ccedil

efg

+

efg

e+g

efg

efg

+Ccedil

efg

jetimesg

sup2sup3plusmnshyDo

sup2sup3plusmnshyDo

sup2sup3plusmnshyD

sup2sup3plusmnshyD

Xpound

XpoundCcedil

Refg

Ccedil

efg

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy1

Ccedil

Ccedil

Ccedil

+

Fig

D-4Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (1

)

JAEA-Testing 2011-005

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy2

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

rdma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

000

0E+0

0

200

0E+0

3

400

0E+0

3

600

0E+0

3

800

0E+0

3

100

0E+0

4

120

0E+0

4

140

0E+0

4

160

0E+0

4

180

0E+0

4

qo

|

+ ( f )

nordma+

(4mpiIcirc

8omp)shy3

Ccedil

Ccedil

Ccedil

+

Fi

g D

-5Icirca|ccedil`eacuteYacute~

2

4mpij

8om

p (2

)

JAEA-Testing 2011-005

E 1oacute 1

Fig E-1 QRfeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39305 sec Ave Speed = 32566145 MBs RANK = 2 TIME = 39197 sec Ave Speed = 32655168 MBs RANK = 3 TIME = 39212 sec Ave Speed = 32643219 MBs RANK = 4 TIME = 34341 sec Ave Speed = 37272765 MBs RANK = 5 TIME = 34175 sec Ave Speed = 37453885 MBs RANK = 6 TIME = 34086 sec Ave Speed = 37552470 MBs RANK = 7 TIME = 34329 sec Ave Speed = 37286226 MBs RANK = 8 TIME = 24021 sec Ave Speed = 53287420 MBs RANK = 9 TIME = 23794 sec Ave Speed = 53794939 MBs RANK = 10 TIME = 23805 sec Ave Speed = 53770236 MBs RANK = 11 TIME = 23798 sec Ave Speed = 53786914 MBs RANK = 12 TIME = 23942 sec Ave Speed = 53462490 MBs RANK = 13 TIME = 23931 sec Ave Speed = 53488046 MBs RANK = 14 TIME = 23950 sec Ave Speed = 53443826 MBs RANK = 15 TIME = 24066 sec Ave Speed = 53186052 MBs RANK = 16 TIME = 23798 sec Ave Speed = 53786327 MBs RANK = 17 TIME = 23802 sec Ave Speed = 53775886 MBs RANK = 18 TIME = 23821 sec Ave Speed = 53733399 MBs RANK = 19 TIME = 23794 sec Ave Speed = 53795866 MBs RANK = 20 TIME = 23939 sec Ave Speed = 53468837 MBs RANK = 21 TIME = 23938 sec Ave Speed = 53470775 MBs RANK = 22 TIME = 23939 sec Ave Speed = 53470333 MBs RANK = 23 TIME = 23942 sec Ave Speed = 53462889 MBs

JAEA-Testing 2011-005

Fig E-2 afeaZQR MPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes all time RANK = 1 TIME = 39370 sec Ave Speed = 32511775 MBs RANK = 2 TIME = 39475 sec Ave Speed = 32425231 MBs RANK = 3 TIME = 39415 sec Ave Speed = 32474846 MBs RANK = 4 TIME = 34392 sec Ave Speed = 37217925 MBs RANK = 5 TIME = 34543 sec Ave Speed = 37055373 MBs RANK = 6 TIME = 34250 sec Ave Speed = 37372742 MBs RANK = 7 TIME = 34709 sec Ave Speed = 36878286 MBs RANK = 8 TIME = 26967 sec Ave Speed = 47465876 MBs RANK = 9 TIME = 26964 sec Ave Speed = 47471018 MBs RANK = 10 TIME = 26955 sec Ave Speed = 47486289 MBs RANK = 11 TIME = 26947 sec Ave Speed = 47499771 MBs RANK = 12 TIME = 26957 sec Ave Speed = 47483647 MBs RANK = 13 TIME = 27020 sec Ave Speed = 47373054 MBs RANK = 14 TIME = 26948 sec Ave Speed = 47498023 MBs RANK = 15 TIME = 27174 sec Ave Speed = 47104542 MBs RANK = 16 TIME = 26956 sec Ave Speed = 47485054 MBs RANK = 17 TIME = 26945 sec Ave Speed = 47504899 MBs RANK = 18 TIME = 26946 sec Ave Speed = 47502360 MBs RANK = 19 TIME = 26967 sec Ave Speed = 47466158 MBs RANK = 20 TIME = 26948 sec Ave Speed = 47499767 MBs RANK = 21 TIME = 26982 sec Ave Speed = 47438204 MBs RANK = 22 TIME = 27087 sec Ave Speed = 47254830 MBs RANK = 23 TIME = 27004 sec Ave Speed = 47399960 MBs

JAEA-Testing 2011-005

Fig E-3 QRfeaZOpenMPIbrvbar

nmb = 128 intcpu = 1 Number of CPUs = 24 ------------------------------------------------ CPU(rank0) -gt every 1 CPUs 128MB data communication ------------------------------------------------ The values are correct at all processes NODE CPU coreid Rank = 0 bx2050 CPU0 0 all time RANK = 1 TIME = 38110 sec Ave Speed = 33587213 MBs bx2050 CPU1 1 RANK = 2 TIME = 28656 sec Ave Speed = 44667752 MBs bx2050 CPU0 2 RANK = 3 TIME = 37959 sec Ave Speed = 33720175 MBs bx2050 CPU1 3 RANK = 4 TIME = 28658 sec Ave Speed = 44664664 MBs bx2050 CPU0 4 RANK = 5 TIME = 37829 sec Ave Speed = 33836073 MBs bx2050 CPU1 5 RANK = 6 TIME = 28641 sec Ave Speed = 44691880 MBs bx2050 CPU0 6 RANK = 7 TIME = 37892 sec Ave Speed = 33779924 MBs bx2050 CPU1 7 RANK = 8 TIME = 35140 sec Ave Speed = 36425620 MBs bx2051 CPU0 0 RANK = 9 TIME = 35017 sec Ave Speed = 36554163 MBs bx2051 CPU1 1 RANK = 10 TIME = 38916 sec Ave Speed = 32891010 MBs bx2051 CPU0 2 RANK = 11 TIME = 45416 sec Ave Speed = 28184186 MBs bx2051 CPU1 3 RANK = 12 TIME = 35172 sec Ave Speed = 36392511 MBs bx2051 CPU0 4 RANK = 13 TIME = 35053 sec Ave Speed = 36516468 MBs bx2051 CPU1 5 RANK = 14 TIME = 35200 sec Ave Speed = 36363472 MBs bx2051 CPU0 6 RANK = 15 TIME = 35009 sec Ave Speed = 36561987 MBs bx2051 CPU1 7 RANK = 16 TIME = 22996 sec Ave Speed = 55661441 MBs bx2052 CPU0 0 RANK = 17 TIME = 35587 sec Ave Speed = 35968575 MBs bx2052 CPU1 1 RANK = 18 TIME = 35587 sec Ave Speed = 35968151 MBs bx2052 CPU0 2 RANK = 19 TIME = 35712 sec Ave Speed = 35842525 MBs bx2052 CPU1 3 RANK = 20 TIME = 22996 sec Ave Speed = 55661978 MBs bx2052 CPU0 4 RANK = 21 TIME = 36856 sec Ave Speed = 34729458 MBs bx2052 CPU1 5 RANK = 22 TIME = 22994 sec Ave Speed = 55666382 MBs bx2052 CPU0 6 RANK = 23 TIME = 22991 sec Ave Speed = 55673257 MBs bx2052 CPU1 7

JAEA-Testing 2011-005

13

F |eacuteIuml -microX|eacuteIuml^_Z)JCQRfeaZW Intel feaZegrave

pound5poundDqQRfeaZgt|YatildeGfeaUuml^13~

WCeumlUKntstiacuteordflaquoqmnaringaeligccedil13heumlX store amp5GUuml^13~gtlaquoq2csup1egraveBordfOcirc$ storegtszligordflaquoqiAgraveIntelmnntildeGfeaUuml^13~centsup3CXqCcCcedilEgravegtacircOslash5Cagraveaacute5pound

Dq WCIntelfeaZmUuml^13~ordfccdQRfeaZ

UKntst egraveBCDoumlWordfqiAgraveUKntst ordfQRfeaZEGmWordfethCDq

F1 Agraveograve o^_UgraveDo7ccedilIumlgt 128MB copyIgrave0]lyacuteOslash5GZccedil` MPI

OpenMP^_Z)JCDqm0]lyacute 100plusmnatildesectaumlC5mnccC|eacuteIuml (MBs)iquestHq^_ZiIacute[oumlYacutebrvbarsectGqOslash5IumlWCMPI gt 12348 ^_OpenMP gt 123487ccedilIumlgtOslash5UgraveD IFyacute1EC1CPUcd2^_UgraveD7ccedilIuml 2laquonotgtOslash55poundEacuteDqQRfeaZUKntst CDoumlWCXouml 2sectCDqIntelucircfeagt5poundDq

JAEA-Testing 2011-005

Fig F-1 ^_ZugraveuacuteLFlat MPIP

(main) n = nmb102410248 nmb = 128 (from input data) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) nsize = n size c NNN Repetition number of times NNN = 100 do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue call MPI_BARRIER(MPI_COMM_WORLDierr) ttime1 = MPI_WTIME() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = MPI_WTIME() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

Fig F-2 ^_ZugraveuacuteLOpenMPP

(main) $OMP threadprivate(ioutompidompnumabierr1ierr2) n = nmb102410248 nmb = 128 (from input data) $OMP parallel shared(n) allocate(a(n)stat=ierr1) allocate(b(n)stat=ierr2) if(ierr1ne0orierr2ne0) then write(6) Array allocation failed stop end if $OMP end parallel nsize = n size c NNN Repetition number of times NNN = 100 $OMP parallel private(i)shared(nsize) do 100 i = 1 nsize a(i) = 10d0 b(i) = 00d0 100 continue $OMP end parallel $OMP parallel private(IIttime1ttime2timewave_spd) $OMPamp shared(NNNn) ttime1 = omp_get_wtime() do 120 II = 1 NNN call atob(abn) 120 continue ttime2 = omp_get_wtime() timew = ttime2 - ttime1 ave_spd = (dfloat(nsize)8d0)(timewdfloat(NNN)) amp 1024d01024d0 subroutine atob(abn) implicit none integer(4) in real(8) a(n)b(n) do i = 1 n b(i) = a(i) end do return end

JAEA-Testing 2011-005

F2 mnUgravegttuDOslash5CcedilEgraveOslash5Iumlgt[oumlYacutebrvbarsect 4gtGqmn

d QRUKntstlaquooumlWoumlWr 2 2^_laquonotgtUKntstlaquooumlWoumlWr 3KntstCc 1ccedil8^_gtQRW IntelWr QR(-Kntst)W Intel2 2^_Wr CqOMP Intel EcircHGu MPI Wiacute

XWXpoundDDHyzGqmndbrvbarsect[raquozfrac14frac12CDq (1) MPI^_ZOpenMP^_ZWQRfeaZgtUKntstordflaquooumlW

oumlgtordfaacuteCq (2) IntelfeaZUKntstCDQRfeaZWbAEligq (3) MPI^_ZOpenMP^_ZW 1CPUSgto coremacrdnXq (4) UKntstordfQRfeaZOslash5thornlt]G(1plusmnW 2plusmn

Oslash5ordfaEacuteD)q ^_ZbrvbarPXOcirc$sup1yBordfAgraveXCc store ampordfCXoumlQR

feaZUKntst Uuml^13~CXnordfyacuteordfdXqOcirc$sup1yBordfcAgraveXcpoundecircsectCXoumlmUuml^13~CagraveaacuteGaeliguordflaquoWnq

IntelfeaZdegIacuteQRUKntstlaquosectoumlWGmWordfgtecircq(XETHNtildefeaOslash5|YatildeG4X

yen5oslashIacuteD|Yordf9WXpoundbrvbarPmacrltdnDq

JAEA-Testing 2011-005

Table F-1 |eacuteIumlWeumlUKntstiacuteUuml^13~

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqvo uuuwvq u|xvq qvzu

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |o|v wq|vy xxwvu qvXpoundo |xwxvz |xwvq |xw|vz Xpoundo |xvy wovw xov qv

|xyvu |xuvo |xxv| |ozvu woqv xyxvq qv

Xpoundq z|vx zvz z|v Xpoundq ouuwvz ozwov ooxv| qvyoXpoundo woqv wqxvo wqvy Xpoundo oux|vu ozwv| oqvu qvyoXpound wqzv| wqvo wqxv Xpound oux|v| ozzqvo oov qvyo

wquv| wqqvo wqv ouxovz ozwyvu oozvo qvyo

Xpoundq o||vo o|xvq o|uvq Xpoundq oqoxvz oxquvu oyqvo qvxzXpoundo ouv ouov ouov Xpoundo oqovw oxoqvo oy|vz qvxzXpound ouov ouyv| ouuvq Xpound oqowvz oxqzv oyuv| qvxzXpound| ouvy ouuvw ou|v Xpound| oqowvz oxqzv oyuvq qvxz

o|zvz ouovw ouqvz oqovz oxqwv| oy|vo qvxz

Xpoundq qwyvq oqyvw qzyvu Xpoundq ouxvq ouwuvy ouywv| qvqXpoundo qz|v ooyvy oqxv Xpoundo ouxvz ouwuv ouovq qvqXpound qzuvy oovq oqxvw Xpound ouxzv ouz|v ouyv qvqXpound| qzxvy ooxvu oqxvx Xpound| ouxuvw ouzv ou|vx qvqXpoundu qwzv| o||v ooovx Xpoundu ouqxvo ouuv| ouyv qvyXpoundx qwzvu o|vw oo|vy Xpoundx ouq|vo ouuxv ouuv qvyXpoundy oqvy o|xvu oovx Xpoundy ouquvw ouuyvq ouxvu qvyXpound ooqvu o|yvo o|v Xpound ouqxvx ouuyv ouyvo qvy

qzxvw ouvw ooqv| ou|qv| ouyvu ouuwvz qvyw

usup3deg

wsup3deg wsup3deg

usup3deg

|sup3deg

sup3deg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

sup3deg

|sup3deg

Table F-2 |eacuteIumlLiacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq ywxvu xyuvw yxvo qvuXpoundo |xywv |xwvx |x|vy Xpoundo yzwvz xvu y|wvo qvuXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz yzvo xovo y|ovy qvu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq yqv xywvy yozvy qvuXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound ywyvz xwv y|uvw qvuXpound| Xpound|

|xyov| |xovu |xyyv| ywvw xxvy yv qvu

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq yoov xvu ywovw qvxXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| yxv qvo yzvy qvx

|xyv| |xqvo |xyyv yowv yov ywzv qvx

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo uyv| wqvo y|v qvuXpound |xyvz |xuvu |xovo Xpound uqvu wqyv| y|wv| qvuXpound| Xpound|

|xyyv| |xuvu |xqv| uywvw wqyv y|vw qvu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo uyxvy wqyv y|yvo qvuXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| uywvq wqxvz y|vq qvu

|xyuvw |x|vo |xywvz uyyvw wqyv| y|yvx qvu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound xq|v wovo yxvz qvuXpound| |xyxvu |x|v| |xyzv| Xpound| uzzvx wqyvu yxvz qvu

|xyuvw |xuvy |xyzv xqovy wqzv yxxvu qvu

ntildeR| ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

ontildeR

ontildeR|

qntildeRo

qntildeR

qntildeR|

qntildeRo

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg ReccedilograveregatildeNg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

JAEA-Testing 2011-005

Talbe F-3 |eacuteIumlLQRafeaZrsacute 1ccedil8 ^_P

osup3deg Xpoundq uuzyvo uuzvx uuzyvw osup3deg Xpoundq uqxvw uqzvq uqvu ovqx

Xpoundq |xyvq |xyyv |xyyvy Xpoundq |ywqvx |ywovy |ywovo ovq|Xpoundo |xwxvz |xwvq |xw|vz Xpoundo |yzyvz |yzvo |yzvq ovq|

|xyvu |xuvo |xxv| |ywwv |ywzv| |ywzvq ovq|

Xpoundq z|vx zvz z|v Xpoundq zv| zxvq z|v ovqqXpoundo woqv wqxvo wqvy Xpoundo wqxvx wqwvy wqvo ovqqXpound wqzv| wqvo wqxv Xpound wquv| wqwv| wqyv| ovqq

wquv| wqqvo wqv wqqv wquvq wqv| ovqq

Xpoundq o||vo o|xvq o|uvq Xpoundq ov ouqv| o|ov| ovqqXpoundo ouv ouov ouov Xpoundo o|ovy ouwvw ouqv ovqqXpound ouov ouyv| ouuvq Xpound o|ovy oxqvo ouqvw ovqqXpound| ouvy ouuvw ou|v Xpound| o|qv ouzvo o|zvz ovqq

o|zvz ouovw ouqvz ozvq ouvo o|wvq ovqq

Xpoundq qwyvq oqyvw qzyvu Xpoundq qzv ooxvq oqyvo ovqqXpoundo qz|v ooyvy oqxv Xpoundo oquv oouv| oqzv ovqqXpound qzuvy oovq oqxvw Xpound oqxv| oxvx ooxvu ovqqXpound| qzxvy ooxvu oqxvx Xpound| oqxvw oxv ooxvx ovqqXpoundu qwzv| o||v ooovx Xpoundu oqv qzuv| oqvx qvzzXpoundx qwzvu o|vw oo|vy Xpoundx oov qzov oquvu qvzwXpoundy oqvy o|xvu oovx Xpoundy ooxvu qwzv oqvy qvzwXpound ooqvu o|yvo o|v Xpound oouvo qzqvo oqvo qvzw

qzxvw ouvw ooqv| ooqvq oqxv oqvz qvzz

wsup3deg

usup3deg

|sup3deg

sup3deg

13aumlecedilg

sup3deg

|sup3deg

usup3deg

eccedilograveregYacuteg

efrac12cedilregg

oefrac12cedilregg

wsup3deg

efrac12cedilregg

efrac12cedilregg

oefrac12cedilregg

efrac12cedilregg

Table F-4 |eacuteIumlLQRafeaZrsacute iacuteX^_P

Xpoundq |xxovu |xyqvz |xxyvo Xpoundq |quvy |quvx |quvx ovquXpoundo |xywv |xwvx |x|vy Xpoundo ||vx |ovw |vy ovquXpound XpoundXpound| Xpound|

|xyqvo |xyzv |xyuvz |ouvq |o|vo |o|vy ovqu

Xpoundq |xx|vy |xyvo |xxvw Xpoundq ||zzv |qv| |xxqvw ovqqXpoundo XpoundoXpound |xyzvq |xwqvw |xuvz Xpound |zwv| ||v |yovq ovqxXpound| Xpound|

|xyov| |xovu |xyyv| |xzwvw |o|vq |yxxvz ovq

Xpoundq |xxvo |xyqvz |xxyvx Xpoundq |quv |quvq |quvu ovquXpoundo XpoundoXpound XpoundXpound| |xvu |xzv |xxvw Xpound| |ovx |ovy |ovy ovqu

|xyv| |xqvo |xyyv |o|vo |ovw |o|vq ovqu

Xpoundq XpoundqXpoundo |xyuv |xuvu |xyzvx Xpoundo |oyvw |oyvu |oyvy ovquXpound |xyvz |xuvu |xovo Xpound |oyv |oyvq |oyvo ovquXpound| Xpound|

|xyyv| |xuvu |xqv| |oyvx |oyv |oyv| ovqu

Xpoundq XpoundqXpoundo |xyuv| |x|vy |xywvz Xpoundo |ov |oyvo |oyvz ovquXpound XpoundXpound| |xyxv |xvx |xywvz Xpound| |ouv| |ovx |o|vu ovqu

|xyuvw |x|vo |xywvz |oyvq |ouv| |oxvo ovqu

Xpoundq XpoundqXpoundo XpoundoXpound |xyuv |xxvw |xqvq Xpound |oyv |oxvz |oyvo ovquXpound| |xyxvu |x|v| |xyzv| Xpound| |oxvx |oyvw |oyvo ovqu

|xyuvw |xuvy |xyzv |oxvw |oyv| |oyvo ovqu

ntildeR|

ontildeR|

ontildeR

qntildeR|

qntildeR

qntildeRoqntildeRo

qntildeR

qntildeR|

ontildeR

ontildeR|

ntildeR|

oefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

13aumleograveregatildeNcedilograveregYacuteg

eccedilograveregYacuteg

Xpoundoefrac12cedilregg

efrac12cedilregg

efrac12cedilregg

Xpound

JAEA-Testing 2011-005

G Zccedil` MPIWIcirca|ccedil`gt MPI atilde 6MPI ETHNtildegtegraveBCD^_ZiIacute^regCZccedil`MPI

WIcirca|ccedil`gtOslash55poundDq

G1 Agraveograve MPI ETHNtildegtegraveBCD^_ZiIacute^regC1024 ^_Zccedil

` MPIW 128^_j87ccedilIumlIcirca|ccedil`128^_Zccedil` MPIW128 ^_j8 7ccedilIumlIcirca|ccedil`acircgtCc Ocirc$-a5gtrs5poundDqUgraveDRDMANORDMAIumlgt5poundDq )acircEacute coregtZccedil` MPIWIcirca|ccedil`rs acircEacute MPI^_gt

coreCCDouml rsWGq Ocirc$-a5n~n 8eacutea`ccedil1Keacutea`256Keacutea`ccedil1Meacutea`WCDq

G2 [oumlYacutebrvbarsectGqmbrvbarsect[raquozfrac14frac12CDq (1) 1024Zccedil` MPIW 128j8Icirca|ccedil`gtGu MPIatildegtIcirca|ccedil`

ordfFqUgraveDatildebrvbarpoundaacuteCCordflaquoq (2) 128Zccedil` MPIW 128j8Icirca|ccedil`gtplusmn Ocirc$-a5(256KB13yacute)

gtlaquoWGu MPIatildegtaeligIcirca|ccedil`ordfnotyW wXordfaacuteCCIcirca|ccedil`ordfZccedil` MPIbrvbarsectordfmWq

(2)gt ordfAgravecedilCXpoundegraveBG core ordfCXgtOslashf

IumlntildeHDoumlfIumlOslash5FCXWeumlnqmndbrvbarsect

BX900gtIcirca|ccedil`ordfnotygtlaquoqfIumlUTUacuteUcircbgtIcirca|ccedil`Ucirc5Gaeliguordflaquoq

JAEA-Testing 2011-005

Table G-1 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (1)

sup2~ w |y |o ovq| zx o| qv |vu vuqsup2~ oy |o |qz ovq| oqq o|x qv| |vow vwsup2~ u |ow |o ovqq zz o|x qv| |vo v|xsup2~ | |y |o ovqu zw o|x qv| |v|u v|sup2~ yu |o |oy ovqq zy o|z qvyz |v| vsup2~ ow ||u |o| ovq zw o| qv |vuq vwsup2~ xy ||u |o ovqx oqx o|z qvx |voz vsup2~ xo ||u |q ovqu oq o|w qv |vo| v|sup2~ ontildequ |uo |q ovq oqz ou| qvy |vo vu

sup2~ w owq qw qvw q oo qvy| vxy ovwxsup2~ oy owu o| qvw q oqw qvyy vyo ovzwsup2~ u owq qu qvww yw oqz qvy vyx ovwsup2~ | o| qy qvwu o oqw qvyy vu| ovzosup2~ yu o| qu qvwx o oqw qvyy vu| ovwzsup2~ ow owx qq qvz| oqz qvy vxy ovwusup2~ xy oz qo qvwz wy ooo qvw vqz ovwsup2~ xo ow qu qvw w oo qv vqx ovwsup2~ ontildequ ow ou qvwx z oq qv ovzw ovz

sup2~Oslashreg~plusmn w untildeqoy untilde|z ov| ontildeuw ontildeyz| qvu xzv|q xvusup2~Oslashreg~plusmn oy |ntildeyuw uqntilde ovwo ontildezw ontildeyuu qvz xyvx uvwsup2~Oslashreg~plusmn u untildeqxu |zntildezu| ovwx ontildexo ontildeyoq qvw xzvow uvwsup2~Oslashreg~plusmn | |ntilde|| uqntildeyq ovw ontilde|q ontildeyq qvwo xxvxy uvwysup2~Oslashreg~plusmn yu untildey uqntildeuuu ovwu ontildeww ontildeyuy qvw xvyu uvxsup2~Oslashreg~plusmn ow |ntildeqq uqntildezw| ovw ontildew ontildeyzx qv xzvxq uvowsup2~Oslashreg~plusmn xy untildeyyx uqntildeywz ovwu ontildeuxw ontildeyy qvw xovq uvuosup2~Oslashreg~plusmn xo untildeqz |zntildeoww ovwz ontildexq ontildeqx qvwz uwvyz vzzsup2~Oslashreg~plusmn ontildequ yntildexwx |ntildeuuz vqx ontildey ontildewx qvw uvo ozvzw

middotplusmnsup1 w |w | ovqq oqq o|y qv| |vw vuqmiddotplusmnsup1 oy |u |uo qvzx oqo o|y qvu |vq vxqmiddotplusmnsup1 u |uo |yq qvzx oqo ouq qv |v|w vxwmiddotplusmnsup1 | |xo |yw qvzy oq o|w qvu |vux vyymiddotplusmnsup1 yu xzo u|| ov|y oqx ouw qvo xvy| vzmiddotplusmnsup1 ow ntildez yqu uvz| ooy oxx qvx xvq |vzqmiddotplusmnsup1 xy xntilde z xv| o| oy qvz |zvyx xvwumiddotplusmnsup1 xo wntildeqoz ontildew uvxo oy oz ovo| |voy zvmiddotplusmnsup1 ontildequ oqntildezy |ntildexxo |vqz zo |w uvqz oovz ouvzx

middotplusmnsup1raquo w ntildey| oxntildequq qvo || oy qvu vz ovqomiddotplusmnsup1raquo oy ntildeuyw oxntildeqx qvoy ||x oq qvu v| ovoxmiddotplusmnsup1raquo u ntilde|y oxntildequq qvow |uo o| qvuw wvq| ovqzmiddotplusmnsup1raquo | ntildexw| oxntildeq|| qvo || ox qvu vyy ovqomiddotplusmnsup1raquo yu ntildeyyz oxntildeqqz qvow |xq yw qvuy vy ozvxumiddotplusmnsup1raquo ow |ntildeqqx oxntildeqy| qvq u|q uo qvxw yvzz qv|middotplusmnsup1raquo xy |ntilde|uy oxntildeoqu qv xyo y qvu xvzy ozvw|middotplusmnsup1raquo xo |ntilde|zq oxntildeoxq qv xzw woq qvu xvy owvomiddotplusmnsup1raquo ontildequ |ntildex| oxntildeuox qvu yy z|z qv xvxx oyvuo

Ugraveplusmnsup1 w ontildeuox yntildeyo| vy| u q ovq ovzq |vUgraveplusmnsup1 oy owntildeqz yntildewu vy |q qo ovou wv|q |uvoxUgraveplusmnsup1 u owntilde| yntildewzz v |qo q ovuy yvoy ||v|wUgraveplusmnsup1 | ozntildeqzy yntildezq v uz q ovq yvwq ||v|wUgraveplusmnsup1 yu ozntilde|wx yntildewx| vw| z oy ov| yxv|o |ovqUgraveplusmnsup1 ow qntildewo yntildez|y vz ||q o ovxq yovxq |ovuxUgraveplusmnsup1 xy qntilde|o ntildeqo vzq |wy y ovo xvx |ovqqUgraveplusmnsup1 xo ontildeo|u ntildeow| vzu |xy |u ovx| xzv| |qvyUgraveplusmnsup1 ontildequ ontildey|q ntilde|z vz| |zw ow| vow xuv|o uqvu

Ugraveplusmnsup1raquo w ozntildexw yntildez| vw| |z q ovow wvqu |uvuqUgraveplusmnsup1raquo oy ozntildexoz yntildezq vw| |q q ovou wuvyz |uvoUgraveplusmnsup1raquo u ozntildexqy yntildezu vwo u q ov| ov|q ||vyqUgraveplusmnsup1raquo | ozntildeuu yntildewz vwy o qy ov|o vwx ||vuqUgraveplusmnsup1raquo yu ozntildeqw yntildezq| vwx y oy ovu |vy |ovzy

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

sectWOcirceg

oqu$XAcircsup3deg owsup3degIcircwsup3+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

JAEA-Testing 2011-005

Table G-2 MPIatilderaquo Lflat MPI and hybridacute8ccedil1024B) (2)

Ugraveplusmnsup1raquo ow qntildeqx yntildewz| |vqq |u q ovq xxv|| |ovzUgraveplusmnsup1raquo xy qntildew| ntildeq| vzy ||y y ovuz yvqx |ovoUgraveplusmnsup1raquo xo ontildew ntildexy vz| u|| |u ovwx uzvoo |qvzxUgraveplusmnsup1raquo oqu ontildeww| ntildeuqz vzx |oy owu ov yzv|q uqv||

~plusmn w yntildez zntildeuw qvo ontildeozu uux vyw xvwq ovzo~plusmn oy ntildeoqz zntildeu| qv| ontildeoy| uu vyq yvoo ovz~plusmn u ntildeq|z zntildeuy qv ontildeo uuy vy| yvqq ovwx~plusmn | ntildeoz zntildeyq qvu ontildequ uu vq xvzw ovwy~plusmn yu yntildezoq zntildey qvo ontildeoyu uu vyq xvzu ovw|~plusmn ow zntilde|qz zntildew| qvzx ntildeo uuy uvw uvz ovz~plusmn xy zntilde|wz zntildewqw qvzy ntildey uuy uvzz uv ovz~plusmn xo zntildeqx zntildewuq qvzu ntildeowx uuz uvw uvo ovzu~plusmn oqu zntildexz zntildezq| qvzz ntildeo uyo uv uvuz ovxq

~plusmnraquo w yntildewox zntilde| qvq ontildeuo uux vz xvuz ovw~plusmnraquo oy yntildezq zntildeu| qvo ontildeoz uu vyw xvw ovwo~plusmnraquo u yntildeyz| zntildeux qvyz ontildeoyw uuu vy| xv| ovz|~plusmnraquo | yntildewou zntildexx qvq ontildeu uux vx xvx ovzo~plusmnraquo yu yntildezx| zntildex qvo ontildeoyo uuu vyo xvzz ovzx~plusmnraquo ow zntildeuux zntildew qvz ntildeoxx uu| uvwy uv|w vqx~plusmnraquo xy zntilde|x zntildewqo qvzx ntildeq uuy uvzx uvu ovzw~plusmnraquo xo zntildeyw zntildewuy qvzw ntildeqy uxq uvzq uv|z ovwz~plusmnraquo oqu zntildeyxq zntildezoo qvz ntildeoxy uyq uvyw uvuw ovx|

plusmn w uy uo ovo zy oow qvw uvwq |vxqplusmn oy xq uw ovqy oqo oq qvwx xvqq |vzzplusmn u xwz x|| ovoq oqy o| qvwy xvx uv|uplusmn | oo yqy ovo ooq o|q qvwx yvuy uvywplusmn yu ntilde zoz vu o| ouo qvw owvqu yvxqplusmn ow ntildeww ntildeqqz ov|z oxo ox qvwy owvu oovxqplusmn xy untildeouy |ntildeoy ovo q| oz qvz| qvuy oyvzplusmn xo |zntildeo| |xntilde|u ovoo ontildeuox ontildeuy qvwo vyz qvowplusmn oqu uqntildeqwy |xntilde|w ovo ontildeuww ontildewqw qvw yvzu ozv

plusmnraquo w oqqntildexy yqntilde|y ovy ntildeuou wy vx uovyy ywvwzplusmnraquo oy oqntildey| yontildeq ovyw ntildeu|w ww| vy uvqz yzv|oplusmnraquo u oqntildexyw yontildeqz| ovyw ntilde|zy wz| vyw uvw ywvuqplusmnraquo | oq|ntildeqo yontildeo ovyw ntildeuqq www vq uvzx ywvzoplusmnraquo yu oq|ntildeqzw yontildeoou ovyz ntildeuyx wzx vx uovw| ywv|qplusmnraquo ow oq|ntildezo yontildeq| ovq |ntildexqx wz |vzo zvyx ywvoplusmnraquo xy oquntildeqy yontildeqq ovo |ntildexyw wzz |vz zvo yvwxplusmnraquo xo oqwntilde|zq yqntildewo ovw |ntildexyx wzx |vzw |qvuo yvzplusmnraquo oqu ooqntildeqq yontildeoxx ovwq |ntildeyu zo |vzz zvzx yyv|z

~plusmnreg w ooz o|o qvzq q x qvwq xvwo xvo~plusmnreg oy oo| ou qvzo y qvwx xvq uvwu~plusmnreg u oow o|o qvzo q x qvw xvz xvy~plusmnreg | oq ou qvz q x qvwq xvzu uvwz~plusmnreg yu oq ox qvzy q x qvw yvo uvzz~plusmnreg ow o oz qvzz y qvw xvy xvq|~plusmnreg xy ox o|o qvzx y qvzx uvwy uvw|~plusmnreg xo ox o|o qvzx | y qvww xv|x uvzx~plusmnreg oqu ou o|| qvz| y z qvz uvq uvy|

~plusmn w oontildewxu |ntildeyuy |vx yu ontildeoow qvx owvuy |vy~plusmn oy oontildewo |ntildeyq |vw y|z ontildeoox qvx owvuw |v|~plusmn u oontildey |ntildexw |vz yuo ontildeoox qvx owv| |vo~plusmn | oontilde| |ntildexzx |v y| ontildeo| qvx owvuz |vq~plusmn yu oontildew|o |ntildey |v y| ontildeo qvx owvxw |vo~plusmn ow ontildeqw |ntilde| |v| yy ontildeouq qvxx ozv|o |vw~plusmn xy ontilde |ntildewox |v zz ontildeoyw qvyw oxv| |v~plusmn xo ontildeuo |ntildezzo |voo wu ontilde qvyz ouvy |v~plusmn oqu ontildequ untilde|w vzq zu ontilde|u qvo o|vuz |v|o

sup2cedil~raquo w y o qvx o | qvu| xvoy uvusup2cedil~raquo oy z o qvwq o qvxx vzx xvuxsup2cedil~raquo u x oq qvuy o qvyo |v| uvuzsup2cedil~raquo | | oo qv o qvxu vuz uvzosup2cedil~raquo yu oo qvyo o qvuz xvyo uvxusup2cedil~raquo ow oo qvyw o qvxx yvqq uvwxsup2cedil~raquo xy y oo qvxy o qvy uvyq xvoosup2cedil~raquo xo y oo qvxy | qvy| |vq| |vusup2cedil~raquo oqu x o qv|z o qvx |vyz xvuq

JAEA-Testing 2011-005

Table G-3 MPIatilderaquo LMPI and hybrid acute256ccedil1024KBP

sup2~ yntildeouu |ntildewzz |ntildeuou ovou ontildezx ontildeqyw ovw| ovzz |vqsup2~ xuntildeww ntilde|o ntildeuo qvzz |ntildeqo |ntildey qvz vu vsup2~ wyntildeu| oqntildexx oontildeqyy qvz |ntildezwy untildeuyw qvwz vq vuwsup2~ ontildequwntildexy ountildexxy oxntildeuy qvzu untildezqq xntildewyw qvwu vz vyu

sup2~ yntildeouu zou zo qvzu ontildeqqq ontildeq qvzw qvzo qvzxsup2~ xuntildeww ontildeo ontildezy qvwz ontildew ntildeuuz qv| qvz qvzsup2~ wyntildeu| ntildexuo ntildey| qvz ntildeyzz |ntildeuqw qvz qvzu qvwosup2~ ontildequwntildexy |ntildex untildeqox qvzu |ntildexww untilde|| qvw ovqx qvz

sup2~Oslashreg~plusmn yntildeouu yntildeuw| wontildexoz qvzu xntildeuo |ntildeoqw ovoq |vqq |vx|sup2~Oslashreg~plusmn xuntildeww ou|ntildewx ountildeqo qvw| uyntildeqzq yyntildewxo qvyz |vo vyosup2~Oslashreg~plusmn wyntildeu| oontildezoo yzntildeuz qvz yntildeqz zxntildeox qvo |voy vw|sup2~Oslashreg~plusmn ontildequwntildexy w|ntildeyox |yontildeuq| qvw wntildexo o|ntildexqz qvo |v| vz|

middotplusmnsup1 yntildeouu |ntildezw ||ntilde|x| qvzz oontildexww oxntildeyzu qvu vwx vo|middotplusmnsup1 xuntildeww y|ntildex yyntildeyy qvzx ontildez|x untilde|x qvuy vww ovuomiddotplusmnsup1 wyntildeu| wzntildewy z|ntildeyux qvzy |qntildeyx yntildeyqu qvuy vz ov|zmiddotplusmnsup1 ontildequwntildexy oqntildeux| o|untildeqox qvzq uqntildeyxx wwntildez| qvuy vzy ovxo

middotplusmnsup1raquo yntildeouu yntildew|u ||ntildeu ovww ntildewxy oxntildeyuw uvyy qvwy voumiddotplusmnsup1raquo xuntildeww wxntildeoz yuntildeo| ov|| wxntildex|| untildeuy ovwq ovqq ov|xmiddotplusmnsup1raquo wyntildeu| ooxntildeuw z|ntildezyu ov| z|ntildeyow yntildexuz ov|z ov| ov|zmiddotplusmnsup1raquo ontildequwntildexy o|ntildexx o|untildeuqz ovq zyntildexoq wntildeuuy ovoq ovu| ovxu

Ugraveplusmnsup1 yntildeouu |zntildeyu| oqntildexwz |vu uontildeuyq ountildewy vz qvzy qvoUgraveplusmnsup1 xuntildeww untildewxx ntildeqo voy uzntildeoy| yntildewo ovw| qvz qvw|Ugraveplusmnsup1 wyntildeu| xxntildewoq zntildewyu ovw xyntildewuu |yntildexqx ovxy qvzw qvwUgraveplusmnsup1 ontildequwntildexy yntildewxy |ntildexxw ovy yuntildewzx uyntildeoq ovuo qvz qvwo

Ugraveplusmnsup1raquo yntildeouu uqntildeq|u oqntildeyxq |vy uontildey| ountildez|z vz qvzy qvoUgraveplusmnsup1raquo xuntildeww uwntilde|u ntildeqz vo uzntildew yntildewx ovw| qvzw qvw|Ugraveplusmnsup1raquo wyntildeu| xyntildezy zntildez|y ovww xntildeo| |yntildeuwq ovx qvzw qvwUgraveplusmnsup1raquo ontildequwntildexy y|ntildeo| |ntildeyxq ovyw yxntilde|yw uyntildeqx ovuo qvz qvwo

~plusmn yntildeouu untildeyw wntildezu uvo uuntildeo wntildeuq| xvx qvzy ovq~plusmn xuntildeww xqntildeqyx qntilde|xq vuy xontildexwy untildezux vq qvz qvw~plusmn wyntildeu| xwntildeq wntildeux vqy yqntildeox |untildeu ov| qvzw qvw~plusmn ontildequwntildexy yyntildequq |yntilde|q| ovw ywntildeqx uuntildeuyy ovx| qvz qvw

~plusmnraquo yntildeouu zntilde|q zntildeqq ovqu zntildexq wntildeu|w ovo| qvzw ovq~plusmnraquo xuntildeww oyntildeu| qntilde|wy qvwo oyntildeoz untildezuy qvyx ovq qvw~plusmnraquo wyntildeu| |ntildewo wntildeuxu qvw ntildewz |untilde qvyy ovq qvw~plusmnraquo ontildequwntildexy zntildezy |yntildexo| qvw zntildezxq uuntildeu qvy ovqq qvw

plusmn yntildeouu yxntildewu wzntildeq| qv| ontildeyoz yntildezz qvwq |vq |v|qplusmn xuntildeww ountildeqoz oxntilde|o qvwo |zntildexu xxntilde|ux qv |vo vxplusmn wyntildeu| ow|ntildeuwy untildeyu qvw xntildeoyu untildezu| qvy |vo |vqqplusmn ontildequwntildexy u|ntildewo zxntildezqo qvw xntildezy zuntildeoy qvwq |vo |vo

plusmnraquo yntildeouu x|ontildeyx wyntildeqoz yvow o|ntildeqox owntilde|xu vu |vww uvyzplusmnraquo xuntildeww yozntildeyz| oz|ntildeuwu |vq oyntildeyw ywntildeyw v| |vwo vwplusmnraquo wyntildeu| yw|ntildewxu wwntildeooz v| owyntilde|q| oqxntildeyw ovy |vy vplusmnraquo ontildequwntildexy |yntildezo uqxntildeoyq ovw oontildexo ouontildezzx ovuz |vuw vwx

~plusmnreg yntildeouu uzu y| qvw uwo xzw qvwq ovq| ovqy~plusmnreg xuntildeww zq| ontildeoz qvy wo| ontildexuy qvx| ovoo qv~plusmnreg wyntildeu| ontilde|q ontildeyq qvw ontildeoxo ntildeoqz qvxx ovo| qvz~plusmnreg ontildequwntildexy ontildezqu ntilde|ux qvwo ontildeuxu ntildeyx qvxu ov|o qvww

~plusmn yntildeouu ountildewqx ountildez|x qvzz ontildeqq owntildeuw| ovou qvq qvwo~plusmn xuntildeww zntildeuy zntildewxz qvzw uqntildex|o uwntildexyw qvw| qv qvyo~plusmn wyntildeu| uuntildeq uuntildeo qvzz xyntildeyoq yzntildexzw qvwo qvw qvyu~plusmn ontildequwntildexy qntildeqqx qntildeu|q qvzz ntildeo zqntilde|ux qvwo qvzy qvw

sup2cedil~raquo yntildeouu x w qvz zo ux vq| qvw ov|sup2cedil~raquo xuntildeww ox o| qv| o|w o| qvyx qvzo qvwosup2cedil~raquo wyntildeu| oo |o qvu owz w| qvy qvzo qvwsup2cedil~raquo ontildequwntildexy ou wz qvu | |x qvyx qvz qvwo

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+Cfgsup2sup3plusmn

+efgsup2sup3plusmn

+aumlesup2cedilsup2g

+aumlemplusmncedilegraveugravegsup2sup3plusmn

+aumlemplusmncedilegraveugravegsup2sup3plusmn

sectWOcirceg

ow$XAcircsup3deg owsup3degIcircwsup3

国際単位系(SI)

乗数  接頭語 記号 乗数  接頭語 記号

1024 ヨ タ Y 10-1 デ シ d1021 ゼ タ Z 10-2 セ ン チ c1018 エ ク サ E 10-3 ミ リ m1015 ペ タ P 10-6 マイクロ micro1012 テ ラ T 10-9 ナ ノ n109 ギ ガ G 10-12 ピ コ p106 メ ガ M 10-15 フェムト f103 キ ロ k 10-18 ア ト a102 ヘ ク ト h 10-21 ゼ プ ト z101 デ カ da 10-24 ヨ ク ト y

表5SI 接頭語

名称 記号 SI 単位による値

分 min 1 min=60s時 h 1h =60 min=3600 s日 d 1 d=24 h=86 400 s度 deg 1deg=(π180) rad分 rsquo 1rsquo=(160)deg=(π10800) rad秒 rdquo 1rdquo=(160)rsquo=(π648000) rad

ヘクタール ha 1ha=1hm2=104m2

リットル Ll 1L=11=1dm3=103cm3=10-3m3

トン t 1t=103 kg

表6SIに属さないがSIと併用される単位

名称 記号 SI 単位で表される数値

電 子 ボ ル ト eV 1eV=1602 176 53(14)times10-19Jダ ル ト ン Da 1Da=1660 538 86(28)times10-27kg統一原子質量単位 u 1u=1 Da天 文 単 位 ua 1ua=1495 978 706 91(6)times1011m

表7SIに属さないがSIと併用される単位でSI単位で表される数値が実験的に得られるもの

名称 記号 SI 単位で表される数値

キ ュ リ ー Ci 1 Ci=37times1010Bqレ ン ト ゲ ン R 1 R = 258times10-4Ckgラ ド rad 1 rad=1cGy=10-2Gyレ ム rem 1 rem=1 cSv=10-2Svガ ン マ γ 1γ=1 nT=10-9Tフ ェ ル ミ 1フェルミ=1 fm=10-15mメートル系カラット 1メートル系カラット = 200 mg = 2times10-4kgト ル Torr 1 Torr = (101 325760) Pa標 準 大 気 圧 atm 1 atm = 101 325 Pa

1cal=41858J(「15」カロリー)41868J(「IT」カロリー)4184J(「熱化学」カロリー)

ミ ク ロ ン micro 1 micro =1microm=10-6m

表10SIに属さないその他の単位の例

カ ロ リ ー cal

(a)SI接頭語は固有の名称と記号を持つ組立単位と組み合わせても使用できるしかし接頭語を付した単位はもはや コヒーレントではない(b)ラジアンとステラジアンは数字の1に対する単位の特別な名称で量についての情報をつたえるために使われる

 実際には使用する時には記号rad及びsrが用いられるが習慣として組立単位としての記号である数字の1は明 示されない(c)測光学ではステラジアンという名称と記号srを単位の表し方の中にそのまま維持している

(d)ヘルツは周期現象についてのみベクレルは放射性核種の統計的過程についてのみ使用される

(e)セルシウス度はケルビンの特別な名称でセルシウス温度を表すために使用されるセルシウス度とケルビンの

  単位の大きさは同一であるしたがって温度差や温度間隔を表す数値はどちらの単位で表しても同じである

(f)放射性核種の放射能(activity referred to a radionuclide)はしばしば誤った用語でrdquoradioactivityrdquoと記される

(g)単位シーベルト(PV200270205)についてはCIPM勧告2(CI-2002)を参照

(a)量濃度(amount concentration)は臨床化学の分野では物質濃度

  (substance concentration)ともよばれる(b)これらは無次元量あるいは次元1をもつ量であるがそのこと   を表す単位記号である数字の1は通常は表記しない

名称 記号SI 基本単位による

表し方

秒ルカスパ度粘 Pa s m-1 kg s-1

力 の モ ー メ ン ト ニュートンメートル N m m2 kg s-2

表 面 張 力 ニュートン毎メートル Nm kg s-2

角 速 度 ラジアン毎秒 rads m m-1 s-1=s-1

角 加 速 度 ラジアン毎秒毎秒 rads2 m m-1 s-2=s-2

熱 流 密 度 放 射 照 度 ワット毎平方メートル Wm2 kg s-3

熱 容 量 エ ン ト ロ ピ ー ジュール毎ケルビン JK m2 kg s-2 K-1

比熱容量比エントロピー ジュール毎キログラム毎ケルビン J(kg K) m2 s-2 K-1

比 エ ネ ル ギ ー ジュール毎キログラム Jkg m2 s-2

熱 伝 導 率 ワット毎メートル毎ケルビン W(m K) m kg s-3 K-1

体 積 エ ネ ル ギ ー ジュール毎立方メートル Jm3 m-1 kg s-2

電 界 の 強 さ ボルト毎メートル Vm m kg s-3 A-1

電 荷 密 度 クーロン毎立方メートル Cm3 m-3 sA表 面 電 荷 クーロン毎平方メートル Cm2 m-2 sA電 束 密 度 電 気 変 位 クーロン毎平方メートル Cm2 m-2 sA誘 電 率 ファラド毎メートル Fm m-3 kg-1 s4 A2

透 磁 率 ヘンリー毎メートル Hm m kg s-2 A-2

モ ル エ ネ ル ギ ー ジュール毎モル Jmol m2 kg s-2 mol-1

モルエントロピー モル熱容量ジュール毎モル毎ケルビン J(mol K) m2 kg s-2 K-1 mol-1

照射線量(X線及びγ線) クーロン毎キログラム Ckg kg-1 sA吸 収 線 量 率 グレイ毎秒 Gys m2 s-3

放 射 強 度 ワット毎ステラジアン Wsr m4 m-2 kg s-3=m2 kg s-3

放 射 輝 度 ワット毎平方メートル毎ステラジアン W(m2 sr) m2 m-2 kg s-3=kg s-3

酵 素 活 性 濃 度 カタール毎立方メートル katm3 m-3 s-1 mol

表4単位の中に固有の名称と記号を含むSI組立単位の例

組立量SI 組立単位

名称 記号

面 積 平方メートル m2

体 積 立法メートル m3

速 さ 速 度 メートル毎秒 ms加 速 度 メートル毎秒毎秒 ms2

波 数 毎メートル m-1

密 度 質 量 密 度 キログラム毎立方メートル kgm3

面 積 密 度 キログラム毎平方メートル kgm2

比 体 積 立方メートル毎キログラム m3kg電 流 密 度 アンペア毎平方メートル Am2

磁 界 の 強 さ アンペア毎メートル Am量 濃 度 (a) 濃 度 モル毎立方メートル molm3

質 量 濃 度 キログラム毎立法メートル kgm3

輝 度 カンデラ毎平方メートル cdm2

屈 折 率 (b) (数字の) 1 1比 透 磁 率 (b) (数字の) 1 1

組立量SI 基本単位

表2基本単位を用いて表されるSI組立単位の例

名称 記号他のSI単位による

表し方SI基本単位による

表し方平 面 角 ラジアン(b) rad 1(b) mm立 体 角 ステラジアン(b) sr(c) 1(b) m2m2

周 波 数 ヘルツ(d) Hz s-1

ントーュニ力 N m kg s-2

圧 力 応 力 パスカル Pa Nm2 m-1 kg s-2

エ ネ ル ギ ー 仕 事 熱 量 ジュール J N m m2 kg s-2

仕 事 率 工 率 放 射 束 ワット W Js m2 kg s-3

電 荷 電 気 量 クーロン A sC電 位 差 ( 電 圧 ) 起 電 力 ボルト V WA m2 kg s-3 A-1

静 電 容 量 ファラド F CV m-2 kg-1 s4 A2

電 気 抵 抗 オーム Ω VA m2 kg s-3 A-2

コ ン ダ ク タ ン ス ジーメンス S AV m-2 kg-1 s3 A2

バーエウ束磁 Wb Vs m2 kg s-2 A-1

磁 束 密 度 テスラ T Wbm2 kg s-2 A-1

イ ン ダ ク タ ン ス ヘンリー H WbA m2 kg s-2 A-2

セ ル シ ウ ス 温 度 セルシウス度(e) Kンメール束光 lm cd sr(c) cd

スクル度照 lx lmm2 m-2 cd放射性核種の放射能( f ) ベクレル(d) Bq s-1

吸収線量 比エネルギー分与カーマ

グレイ Gy Jkg m2 s-2

線量当量 周辺線量当量 方向

性線量当量 個人線量当量シーベルト(g) Sv Jkg m2 s-2

酸 素 活 性 カタール kat s-1 mol

表3固有の名称と記号で表されるSI組立単位SI 組立単位

組立量

名称 記号 SI 単位で表される数値

バ ー ル bar 1bar=01MPa=100kPa=105Pa水銀柱ミリメートル mmHg 1mmHg=133322Paオングストローム Å 1Å=01nm=100pm=10-10m海 里 M 1M=1852mバ ー ン b 1b=100fm2=(10-12cm)2=10-28m2

ノ ッ ト kn 1kn=(18523600)msネ ー パ Npベ ル B

デ ジ ベ ル dB

表8SIに属さないがSIと併用されるその他の単位

SI単位との数値的な関係は    対数量の定義に依存

名称 記号

長 さ メ ー ト ル m質 量 キログラム kg時 間 秒 s電 流 ア ン ペ ア A熱力学温度 ケ ル ビ ン K物 質 量 モ ル mol光 度 カ ン デ ラ cd

基本量SI 基本単位

表1SI 基本単位

名称 記号 SI 単位で表される数値

エ ル グ erg 1 erg=10-7 Jダ イ ン dyn 1 dyn=10-5Nポ ア ズ P 1 P=1 dyn s cm-2=01Pa sス ト ー ク ス St 1 St =1cm2 s-1=10-4m2 s-1

ス チ ル ブ sb 1 sb =1cd cm-2=104cd m-2

フ ォ ト ph 1 ph=1cd sr cm-2 104lxガ ル Gal 1 Gal =1cm s-2=10-2ms-2

マ ク ス ウ ェ ル Mx 1 Mx = 1G cm2=10-8Wbガ ウ ス G 1 G =1Mx cm-2 =10-4Tエルステッド( c ) Oe 1 Oe  (1034π)A m-1

表9固有の名称をもつCGS組立単位

(c)3元系のCGS単位系とSIでは直接比較できないため等号「   」

   は対応関係を示すものである

(第8版2006年改訂)

この印刷物は再生紙を使用しています

Page 13: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 14: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 15: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 16: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 17: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 18: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 19: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 20: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 21: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 22: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 23: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 24: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 25: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 26: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 27: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 28: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 29: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 30: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 31: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 32: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 33: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 34: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 35: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 36: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 37: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 38: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 39: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 40: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 41: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 42: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 43: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 44: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 45: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 46: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 47: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 48: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 49: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 50: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 51: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 52: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 53: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 54: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 55: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 56: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 57: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 58: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 59: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 60: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 61: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 62: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 63: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 64: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 65: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 66: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 67: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 68: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 69: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 70: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 71: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 72: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 73: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 74: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 75: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 76: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 77: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 78: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 79: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 80: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 81: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 82: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 83: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 84: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 85: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 86: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 87: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 88: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 89: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 90: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 91: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 92: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 93: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 94: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 95: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 96: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター
Page 97: JAEA- 原子力機構の新大型計算機システムにおけるJAEA-Testing JAEA-Testing 2011-005 Center for Computational Science & e-Systems システム計算科学センター