4th RICC workshopのご案内

4th RICC workshopのご案内

第15回さくらの夕べ in 札幌

An invitation for 4th RICC workshop

柏崎礼生 Hiroki Kashiwazaki

小樽市桂岡町からきました

4th RICC workshop @Okinawa 2014/3/27(Thu)～28(Fri)

RICC

俵屋宗達: 風神雷神図 (1624ころ?) Soutatsu Tawaraya: Fujin Raijin-zu

RICC

RICC

地域間インタークラウド

分科会

雲内放電 Inter Cloud Lightening

4th RICC workshopのご案内

第15回さくらの夕べ in 札幌

An invitation for 4th RICC workshop

柏崎礼生 Hiroki Kashiwazaki

4th RICC workshop @Okinawa 2014/3/27(Thu)～28(Fri) RICC

俵屋宗達: 風神雷神図 (1624ころ?) Soutatsu Tawaraya: Fujin Raijin-zu

RICC

RICC

地域間インタークラウド

分科会雲内放電 Inter Cloud Lightening 82p/5min TOYAMA site

OSAKA site

TOKYO site before Migration

Copy to DR-sites

Copy to DR-sites

live migration of VM between distributed areas

real time and active-active features seem to be just a simple "shared storage". Live migration is also possible between DR sites

(it requires common subnet and fat pipe for memory copy, of course)

after Migration

Copy to DR-sites 広域分散仮想化環境

Distcloud DR Disaster Recovery

1978Sun Information Systems

mainframe hot site ‘80-’90 Realtime

Processing POS point of sales

’90-’00 the Internet 2001.9.11 September 11 attacks

2003.8.14 Northeast blackout of 2003 in Japan 2011.3.11

The aftermath of the 2011 Tohoku earthquake and tsunami BCP

Business Continuity Plan

群馬 Gunmma prefecture

石狩 Ishikari city

２つで十分ですよ？

国立情報学研究所

Kitami Institute of Technology

University of the Ryukyus

北見工大

琉球大学

SINET 最長

XenServer 6.0.2

CloudStack 4.0.0

XenServer 6.0.2

CloudStack 4.0.0

problems

shared storage ≒50ms

RTT > 100ms Distributed

Storage

requirement 64 256 1024 4096 16384 65536 262144 1.04858e+06 4.1943e+06 1.67772e+07 6.71089e+07 4

16 64

256 1024

4096 16384

0

20000

40000

60000

80000

100000

120000

Kbyt

es/s

ec

File size in 2^n KBytes

Record size in 2^n Kbytes

0

20000

40000

60000

80000

100000

120000

High Random R/W Performance

POSIX準拠 interface protocl

NFS, CIFS, iSCSI

Global VM migration is also available by sharing "storage space" by VM host machines. Real time availability makes it possible. Actual data copy follows.

(VM operator need virtually common Ethernet segment and fat pipe for memory copy)

TOYAMA site

OSAKA site


Copy to DR-sites

Copy to DR-sites




after Migration

Copy to DR-sites

Fileblock block block

block block block

block block block

Meta Data

consistent hash

backend (core servers) NFS

CIFS iSCSI

redundancy = 3

r = 2ACK

r = 1

r = 0

write

redundancy = 3

ACK

r = 2 e = 0

r = 1 e = 0

r = 0 e = 1

r = -1 e = 2

external

Hypervisor

VM 金沢大学広島大学

iozone -aceI

write

64 256 1024 4096 16384 65536 262144 1.04858e+06 4.1943e+06 1.67772e+07 6.71089e+07 4 16

64 256

1024 4096

16384

0

20000

40000

60000

80000

100000

120000

Kbyt

es/s

ec



0

20000

40000

60000

80000

100000

120000

64 256 1024 4096 16384 65536 262144 1.04858e+06 4.1943e+06 1.67772e+07 6.71089e+07 4

16

64

256

1024

4096

16384



0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

write rewrite read reread

random read random write bkwd read

stride read fwrite freadlegend

record rewrite

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

従来方式 Exage/Storage

広域対応 Exage/Storage

SINET4 Hiroshima University EXAGE L3VPN

SINET4 Kanazawa University EXAGE L3VPN

　　　　　　　　　　　　　　　　　　　　　　

proposed method shared NFS

Read (before migration) Read (after migration)Write (before migration) Write (after migration)

Throughput (MB/sec)

SC2013 2013/11/17～22 @Colorado Convention Center

We have been developing a widely distributed cluster storage system andevaluating the storage along with various applications. The main advantage ofour storage is its very fast random I/O performance, even though it provides aPOSIX compatible file system interface on the top of distributed cluster storage.

当初の予定

下條真司 Shinji Shimojo @Osaka Univ, NICT

面白くないよね！

本番RTT=244ms 1Gbps

本番折り返し

国際回線を国際回線上での DCダウン時の今年は

82p/5min

TOYAMA site

OSAKA site


Copy to DR-sites

Copy to DR-sites




after Migration

Copy to DR-sites 広域分散仮想化環境

Distcloud

DR Disaster Recovery

1978

Sun Information Systems

mainframe hot site

‘80-’90

Realtime Processing

POS point of sales

’90-’00

the Internet

2001.9.11 September 11 attacks

2003.8.14 Northeast blackout of 2003

in Japan

2011.3.11 The aftermath of the 2011

Tohoku earthquake and tsunami

BCP Business Continuity Plan

群馬 Gunmma prefecture

石狩 Ishikari city

２つで十分ですよ？


Kitami Institute of Technology

University of the Ryukyus

北見工大

琉球大学

SINET 最長

XenServer 6.0.2

CloudStack 4.0.0

XenServer 6.0.2

CloudStack 4.0.0

problems

shared storage

≒50ms

RTT > 200ms

分散ストレージ distributed storage

要求性能 required quality

64 256 1024 4096 16384 65536 262144 1.04858e+06 4.1943e+06 1.67772e+07 6.71089e+07 4 16

64 256

1024 4096

16384

0

20000

40000

60000

80000

100000

120000

Kbyt

es/s

ec



0

20000

40000

60000

80000

100000

120000

High Random R/W Performance

POSIX準拠 interface protocl

NFS, CIFS, iSCSI

��"�� $�� !��

Con$idential �

�� %*,&.'+�#�)(-��

Global VM migration is also available by sharing "storage space" by VM host machines. Real time availability makes it possible. Actual data copy follows.

(VM operator need virtually common Ethernet segment and fat pipe for memory copy)

TOYAMA site

OSAKA site


Copy to DR-sites

Copy to DR-sites




after Migration

Copy to DR-sites

Fileblock block block

block block block

block block block

Meta Data

consistent hash

backend (core servers)

NFS CIFS iSCSI

redundancy = 3

r = 2ACK

r = 1

r = 0

write

redundancy = 3

ACK

r = 2 e = 0

r = 1 e = 0

r = 0 e = 1

r = -1 e = 2

external

10Gbps

Cisco UCS

Hypervisor

VM

1/4U server x4

大阪大学

金沢大学広島大学


iozone -aceI a: full automatic mode

c: Include close() in the timing calculations e: Include flush (fsync,fflush) in the timing calculations

I: Use DIRECT IO if possible for all file operations.

write

64 256 1024 4096 16384 65536 262144 1.04858e+06 4.1943e+06 1.67772e+07 6.71089e+07 4 16

64 256

1024 4096

16384

0

20000

40000

60000

80000

100000

120000

Kbyt

es/s

ec



0

20000

40000

60000

80000

100000

120000

64 256 1024 4096 16384 65536 262144 1.04858e+06 4.1943e+06 1.67772e+07 6.71089e+07 4

16

64

256

1024

4096

16384



0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

write rewrite read reread

random read random write bkwd read

stride read fwrite freadlegend

record rewrite

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

0

20

40

60

80

100

120

10MB 100MB 1GB 10GB

Thro

ughp

ut (M

B/s

ec)

File size

従来方式 Exage/Storage

広域対応 Exage/Storage

SINET4 Hiroshima University EXAGE L3VPN

SINET4 Kanazawa University EXAGE L3VPN

SINET4 NII EXAGE L3VPNSINET4 NII EXAGE L3VPN

　　　　　　　　　　　　　　　　　　　　　　

proposed method shared NFS

Read (before migration) Read (after migration)Write (before migration) Write (after migration)

Throughput (MB/sec)

SC2013 2013/11/17～22 @Colorado Convention Center

We have been developing a widely distributed cluster storage system andevaluating the storage along with various applications. The main advantage ofour storage is its very fast random I/O performance, even though it provides aPOSIX compatible file system interface on the top of distributed cluster storage.

㻯㼛㼚㼠㼍㼏㼠㼟㻦㻌㻰㼕㼟㼠㼏㼘㼛㼡㼐㻼㼞㼛㼖㼑㼏㼠㻱㻙㼙㼍㼕㼘䠖㼐㼕㼟㼠㼏㼘㼛㼡㼐㻬㼞㼕㼏㼏㻚㼕㼠㼞㼏㻚㼚㼑㼠

• Long Distance: Sharing data across geographically dispersed locations• Multi-sites: Replicating data over at least three different locations• All Active: Simultaneous accessing from multiple locations

We have successfully performed a longdistance live migration experiment. Wehave migrated VMs using our storagewithout significant performancedegradation of read/write operations. Migrated to

remote siteMigrated to local site

Migrated to remote site

Migrated to local site

Fig. 1: Comparison of disk write performance during VM migration between with our platform and with NFS. (Distance between two sites is about 450km and RTT is about 18ms.)

当初の予定

下條真司 Shinji Shimojo @Osaka Univ/JGN-X Leader

面白くないよね！

本番

RTT=244ms 1Gbps

本番折り返し

国際回線を使用した

マイグレーション

国際回線上での広域分散ストレージのアクセス試験

DCダウン時の DR実現検証

今年は拠点を米国に

Future Works

Big Data Analysis

モバイルデバイスからの行動データ behavior data from mobile devices

電源非供給地域で収集されるデータ data from non-electrification area

personal data aggregation service

high latency power

consumption

mobile devices

sensor devices

mobile devices

sensor devices

low latency

regional exchange

regional exchange

personal data aggregation service

wide-area distributed platform

regional data center

regional data center

経路最適化

【今後の展開】仮想計算機の流動性向上に向けて

18

VM VM VMmigration

拠点間マイグレーションにおける経路最適化の実現

21

Layer� e�� ¶� �[ŕ!�œ��Ŗ�

L3�

¦À"Z�œæŅĚĤŏŔĪŋŒĕĵōġą¦ÀQ1ĕ&�ą�đxò�

ĨĚĵ)�ĂĆ¦À"Zô+¯ VM)�ĂĆ|ÁĄ"Zô6Þ ¦Àl�ô\¶ąĄĒ�

+�

L2S��œæY{ĆL2f³ĕ�� œæVPLS, IEEE802.1ad PB(Q-in-Q),

IEEE802.1ah(Mac-in-Mac)

FG^íğłŏŔĪŋŒħĬĵą�ēĒ h�pąKûĒĬĥŔŌĻōĳę (Ç%ħĬĵíľŐŔĶĢňĬĵĶņĚŒ�Ŗ

L2 over L3� œæĝŒĶŃĬĵÖĂIPĵŒķōŒĤ œæVXLAN, OTV, NVGRE�

tDL3f³ĂH�+¯ ĵŒķōŒĤğŔĹŔŀĲĶ IPŅŎıĢňĬĵô\à�

SDN�œĴŔİĆĽĞőŔĴęŒĤĕįĽĵěĜĘĂ"Z œæOpenFlow

ĿŐĤŌŅľŎ"Z �3Ç%íÎ�ħĬĵå

ID/Locator�Ý�œĸŔĶĆIDăŎŔĳęŒĤ�ĆLocator ĕ�Ý œ LISP�

h�pąKûĒĬĥŔŌĻōĳę ĸŔĶ)�Ć"Z �3Ç% ŅĲļŒĤ_<¡�í*u^ąWß�

IPŇĻōĳęf³� œ IPMĂĆ�&ÉÏ^ œæMAT, NEMO, MIP(Kagemusha)�

h�pąKûĒĬĥŔŌĻōĳę IPMĂĆ��¼°(�¯ĕAÒĂH�ûĒe�čD8Ŗ�

L4� mSCTP� œ SCTPĆŅŎıĺĬ�¯ĕ!� L2 / L3ąć×�ùĄð ĵŌŒĬńŔĵôSCTPąÙGøēĒ�

L7� DNS + Reverse NAT�œ Dynamic DNSąďĒ&�Ą�đxò œ VMąćĿŌĚŁŔĵĘĶŏĬĕ��ùāíh��ĆĦŔĵěĜĚĂReverse NATĕÑ��

L2 / L3ąć×�ùĄð IPĘĶŏĬć@v ĮĲĪŋŒć�røēĒ�

2011.3.11 The aftermath of the 2011

Tohoku earthquake and tsunami

Japan

Taiwan

Indonesia

New Zealand

4th RICC workshop @Okinawa 2014/3/27(Thu)～28(Fri)

http://ricc.itrc.net

Cybermedia Center Osaka UniversityCybermedia Center Osaka UniversityCybermedia Center Osaka University

そういえば任期が 2014/3/31まで

なので…

転職先探しています

おあとがよろしいようで

go to next stage