Transcript
Page 1: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

Cloud Computing in Our Common

Life(生活中的云计算)

Zhenhua Li (李振华)Tsinghua University

[email protected]://www.greenorbs.org/people/lzh/

Dec. 21th, 20141

The 8th International Workshop on IOT and Cloud Computing

Page 2: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

2

Outline

① Huge Background

② Google Play

Security③ Cloud Storage Traffic

④ OpenStack

Bottleneck (Intro)

■ Short

Summary

⑤ ConflictBox System

(Intro)

Page 3: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

3

Cloud Computing in IndustryEC2, S3, SQS, RDS

GFS, BigTable, MapReduce蓝云 , 智慧地球

Azure,Office365

CloudServers, OpenStackiCloud,iTunes

Page 4: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

4

Cloud Computing in Academia

……

Page 5: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

5

Money and Papers

万亿投入!

万篇论文!

Page 6: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

6

投入如此巨大,我们的日常生活是否因为云计算而得到了巨大的改善?

Page 7: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

7

② Google Play

SecurityA Measurement Study of Google

Play

Nicolas Viennot, Edward Garcia, Jason NiehColumbia University

Page 8: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

8

Android Dominates Market

Page 9: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

9

Google Play for Android

ONLY Official Market for Android Apps

Page 10: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

10

Is Google Play Really Secure?

Nicolas Viennot

Page 11: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

11

Gmail Code Hacked!

Page 12: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

12

Finding 1: Rating is Ridiculous

Where is Google’s Big Data

Analytics?

Page 13: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

13

Finding 2: Clone Apps are Pervasive

Clone apps are ALMOST malicious apps ~~

Page 14: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

14

Finding 3: OAuth is Almost Useless

Android apps heavily rely on the OAuth

protocol to guarantee security

Developers often store secret authentication keys in their Android applications without realizing their credentials are easily compromised through de-compilation.

Page 15: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

15

Even the Google Play Cloud is soooooo… insecure

Our Key Idea

Page 16: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

16

Zhenhua Li, Tianyin Xu, Yunhao Liu, et al.Tsinghua University, and so forth

③ Cloud Storage Traffic

Towards Network-level Efficiency

for Cloud Storage Services

Page 17: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

17

Cloud Storage Services

store share

Over 200M users 1B files per day

Over 200M users Over 14 PB data

Page 18: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

18

Key Operation

datasync𝒇𝒊𝒍𝒆𝒐𝒑𝒆𝒓𝒂𝒕𝒊𝒐𝒏

𝒅𝒂𝒕𝒂𝒔𝒚𝒏𝒄𝒆𝒗𝒆𝒏𝒕

Create Delete Modify

Index Content Notify

data sync traffic

Tremendous !

Page 19: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

19

How Tremendous for a Provider?

Over 200M users1B files per day

[IMC’12] Drago et al : Large-scale Measurement of

Dropbox Sync traffic ≈ 1/3 of

traffic Sync traffic of one file operation

= 5.18MB out + 2.8MB in

Monetary Cost of Dropbox sync traffic in one day ≈$0.05/GB × 1 Billion × 5.18MB

= $260,000 * We assume there is no special pricing contract between Dropbox and Amazon S3, so our calculation of the traffic costs may involve potential overestimation.

Page 20: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

20

How Tremendous for End Users?Bandwidth-constrained

Users

“ Keep a close eye on your data usage if you have a mobile cloud storage app! ”

Traffic-capped(Mobile) Users

“ Dirty Secret ”: Tremendous sync traffic almost saturates the slow-speed network link!

Page 21: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

21

Fundamental Problem

Is the current data sync traffic of cloud storage services efficiently used?

Is the tremendous data sync traffic basically necessary or unnecessary?

Further broaden today’s

broadband network

Enhance network-level

design of today’s services

Page 22: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

22

A Novel Metric

To quantify the efficiency of data sync traffic usage of cloud storage services.

𝑷𝑼𝑬=𝑻𝒐𝒕𝒂𝒍 𝒇𝒂𝒄𝒊𝒍𝒊𝒕𝒚 𝒑𝒐𝒘𝒆𝒓𝑰𝑻 𝒆𝒒𝒖𝒊𝒑𝒎𝒆𝒏𝒕 𝒑𝒐𝒘𝒆𝒓

Power Usage

Efficiency

𝑻𝑼𝑬=𝑻𝒐𝒕𝒂𝒍𝒅𝒂𝒕𝒂𝒔𝒚𝒏𝒄𝒕𝒓𝒂𝒇𝒇𝒊𝒄

𝑫𝒂𝒕𝒂𝒖𝒑𝒅𝒂𝒕𝒆 𝒔𝒊𝒛𝒆

Traffic Usage

Efficiency

Page 23: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

23

Data Update Size

-

User’s intuitive perception about how much traffic should be consumed

Compared with absolute value of sync traffic, TUE better reveals the essential traffic harnessing capability of cloud storage services

* If data compression is utilized, the data update size denotes the compressed size of altered bits.

Page 24: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

24

Client @ MN Cloud

Client @ BJ Cloud

Client @ MN Cloud

(a) Closesetup

(b) Remotesetup

(c) Network controllable setup

Controlled bandwidth or latency

Benchmark Experiments

Various Hardware

Powerful PC Common PC Outdated PC Android Phone

Minneapolis

Beijing

Various Access

Methods PC client Web browser Mobile App

Various File Operations

Create, Delete (Frequent) Modify Compressed and

Uncompressed

Page 25: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

25

File Creation - finding

1The majority (77%) of files in our collected trace are small in size, which may result in poor TUE. Meanwhile, nearly two thirds (66%) of small files can be logically combined into large files.

< 100 KB > 1 MB

Page 26: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

26

File Creation - implication

1Small files should be properly combined into larger files for batched data sync (BDS) to reduce sync traffic. However, only Dropbox and Ubuntu One have partially implemented BDS so far.

What if we create one hundred 1-KB files in a batch?

Page 27: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

27

File Modification - finding

284% of files are modified by users at least once. Most cloud storage services employ full-file sync, while Dropbox and SugarSync utilize incremental data sync (IDS) to save traffic for PC clients.

What if we modify 1 byte in a 1-MB file? 50 KB

1.1 MB

No IDS at all !

Page 28: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

28

Why Not IDS for most PC clients? Conflicts between IDS and RESTful infrastructures

Typically only support data access operations at the full-file level,like PUT, GET and DELETE.

MODIFY = Local Modify

+ PUT +

DELETE

Page 29: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

29

File Modification - implication

2For a cloud storage service built on top of RESTful infrastructure, enabling IDS requires an extra, (maybe) complicated mid-layer. Given that file modifications frequently happen, implementing such a mid-layer is worthwhile.

Extra mid-layer to enable IDS

Also RESTful

Page 30: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

30

File Compression - finding

352% of files can be effectively compressed. However, Google Drive, OneDrive, Box, and SugarSync never compress data, while Dropbox is the only one that compresses data for every access method.

𝑪𝒐𝒎𝒑𝒓𝒆𝒔𝒔𝒆𝒅 𝒇𝒊𝒍𝒆 𝒔𝒊𝒛𝒆𝑶𝒓𝒊𝒈𝒊𝒏𝒂𝒍 𝒇𝒊𝒍𝒆 𝒔𝒊𝒛𝒆

<𝟗𝟎%

3For providers, data compression is able to reduce 24% of the total sync traffic.

For users, PC clients are more likely to support compression.

Page 31: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

31

File Deduplication - finding

4 Although we observe that 18% of user files can be deduplicated, most cloud storage services do not support data deduplication.

Web browsers never dedup

data

For security concerns

Page 32: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

32

Full-file vs. Block-level Dedup

Block-level dedup exhibits trivial superiority to full-file dedup, but is much more complex

* We are dividing files to blocks in a simple and natural way, i.e., by starting from the head of a file with a fixed block size. So clearly, we are not dividing files to blocks in the best possible manner which is much more complicated and computation intensive.

4We suggest providers just implement full-file deduplication since it is both simple and efficient.

Page 33: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

33

Frequent modifications - finding Frequent, short data updates

Network traffic for data synchronization

time

Session maintenance traffic far exceeds real data update size

The Traffic Overuse Problem

For 8.5% Dropbox users, >10% of their traffic is generated in response to frequent modifications

Zhenhua Li et al. Efficient Batched Sync in

Dropbox-like Cloud Storage Services. In Proc. of ACM

Middleware, 2013.

Page 34: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

34

Sync Deferment What if we append X KB per X sec until 1 MB ?

51) Frequent modifications to a file often lead to large TUE.

2) Some services deal with this issue by batching file updates using a fixed sync deferment. However, fixed sync deferments are limited in applicable scenarios.

Page 35: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

35

Frequent modifications - implication

5To fix the problem of fixed sync deferment, we propose an adaptive sync defer (ASD) mechanism that dynamically adjusts the sync deferment.

time......

data update

......

Δ ti-1 Δ ti+1

SyncDeferment

𝑇 𝑖=min (𝑇 𝑖−1

2+∆ 𝑡𝑖2

+𝜖 ,𝑇𝑚𝑎𝑥)

4.2 sec

.5 sec

6 sec

Page 36: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

36

Network & Hardware Impact Network and hardware do not affect the TUE of simple file operations, but significantly affect the TUE of frequent modifications

36

6In the case of frequent file modifications, today’s cloud storage services actually bring good news (in terms of TUE) to those users with relatively poor hardware or Internet access.

Page 37: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

37

6Findings

6Implications

A considerable portion of the data sync traffic is in a sense wasteful

The wasted (tremendous) traffic can be effectively avoided or significantly reduced via carefully designed sync mechanisms

Our Key Idea

Page 38: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

38

④ OpenStack

Bottleneck (Intro)

Thierry (切瑞)

Page 39: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

http://www.thucloud.com

39

Page 40: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

40

Time increasingWith # objects

OpenStack: Handling Enormous Objects

CPU increasingWith # objects

Page 41: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

41

LightSync: Addressing Sync Bottleneck

500 more lines of code added/modified(2 files)

5 Million objectsr=3

Page 42: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

42

⑤ ConflictBox System

(Intro)

钟海华

Page 43: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

43

Sample file.docx

Sample file(Jim’s conflicted copy 2014-12-21).docx

Sample file.docx

Sample file (Jim’s conflicted copy 2014-12-21).docx

Sample file (Jim’s conflicted copy 2014-12-21) (Bob’s conflicted copy 2014-12-21).docx

Have You Experienced …?

Sample file.docx

Page 44: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

44

How to Avoid?

网页端(视图)和云端实时同步

Page 45: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

45

correct

file conflict

some conflicts in this folder

2014/12/16 07:17:24

2014/12/16 13:18:37

ConflictBox: UI

Page 46: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

Short Summary

Thank you!

生活中的云计算离我们的期待还有很长很长的距离

正因如此,学术菜鸟才有存在的意义和生存的空间

Page 47: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

47

Why Not IDS for Web & Mobile?

IDS is hard to implement in a script language, particularly JavaScriptUnable to directly invoke file-level system calls/APIs like open, close, read, write, stat, rsync, and gzip.

Instead, JavaScript can only access users’ local files in an indirect and constrained manner.

(Probably) Energy concerns for IDS is usually computation intensive

Page 48: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

48

File Compression - implication

3For providers, data compression is able to reduce 24% of the total sync traffic.

For users, PC clients are more likely to support compression.

High-level compression, and cloud-side compression level seems higherNo user-side compression, while high-level cloud-side compressionLow-level user-side compression due to energy concerns of smartphones

Page 49: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

49

The Case of iCloud DriveReleased in Oct. 2014 with

Efficient BDS (batched data sync) for OS X, but not for web browser or iOS 8

IDS (incremental data sync) for OS X, but not for web browser or iOS 8

No compression at all

Fine-grained (KBs) level dedup for OS X, but not for web browser or iOS 8

Quite unstable at the moment

Page 50: Cloud Computing in Our Common Life (生活中的云计算) Zhenhua Li (李振华) Tsinghua University lizhenhua1983@gmail.com  Dec. 21th,

Working Principle of Dropbox Client

50

First, Dropbox client must re-index the

updated file --- computation intensive

A file is considered “synchronized” to the cloud only when the

cloud returns ACK

Sometimes, when data updates happen even faster than the file re-indexing speed, they are also “batched” for synchronization

This is why some data updates are “batched” for

synchronization unintentionllay

The four basic components of Dropbox client behavior


Recommended