42
1 An Update Model for Network Coding in Cloud Storage Systems 2012 50th Annual Allerton Conference on Communication, Control, and Computing Mohammad Reza Zakerinasab Mea Wang Department of Computer Science University of Calgary

An Update Model for Network Coding in Cloud Storage Systems

  • Upload
    ula

  • View
    44

  • Download
    0

Embed Size (px)

DESCRIPTION

An Update Model for Network Coding in Cloud Storage Systems. 2012 50th Annual Allerton Conference on Communication , Control, and Computing Mohammad Reza Zakerinasab Mea Wang Department of Computer Science University of Calgary. Outline. Introduction Related Works Proposed System - PowerPoint PPT Presentation

Citation preview

Page 1: An Update Model for Network Coding in Cloud Storage Systems

1

An Update Model for Network Coding in Cloud Storage Systems

2012 50th Annual Allerton Conference onCommunication, Control, and Computing

Mohammad Reza ZakerinasabMea Wang

Department of Computer Science University of Calgary

Page 2: An Update Model for Network Coding in Cloud Storage Systems

2

Outline

ه Introductionه Related Worksه Proposed Systemه Differential Update Modelه Evaluationه Conclusion

Page 3: An Update Model for Network Coding in Cloud Storage Systems

3

Network Coding (1/2)

ه There are different mechanisms for arranging file copies among storage nodes or devicesه standard RAID architecturesه erasure codeه network coding

ه The network coding in cloud storage systems allows storage nodes to collectively host multiple copies of a file.

Page 4: An Update Model for Network Coding in Cloud Storage Systems

4

Network Coding (2/2)

ه In a network-coding-assisted cloud storage systemه a file is divided into n blocks

ى encoded using random coefficients.

ه encoded blocks are distributed in the Cloud. ى decoded the n encoded blocks from any subset of the storage nodes.

Page 5: An Update Model for Network Coding in Cloud Storage Systems

5

Problem Definition

ه Existing works have been focusing on mechanisms for preserving the level of redundancy.

ه However, the most frequent operations maintaining coded information in the system up to date performed on files.ه file updates

ه Any change in the file will impact all coded blocks in the system.ه replace all traces of the file

Page 6: An Update Model for Network Coding in Cloud Storage Systems

6

Application

ه GoogleDocs : online collaborative office suites, let users create, edit and publish a document collaboratively from around the world.

ه When a file is updated, even changing a single byte can outdate all coded blocks in the system.ه re-computationsه re-deliveries

Page 7: An Update Model for Network Coding in Cloud Storage Systems

7

Problems

ه Re-computing coded blocks is very CPU intensive.

ه Replacing all the coded blocks consumes large amount of bandwidth.

Page 8: An Update Model for Network Coding in Cloud Storage Systems

8

Proposed Model

ه Sending only the modified parts with a minimum possible overhead.

ه The mathematical model of Differential Update Mechanism (DUM) was presented by this paper.ه update algorithms can be performed on all nodes.

ه The simulation results show that the proposed DUM saving a significant bandwidth in a cloud storage system.

Page 9: An Update Model for Network Coding in Cloud Storage Systems

9

Outline

ه Introductionه Related Worksه Proposed Systemه Differential Update Modelه Evaluationه Conclusion

Page 10: An Update Model for Network Coding in Cloud Storage Systems

10

Related Works (1/2)

ه Commercial cloud storage systems, such as Microsoft Azure [8] and Google Cloud [9], utilize source erasure codes.

ه Network coding was originally proposed in information theory in 2000 [1].

ه In contrast to source erasure codes, network coding applies coding at intermediate relay nodes throughout the network.

Page 11: An Update Model for Network Coding in Cloud Storage Systems

11

Related Works (2/2)

ه The benefits for coding at intermediate nodes include ه high throughput [1], [3]ه efficient routing algorithm design [17]ه energy savings in wireless networking [18]ه security [19]

ه The closest related works of update problem are on the repair problemه provide mechanisms for one or more nodes fail [25].ه preserve the level of redundancy.

Page 12: An Update Model for Network Coding in Cloud Storage Systems

12

Reference

ه [1] R. Ahlswede, N. Cai, S. R. Li, and R. W. Yeung, “Network Information Flow,” IEEE Transactions on Information Theory, vol. 46, no. 4, pp. 1204–1216, July 2000.

ه [3] R. Koetter and M. Medard, “An Algebraic Approach to Network Coding,” IEEE/ACM Transactions on Networking, vol. 11, no. 5, pp. 782–795, October 2003.

ه [8] B. Calder, J. Wang, A. Ogus, N. Nilakantan, A. Skjolsvold, S. McKelvie, Y. Xu, S. Srivastav, J. Wu, H. Simitci, J. Haridas, C. Uddaraju, H. Khatri, A. Edwards, V. Bedekar, S. Mainali, R. Abbasi, A. Agarwal, M. F. ul Haq, M. I. ul Haq, D. Bhardwaj, S. Dayanand, A. Adusumilli, M. McNett, S. Sankaran, K. Manivannan, , and L. Rigas, “Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency,” in Proc. of the 23rd ACM Symposium on Operating Systems Principles (SOSP), Cascais, Portugal, October 23-26 2011, pp. 143–157.

Page 13: An Update Model for Network Coding in Cloud Storage Systems

13

Reference

ه [9] D. Ford, F. Labelle, F. I. Popovici, M. Stokely, V.-A. Truong, L. Barroso, C. Grimes, and S. Quinlan, “Availability in Globally Distributed Storage Systems,” in Proc. of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Vancouver, BC, October 4-6 2010, pp. 1–14.

ه [17] D. S. Lun, N. Ratnakar, R. Koetter, M. Medard, E. Ahmed, and H. Lee, “Achieving Minimum Cost Multicast: A Decentralized Approach Based on Network Coding,” in Proc. of the 24th Conference of the IEEE Communications Society (INFOCOM), Miami, FL, March 13- 17 2005, pp. 1607–1617.

ه [18] H. Rahul, W. Hu, D. Katabi, M. Medard, and J. Crowcroft, “XORs in the Air: Practical Wireless Network Coding,” IEEE/ACM Transactions on Networking, vol. 16, no. 3, pp. 497–510, June 2008.

ه [19] C. Gkantsidis and P. Rodriguez, “Cooperative Security for Network Coding File Distribution,” in Proc. of the 25th Conference of the IEEE Communications Society (INFOCOM), Barcelona, Spain, April 23-29 2006, pp. 1–13.

Page 14: An Update Model for Network Coding in Cloud Storage Systems

14

Outline

ه Introductionه Related Worksه Proposed Systemه Differential Update Modelه Evaluationه Conclusion

Page 15: An Update Model for Network Coding in Cloud Storage Systems

15

Modeling the Storage Cloud System

Storage Cloud

End Hosts

Page 16: An Update Model for Network Coding in Cloud Storage Systems

16

Modeling the Storage Cloud System

ه Model simplification assumptions : 1. A single original copy of each file is hosted among the source

nodes in the Cloud.ى each source node owns a disjoint set of files.

2. Each node can only be a source node, a storage, or a target node at a time.ى nodes of the same type do not connect to each other.

3. It is common for a storage system to distribute R 1 copies of each file to provide data redundancy, where R is the replication factor.

Page 17: An Update Model for Network Coding in Cloud Storage Systems

17

Network Coding in the Storage Cloud System

ه With randomized network coding, a file is divided into n original blocks B = [b1, b2, …, bn], where bi has a fixed number of bytes s.

ه Encoding a new block ci

ه the source node first independently and randomly chooses a set of coding coefficients εi = [εi,1, εi,2, … , εi,n] in the Galois field GF(28).ى .

……B =

b1, b2, b3, ..…. bj

c1, c2, c3, . . . . . . , cR*n

b1, b2, b3,.. bn

Page 18: An Update Model for Network Coding in Cloud Storage Systems

18

Network Coding in the Storage Cloud System

ه Decoding : any n of the R n coded blocks are linearly independent and can be used to recover all original blocks of the corresponding file.ه a target node locates and downloads n coded blocks, C = [c1, c2,… , cn], from the storage nodes.

ه Given the encoding matrix ξ = [ε1, ε2, … , εn], the original blocks B = [b1, b2, …, bn] can be recovered by:

ى .

Page 19: An Update Model for Network Coding in Cloud Storage Systems

19

The Update Problem

ه For every single update, we must ه transmit R n new coded blocks from the source nodes to the

storage nodes.ه transmit K n coded blocks from the storage nodes to the target

nodes.

Page 20: An Update Model for Network Coding in Cloud Storage Systems

20

Outline

ه Introductionه Related Worksه Proposed Systemه Differential Update Modelه Evaluationه Conclusion

Page 21: An Update Model for Network Coding in Cloud Storage Systems

21

Differential Update Model (DUM)

ه They believe that the update problem is just as essential as the repair problem.

ه They propose the DUM to update coded blocks by delivering only the blocks that are affected by the updates.ه avoids transmissions of the entire file for each update.

Page 22: An Update Model for Network Coding in Cloud Storage Systems

22

Updating Coded Blocks

ه Assume that the current version number of a file is v, then version v 1 involves arbitrary updates in n’ n blocks of the file.ه B = [b1, b2, …, bn] be the original file of version v.ه B’ = [b1’, b2’, …, bn’] be the updated file of version v 1.ه For each block bi’ in version v 1, we can express it as bi δi, where

δi is the differential vector.

ى .

Δ = [δ1, δ2, δ3 , … , δn], differential matrixى

Page 23: An Update Model for Network Coding in Cloud Storage Systems

23

Updating Coded Blocks

ه To encode a new block for version v 1, the source node again randomly chooses a set of coding coefficients εi’ = [εi,1’, εi,2’, … , εi,n’] in the Galois field GF(28).

ه .

Page 24: An Update Model for Network Coding in Cloud Storage Systems

24

Updating Storage Nodes

ه A significant amount of bandwidth can be saved since most updates will affect only a smaller portion of a file.

ه Recover Δ from Δ’ه reconstructed by inserting the zero δ-vectors into Δ’ according to

the update vector u .

Page 25: An Update Model for Network Coding in Cloud Storage Systems

25

Updating Storage Nodes

ه Send the non-zero rows of Δ’ = [δ1, δ2, δ3, … , δn’]ه Update vector uv+1 = [uv+1,1, uv+1,2,..., uv+1,n]

ه .

ه Encode the matrix Δ’,

ه Decode the matrix Δ’,

Page 26: An Update Model for Network Coding in Cloud Storage Systems

26

Updating Storage Nodes

Page 27: An Update Model for Network Coding in Cloud Storage Systems

27

Updating Target Nodes

Page 28: An Update Model for Network Coding in Cloud Storage Systems

28

Aggregating Updates Across Multiple Versions (1/4)

ه Storage nodes and target nodes may not be always synchronized to the latest version.ه may miss several updates due to various reasons.

ه Assume that the node missed m update ه current version is v.ه actual version of file is v m.

Page 29: An Update Model for Network Coding in Cloud Storage Systems

29

Aggregating Updates Across Multiple Versions (2/4)

ه A coded block in version v may be expressed in terms of the coded blocks of version 0 and the summation of coded δ-blocks from version 0 to version m.

Page 30: An Update Model for Network Coding in Cloud Storage Systems

30

Aggregating Updates Across Multiple Versions (3/4)

ه To support such an aggregated update, the update table that stores ه the update vectorsه the coded δ- blocks

ه If a storage node misses one or more updates, then find the first non-empty entry following the empty entries. ه the aggregated Δ’ containing changes across the missing versions.

Page 31: An Update Model for Network Coding in Cloud Storage Systems

31

Aggregating Updates Across Multiple Versions (4/4)

ه Computational overheadه generation of the aggregated update vector

ى .

ه generation of n’ aggregated coded δ-vectorsى .

Page 32: An Update Model for Network Coding in Cloud Storage Systems

32

Outline

ه Introductionه Related Worksه Proposed Systemه Differential Update Modelه Evaluationه Conclusion

Page 33: An Update Model for Network Coding in Cloud Storage Systems

33

Numerical Analysis

ه The bandwidth saving in updating the storage nodes using DUM.ه .

ه The bandwidth saving in updating the target nodes using DUM.ه .

Page 34: An Update Model for Network Coding in Cloud Storage Systems

34

Experiment Results (1/7)

ه The number of blocks n should be no more than 100 to ensure that network coding operates at a rate faster than a typical transmission rate in a network.

ه We compare the performance of conventional network coding update (NC) and DUM.

Page 35: An Update Model for Network Coding in Cloud Storage Systems

35

Experiment Results (2/7)

ه Bandwidth usages

Page 36: An Update Model for Network Coding in Cloud Storage Systems

36

Experiment Results (3/7)

ه Bandwidth usage and Computational cost

Page 37: An Update Model for Network Coding in Cloud Storage Systems

37

Experiment Results (4/7)

ه Computational cost on storage nodes dominates the overall cost.

Page 38: An Update Model for Network Coding in Cloud Storage Systems

38

Experiment Results (5/7)

ه Aggregated updates

Page 39: An Update Model for Network Coding in Cloud Storage Systems

39

Experiment Results (6/7)

ه Update affects

Page 40: An Update Model for Network Coding in Cloud Storage Systems

40

Experiment Results (7/7)

ه Simulation study

ه Diff [31], bsDiff [32]

[31] J. W. Hunt and M. D. McIlroy, “An Algorithm for Differential File Comparison,” Bell Laboratories 41, Computing Science Technical Report, June 1976.[32] C. Percival, “Matching with Mismatches and Assorted Applications,” Ph.D. dissertation, Wadham College, University of Oxford, 2006.

Page 41: An Update Model for Network Coding in Cloud Storage Systems

41

Outline

ه Introductionه Related Worksه Proposed Systemه Differential Update Modelه Evaluationه Conclusion

Page 42: An Update Model for Network Coding in Cloud Storage Systems

42

Conclusion

ه DUM saves both the communication and computational costs, unless the update affects almost the entire file

ه DUM conserves CPU cycles for large files and when the data is more scattered in the Cloud.

ه This paper only considered n’ is smaller than n, what’s happened if n’ is large than n ?