20
Copyright©2014 NTT Corp. All Rights Reserved. Durability Simulator Design for OpenStack Swift (Interactive Durability Calculation Tools) Kota Tsuyuzaki [IRC: kota_] [email protected] NTT Software Innovation Center Copyright(c)2009-2014 NTT CORPORATION. All Rights Reserved.

Durability Simulator Design for OpenStack Swift

Embed Size (px)

Citation preview

Page 1: Durability Simulator Design for OpenStack Swift

Copyright©2014 NTT Corp. All Rights Reserved.

Durability Simulator Design for OpenStack Swift (Interactive Durability Calculation Tools)

Kota Tsuyuzaki [IRC: kota_] [email protected] NTT Software Innovation Center

Copyright(c)2009-2014 NTT CORPORATION. All Rights Reserved.

Page 2: Durability Simulator Design for OpenStack Swift

2 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Goal & Benefits

• How to calculate?

• Demo

Outline

Etherpad: https://etherpad.openstack.org/p/kilo-swift-durability-simulator

Page 3: Durability Simulator Design for OpenStack Swift

3 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Issue

User

I wanna build a durable object storage system by

using OpenStack Swift. I wanna know also the durability

to confirm it will be enough for our SLA.

Page 4: Durability Simulator Design for OpenStack Swift

4 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Issue

User

Provider A

Provider B

Provider C

Hey, guys. Could you tell me the

Swift system architecture and its

storage durability you support.

OpenStack Providers

Page 5: Durability Simulator Design for OpenStack Swift

5 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Issue

User

Provider A

Provider B

Provider C

A: 7-9s durability with 3 copies

B: 9-9s durability with 3 copies

C: 11-9s durability with 3 copies

WHAT’S HAPPEN!? WHICH IS CORRECT?

OpenStack Providers

Page 6: Durability Simulator Design for OpenStack Swift

6 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Goal

• Building durability calculation tools supported (or recommended) by Swift community

• Enabling to get the calculation result easily from both specs of system component HWs and swift configures. (e.g. # of disks, size of each disk, # of partitions)

• Benefits

• Swift Administrators (almost beginners) can find their own system durability easily

• Enable to standardize the calculation definition among Swift providers

• Swift Users can choose the policy for their use case (Replica? EC? Which # of parities are best for you?)

Goal & Benefits

Page 7: Durability Simulator Design for OpenStack Swift

7 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

How to calculate the durability?

Page 8: Durability Simulator Design for OpenStack Swift

8 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

For Replica Case

Page 9: Durability Simulator Design for OpenStack Swift

9 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Calculation Using Markov Model (Markov Process)

• 2 Replica -> k = 1, m = 1 • i.e. Data Lost with 2 Fragments

• 3 Replica -> k = 1, m = 2 • i.e. Data Lost with 3 Fragments

• Reference:

• [1]: "Reliability Mechanisms for Very Large Storage Systems"

• http://www.ssrc.ucsc.edu/Papers/xin-mss03.pdf

How to Calculate EC Durability?

[1]

Page 10: Durability Simulator Design for OpenStack Swift

10 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Redundancy Set[1]:

• Definition

• A block group composed of data blocks or object and their associated replicas or parity blocks. A single redundancy set will typically contain 1MB to 1TB, though we expect that redundancy sets will be at least 1GB to minimize bookkeeping overhead and reduce the likelihood that two redundancy sets will be stored on the same set of object storage system.

• Assuming a Reduandancy Set as a Partition

Consideration for Swift’s Partition

Ring Ring

MD5*(URL) = index

partitions

idx Copy 1 Copy 2 Copy 3

0 1 5 7

… … … …

8 3 2 6

Partition table from part to device id.

From [1]

Page 11: Durability Simulator Design for OpenStack Swift

11 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Definition: • Absorbing State: The end state in the state transition model.

• P: Transition Probability Matrix

Markov Process (1)

Absorbing State

Temporary State

P=𝑄 𝑈𝑂 𝐼

𝟏 − 𝟐𝝁 𝟐𝝁 𝟎𝒗 𝟏 − (𝝁 + 𝒗) 𝝁𝟎 𝟎 𝟏

Q: Transition Probability Matrix among Temporary State U: Probability Matrix from Temporary State into Absorbing State O: Zero Matrix、I: Identity Matrix

State0 State1 State2

Page 12: Durability Simulator Design for OpenStack Swift

12 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Time (t) Limitation of State Transition Matrix (P) shows average # of state transition (M) from initial state to absorbing state

• MTTDL (Time to be absorbing state) calculated from sum of each rows in MN

Markov Process (2)

𝐥𝐢𝐦𝒕→∞

𝑷𝒕=𝟎 𝑴𝑼𝟎 𝑰

M = (I-Q)-1 MTTDLrs = M𝟏⋮𝟏

P=𝟏 − 𝟐𝝁 𝟐𝝁 𝟎

𝒗 𝟏 − (𝝁 + 𝒗) 𝝁𝟎 𝟎 𝟏

𝟏

𝟐𝝁

𝝁 + 𝒗

𝝁𝟐

𝒗

𝝁𝟐

State Transition Matrix for 2 replica

M MTTDLrs 𝟏

𝟐𝝁𝟐𝟑𝝁 + 𝒗𝟐𝝁 + 𝒗

Durability = 1 – N/ MTTDLrs

Probability for Data Lost

Durability

1 - 2𝑵𝝁𝟐

𝟏

𝟑𝝁+𝒗

𝟏

𝟐𝝁+𝒗

Page 13: Durability Simulator Design for OpenStack Swift

13 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

For EC Case

Page 14: Durability Simulator Design for OpenStack Swift

14 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Object Size(bytes): n

• # of Sliced Raw Objects: k

• # of Parities: m

• Total # of Fragments: k + m

• Fragment Size(bytes): n / k (+ checksum)

• Total Stored Size (bytes): Fragment Size * (k + m)

Erasure Code Definition

object

Data

fragment

Data

fragment

parity

fragment

parity

fragment

… k

m

encode

decode

Terminology Reference: http://specs.openstack.org/openstack/ swift-specs/specs/swift/erasure_coding.html

Page 15: Durability Simulator Design for OpenStack Swift

15 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Basic Idea

• Expansion of Durability Calculation for Replica Model

• Calculation Using Markov Model (Markov Process)

• Replica Model based on Markov Process:

• 2 Replica -> k = 1, m = 1 • i.e. Data Lost with 2 Fragments

• 3 Replica -> k = 1, m = 2 • i.e. Data Lost with 3 Fragments

How to Calculate EC Durability?

[1]

※ Markov Process works to calculate the durability with matrix calculation. [3]

Page 16: Durability Simulator Design for OpenStack Swift

16 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

• Algorithms

• State: Status (exists or lost) for All fragments

• Each state is transferred by constant probability

• μ = Disk Failure Rate, v = Fragments Repair Rate

• Each Rate related to # of Fragments

• E.g. RAID related to # of Devices

• Extract States to m + 1 (i.e. data lost)

Durability Calculation Algorithms

0 1 m-1 m … m+1

state transitions for “m” parities EC

D = # of Devices (RAID5) N = k + m (N fragments located in the system)

-Nμ

v

Nμ -(N-1)μ-v

(N-m)μ

mv

(N-(m-1))μ

-(N-(m-1))μ-mv

Page 17: Durability Simulator Design for OpenStack Swift

17 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Demo

Page 18: Durability Simulator Design for OpenStack Swift

18 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Demo

Page 19: Durability Simulator Design for OpenStack Swift

19 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Demo

Page 20: Durability Simulator Design for OpenStack Swift

20 Copyright©2014 NTT Corp. All Rights Reserved.

NTT Confidential

Kota Tsuyuzaki [IRC: kota_] [email protected]

NTT Software Innovation Center

Questions?

Etherpad: https://etherpad.openstack.org/p/kilo-swift-durability-simulator