View
31
Download
0
Category
Preview:
DESCRIPTION
StarFish : highly-available block storage. 資訊三 李益昌 B00902051 資訊三 何柏勳 B00902097. Introduction. Data protection Disk failure V.S. catastrophic site failure Low price of disk drives and high-speed networking infrastructure StarFish Survive catastrophic site failure - PowerPoint PPT Presentation
Citation preview
StarFish: highly-available block storage
資訊三 李益昌 B00902051資訊三 何柏勳 B00902097
Introduction Data protection
Disk failure V.S. catastrophic site failure Low price of disk drives and high-speed networking infrastructure StarFish
Survive catastrophic site failure Use IP network : (1) geographically-dispersed (2) inexpensive Good performance Block level
Architecture One Host Element(HE)
Provides storage virtualization and read cache N Storage Element(SE)
Q: write quorum size. Synchronous updates to a quorum of Q SEs, and asynchronous updates to the
rest. Communicate by TCP/IP over high speed network
Architecture Recommended configuration
N=3 Q=2
Architecture Another configuration
Data consistency and SE recovery Log
sequential number NVRAM
Data consistency Failure
RAID or network connection fails SE recovery
Quick recovery Replay recovery Full recovery
Availability and reliability analysis Parameter
SE failure process : SE recovery process : Number of SEs : N Quorum size : Q
Model SEs failure process is i.i.d Poisson process with mean rate SEs recovery process is i.i.d Poisson process with mean rate HE failure process Poisson process with mean rate HE recovery process Poisson process with mean rate
Availability a HE or SE is available if it can serve data Availability of StarFish A(Q, N) : the steady-state probability that at least Q
SEs are available
is called load , Repairman model
Availability(cont.)
Availability(cont.)
SE availability = 1- X ★ 9 : the number of 9s in an availability measure Fixed N, availability decreases with large Q
Trade off availability for reliability
Reliability Probability of data loss
HE and Q SEs fails The reliability increases with larger Q Two approach
Q > floor(N/2) and at least Q SEs are available Reduce availability
Read-only consistency
Read-only consistency Available in read-only mode during failure.
Read-only mode obviates the need for Q SEs to be available to handle updates. Increase availability
Availability with Read-only Consistency
Implementation
Performance measurements
Setting Gigabit Ethernet(GbE) with dummynet controlling delays and bandwith limit
to model Internet links Different network delays
1, 2, 4, 8, 23, 36, 65 ms Different bandwidth limitations
31, 51, 62, 93, 124 Mb/s Benchmark
Micro-benchmark PostMark
Effects of network delays and HE cache size
Larger cache improves performance Larger cache doesn’t change the response time of write requests
Normal Operation and placement of the far SE
Normal Operation and placement of the far SE
Normal Operation and placement of the far SE
Observation Performance is affected by two parameters
Write quorum size Q Delay to the SE
StarFish performs adequately when one of the SEs is placed in a remote location At least 85% of the performance of a direct-attached RAID
Recovery
Performance degrades more during full recovery
Conclusion The StarFish system reveals significant benefits from a third copy of data at
an intermediate distance A StarFish system with 3 replicas, a write quorum size of 2, and read-only
consistency yields better than 99.9999% availability assuming individual Storage Element availability of 99%
Recommended