Data Consistency in the Structured Peer-to-Peer Network Cheng-Ying Ou, Polly Huang Network and...

Preview:

Citation preview

Data Consistency in the Structured Peer-to-Peer Network

Cheng-Ying Ou, Polly HuangNetwork and Systems Lab

台灣大學電機資訊學院電機所

Roadmap

BackgroundRelated WorksProblem DefinitionChordReplica Management SystemConsistency EnhancementsEvaluationConclusion

Structured P2PDistributed Hash Table (DHT)

A peer or data has a unique ID (key)Key is generated by a hash function

A peer is responsible for a key spaceStore data with keys within this range

Key space: (8, 14]Chord

D9, D11, D14

We want to keep data on P2P network

The Challenge of Replica Consistency

ChurnPeers join, leave, fail randomly

Fault-tolerantReplicas to restore lost data

Updated frequently Inconsistency between replicas

Data consistency problem

Roadmap

BackgroundRelated WorksProblem DefinitionChord ProtocolsReplica Management SystemConsistency EnhancementsEvaluationConclusion

Replica Consistency Enhancements

B. Temkow, A.-M. Bosneag, X. Li, and M. Brockmeyer, “PaxonDHT: achieving consensus in distributed hash tables,” in Proceedings of the 2005 symposium on Applications and the Internet, 2006

A. Muthitacharoen, S. Gilbert, and R. Morris, “Etna: a fault-tolerant algorithm for atomic mutable DHT data,” MIT-LCS-TR-933, Jun 2005

Paxos Each peer can do consensus Request all values of replicas from other peers Pick the latest version Commit this value back to all replicas

Distributed consensus algorithm 3-phases message exchanges (atomic updates)

Roadmap

BackgroundRelated WorksProblem DefinitionChord ProtocolsReplica Management SystemConsistency EnhancementsEvaluationConclusion

Problem Definition

How much do these enhancements help to improve the data consistency?

Roadmap

BackgroundRelated WorksProblem DefinitionChord ProtocolsReplica Management SystemConsistency EnhancementsEvaluationConclusion

Chord ProtocolsJoin

Find the successor by the key ID of the new peer

Stabilize Check if the predecessor of the successor is itself periodically If not, change the successor to the new one

[2] I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan, “Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications,” in Proceedings of the 2001 ACM SIGCOMM Conference, 2001, pp. 149-160

Before stabilizing After stabilizingBefore joining

[2]

Roadmap

BackgroundRelated WorksProblem DefinitionChord ProtocolsReplica Management SystemConsistency EnhancementsEvaluationConclusion

Replica Management System

Rely on the underlying DHTN replicas

Stored on the first N successors Update

Update the value of data in backup storages

RestorePing periodically If the predecessor doesn’t response

Fetch the backup objects within new key spaceRestore them into its storageExpand its key space

Remove other backups

(2, 8]

(8, 14](2, 14]

(2, 8]

(2, 8]

(2, 8]

Partial ability of replica consistency

Roadmap

BackgroundRelated WorksProblem DefinitionChordReplica Management SystemConsistency EnhancementsEvaluationConclusion

Replica Consistency Enhancement

Global InformationNo message exchanges

Do consensus before restoringBetween whole backup storagesCompare all replicas

Keep the one with the latest value

Restore the lost ones to the backup storage

D3

D3,D5

D3,D5

D3,D5

Roadmap

BackgroundRelated WorksProblem DefinitionChord ProtocolsReplica Management SystemConsistency EnhancementsEvaluationConclusion

Simulation SetupsRocketfuel topology (AS1755)

492 peers Simulation time: 240000s (5 rounds)5 replicas [3]

ChurnSession life length

Distribution: Pareto [4][5]

Average: 1, 2, 3, 4, 5 hours [4] [5]

• From file sharing (eMule) to Skype superpeer

Inter-session life length Distribution: Exponential [4][5]

Average: 438.5, 877, 1315.5, 1754, 2192.5 minutes [4] [5]

[3] J.Li, J. Stribling, T.gil, R.Morris, and F. Kaashoek, “Comparing the performance of distributed hash tables under churn,” 2004

[4] S.Guha, N.Daswani, and R. Jain, “A experimental study of the skype peer-to-peer VoIP system,” 2006[5] M. Steiner, E. W. Biersack, and T. Ennajjary, “Actively monitoring peers in KAD,” 2006

Metrics

Ratio of availabilityData are not lost and stored on correct peers

Ratio of inconsistencyData don’t have latest version of value

Collected at the end of the simulationAverage of 5 results

Availability

Enhancement is uselessReplica management system

It is seldom that losing data in the backup storages

Inconsistency

Packets with acknowledgement & re-sendingIf the updating packets cannot be delivered

successfully, this peer will be thought as a failure peer.

Roadmap

BackgroundRelated WorksProblem DefinitionChord ProtocolsReplica Management SystemConsistency EnhancementsEvaluationConclusion

Conclusion

Even when the churn rate is highReplica consistency enhancement doesn’t perform

better than the original oneA simple replica management mechanism is

enough

Replica consistency enhancement appears to be overkill given its implementation complexity

Hopefully, our findings provide insights to the making of cost-effective design decisions

Q & A

Recommended