Upload
dangdien
View
221
Download
0
Embed Size (px)
Citation preview
P2PSystemsandDistributedHashTablesSec7on9.4.2
COS461:ComputerNetworksSpring2011
MikeFreedmanhIp://www.cs.princeton.edu/courses/archive/spring11/cos461/
1
P2PasOverlayNetworking
• P2Papplica7onsneedto:– Trackiden77es&IPaddressesofpeers
• Maybemanyandmayhavesignificantchurn
– Routemessagesamongpeers• Ifyoudon’tkeeptrackofallpeers,thisis“mul7‐hop”
• Overlaynetwork– Peersdoingbothnamingandrou7ng– IPbecomes“just”thelow‐leveltransport
2
EarlyP2P
3
EarlyP2PI:Client‐Server
• Napster– Client‐serversearch– “P2P”filexfer
xyz.mp3?
xyz.mp3
1.insert
2.search
3.transfer
4
EarlyP2PII:FloodingonOverlays
xyz.mp3?
xyz.mp3
Flooding
5
search
EarlyP2PII:FloodingonOverlays
xyz.mp3?
xyz.mp3
Flooding
6
search
EarlyP2PII:FloodingonOverlays
transfer
7
EarlyP2PII:“Ultra/superpeers”• Ultra‐peerscanbeinstalled(KaZaA)orself‐promoted(Gnutella)– AlsousefulforNATcircumven7on,e.g.,inSkype
8
LessonsandLimita7ons• Client‐Serverperformswell
– Butnotalwaysfeasible:Performancenotocenkeyissue!
• Thingsthatflood‐basedsystemsdowell– Organicscaling– Decentraliza7onofvisibilityandliability– Findingpopularstuff– Fancylocalqueries
• Thingsthatflood‐basedsystemsdopoorly– Findingunpopularstuff– Fancydistributedqueries– Vulnerabili7es:datapoisoning,tracking,etc.– Guaranteesaboutanything(answerquality,privacy,etc.)
9
StructuredOverlays:DistributedHashTables
10
BasicHashingforPar77oning?
• Considerproblemofdatapar77on:– GivendocumentX,chooseoneofkserverstouse
• Supposeweusemodulohashing– Numberservers1..k
– PlaceXonserveri=(Xmodk)• Problem?Datamaynotbeuniformlydistributed
– PlaceXonserveri=hash(X)modk• Problem?
– Whathappensifaserverfailsorjoins(kk±1)?
– Whatisdifferentclientshasdifferentes7mateofk?
– Answer:Allentriesgetremappedtonewnodes!
11
• Consistenthashingpar77onskey‐spaceamongnodes
• Contactappropriatenodetolookup/storekey– Bluenodedeterminesrednodeisresponsibleforkey1
– Bluenodesendslookuporinserttorednode
key1 key2 key3
key1=value
insert(key1,value)
12
ConsistentHashing
lookup(key1)
• Par77oningkey‐spaceamongnodes
– Nodeschooserandomiden7fiers: e.g.,hash(IP)
– KeysrandomlydistributedinID‐space: e.g.,hash(URL)
– Keysassignedtonode“nearest”inID‐space– Spreadsownershipofkeysevenlyacrossnodes
0000 0010 0110 1010 1111 1100 1110 URL1 URL2 URL3 0001 0100 1011
13
ConsistentHashing
ConsistentHashing0
4
8
12 Bucket
14 • Construc7on– Assignnhashbucketstorandompointsonmod2kcircle;hashkeysize=k
– Mapobjecttorandomposi7ononcircle
– Hashofobject=closestclockwisebucket– successor(key)bucket
• Desiredfeatures– Balanced:Nobuckethasdispropor7onatenumberofobjects
– Smoothness:Addi7on/removalofbucketdoesnotcausemovementamongexis7ngbuckets(onlyimmediatebuckets)
– Spreadandload:Smallsetofbucketsthatlienearobject
14
Consistenthashingandfailures
• Considernetworkofnnodes• Ifeachnodehas1bucket
– Owns1/nthofkeyspaceinexpecta<on– Saysnothingofrequestloadperbucket
• Ifanodefails:– Itssuccessortakesoverbucket– Achievessmoothnessgoal:Onlylocalizedshic,notO(n)– Butnowsuccessorowns2buckets:keyspaceofsize2/n
• Instead,ifeachnodemaintainsvrandomnodeIDs,not1– “Virtual”nodesspreadoverIDspace,eachofsize1/vn– Uponfailure,vsuccessorstakeover,eachnowstores(v+1)/vn
0
4
8
12 Bucket
14
15
Consistenthashingvs.DHTs
ConsistentHashing
DistributedHashTables
Rou7ngtablesize O(n) O(logn)
Lookup/Rou7ng O(1) O(logn)
Join/leave:Rou7ngupdates
O(n) O(logn)
Join/leave:KeyMovement
O(1) O(1)
16
DistributedHashTable
0010 0110 1010 1111 1100 1110 0000
• Nodes’neighborsselectedfrompar7culardistribu7on
- Visualkeyspaceasatreeindistancefromanode
0001 0100 1011
17
DistributedHashTable
0010 0110 1010 1111 1100 1110 0000
• Nodes’neighborsselectedfrompar7culardistribu7on
- Visualkeyspaceasatreeindistancefromanode
- Atleastoneneighborknownpersubtreeofincreasingsize/distancefromnode
18
DistributedHashTable
0010 0110 1010 1111 1100 1110 0000
• Nodes’neighborsselectedfrompar7culardistribu7on
- Visualkeyspaceasatreeindistancefromanode
- Atleastoneneighborknownpersubtreeofincreasingsize/distancefromnode
• Routegreedilytowardsdesiredkeyviaoverlayhops
19
TheChordDHT
• Chordring:IDspacemod2160
– nodeid=SHA1(IPaddress,i) fori=1..vvirtualIDs
– keyid=SHA1(name)
• Rou7ngcorrectness:– Eachnodeknowssuccessorandpredecessoronring
• Rou7ngefficiency:– EachnodeknowsO(logn)well‐distributedneighbors
20
BasiclookupinChordlookup (id): if ( id > pred.id && id <= my.id )
return my.id; else return succ.lookup(id);
• Routehopbyhopviasuccessors– O(n)hopstofinddes7na7onid
Rou7ng
21
EfficientlookupinChordlookup (id): if ( id > pred.id && id <= my.id )
return my.id; else // fingers() by decreasing distance
for finger in fingers(): if id <= finger.id return finger.lookup(id); return succ.lookup(id);
• Routegreedilyviadistant“finger”nodes– O(logn)hopstofinddes7na7onid
Rou7ng
22
Buildingrou7ngtables
Rou7ngRou7ngTables
Foriin1...logn:finger[i]=successor((my.id+2i)mod2160)
23
Joiningandmanagingrou7ng• Join:
– Choosenodeid– Lookup(my.id)tofindplaceonring
– Duringlookup,discoverfuturesuccessor– Learnpredecessorfromsuccessor
– Updatesuccandpredthatyoujoined– Findfingersbylookup((my.id+2i)mod2160)
• Monitor:– Ifdoesn’trespondforsome7me,findnew
• Leave:Justgo,already!– (Warnyourneighborsifyoufeellikeit)
24
DHTDesignGoals
• An“overlay”networkwith:– Flexiblemappingofkeystophysicalnodes– Smallnetworkdiameter
– Smalldegree(fanout)– Localrou7ngdecisions– Robustnesstochurn– Rou7ngflexibility– Decentlocality(low“stretch”)
• Different“storage”mechanismsconsidered:– Persistencew/addi7onalmechanismsforfaultrecovery– Besteffortcachingandmaintenanceviasocstate
25
Storagemodels
• Storeonlyonkey’simmediatesuccessor– Churn,rou7ngissues,packetlossmakelookupfailuremorelikely
• Storeonksuccessors– Whennodesdetectsucc/predfail,re‐replicate
• Cachealongreverselookuppath– Provideddataisimmutable– …andperformingrecursiveresponses
26
Summary• Peer‐to‐peersystems
– Unstructuredsystems• Findinghay,performingkeywordsearch
– Structuredsystems(DHTs)• Findingneedles,exactmatch
• Distributedhashtables– BasedaroundconsistenthashingwithviewsofO(logn)– Chord,Pastry,CAN,Koorde,Kademlia,Tapestry,Viceroy,…
• Lotsofsystemsissues– Heterogeneity,storagemodels,locality,churnmanagement,underlayissues,…
– DHTsdeployedinwild:Vuze(Kademlia)has1M+ac7veusers
27