Upload
md-khairul-anam
View
148
Download
1
Embed Size (px)
Citation preview
Replica Set Roles
• Heartbeats• Priority Comparisons• Optime• Connections• Networka Partitions
Factors and Conditions that Affect Elections
Replica Set – 1 Data Center
Single datacenter
Single switch & power
Points of failure:– Power– Network– Data center
Automatic recovery of single node crash
Replica Set – 2 Data Centers
Multi data center
DR node for safety
Can’t do multi data center durable write safely since only 1 node in distant DC
Replica Set – 3 Data Centers
Three data centers
Can survive full data center loss
Can do w= { dc : 2 } to guarantee write in 2 data centers
User Growth– 1995: 0.4% of the world’s population– Today: 30% of the world is online (~2.2B)
Data Set Growth– Facebook’s data set is around 100 petabytes– 4 billion photos taken in the last year (4x a decade
ago
Examining Growth
Read/Write Throughput Exceeds I/O
Working SetIndexes
Data
Working SetIndexes Dat
a
Working Set Exceeds Physical Memory
Vertical Scalability (Scale Up)
Custom Hardware– Oracle
Custom Software– Facebook + MySQL– Google
MongoDB Auto-Sharding
A data store that is– Free– Publicly available– Open Source (https://github.com/mongodb/mongo)– Horizontally scalable– Application independent
Data Store Scalability Solutions
Config Server– Stores cluster chunk ranges and locations– Can have only 1 or 3 (production must have
3)– Not a replica set
Meta Data Storage
Mongos– Acts as a router / balancer– No local data (persists to config database)– Can have 1 or many
Routing and Managing Data
• User defines shard key
• Shard key defines range of data
• Key space is like points on a line
• Range is a segment of that line
Partitioning
• Shard key is used to partition your collection
• Shard key must exist in every document
• Shard key must be indexed
• Shard key is used to route requests to shards
What is a Shard Key
• Initially 1 chunk
• Default max chunk size: 64mb
• MongoDB automatically splits & migrates chunks when max reached
Data Distribution
• Targeted Queries
• Scatter Gather Queries
• Scatter Gather Queries with Sort
Cluster Request Routing