Upload
vmworld
View
384
Download
4
Embed Size (px)
DESCRIPTION
VMworld 2013 Jeff Hunter, VMware Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare Vahid Fereydouny, VMware
Citation preview
VMware vSphere Replication:
Technical Walk-Through with Engineering
Aleksey Pershin, VMware
Ken Werneburg, VMware
BCO4977
#BCO4977
2
Agenda
Introduction to vSphere Replication
What’s New in 2013
vSphere Replication and SRM
Configuring VR replication
VR internals
Failover and test
Automated reprotect and failback
Summary
3
Agenda
Introduction to vSphere Replication
What’s New in 2013
vSphere Replication and SRM
Configuring VR replication
VR internals
Failover and test
Automated reprotect and failback
Summary
4
vSphere Replication: Protection Built-in to the Platform!
Standalone Protection
VM-by-VM Protection and
Recovery
Replication Engine
Integrated with the vSphere
Platform
Bundled with most vSphere
Editions
vSphere vSphere
vSphere Replication enables simple
and reliable protection for all Virtual Machines
5
Introduction to vSphere Replication: Protection for SRM
Replication built into vSphere
Replicates individual VMs
Replicates between
heterogenous datastores
Asynchronous replication with RPO >= 15 min
Alternative or augmentation for
ABR
Recovery and test are done through SRM
recovery plans
vSphere Replication can be used by SRM as the replication engine
vSphere vSphere
6
vSphere vSphere ESX
vSphere
vSphere
vSphere
VM VM
VR Appliance
vCenter
Protected Site Recovery Site
VR Agent (Further VR
Servers)
vCenter
NFC Service
vSphere Web UI
vSphere Replication Architecture
VR vSCSI Filter
VM
VR Appliance
7
Agenda
Introduction to vSphere Replication
What’s New in 2013
vSphere Replication and SRM
Configuring VR replication
VR internals
Failover and test
Automated reprotect and failback
Summary
8
Top New Features in vSphere Replication
Multiple Points in Time
Multiple vSphere Replication Appliances per vCenter
Support for Storage vMotion
New User Interface Location
Support for vSAN and other VM Storage Policies
Dramatic Speed Improvement
9
Secondary DC
Storage Storage
vSphere
VR Agent
vSphere
VR Agent
(VMDK3) VMDK1
VR
Appliance
Storage Storage
vSphere
VR Agent
(VMDK1) VMDK2
VR
Appliance
vSphere
VR Agent
VR Server
Storage Storage VMDK3 (VMDK2)
vCenter Server
vCenter Server
Main Office Datacenter
Open Topologies with up to 10 vSphere Replication Appliances
Replicate to or between remote sites with or without a vCenter server present!
Remote Office
10
Up to 24 Points in Time Retained to Allow Reversion of VM State
Retention policy is specified during configuration of replication
11
Protected Site Storage vMotion Now Supported
Replication
Manually migrate VMs or even use Storage DRS to ease management
Protected Site Recovery Site
Storage vMotion
can now be used
for protected virtual
machines.
Only protected site
VMDKs can be
migrated: recovery
‘shadow’ objects
are fixed.
12
Administrator chooses a VM Storage Policy: only valid datastores are selectable
VM Storage Policy and vSAN Interoperability
13
VR Now Found Under the Corresponding vCenter
vSphere Replication now easier to find and more intuitive to manage
14
Each vCenter Now Has “Monitor” and “Manage” for VR
vSphere Replication now easier to find and more intuitive to manage
15
Dramatic Performance Improvement
vSphere
VR Agent
vSphere
VR Agent
VR Server VR Server
5.5 Behaviour 5.1 Behaviour
Increased parallelism and more efficient throughput means faster replication,
pushing more data. Replicate more, with no performance cost!
New TCP Stack Optimized for
Latency
Buffered IO for NFC Writes
Coalesced Contiguous
Writes
16
Agenda
Introduction to vSphere Replication
What’s New in 2013
vSphere Replication and SRM
Configuring VR replication
VR internals
Failover and test
Automated reprotect and failback
Summary
17
vSphere Replication and SRM
SVR is now independent of SRM
SVR can replicate within a single vCenter
• SRM will discover and use SVR and its replication
SRM can be installed after SVR
• Gain automation, test recovery, failback, customization, reporting...
Upgrade to SRM
SVR and SRM can coexist
See a more detailed session on using VR and SRM
INF-BCO5129 “Protection for All – vSphere Replication + SRM Technical Update”
18
Architecture: vSphere Replication with Site Recovery Manager
“Protected” Site “Recovery” Site
VR App VR App
vSphere Client
SRM Plug-In
vSphere Client
SRM Plug-In
VMFS Storage VMFS
DB DB
SRM Server SRM Server
DB DB
vCenter Server vCenter Server
ESX ESX
VMFS Storage VMFS
ESX ESX ESX
VRA VRA VRA
VR Server
DB DB
Replication
19
Agenda
Introduction to vSphere Replication
What’s New in 2013
vSphere Replication and SRM
Configuring VR replication
VR internals
Failover and test
Automated reprotect and failback
Summary
20
Configuring VR Replication
VR replication is configured
per VM in vCenter
Selectable RPO from 15 min up
to 24 hours
Selectable destination
datastore (per virtual disk)
Select MPIT policy
21
Configuring VR Replication: Multiple VMs
All VMs will have the same settings (RPO, quiescence, etc.)
22
Datastore Mappings Ease Mass Protection of Systems
23
Seeding the Initial Copy to Save Time and Bandwidth
The user can provide the seed for the initial copy
The seed can be delivered through any out-of-band channel
The more recent, the better
The user directs the wizard to the seed files when configuring replication
If using seeds when configuring en masse
The seed files must be placed in a specific way at the target
Refer to the VR user manual for more details
24
Agenda
Introduction to vSphere Replication
What’s New in 2013
vSphere Replication and SRM
Configuring VR replication
VR internals
Failover and test
Automated reprotect and failback
Summary
25
First, It Does an Initial Full Sync of Source and Target
Compares disk IDs to avoid mismatches
Calculates checksum of all blocks at source and target
Exchanges and compares checksums to determine delta
Replicates all changed blocks necessary to align VMDKs
A B C D E A C
Source Disk Seed Disk tcp/31031
B D E
A B C D E
26
After Full Sync, We Switch to Sending the Delta
• Crash consistent if quiescing is turned off
• Allows cross-disk consistency within a VM
• Ongoing I/O not penalized with replication active
• Lightweight snapshots are not the same as VM snapshots (redo logs)
Light-Weight Deltas
27
Normally Sends Only Changed Blocks
Switches to delta after first sync
VR Agent tracks all changing blocks via vSCSI filter
Changed blocks replicated as per RPO
A B C D E
Source Disk Target Disk
A C D
tcp/44046
Disks are always consistent
A B C D E AII B CI DI E AII B CI DI E
28
Lightweight Snapshots and the LWD Protocol
Writes tracked by vSCSI filter driver
Each replica corresponds to a lightweight snapshot
Bitmap of changed blocks is maintained between replications
During a sync changed blocks are read and sent to the target
LWD protocol – Light Weight Deltas
• Port 31031 – Initial replication traffic
• Port 44046 – Ongoing replication traffic
VR Filter
29
Replication Consistency
• VM has a known RPO
Maintains point-in-time consistency
• All disks within a VM treated as an entity
Guarantees cross-disk consistency
• A VMDK will never be corrupt
Every replica is a crash consistent image of the VM
• Improves OS recoverability with VSS
Guest quiescing adds file system consistency
• Flush application writers with VSS
App-level quiescing adds application level consistency
30
Protecting against Network Failures
VR vSCSI filter discards a snapshot only after a sync is completed
VR Server writes each replica into a separate redo log
A redo log is snapshotted only after a sync is completed
Old replicas are collapsed only after a sync is completed
There is always at least one valid replica that corresponds to a
valid lightweight snapshot
Blocks changed LWD Shipped
Redo log
collected
Write
committed to
replica vmdks
31
The Replication Scheduler
The scheduler runs in the VR agent on each ESX host
Minimizes RPO violations across all VMs on the host
Tries to minimize the overall bandwidth usage within RPO constraints
Statistical analysis to predict sync durations
Can do “early syncs” in anticipation of large syncs
32
Retain Historical Replications as Snapshots
vSphere
VR Agent
After recovery, use the snapshot manager to revert to earlier points
Retention of
multiple points in
time allows
reversion to
earlier known
good states
33
Multiple Points in Time Saved Intelligently
Current
Previous replicas retained
Replication
Running
Replication
Halted Recovers to most recent replica
– others are snapshots
Ongoing Protection
During Recovery
34
MPIT retention policy: keep 3 replicas per 24 hour retention period = 1 retained every 8 hours
4 hour RPO = ~6 replications during the day
Of the 6 replica snapshots created, only 3 are kept during the 24 hour period
Retains the most recent up-to-date snapshot within an 8 hour period
Replication Differs from Retention - Example
12AM 4AM 8AM 12PM 4PM 8PM
4AM 12PM 8PM
Retains only a subset of the replicas in accordance with policy
35
Replication Slots Differ from Replication Instances
The most recent complete instance is *always* preserved even
though it might be the second instance in the slot.
This ensures you can always failover to the most recent copy.
36
Replication Slots Differ from Replication Instances
The oldest instance in any given retention slot is preserved,
as is the most recent replication.
37
MPIT Presented as VM Snapshots after Failover
Use the snapshot manager to revert to earlier points, an interface all administrators
have been comfortable with for many years.
38
SRM and VR Interop Resolution
Point in time recovery is
available in SRM when using
vSphere Replication
SRM Advanced Settings
dialog to instruct SRM to
preserve the MPIT images
vrReplication.preserveMpitIma
gesAsSnapshots
On by default, change at both
sites if desired
39
Agenda
Introduction to vSphere Replication
What’s New in 2013
vSphere Replication and SRM
Configuring VR replication
VR internals
Failover and test
Automated reprotect and failback
Summary
40
Failover and Test
During a failover, a replica is surfaced as a VM in vCenter
• Replication is automatically stopped
• All MPIT replicas are collapsed to avoid a performance penalty at runtime or preserved as VM snapshots
During a test (SRM only), a snapshot of a replica is surfaced as a VM
• Replication continues to run while test is in progress
• The test VM can write to the disks without affecting the replicas
• After the test the test snapshot is discarded
41
Agenda
Introduction to vSphere Replication
What’s New in 2013
vSphere Replication and SRM
Configuring VR replication
VR internals
Failover and test
Automated reprotect and failback
Summary
42
SRM Allows for Automated Reprotect and Failback
• Reprotect
• Test recovery after reprotect
• Failback
SRM provides additional automation workflows
• Failover shuts down protected VMs and disables power-on
• All VM files are left at the protected site
A successful planned migration is required for reprotect
• All replication settings preserved
• Original VMs used as seeds
• Detects manually configured replications
Reprotect automatically configures VMs for replication in the opposite direction
V
M
D
K1
V
M
D
K2
(VMDK1)
VMDK1 (VMDK2)
VMDK2
43
Agenda
Introduction to vSphere Replication
What’s New in 2013
vSphere Replication and SRM
Configuring VR replication
VR internals
Failover and test
Automated reprotect and failback
Summary
44
Summary
vSphere Replication provides robust and cost effective replication
More features and improvements coming in 2013
• Multiple Point In Time
• Multiple replication appliances per vCenter
• SDRS and Storage vMotion support
• New and improved UI
• Support for vSAN and storage classes
• Dramatic performance improvements
vSphere Replication for SMBs
• Offered with Essentials Plus licenses and above
• Can be upgraded to SRM to provide automation, test, failback
45
More Good Stuff!
http://blogs.vmware.com/vSphere/Uptime
46
Other VMware Activities Related to This Session
HOL:
HOL-SDC-1305
Business Continuity and Disaster Recovery In Action
Group Discussions:
BCO1003-GD
Disaster Recovery and Replication with Ken Werneburg
BCO4977
THANK YOU
VMware vSphere Replication:
Technical Walk-Through with Engineering
Aleksey Pershin, VMware
Ken Werneburg, VMware
BCO4977
#BCO4977