116
© 2014 VMware Inc. All rights reserved. VMware Virtual SAN Below the Marketing Jason Nash, VCDX Chief Technology Officer, Varrow [email protected] @TheJasonNash

Varrow Madness 2014 - VSAN Deep Drive

  • Upload
    wpera

  • View
    23

  • Download
    2

Embed Size (px)

DESCRIPTION

VMware VSAN

Citation preview

  • 2014 VMware Inc. All rights reserved.

    VMware Virtual SAN Below the Marketing

    Jason Nash, VCDX Chief Technology Officer, Varrow [email protected] @TheJasonNash

  • Storage & Availability Technical Marketing Resources

    Team

    Group Manager Charu Chaubal [email protected]

    Storage Rawlinson Rivera

    [email protected] Wade Holmes [email protected]

    Availability Ken Werneburg

    [email protected] Jeff Hunter [email protected] GS Khalsa [email protected]

    Big Data Justin Murray [email protected]

    Resources

    Blog: http://blogs.vmware.com/vsphere

    Tablet & Mobile app: http://vmware.com/go/vmkp

    Hands-on Labs: http://hol.vmware.com

    Product Walkthrough Demos: http://vmwarewalkthroughs.com

    VM Vault: search by topic or author

    2

  • Virtual SAN Technical Walkthrough

    1 SDS and Virtual SAN Overview

    2 Use Cases

    3 Hardware Requirements

    4 Technical Characteristics and Architecture

    5 Configuration Walkthrough

    6 Virtual Machine Provisioning Operations

    7 Resiliency and Failed Scenarios

    8 Interoperability

    9 Design and Sizing

    10 Troubleshooting 3

  • 2011-2013

    2008-2010

    2005-2007

    vSphere 4.x Thin Provisioning Storage I/O control Boot from SAN VAAI Linked mode

    2014+

    VI 3.x VMFS Snapshots Storage vMotion NAS & iSCSI

    support

    vSphere 5.x Storage DRS Profile-driven Storage VASA Sphere Storage

    Appliance vSphere Data Protection vSphere Replication vSphere Flash Read

    VMware Storage Innovations

    4

    Software-defined Storage

  • Hypervisor-Converged Opportunities

    Why the virtualization platform can play a critical role to solve storage problems?

    Inherent knowledge of application

    Global view of infrastructure

    Hardware agnostic

    Hypervisor-Converged storage solutions abstract the plumbing to optimize storage for applications

    vSphere

    SAN & NAS All Flash BLOB DAS

    Server Side Flash

    5

  • Virtual SAN

    6

    vSAN

    VSAN

  • SAN / NAS

    VVOL

    VMware Software-defined Storage

    7

    Bringing computes operational model to storage

    Abstraction and pooling Infrastructure integration New storage tiers

    VM centric data services Third-party services

    integration

    Common policy-based automation and orchestration

    LUN LUN LUN

    LUN LUN

    LUN

    SAN/NAS Pool

    Virtual Data Plane

    x86 Servers

    Hypervisor-converged Storage pool

    Object Storage Pool

    Cloud Object Storage

    Virtual Data Services

    Data Protection Mobility Performance

    Policy-driven Control Plane

    Virtual SAN

  • VMware Virtual SAN

    Software-defined storage software solution.

    Aggregates locally attached storage from each ESXi host in a cluster.

    Flash optimized storage solution.

    VM-Centric data operations and policy driven management principals.

    Resilient design based on a Distributed RAID architecture No single points of failures

    Fully integrated with vSphere.

    8

    vSphere + Virtual SAN

    Hard disks Hard disks SSD SSD Hard disks

    SSD

    Virtual SAN Shared Datastore

    Hypervisor-Converged storage platform

  • 9

    Data Protection

    Disaster Recovery

    vSphere

    Cloud Ops and Automation

    Snapshots Linked clones

    Site Recovery Manager vCenter Operations Manager vCloud Automation Center

    Virtual Desktop VDP Advanced

    vSphere Replication

    Storage Policy-Based Management

    VMware Horizon View vMotion vSphere HA

    DRS Storage vMotion

    IaaS

    Deeply Integrated with VMware Stack

    Bringing the benefit of VMwares products to make Storage Easy

  • Virtual SAN is NOT a Virtual Storage Appliance

    10

    Virtual SAN is fully integrated with vSphere (ESXi & vCenter) Drivers embedded in ESXi 5.5 contain the Virtual SAN smarts Kernel modules:

    Provide the shortest path for I/O Remove unnecessary management overheads when dealing with an

    appliance Do not consume resources unnecessarily

    Virtual SAN Embedded into vSphere Virtual SAN Not a VSA

    VSA

  • VMware Virtual SAN

    Hybrid storage solution Magnetic disks (HDD) Flash based disks (SSD)

    Storage scale out architecture built into the hypervisor

    Dynamic capacity and performance scalability

    Object based storage architecture

    Interoperable with vSphere and enterprise features: vMotion, DRS, vSphere HA

    11

    vSphere + Virtual SAN

    Hard disks Hard disks SSD SSD Hard disks

    SSD

    Virtual SAN Shared Datastore

    Radically Simple Hypervisor-Converged Storage Software

  • 12

    Installs in two clicks Managed from vSphere

    Client

    Policy-based management Self-tuning and elastic Deep integration with

    VMware stack

    Radically Simple

    Embedded in vSphere kernel

    Flash-accelerated Up to 915K IOPs from 16

    nodes cluster

    Matches the VDI density of all flash array

    Best price/performance

    High Performance Lower TCO

    Eliminates large upfront investments (CAPEX)

    Grow-as-you-go (OPEX) Flexible choice of industry

    standard hardware Does not require

    specialized skills

    Virtual SAN Key Benefits

  • Storage Policy-Based Management

    VSAN Shared Datastore

    Simplifies and Automates Storage Management

    13

    Per VM storage service levels from a single self-tuning datastore

    Capacity

    Performance

    Availability

    Per VM Storage Policies

    Policies set based on application needs

    vSphere + VSAN

    SLAs

    Software automates control of service levels

    No more LUNs!

  • No overprovisioning Less resources, less time Easy to change

    Today

    5. Consume from pre-allocated bin

    4. Select appropriate bin

    3. Expose pre-allocated bins

    2. Pre-allocate static bins

    1. Pre-define storage configurations

    1. Define storage policy

    2. Apply policy at VM creation

    VSAN

    VSAN Shared

    Datastore

    Resource and data service are automatically provisioned and

    maintained

    Overprovisioning (better safe than sorry!) Wasted resources, wasted time Frequent Data Migrations

    Virtual SAN Puts The App In Charge

    Simpler and automated storage management through application centric approach

    14

  • Oh yeah! Scalability..

    15

    vsanDatastore

    4.4 Petabytes

    2 Million IOPS

    32 Hosts

  • VMware Virtual SAN Use Cases

  • 17

    Management Clusters

    Use Cases

    Backup and DR Target

    DMZ / Isolated

    Tier 2 / Tier 3 Test / Dev / Staging

    Private cloud

    Virtual Desktop

    ROBO

    VDI

    Site A Site B

    vSphere VSAN

  • VMware Virtual SAN Hardware Requirements

  • Hardware Requirements

    19

    Any Server on the VMware Compatibility Guide

    SSD, HDD, and Storage Controllers must be listed on the VMware Compatibility Guide for VSAN http://www.vmware.com/resources/compatibility/search.php?deviceCategory=vsan

    ESXi 5.5 Hosts: Minimum 3, Maximum 32

    1Gb/10Gb NIC

    SAS/SATA Controllers (RAID Controllers must work in pass-through or RAID0 mode

    SAS/SATA/PCIe SSD

    SAS/NL-SAS/SATA HDD

    At least 1 of each

    4GB to 8GB USB, SD Cards

  • Flash Based Devices

    In Virtual SAN ALL read and write operations always go directly to the Flash tier.

    Flash based devices serve two purposes in Virtual SAN 1. Non-volatile Write Buffer (30%)

    Writes are acknowledged when they enter prepare stage on SSD. Reduces latency for writes

    2. Read Cache (70%) Cache hits reduces read latency Cache miss retrieve data from HDD

    Choice of hardware is the #1 performance differentiator between Virtual SAN configurations.

    20

  • Flash Based Devices

    VMware SSD Performance Classes Class A: 2,500-5,000 writes per second Class B: 5,000-10,000 writes per second Class C: 10,000-20,000 writes per second Class D: 20,000-30,000 writes per second Class E: 30,000+ writes per second

    Examples Intel DC S3700 SSD ~36000 writes per

    second -> Class E Toshiba SAS SSD MK2001GRZB ~16000

    writes per second -> Class C

    Workload Definition Queue Depth: 16 or less Transfer Length: 4KB Operations: write Pattern: 100% random Latency: less than 5 ms

    Endurance 10 Drive Writes per Day (DWPD), and Random write endurance up to 3.5 PB on

    8KB transfer size per NAND module, or 2.5 PB on 4KB transfer size per NAND module

    21

  • Magnetic Disks (HDD)

    SAS/NL-SAS/SATA HDDs supported 7200 RPM for capacity 10000 RPM for performance 15000 RPM for additional performance

    NL SAS will provide higher HDD controller queue depth at same drive rotational speed and similar price point NL SAS recommended if choosing between SATA and NL SAS

    Differentiate performance between clusters with SSD selection, and SSD:HDD ratio. Rule of thumb guideline is 10% of anticipated capacity usage

    22

  • Storage Controllers

    SAS/SATA Storage Controllers Pass-through or RAID0 mode supported

    Performance using RAID0 mode is controller dependent Check with your vendor for SSD performance behind a RAID-controller

    Storage Controller Queue Depth matters Higher storage controller queue depth will increase performance

    Validate number of drives supported for each controller

    23

  • Storage Controllers RAID0 Mode

    Configure all disks in RAID0 mode Flash based devices (SSD) Magnetic disks (HDD)

    Disable the storage controller cache Allows better performance as cache is controlled by Virtual SAN

    Disks Device cache support Flash based devices leverage write through caching

    ESXi may not be able to differentiate flash based devices from magnetic devices. Use ESXCLI to manually flag the devices as SSD

    24

  • Network

    1Gb / 10Gb supported 10Gb shared with NIOC for QoS will support

    most environments If 1GB then recommend dedicated links for Virtual SAN

    Jumbo Frames will provide nominal performance increase Enable for greenfield deployments

    Virtual SAN supports both VSS & VDS NIOC requires VDS Nexus 1000v Should work but hasn't been fully tested

    Network bandwidth performance has more impact on host evacuation, rebuild times than on workload performance

    25

  • Firewalls

    Virtual SAN Vendor Provider (VSANVP) Inbound and outbound - TCP 8080

    Cluster Monitoring, Membership, and Monitoring Services (CMMDS) Inbound and outbound UDP 12345 - 23451

    Reliable Datagram Transport (RDT) Inbound and outbound TCP 2233

    26

  • VMware Compatibility Guide

    27

  • Multi-level cell SSD (or better) or PCIe SSD

    SAS/NL-SAS HDD Select SATA HDDs

    Any Server on vSphere Hardware Compatibility List

    * Note: For additional details, please refer to Virtual SAN VMware Compatibility Guide Page

    6Gb enterprise-grade HBA/RAID Controller

    1 2 Build your own VSAN Ready Node

    with 10 different options between multiple 3rd party vendors available at GA

    Preconfigured server ready to use VSAN

    using the VSAN Compatibility Guide*

    Choose individual components

    Two Ways to Build a Virtual SAN Node

    Radically Simple Hypervisor-Converged Storage

  • VMware Virtual SAN Technical Characteristics and Architecture

  • Technical Characteristics Virtual SAN is a cluster level feature similar to:

    vSphere DRS vSphere HA Virtual SAN

    Deployed, configured and manage from vCenter through the vSphere Web Client (ONLY!).

    Radically simple Configure VMkernel interface for Virtual SAN Enable Virtual SAN by clicking Turn On

    30

  • Virtual SAN Implementation Requirements

    Virtual SAN requires: Minimum of 3 hosts in a cluster

    configuration All 3 host MUST!!! contribute

    storage vSphere 5.5 U1 or later

    Maximum of 32 hosts

    Locally attached disks Magnetic disks (HDD) Flash-based devices (SSD)

    Network connectivity 1GB Ethernet 10GB Ethernet (preferred)

    31

    esxi-01

    local storage local storage local storage

    vSphere 5.5 U1 Cluster

    esxi-02 esxi-03

    cluster

    HDD HDD HDD

  • Storage Policy-based Management

    SPBM is a storage policy framework built into vSphere that enables virtual machine policy driven provisioning.

    Virtual SAN leverages this new framework in conjunction with VASA APIs to expose storage characteristics to vCenter: Storage capabilities

    Underlying storage surfaces up to vCenter and what it is capable of offering.

    Virtual machine storage requirements Requirements can only be used against available capabilities.

    VM Storage Policies Construct that stores virtual machines storage provisioning requirements based on

    storage capabilities.

    32

  • Storage Policy Wizard

    SPBM

    VSAN object

    VSAN object manager

    virtual disk

    VSAN objects may be (1) mirrored across hosts & (2) striped across disks/hosts to meet VM storage profile policies

    Datastore Prole

    Virtual SAN SPBM Object Provisioning Mechanism

  • Virtual SAN Constructs and Artifacts

    New Virtual SAN constructs, artifacts and terminologies:

    Disk Groups. VSAN Datastore. Objects. Components. Virtual SAN Network.

    34

  • Virtual SAN Disk Groups Virtual SAN uses the concept of disk groups to pool together flash

    devices and magnetic disks as single management constructs.

    Disk groups are composed of at least 1 flash device and 1 magnetic disk. Flash devices are use for performance (Read cache + Write buffer). Magnetic disks are used for storage capacity. Disk groups cannot be created without a flash device.

    35

    disk group disk group disk group disk group Each host: 5 disk groups max. Each disk group: 1 SSD + 1 to 7 HDDs

    disk group

    HDD HDD HDD HDD HDD

  • Virtual SAN Datastore Virtual SAN is an object store solution that is presented to vSphere as

    a file system.

    The object store mounts the VMFS volumes from all hosts in a cluster and presents them as a single shared datastore. Only members of the cluster can access the Virtual SAN datastore Not all hosts need to contribute storage, but its recommended.

    36

    disk group disk group disk group disk group

    Each host: 5 disk groups max. Each disk group: 1 SSD + 1 to 7 HDDs

    disk group

    VSAN network VSAN network VSAN network VSAN network VSAN network

    vsanDatastore

    HDD HDD HDD HDD HDD

  • Virtual SAN Objects Virtual SAN manages data in the form of flexible data containers called

    objects. virtual machine files are referred to as objects.

    Virtual machines files are referred to as objects. There are four different types of virtual machine objects:

    VM Home VM swap VMDK Snapshots

    Virtual machine objects are split into multiple components based on performance and availability requirements defined in VM Storage profile.

    37

    disk group disk group disk group disk group

    Each host: 5 disk groups max. Each disk group: 1 SSD + 1 to 7 HDDs

    disk group

    VSAN network VSAN network VSAN network VSAN network VSAN network

    vsanDatastore

    HDD HDD HDD HDD HDD

  • Virtual SAN Components Virtual SAN components are chunks of objects distributes across

    multiple hosts in a cluster in order to tolerate simultaneous failures and meet performance requirements.

    Virtual SAN utilizes a Distributed RAID architecture to distribute data across the cluster.

    Components are distributed with the use of two main techniques: Striping (RAID0) Mirroring (RAID1)

    Number of component replicas and copies created is based on the object policy definition.

    38

    disk group disk group disk group disk group disk group

    VSAN network VSAN network VSAN network VSAN network VSAN network

    vsanDatastore

    replica-1 replica-2 RAID1

    HDD HDD HDD HDD HDD

  • Object and Components Layout

    39

    VSAN network VSAN network VSAN network VSAN network VSAN network

    Virtual SAN Storage Objects

    R1

    R0 R0 R0

    Availability defined as number of copies

    Low level storage objects would reside on different hosts

    VMFS VMFS VMFS

    rolo2.vmdk

    The VM Home directory object is formatted with VMFS to allow a VMs configuration files to be stored on it.

    Performance may include a stripe width

    VMFS

    rolo1.vmdk rolo.vmx, .log, etc

    /vmfs/volumes/vsanDatastore/rolo/rolo.vmdk

    disk group

    HDD

    disk group

    HDD

    disk group

    HDD

    disk group

    HDD

    disk group

    HDD

  • Virtual SAN Network New Virtual SAN traffic VMkernel interface.

    Dedicated for Virtual SAN intra-cluster communication and data replication.

    Supports both Standard and Distributes vSwitches Leverage NIOC for QoS in shared scenarios

    NIC teaming used for availability and not for bandwidth aggregation. Layer 2 Multicast must be enabled on physical switches.

    Much easier to manage and implement than Layer 3 Multicast

    40

    Management Virtual Machines vMotion Virtual SAN

    Distributed Switch

    20 shares 30 shares 50 shares 100 shares

    uplink1 uplink2

    vmk1 vmk2 vmk0

  • Virtual SAN Network

    NIC teamed and load balancing algorithms: Route based on Port ID

    active / passive with explicit failover

    Route based on IP Hash active / active with LACP port channel

    Route based on Physical NIC load active / active with LACP port channel

    Management Virtual Machines vMotion Virtual SAN

    Distributed Switch

    100 shares 150 shares 250 shares 500 shares

    uplink1 uplink2

    vmk1 vmk2 vmk0

    Multi chassis link aggregation capable switches

  • Virtual SAN Scalable Architecture

    42

    Scale up and Scale out architecture granular and linearly storage, performance and compute scaling capabilities Per magnetic disks for capacity Per flash based device for performance Per disk group for performance and capacity Per node for compute capacity

    disk group disk group disk group

    VSAN network VSAN network VSAN network

    vsanDatastore

    HDD

    disk group

    HDD HDD HDD

    disk group

    VSAN network

    HDD scal

    e up

    scale out

  • VMware Virtual SAN Configuration Walkthrough Demo

  • Configuring VMware Virtual SAN

    Radically Simple configuration procedure

    44

    Setup Virtual SAN Network

    Enable Virtual SAN on the

    Cluster

    Select Manual or Automatic

    If Manual, create disk

    groups

  • Configure Network

    45

    Configure the new dedicated Virtual SAN network vSphere Web Client network template configuration feature.

  • Enable Virtual SAN One click away!!!

    Virtual SAN configured in Automatic mode, all empty local disks are claimed by Virtual SAN for the creation of the distributed vsanDatastore.

    Virtual SAN configured in Manual mode, the administrator must manually select disks to add the the distributed vsanDatastore by creating Disk Groups.

    46

  • Disk Management

    Each host in the cluster creates a single or multiple disk groups which contain a combination of HDDs, and SSDs.

    47

  • Virtual SAN Datastore

    A single Virtual SAN Datastore is created and mounted, using storage from all multiple hosts and disk groups in the cluster.

    Virtual SAN Datastore is automatically presented to all hosts in the cluster.

    Virtual SAN Datastore enforces thin-provisioning storage allocation by default.

    48

  • VM Storage Policies

    VM Storage Policies are accessible from vSphere Web Client Home screen.

    49

  • Virtual SAN Capabilities

    Virtual SAN currently surfaces five unique storage capabilities to vCenter.

    50

  • Number of Failures to Tolerate

    Number of failures to tolerate Defines the number of hosts, disk or network failures a storage object can

    tolerate. For n failures tolerated, n+1 copies of the object are created and 2n+1 host contributing storage are required.

    51

    vsan network

    vmdk vmdk witness

    esxi-01 esxi-02 esxi-03 esxi-04

    ~50% of I/O ~50% of I/O

    Virtual SAN Policy: Number of failures to tolerate = 1

    raid-1

  • Number of Disk Stripes Per Object

    Number of disk stripes per object The number of HDDs across which each replica of a storage object is

    distributed. Higher values may result in better performance.

    52

    vsan network

    stripe-2b witness

    esxi-01 esxi-02 esxi-03 esxi-04

    stripe-1b stripe-1a stripe-2a

    raid-0 raid-0

    VSAN Policy: Number of failures to tolerate = 1 + Stripe Width =2

    raid-1

  • Virtual SAN Storage Capabilities

    Force provisioning if yes, the object will be provisioned even is the policy specified in the

    storage policy is not satisfiable with the resources currently available.

    Flash read cache reservation (%) Flash capacity reserved as read cache for the storage object. Specified as a

    percentage of logical size of the object.

    Object space reservation (%) Percentage of the logical size of the storage object that will be reserved

    (thick provisioned) upon VM provisioning. The rest of the storage object is thin provisioned.

    53

  • Virtual SAN I/O flow Write Acknowledgement

    vsan network

    vmdk vmdk

    esxi-01 esxi-02 esxi-03 esxi-04

    VSAN mirrors write IOs to all active mirrors, these are acknowledged when they hit the flash buffer!

    witness

    Destaging to HDD is done independently between hosts.

    raid-1

  • Virtual SAN I/O flow 1MB increment striping

    vsan network

    witness

    esxi-01 esxi-02 esxi-03 esxi-04

    stripe-1b

    stripe-1a 1MB(1)

    raid-0 raid-0

    VSAN is thin provisioned by default, stripes grow in increments of 1MB

    raid-1

    1MB(3) 1MB(5) 1MB(2) 1MB(4)

    (x) indicates stripe segment.

  • Components and Objects Visualization

    Visualization of mapping and layout of all objects and components vSphere Web Client RVC

    56

  • Storage Capabilities Recommended Practices

    57

    Storage Capability Use Case Value

    Number of failures to tolerate (RAID 1 Mirror) Redundancy

    Default 1 Max 3

    Number of disk stripes per object (RAID 0 Stripe) Performance

    Default 1 Max 12

    Object space reservation Thick Provisioning Default 0 Max 100%

    Flash read cache reservation Performance Default 0 Max 100%

    Force provisioning Override policy Disabled

  • VM Storage Policies Recommendations

    Number of Disk Stripes per object Should be left at 1, unless the IOPS requirements of the VM is not being

    met by the flash layer.

    Flash Read Cache Reservation Should be left at 0, unless there is a specific performance requirement to be

    met by a VM.

    Proportional Capacity Should be left at 0, unless thick provisioning of virtual machines is required.

    Force Provisioning Should be left disabled, unless the VM needs to be provisioned, even if not

    in compliance.

    58

  • VMware Virtual SAN Virtual Machine Provisioning Operations Demo

  • Virtual Machine Provisioning Operations

    All VM provisioning operation include access to VM Storage Policies

    60

  • Virtual Machine Provisioning Operations

    If the VSAN Datastore understands the capabilities in the VM Storage Policy, it will be displayed as a matching resource.

    61

  • Virtual Machine Provisioning Operations

    If the VSAN Datastore can satisfy the VM Storage Policy, the VM Summary tab will display the VM as compliant.

    If not, due to failures, or the force

    provisioning capability, the VM will be shown as non-compliant.

    62

  • Virtual Machine Policy Management

    Modify VM performance, capacity, and availability requirements without downtime.

    63

  • VMware Virtual SAN Resiliency & Failure Scenarios

  • Understanding Failure Events

    Virtual SAN recognized two different types of hardware device events in order to define the type of failed scenario: Absent Degraded

    Absent events are responsible to trigger the 60 minutes recovery operations. Virtual SAN will wait 60 minutes before starting the object and component

    recovery operations 60 minutes is the default setting for all absent events Configurable value via hosts advanced settings

    65

  • Understanding Failure Events

    Degraded events are responsible to trigger the immediate recovery operations. Triggers the immediate recovery operation of objects and components Not configurable

    Any of the following detected I/O errors are always deemed degraded: Magnetic disk failures Flash based devices failures Storage controller failures

    Any of the following detected I/O errors are always deemed absent: Network failures Network Interface Cards (NICs) Host failures

    66

  • Failure handling philosophy

    Traditional SANs Physical drive needs to be replaced to get back to full redundancy Hot-spare disks are set aside to take role of failed disks immediately In both cases: 1:1 replacement of disk

    Virtual SAN Entire cluster is a hot-spare, we always want to get back to full redundancy When a disk fails, many small components (stripes or mirrors of objects) fail

    New copies of these components can be spread around the cluster for balancing Replacement of the physical disk just adds back resources

  • Managing Failure Scenarios

    Through policies, VMs on Virtual SAN can tolerate multiple failures Disk Failure degraded event SSD Failure degraded event Controller Failure degraded event Network Failure absent event Server Failure absent event

    VMs continue to run Parallel rebuilds minimize performance pain

    SSD Fail immediately HDD Fail immediately Controller Fail immediately Network Fail 60 minutes Host Fail 60 minutes

    68

  • Virtual SAN Access Rules

    Components Access Rules At least 1 mirror copy intact All stripes must be intact Greater than 50% of components must be available

    Including witnesses

    1 Mirror Copy All stripes available

    > 50% components

    and witnesses

    Power on Operation

    Logic is implemented per object

  • Magnetic Disk Failure Instant mirror copy

    Degraded - All impacted components on the failed disk will be instantaneously created onto other disk, disk groups, or hosts.

    vsan network

    vmdk vmdk witness

    esxi-01 esxi-02 esxi-03 esxi-04

    vmdk

    new mirror copy Instant!

    Disk failure, instant mirror copy of impacted component

    raid-1

  • Flash Based Device Failure Instant mirror copy

    Degraded - All impacted components on the failed disk will be instantaneously created onto other disk, disk groups, or hosts.

    Greater impact on the cluster overall storage capacity

    vsan network

    vmdk vmdk witness

    esxi-01 esxi-02 esxi-03 esxi-04

    vmdk

    new mirror copy Instant!

    Disk failure, instant mirror copy of impacted component

    raid-1

  • Host Failure 60 Minute Delay

    Absent will wait the default time setting of 60 minutes before starting the copy of objects and components onto other disk, disk groups, or hosts.

    Greater impact on the cluster overall compute and storage capacity.

    vsan network

    vmdk vmdk witness

    esxi-01 esxi-02 esxi-03 esxi-04

    vmdk

    new mirror copy 60 minute wait

    Host failure, 60 minutes wait copy of impacted component

    raid-1

  • Network Failure 60 Minute Delay

    Absent will wait the default time setting of 60 minutes before starting the copy of objects and components onto other disk, disk groups, or hosts.

    NIC failures, physical network failures can lead to network partitions. Multiple hosts could be impacted in the cluster.

    vsan network

    vmdk vmdk witness

    esxi-01 esxi-02 esxi-03 esxi-04

    vmdk

    new mirror copy 60 minute wait

    Network failure, 60 minutes wait copy of impacted component

    raid-1

  • Virtual SAN 1 host isolated HA restart

    vsan network

    vmdk vmdk witness

    esxi-01 esxi-02 esxi-03 esxi-04 isolated!

    HA restart raid-1

    vSphere HA restarts VM

  • Virtual SAN 2 hosts isolated HA restart

    vsan network

    vmdk vmdk witness

    esxi-01 esxi-02 esxi-03 esxi-04 isolated! isolated!

    HA restart raid-1

    vSphere HA restarts VM on ESXi-02 / ESXi-03, they own > 50% of components!

  • Virtual SAN partition With HA restart

    vsan network

    vmdk vmdk witness

    esxi-01 esxi-02 esxi-03 esxi-04 Partition 1 Partition 2

    HA restart

    vSphere HA restarts VM in Partition 2, it owns > 50% of components!

    raid-1

  • Maintenance Mode planned downtime

    3 Maintenance mode options:

    Ensure accessibility Full data migration No data migration

  • VMware Virtual SAN Interoperability Technologies and Products

  • Technology Interoperability

    Virtual SAN is fully integrated with many of VMwares storage and vSphere availability enterprise features.

    79

    Supported Not applicable Future

    Virtual Machine Snapshots Storage IO Control (SIOC) 62 TB VMDKs

    vSphere HA Storage DRS vCOPS

    vSphere DRS Distributed Power Management (DPM)

    vMotion

  • Horizon View Virtual SAN and Horizon View:

    Handle peak performance such as boot, login, read/write storms

    Seamless granular scaling without huge upfront investments

    Support high VDI density

    Support high end virtual desktop GPU requirements

    Virtual SAN is compatible with the following Horizon View versions:

    Horizon View 5.3 (SPBM manually implemented)

    Policies maintained across operations such as refresh/refresh no need to re-associate

    vSphere + Virtual SAN

    Hard disks

    Hard disks SSD SSD Hard disks

    SSD

    Full Clone Policies FTT = 1 for persistent

    FTT = 0 for non-persistent

    Provisioning 100% reserved

    Linked Clone Policies OS Disk: FTT = 1 for dedicated

    pools,

    OS Disk: FTT = 0 for floating pool

    Replica Disk: FTT = 1

    Replica Disk: Read Cache Reservation 10%

    Provisioning: Thin

  • vSphere Replication and Site Recovery Manager

    Virtual SAN is compatible with: vSphere Replication 5.5 (vSphere Web Client) SPBM configured as part of replication vCenter Site Recovery Manager 5.5 (vSphere C#) SRM configuration based on VR replication

    vSphere Replication & vCenter Site Recovery Manager Asynchronous replication 15 minute RPO VM-Centric based protection Provide automated DR operation & orchestration Automated failover execution of user defined plans Automated failback reverser original recovery plan Planned migration ensure zero data loss Point-in-Time Recovery multiple recovery points Non-disruptive test automate test on isolated network

    vCenter Server VR/SRM

    vSphere

    VMFS

    vCenter Server VR/SRM

    production site recovery site

    replication

    Hard disks

    SSD

    vSphere + Virtual SAN

    Hard disks

    SSD Hard disks

    SSD

  • vSphere Data Protection Virtual SAN and vSphere Data Protection

    Radically simple to deploy and manage Integrated User Interface vSphere Web Client Highly available storage solution Increase operation efficiency

    vSphere Data Protection Advanced 5.5 Source and target De-duplication capabilities

    Bidirectional replication capabilities

    Secure, easy, reliable, network-efficient replication

    Application-consistent backup and recovery capabilities

    Higher RTO and RPO 24 hours RTO, minutes hours RPO

    Incorporated technologies vStorage API for Data protection

    Change Block Tracking (CBT)

    Avamar variable-length segment algorithm

    vCenter Server

    Hard disks

    SSD

    vSphere + Virtual SAN

    Hard disks

    SSD Hard disks

    SSD

    vSphere

    VMFS

    vCenter Server

  • vCloud Automation Center

    vCloud Automation Center provides Virtual SAN: Centralized provisioning, governance, infrastructure

    management capabilities Simple and self-service consumption capabilities Entitlement compliance monitoring, and enforcement Leverage existing business processes and tools Delegation control of resources

    Custom use of VM Storage Policies: Virtual SAN default policy Blueprints VM templates Via vCenter Orchestrator with custom workflow Via vCloud Automation Center designer modifying

    provisioning workflow

    vSphere + Virtual SAN

    Hard disks

    Hard disks SSD SSD Hard disks

    SSD

  • OpenStack

    Virtual SAN and OpenStack Framework Cloud Ready App to Hypervisor

    Converged solution Leverage the use of Flash Optimized

    storage in OpenStack Resiliency for legacy and Cloud Ready

    applications vSphere Web Plug-in for OpenStack UI

    Virtual SAN interoperates with OpenStack Framework. vSphere Driver vSphere Datastore

    Swift

    object store

    Glance

    image store

    HorizonDashboard

    OpenStack FrameworkKeyStone

    identity service

    NSX

    driver

    Neutron

    networking

    Nova

    compute node

    vsphere datastore

    driver

    Cinder

    volume service

    vsphere

    driver

    vSphere + Virtual SAN

    Hard disks Hard disks SSD SSD Hard disks SSD

  • VMware Virtual SAN Design & Sizing Guidelines Exercise

  • Virtual SAN Datastore Distributed datastore capacity determined by aggregating the disk

    groups found across multiple hosts that are members of a vSphere cluster and the size of the magnetic disks.

    Only the usable capacity of the magnetic disks count towards the total capacity of the Virtual SAN datastore.

    The capacity of the flash based devices is specifically dedicated to Virtual SAN's caching layer.

    disk group disk group disk group disk group

    Each host: 5 disk groups max. Each disk group: 1 SSD + 1 to 7 HDDs

    disk group

    VSAN network VSAN network VSAN network VSAN network VSAN network

    vsanDatastore

    HDD HDD HDD HDD HDD

  • Objects

    Individual storage block device that is compatible with SCSI semantics. Each object that resides on the Virtual SAN datastore is comprised of

    multiple components.

    Objects are assigned storage performance and availability services requirements through VM Storage Profiles.

    Object Types Definitions VM Home Location where all virtual machines configuration files reside (.vmx, log files, etc.)

    Swap Unique storage object only created when virtual machines are powered on.

    VMDK Virtual machine disk file

    Snapshots Unique storage object created for virtual machines

  • Components Objects are comprised of components that are distributed across hosts

    in vSphere cluster. Virtual SAN 5.5 currently supports a maximum of 3000 components per

    host.

    Objects greater than of 255 gigabytes in capacity are automatically divided into multiple components.

    Each component consumes 2 megabytes of disk capacity for metadata.

  • Witness Witness components are part of every storage object. Only contain object metadata. Serve as tiebreakers when availability decisions are made in the Virtual

    SAN cluster in order to avoid split-brain behavior.

    Each Virtual SAN witness component also consumes 2 megabytes of capacity.

  • Virtual SAN Datastore Sizing Considerations

    It is important to understand the impact of availability and performance storage capabilities on the consumption of storage capacity. Number of Failures to Tolerate Number of Disk Stripes per Object Flash Read Cache Reservation Object Space Reservation

  • Disk Groups A single flash based device (SAS/SATA/PCIe SSD) and one or more

    magnetic disks (SAS/SATA HDD). Disk Groups make up the distributed flash tier and storage capacity of

    the Virtual SAN Datastore.

    Formatted with a modified on-disk file system (VMFS-L) and are then mounted onto the Object Store File System datastore as a single datastore

    VMFS-L on-disk file system formatting consumes a total of 750 megabytes of capacity per disk.

    Artifacts Minimums Maximums Disk Groups 1 Per Host 5 per host

    Flash Devices (SAS/SAS/PCIe SSD) 1 Per Disk Group 1 Per Disk Group

    Magnetic Disk Devices 1 HDD Per Disk Group 7 HDD Per Disk Group

    Disk Formatting Overhead 750 MB Per HDD 750 MB Per HDD

  • Number of Failures to Tolerate

    Largest impact on the consumption of storage capacity in Virtual SAN. Based on the availability requirements of a virtual machine, the setting

    defined in a VM Storage Policy can lead to the consumption of up to four times the virtual machine or individual disks capacity

    2 full copies of data + 1 witness

  • Number of Disk Stripes Per Object

    If the Number of Disk Stripes per Object is increased beyond the default value of 1, then each stripe will count as a separate component.

    This has an impact on the of total number of components supported per host.

  • Disk Group Design

    One Flash Device Per Disk Group Multiple flash based devices, multiple disk groups will be created to

    leverage the additional flash

    Higher the ratio of flash based device capacity to magnetic disks capacity, the greater the size of the cache layer.

    Define and reduce the storage failure domains.

    Failure domain

    disk group disk group disk group disk group

    Each host: 5 disk groups max. Each disk group: 1 SSD + 1 to 7 HDDs

    disk group

    VSAN network VSAN network VSAN network VSAN network VSAN network

    HDD HDD HDD HDD HDD

  • Flash Capacity Sizing

    The general recommendation for sizing Virtual SAN's flash capacity is to have 10% of the anticipated consumed storage capacity before the Number of Failures To Tolerate is considered.

    Total flash capacity percentage should be based on use case, capacity and performance requirements.

    10% is a general recommendation, could be too much or it may not be enough.

    Measurement Requirements Values Projected VM space usage 20GB

    Projected number of VMs 1000

    Total projected space consumption per VM 20GB x 1000 = 20,000 GB = 20 TB

    Target flash capacity percentage 10%

    Total flash capacity required 20TB x .10 = 2 TB

  • Sizing Exercise Formulas

    96

    Constraints VSAN components and VMFS metadata overhead (VSANmetaDataOverhead):

    1GB per disk Variables Number of Hosts Per cluster (Hst) = 8 Number of Disk Groups (DskGrp) = 5 Number of Disks Per Disk Group (DskPerDskGrp) = 7 Size of Disks (SzHDD) = 4000 GB Number of Failures To Tolerate (ftt) = 1 Number of Virtual Machines (VMs) = 800 Number of Disks per Virtual Machine (NumOfVMDK) = 1 Memory Per Virtual Machine (vmSwp) = 10 GB

    Cluster RAW Capacity Formula: Hst x NumDskGrpPerHst x NumDskPerDskGrp x SzHDD = y Example: 8 x 5 x 7 x 4000 GB =1,120,000 GB =1,120 TB

  • Sizing Exercise Formulas

    97

    VMFS Meta Data Formula: VMFSMetadata x NumDskGrpPerHst x NumDskPerDskGrp = y Example: 750 MB x 5 x 7 = 26,250 MB = 26.2 GB VMFS Metadata

    Objects Formula: VMs x [VMnamespace + vmSwap + NumOfVMDK] = y Example: 800 x [1 + 1 + 1] = 2400 Objects Note: Snaps, Clones and >1 Disk Stripes would add more objects Components Formula: Object x [ftt x 2 + 1] = y Example: 2400 x (1 x 2 + 1) = 7200 Components = 900 average components per host (max is 3000 per host) Components Metadata Formula: NumComponents x compMetadata = y Example: 7200 Components x 2 MB = 14.4 GB Component Metadata

  • Sizing Exercise Formulas

    98

    VSAN Meta Data Formula: compMetadata + VMFSMetadata = y Example: 14.4 GB + 26.2 GB = 40.6 GB VSAN Metadata Simplified Formula: NumDskGrpPerHst x NumDskPerDskGrp x NumHosts x 1 GB = y Simplified Example: 5 x 7 x 8 x 1 GB = 280 GB

    Accounts for factors such as snapshots, additional stripes, etc. Swap Utilization

    Formula: (VMs x vmSwp x 2) Example: Swap Space = (800 x 10GB x 2) = 16000 GB

    Available Capacity = Raw Capacity Swap Capacity = 1120000 16000 = 1104000 = 1,104 TB Disk Capacity

    Usable Capacity Formula: (DiskCapacity VSAN Meta Data) / (ftt + 1) Example: (1104000 GB - 280 GB) / 2 = 1103720 GB / 2 = 551,860 GB Usable Capacity Best practice is to allocate no more than 80% to virtual disks

  • Memory and CPU

    Memory requirements for Virtual SAN are defined based on the number of disks groups and disk that are managed by hypervisor.

    As long as vSphere hosts have greater memory configurations than 32 gigabytes of RAM, they will be able to support the maximum disk group and disks configuration supported in Virtual SAN.

    Virtual SAN is designed to introduce no more than 10% of CPU overhead per hosts. Consider this fact in Virtual SAN implementations with high consolidation ratios and CPU intensive applications requirements.

  • Network Virtual SAN network activities can potentially saturate and overwhelm an

    entire 1GbE network, particularly during rebuild and synchronization operations.

    Separate the different traffic types (Management, vMotion, Virtual Machine, Virtual SAN) onto different VLANs and use shares as a Quality of Service mechanism to sustain the level of performance expected during possible contentions scenarios.

    Virtual SAN requires for IP multicast to be enabled on the layer 2 physical network segment utilized for Virtual SAN communication

  • VMware Virtual SAN Monitoring & Troubleshooting

  • Network Status reports

    Misconfiguration detected: Verify physical network Enable multicast

    Disabling IGMP snooping Configure IGMP snooping for

    selective traffic

    Validate the virtual switch configuration VLAN VSAN Traffic service enabled NIC team failover policy

  • Failover Policy

    NIC Teaming failover load balancing: policy with route based on port ID Active / Standby

    103

  • Command Line Tools

    VMKPING vmkping

    Example 10.4.90.27 To validate network accessibility

    ESXCLI esxcli vsan network list

    104

  • Disk Claiming Operation

    Automatic disk claiming operation fails to claim disks Is local: true disks are automatically claimed Is local: false disks are shared thus not automatically claimed but can be

    manually marked local

    105

  • Ruby vSphere Console

    RVC VSAN vsan.disks_info Size, disk type, manufacturers, model, local/non-local

    106

  • Disk Groups Creation Fails

    Disk Groups Creation Fails VSAN license needs to be added to the cluster

    Home > licenses > Cluster tab > Select cluster object > Assign License Key vSphere Web Client refresh time out

    Log out and back in

    Unable to delete Disk Group VSAN disk claiming operation set to automatic, change to manual

    vsan.host_wipe_vsan_disks --force wipe disks used by VSAN

    CONFIDENTIAL 107

  • Observing performance Monitor performance: Ruby vSphere Console & VSAN Observer

    In-depth monitoring of VSANs physical disk layer performance, cache hit rates, latencies, etc.

  • VSAN Observer

    Starting the VSAN Observer Performance stats

    1

    2

  • VSAN Observer Monitoring Flash Devices

    Monitor read cache hit rate Flash based devices evictions to magnetic disks

    110

  • VSAN Observer

    Monitor disk groups aggregate and disk layers

  • Virtual SAN Logs

    Virtual SAN related logs. Individually maintained per hosts

    112

  • Ruby vSphere Console

    Disk Capacity used and reserved capacity

    113

    Monitoring VSAN Component Limits

  • Ruby vSphere Console

    Virtual SAN what if failure analysis Simulate host failure impact to cluster

  • Ruby vSphere Console

    VSAN Observer recommendations Deploy a VCVA appliance to use for the Observer Run the observer session on the newly deployed or remote VCVA appliance Increase the data gathering time beyond the default (2 hours) if necessary.

  • 116

    !!!

    Jason [email protected]!

    @TheJason Nash"

    THANK YOU

    Graphics by Duncan Epping & Rawlinson