92
IaaS Cloud Conceptual Design 2012.04.01 Terry.Cho [email protected]

IaaS Cloud Architecture Design

Embed Size (px)

DESCRIPTION

Cloud IaaS Architecture design whci

Citation preview

2. Table Of Contents 1. Overview 2. Architecture Principals 3. Patterns of Cloud Architecture 4. Infrastructure Domain Model 5. User & Service Domain Model 6. Domain Model Mapping 7. Software architecture 8. User Interface reference scenario 9. Infrastructure Architecture 3. 1.Overview 4. Overview IaaS . 5. Design Strategy Strategy Amazon EC2 (MULTICAST,VLAN,VM Access ) Public VIEW Global Rollout 6. Usecase IaaS UseCase . ACCOUNT ADMIN IaaS + FundamentalCLOUD ADMIN Service IFRA ADMIN USER 7. 2. Architecture Principals 8. Architecture Principals Fundamental Principals which drives Cloud Architecture Design 1. Infinite capacity 2. Continuous availability 3. Predictability 4. Take a service providers approach to delivering infrastructure 5. Resiliency over redundancy mind set 6. Minimize human involvement 7. Optimize resource usage 8. Incentive desired resource consumption behavior 9. 1. Infinite Capacity From customers perspective, cloud service appear to have infinite capacity Objectives Drive a change of thinking in EA, Service Delivery and Operations teams Place high emphasis on capacity planning 10. 2. Continuous Availability From customers perspective, cloud service should never exhibit any interruption to service, evenif failure occur within the cloud environment Objectives Drive a change of thinking in EA, Service Delivery and Operations teams New approach to resiliency/redundancy Place high emphasis on availability planning 11. 3. Predictability Remote as much variation from the environment as possible to increase predictability Objectives Increase predictability translates to lower costs and higher quality Reduce variations across infrastructure, system management and operations 12. 4. Take a service providers approach to delivering infrastructure SDS should adopt a service provider model, where the provider delivers infrastructure on demand Objectives Drive change of thinking in EA, Service Delivery and Operation teams Consciously thinking about re-usable , on-demand services as opposed to project orientedservices Provider and consumer have different perspective & needs New approach to budgeting Blended central & project-based budgets 13. 5.Resiliency over redundancy mind set From the providers perspective, focus on maintaining service availability through resiliency,rather than redundancy Objective Reduce redundancy at the infrastructure level which is highly costly Eliminate duplicated redundancy which is typical across several layers of the stack Move toward resilience model which is more cost effective Push resilience up the stack; designing application for resilience is less costly than redundantinfrastructure 14. 5.Resiliency over redundancy mind set From the providers perspective, focus on maintaining service availability through resiliency,rather than redundancy Objective Reduce redundancy at the infrastructure level which is highly costly Eliminate duplicated redundancy which is typical across several layers of the stack Move toward resilience model which is more cost effective Push resilience up the stack; designing application for resilience is less costly than redundantinfrastructure 15. 5.Resiliency over redundancy mind set Availability through redundancy Availability through resiliency Aim : avoid hardware component failure Aim : avoid service failure Redundancy at hardware layers Automated detection-and-response Fewer failures, but great impact Resiliency at Software fail over Measured by Mean-Time-Between-Failures More failure, but less impact Measured by Mean-Time-to-Restore-Service 16. 6. Minimize human involvement A highly automated environment is required to achieve resiliency Objectives Well-defined, mature procedure can be automated Move further up the automation continuum A high fidelity end-to-end health model is also required for automation Automation is necessary to achieve resiliency 17. 7. Optimize resource usage From the providers perspective, resource should be optimized to maximize utilization andminimize waste Objectives Provide highest ROI by maximizing resource utilization Drive efficiency and reduce cost 18. 8. Incentive desired resource consumption behavior Leverage cost, quality and agility to influence consumer behavior in ways that facilitate CloudArchitecture Principals Objectives Avoid unlimited consumption Get consumers to release resource when no longer needed Exposing cost or resource allocated to consumer allows consumers to act responsibly 19. 3. Patterns of Cloud Architecture 20. Patterns of Cloud Architecture Concepts which support the principals and enables IaaS 1. Homogenization of physical infrastructure 2. Provisioning on demand 3. Cloud Management 4. Consumption based pricing 5. Virtualized Infrastructure 6. Server classification 7. Holistic approach to availability 8. Compute resource decay 9. Elastic Infrastructure 10. Partitioning of shared resource 21. 1. Homogenization of physical infrastructure Eliminate hardware variation Reduce complexity of the environment VMs get consistent experience from all hosts Simplifies automation Cost saving may be achieved through bulk purchase Blockers Might not be realistic for many specific services. Some service requires High Performance or Uniquehardware configuration like GPU Hardware vendor replace product models frequently 22. 2. Provisioning on Demand Provide agility to service consumer Providing infrastructure to consumer on demand Provisioning process is automated and it is completed a few minutes Basic software installation ( DBMS, middleware) is also supported further 23. 3. Cloud Management Managing workloads on a pool of compute , network and storage resources Virtualized infrastructure alone is insufficient No concept of fault domain Depends upon redundancy for availability Need to manage a group of VMs which collectively provide a service. (e.g. Database servers, Email Servers,J2EE WAS Servers etc) Ensure deployment across fault domain (racks, groups of racks) 24. 4. Consumption based pricing Charge service fee based on actual resource usage 25. 5. Virtualized Infrastructure Abstraction at multiple levels enables step changes in service design and delivery Virtualize Network, Storage and Compute independence De-coupling consumer from provider resources New approach to resilience, less dependent on redundancy Any virtualized service can run and function identically on any physical server Necessary for achieving resiliency Virtualization alone is not enough for cloud 26. 6. Server classification The definition of a service and its non-functional characteristics Limit the number of service classifications in the same way the number of server specifications should belimited Analyze existing workloads to determine the classifications Split stateful and stateless workloads into separate classifications (stateless cost a lot less) Expose the actual cost of each classification to incentivize consumer behavior 27. 7. Holistic approach to availability Availability is achieved through resiliency, redundancy and application design Traditional availability is delivered via redundancy , required expensive hardware Availability should be considered across the whole stack; infrastructure, cloud platform, application anddata Designing an application to expect and handle failure reduce infrastructure costs Resiliency (Mean-Time-to-Restore-Service), minimized the need for redundant hardware Even if the number of outstages increases, the duration of each outstate is very low, maintaining a highavailability experience for the user 28. 8. Compute Resource Decay Compute failures decrease capacity rather than case an incident Virtualization allows workloads to be moved around, therefore VM outages are short-lived, or if able to bemoved proactively, non-existent Concept depends upon homogenization and pool of resources concepts 29. 9. Elastic Infrastructure Ability to expand and contract capacity on-demand Deep understanding of the business is required in order to ensure maximum capacity requirements aremaintained efficiently Requires triggers on when to scale out and when to scale back Triggers may be automated or request-driven Scale down is important to avoid waste Auto scale out needs deep technical understanding of specific application or middle ware type. 30. 4. Infrastructure Domain Model 31. Date Center Data Center Data Center , , LCLC RC RC . , GC . LCLC RC (Regional Center) OSS,BSS RC MC GC (Global Center) RC Self Service Portal RC OSS+BSS LC (Local Center) : RC , RC Zone . Regulation CDN Edge Node , 32. Date Center Data Center Global CenterRegional Center Data Center API , DATA . API Self Service PortalCloud Admin Portal API UI GC API Flow Portal Interface .API Routing (Proxy or ESB)API Routing (Proxy or ESB) Request Center API Routing Routing . Cloud ControllerCloud Controller DATA Operation , Billing Cloud InfrastructureCloud Infrastructure Business , Data Bus GC . Operational Data Business Data Operational DataBusiness Data(Monitoring History etc) (Billing etc) Network Dedicated>> (Monitoring History etc) (Billing etc) Line VPN Tunneling . Data Bus (ETL, CDC etc)Data FlowVPN or Dedicated Network 33. Zone Zone : Zone Data Center , Hardware Server Rack . Zone .LoadServer Rack Multicast is allowed (VLAN ) Balancer Virtual Machine Physical ServerNW Switch , Rack Storage SANSAN SwitchSwitch, Zone Network Switch . NW SwitchNW SwitchNW SwitchNW SwitchStorage Rack SAN Switch SAN SwitchSAN SwitchSAN Switch Network SAN Storage . Physical Physical Physical PhysicalRack Server Server Server ServerNetwork RackPhysical Physical Physical Physical Server Server Server Server Rack Backbone Network Switch , Switch Data Center Router Physical Server PhysicalServerPhysical Server PhysicalServer .Compute Load Balancer , VM Rack. Server Rack SAN SAN SAN (10G TCP)Backbone SAN . SAN Zone Controller Zone Router Disk ArrayManagement Network ,Network Storage , Center Cloud Infra ManagementStorageNetwork Center Zone . . Rack Data Center Zone Zone Zone . Switch, Storage , . Dedicate Zone . 34. Zone Zone Public Zone - Zone Private Zone Zone 1 Zone 2 - ZonePublic ServicePublic Service 2 - Firewall On Premise Dedicated Network VPN . - Inter zone connection Private Zone , Zone Dedicated Zone 3 Zone 4CustomeZone 5 Customer 2Fault Zone Network , Zone . ( Custom er1 r 1(DR) ) Fault ZoneInter zone Connection- Zone Fail VM Host Stand by Zone - Fault Zone Zone Rack Type Cloud . VPN - Costly.On Premise - Alternatives : Fault Rack in ZoneCustomer 1 (On Premise) Data Center Zone . 35. Rack Rack RACK : VM Physical Server , 19 RACK . RACK TYPE : RACK Physical Server SPEC , RACK RACK TYPE . Scale Out Unit Scale Out Unit , Scale Out Unit Rack . Scale Out Unit , , , . ( SCALE OUT UNIT , , Bulk Buy Discount . SCALE OUT UNIT , Scale Out Unit Size . ) NW Switch NW Switch NW Switch NW Switch SAN SwitchSAN SwitchSAN SwitchSAN Switch Physical Server Physical Server Physical Server + Physical Server Physical Server Physical Server Physical Server Physical Server Physical Server Physical Server Physical Server Physical Server Physical Server Physical Server Physical Server Physical Server RACKRACKRACKSCAL OUT UNIT 36. Rack Rack Type Rack Type Rack Spec . Rack , Rack Type NW SwitchNW Switch NW Switch NW Switch . Rack Type SAN SwitchSAN SwitchSAN Switch SAN Switch Physical PhysicalPhysicalPhysicalServer ServerServerServer Physical PhysicalPhysicalPhysical Rack Type .Server ServerServerServer Physical PhysicalPhysicalPhysical Server ServerServerServer Rack Type A (High Performance &Rack Type B (Mid Performance & - NIC High Availability) Low Availability) 2.6 GHZ CPU, 1:4 density 2.6 GHZ CPU, 1:8 density - SAN 10G NIC * 2 (Teaming) 10G NIC - CPU CLOCK 10G NIC ISCSI * 2 (MPIO) 10G NIC ISCSI Storage Storage : RAID 5,Dynamic Disk- RAID, DISK RAID 1+0, RAID 5Option Only Dynamic ,Static ,Pass through disk Rack Rack Type - NW , SAN - NIC, SAN Zone Rack Type Rack VM Rack , , Rack Type Rack . 37. Resource Pool Resource Pool Resource Pool Resource Pool Physical Server , VM Live Migration Fail Over Architecture , Hypervisor Clustering , Cloud Hypervisor Clustering , Live Migration Fail Over Resource Pool . Resource Pool Resource Pool Fail Over VM , Rack Resource Pool , Rack Rack Rack Type . Resource Pool Physical Server Resource Pool , Hypervisor ClusterNode Mapping , Server Hypervisor Clustering Feature .Rack #1Rack #2 NW SwitchNW Switch SAN Switch SAN SwitchPhysical Physical Server ServerPhysical Physical Server Server Resource Pool 1Physical Physical Resource Pool 2 Server Server Service Role VM Resource Pool Deploy. 38. Infrastructure Data Center #1 Data center Zone #1 Zone #2 Data center is physical buildingwhich hosts physical infrastructures Router Network It is located in multiple region Rack for Data Rack #1Rack #2Rack #3 Center Zone NW Switch NW Switch NW Switch It is logical unitRouter Set of Physical server racks NAT SAN SwitchSAN SwitchSAN Switch Multicast is allowed in same zone 1 zone has 1 SAN, 1 network rack .. Physical PhysicalPhysical(SAN, NW switch) VPNServerResource Pool 1 ServerServerFirewall PhysicalServerPhysical ServerPhysical Server Load balancing between VM can be Physical PhysicalPhysicaldone in a ZoneIPS Server ServerServer Rack NWResource Pool 2 It is physical server rackSwitchVLAN for Service A in Infra A It contains physical servers for(Maximum is single Zone)Service Boundary Service Role boundarycompute Network Rack Storage Rack Rack has a type for performance &Network Rack Storage Rack For Zone #1For Zone #1For Zone #1For Zone #1redundancySAN ex)SANTYPE 1(High performance & redundancy) Load BalancerController Load BalancerController NW Switch Disk ArrayDisk Array 10G NIC *2, 10G ISCSI *2 NW SwitchTYPE 2(Low IO & no redundancy) 10G SAN Switch SAN Switch NIC *1, 1G ISCSI * 1 Resource Pool Logical Unit of Physical Server Pool Management Network One Resource Pool consists of onemore Physical servers which residesin different Rack VMs in same Service Role aredeployed in same Resource Pool Load balancing and failover isoccurred in Resource Pool boundary 39. Infrastructure Architecture Reference Architecture 40. Infrastructure Architecture Infrastructure Hierarchy Central Cloud CloudServiceMgmt CloudDataCenter Manageme nt Fundament alService Zone NetworkRack LoadBalancerNW SwitchSAN Switch ComputeRack SANSwitchRackNW SwitchTypeResource PhysicalPool Servers StorageSANRackController Network GatewayDisk Array RAID RouterNATVPN Firewall IPS 41. Infrastructure Architecture Component DescriptionLevel Component Description NotesCloud Service Central Cloud It is cloud management system across datacenters in multiple region It includes BSSManagement Only central cloud management system has end user interface (portal)Data center Cloud Management It is individual management system for each data center It includes DHCP, DNS, Cloud It communicates with Central Cloud Management with Remote API OS , OSS etc.Fundamental Service It is additional functional services like RDBMS service Blob Storage Map & Reduce Notification etcZoneStorage Rack It is storage for EBS which attached into VM as a main repository SAN is preferred One zone has only one logical storage rack (can be multiple physically)Network RackLoad Balancer Load balancer which balancing input load to VMSW L4 is preferredNW Switch Simple network switch which aggregate network traffic from individual VLAN support requiredcompute rackL2 is preferredSAN Switch SAN switch with aggregate SAN traffic from individual compute rackStorage virtualization should beconsidered. Intelligent switch canbe used.Compute RackSAN Switch Connect Physical server to SAN as a SAN backbone in rackNW Switch Connect Physical server network as a network backbone in rackResource Pool Logical unit of physical server set It has a dependency to Hypervisor In resource pool boundary, VM can be moved for fail oversolutionPhysical Server Physical server which hosts VMStorage RackSAN Controller SAN ControllerStorage virtualization should beconsideredDisk Array RAID based DISK array IO segregation architecture shouldbe considered 42. Infrastructure Architecture Component DescriptionLevel Component DescriptionNotesNetwork Gateway Router Routing network traffic between internet and cloud internalNAT Network address translatorVPN Provides secure access to internal VM instanceFirewall permit or deny network transmissions based upon a set of rules and is frequently usedto protect networks from unauthorized access while permitting legitimatecommunications to pass.IPS (Intrusion Prevention monitor network and/or system activities for malicious activity. The main functions ofSystem) intrusion prevention systems are to identify malicious activity, log information aboutsaid activity, attempt to block/stop activity, and report activityNW Switch Network backbone L3 is preferred Aggregate traffic from Zone 43. 5. User & Service Domain Model 44. Usecase ACCOUNT & INFRAAccount + ACCOUNT ADMINACCOUNT ACCOUNT INFRAINFRA IFRA INFRAINFRACLOUD ADMIN INFRA ADMIN INFRA (Resource) USER 45. SERVICE & SERVICE ROLECREATEINFRA Service ACCOUNT Service Service Role : EX) Web Front End Service Role : EX) Web Front End Business Unit Load BalancerACCOUNT ADMIN ACCOUNT 1..N INFRA Load Balancer IFRA VM1VM2 VM3CREATE , Service Role : EX) DBMS INFRA 1..N SERVICE Load Balancer INFRA ADMIN SERVICE VM1 VM2 VM3 ( ERP, CRP, ) VM1VM2 VM3 SERVICE 1..N SERVICE ROLE Service VLAN . SERVICE, Service Role : EX) Web Front End Service Role : EX) DBMS INFRA VLAN ID (SERVICE Load Balancer ) Load BalancerUSER SERVICE ROLE VM1VM2 VM3 ( , Service Role : EX) DBMS CREATE DBMS , CMS , IDM ) CONFIGURE SERVICE ROLE 1..N VM Load Balancer START/STOP VM1 VM2 VM3 0..1 LOAD BALANCER . VM1VM2 VM3 CF. Shared_IP_Group in OpenStack VM 46. DOMAIN AccountACCOUNTInfrastructure Firewall Policy Business UnitService VLANID Ex) Production ACCOUNT 1..N INFRA IFRAService Role LB Ex) Web Front End , INFRA 1..N SERVICE Policy VirtualSERVICE ( ERP, CRP, )Machine Rack Type VirtualMachine Load Balancing SERVICE 1..N SERVICE ROLE VLAN . SERVICE, INFRA VirtualMachine VLAN ID (SERVICE )SERVICE ROLEService Role Ex) Reporting Service ( , DBMS , CMS , IDM ) Virtual SERVICE ROLE 1..N VM Machine 0..1 LOAD BALANCER . Service RolePhysical Server Networking VirtualMachine Load Balancing Virtual Rack Type .Machine Service Role VM , Scale Out .Infrastructure FirewallVMPolicy Service VLANEx) Dev/Test ID LBService Role Ex) Web Front EndPolicyVLAN 2 VirtualMachine Load Balancing VirtualMachine VLAN can be shared between Services in same infrastructure 47. Usecase Domain Model Sample Account + IaaSACCOUNT ADMIN Service : CMS ServiceLegacy Account Service Role : Web Front End Load BalancerAccountDOMAIN DOMAINVM1 VM2VM3 DOMAINUSER USER ADMINService Role : DBMSAccount DomainLoad BalancerConfigure DOMAIN DOMAIN ManageUSER USER VM1 VM2VM3CloudService : Push Service Service Role : Push ServerDomainDOMAIN DOMAINCLOUD DOMAINLoad Balancer USER USER ADMINADMIN VM1 VM2 VM3DOMAIN DOMAIN Service Role : MySQL Cluster USER USERLoad Balancer VM1 VM2 VM3DomainDOMAIN DOMAIN DOMAINUSER USERADMINDOMAIN DOMAIN USER USER 48. 6. Domain Model Mapping 49. Provisioning Scenario VM Provisioning End 2 End work flow CREATE CREATECREATEVLAN Service Infra INFRA ADMINUSER ACCOUNT ADMIN 1. Create Infra andFire Wall assign it into Business Load Balancer Unit Rack Type Service Role (Front End) Rack TypeLoad BalancerService Role( DB Server) 6. Metering & Charging 2. Create Service and associated Service4. Create VM in the Service Role Role5. Configure Load balancer in the 3. Assign the Service Roles to User Service Role 50. Domain Concept Mapping Concept boundary mapping CreateCloudCentral AccountCreate Service Cloud Mgmt Create DataCenterCloudInfrastructure Firewall ACCOUNT ADMINManagementPolicy ConfigureConfigure VLANFundamental ServiceServiceID Create Create LB Zone Service RoleNetworkPolicyINFRA ADMIN Rack Load VirtualBalancer Machine Rack Type Virtual NW Switch MachineVirtualConfigure SAN SwitchMachineUSERCompute RackSAN Switch Create Configure RackNW Switch Manage TypeResourcePhysicalPoolServersStorageSAN RackController Network Gateway Disk Array RAIDRouter NAT VPNFirewallIPSINFRSTRUCTURE CONCEPTDOMAIN CONCEPT 51. 7. Software Architecture 52. Software Architecture Level 1. Conceptual ArchitectureUser PortalCloud Admin Portal Platform ServiceOSS >IDMOrchestrationVirtual MachineConfigurationFundamentalMonitoringBackupManagerManagerServicesRDS,Blob Storage,No SQLBSS etcInfrastructure Infrastructure(Low cost, Low reliability (High cost, High reliability-redundancy support) No redundancy support) 53. Software Architecture Level 1. Conceptual Architecture and Priority 1 1User PortalCloud Admin PortalOSS1.5IDMOrchestration1 Virtual MachineConfigurationMonitoringBackupManagerManagerBSS 2 31.5Infrastructure Infrastructure 1(Low cost, Low reliability (High cost, High reliability-redundancy support) No redundancy support) 54. Software Architecture Level 2. Conceptual Architecture User Portal Cloud Admin Portal Web based Web based CLI ReportingCLI Reporting ManagementManagementOSS NMS OrchestrationUser Profile Sync (Propagation) SMS Service BusWork Flow Engine AdapterManagementRole ManagementBSSDomain ManagerServer Profile Management Snapshot Monitoring Interface> Authentication & ManagementMetering Authorization Software InstallSoftware Asset ManagementVM ManagerMemory GridVM Profile Engine> ManagementChargingIDMNetwork Manager Event Trigger Patch Management BillingBare MetalStorage Manager Alert & NotificationProvisioningPayment Virtual MachineMonitoringConfigurationBack Up ManagerManagerRouter SAN Switch Physical ServerSwitch SAN Controller Load BalancerDisk Array Fire Wall NAT VPN NetworkingStorage Server 55. Component DescriptionLevel 1Level 2DescriptionUser Portal Cloud CLICommand Line Interface VM , VM Reporting , KPI OLAP CF. BI, Microsoft MS-SQL Excel BI Reference Web Based Management , Infra (VM,Storage,Network etc) , Cloud Admin PortalCloud (OSS ) CLICommand Line Interface , (VM + Host Server) Reporting , Web Based Management Orchestration API Hub .Infrastructure (Network, Storage, Server) Cloud Business Process (Provisioning, Patching)Operation . Service BusAPI Hub ,Mediation, Routing, Transforming .CF. SOA Enterprise Service Bus. (EX Oracle Service Bus) Work Flow Engine Work Flow , Infrastructure (Network, Storage, Server), BSS,OSS Integration , Business Process .CF. SOA BPM , Cloud Provisioning Engine (EX. Microsoft Opalis, HP Matrix) Adapter Infrastructure (Network, Storage, Server) BSS,OSS Cloud (RESTor SOAP/HTTP) . Optional Layer : Virtual Machine Manager CF. Cloud Stack Storage, Network Manage API. 56. Component DescriptionLevel 1Level 2DescriptionVirtual Machine Manager Infrastructure API Expose.VM . Domain Manager Infrastructure concept, Domain concept Hardware Infrastructure Mappning . VM Manager Physical Server VM , Open API Expose . Template Manager VM Template Live Motion VM Physical Host Life Cycle Management VM Life Cycle VM Mgmt VM Control Expose (Start , Stop, etc) VM Locator Domain Model Infra VM Hosting Server . Network ManagerNetworking Abstract Open API Expose VPN Management IP Pool NAT Management Router Management Firewall Management Load Balancer Management VLAN Management Router Management External Device Integration is required. Storage ManagerSAN Storage Abstract Open API Expose . Volume ManagementMonitoringCloud , . . Monitoring Interface Infrastructure BSS,OSS, Cloud . Memory Grid Clustered Memory .CF. Oracle Coherence, OpenSource memcache,Microsoft Windows Server AppFabric Cache Event TriggerPre defined Role , Monitoring . (, ).Event Trigger Fail Over, Scale Out . Alert & Notification Event Trigger Alert Notification Message .Back Up VM Snapshot VM Backup . Snapshot ManagementVM Snapshot , Restore . 57. Component DescriptionLevel 1Level 2 DescriptionConfiguration Management Physical Server VM Software Patch Install . Server ProfilePhysical Server . (, Server , Software List, Patch List ) Management VM Profile Management VM . (, VM , Software List, Patch List) Patch ManagementOS , Software Patch List , VM Physical Server Patch . Software AssetSoftware ManagementSoftware Software Install Engine Software Patch Physical Server VM Install . Bare Metal Provisioning Physical Server OS .IDM (Identity Management) , Global Deployment . User Profile Management Role Management Role Authentication & Authorization Sync (Propagation)Center OSS (Operation Support System) Monitoring Interface , OSS , , Storage Device Device . NMS Network Management System SMS System Management System CF. HP Open view, CA Unicenter 58. Component DescriptionLevel 1Level 2 DescriptionNetworking . (Backbone) , . Router Switch (VLAN ) Load Balancer Load Balancer CF. L4 or L7 Fire Wall NAT Network Address Translator VPN VPNStorageVM Host Disk , Amazon EC2 EBS Storage SAN SwitchPhysical Server SAN Storage SAN ControllerDisk Array Control Controller (ISCSI Controller) Disk ArrayDisk ArrayServer VM Host Physical Server 59. Architecture concept VM Manager / VM LocatorRACK #1 RACK #2 RACK #3 Resource Pool Hypervisor ERP Web Front Clustering , Cluster End Service Role70VM/90 Managed Node . VM #1 Total Hosted Hypervisor Clustering 16~64 Cluster , ERP Web Front Cloud Cluster End Service Role80VM/90VM #2 , VM Total Hosted , Cluster Resource Pool .Resource Pool Find other rack in resource pool Cloud Cluster VM Placer Find most idle server in resourcepool Zone Resource Pool > .VM Provisioning request VM Placement Policy VM RACK Server . ( Service Role VM ServerERP Web Front End Service Role Server Fail Over .)VM #3 VM Placement Policy Infra Admin . VM Placement Policy . Service Role VM Server Rack . Server Rack Idle Physical Server . Rack , Rack Physical Server . Rack Error . ( ) 60. Architecture concept Orchestration / Service Bus Generic Proxy Pattern Service Consumer SOA Enterprise Service Bus , Generic Proxy Pattern , API Hub . IDM Generic ProxyLogging - Edge Proxy Tracing API Entry Point (REST- Auditing Edge Proxy XML,REST-JSON,SOAP/HTTP etc) .Orchestration , API , . Orchestration Logic Transformation - Common Proxy Common Proxy Compensation API , Logging .SLA Center .AlertThrottling- Local Proxy Local Proxy Exception HandlingIgnoreMonitoringMediation,Message Transformation .Reporting - Business ServiceBusiness ServiceAuto retry Delegator , SLA Human Error handling(Throttling, Alert .) Components 61. Architecture concept Orchestration / Work flow engine Work flow EngineCloud (Physical Infrastructure) , Cloud (Provisioning, Patching, Resiliency) - UI Based Work flow GUI Based Work flow .- Work flow Runtime UI Based Work flow Workflow , Process Tracking .- Adapter Device Interface , Device Open API Abstract Sample. Microsoft Opalis Expose . (CF. EAI Legacy Adapter)- Agility Cloud Process , GUI Infrastructure , . Flexibility Process , Coding Process . Hardware Infrastructure Abstraction ,Infrastructure . 62. Solution Candidate Virtual Machine Manager Cloud Stack Open Source Cloud OS Supported by Cloud.Com (Subscription Model Software Subscription, Patch & Bug Fix, 24x7 Tech Support. ProfessionalService) Characteristics Well defined Domain Model Well defined Open API set Software based Networking devices (Firewall, NAT, Router, LB etc) Storage Tiering It has a dependency on Hypervisor Cluster Feature (Resource pool size restriction) Virtual Machine Manager OpenStack / Nova Open Source Cloud OS Sponsored by RackSpace.com Characteristics Very Simple and not matured yet. Focused on VM Provisioning scenario only. External Networking device integration is required. (See. Zeus ) It needs to research Storage Architecture. (No IO Segreation, Storage Tiering feature) Not enough feature to realize this architecture. (Many customization and enhancement is required) Recommendation : RackSpace.com Professional Service is mandatory to delivery. 63. Solution CandidateLevel 1 ComponentLevel 2 Component CloudStack OpenStack Nova (Cactus)Domain,Account,Security Group,Zone,Pod Domain Domain Model (Cutomization Domain Manager X - )(Mapping Research ) Template Flavor (Memory, Disk Size -Research ,Template Virtual Machine CPU Priority Time ,Virtual Core )VM ManagerISO Manager Image (Glance )VM Cluster Live Migration VM .Router, VPN,Load Balancer, NAT,Fire wall,VLAN Open API Network Manager IP Load Balancer, Firewall Storage Pool, Volume -IO Segregation Storage Manager Storage API . CLI Storage Tiering (Feature List)Orchestration Service Bus N/AX N/AX Work Flow Engine N/AX N/AXAdapter N/AX N/AX Schedule Backup VM Sna Back up Snapshot Management Snapshot Management X pshot Management BSS Metering Metering API Usage Metering API ( ) X IDM (IDM ) SSO Integration Software Networking Software Router, VPN,Load Balancer, NAT,Fire wall,VLAN Networking XDevice - VM Provisioning Focus . (FeatLive Migration (Feature List , API List ) Comment ure )Cluster (Resource Pool Rackspace production delpoyment Reference Imple, Resource Pool VM Movement ) mentation Very High RiskConclusion , , CloudStack , OpenStack Nova Production Deployment Rackspace Monitoring, Configuration Management Additional Feature Product CloudStack Software Networking KT , Alternative Solution CloudStack Hypervisor Clustering 64. Solution Candidate Networking Infrastructure Zeus Cloud , Network Device, , Delivery , . Network Device , Delivery Time , Software Networking Infrastructure . CloudStack Software Networking Infrastructure , KT Load Balancer Report , Alternative Solution . Zeus (http://www.zeus.com) Software Load Balancer, Traffic Manager, Fire wall Pros Rackspace . Delivery Partner Global Scale Global Load Balancer Cons Additional Cost (CloudStack ) 65. Solution Candidate Level 2. Conceptual Architecture User Portal Cloud Stack Customization Cloud Admin Portal Cloud Stack Customization Web basedWeb based CLI ReportingCLI Reporting Management ManagementOSS NMS Orchestration User ProfileSync (Propagation) SMS Service BusWork Flow Engine Adapter Management Open Source Cloud Stack Role ManagementBSSDomain ManagerServer Profile Management Snapshot Monitoring Interface>Authentication & ManagementMeteringAuthorization Software InstallSoftware Asset ManagementVM ManagerMemory GridVM Profile Engine> ManagementCharging IDMNetwork Manager Event Trigger Patch Management BillingBare MetalStorage Manager Alert & NotificationProvisioningPayment Virtual MachineMonitoringConfigurationBack Up ManagerMS System Manager CloudStack CloudStackCenter ChefRouter SAN Switch Physical ServerSwitch SAN Controller Load BalancerDisk Array Fire Wall NAT VPN NetworkingStorage ServerZeusNetApp orLeftHand 66. 8. User Interface Reference Scenario 67. Cloud Admin Cloud Admin CL-ADM-INFRA-ZONELIST CL-ADM-INFRA-RACKLISTCL-ADM-INFRA-DCLIST Select Zone CL-ADM-INFRA-SERVERLISTSelect Data Center Zone ListSelect Rack Rack ListData Center List Server ListLists zone and physical hardware status Lists Rack StatusEvent Event Lists Servers in the Rack> Network and Storagestatus in the Zone CL-ADM-INFRA-SERVICELIST >Select Server Service ListCL-ADM-INFRA-VMLISTVM List Lists of Service which is deployed to the zone (DataCenter Level)CL-ADM-INFRA-VMDETAIL VM DetailLists VMs in the Rack Select VM 68. Cloud Admin Cloud Admin Data Center List View .CL-ADM-INFRA-DCLIST 69. Cloud Admin Cloud Admin Zone List View Zone . Zone Compute Rack, Storage Devices, Network Devices .CL-ADM-INFRA-ZONELIST 70. Cloud Admin Cloud Admin Data Rack List View Zone Server Rack . Rack Physical Server, Network Switch, SAN Switch .CL-ADM-INFRA-RACKLIST 71. Cloud Admin Cloud Admin Server List View Server Rack Physical Server . Maintenance , Running Server Live Migration , Zone .CL-ADM-INFRA-SERVERLIST 72. Cloud Admin Cloud Admin VM List View VM .CL-ADM-INFRA-VMLIST 73. Cloud Admin Cloud Admin Server List View VM VDI CLI VM VM Resource Software Install & Update Back up & restore VM , CL-ADM-INFRA-VMDETAIL 74. User Portal User Portal (IFRA ADMIN, USER) VM ACCOUNT USER UI CL-ADM-INFRA-VMDETAILVM Detail CL-USR-INFRA-SVCLISTSelect VMSelect Service CL-USR-INFRA-SVCDETAIL 75. User Portal 76. User PortalCL-ADM-INFRA-VMDETAIL 77. 9. Infrastructure Architecture 78. Physical Server Architecture Physical Server Architecture 10G NIC #1 10G NIC #1NW Switch SAN IFSAN Switch Service Network > >>10G NIC #2 (redundancy) MPIONW Switch SAN IFSAN Switch Management > >> CPULive Motion>Memory NIC Teaming 10G NIC #2 (redundancy) Local Disk Service Network>Management>Live Motion> Physical ServerRackPhysical Server component CPU ArchitectureSLAT (Second Level Address Translation) is recommended for virtualization environment (Intel EPT, AMD NPT)Current hypervisor limitation of physical core : virtual core is 1:8Recommended ratio is 1:4 DiskLocal Disk is used for booting HypervisorDisk mirroring is recommended for disk failureOnly small size of disk is required. (SATA type disk is enough) MemoryCommonly 2GB memory per VM is recommended2GB base memory is required for Hypervisor (it is different depends on Hypervisor) 79. Physical Server Architecture Physical Server ArchitecturePhysical Server component Network Interface Ethernet InterfaceIt serves TCP/IP based connection including service network for VM and Management & VM Live Motion for hypervisorNetwork Teaming is recommended for fail over and maximize throughput SAN InterfaceIt is connected to SAN storageTCP/IP based ISCSI is recommended (FC/HBA is expensive)MPIO is recommended for fail over and maximize throughput Type DescriptionRequired Feature Management It is dedicated network for manage Hypervisor, SAN device management, Network NIC Teaming required Network device management TCP/IP offload (optional) VM cannot access this network Live Motion It is dedicated network for VM movement NIC Teaming required Network VM movement is used to move VM to other physical machine. TCP/IP offload (optional) It is used for optimizing hardware utilization, maintance (planned hardware shut down, patch etc) VM cannot access this network Service Network It is used by VM. VLAN support required It is connected public internet work NIC Teaming required SAN Jumbo Frame support required MPIO support required VLAN support is required ISCSI case TCP/IP offload (optional) ISCSI case 80. Physical Server Architecture Physical Server ArchitecturePhysical Server Type Rackmount type server Most common server type. Easy to configure and deployment It is easy to extend by using PCI expansion etc. Low density and high power consumption Blade type server Provides high density and low power compute node configuration Low cost in huge number of compute node deployment Hard to configure and manage. Matured operation team is required. Enclosure is container which includes computing unit in it. Enclosure failure bring all node failure in the enclosure. It can be integrated with virtual switch which is integrated with hypervisor environment. 81. Physical Server Architecture Physical Server NIC Teaming Option 1 Building multiple 1gbps NICsPhysical Server Using Teaming feature from NIC VM VMVM VMVM Provides transmit and receive traffic load balancingacross Virtual Machines bound to the team interface, VM VMVM VMVMas well as fault tolerance in the event of switch port,VM VMVM VMVMcable, or adapter failure. This teaming type works withany switch.VM VMVM VMVM If cost is problem, we can remove redundant NIC for Virtual Network SwitchManagement, Live Motion and SAN network NIC. It NIC Team 1can reduce performance but cannot support hardwareNIC NIC NICNICNICNIC NICNIC NIC NICbased failover. It has to be supported by software#1#2#3 #4 #5 #6#7 #8#9#10cluster or resiliency architecture. ProsService NetworkManagement Live Motion SAN NIC fault is dedicated to the failed NIC. TheNetwork Network fault is not propagated. 1gbps NIC is cheap Cons 1G NIC 10G NIC (ISCSI) 1 physical server requires 8 1g port. It needs a lot of 1g switch port and complex cabling. 82. Physical Server Architecture Physical Server NIC Teaming Option 2 Bonding 2 x 10Gbps NICs Physical Server Each VLAN creates a new virtual network adapter on theVM VMVM VMVMparent partition. VM VMVM VMVM Bond V-Switches to appropriate VLAN-tagged adapterVM VMVM VMVM No need to VLAN tag individual VMs ProsVM VMVM VMVM It can reduce total number of switch portVLAN #1 VLAN#2VLAN #3Not visible to VM (Simple cabling) Cons Service Mgmt Live NetworkNetworkMotion Network 10Gbps NIC is very expensive than 1Gbps NIC. NIC Team 1 (Cost issue)NIC NIC NIC #1NIC #2 Network traffic can effect other network#9#10 bandwidth Trouble in specific NIC is propagated other interfaceNetwork TrafficSAN10G NIC 10G NIC (ISCSI)Industry Trend - Virtual Switch (see appendix) Cisco Nexus 1000v for VMware HP Flex 10 for Windows HypervisorIt separates shared 10g NIC into logically dedicated NIC. It provides throttling for each virtual NICs 83. Networking Architecture Networking components Router It connects internal cloud computing node to public internet Network RackCompute Rack Router connects internal network between data center NAT InternetOr WAN Router Network Switch Translate public ip address to internal ip address NAT SAN Switch VPNPhysicalVPN Provides secure network access to computes nodes for specific Infra 10GServer10GISCSI FirewallFirewall IPS IPS Load Balancer Storage Rack Load Balancer Balancer network traffic between compute nodesEthernet Switch SAN Controller Ethernet Switch SAN Switch Disk Array It is TCP/IP backbone. Aggregate TCP/IP connection from Compute Racks It covers inter-rack communication It can be separated to Service Network, Management Network, Live Motion Management Network is in same sub net in data center Live Motion is in dedicated Service Network is dedicated by VLAN for each Infra SAN Switch TCP/IP based iSCSI SAN switch 84. Networking Architecture Network topologyRACKZONE DATACENTER 85. Networking Architecture VLAN Network Switch, Hypervisor, VM , NICs in Physical server must support VLAN Each Infra can have 1..* VLAN 1 Service can have 1 VLAN Multicast is supported in VLAN (In Service Boundary) 86. Networking Architecture Networking Architecture 10G based ArchitectureRouterRouterNW Switch #1Shared Network (10G)NW Switch #2Shared Network (10G)NW Switch NW Switch SAN Switch SAN Switch VLAN VLANVLAN VLANVLAN VLAN > > >Network>>Network>>SAN Switch #1 SANSAN Switch #2 SAN 87. Networking Architecture Networking Features Support for TCP/IP offload Physical network adapter handles TCP/IP traffic packet analysis Reduce processor cycles for heavy network traffic Support for Jumbo Frames Allow Ethernet frames larger than 1500 bytes Increase the packet payload size Reduces the number of packet that the stack must process It is highly recommended to use ISCSI based SAN network Performance is highly leveraged in high IOPS required application like DB, Email Box Networking Features Network Switch, Hypervisor, VM , NICs in Physical server must support VLAN NICs in Physical Server should Supports Teaming 88. Storage Architecture Storage ArchitectureVM VM VM >> >DynamicStatic DiskDisk Physical ServerPass throughdisk SAN Switch SAN Switch SANSANCloud Controller Controller Management DISK GROUP DISK DISK DISKDISK Disk Array RAID N DISK DISK DISK DISK GROUP DISK RAID N DISK DISK DISK DISK GROUP DISK RAID N DISK DISK DISK DISK GROUP DISK RAID N 89. Storage Architecture I/O Segregation 90. Storage Architecture I/O Segregation (Private Cloud case)One potential configuration using additional SAN, segregating Prod from Dev, OS from Data, and providing dedicated I/O for one Application 91. Storage Architecture Capacity and performance planning Storage performance (IOPS and Throughput) is most important factor in Storage Design. It is hard to estimate and measure. Because required IOPS and Throughput of application type like DBMS, TP Monitor which uses Storage are different It is recommended test and measure IOPS and throughput based on the application typeIOPSThroughputExampleAccess SpecifiQuick & dirty SAN benchmark using 1 host an Target Name cation Name IOpsMBps d Iometer VM1 Default 20.97 0.05Total test setup time: 2 hrsVM2 Default 43.95 0.09 VM3 Default 43.96 0.09Lots of other useful data IOmeter produces VM4 MAXTP 874.24114.84 VM5 MAXTP 519.11 72.45 VM6 MAXTP 933.28121.65 VM7 Max I/O1023.890.56 VM8 MAX I/O1109.320.75 VM9 MAXIO1005.180.39 5573.90 310.87Total TransactiiSCSI Throughputons per secondMB/s 92. Storage Architecture Test tooling IOMeter (http://iometer.org) Used to validate the actual max IOPS and max Throughput as pushed from VMs using the production Host & VM config More testing needed to determine Disk Write performance, etc, but this is a good, quick starting pointP. 10 of Iometer guide: The Edit Access Specification dialog shows you how the disk will be accessed. The default is 2-Kilobyte random I/Os with a mix of 67% reads and 33% writes, which represents a typical database workload. You can leave it alone or change it. Press OK to close the dialog when you are through. For maximum throughput (Megabytes per second), try changing the Transfer Request Size to 64K, the Percent Read/Write Distribution to 100% Read, and the Percent Random/Sequential Distribution to 100% Sequential. For the maximum I/O rate (I/O operations per second), try changing the Transfer Request Size to 512 bytes, the Percent Read/Write Distribution to 100% Read, and the Percent Random/Sequential Distribution to 100% Sequential. Reference MAP data! https://microsoft.com/map ServerVirtRecommendations IOPS & Capacity PerfMetricsResults Physical Disk Perf Coutners