Upload
steven-li
View
3.164
Download
0
Embed Size (px)
Citation preview
- OpenStack Control Services High Availability
OpenStack HA Design & Deployment - Kilo
Oct. 2015
Four types of HA in an OpenStack Cloud
Compute ControllerNetwork Controller
DatabaseMessage Queue
Storage....
Physical nodesPhysical networkPhysical storage
HypervisorHost OS
….
Service ResiliencyQoS Cost
TransparencyData Integrity
…..
Virtual MachineVirtual NetworkVirtual Storage
VM Mobility…Physical infrastructure
OpenStack Control services
VMs OpenStack Compute
Applications
3
Physical Infrastructure HA
• Redundant in locations – Minimize downtime due to power or network issue
• All components distributed on different labs• Controller, Network, Compute, Storage• Minimize downtime • Minimize data loss risk
• Eliminate single point of failure• Extendability
Physical Infrastructure
Physical Infrastructure - example
Zone1 Zone 2
APIs/Orchestration/Dashboard…Controller Controller
Compute Compute
Database for control plane (MySQL)
Message Queue
Network
Storage Storage
6
OpenStack Control Services High Availability Technology
Stateless / Stateful Services
State Description ServicesStateless • There is no dependency between
requests • No need for data replication/synchronization. Failed request may need to be restarted on a different node.
Nova-api, nova-conductor, glance-api, keystone-api, neutron-api, nova-scheduler, Apache web server, Cinder Scheduler, etc.
Stateful • An action typically comprises multiple requests • Data needs to be replicated and synchronized between redundant services (to preserve state and consistency)
MySQL, RabbitMQ, Cinder Volume, Ceilometer center agent, Neutron L3, DHCP agents, etc.
• Active/Passive• There is a single master• Load balance stateless services
using a VIP and a load balancer such as HAProxy
• For Stateful services a replacement resource can be brought online. A separate application monitors these services, bringing the backup online as necessary
• After a failover the system will encounter a “speed bump” since the passive node has to notice the fault in the active node and become active
• Active/Active• Multiple masters• Load balance stateless services
using a VIP and a load balancer such as HAProxy
• Stateful Services are managed in such a way that services are redundant, and that all instances have an identical state
• Updates to one instance of database would propagate to all other instances
• After a failover the system will function in a “degraded” state
Active/Passive or Active/Active
Do not reinvent the wheel• Leverage time-tested Linux utilities such as Keepalived,
HAProxy and Virtual IP (using VRRP) • Leverage Hardware Load Balancers • Leverage replication services for RabbitMQ/MySQL such as
RabbitMQ Clustering, MySQL master-master replication, Corosync, Pacemaker, DRBD, Galera and so on
Overall Philosophy
• Keepalived• Based on Linux Virtual Server (IPVS) kernel module
providing layer 4 Load Balancing• Implements a set of checkers to maintain health and Load
Balancing• HA is implemented using VRRP Protocol, Used to load
balance API services• VRRP (Virtual Router Redundancy Protocol)
• Eliminates SPOF in a static default routed environment• HAProxy
• Load Balancing and Proxying for HTTP and TCP Applications• Works over multiple connections• Used to load balance API services
Keepalived, VRRP, HAProxy – for APIs (Active/Active – 2 nodes)
• Corosync• Totem single-ring ordering and membership protocol• UDP and InfiniBand based messaging, quorum, and cluster
membership to Pacemaker• Pacemaker
• High availability and load balancing stack for the Linux platform. • Interacts with applications through Resource Agents (RA)
• DRDB (Distributed Replication Block Device)• Synchronizes Data at the block device• Uses a journaling system (such as ext3 or ext4)
Corosync, Pacemaker and DRDB - for APIs and MySQL (Active/Passive)
• MySQL patched for wsrep (Write Set REPlication)
• Active/active multi-master topology• Read and write to any cluster node • True parallel replication, in row level• No slave lag or integrity issues
MySQL Galera (Active/Active)
Synchronous Multi-master Cluster technology for MySQL/InnoDB
RabbitMQ HA – native
• Cinder (Block Storage) backends support• LVM Driver
• Default linux iSCSI server• Vendor software plugins
• Gluster, CEPH, VMware VMDK driver• Vendor storage plugins
• EMC VNX, IBM Storwize, Solid Fire, etc.• Local RAID support
• Swift (Object Storage) -- Done• Replication• Erasure coding: (not enabled)
Data Redundancy (storage HA)
• No need to HA support for L2 networking, which is located in compute node
• Problems • Routing on Linux server (max. bandwith approximately 3-4 Gbits) • Limited distribution between more network nodes• East-West and North-South communication through network node
• High Availability• Pacemaker&Corosync• Keepalived VRRP• DVR + VRRP – should be in Juno release
Networking – Vanilla Neutron L3 agent
Reference: • Neutron/DVR• L3 High Availability• Configuring DVR in OpenStack Juno
HA methods in different vendors
Vendor Cluster/Replication Technique Characteristics
RackSpace
Keepalived, HAProxy, VRRP, DRBD, native clustering
Automatic - Cheffor 2 controller nodes installation
Red Hat Pacemaker, Corosync, Galera Manual installation/Foreman
Cisco Keepalived, HAProxy, Galera Manual installation, at least 3 controller
tcp cloud Pacemaker, Corosync, HAProxy,Galera, Contrail Automatic Salt-Stack deployment
Mirantis Pacemaker, Corosync, HAProxy,Galera Automatic - Puppet
HP Microsoft Windows based installation with Hyper-V
MS SQL server and other Windows based methods
Ubuntu Juju-Charms, Corosync, Percona XtraDB, Juju+MAAS
Comparison
Database Replication method
Strengths Weakness/Limitations
Keepalived/HAProxy/VRRP
Works on MySQL master-master replication
Simple to implement and understand. Works for any storage system.
Master-master replication does not work beyond 2 nodes.
Pacemaker/Corosync/DRBD
Mirroring on Block Devices
Well tested More complex to setup. Split Brain possibility
Galera Based on write-set Replication (wsrep)
No Slave lag Needs at least 3 nodes. Relatively new.
Others MySQL Cluster, RHCS with DAS/SAN storage
Well tested More complex setup.
• HAProxy for load balancing
• MySQL Galera – active/active
• RabbitMQ cluster
Sample OpenStack HA architecture -1
• HAProxy for load balancing• MySQL Galera –
active/active• RabbitMQ cluster• DVR + VRRP for network
Sample OpenStack HA architecture - 2
HAProxy
VIP
HAProxy
Keepalived
Controller
keystone
glance
cinder
horizon
rabbitmq
nova
Controller
keystone
glance
cinder
horizon
rabbitmq
nova
MySQL MySQL
galera
Storage Storage
Block Block
Object Object
Network / Compute Network / Compute
DVR + VRRP
• OpenStack High Availability Guide• Ubuntu OpenStack HA wiki• RackSpace OpenStack Control Plane High Availability• TCP Cloud OpenStack High Availability• Configuring DVR in OpenStack Juno• OpenStack High Availability – Controller Stack by Brian
Seltzer
Reference
22
Design & Build-up - Practice
Overall Picture
VIP1
PortalHost1
PortalHost2
HAProxy1
HAProxy2
Kee
paliv
ed
JumpBox4
JumpBox3
Ext
erna
l Net
wor
kInternal Network
VIP2
Controller
keystone
glance
cinder
horizon
rabbitmq
nova
MySQL
Controller
keystone
glance
cinder
horizon
rabbitmq
nova
MySQL
ComputeCompute
ComputeCompute
Compute
StorageJumpBox1
JumpBox2
Network L3
Network L3
KeepalivedVIP
rabbitmq
MySQL
To external Network
• OpenStack Release: Kilo• Host computers
• Cisco UCS for Controller, Compute, Network nodes• SuperMicro Computer for Storage nodes
• Host OS: Ubuntu 14.04 Server• Network Switches: Cisco Nexus – N7K, N5K, N2K • IP assignment:
• All hosts are using Lab internal IP address to save IP addresses resource• For Management/tunnel/storage/… cloud networks
• Use Jumpbox to access the all the cloud host computers from outside, 4 Jumpbox are set up for redundancy
• HAProxies for internal load balancing and dashboard portal for outside.
Equipment and Software
• Two portal hosts for redundancy and load balance• Same configurations on both• One node hosts 3 VMs for 2 jumpbox
and 1 haproxy • Jumpbox to Cloud management
• All IP addresses in Cloud are private, reachable via Jumpbox from outside
• Applications: VNC, Java, Wireshark, …• Repository mirroring for Linux(Ubuntu
14.04) and OpenStack (Kilo)• Mirror required since internal network can
not access Internet directly• Locate on Jumpbox
• Dashboard portal (on HAProxies)• VIPs for load-balance
• VIP1 for external network access• VIP2 for load balance of all Cloud APIs,
Database, MessageQ, … • Important: Two VIPs should be in one
VRRP group.
Set up Portal Hosts
VIP1
PortalHost1
PortalHost2
HAProxy1
HAProxy2
Kee
paliv
ed
JumpBox4
JumpBox3
External Network (Cisco)
VIP2
JumpBox1
JumpBox2
Internal Network (Cloud)
Step 1
Portal Hosts Index Example
VIP1
PortalHost1IPMI: 10.10.10.6Host: 10.10.10.9
PortalHost2IPMI: 10.10.10.7Host: 10.10.10.8
HAProxy1
HAProxy2
Kee
paliv
ed
JumpBox4
JumpBox3
External Network (Cisco)
VIP2
JumpBox1
JumpBox2
Internal Network (Cloud)
10.10.10.11
10.10.10.12
10.10.10.10 gw 10.10.10.1
192.168.222.240 gw 192.168.222.1
192.168.222.251
192.168.222.252
192.168.222.242
192.168.222.241
192.168.222.243
192.168.222.244
Windows 10.10.10.14
Ubuntu 10.10.10.13
Ubuntu 10.10.10.15
Windows 10.10.10.16
Assume the 10.x.x.x is the company network IPs, 192.168.*.* is for lab internal use
• HAProxy configuration for VIP1 (external network)
• Keepalived configuration• VIP1 and VIP2 should be in one VRRP group
• Once there is one interface fail, the whole function will be taken over by another host
• HAProxy configuration for VIP2 (internal network)• http://docs.openstack.org/high-availability-guide/content/ha-aa-
haproxy.html
HAProxy and Keepalived set up
Step 1.1
• 4 jumpbox set up• 2 Windows and 2 Linux• Software installed:
• VNC• WireShark• Vmclient• Putty …
• Repository Mirror for Ubuntu 14.04 and OpenStack Kilo set up in 2 Linux jumpbox • The internal network will get package from the Jumpbox directly
Jumpbox and Repository Mirroring
Step 1.2
• NIS servers set up for the cloud infrastructure, for• Host configuration• Authentication• …
• Two NIS servers set up on the HAProxy hosts• Master and slave
NIS set up on HAProxy host (option)
• 3 UCS hosts for Controller, Database, and MessageQ• Located in two racks
• Better to have Network / Compute all located in UCS-B hosts• Be sure the Mac Pools are set differently in different FIs,
otherwise, there will be Mac Address conflict• Complete all cabling and network configuration on UCSes
and upper switches• Verify all network connectivity of IPMI ports• Write down all the configuration in a detailed document
Cloud hostStep 2
Similar setting when using other compute hosts
• 2 Portal Hosts • 2 HAProxy, 4 Jumpbox• 2 Portal hosts, for each:
• IPMI: VLAN aaa (external network)• Eth0: (7 external IP) – VLAN eee• Eth1: (7 Internal IPs) – VLAN mmm -- VLAN access port.
• All Cloud hosts• IPMI vlan/network (lab internal)
• accessible via jumpbox from external network• Management vlan/network (lab internal)
• Accessible via jumpbox from external network• Tunnel vlan/network (lab internal) – not accessible from external• Storage vlan/network (lab internal) – not external accessible• Other internal network (e.g. internal VLAN network)
VLANs / IP design
• Network configuration for each node• Each host in one network can connect each other• Each host can reach HAProxy and JumpBox via management interface• Each host can reach HAProxy VIP2 via management interface• Hosts set up: controller-vip is used for APIs
• Install the Ubuntu Cloud archive keyring and repository• Use the mirror address, instead of the standard one
• Update packages for each system, via mirror on jumpbox• Verification
1. NTP: ntpq –c peers2. Connectivity: can reach HAProxy and Jumpbox3. Repository setup: /etc/apt/sources.list..d/ … 4. Upgraded the packages
Host system preparation Check on each host
• NTP Source: • Select a stable NTP source from external network as standard
time server• The VMs on portal hosts should configure to follow the standard
time server listed above• Jumpboxes and HAProxy
• All internal hosts in cloud should follow the HAProxy host• Using the VIP2
NTP set upStep 3
• The MySQL/Maria Galera are deployed in 3 hosts: 2 controllers and another • Make sure InnoDB is configured
• Configure HAProxy to listen on galera cluster api, and load balance (Port: 3306) .
• Verification• Create table on one node, can be access/manipulate from another• Mysql work well through VIP2, and verify tolerance of single node failure• Access from Jumpbox, work fine.
• References: • http://docs.openstack.org/high-availability-guide/content/ha-aa-db-mysql-
galera.html • Product webpage : http://www.codership.com/content/using-galera-cluster/。• Download: http://www.codership.com/downloads/download-mysqlgalera。• Document: http://www.codership.com/wiki。• More information about wsrep, see https://launchpad.net/wsrep
MySQL/MariaDB Galera Setup Step 4
• Deploy on 3 nodes, including two controllers• Configure them as a cluster, all nodes are disk nodes• Configure HAProxy for load balance (port: 5672) , to use multiple
rabbit_hosts instead. • Verification
• rabbitmqadmin tool? • rabbitmqctl status
• References: • http://docs.openstack.org/high-availability-guide/content/ha-aa-
rabbitmq.html • http://88250.b3log.org/rabbitmq-clustering-ha• OpenStack High Availability: RabbitMQ
RabbitMQ Cluster SetupStep 5
• Services contain: • All OpenStack API services• All OpenStack Schedulers• Memcashed service (multiple instances can be configured, consider later)
• API services• User VIP2 when configuring Keystone endpoints• All configuration files should refer to VIP2
• Schedulers: use RabbitMQ as the message system, hosts configured:• http://docs.openstack.org/admin-guide-cloud/content/section_telemetry-cetral-comp
ute-agent-ha.html
• Telemetry central agent set up can be load balanced: • See also: http://docs.openstack.org/high-availability-guide/content/ha-aa-controllers.html
Control services set up
• Installation• Add Database into MySQL, and grant privileges (once)• Install Keystone components on each node (one each node) • Configure the keystone.conf
• Configure the backend to sql database• Disable caching if needed (need to try)
• Configure the HAProxy for the API• Configure the keystone token backend to sql (the default is
memcached)• Services/Endpoints/Users/Roles/Projects & Verification
• Create admin, demo, service projects and corresponding user/role, … • Create on one node and verify it on another node• Work with VIP2
• See also:• http://docs.openstack.org/high-availability-guide/content/s-keystone.html• http://docs.openstack.org/kilo/config-reference/content/section_keystone.conf.html
Identity Service - KeystoneStep 6
• Shared storage is required for Glance HA• In the pilot cloud, the controller local file
system is used as image storage, for HA, it will not work
• Use the swift as Glance backend. • Swift itself needs to be HA
• At least two storage nodes• At least two swift proxy nodes
• Installed on controllers with glance• Use keystone for authentication, instead
of Swauth
Image Service - GlanceStep 7
VIP2
…
Controller
keystone
glance
MySQL
Swift Proxy…
Controller
keystone
glance
MySQL
Swift Proxy
HAProxy1 HAProxy2
Swift
• Installation• Install two swift proxy
• Proxies can be located on the controller nodes; Configure VIP2 for them for load-balance
• Install two storage nodes in B-series nodes, two disks for each, total 4• Configure 3 replicators for HA• No account error fix – upgrade swift-client to 2.3.2.
• There is a bug, fixed in 2.3.2 (not in Kilo release)
• Verification• File can be put into the storage via one of proxies, and get from another• Object write/get via VIP2• Failure cases
• See also: http://docs.openstack.org/kilo/install-guide/install/apt/content/ch_swift.html• https://bugs.launchpad.net/python-swiftclient/+bug/1372465
Object Storage InstallationStep 7.1
• Install Glance on each controller• Use the file system as the backend first, verify it works with local
file system
• Configure the HAProxy for Glance API and Glance Registry service• There would be warning about Unknown version in Glance API log• Need to change HAProxy setting about httpchk to fix it
• option httpchk Get /versions
• See also: • http://docs.openstack.org/juno/config-reference/content/section_glance-
api.conf.html• https://bugzilla.redhat.com/show_bug.cgi?id=1245572• http://docs.openstack.org/high-availability-guide/content/s-glance-api.html
Image Service - Glance
Step 7.2
• Prerequisites:• Swift object store had been installed and verified• Glance installed on controllers and verified with local file system as the backend
• Integration:• Configure the swift store as the glance backend• Configure the keystone token backend to sql (important)
• Or configure multiple memcached hosts in the configuration file
• Verification• Upload image and list images successfully in each controller node
• See also:• http://behindtheracks.com/2014/05/openstack-high-availability-glance-an
d-swift/
• http://thornelabs.net/2014/08/03/use-openstack-swift-as-a-backend-store-for-glance.html
Integration of Glance and Swift
Step 7.3
• Install Nova related packages • In two controller nodes and in one compute node• Compute nodes need to be set up as “Virturelization Host”
• Otherwise, the installation later will fail due to dependency issue.
• Configure HAProxy for Nova services
• List the Nova services• Verify if RabbitMQ works in the HA environment• There should be redundant Nova APIs, Schedulers, Conductors, …
listed • Further verification needs network nodes set up
Install Compute ServicesStep 8
• DVR + 3 network nodes (distributed SNAT / DHCP redundency)• Multiple options, but no one is perfect
• Pacemaker&Corosync• Keepalived VRRP• DVR + VRRP – should be in Juno/Kilo releases
• References: • http://docs.openstack.org/networking-guide/scenario_dvr_ovs.html• https://blogs.oracle.com/ronen/entry/
diving_into_openstack_network_architecture• http://assafmuller.com/2014/08/16/layer-3-high-availability/
Network RedundancyStep 9
Service Layout for DVR mode
• Network Node• Services required the same as the
central mode• Compute Node
• Compute node is doing networking too• L3 services added in Compute node• Take most networking functions too
• Support GRE/VXLAN/VLAN/FLAT network• In our system, GRE is used for
tunneling between instances and SNAT• VLAN network is not required if we do
not use it. • Network Node is mainly for network
central services like DHCP, Metadata, and SNAT• Just north/south traffic with fixed IP
need network node forwarding• Compute nodes handle DNAT• East/West traffic and North/South
traffic with a floating IP will not go through network node
DVR General Architecture
• This is just for load balance of networking, not HA really• Install all services listed in the service layout picture
• On Network node and Compute node respectively• Configuration
• router_distrbuted = True• ……
• Create routers, networks and instances • Verification
• North/south for instances with a fixed IP address (SNAT, via Network node)• North/south for instances with a floating IP address (DNAT, via Compute
node only)• East/west for instances using different networks on the same router (via
Compute node only)
Install one Network and two Compute nodeStep 9.1
• Add one more network Node• DHCPD redundancy• Networking L3 Agent redundancy• Networking Metadata Agent
• Kilo does not support DVR & L3HA mechanism combination
• This is not implemented in our practice, but it should be feasible to implement• The key is to keep all configuration (static/dynamic) sync-up• Two ways to go:
• PaceMaker + CoroSync … • VRRP + Keepalived (need to reboot network node when one is down)
L3 network redundancy - TBDStep 9.2
• Cinder Services installation• Install Cinder services in each controller• Configure HAProxy for the API
• Storage Nodes set up• SuperMicro equipment as storage• Linux soft-raid for disk redundancy• GlusterFS for node redundancy
• Verification• Create and access volume through client on
Jumpbox – try both controllers• Do failover cases on disk level• Do failover cases on node level
Volume redundancyStep 10
VIP2
…
Controller
keystone
MySQL
Cinder…
Controller
keystone
MySQL
Cinder
HAProxy1 HAProxy2
It’s the same to use any other storage node, e.g. normal computer, we use SuperMicro since it provides >100T storage on one node
RaidRaid
SuperMicro Storage Node
Gluster FS
• Horizon Services installation• Install Horizon services in each controller• Configure Horizon services• Use the external url name for console setting (instead of controller)
• Configure the memcached• /etc/openstack-dashboard/local_settings.py – CACHES LOCATION changes to VIP2• /etc/memcached.conf change 127.0.0.1 to the controller IP.
• Configure HAProxy for the API• Configure the VIP1 to the internal controllers proxy• Make sure the Dashboard is accessible from external network
• Verification• From Jumpbox to access the dashboard• From external network to access the dashboard
Dashboard redundancyStep 11
• HEAT Services installation• Install HEAT services in each controller• Configure HEAT services
• Configure HAProxy for the HEAT API
• Verification
Orchestration redundancyStep 12
• Ceilometer Services installation• Install Ceilometer services in each controller• Configure Ceilometer services
• Configure HAProxy for the Ceilometer API
• Verification
Telemetry redundancy - TBDStep 13