Upload
cesar-erquinigo
View
221
Download
0
Embed Size (px)
Citation preview
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 1/116
Sun Cluster Concepts Guide orSolaris OS
Sun Microsystems, Inc.4150 Network CircleSanta Clara, CA 95054U.S.A.
PartNo: 820–4676–10January 2009, Revision A
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 2/116
Copyright2009 SunMicrosystems, Inc. 4150 Network Circle, Santa Clara,CA 95054 U.S.A. Allrights reserved.
SunMicrosystems, Inc. hasintellectual property rightsrelatingto technology embodied in theproduct that is describedin this document.In particular, andwithoutlimitation, these intellectualpropertyrights mayinclude oneor more U.S. patents or pending patentapplications in theU.S. andin other countries.
U.S. Government Rights– Commercial sotware. Government users are subject to the Sun Microsystems, Inc. standard license agreement and applicableprovisionso theFARand itssupplements.
This distribution may include materials developed by thirdparties.Partso theproduct maybe derived rom Berkeley BSDsystems, licensed rom theUniversity o Caliornia. UNIX is a registered trademarkin theU.S. andothercountries, exclusivelylicensed through X/OpenCompany, Ltd.
Sun, SunMicrosystems, theSun logo, theSolaris logo, theJavaCofee Cuplogo,docs.sun.com, OpenBoot, Solaris VolumeManager,StorEdge, SunFire,Java, andSolaris aretrademarks or registered trademarks o SunMicrosystems, Inc. or itssubsidiaries in theU.S. andothercountries. AllSPARCtrademarks areused underlicenseand aretrademarks or registered trademarks o SPARCInternational,Inc. in theU.S. andothercountries. Products bearing SPARCtrademarks arebasedupon an architecture developed by Sun Microsystems, Inc.
The OPENLOOK and SunTM GraphicalUser Interacewas developedby SunMicrosystems, Inc. orits users andlicensees. Sunacknowledges thepioneering efortso Xerox in researching anddeveloping theconcept o visualor graphicaluser interaces orthe computer industry.Sun holds a non-exclusive licenseromXeroxtotheXeroxGraphical UserInterace,whichlicense also coversSun'slicenseeswho implementOPENLOOK GUIs andotherwise complywith Sun's written licenseagreements.
Products covered by andinormationcontained in this publication arecontrolled by U.S. ExportControl laws andmay be subjectto theexport or importlaws inother countries. Nuclear,missile,chemicalor biological weapons or nuclear maritime enduses or endusers,whether director indirect,are strictly prohibited. Exportor reexport to countriessubject to U.S. embargo or to entities identiedon U.S. exportexclusion lists,including, butnot limited to,the deniedpersons andspecially designated nationals lists is strictly prohibited.
DOCUMENTATIONIS PROVIDED “AS IS” AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANYIMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPTTOTHEEXTENTTHAT SUCH DISCLAIMERS AREHELD TO BE LEGALLY INVALID.
Copyright2009 SunMicrosystems, Inc. 4150 Network Circle, Santa Clara,CA 95054 U.S.A. Tous droitsréservés.
SunMicrosystems, Inc. détient lesdroits de propriétéintellectuellerelatisà la technologie incorporée dans le produit quiest décritdans ce document.En particulier,et ce sans limitation, cesdroits de propriétéintellectuellepeuvent inclure un ou plusieursbrevets américains ou desapplications de breveten attente auxEtats-Uniset dans d'autres pays.
Cette distribution peut comprendredes composants développéspar des tierces personnes.
Certainescomposants de ce produit peuvent être dérivées du logiciel Berkeley BSD, licenciéspar l'Universitéde Caliornie. UNIX estune marquedéposée auxEtats-Uniset dans d'autres pays; elle estlicenciée exclusivementpar X/OpenCompany,Ltd.
Sun, SunMicrosystems, le logo Sun, le logo Solaris, le logo Java Cofee Cup, docs.sun.com,OpenBoot,Solaris VolumeManager, StorEdge, SunFire, Java et Solarissont desmarques de abrique ou desmarques déposées de SunMicrosystems, Inc., ou ses liales, auxEtats-Unis et dans d'autres pays. Toutesles marques SPARCsont utiliséessous licence et sont desmarques de abrique ou desmarques déposées de SPARCInternational,Inc. auxEtats-Unis et dans d'autres pays. Lesproduitsportant lesmarques SPARCsont basés surune architecturedéveloppée parSun Microsystems, Inc.
L'interace d'utilisation graphiqueOPENLOOK et Suna étédéveloppée parSun Microsystems, Inc. pour ses utilisateurset licenciés. Sunreconnaît leseforts depionniersde Xerox pour la rechercheet le développement du concept desinteraces d'utilisation visuelle ou graphiquepour l'industrie de l'inormatique.Sun détientunelicence nonexclusive de Xerox surl'interaced'utilisation graphiqueXerox, cette licence couvrant égalementles licenciésde Sunqui mettent en place l'interaced'utilisation graphiqueOPENLOOK et qui, en outre,se conorment auxlicencesécrites de Sun.
Lesproduitsqui ont l'objet de cette publication et lesinormations qu'il contient sontrégispar la legislation américaine en matière de contrôle desexportations etpeuvent être soumisau droit d'autres pays dans le domaine desexportations et importations. Lesutilisations nales, ou utilisateursnaux, pour desarmesnucléaires,des missiles, des armeschimiques ou biologiquesou pour le nucléaire maritime, directementou indirectement, sont strictementinterdites. Les exportations ouréexportations vers despays sous embargo desEtats-Unis,ou vers desentités gurantsur leslistes d'exclusion d'exportation américaines, y compris, mais de manièrenonexclusive, la liste de personnesqui ontobjet d'un ordre de ne pasparticiper,d'uneaçondirecte ou indirecte, auxexportations desproduitsou desservicesquisont régispar la legislationaméricaine en matière de contrôle des exportations et la listede ressortissants spéciquement designés, sont rigoureusement interdites.
LA DOCUMENTATIONEST FOURNIE "EN L'ETAT" ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES EXPRESSES OU TACITESSONTFORMELLEMENT EXCLUES, DANS LA MESUREAUTORISEE PAR LA LOI APPLICABLE, Y COMPRIS NOTAMMENTTOUTE GARANTIEIMPLICITE RELATIVE A LA QUALITE MARCHANDE,A L'APTITUDEA UNE UTILISATIONPARTICULIERE OU A L'ABSENCE DE CONTREFACON.
081112@21288
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 3/116
Contents
Preace .....................................................................................................................................................7
1 Introduction andOverview ...............................................................................................................13
Introduction to the Sun Cluster Environment ................................................................................ 14
Three Views o the Sun Cluster Sotware ......................................................................................... 15
Hardware Installation and Service View ................................................................................... 15
System Administrator View ....................................................................................................... 16
Application Developer View ...................................................................................................... 18
Sun Cluster Sotware Tasks ................................................................................................................ 18
2 Key Concepts orHardware Service Providers ............................................................................... 21
Sun Cluster System Hardware and Sotware Components ............................................................ 21
Cluster Nodes ................................................................................................................... ............ 22
Sotware Components or Cluster Hardware Members .......................................................... 23
Multihost Devices ........................................................................................................................ 24
Multi-Initiator SCSI ..................................................................................................................... 25
Local Disks ....................................................................................................................... ............. 25
Removable Media ......................................................................................................................... 26
Cluster Interconnect .................................................................................................................... 26
Public Network Interaces ........................................................................................................... 27
Client Systems .............................................................................................................................. 27
Console Access Devices ............................................................................................................... 28
Administrative Console ....................................................................................................... ....... 28
SPARC: Sun Cluster Topologies ........................................................................................................ 29
SPARC: Clustered Pair Topology ............................................................................................... 30
SPARC: Pair+N Topology ........................................................................................................... 31SPARC: N+1 (Star) Topology ..................................................................................................... 31
3
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 4/116
SPARC: N*N (Scalable) Topology ............................................................................................. 32
SPARC: LDoms Guest Domains: Cluster in a Box Topology ................................................. 33
SPARC: LDoms Guest Domains: Single Cluster Spans Two Diferent Hosts Topology ..... 34SPARC: LDoms Guest Domains: Clusters Span Two Diferent Hosts Topology ................. 35
SPARC: LDoms Guest Domains: Redundant I/O Domains ................................................... 37
x86: Sun Cluster Topologies ............................................................................................................... 38
x86: Clustered Pair Topology ..................................................................................................... 39
x86: N+1 (Star) Topology ............................................................................................................ 39
3 Key Concepts orSystem Administrators andApplication Developers ..................................... 41
Administrative Interaces ................................................................................................................... 42
Cluster Time .................................................................................................................... ..................... 42
High-Availability Framework ............................................................................................................ 43
Zone Membership ........................................................................................................................ 44
Cluster Membership Monitor .................................................................................................... 44
Failast Mechanism ...................................................................................................................... 44
Cluster Conguration Repository (CCR) ................................................................................. 45
Global Devices ..................................................................................................................................... 46
Device IDs and DID Pseudo Driver ........................................................................................... 46
Device Groups ..................................................................................................................................... 47
Device Group Failover ................................................................................................................. 47
Multiported Device Groups ........................................................................................................ 48
Global Namespace ............................................................................................................................... 49
Local and Global Namespaces Example .................................................................................... 50
Cluster File Systems ........................................................................................................................... .. 51
Using Cluster File Systems .......................................................................................................... 51
HAStoragePlus Resource Type .................................................................................................. 52
syncdir Mount Option ............................................................................................................... 53Disk Path Monitoring ........................................................................................................................ . 53
DPM Overview ............................................................................................................................ . 54
Monitoring Disk Paths ................................................................................................................ 55
Quorum and Quorum Devices .......................................................................................................... 56
About Quorum Vote Counts ...................................................................................................... 58
About Quorum Congurations ................................................................................................. 58Adhering to Quorum Device Requirements ............................................................................ 59
Contents
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A4
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 5/116
Adhering to Quorum Device Best Practices ............................................................................. 60
Recommended Quorum Congurations .................................................................................. 61
Atypical Quorum Congurations .............................................................................................. 62Bad Quorum Congurations ..................................................................................................... 63
Data Services .......................................................................................................................... .............. 64
Data Service Methods .................................................................................................................. 67
Failover Data Services .................................................................................................................. 67
Scalable Data Services .................................................................................................................. 68
Load-Balancing Policies .............................................................................................................. 69
Failback Settings ........................................................................................................................... 71
Data Services Fault Monitors ...................................................................................................... 71
Developing New Data Services .......................................................................................................... 71
Characteristics o Scalable Services ............................................................................................ 72
Data Service API and Data Service Development Library API .............................................. 73
Using the Cluster Interconnect or Data Service Trac ................................................................. 73
Resources, Resource Groups, and Resource Types ......................................................................... 74
Resource Group Manager (RGM) ............................................................................................. 75
Resource and Resource Group States and Settings .................................................................. 75
Resource and Resource Group Properties ................................................................................ 77
Support or Solaris Zones ................................................................................................................... 77
Support or Global-Cluster Non-Voting Nodes (Solaris Zones) Directly Through theRGM .............................................................................................................................................. 78
Support or Solaris Zones on Sun Cluster Nodes Through Sun Cluster HA or SolarisContainers .................................................................................................................................... 79
Service Management Facility ............................................................................................................. 80
System Resource Usage ....................................................................................................................... 81
System Resource Monitoring ..................................................................................................... 82
Control o CPU ............................................................................................................................ 82
Viewing System Resource Usage ................................................................................................ 83Data Service Project Conguration ................................................................................................... 83
Determining Requirements or Project Conguration ........................................................... 85
Setting Per-Process Virtual Memory Limits ............................................................................. 86
Failover Scenarios ........................................................................................................................ 87
Public Network Adapters and IP Network Multipathing ............................................................... 92
SPARC: Dynamic Reconguration Support .................................................................................... 94SPARC: Dynamic Reconguration General Description ....................................................... 94
Contents
5
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 6/116
SPARC: DR Clustering Considerations or CPU Devices ....................................................... 95
SPARC: DR Clustering Considerations or Memory ............................................................... 95
SPARC: DR Clustering Considerations or Disk and Tape Drives ........................................ 95SPARC: DR Clustering Considerations or Quorum Devices ................................................ 96
SPARC: DR Clustering Considerations or Cluster Interconnect Interaces ....................... 96
SPARC: DR Clustering Considerations or Public Network Interaces ................................ 96
4 Frequently Asked Questions .............................................................................................................99
High Availability FAQs ....................................................................................................................... 99File Systems FAQs ............................................................................................................................. . 100
Volume Management FAQs ............................................................................................................. 101
Data Services FAQs ............................................................................................................................ 101
Public Network FAQs ........................................................................................................................ 102
Cluster Member FAQs ...................................................................................................................... 103
Cluster Storage FAQs ........................................................................................................................ 104Cluster Interconnect FAQs ............................................................................................................... 104
Client Systems FAQs ......................................................................................................................... 105
Administrative Console FAQs ......................................................................................................... 105
Terminal Concentrator and System Service Processor FAQs ...................................................... 106
Index ................................................................................................................................................... 109
Contents
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A6
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 7/116
Preace
The SunCluster ConceptsGuide or Solaris OS contains conceptual and reerence inormation
about the SunTM
Cluster product on both SPARC® and x86 based systems.
Note – This Sun Cluster release supports systems that use the SPARC and x86 amilies o processor architectures: UltraSPARC, SPARC64, AMD64, and Intel 64. In this document, x86reers to the larger amily o 64-bit x86 compatible products. Inormation in this documentpertains to all platorms unless otherwise specied.
Who Should Use This Book This document is intended or the ollowing audiences:
■ Service providers who install and service cluster hardware
■ System administrators who install, congure, and administer Sun Cluster sotware
■ Application developers who develop ailover and scalable services or applications that arenot currently included with the Sun Cluster product
To understand the concepts that are described in this book, you need to be amiliar with theSolaris Operating System and also have expertise with the volume manager sotware that youcan use with the Sun Cluster product.
Beore reading this document, you need to have already determined your system requirementsand purchased the equipment and sotware that you need. TheSunCluster Data Services
PlanningandAdministrationGuide or Solaris OS contains inormation about how to plan,install, set up, and use the Sun Cluster sotware.
7
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 8/116
How This Book Is OrganizedThe SunCluster ConceptsGuide orSolarisOS contains the ollowing chapters:
Chapter 1, “Introduction and Overview,” provides an overview o the overall concepts that youneed to know about Sun Cluster.
Chapter 2, “Key Concepts or Hardware Service Providers,” describes the concepts with whichhardware service providers need to be amiliar. These concepts can help service providersunderstand the relationships between hardware components. These concepts can also helpservice providers and cluster administrators better understand how to install, congure, and
administer cluster sotware and hardware.
Chapter 3, “Key Concepts or System Administrators and Application Developers,” describesthe concepts with which system administrators and developers who intend to use the SunCluster application programming interace (API) need to know. Developers can use this API toturn a standard user application, such as a web browser or database into a highly available dataservice that can run in the Sun Cluster environment.
Chapter 4, “Frequently Asked Questions,” provides answers to requently asked questions aboutthe Sun Cluster product.
Related DocumentationInormation about related Sun Cluster topics is available in the documentation that is listed in
the ollowing table. All Sun Cluster documentation is available athttp://docs.sun.com
.
Topic Documentation
Overview SunCluster Overview orSolarisOS
SunCluster 3.21/09 Documentation Center
Concepts SunCluster ConceptsGuide orSolarisOS
Hardware installation andadministration
SunCluster 3.1- 3.2HardwareAdministrationManual orSolaris OS
Individual hardware administration guides
Sotware installation SunCluster Sotware Installation Guide orSolarisOS
SunCluster Quick StartGuide orSolarisOS
Data service installation and
administration
SunCluster Data ServicesPlanningandAdministration Guide orSolarisOS
Individual data service guides
Preace
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A8
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 9/116
Topic Documentation
Data service development SunCluster Data ServicesDeveloper’s Guide orSolaris OS
System administration SunCluster SystemAdministrationGuide orSolaris OS
SunClusterQuickReerence
Sotware upgrade SunCluster Upgrade Guide orSolaris OS
Error messages SunCluster ErrorMessagesGuide orSolaris OS
Command and unction reerences SunClusterReerenceManual orSolaris OS
SunClusterData Services ReerenceManual orSolaris OSSunCluster QuorumServer ReerenceManual orSolaris OS
For a complete list o Sun Cluster documentation, see the release notes or your release o SunCluster sotware at http://wikis.sun.com/display/SunCluster/Home/.
Getting HelpI you have problems installing or using the Sun Cluster sotware, contact your service providerand provide the ollowing inormation:
■ Your name and email address (i available)■ Your company name, address, and phone number■ The model and serial numbers o your systems■ The release number o the operating system (or example, the Solaris 10 OS)■
The release number o Sun Cluster sotware (or example, 3.2 1/09)
Use the ollowing commands to gather inormation about your systems or your serviceprovider.
Command Function
prtconf -v Displays the size o the system memory and reports
inormation about peripheral devices
psrinfo -v Displays inormation about processors
showrev -p Reports which patchesare installed
SPARC: prtdiag -v Displays system diagnostic inormation
/usr/cluster/bin/clnode show-rev Displays Sun Cluster release and package version
inormation
Also have available the contents o the /var/adm/messages le.
Preace
9
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 10/116
Documentation, Support, andTraining
The Sun web site provides inormation about the ollowing additional resources:■ Documentation (http://www.sun.com/documentation/)■ Support (http://www.sun.com/support/)■ Training (http://www.sun.com/training/)
Typographic Conventions
The ollowing table describes the typographic conventions that are used in this book.
TABLE P–1 TypographicConventions
Typeface Meaning Example
AaBbCc123 The names o commands, les, and directories,
and onscreen computer output
Edityour .login le.
Use ls -a to list all les.
machine_name% you have mail.
AaBbCc123 What you type, contrasted with onscreen
computer output
machine_name% su
Password:
aabbcc123 Placeholder:replacewith a realname orvalue The command toremove a le is rm
flename.
AaBbCc123 Book titles, new terms, and termsto beemphasized
Read Chapter 6 in theUser's Guide.
A cache isa copythat isstored
locally.
Do not save the le.
Note: Some emphasized items
appear bold online.
Shell Prompts in Command Examples
The ollowing table shows the deault UNIX® system prompt and superuser prompt or the C
shell, Bourne shell, and Korn shell.
Preace
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A10
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 11/116
TABLE P–2 ShellPrompts
Shell Prompt
C shell machine_name%
C shell or superuser machine_name#
Bourne shell and Korn shell $
Bourne shell and Korn shell or superuser #
Preace
11
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 12/116
12
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 13/116
Introduction and Overview
The Sun Cluster product is an integrated hardware and sotware solution that you use to createhighly available and scalable services. Sun Cluster Concepts Guide or Solaris OS provides theconceptual inormation that you need to gain a more complete picture o the Sun Clusterproduct. Use this book with the entire Sun Cluster documentation set to provide a complete view o the Sun Cluster sotware.
This chapter provides an overview o the general concepts that underlie the Sun Clusterproduct.
This chapter does the ollowing:
■ Provides an introduction and high-level overview o the Sun Cluster sotware
■ Describes the several views o the Sun Cluster audience
■ Identies key concepts that you need to understand beore you use the Sun Cluster sotware■ Maps key concepts to the Sun Cluster documentation that includes procedures and related
inormation
■ Maps cluster-related tasks to the documentation that contains procedures that you use tocomplete those tasks
This chapter contains the ollowing sections:
■ “Introduction to the Sun Cluster Environment” on page 14■ “Three Views o the Sun Cluster Sotware” on page 15■ “Sun Cluster Sotware Tasks” on page 18
1C H A P T E R 1
13
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 14/116
Introduction to the Sun Cluster Environment
The Sun Cluster environment extends the Solaris Operating System into a cluster operating
system. A cluster is a collection o one or more nodes that belong exclusively to that collection.
Ina cluster that runs on the Solaris 10 OS, a global cluster anda zone cluster are types o clusters.
In a cluster that runs on any version o the Solaris OS that was released beore the Solaris 10 OS,
anodeisa physicalmachine that contributes to cluster membership and is not a quorum device.
In a cluster that runs on the Solaris 10 OS, the concept o a node changes. In this environment, a
node is a Solaris zone that is associated with a cluster. In this environment, a Solaris host , or
simply host , is one o the ollowing hardware or sotware congurations that runs the Solaris OSand its own processes:
■ A “bare metal” physical machine that is not congured with a virtual machine or as a
hardware domain
■ A Sun Logical Domains (LDoms) guest domain
■ A Sun Logical Domains (LDoms) I/O domain
■ A hardware domain
These processes communicate with one another to orm what looks like (to a network client) a
single system that cooperatively provides applications, system resources, and data to users.
In a Solaris 10 environment, a global cluster is a type o cluster that is composed only o one or
more global-cluster voting nodes and optionally, zero or more global-cluster non-voting nodes.
Note – A global cluster can optionally also include solaris8, solaris9, lx (linux), or native
brand, non-global zones that are not nodes, but high availability containers (as resources).
A global-cluster voting node is a native brand, global zone in a global cluster that contributes
votes to the total number o quorum votes, that is, membership votes in the cluster. This total
determines whether the cluster has sucient votes to continue operating. A global-cluster
non-voting node is a native brand, non-global zone in a global cluster that does not contribute
votes to the total number o quorum votes, that is, membership votes in the cluster.
In a Solaris 10 environment, a zone cluster is a type o cluster that is composed only o one or
more cluster brand, voting nodes. A zone cluster depends on, and thereore requires, a global
cluster. A global cluster does not contain a zone cluster. You cannot congure a zone cluster
without a global cluster. A zone cluster has, at most, one zone cluster node on a machine.
Introduction to the Sun Cluster Environment
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A14
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 15/116
Note – A zone-cluster node continues to operate only as long as the global-cluster voting nodeon the same machine continues to operate. I a global-cluster voting node on a machine ails, allzone-cluster nodes on that machine ail as well.
A cluster ofers several advantages over traditional single-server systems. These advantagesinclude support or ailover and scalable services, capacity or modular growth, and low entry price compared to traditional hardware ault-tolerant systems.
The goals o the Sun Cluster sotware are:
■ Reduce or eliminate system downtime because o sotware or hardware ailure
■ Ensure availability o data and applications to end users, regardless o the kind o ailure thatwould normally take down a single-server system
■ Increase application throughput by enabling services to scale to additional processors by adding nodes to the cluster
■ Provide enhanced availability o the system by enabling you to perorm maintenance
without shutting down the entire cluster
For more inormation about ault tolerance and high availability, see “Making ApplicationsHighly Available With Sun Cluster” in SunCluster Overview orSolaris OS.
Reer to “High Availability FAQs” on page 99 or questions and answers on high availability.
Three Views o the Sun Cluster SotwareThis section describes three diferent views o the Sun Cluster sotware and the key conceptsand documentation relevant to each view.
These views are typical or the ollowing proessionals:
■ Hardware installation and service personnel■
System administrators■ Application developers
Hardware Installation and Service View
To hardware service proessionals, the Sun Cluster sotware looks like a collection o
of-the-shel hardware that includes servers, networks, and storage. These components are allcabled together so that every component has a backup and no single point o ailure exists.
ThreeViews o theSun Cluster Sotware
Chapter 1 • Introduction and Overview 15
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 16/116
Key Concepts – Hardware
Hardware service proessionals need to understand the ollowing cluster concepts.
■ Cluster hardware congurations and cabling■ Installing and servicing (adding, removing, replacing):
■ Network interace components (adapters, junctions, cables)■ Disk interace cards■ Disk arrays■ Disk drives■ The administrative console and the console access device
■ Setting up the administrative console and console access device
More Hardware Conceptual Inormation
The ollowing sections contain material relevant to the preceding key concepts:
■ “Cluster Nodes” on page 22■ “Multihost Devices” on page 24■ “Local Disks” on page 25■ “Cluster Interconnect” on page 26■ “Public Network Interaces” on page 27■ “Client Systems” on page 27■ “Administrative Console” on page 28■ “Console Access Devices” on page 28■ “SPARC: Clustered Pair Topology” on page 30■ “SPARC: N+1 (Star) Topology” on page 31
Sun Cluster Documentation or Hardware Proessionals
The SunCluster 3.1 - 3.2HardwareAdministrationManual or Solaris OS includes proceduresand inormation that are associated with hardware service concepts.
System Administrator View
To the system administrator, the Sun Cluster product is a set o Solaris hosts that share storagedevices.
The system administrator sees sotware that perorms specic tasks:
■ Specialized cluster sotware that is integrated with Solaris sotware to monitor theconnectivity between Solaris hosts in the cluster
■ Specialized sotware that monitors the health o user application programs that are runningon the cluster nodes
■ Volume management sotware that sets up and administers disks
ThreeViews o theSun Cluster Sotware
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A16
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 17/116
■ Specialized cluster sotware that enables all Solaris hosts to access all storage devices, eventhose Solaris hosts that are not directly connected to disks
■ Specialized cluster sotware that enables les to appear on every Solaris host as though they were locally attached to that Solaris host
Key Concepts – System Administration
System administrators need to understand the ollowing concepts and processes:
■ The interaction between the hardware and sotware components
■
The general ow o how to install and congure the cluster including:■ Installing the Solaris Operating System■ Installing and conguring Sun Cluster sotware■ Installing and conguring a volume manager■ Installing and conguring application sotware to be cluster ready ■ Installing and conguring Sun Cluster data service sotware
■ Cluster administrative procedures or adding, removing, replacing, and servicing clusterhardware and sotware components
■ Conguration modications to improve perormance
More System Administrator Conceptual Inormation
The ollowing sections contain material relevant to the preceding key concepts:
■ “Administrative Interaces” on page 42■
“Cluster Time” on page 42■ “High-Availability Framework” on page 43■ “Global Devices” on page 46■ “Device Groups” on page 47■ “Global Namespace” on page 49■ “Cluster File Systems” on page 51■ “Disk Path Monitoring” on page 53■ “Data Services” on page 64
Sun Cluster Documentation or System Administrators
The ollowing Sun Cluster documents include procedures and inormation associated with thesystem administration concepts:
■ SunCluster Sotware Installation Guide or Solaris OS■ SunCluster SystemAdministrationGuide or Solaris OS■
SunCluster ErrorMessagesGuide orSolaris OS■ Sun Cluster Release Notes or Solaris OS
ThreeViews o theSun Cluster Sotware
Chapter 1 • Introduction and Overview 17
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 18/116
Application DeveloperView
The Sun Cluster sotware providesdata services or such applications as Oracle, NFS, DNS, Sun
Java System Web Server, Apache Web Server (on SPARC based systems), and Sun Java SystemDirectory Server. Data services are created by conguring of-the-shel applications to rununder control o the Sun Cluster sotware. The Sun Cluster sotware provides conguration lesand management methods that start, stop, and monitor the applications. I you need to create anew ailover or scalable service, you can use the Sun Cluster Application ProgrammingInterace (API) and the Data Service Enabling Technologies API (DSET API) to develop thenecessary conguration les and management methods that enable its application to run as adata service on the cluster.
Key Concepts – Application Development
Application developers need to understand the ollowing:
■ The characteristics o their application to determine whether it can be made to run as aailover or scalable data service.
■ The Sun Cluster API, DSET API, and the “generic” data service. Developers need to
determine which tool is most suitable or them to use to write programs or scripts tocongure their application or the cluster environment.
More Application Developer Conceptual Inormation
The ollowing sections contain material relevant to the preceding key concepts:
■ “Data Services” on page 64
■ “Resources, Resource Groups, and Resource Types” on page 74■ Chapter 4, “Frequently Asked Questions”
Sun Cluster Documentation or Application Developers
The ollowing Sun Cluster documents include procedures and inormation associated with theapplication developer concepts:
■
SunCluster Data ServicesDeveloper’s Guide or Solaris OS■ SunCluster Data Services Planning andAdministrationGuide or Solaris OS
Sun Cluster SotwareTasksAll Sun Cluster sotware tasks require some conceptual background. The ollowing tableprovides a high-level view o the tasks and the documentation that describes task steps. Theconcepts sections in this book describe how the concepts map to these tasks.
Sun Cluster SotwareTasks
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A18
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 19/116
TABLE 1–1 Task Map: Mapping User Tasks to Documentation
Task Instructions
Install cluster hardware SunCluster 3.1 - 3.2HardwareAdministrationManual orSolarisOS
Install Solaris sotware on the cluster SunCluster Sotware InstallationGuide orSolaris OS
SPARC: Install SunTM Management Center
sotware
SunCluster Sotware InstallationGuide orSolaris OS
Install and congure Sun Cluster sotware SunCluster Sotware InstallationGuide orSolaris OS
Install and congure volume managementsotware
SunCluster Sotware InstallationGuide orSolaris OS
Your volume management documentation
Install and congure Sun Cluster data
services
SunClusterData ServicesPlanning andAdministrationGuide or
SolarisOS
Service cluster hardware SunCluster 3.1 - 3.2HardwareAdministrationManual orSolaris
OS
Administer Sun Cluster sotware SunCluster SystemAdministration Guide orSolarisOSAdminister volume management sotware SunCluster SystemAdministration Guide orSolarisOS and your
volume management documentation
Administer application sotware Yourapplicationdocumentation
Problem identication and suggested user
actions
SunClusterErrorMessagesGuide orSolaris OS
Create a new data service SunClusterData ServicesDeveloper’s Guide orSolaris OS
Sun Cluster SotwareTasks
Chapter 1 • Introduction and Overview 19
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 20/116
20
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 21/116
Key Concepts or Hardware Service Providers
This chapter describes the key concepts that are related to the hardware components o a SunCluster conguration.
This chapter covers the ollowing topics:
■ “Sun Cluster System Hardware and Sotware Components” on page 21■ “SPARC: Sun Cluster Topologies” on page 29■ “x86: Sun Cluster Topologies” on page 38
Sun Cluster System Hardware and Sotware ComponentsThis inormation is directed primarily to hardware service providers. These concepts can helpservice providers understand the relationships between the hardware components beore they install, congure, or service cluster hardware. Cluster system administrators might also nd thisinormation useul as background to installing, conguring, and administering clustersotware.
A cluster is composed o several hardware components, including the ollowing:
■ Solaris hosts with local disks (unshared)■ Multihost storage (disks are shared between Solaris hosts)■
Removable media (tapes and CD-ROMs)■ Cluster interconnect■ Public network interaces■ Client systems■ Administrative console■ Console access devices
The Sun Cluster sotware enables you to combine these components into a variety o congurations. The ollowing sections describe these congurations.
■ “SPARC: Sun Cluster Topologies” on page 29
2C H A P T E R 2
21
Sun Cluster System Hardware and Sotware Components
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 22/116
■ “x86: Sun Cluster Topologies” on page 38
For an illustration o a sample two-host cluster conguration, see “Sun Cluster Hardware
Environment” in SunCluster Overview orSolarisOS.
Cluster Nodes
In a cluster that runs on any version o the Solaris OS that was released beore the Solaris 10 OS,anodeisa physicalmachine that contributes to cluster membership and is not a quorum device.In a cluster that runs on the Solaris 10 OS, the concept o a node changes. In this environment, a
node is a Solaris zone that is associated with a cluster. In this environment, a Solaris host , orsimply host , is one o the ollowing hardware or sotware congurations that runs the Solaris OSand its own processes:
■ A “bare metal” physical machine that is not congured with a virtual machine or as ahardware domain
■ A Sun Logical Domains (LDoms) guest domain
■ A Sun Logical Domains (LDoms) I/O domain
■ A hardware domain
Depending on your platorm, Sun Cluster sotware supports the ollowing congurations:
■ SPARC: Sun Cluster sotware supports rom one to sixteen Solaris hosts in a cluster.Diferent hardware congurations impose additional limits on the maximum number o hosts that you can congure in a cluster composed o SPARC based systems. See “SPARC:Sun Cluster Topologies” on page 29 or the supported congurations.
■ x86: Sun Cluster sotware supports rom one to eight Solaris hosts in a cluster. Diferenthardware congurations impose additional limits on the maximum number o hosts thatyou can congure in a cluster composed o x86 based systems. See “x86: Sun ClusterTopologies” on page 38 or the supported congurations.
Solaris hosts are generally attached to one or more multihost devices. Hosts that are notattached to multihost devices use the cluster le system to access the multihost devices. Forexample, one scalable services conguration enables hosts to service requests without beingdirectly attached to multihost devices.
In addition, hosts in parallel database congurations share concurrent access to all the disks.
■ See “Multihost Devices” on page 24 or inormation about concurrent access to disks.■ See “SPARC: Clustered Pair Topology” on page 30 and “x86: Clustered Pair Topology” on
page 39 or more inormation about parallel database congurations.
All nodes in the cluster are grouped under a common name (the cluster name), which is usedor accessing and managing the cluster.
Sun Cluster System Hardware and Sotware Components
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A22
Sun Cluster System Hardware and Sotware Components
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 23/116
Public network adapters attach hosts to the public networks, providing client access to the
cluster.
Cluster members communicate with the other hosts in the cluster through one or morephysically independent networks. This set o physically independent networks is reerred to as
the cluster interconnect .
Every node in the cluster is aware when another node joins or leaves the cluster. Additionally,
every node in the cluster is aware o the resources that are running locally as well as the
resources that are running on the other cluster nodes.
Hosts in the same cluster should have similar processing, memory, and I/O capability to enableailover to occur without signicant degradation in perormance. Because o the possibility o
ailover, every host must have enough excess capacity to support the workload o all hosts or
which they are a backup or secondary.
Each host boots its own individual root (/) le system.
Sotware Components or Cluster Hardware Members
To unction as a cluster member, a Solaris host must have the ollowing sotware installed:
■ Solaris Operating System
■ Sun Cluster sotware
■ Data service application
■ Volume management (Solaris Volume ManagerTM or Veritas Volume Manager)
An exception is a conguration that uses hardware redundant array o independent disks
(RAID). This conguration might not require a sotware volume manager such as Solaris
Volume Manager or Veritas Volume Manager.
■ See the SunCluster Sotware InstallationGuide orSolaris OS or inormation about how to
install the Solaris Operating System, Sun Cluster, and volume management sotware.
■ See the SunCluster Data Services Planning andAdministrationGuide or Solaris OS orinormation about how to install and congure data services.
■ See Chapter 3, “Key Concepts or System Administrators and Application Developers,” or
conceptual inormation about the preceding sotware components.
The ollowing gure provides a high-level view o the sotware components that work together
to create the Sun Cluster environment.
Sun Cluster System Hardware and Sotware Components
Chapter 2 • Key Concepts or Hardware Service Providers 23
Sun Cluster System Hardware and Sotware Components
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 24/116
See Chapter 4, “Frequently Asked Questions,” or questions and answers about clustermembers.
Multihost Devices
Disks that can be connected to more than one Solaris host at a time are multihost devices. In theSun Cluster environment, multihost storage makes disks highly available. Sun Cluster sotwarerequires multihost storage or two-host clusters to establish quorum. Greater than two-hostclusters do not require quorum devices. For more inormation about quorum, see “Quorumand Quorum Devices” on page 56.
Multihost devices have the ollowing characteristics.
■ Tolerance o single-host ailures.
■ Ability to store application data, application binaries, and conguration les.
■ Protection against host ailures. I clients request the data through one host and the hostails, the requests are switched over to use another host with a direct connection to the samedisks.
■ Global access through a primary host that “masters” the disks, or direct concurrent accessthrough local paths. The only application that uses direct concurrent access currently isOracle Real Application Clusters Guard.
A volume manager provides or mirrored or RAID-5 congurations or data redundancy o themultihost devices. Currently, Sun Cluster supports Solaris Volume Manager and VeritasVolume Manager as volume managers, and the RDAC RAID-5 hardware controller on severalhardware RAID platorms.
Combining multihost devices with disk mirroring and disk striping protects against both hostailure and individual disk ailure.
See Chapter 4, “Frequently Asked Questions,” or questions and answers about multihoststorage.
Solaris operating environment
KernelUser
Data service software
Volume management software
Sun Cluster software
FIGURE 2–1 High-LevelRelationship o Sun Cluster Sotware Components
y p
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A24
Sun Cluster System Hardware and Sotware Components
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 25/116
Multi-Initiator SCSI
This section applies only to SCSI storage devices and not to Fibre Channel storage that is used
or the multihost devices.
In a standalone (that is, non-clustered) host, the host controls the SCSI bus activities by way o the SCSI host adapter circuit that connects this host to a particular SCSI bus. This SCSI hostadapter circuit is reerred to as the SCSI initiator . This circuit initiates all bus activities or thisSCSI bus. The deault SCSI address o SCSI host adapters in Sun systems is 7.
Cluster congurations share storage between multiple hosts, using multihost devices. When the
cluster storage consists o single-ended or diferential SCSI devices, the conguration is reerredto as multi-initiator SCSI. As this terminology implies, more than one SCSI initiator exists onthe SCSI bus.
The SCSI specication requires each device on a SCSI bus to have a unique SCSI address. (Thehost adapter is also a device on the SCSI bus.) The deault hardware conguration in amulti-initiator environment results in a conict because all SCSI host adapters deault to 7.
To resolve this conict, on each SCSI bus, leave one o the SCSI host adapters with the SCSI
address o 7, and set the other host adapters to unused SCSI addresses. Proper planning dictatesthat these “unused” SCSI addresses include both currently and eventually unused addresses. Anexample o addresses unused in the uture is the addition o storage by installing new drives intoempty drive slots.
In most congurations, the available SCSI address or a second host adapter is 6.
You can change the selected SCSI addresses or these host adapters by using one o the
ollowing tools to set the scsi-initiator-id property:■ eeprom(1M)
■ The OpenBootTM PROM on a SPARC based system
■ The SCSI utility that you optionally run ater the BIOS boots on an x86 based system
You can set this property globally or a host or on a per-host-adapter basis. Instructions orsetting a unique scsi-initiator-id or each SCSI host adapter are included inSunCluster 3.1
- 3.2WithSCSI JBODStorage DeviceManual or Solaris OS.
Local Disks
Local disks are the disks that are only connected to a single Solaris host. Local disks are,thereore, not protected against host ailure (they are not highly available). However, all disks,including local disks, are included in the global namespace and are congured as global devices.Thereore, the disks themselves are visible rom all cluster hosts.
Chapter 2 • Key Concepts or Hardware Service Providers 25
Sun Cluster System Hardware and Sotware Components
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 26/116
You can make the le systems on local disks available to other hosts by placing them under aglobal mount point. I the host that currently has one o these global le systems mounted ails,all hosts lose access to that le system. Using a volume manager lets you mirror these disks so
that a ailure cannot cause these le systems to become inaccessible, but volume managers donot protect against host ailure.
See the section “Global Devices” on page 46 or more inormation about global devices.
Removable Media
Removable media such as tape drives and CD-ROM drives are supported in a cluster. Ingeneral, you install, congure, and service these devices in the same way as in a nonclusteredenvironment. These devices are congured as global devices in Sun Cluster, so each device canbe accessed rom any node in the cluster. Reer toSunCluster 3.1 - 3.2HardwareAdministration Manual or Solaris OS or inormation about installing and conguring removable media.
See the section “Global Devices” on page 46 or more inormation about global devices.
Cluster Interconnect
The cluster interconnect is the physical conguration o devices that is used to transercluster-private communications and data service communications between Solaris hosts in thecluster. Because the interconnect is used extensively or cluster-private communications, it canlimit perormance.
Only hosts in the cluster can be connected to the cluster interconnect. The Sun Cluster security model assumes that only cluster hosts have physical access to the cluster interconnect.
You can set up rom one to six cluster interconnects in a cluster. While a single clusterinterconnect reduces the number o adapter ports that are used or the private interconnect, itprovides no redundancy and less availability. I a single interconnect ails, moreover, the clusteris at a higher risk o having to perorm automatic recovery. Whenever possible, install two ormore cluster interconnects to provide redundancy and scalability, and thereore higher
availability, by avoiding a single point o ailure.
The cluster interconnect consists o three hardware components: adapters, junctions, andcables. The ollowing list describes each o these hardware components.
■ Adapters – The network interace cards that are located in each cluster host. Their namesare constructed rom a device name immediately ollowed by a physical-unit number, orexample, qe2. Some adapters have only one physical network connection, but others, likethe qe card, have multiple physical connections. Some adapters also contain both network interaces and storage interaces.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A26
Sun Cluster System Hardware and Sotware Components
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 27/116
A network adapter with multiple interaces could become a single point o ailure i theentire adapter ails. For maximum availability, plan your cluster so that the only pathbetween two hosts does not depend on a single network adapter.
■ Junctions – The switches that are located outside o the cluster hosts. Junctions perormpass-through and switching unctions to enable you to connect more than two hosts. In atwo-host cluster, you do not need junctions because the hosts can be directly connected toeach other through redundant physical cables connected to redundant adapters on eachhost. Greater than two-host congurations generally require junctions.
■ Cables – The physical connections that you install either between two network adapters orbetween an adapter and a junction.
See Chapter 4, “Frequently Asked Questions,” or questions and answers about the clusterinterconnect.
Public Network Interaces
Clients connect to the cluster through the public network interaces. Each network adapter card
can connect to one or more public networks, depending on whether the card has multiplehardware interaces.
You can set up Solaris hosts in the cluster to include multiple public network interace cardsthat perorm the ollowing unctions:
■ Are congured so that multiple cards are active.■ Serve as ailover backups or one another.
I one o the adapters ails, IP network multipathing sotware is called to ail over the deectiveinterace to another adapter in the group.
No special hardware considerations relate to clustering or the public network interaces.
See Chapter 4, “Frequently Asked Questions,” or questions and answers about public networks.
Client Systems
Client systems include machines or other hosts that access the cluster over the public network.Client-side programs use data or other services that are provided by server-side applicationsrunning on the cluster.
Client systems are not highly available. Data and applications on the cluster are highly available.
See Chapter 4, “Frequently Asked Questions,” or questions and answers about client systems.
Chapter 2 • Key Concepts or Hardware Service Providers 27
Sun Cluster System Hardware and Sotware Components
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 28/116
Console Access Devices
You must have console access to all Solaris hosts in the cluster.
To gain console access, use one o the ollowing devices:
■ The terminal concentrator that you purchased with your cluster hardware
■ The System Service Processor (SSP) on Sun Enterprise E10000 servers (or SPARC basedclusters)
■ The system controller on Sun FireTM servers (also or SPARC based clusters)
■
Another device that can accessttya
on each host
Only one supported terminal concentrator is available rom Sun and use o the supported Sunterminal concentrator is optional. The terminal concentrator enables access to /dev/console
on each host by using a TCP/IP network. The result is console-level access or each host rom aremote machine anywhere on the network.
The System Service Processor (SSP) provides console access or Sun Enterprise E1000 servers.The SSP is a processor card in a machine on an Ethernet network that is congured to supportthe Sun Enterprise E1000 server. The SSP is the administrative console or the Sun EnterpriseE1000 server. Using the Sun Enterprise E10000 Network Console eature, any machine in thenetwork can open a host console session.
Other console access methods include other terminal concentrators, tip serial port access romanother host, and dumb terminals.
Caution – You can attach a keyboard or monitor to a cluster host provided that that keyboard ormonitor is supported by the base server platorm. However, you cannot use that keyboard ormonitor as a console device. You must redirect the console to a serial port, or depending onyour machine, to the System Service Processor (SSP) and Remote System Control (RSC) by setting the appropriate OpenBoot PROM parameter.
Administrative ConsoleYou can use a dedicated machine, known as the administrative console, to administer the activecluster. Usually, you install and run administrative tool sotware, such as the Cluster ControlPanel (CCP) and the Sun Cluster module or the Sun Management Center product (or use withSPARC based clusters only), on the administrative console. Using cconsole under the CCPenables you to connect to more than one host console at a time. For more inormation about touse the CCP, see the Chapter 1, “Introduction to Administering Sun Cluster,” in SunCluster
SystemAdministrationGuide or Solaris OS.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A28
SPARC: Sun ClusterTopologies
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 29/116
The administrative console is not a cluster host. You use the administrative console or remoteaccess to the cluster hosts, either over the public network, or optionally through anetwork-based terminal concentrator.
I your cluster consists o the Sun Enterprise E10000 platorm, you must do the ollowing:
■ Log in rom the administrative console to the SSP.■ Connect by using the netcon command.
Typically, you congure hosts without monitors. Then, you access the host's console through atelnet session rom the administrative console. The administration console is connected to aterminal concentrator, and rom the terminal concentrator to the host's serial port. In the case
o a Sun Enterprise E1000 server, you connect rom the System Service Processor. See “ConsoleAccess Devices” on page 28 or more inormation.
Sun Cluster does not require a dedicated administrative console, but using one provides thesebenets:
■ Enables centralized cluster management by grouping console and management tools on thesame machine
■
Provides potentially quicker problem resolution by your hardware service provider
See Chapter 4, “Frequently Asked Questions,” or questions and answers about theadministrative console.
SPARC: Sun Cluster Topologies
A topology is the connection scheme that connects the Solaris hosts in the cluster to the storageplatorms that are used in a Sun Cluster environment. Sun Cluster sotware supports any topology that adheres to the ollowing guidelines.
■ A Sun Cluster environment that is composed o SPARC based systems supports rom one tosixteen Solaris hosts in a cluster. Diferent hardware congurations impose additional limitson the maximum number o hosts that you can congure in a cluster composed o SPARCbased systems.
■ A shared storage device can connect to as many hosts as the storage device supports.■ Shared storage devices do not need to connect to all hosts o the cluster. However, these
storage devices must connect to at least two hosts.
You can congure logical domains (LDoms) guest domains and LDoms I/O domains as virtualSolaris hosts. In other words, you can create a clustered pair, pair+N, N+1, and N*N cluster thatconsists o any combination o physical machines, LDoms I/O domains, and LDoms guestdomains. You can also create clusters that consist o only LDoms guest domains, LDoms I/O
domains, or any combination o the two.
Chapter 2 • Key Concepts or Hardware Service Providers 29
SPARC: Sun ClusterTopologies
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 30/116
Sun Cluster sotware does not require you to congure a cluster by using specic topologies.
The ollowing topologies are described to provide the vocabulary to discuss a cluster's
connection scheme. These topologies are typical connection schemes.
■ Clustered pair■ Pair+N■ N+1 (star)■ N*N (scalable)■ LDoms Guest Domains: Cluster in a Box■ LDoms Guest Domains: Single Cluster Spans Two Diferent Hosts■ LDoms Guest Domains: Clusters Span Two Diferent Hosts■
LDoms Guest Domains: Redundant I/O Domains
The ollowing sections include sample diagrams o each topology.
SPARC: Clustered Pair Topology
A clustered pair topology is two or more pairs o Solaris hosts that operate under a single clusteradministrative ramework. In this conguration, ailover occurs only between a pair. However,
all hosts are connected by the cluster interconnect and operate under Sun Cluster sotware
control. You might use this topology to run a parallel database application on one pair and a
ailover or scalable application on another pair.
Using the cluster le system, you could also have a two-pair conguration. More than two hosts
can run a scalable service or parallel database, even though all the hosts are not directly connected to the disks that store the application data.
The ollowing gure illustrates a clustered pair conguration.
Storage Storage Storage Storage
Host 2Host 1 Host 3
Junction
Junction
Host 4
FIGURE 2–2 SPARC:Clustered PairTopology
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A30
SPARC: Sun ClusterTopologies
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 31/116
SPARC: Pair+N Topology
The pair+N topology includes a pair o Solaris hosts that are directly connected to the ollowing:
■ Shared storage.
■ An additional set o hosts that use the cluster interconnect to access shared storage (they have no direct connection themselves).
The ollowing gure illustrates a pair+N topology where two o the our hosts (Host 3 and Host4) use the cluster interconnect to access the storage. This conguration can be expanded toinclude additional hosts that do not have direct access to the shared storage.
SPARC: N+1 (Star)TopologyAn N+1 topology includes some number o primary Solaris hosts and one secondary host. Youdo not have to congure the primary hosts and secondary host identically. The primary hostsactively provide application services. The secondary host need not be idle while waiting or aprimary host to ail.
The secondary host is the only host in the conguration that is physically connected to all the
multihost storage.
I a ailure occurs on a primary host, Sun Cluster ails over the resources to the secondary host.The secondary host is where the resources unction until they are switched back (eitherautomatically or manually) to the primary host.
The secondary host must always have enough excess CPU capacity to handle the load i one o the primary hosts ails.
The ollowing gure illustrates an N+1 conguration.
Host 2Host 1 Host 3
Junction
Storage Storage
Junction
Host 4
FIGURE 2–3 Pair+NTopology
Chapter 2 • Key Concepts or Hardware Service Providers 31
SPARC: Sun ClusterTopologies
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 32/116
SPARC: N*N (Scalable) Topology
An N*N topology enables every shared storage device in the cluster to connect to every Solaris
host in the cluster. This topology enables highly available applications to ail over rom one host
to another without service degradation. When ailover occurs, the new host can access the
storage device by using a local path instead o the private interconnect.
The ollowing gure illustrates an N*N conguration.
Host 2Primary
Host 1Primary
Host 3Primary
Junction
Storage Storage Storage
Junction
Host 4Secondary
FIGURE 2–4 SPARC: N+1 Topology
Storage Storage
Host 2Host 1 Host 3
Junction
Junction
Host 4
FIGURE 2–5 SPARC: N*N Topology
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A32
SPARC: Sun ClusterTopologies
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 33/116
SPARC: LDoms Guest Domains: Cluster in a BoxTopology
In this logical domains (LDoms) guest domain topology, a cluster and every node within that
cluster are located on the same Solaris host. Each LDoms guest domain node acts the same as a
Solaris host in a cluster. To preclude your having to include a quorum device, this conguration
includes three nodes rather than only two.
In this topology, you do not need to connect each virtual switch (vsw) or the private network to
a physical network because they need only communicate with each other. In this topology,
cluster nodes can also share the same storage device, as all cluster nodes are located on the samehost. To learn more about guidelines or using and installing LDoms guest domains or LDoms
I/O domains in a cluster, see “How to Install Sun Logical Domains Sotware and Create
Domains” in SunCluster Sotware InstallationGuide or Solaris OS.
This topology does not provide high availability, as all nodes in the cluster are located on the
same host. However, developers and administrators might nd this topology useul or testing
and other non-production tasks. This topology is also called a “cluster in a box”.
The ollowing gure illustrates a cluster in a box conguration.
Chapter 2 • Key Concepts or Hardware Service Providers 33
SPARC: Sun ClusterTopologies
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 34/116
SPARC: LDoms Guest Domains: Single Cluster SpansTwo Diferent Hosts Topology
In this logical domains (LDoms) guest domain topology, a single cluster spans two diferent
Solaris hosts and each cluster comprises one node on each host. Each LDoms guest domain
node acts the same as a Solaris host in a cluster. To learn more about guidelines or using and
installing LDoms guest domains or LDoms I/O domains in a cluster, see “How to Install Sun
Logical Domains Sotware and Create Domains” in SunCluster Sotware Installation Guide or
SolarisOS.
Node 1 Node 2
VSW 0Private
VSW 1Private
VSW 2Public
VSW = Virtual Switch
Node 3
GuestDomain 1
GuestDomain 2
Cluster
Host
I/O Domain
Public Network
GuestDomain 3
Physical Adapter
Storage
FIGURE 2–6 SPARC: Cluster in a Box Topology
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A34
SPARC: Sun ClusterTopologies
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 35/116
The ollowing gure illustrates a conguration in which a single cluster spans two diferenthosts.
SPARC: LDoms Guest Domains: Clusters SpanTwoDiferent Hosts Topology
In this logical domains (LDoms) guest domain topology, each cluster spans two diferent Solarishosts and each cluster comprises one node on each host. Each LDoms guest domain node actsthe same as a Solaris host in a cluster. In this conguration, because both clusters share the sameinterconnect switch, you must speciy a diferent private network address on each cluster.Otherwise, i you speciy the same private network address on clusters that share an
interconnect switch, the conguration ails.
I/O Domain
ClusterInterconnect
Host 1 Host 2
Guest Domain 1 Guest Domain 2
VSW 1
VSWPrivate
VSWPrivate
VSWPrivate
VSWPrivate
VSW 2
PhysicalAdapter
PhysicalAdapter
VSW = Virtual Switch
Public Network
Node 1 Node 2Cluster 1
Storage
I/O Domain
FIGURE 2–7 SPARC:Single Cluster Spans Two Diferent Hosts
Chapter 2 • Key Concepts or Hardware Service Providers 35
SPARC: Sun ClusterTopologies
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 36/116
To learn more about guidelines or using and installing LDoms guest domains or LDoms I/O
domains in a cluster, see “How to Install Sun Logical Domains Sotware and Create Domains”
in SunCluster Sotware Installation Guide or Solaris OS.
The ollowing gure illustrates a conguration in which more than a single cluster spans two
diferent hosts.
I/O Domain
ClusterInterconnect
Multipleclusters onthe same
interconnectswitch
Host 1 Host 2
Guest Domain 1 Guest Domain 2
Guest Domain 3 Guest Domain 4
VSW 1
VSWPrivate
VSWPrivate
VSWPrivate
VSWPrivate
VSW 2
PhysicalAdapter
PhysicalAdapter
VSW = Virtual Switch
Public Network
Node 1 Node 2
Node 1 Node 2
Cluster 1
Cluster 2
Storage
I/O Domain
FIGURE 2–8 SPARC:Clusters Span Two Diferent Hosts
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A36
SPARC: Sun ClusterTopologies
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 37/116
SPARC: LDoms Guest Domains: Redundant I/ODomains
In this logical domains (LDoms) guest domain topology, multiple I/O domains ensure that
guest domains, or nodes within the cluster, continue to operate i an I/O domain ails. Each
LDoms guest domain node acts the same as a Solaris host in a cluster.
In this topology, the guest domain runs IP network multipathing (IPMP) across two public
networks, one through each I/O domain. Guest domains also mirror storage devices across
diferent I/O domains. To learn more about guidelines or using and installing LDoms guest
domains or LDoms I/O domains in a cluster, see “How to Install Sun Logical Domains Sotwareand Create Domains” in SunCluster Sotware Installation Guide or Solaris OS.
The ollowing gure illustrates a conguration in which redundant I/O domains ensure that
nodes within the cluster continue to operate i an I/O domain ails.
Chapter 2 • Key Concepts or Hardware Service Providers 37
x86:Sun Cluster Topologies
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 38/116
x86: Sun Cluster Topologies
A topology is the connection scheme that connects the cluster nodes to the storage platorms
that are used in the cluster. Sun Cluster supports any topology that adheres to the ollowing
guidelines.
■ Sun Cluster sotware supports rom one to eight Solaris hosts in a cluster. Diferenthardware congurations impose additional limits on the maximum number o hosts that
you can congure in a cluster composed o x86 based systems. See “x86: Sun Cluster
Topologies” on page 38 or the supported host congurations.
■ Shared storage devices must connect to hosts.
Node 1
Guest Domain 1
Host 1 Host 2
ClusterlusterCluster
VSW 1Public
PhysicalAdapter
PhysicalAdapter
PhysicalAdapter
PhysicalAdapter
VSWPrivate
VSW = Virtual Switch
IPMP Mirror
I/O DomainPrimary
VSW 2Public
VSWPrivate
I/O DomainAlternate
Public Network
Node 2
VSW 3Public
VSWPrivate
IPMP Mirror
I/O DomainPrimary
VSW 4Public
VSWPrivate
I/O DomainAlternate
Guest Domain 2
Storage Storage
Mirror
FIGURE 2–9 SPARC:RedundantI/O Domains
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A38
S Cl d i l b i i l i Th
x86:Sun ClusterTopologies
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 39/116
Sun Cluster does not require you to congure a cluster by using specic topologies. Theollowing clustered pair topology, which is a topology or clusters that are composed o x86based hosts, is described to provide the vocabulary to discuss a cluster's connection scheme.
This topology is a typical connection scheme.The ollowing section includes a sample diagram o the topology.
x86: Clustered Pair TopologyA clustered pair topology is two Solaris hosts that operate under a single cluster administrativeramework. In this conguration, ailover occurs only between a pair. However, all hosts are
connected by the cluster interconnect and operate under Sun Cluster sotware control. Youmight use this topology to run a parallel database or a ailover or scalable application on thepair.
The ollowing gure illustrates a clustered pair conguration.
x86: N+1 (Star)TopologyAn N+1 topology includes some number o primary Solaris hosts and one secondary host. Youdo not have to congure the primary hosts and secondary host identically. The primary hostsactively provide application services. The secondary host need not be idle while waiting or aprimary host to ail.
The secondary host is the only host in the conguration that is physically connected to all themultihost storage.
I a ailure occurs on a primary host, Sun Cluster ails over the resources to the secondary host.The secondary host is where the resources unction until they are switched back (eitherautomatically or manually) to the primary host.
The secondary host must always have enough excess CPU capacity to handle the load i one o
the primary hosts ails.
Private
Storage Storage
Host 1 Host 2
Public
FIGURE 2–10 x86:Clustered PairTopology
Chapter 2 • Key Concepts or Hardware Service Providers 39
Th ll i ill t t N+1 ti
x86:Sun Cluster Topologies
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 40/116
The ollowing gure illustrates an N+1 conguration.
Host 2Primary
Host 1Primary
Host 3Primary
Junction
Storage Storage Storage
Junction
Host 4Secondary
FIGURE 2–11 x86:N+1 Topology
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A40
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 41/116
Key Concepts or System Administrators andApplication Developers
This chapter describes the key concepts that are related to the sotware components o the Sun
Cluster environment. The inormation in this chapter is directed primarily to system
administrators and application developers who use the Sun Cluster API and SDK. Cluster
administrators can use this inormation in preparation or installing, conguring, and
administering cluster sotware. Application developers can use the inormation to understand
the cluster environment in which they work.
This chapter covers the ollowing topics:
■ “Administrative Interaces” on page 42■ “Cluster Time” on page 42■ “High-Availability Framework” on page 43■ “Global Devices” on page 46■
“Device Groups” on page 47■ “Global Namespace” on page 49■ “Cluster File Systems” on page 51■ “Disk Path Monitoring” on page 53■ “Quorum and Quorum Devices” on page 56■ “Data Services” on page 64■ “Developing New Data Services” on page 71■ “Using the Cluster Interconnect or Data Service Trac” on page 73■ “Resources, Resource Groups, and Resource Types” on page 74■ “Support or Solaris Zones” on page 77■ “Service Management Facility” on page 80■ “System Resource Usage” on page 81■ “Data Service Project Conguration” on page 83■ “Public Network Adapters and IP Network Multipathing” on page 92■ “SPARC: Dynamic Reconguration Support” on page 94
3C H A P T E R 3
41
Administrative Interaces
Administrative Interaces
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 42/116
Administrative InteracesYou can choose how you install, congure, and administer the Sun Cluster sotware rom
several user interaces. You can accomplish system administration tasks either through the SunCluster Manager graphical user interace (GUI) or through the command-line interace. On topo the command-line interace are some utilities, such as scinstall and clsetup, to simpliy selected installation and conguration tasks. The Sun Cluster sotware also has a module thatruns as part o Sun Management Center that provides a GUI to particular cluster tasks. Thismodule is available or use in only SPARC based clusters. Reer to “Administration Tools” inSunCluster SystemAdministrationGuide or Solaris OS or complete descriptions o theadministrative interaces.
ClusterTimeTime between all Solaris hosts in a cluster must be synchronized. Whether you synchronize thecluster hosts with any outside time source is not important to cluster operation. The SunCluster sotware employs the Network Time Protocol (NTP) to synchronize the clocks betweenhosts.
In general, a change in the system clock o a raction o a second causes no problems. However,i you run date, rdate, or xntpdate (interactively, or within cron scripts) on an active cluster,you can orce a time change much larger than a raction o a second to synchronize the systemclock to the time source. This orced change might cause problems with le modicationtimestamps or conuse the NTP service.
When you install the Solaris Operating System on each cluster host, you have an opportunity tochange the deault time and date setting or the host. In general, you can accept the actory
deault.
When you install Sun Cluster sotware by using the scinstall command, one step in theprocess is to congure NTP or the cluster. Sun Cluster sotware supplies a template le,ntp.cluster (see /etc/inet/ntp.cluster on an installed cluster host), that establishes a peerrelationship between all cluster hosts. One host is designated the “preerred” host. Hosts areidentied by their private host names and time synchronization occurs across the clusterinterconnect. For instructions about how to congure the cluster or NTP, see Chapter 2,
“Installing Sotware on Global-Cluster Nodes,” in SunCluster Sotware Installation Guide or SolarisOS.
Alternately, you can set up one or more NTP servers outside the cluster and change thentp.conf le to reect that conguration.
In normal operation, you should never need to adjust the time on the cluster. However, i thetime was set incorrectly when you installed the Solaris Operating System and you want tochange it, the procedure or doing so is included in Chapter 8, “Administering the Cluster,” in
SunCluster SystemAdministrationGuide or Solaris OS.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A42
High-Availability Framework
High-AvailabilityFramework
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 43/116
High-Availability Framework
The Sun Cluster sotware makes all components on the “path” between users and data highly
available, including network interaces, the applications themselves, the le system, and themultihost devices. In general, a cluster component is highly available i it survives any single(sotware or hardware) ailure in the system.
The ollowing table shows the kinds o Sun Cluster component ailures (both hardware andsotware) and the kinds o recovery that are built into the high-availability ramework.
TABLE 3–1 Levels o Sun Cluster Failure Detectionand Recovery
Failed Cluster Component Software Recovery Hardware Recovery
Data service HA API, HA ramework Not applicable
Public network
adapter
IP n etwork multipathing Multiple public n etwork adapter cards
Clusterle system Primaryand secondary replicas Multihost devices
Mirrored multihost
device
Volume management (Solaris Volume
Manager and Veritas Volume Manager)
Hardware RAID-5 (or example, Sun
StorEdgeTM A3x00)
Global device Primary and secondary replicas Multiple paths to the device, cluster
transport junctions
Private network HA transport s otware Multiple private hardware-independent
networks
Host CMM, ailast driver Multiple hosts
Zone HA API, HA ramework Not applicable
Sun Cluster sotware's high-availability ramework detects a node ailure quickly and creates a
new equivalent server or the ramework resources on a remaining node in the cluster. At notime are all ramework resources unavailable. Framework resources that are unafected by aailed node are ully available during recovery. Furthermore, ramework resources o the ailednode become available as soon as they are recovered. A recovered ramework resource does not
have to wait or all other ramework resources to complete their recovery.
Most highly available ramework resources are recovered transparently to the applications (dataservices) that are using the resource. The semantics o ramework resource access are ully preserved across node ailure. The applications cannot detect that the ramework resourceserver has been moved to another node. Failure o a single node is completely transparent toprograms on remaining nodes by using the les, devices, and disk volumes that are available tothis node. This transparency exists i an alternative hardware path exists to the disks rom
another host. An example is the use o multihost devices that have ports to multiple hosts.
Chapter 3 • Key Concepts or System Administrators and Application Developers 43
Zone Membership
High-AvailabilityFramework
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 44/116
Zone Membership
Sun Cluster sotware also tracks zone membership by detecting when a zone boots up or halts.
These changes also trigger a reconguration. A reconguration can redistribute clusterresources among the nodes in the cluster.
Cluster Membership Monitor
To ensure that data is kept sae rom corruption, all nodes must reach a consistent agreement on
the cluster membership. When necessary, the CMM coordinates a cluster reconguration o cluster services (applications) in response to a ailure.
The CMM receives inormation about connectivity to other nodes rom the cluster transport
layer. The CMM uses the cluster interconnect to exchange state inormation during a
reconguration.
Ater detecting a change in cluster membership, the CMM perorms a synchronized
conguration o the cluster. In a synchronized conguration, cluster resources might beredistributed, based on the new membership o the cluster.
Failast Mechanism
The ailast mechanism detects a critical problem on either a global-cluster voting node or
global-cluster non-voting node. The action that Sun Cluster takes when ailast detects aproblem depends on whether the problem occurs in a voting node or a non-voting node.
I the critical problem is located in a voting node, Sun Cluster orcibly shuts down the node. Sun
Cluster then removes the node rom cluster membership.
I the critical problem is located in a non-voting node, Sun Cluster reboots that non-voting
node.
I a node loses connectivity with other nodes, the node attempts to orm a cluster with the nodes
with which communication is possible. I that set o nodes does not orm a quorum, Sun Cluster
sotware halts the node and “ences” the node rom the shared disks, that is, prevents the node
rom accessing the shared disks.
You can turn of encing or selected disks or or all disks.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A44
High-AvailabilityFramework
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 45/116
Caution – I you turn of encing under the wrong circumstances, your data can be vulnerable tocorruption during application ailover. Examine this data corruption possibility careully when
you are considering turning of encing. I your shared storage device does not support the SCSIprotocol, such as a Serial Advanced Technology Attachment (SATA) disk, or i you want toallow access to the cluster's storage rom hosts outside the cluster, turn of encing.
I one or more cluster-specic daemons die, Sun Cluster sotware declares that a criticalproblem has occurred. Sun Cluster sotware runs cluster-specic daemons on both votingnodes and non-voting nodes. I a critical problem occurs, Sun Cluster either shuts down and
removes the node or reboots the non-voting node where the problem occurred.
When a cluster-specic daemon that runs on a non-voting node ails, a message similar to theollowing is displayed on the console.
cl_runtime: NOTICE: Failfast: Aborting because "pmfd" died in zone "zone4" (zone id 3)
35 seconds ago.
When a cluster-specic daemon that runs on a voting node ails and the node panics, a message
similar to the ollowing is displayed on the console.
panic[cpu1]/thread=2a10007fcc0: Failfast: Aborting because "pmfd" died in zone "global" (zone id 0)
35 seconds ago.
409b8 cl_runtime:__0FZsc_syslog_msg_log_no_argsPviTCPCcTB+48 (70f900, 30, 70df54, 407acc, 0)
%l0-7: 1006c80 000000a 000000a 10093bc 406d3c80 7110340 0000000 4001 fbf0
Ater the panic, the Solaris host might reboot and the node might attempt to rejoin the cluster.
Alternatively, i the cluster is composed o SPARC based systems, the host might remain at theOpenBoot PROM (OBP) prompt. The next action o the host is determined by the setting o theauto-boot? parameter. You can set auto-boot? with the eeprom command, at the OpenBootPROM ok prompt. See the eeprom(1M) man page.
Cluster Conguration Repository (CCR)
The CCR uses a two-phase commit algorithm or updates: An update must be successully completed on all cluster members or the update is rolled back. The CCR uses the clusterinterconnect to apply the distributed updates.
Caution – Although the CCR consists o text les, never edit the CCR les yoursel. Each lecontains a checksum record to ensure consistency between nodes. Updating CCR les yoursel can cause a node or the entire cluster to stop working.
Chapter 3 • Key Concepts or System Administrators and Application Developers 45
The CCR relies on the CMM to guarantee that a cluster is running only when quorum isbli h d Th CCR i ibl i i d i h l
Global Devices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 46/116
established. The CCR is responsible or veriying data consistency across the cluster,perorming recovery as necessary, and acilitating updates to the data.
Global DevicesThe Sun Cluster sotware uses global devices to provide cluster-wide, highly available access toany device in a cluster, rom any node, without regard to where the device is physically attached.In general, i a node ails while providing access to a global device, the Sun Cluster sotwareautomatically discovers another path to the device. The Sun Cluster sotware then redirects theaccess to that path. Sun Cluster global devices include disks, CD-ROMs, and tapes. However,the only multiported global devices that Sun Cluster sotware supports are disks. Consequently,CD-ROM and tape devices are not currently highly available devices. The local disks on eachserver are also not multiported, and thus are not highly available devices.
The cluster automatically assigns unique IDs to each disk, CD-ROM, and tape device in thecluster. This assignment enables consistent access to each device rom any node in the cluster.The global device namespace is held in the /dev/global directory. See “Global Namespace” onpage 49 or more inormation.
Multiported global devices provide more than one path to a device. Because multihost disks arepart o a device group that is hosted by more than one Solaris host, the multihost disks are madehighly available.
Device IDs and DID Pseudo Driver
The Sun Cluster sotware manages global devices through a construct known as the DIDpseudo driver. This driver is used to automatically assign unique IDs to every device in thecluster, including multihost disks, tape drives, and CD-ROMs.
The DID pseudo driver is an integral part o the global device access eature o the cluster. TheDID driver probes all nodes o the cluster and builds a list o unique devices, assigns each devicea unique major and a minor number that are consistent on all nodes o the cluster. Access to theglobal devices is perormed by using the unique device ID instead o the traditional Solarisdevice IDs, such as c0t0d0 or a disk.
This approach ensures that any application that accesses disks (such as a volume manager orapplications that use raw devices) uses a consistent path across the cluster. This consistency isespecially important or multihost disks, because the local major and minor numbers or eachdevice can vary rom Solaris host to Solaris host, thus changing the Solaris device namingconventions as well. For example, Host1 might identiy a multihost disk as c1t2d0,and Host2
might identiy the same disk completely diferently, as c3t2d0. The DID driver assigns a globalname, such as d10, that the hosts use instead, giving each host a consistent mapping to the
multihost disk.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A46
You update and administer device IDs with the cldevice command. See the cldevice(1CL)
Device Groups
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 47/116
man page.
Device GroupsIn the Sun Cluster sotware, all multihost devices must be under control o the Sun Clustersotware. You rst create volume manager disk groups, either Solaris Volume Manager disk setsor Veritas Volume Manager disk groups, on the multihost disks. Then, you register the volumemanager disk groups as device groups. A device group is a type o global device. In addition, theSun Cluster sotware automatically creates a raw device group or each disk and tape device inthe cluster. However, these cluster device groups remain in an oine state until you access themas global devices.
Registration provides the Sun Cluster sotware inormation about which Solaris hosts have apath to specic volume manager disk groups. At this point, the volume manager disk groupsbecome globally accessible within the cluster. I more than one host can write to (master) adevice group, the data stored in that device group becomes highly available. The highly availabledevice group can be used to contain cluster le systems.
Note – Device groups are independent o resource groups. One node can master a resourcegroup (representing a group o data service processes). Another node can master the disk groups that are being accessed by the data services. However, the best practice is to keep on thesame node the device group that stores a particular application's data and the resource groupthat contains the application's resources (the application daemon). Reer to “RelationshipBetween Resource Groups and Device Groups” in SunCluster Data Services Planning and Administration Guide or Solaris OS or more inormation about the association between device
groups and resource groups.
When a node uses a device group, the volume manager disk group becomes “global” because itprovides multipath support to the underlying disks. Each cluster host that is physically attachedto the multihost disks provides a path to the device group.
Device Group Failover
Because a disk enclosure is connected to more than one Solaris host, all device groups in thatenclosure are accessible through an alternate path i the host currently mastering the devicegroup ails. The ailure o the host that is mastering the device group does not afect access to thedevice group except or the time it takes to perorm the recovery and consistency checks.During this time, all requests are blocked (transparently to the application) until the systemmakes the device group available.
Chapter 3 • Key Concepts or System Administrators and Application Developers 47
Before Disk Device Group Failover
Device Groups
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 48/116
Multiported Device Groups
This section describes device group properties that enable you to balance perormance andavailability in a multiported disk conguration. Sun Cluster sotware provides two propertiesthat congure a multiported disk conguration: preferenced and numsecondaries.Youcancontrol the order in which nodes attempt to assume control i a ailover occurs by using thepreferenced property. Use the numsecondaries property to set the number o secondary nodes or a device group that you want.
A highly available service is considered down when the primary node ails and when no eligiblesecondary nodes can be promoted to primary nodes. I service ailover occurs and thepreferenced property is true, then the nodes ollow the order in the node list to select asecondary node. The node list denes the order in which nodes attempt to assume primary control or transition rom spare to secondary. You can dynamically change the preerence o adevice service by using the clsetup command. The preerence that is associated withdependent service providers, or example a global le system, is identical to the preerence o the device service.
Secondary nodes are check-pointed by the primary node during normal operation. In a
multiported disk conguration, checkpointing each secondary node causes cluster
MultihostDisks
DiskDeviceGroups
Client
Access
DataAccess
Primary Secondary
Host1 Host 2
MultihostDisks
After Disk Device Group Failover
DiskDeviceGroups
ClientAccess
DataAccess
Secondary
Host 2Host 1
Primary
FIGURE 3–1 Device Group Beore andAter Failover
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A48
perormance degradation and memory overhead. Spare node support was implemented tominimize the perormance degradation and memory overhead that checkpointing caused By
Global Namespace
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 49/116
minimize the perormance degradation and memory overhead that checkpointing caused. By deault, your device group has one primary and one secondary. The remaining availableprovider nodes become spares. I ailover occurs, the secondary becomes primary and the nodeor highest in priority on the node list becomes secondary.
You can set the number o secondary nodes that you want to any integer between one and thenumber o operational nonprimary provider nodes in the device group.
Note – I you are using Solaris Volume Manager, you must create the device group beore youcan set the numsecondaries property to a number other than the deault.
The deault number o secondaries or device services is 1. The actual number o secondary providers that is maintained by the replica ramework is the number that you want, unless thenumber o operational nonprimary providers is less than the number that you want. You mustalter the numsecondaries property and double-check the node list i you are adding orremoving nodes rom your conguration. Maintaining the node list and number o secondariesprevents conict between the congured number o secondaries and the actual number that is
allowed by the ramework.■ (Solaris Volume Manager) Use the metaset command or Solaris Volume Manager device
groups, in conjunction with the preferenced and numsecondaries property settings, tomanage the addition o nodes to and the removal o nodes rom your conguration.
■ (Veritas Volume Manager) Use the cldevicegroup command or VxVM device groups, inconjunction with the preferenced and numsecondaries property settings, to manage theaddition o nodes to and the removal o nodes rom your conguration.
■ Reer to “Overview o Administering Cluster File Systems” in SunCluster System Administration Guide or Solaris OS or procedural inormation about changing devicegroup properties.
Global NamespaceThe Sun Cluster sotware mechanism that enables global devices is the global namespace.The
global namespace includes the /dev/global/ hierarchy as well as the volume managernamespaces. The global namespace reects both multihost disks and local disks (and any othercluster device, such as CD-ROMs and tapes), and provides multiple ailover paths to themultihost disks. Each Solaris host that is physically connected to multihost disks provides apath to the storage or any node in the cluster.
Normally, or Solaris Volume Manager, the volume manager namespaces are located in the/dev/md/diskset /dsk (and rdsk) directories. For Veritas VxVM, the volume manager
namespaces are located in the /dev/vx/dsk/disk-group and /dev/vx/rdsk/disk-group
Chapter 3 • Key Concepts or System Administrators and Application Developers 49
directories. These namespaces consist o directories or each Solaris Volume Manager disk setand each VxVM disk group imported throughout the cluster, respectively. Each o these
Global Namespace
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 50/116
and each VxVM disk group imported throughout the cluster, respectively. Each o thesedirectories contains a device host or each metadevice or volume in that disk set or disk group.
In the Sun Cluster sotware, each device host in the local volume manager namespace isreplaced by a symbolic link to a device host in the /global/.devices/node@nodeIDle system.nodeID is an integer that represents the nodes in the cluster. Sun Cluster sotware continues topresent the volume manager devices, as symbolic links, in their standard locations as well. Boththe global namespace and standard volume manager namespace are available rom any clusternode.
The advantages o the global namespace include the ollowing:
■ Each host remains airly independent, with little change in the device administration model.
■ Devices can be selectively made global.
■ Third-party link generators continue to work.
■ Given a local device name, an easy mapping is provided to obtain its global name.
Local and Global Namespaces ExampleThe ollowing table shows the mappings between the local and global namespaces or amultihost disk, c0t0d0s0.
TABLE 3–2 Localand Global Namespace Mappings
Component or Path Local Host Namespace Global Namespace
Solaris logical name /dev/dsk/c0t0d0s0 /global/.devices/node@nodeID/dev/dsk/c0t0d0s0
DID name /dev/did/dsk/d0s0 /global/.devices/node@nodeID/dev/did/dsk/d0s0
Solaris Volume Manager /dev/md/diskset /dsk/d0 /global/.devices/node@nodeID/dev/md/diskset /dsk/d0
Veritas Volume Manager /dev/vx/dsk/disk-group/v0 /global/.devices/node@nodeID/dev/vx/dsk/disk-group/v0
The global namespace is automatically generated on installation and updated with every
reconguration reboot. You can also generate the global namespace by using the cldevicecommand. See the cldevice(1CL) man page.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A50
Cluster File Systems
Cluster File Systems
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 51/116
The cluster le system has the ollowing eatures:
■ File access locations are transparent. A process can open a le that is located anywhere in thesystem. Processes on all Solaris hosts can use the same path name to locate a le.
Note – When the cluster le system reads les, it does not update the access time on thoseles.
■
Coherency protocols are used to preserve the UNIX le access semantics even i the le isaccessed concurrently rom multiple nodes.
■ Extensive caching is used along with zero-copy bulk I/O movement to move le dataeciently.
■ The cluster le system provides highly available, advisory le-locking unctionality by usingthe fcntl command interaces. Applications that run on multiple cluster nodes cansynchronize access to data by using advisory le locking on a cluster le system. File locksare recovered immediately rom nodes that leave the cluster, and rom applications that ail
while holding locks.
■ Continuous access to data is ensured, even when ailures occur. Applications are notafected by ailures i a path to disks is still operational. This guarantee is maintained or rawdisk access and all le system operations.
■ Cluster le systems are independent rom the underlying le system and volumemanagement sotware. Cluster le systems make any supported on-disk le system global.
You can mount a le system on a global device globally with mount -g or locally with mount.
Programs can access a le in a cluster le system rom any node in the cluster through the samele name (or example, /global/foo).
A cluster le system is mounted on all cluster members. You cannot mount a cluster le systemon a subset o cluster members.
A cluster le system is not a distinct le system type. Clients veriy the underlying le system
(or example, UFS).
Using Cluster File Systems
In the Sun Cluster sotware, all multihost disks are placed into device groups, which can beSolaris Volume Manager disk sets, VxVM disk groups, or individual disks that are not under
control o a sotware-based volume manager.
Chapter 3 • Key Concepts or System Administrators and Application Developers 51
For a cluster le system to be highly available, the underlying disk storage must be connected tomore than one Solaris host. Thereore, a local le system (a le system that is stored on a host's
Cluster FileSystems
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 52/116
local disk) that is made into a cluster le system is not highly available.
You can mount cluster le systems as you would mount le systems:
■ Manually.Use the mount command and the -g or -o global mount options to mount thecluster le system rom the command line, or example:
SPARC: # mount -g /dev/global/dsk/d0s0 /global/oracle/data
■ Automatically.Create an entry in the /etc/vfstab le with a global mount option tomount the cluster le system at boot. You then create a mount point under the /global
directory on all hosts. The directory /global is a recommended location, not a requirement.Here's a sample line or a cluster le system rom an /etc/vfstab le:
SPARC: /dev/md/oracle/dsk/d1 /dev/md/oracle/rdsk/d1 /global/oracle/data ufs 2 yes global,logging
Note – While Sun Cluster sotware does not impose a naming policy or cluster le systems, youcan ease administration by creating a mount point or all cluster le systems under the same
directory, such as /global/disk-group. SeeSunCluster 3.19/04 SotwareCollection or SolarisOS (SPARCPlatormEdition) and SunCluster SystemAdministrationGuide or Solaris OS ormore inormation.
HAStoragePlus Resource Type
The HAStoragePlus resource type is designed to make local and global le system
congurations highly available. You can use the HAStoragePlus resource type to integrate yourlocal or global le system into the Sun Cluster environment and make the le system highly available.
You can use the HAStoragePlus resource type to make a le system available to a global-clusternon-voting node. To enable the HAStoragePlus resource type to do this, you must create amount point on the global-cluster voting node and in the global-cluster non-voting node. TheHAStoragePlus resource type makes the le system available to the global-cluster non-voting
node by mounting the le system in the global-cluster voting node. The resource type thenperorms a loopback mount in the global-cluster non-voting node.
Note–
Sun Cluster systems support the ollowing cluster le systems:
■ Solaris ZFSTM
■
UNIX le system (UFS)
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A52
■ Sun StorEdge QFS le system and Sun QFS Shared le system■ Sun Cluster Proxy le system (PxFS)
l
Disk Path Monitoring
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 53/116
■ Veritas le system (VxFS)
The HAStoragePlus resource type provides additional le system capabilities such as checks,mounts, and orced unmounts. These capabilities enable Sun Cluster to ail over local lesystems. In order to ail over, the local le system must reside on global disk groups with anity switchovers enabled.
See “Enabling Highly Available Local File Systems” in SunCluster Data Services Planningand Administration Guide or Solaris OS or inormation about how to use the HAStoragePlus
resource type.
You can also use the HAStoragePlus resource type to synchronize the startup o resources anddevice groups on which the resources depend. For more inormation, see “Resources, ResourceGroups, and Resource Types” on page 74.
syncdir Mount OptionYou can use the syncdir mount option or cluster le systems that use UFS as the underlyingle system. However, perormance signicantly improves i you do not speciy syncdir.Iyouspeciy syncdir, the writes are guaranteed to be POSIX compliant. I you do not speciy syncdir, you experience the same behavior as in NFS le systems. For example, withoutsyncdir, you might not discover an out o space condition until you close a le. With syncdir
(and POSIX behavior), the out-o-space condition would have been discovered during the writeoperation. The cases in which you might have problems i you do not speciy syncdir are rare.
I you are using a SPARC based cluster, VxFS does not have a mount option that is equivalent tothe syncdir mount option or UFS. VxFS behavior is the same as or UFS when the syncdir
mount option is not specied.
See “File Systems FAQs” on page 100 or requently asked questions about global devices andcluster le systems.
Disk Path MonitoringThe current release o Sun Cluster sotware supports disk path monitoring (DPM). This sectionprovides conceptual inormation about DPM, the DPM daemon, and administration tools thatyou use to monitor disk paths. Reer toSunCluster SystemAdministrationGuide orSolaris OSor procedural inormation about how to monitor, unmonitor, and check the status o disk
paths.
Chapter 3 • Key Concepts or System Administrators and Application Developers 53
DPM Overview
DPM h ll l b l l d h b d d k
Disk Path Monitoring
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 54/116
DPM improves the overall reliability o ailover and switchover by monitoring secondary disk path availability. Use the cldevice command to veriy the availability o the disk path that isused by a resource beore the resource is switched. Options that are provided with the cldevice
command enable you to monitor disk paths to a single Solaris host or to all Solaris hosts in thecluster. See the cldevice(1CL) man page or more inormation about command-line options.
The ollowing table describes the deault location or installation o DPM components.
Location Component
Daemon /usr/cluster/lib/sc/scdpmd
Command-line interace /usr/cluster/bin/cldevice
Daemon status le (created at runtime) /var/run/cluster/scdpm.status
A multithreaded DPM daemon runs on each host. The DPM daemon (scdpmd) is started by anrc.d script when a host boots. I a problem occurs, the daemon is managed by pmfd and restarts
automatically. The ollowing list describes how the scdpmd works on initial startup.
Note – At startup, the status or each disk path is initialized to UNKNOWN.
1. The DPM daemon gathers disk path and node name inormation rom the previous statusle or rom the CCR database. See “Cluster Conguration Repository (CCR)” on page 45 ormore inormation about the CCR. Ater a DPM daemon is started, you can orce thedaemon to read the list o monitored disks rom a specied le name.
2. The DPM daemon initializes the communication interace to respond to requests romcomponents that are external to the daemon, such as the command-line interace.
3. The DPM daemon pings each disk path in the monitored list every 10 minutes by usingscsi_inquiry commands. Each entry is locked to prevent the communication interaceaccess to the content o an entry that is being modied.
4. The DPM daemon noties the Sun Cluster Event Framework and logs the new status o thepath through the UNIX syslogd command. See the syslogd(1M) man page.
Note – All errors that are related to the daemon are reported by pmfd. All the unctions rom theAPI return 0 on success and -1 or any ailure.
The DPM daemon monitors the availability o the logical path that is visible through multipathdrivers such as Solaris I/O multipathing (MPxIO), which was ormerly named Sun StorEdge
Trac Manager, Sun StorEdge 9900 Dynamic Link Manager, and EMC PowerPath. The
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A54
individual physical paths that are managed by these drivers are not monitored because the
multipath driver masks individual ailures rom the DPM daemon.
Disk Path Monitoring
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 55/116
Monitoring Disk Paths
This section describes two methods or monitoring disk paths in your cluster. The rst method
is provided by the cldevice command. Use this command to monitor, unmonitor, or display
the status o disk paths in your cluster. You can also use this command to print a list o aulted
disks and to monitor disk paths rom a le. See the cldevice(1CL) man page.
The second method or monitoring disk paths in your cluster is provided by the Sun ClusterManager graphical user interace (GUI). Sun Cluster Manager provides a topological view o
the monitored disk paths in your cluster. The view is updated every 10 minutes to provide
inormation about the number o ailed pings. Use the inormation that is provided by the Sun
Cluster Manager GUI in conjunction with the cldevice command to administer disk paths.
See Chapter 12, “Administering Sun Cluster With the Graphical User Interaces,” in SunCluster
SystemAdministrationGuide or Solaris OS or inormation about Sun Cluster Manager.
Using the cldevice Command to Monitor and Administer Disk Paths
The cldevice command enables you to perorm the ollowing tasks:
■ Monitor a new disk path■ Unmonitor a disk path■ Reread the conguration data rom the CCR database■
Read the disks to monitor or unmonitor rom a specied le■ Report the status o a disk path or all disk paths in the cluster■ Print all the disk paths that are accessible rom a node
Issue the cldevice command with the disk path argument rom any active node to perorm
DPM administration tasks on the cluster. The disk path argument consists o a node name and a
disk name. The node name is not required. I you do not speciy a node name, all nodes are
afected by deault. The ollowing table describes naming conventions or the disk path.
Note – Always speciy a global disk path name rather than a UNIX disk path name because a
global disk path name is consistent throughout a cluster. A UNIX disk path name is not. For
example, the disk path name can be c1t0d0 on one node and c2t0d0 on another node. To
determine a global disk path name or a device that is connected to a node, use the cldevice
list command beore issuing DPM commands. See the cldevice(1CL) man page.
Chapter 3 • Key Concepts or System Administrators and Application Developers 55
TABLE 3–3 Sample DiskPathNames
Name Type Sample Disk Path Name Description
Quorum and Quorum Devices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 56/116
Global disk path schost-1:/dev/did/dsk/d1 Disk path d1 on the
schost-1 node
all:d1 Disk path d1 on all nodes
in the cluster
UNIX disk path schost-1:/dev/rdsk/c0t0d0s0 Disk path c0t0d0s0 on the
schost-1 node
schost-1:all All disk paths on the
schost-1 node
All disk paths all:all All disk paths on all nodes
o the cluster
Using Sun Cluster Manager to Monitor Disk Paths
Sun Cluster Manager enables you to perorm the ollowing basic DPM administration tasks:
■ Monitor a disk path
■ Unmonitor a disk path
■ View the status o all monitored disk paths in the cluster
■ Enable or disable the automatic rebooting o a Solaris host when all monitored disk pathsail
The Sun Cluster Manager online help provides procedural inormation about how toadminister disk paths.
Using the clnode set Command to Manage Disk Path Failure
You use the clnode set command to enable and disable the automatic rebooting o a nodewhen all monitored disk paths ail. You can also use Sun Cluster Manager to perorm thesetasks.
Quorum and Quorum DevicesThis section contains the ollowing topics:
■ “About Quorum Vote Counts” on page 58■ “About Quorum Congurations” on page 58■ “Adhering to Quorum Device Requirements” on page 59■ “Adhering to Quorum Device Best Practices” on page 60■
“Recommended Quorum Congurations” on page 61
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A56
■ “Atypical Quorum Congurations” on page 62■ “Bad Quorum Congurations” on page 63
Quorum and Quorum Devices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 57/116
Note – For a list o the specic devices that Sun Cluster sotware supports as quorum devices,contact your Sun service provider.
Because cluster nodes share data and resources, a cluster must never split into separate
partitions that are active at the same time because multiple active partitions might cause data
corruption. The Cluster Membership Monitor (CMM) and quorum algorithm guarantee that,
at most, one instance o the same cluster is operational at any time, even i the cluster
interconnect is partitioned.
For an introduction to quorum and CMM, see “Cluster Membership” in SunClusterOverview
or Solaris OS.
Two types o problems arise rom cluster partitions:
■ Split brain■
Amnesia
Split brainoccurs when the cluster interconnect between nodes is lost and the cluster becomes
partitioned into subclusters. Each partition “believes” that it is the only partition because the
nodes in one partition cannot communicate with the node or nodes in the other partition.
Amnesiaoccurs when the cluster restarts ater a shutdown with cluster conguration data that
is older than the data was at the time o the shutdown. This problem can occur when you start
the cluster on a node that was not in the last unctioning cluster partition.
Sun Cluster sotware avoids split brain and amnesia by:
■ Assigning each node one vote■ Mandating a majority o votes or an operational cluster
A partition with the majority o votes gains quorum and is allowed to operate. This majority
vote mechanism prevents split brain and amnesia when more than two nodes are congured in
a cluster. However, counting node votes alone is not sucient when more than two nodes arecongured in a cluster. In a two-host cluster, a majority is two. I such a two-host cluster
becomes partitioned, an external vote is needed or either partition to gain quorum. This
external vote is provided by a quorum device.
Chapter 3 • Key Concepts or System Administrators and Application Developers 57
About Quorum Vote Counts
Use the clquorum show command to determine the ollowing inormation:
Quorum and Quorum Devices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 58/116
Use t e c quo u s o co a d to dete e t e o ow g o at o :
■
Total congured votes■ Current present votes■ Votes required or quorum
See the cluster(1CL) man page.
Both nodes and quorum devices contribute votes to the cluster to orm quorum.
A node contributes votes depending on the node's state:
■ A nodehas a vote count o onewhen it boots and becomes a cluster member.
■ A nodehas a vote count o zerowhen the node is being installed.
■ A nodehas a vote count o zerowhen a system administrator places the node intomaintenance state.
Quorum devices contribute votes that are based on the number o votes that are connected tothe device. When you congure a quorum device, Sun Cluster sotware assigns the quorum
device a vote count o N -1 whereN is the number o connected votes to the quorum device. Forexample, a quorum device that is connected to two nodes with nonzero vote counts has aquorum count o one (two minus one).
A quorum device contributes votes i one o the ollowing two conditions are true:
■ At least one o the nodes to which the quorum device is currently attached is a clustermember.
■
At least one o the hosts to which the quorum device is currently attached is booting, andthat host was a member o the last cluster partition to own the quorum device.
You congure quorum devices during the cluster installation, or aterwards, by using theprocedures that are described in Chapter 6, “Administering Quorum,” in SunCluster System Administration Guide or Solaris OS.
About Quorum CongurationsThe ollowing list contains acts about quorum congurations:
■ Quorum devices can contain user data.
■ In an N+1 conguration whereN quorum devices are each connected to one o the 1
throughN Solaris hosts and the N+1 Solaris host, the cluster survives the death o either all 1
throughN Solaris hosts or any o theN /2 Solaris hosts. This availability assumes that the
quorum device is unctioning correctly.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A58
■ In anN -host conguration where a single quorum device connects to all hosts, the clustercan survive the death o any o theN -1 hosts. This availability assumes that the quorumdevice is unctioning correctly.
Quorum and Quorum Devices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 59/116
g y
■
In anN -host conguration where a single quorum device connects to all hosts, the clustercan survive the ailure o the quorum device i all cluster hosts are available.
For examples o quorum congurations to avoid, see “Bad Quorum Congurations” onpage 63. For examples o recommended quorum congurations, see “Recommended QuorumCongurations” on page 61.
Adhering to Quorum Device RequirementsEnsure that Sun Cluster sotware supports your specic device as a quorum device. I youignore this requirement, you might compromise your cluster's availability.
Note – For a list o the specic devices that Sun Cluster sotware supports as quorum devices,contact your Sun service provider.
Sun Cluster sotware supports the ollowing types o quorum devices:
■ Multihosted shared disks that support SCSI-3 PGR reservations.
■ Dual-hosted shared disks that support SCSI-2 reservations.
■ A Network-Attached Storage (NAS) device rom Sun Microsystems, Incorporated or romNetwork Appliance, Incorporated.
■ A quorum server process that runs on the quorum server machine.
■ Any shared disk, provided that you have turned of encing or this disk, and are thereoreusing sotware quorum. Sotware quorum is a protocol developed by Sun Microsystems thatemulates a orm o SCSI Persistent Group Reservations (PGR).
Caution – I you are using disks that do not support SCSI, such as Serial Advanced
Technology Attachment (SATA) disks, turn of encing.
Note – You cannot use a replicated device as a quorum device.
In a two–host conguration, you must congure at least one quorum device to ensure that a
single host can continue i the other host ails. See Figure 3–2.
Chapter 3 • Key Concepts or System Administrators and Application Developers 59
For examples o quorum congurations to avoid, see “Bad Quorum Congurations” onpage 63. For examples o recommended quorum congurations, see “Recommended QuorumCongurations” on page 61.
Quorum and Quorum Devices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 60/116
Adhering to Quorum Device Best Practices
Use the ollowing inormation to evaluate the best quorum conguration or your topology:
■ Do you have a device that is capable o being connected to all Solaris hosts o the cluster?
■ I yes, congure that device as your one quorum device. You donot need to congure
another quorum device because your conguration is the most optimal conguration.
Caution – I you ignore this requirement and add another quorum device, the additionalquorum device reduces your cluster's availability.
■ I no, congure your dual-ported device or devices.
■ Ensure that the total number o votes contributed by quorum devices is strictly less than the
total number o votes contributed by nodes. Otherwise, your nodes cannot orm a cluster i all disks are unavailable, even i all nodes are unctioning.
Note – In particular environments, you might want to reduce overall cluster availability tomeet your needs. In these situations, you can ignore this best practice. However, notadhering to this best practice decreases overall availability. For example, in theconguration that is outlined in “Atypical Quorum Congurations” on page 62 the cluster
is less available: the quorum votes exceed the node votes. In a cluster, i access to the sharedstorage between Host A and Host B is lost, the entire cluster ails.
See “Atypical Quorum Congurations” on page 62 or the exception to this best practice.
■ Speciy a quorum device between every pair o hosts that shares access to a storage device.This quorum conguration speeds the encing process. See “Quorum in Greater ThanTwo–Host Congurations” on page 61.
■ In general, i the addition o a quorum device makes the total cluster vote even, the totalcluster availability decreases.
■ Quorum devices slightly slow recongurations ater a node joins or a node dies. Thereore,do not add more quorum devices than are necessary.
For examples o quorum congurations to avoid, see “Bad Quorum Congurations” onpage 63. For examples o recommended quorum congurations, see “Recommended Quorum
Congurations” on page 61.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A60
Recommended Quorum Congurations
This section shows examples o quorum congurations that are recommended. For examples o
Quorum and Quorum Devices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 61/116
quorum congurations you should avoid, see “Bad Quorum Congurations” on page 63.
Quorum in Two–Host Congurations
Two quorum votes are required or a two-host cluster to orm. These two votes can derive rom
the two cluster hosts, or rom just one host and a quorum device.
Quorum in GreaterThanTwo–Host Congurations
Quorum devices are not required when a cluster includes more than two hosts, as the cluster
survives ailures o a single host without a quorum device. However, under these conditions,
you cannot start the cluster without a majority o hosts in the cluster.
You can add a quorum device to a cluster that includes more than two hosts. A partition can
survive as a cluster when that partition has a majority o quorum votes, including the votes o
the hosts and the quorum devices. Consequently, when adding a quorum device, consider the
possible host and quorum device ailures when choosing whether and where to congure
quorum devices.
Host A Host B
11
1
Quorum
Device
Total configured votes: 3Votes required for quorum: 2
FIGURE 3–2 Two–Host Conguration
Chapter 3 • Key Concepts or System Administrators and Application Developers 61
Total configured votes: 6Votes required for quorum: 4
Host A Host DHost B Host C
1 1 1 1
Quorum and Quorum Devices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 62/116
Atypical Quorum Congurations
Figure 3–3 assumes you are running mission-critical applications (Oracle database, or
example) on Host A and Host B. I Host A and Host B are unavailable and cannot access shared
data, you might want the entire cluster to be down. Otherwise, this conguration is suboptimal
because it does not provide high availability.
For inormation about the best practice to which this exception relates, see “Adhering to
Quorum Device Best Practices” on page 60.
Total configured votes: 5Votes required for quorum: 3
Total configured votes: 5Votes required for quorum: 3
QuorumDevice
QuorumDevice
QuorumDevice
QuorumDevice
In this configuration, usually applications
are configured to run on Host A and HostB and use Host C as a hot spare.
In this configuration, the combinationof any one or more hosts and thequorum device can be from a cluster.
Host A Host B Host C
QuorumDevice
Host A Host B Host C
1 1
1 1 1
2
1 1 1
11
In this configuration, each pair mustbe available for either pair to survive.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A62
Host C Host DHost A
1
Host A
1 1 1Total configured votes: 10Votes required for quorum 6
Quorum and Quorum Devices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 63/116
Bad Quorum Congurations
This section shows examples o quorum congurations you should avoid. For examples o
recommended quorum congurations, see “Recommended Quorum Congurations” on
page 61.
3111
1 1 1 1
QuorumDevice
QuorumDevice
QuorumDevice
QuorumDevice
Votes required for quorum: 6
FIGURE 3–3 Atypical Conguration
Chapter 3 • Key Concepts or System Administrators and Application Developers 63
Total configured votes: 4Votes required for quorum: 3
Host A
1 1
Host B
DataServices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 64/116
Data ServicesThe term data service describes an application, such as Sun Java System Web Server or Oracle,that has been congured to run on a cluster rather than on a single server. A data serviceconsists o an application, specialized Sun Cluster conguration les, and Sun Clustermanagement methods that control the ollowing actions o the application.
■ Start■ Stop■
Monitor and take corrective measures
Total configured votes: 6Votes required for quorum: 4
Total configured votes: 6Votes required for quorum: 4
This configuration violates the best practice that youshould not add quorum devices to make total voteseven. This configuration does not add availability.
This configuration violates the best practice that quorumdevice votes should be strictly less than votes of hosts.
QuorumDevice
This configuration violates the best practice thatquorum device votes should be strictly less thanvotes of hosts.
QuorumDevice
QuorumDevice
QuorumDevice
QuorumDevice
11
QuorumDevice
Host A
1 1 1
Host B Host C
1
Host D
11
21
1 1
Host B Host CHost A
1
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A64
For inormation about data service types, see “Data Services” in SunCluster Overview or SolarisOS.
Figure 3–4 compares an application that runs on a single application server (the single-server
DataServices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 65/116
model) to the same application running on a cluster (the clustered-server model). The only diference between the two congurations is that the clustered application might run aster andis more highly available.
In the single-server model, you congure the application to access the server through aparticular public network interace (a host name). The host name is associated with thatphysical server.
In the clustered-server model, the public network interace is a logical host name or a shared address. The termnetwork resources is used to reer to both logical host names and sharedaddresses.
Some data services require you to speciy either logical host names or shared addresses as thenetwork interaces. Logical host names and shared addresses are not always interchangeable.Other data services allow you to speciy either logical host names or shared addresses. Reer tothe installation and conguration or each data service or details about the type o interace youmust speciy.
A network resource is not associated with a specic physical server. A network resource canmigrate between physical servers.
A network resource is initially associated with one node, the primary. I the primary ails, thenetwork resource and the application resource ail over to a diferent cluster node (asecondary). When the network resource ails over, ater a short delay, the application resource
continues to run on the secondary.
Standard Client-Server Application
Clustered Client-Server Application
ClientServer
Clustered Application Servers
ClientServer
ApplicationServer
FIGURE 3–4 Standard Compared to Clustered Client-Server Conguration
Chapter 3 • Key Concepts or System Administrators and Application Developers 65
Figure 3–5 compares the single-server model with the clustered-server model. Note that in theclustered-server model, a network resource (logical host name, in this example) can movebetween two or more o the cluster nodes. The application is congured to use this logical hostname in place o a host name that is associated with a particular server.
DataServices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 66/116
p p
A shared address is also initially associated with one node. This node is called the globalinterace node. A shared address (known as the global interace) is used as the single network interace to the cluster.
The diference between the logical host name model and the scalable service model is that in thelatter, each node also has the shared address actively congured on its loopback interace. Thisconguration enables multiple instances o a data service to be active on several nodessimultaneously. The term “scalable service” means that you can add more CPU power to theapplication by adding additional cluster nodes and the perormance scales.
I the global interace node ails, the shared address can be started on another node that is alsorunning an instance o the application (thereby making this other node the new global interacenode). Or, the shared address can ail over to another cluster node that was not previously running the application.
Figure 3–6 compares the single-server conguration with the clustered scalable serviceconguration. Note that in the scalable service conguration, the shared address is present onall nodes. The application is congured to use this shared address in place o a host name that isassociated with a particular server. This scheme is similar to how a logical host name is used or
a ailover data service.
Standard Client-Server Application
Failover Clustered Client-Server Application
ClientServer
Clustered Application Servers
hostname=iws-1
Logical hostname=iws-1
ClientServer
ApplicationServer
FIGURE 3–5 Fixed Host Name Compared to Logical Host Name
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A66
Standard Client-Server Application
Client Application
DataServices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 67/116
Data Service Methods
The Sun Cluster sotware supplies a set o service management methods. These methods run
under the control o the Resource Group Manager (RGM), which uses them to start, stop, and
monitor the application on the cluster nodes. These methods, along with the cluster ramework
sotware and multihost devices, enable applications to become ailover or scalable data services.
The RGM also manages resources in the cluster, including instances o an application and
network resources (logical host names and shared addresses).
In addition to Sun Cluster sotware-supplied methods, the Sun Cluster sotware also supplies an
API and several data service development tools. These tools enable application developers to
develop the data service methods that are required to make other applications run as highly
available data services with the Sun Cluster sotware.
Failover Data ServicesI the node on which the data service is running (the primary node) ails, the service is migrated
to another working node without user intervention. Failover services use a ailover resource
group, which is a container or application instance resources and network resources (logical
host names). Logical host names are IP addresses that can be congured on one node, and at a
later time, automatically congured down on the original node and congured on another
node.
Scalable Clusterered Client-Server Application
Server
Clustered Application Servers
hostname=iws-1
Shared address=iws-1
GIF iws-1 iws-1
iws-1
Server
Client
Server
FIGURE 3–6 Fixed Host Name Compared to Shared Address
Chapter 3 • Key Concepts or System Administrators and Application Developers 67
For ailover data services, application instances run only on a single node. I the ault monitordetects an error, it either attempts to restart the instance on the same node, or to start theinstance on another node (ailover). The outcome depends on how you have congured thedata service.
DataServices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 68/116
Scalable Data Services
The scalable data service has the potential or active instances on multiple nodes.
Scalable services use the ollowing two resource groups:
■ A scalable resource groupcontains the application resources.
■ A ailover resource group, which contains the network resources (shared addresses) onwhich the scalable service depends. A shared address is a network address. This network address can be bound by all scalable services that are running on nodes within the cluster.This shared address enables these scalable services to scale on those nodes. A cluster canhave multiple shared addresses, and a service can be bound to multiple shared addresses.
A scalable resource group can be online on multiple nodes simultaneously. As a result, multipleinstances o the service can be running at once. All scalable resource groups use load balancing.
All nodes that host a scalable service use the same shared address to host the service. Theailover resource group that hosts the shared address is online on only one node at a time.
Service requests enter the cluster through a single network interace (the global interace).These requests are distributed to the nodes, based on one o several predened algorithms thatare set by the load-balancing policy. The cluster can use the load-balancing policy to balance theservice load between several nodes. Multiple global interaces can exist on diferent nodes thathost other shared addresses.
For scalable services, application instances run on several nodes simultaneously. I the node thathosts the global interace ails, the global interace ails over to another node. I an applicationinstance that is running ails, the instance attempts to restart on the same node.
I an application instance cannot be restarted on the same node, and another unused node iscongured to run the service, the service ails over to the unused node. Otherwise, the servicecontinues to run on the remaining nodes, possibly causing a degradation o service throughput.
Note – TCP state or each application instance is kept on the node with the instance, not on theglobal interace node. Thereore, ailure o the global interace node does not afect theconnection.
Figure 3–7 shows an example o ailover and a scalable resource group and the dependenciesthat exist between them or scalable services. This example shows three resource groups. Theailover resource group contains application resources or highly available DNS, and network
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A68
resources used by both highly available DNS and highly available Apache Web Server (used in
SPARC-based clusters only). The scalable resource groups contain only application instances o
the Apache Web Server. Note that resource group dependencies exist between the scalable and
ailover resource groups (solid lines). Additionally, all the Apache application resources depend
DataServices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 69/116
on the network resource schost-2, which is a shared address (dashed lines).
Load-Balancing Policies
Load balancing improves perormance o the scalable service, both in response time and in
throughput. There are two classes o scalable data services.
■ Pure■ Sticky
A pure service is capable o having any o its instances respond to client requests. A sticky
service is capable o having a client send requests to the same instance. Those requests are not
redirected to other instances.
A pure service uses a weighted load-balancing policy. Under this load-balancing policy, client
requests are by deault uniormly distributed over the server instances in the cluster. The load is
distributed among various nodes according to specied weight values. For example, in a
three-node cluster, suppose that each node has the weight o 1. Each node services one third o
the requests rom any client on behal o that service. The cluster administrator can change
weights at any time with an administrative command or with Sun Cluster Manager.
Failover application resourcein.named (DNS resource)
Network resourcesschost-1 (logical hostname)
schost-2 (shared IP address)
Scalable Resource Group
Scalable Resource Group
r e
s o ur c e gr o u p d e p en d en
c y
( R
G _ d e p en d en c i e s pr o p e
r t y )
Failover Resource Groupr e s o ur c e d e p en d en c i e s
( N e t w
or k _r e s o ur c e s _ u s e d pr o p er t y )
Scalable application resourceapache (Apache server resource)
Scalable application resourceapache (Apache server resource)
FIGURE 3–7 SPARC:Failover and Scalable Resource GroupExample
Chapter 3 • Key Concepts or System Administrators and Application Developers 69
The weighted load-balancing policy is set by using the LB_WEIGHTED value or theLoad_balancing_weights property. I a weight or a node is not explicitly set, the weight orthat node is set to 1 by deault.
Th h d l d h l l
DataServices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 70/116
The weighted policy redirects a certain percentage o the trac rom clients to a particularnode. Given X=weight and A=the total weights o all active nodes, an active node can expectapproximately X/A o the total new connections to be directed to the active node. However, thetotal number o connections must be large enough. This policy does not address individualrequests.
Note that the weighted policy is not round robin. A round-robin policy would always causeeach request rom a client to go to a diferent node. For example, the rst request would go tonode 1, the second request would go to node 2, and so on.
A sticky service has two avors, ordinary stickyandwildcard sticky.
Sticky services enable concurrent application-level sessions over multiple TCP connections toshare in-state memory (application session state).
Ordinary sticky services enable a client to share the state between multiple concurrent TCPconnections. The client is said to be “sticky” toward that server instance that is listening on a
single port.The client is guaranteed that all requests go to the same server instance, provided that theollowing conditions are met:
■ The instance remains up and accessible.■ The load-balancing policy is not changed while the service is online.
For example, a web browser on the client connects to a shared IP address on port 80 using three
diferent TCP connections. However, the connections exchange cached session inormationbetween them at the service.
A generalization o a sticky policy extends to multiple scalable services that exchange sessioninormation in the background and at the same instance. When these services exchange sessioninormation in the background and at the same instance, the client is said to be “sticky” towardmultiple server instances on the same node that is listening on diferent ports.
For example, a customer on an e-commerce web site lls a shopping cart with items by using
HTTP on port 80. The customer then switches to SSL on port 443 to send secure data to pay by credit card or the items in the cart.
In the ordinary sticky policy, the set o ports is known at the time the application resources arecongured. This policy is set by using the LB_STICKY value or the Load_balancing_policy
resource property.
Wildcard sticky services use dynamically assigned port numbers, but still expect client requeststo go to the same node. The client is “sticky wildcard” over pots that have the same IP address.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A70
A good example o this policy is passive mode FTP. For example, a client connects to an FTPserver on port 21. The server then instructs the client to connect back to a listener port server inthe dynamic port range. All requests or this IP address are orwarded to the same node that theserver inormed the client through the control inormation.
Developing New DataServices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 71/116
The sticky-wildcard policy is a superset o the ordinary sticky policy. For a scalable service thatis identied by the IP address, ports are assigned by the server (and are not known in advance).The ports might change. This policy is set by using the LB_STICKY_WILD value or theLoad_balancing_policy resource property.
For each one o these sticky policies, the weighted load-balancing policy is in efect by deault.Thereore, a client's initial request is directed to the instance that the load balancer dictates.
Ater the client establishes an anity or the node where the instance is running, uture requestsare conditionally directed to that instance. The node must be accessible and the load-balancingpolicy must not have changed.
Failback Settings
Resource groups ail over rom one node to another. When this ailover occurs, the original
secondary becomes the new primary. The ailback settings speciy the actions that occur whenthe original primary comes back online. The options are to have the original primary becomethe primary again (ailback) or to allow the current primary to remain. You speciy the optionyou want by using the Failback resource group property setting.
I the original node that hosts the resource group ails and reboots repeatedly, setting ailback might result in reduced availability or the resource group.
Data Services Fault Monitors
Each Sun Cluster data service supplies a ault monitor that periodically probes the data serviceto determine its health. A ault monitor veries that the application daemon or daemons arerunning and that clients are being served. Based on the inormation that probes return,predened actions such as restarting daemons or causing a ailover can be initiated.
Developing New Data ServicesSun supplies conguration les and management methods templates that enable you to make various applications operate as ailover or scalable services within a cluster. I Sun does not oferthe application that you want to run as a ailover or scalable service, you have an alternative. Usea Sun Cluster API or the DSET API to congure the application to run as a ailover or scalableservice. However, not all applications can become a scalable service.
Chapter 3 • Key Concepts or System Administrators and Application Developers 71
Characteristics o Scalable Services
A set o criteria determines whether an application can become a scalable service. To determine
i your application can become a scalable service, see “Analyzing the Application or Suitability”
in SunCluster Data ServicesDeveloper’s Guide or Solaris OS
Developing New DataServices
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 72/116
in SunCluster Data ServicesDevelopers Guide or Solaris OS.
This set o criteria is summarized as ollows:
■ First, such a service is composed o one or more server instances. Each instance runs on a
diferent node. Two or more instances o the same service cannot run on the same node.
■ Second, i the service provides an external logical data store, you must exercise caution.
Concurrent access to this store rom multiple server instances must be synchronized to
avoid losing updates or reading data as it's being changed. Note the use o “external” todistinguish the store rom in-memory state. The term “logical” indicates that the store
appears as a single entity, although it might itsel be replicated. Furthermore, in this data
store, when any server instance updates the data store, this update is immediately “seen” by
other instances.
The Sun Cluster sotware provides such an external storage through its cluster le system
and its global raw partitions. As an example, suppose a service writes new data to an external
log le or modies existing data in place. When multiple instances o this service run, eachinstance has access to this external log, and each might simultaneously access this log. Each
instance must synchronize its access to this log, or else the instances interere with each
other. The service could use ordinary Solaris le locking through fcntl and lockf to
achieve the synchronization that you want.
Another example o this type o store is a back-end database, such as highly available Oracle
Real Application Clusters Guard or SPARC based clusters or Oracle. This type o back-end
database server provides built-in synchronization by using database query or updatetransactions. Thereore, multiple server instances do not need to implement their own
synchronization.
The Sun IMAP server is an example o a service that is not a scalable service. The service
updates a store, but that store is private and when multiple IMAP instances write to this
store, they overwrite each other because the updates are not synchronized. The IMAP server
must be rewritten to synchronize concurrent access.
■
Finally, note that instances can have private data that is disjoint rom the data o otherinstances. In such a case, the service does not need synchronized concurrent access because
the data is private, and only that instance can manipulate it. In this case, you must be careul
not to store this private data under the cluster le system because this data can become
globally accessible.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A72
Data Service API and Data Service DevelopmentLibrary API
The Sun Cluster sotware provides the ollowing to make applications highly available:
Using the Cluster Interconnect or Data Service Trafc
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 73/116
The Sun Cluster sotware provides the ollowing to make applications highly available:■ Data services that are supplied as part o the Sun Cluster sotware■ A data service API■ A development library API or data services■ A “generic” data service
The SunCluster Data Services PlanningandAdministrationGuide orSolarisOSdescribes howto install and congure the data services that are supplied with the Sun Cluster sotware. The
SunCluster 3.19/04 Sotware Collection or Solaris OS (SPARCPlatormEdition)describes howto instrument other applications to be highly available under the Sun Cluster ramework.
The Sun Cluster APIs enable application developers to develop ault monitors and scripts thatstart and stop data service instances. With these tools, an application can be implemented as aailover or a scalable data service. The Sun Cluster sotware provides a “generic” data service.Use this generic data service to quickly generate an application's required start and stopmethods and to implement the data service as a ailover or scalable service.
Using the Cluster Interconnect or Data Service TracA cluster must usually have multiple network connections between Solaris hosts, orming thecluster interconnect.
Sun Cluster sotware uses multiple interconnects to achieve the ollowing goals:
■ Ensure high availability ■ Improve perormance
For both internal and external trac such as le system data or scalable services data, messagesare striped across all available interconnects. The cluster interconnect is also available toapplications, or highly available communication between hosts. For example, a distributedapplication might have components that are running on diferent hosts that need to
communicate. By using the cluster interconnect rather than the public transport, theseconnections can withstand the ailure o an individual link.
To use the cluster interconnect or communication between hosts, an application must use theprivate host names that you congured during the Sun Cluster installation. For example, i theprivate host name or host1 is clusternode1-priv, use this name to communicate with host1
over the cluster interconnect. TCP sockets that are opened by using this name are routed overthe cluster interconnect and can be transparently rerouted i a private network adapter ails.Application communication between any two hosts is striped over all interconnects. The trac
Chapter 3 • Key Concepts or System Administrators and Application Developers 73
or a given TCP connection ows on one interconnect at any point. Diferent TCP connectionsare striped across all interconnects. Additionally, UDP trac is always striped across allinterconnects.
An application can optionally use a zone's private host name to communicate over the clusterinterconnect between zones However you must rst set each zone's private host name beore
Resources, ResourceGroups,and ResourceTypes
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 74/116
interconnect between zones. However, you must rst set each zone s private host name beorethe application can begin communicating. Each zone must have its own private host name tocommunicate. An application that is running in one zone must use the private host name in thesame zone to communicate with private host names in other zones. An application in one zonecannot communicate through the private host name in another zone.
Because you can congure the private host names during your Sun Cluster installation, thecluster interconnect uses any name that you choose at that time. To determine the actual name,
use the scha_cluster_get command with the scha_privatelink_hostname_node argument.See the scha_cluster_get(1HA) man page.
Each host is also assigned a xed per-host address. This per-host address is plumbed on theclprivnet driver. The IP address maps to the private host name or the host:clusternode1-priv. See the clprivnet(7) man page.
I your application requires consistent IP addresses at all points, congure the application tobind to the per-host address on both the client and the server. All connections appear then to
originate rom and return to the per-host address.
Resources, Resource Groups, and Resource TypesData services use several types o resources: applications such as Sun Java System Web Server orApache Web Server use network addresses (logical host names and shared addresses) on whichthe applications depend. Application and network resources orm a basic unit that is managed
by the RGM.
Data services are resource types. For example, Sun Cluster HA or Oracle is the resource typeSUNW.oracle-server and Sun Cluster HA or Apache is the resource type SUNW.apache.
A resource is an instantiation o a resource type that is dened cluster wide. Several resourcetypes are dened.
Network resources are either SUNW.LogicalHostname or SUNW.SharedAddress resource types.
These two resource types are preregistered by the Sun Cluster sotware.
The HAStoragePlus resource type is used to synchronize the startup o resources and devicegroups on which the resources depend. This resource type ensures that beore a data servicestarts, the paths to a cluster le system's mount points, global devices, and device group namesare available. For more inormation, see “Synchronizing the Startups Between Resource Groupsand Device Groups” in SunCluster Data Services Planning andAdministrationGuide or SolarisOS. The HAStoragePlus resource type also enables local le systems to be highly available. Formore inormation about this eature, see “HAStoragePlus Resource Type” on page 52.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A74
RGM-managed resources are placed into groups, called resource groups, sothatthey canbe
managed as a unit. A resource group is migrated as a unit i a ailover or switchover is initiated
on the resource group.
N t Wh b i th t t i li ti li th
Resources, ResourceGroups,and ResourceTypes
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 75/116
Note – When you bring a resource group that contains application resources online, the
application is started. The data service start method waits until the application is running beore
exiting successully. The determination o when the application is up and running is
accomplished the same way the data service ault monitor determines that a data service is
serving clients. Reer to theSunCluster Data Services Planning andAdministrationGuide or
Solaris OS or more inormation about this process.
Resource Group Manager (RGM)
The RGM controls data services (applications) as resources, which are managed by resource
type implementations. These implementations are either supplied by Sun or created by a
developer with a generic data service template, the Data Service Development Library API
(DSDL API), or the Resource Management API (RMAPI). The cluster administrator creates
and manages resources in containers called resource groups. The RGM stops and starts resourcegroups on selected nodes in response to cluster membership changes.
The RGM acts on resources and resource groups.RGM actions cause resources and resource
groups to move between online and oine states. A complete description o the states and
settings that can be applied to resources and resource groups is located in “Resource and
Resource Group States and Settings” on page 75.
Reer to “Data Service Project Conguration” on page 83 or inormation about how to launchSolaris projects under RGM control.
Resource and Resource Group States and Settings
A system administrator applies static settings to resources and resource groups. You can change
these settings only by administrative action. The RGM moves resource groups betweendynamic “states.”
Chapter 3 • Key Concepts or System Administrators and Application Developers 75
These settings and states are as ollows:
■ Managedor unmanagedsettings.These cluster-wide settings apply only to resourcegroups. The RGM manages resource groups. You can use the clresourcegroup commandto request that the RGM manage or unmanage a resource group. These resource groupsettings do not change when you recongure a cluster.
Resources, ResourceGroups,and ResourceTypes
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 76/116
g g y g
When a resource group is rst created, it is unmanaged. A resource group must be managedbeore any resources placed in the group can become active.
In some data services, or example, a scalable web server, work must be done prior tostarting network resources and ater they are stopped. This work is done by initialization(INIT) and nish (FINI) data service methods. The INIT methods only run i the resourcegroup in which the resources are located is in the managed state.
When a resource group is moved rom unmanaged to managed, any registered INIT
methods or the group are run on the resources in the group.
When a resource group is moved rom managed to unmanaged, any registered FINI
methods are called to perorm cleanup.
The most common use o the INIT and FINI methods are or network resources or scalableservices. However, a data service developer can use these methods or any initialization orcleanup work that is not perormed by the application.
■ Enabled or disabled settings.These settings apply to resources on one or more nodes. Asystem administrator can use the clresource command to enable or disable a resource onone or more nodes. These settings do not change when the cluster administratorrecongures a cluster.
The normal setting or a resource is that it is enabled and actively running in the system.
I you want to make the resource unavailable on all cluster nodes, disable the resource on allcluster nodes. A disabled resource is not available or general use on the cluster nodes that
you speciy.■ Onlineor ofine states.These dynamic states apply to both resources and resource groups.
Online and oine states change as the cluster transitions through cluster recongurationsteps during switchover or ailover. You can also change the online or oine state o aresource or a resource group by using the clresource and clresourcegroup commands.
A ailover resource or resource group can only be online on one node at any time. A scalableresource or resource group can be online on some nodes and oine on others. During a
switchover or ailover, resource groups and the resources within them are taken oine onone node and then brought online on another node.
I a resource group is oine, all o its resources are oine. I a resource group is online, all o its enabled resources are online.
You can temporarily suspend the automatic recovery actions o a resource group. Youmight need to suspend the automatic recovery o a resource group to investigate and x aproblem in the cluster. Or, you might need to perorm maintenance on resource groupservices.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A76
A suspended resource group isnot automatically restarted or ailed over until you explicitly issue the command that resumes automatic recovery. Whether online or oine, suspendeddata services remain in their current state. You can still manually switch the resource groupto a diferent state on specied nodes. You can also still enable or disable individual
resources in the resource group.
Support or Solaris Zones
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 77/116
Resource groups can contain several resources, with dependencies between resources. Thesedependencies require that the resources be brought online and oine in a particular order.The methods that are used to bring resources online and oine might take diferentamounts o time or each resource. Because o resource dependencies and start and stoptime diferences, resources within a single resource group can have diferent online andoine states during a cluster reconguration.
Resource and Resource Group Properties
You can congure property values or resources and resource groups or your Sun Cluster dataservices. Standard properties are common to all data services. Extension properties are specicto each data service. Some standard and extension properties are congured with deaultsettings so that you do not have to modiy them. Others need to be set as part o the process o
creating and conguring resources. The documentation or each data service species whichresource properties can be set and how to set them.
The standard properties are used to congure resource and resource group properties that areusually independent o any particular data service. For the set o standard properties, seeAppendix B, “Standard Properties,” in SunCluster Data Services PlanningandAdministrationGuide or Solaris OS.
The RGM extension properties provide inormation such as the location o application binaries
and conguration les. You modiy extension properties as you congure your data services.The set o extension properties is described in the individual guide or the data service.
Support or Solaris ZonesSolaris zones provide a means o creating virtualized operating system environments within an
instance o the Solaris 10 OS. Solaris zones enable one or more applications to run in isolationrom other activity on your system. The Solaris zones acility is described in Part II, “Zones,” inSystemAdministration Guide: Solaris Containers-ResourceManagement and Solaris Zones.
When you run Sun Cluster sotware on the Solaris 10 OS, you can create any number o global-cluster non-voting nodes.
You can use Sun Cluster sotware to manage the availability and scalability o applications thatare running on global-cluster non-voting nodes.
Chapter 3 • Key Concepts or System Administrators and Application Developers 77
Support or Global-Cluster Non-Voting Nodes (SolarisZones) Directly Through the RGM
On a cluster where the Solaris 10OS is running, you can congure a resource group to run on aglobal-cluster voting node or a global-cluster non-voting node. The RGM manages each
Support or Solaris Zones
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 78/116
g g g g gglobal-cluster non-voting node as a switchover target. I a global-cluster non-voting node isspecied in the node list o a resource group, the RGM brings the resource group online in thespecied node.
Figure 3–8 illustrates the ailover o resource groups between nodes in a two-host cluster. In thisexample, identical nodes are congured to simpliy the administration o the cluster.
You can congure a scalable resource group (which uses network load balancing) to run in acluster non-voting node as well.
In Sun Cluster commands, you speciy a zone by appending the name o the zone to the name o the host, and separating them with a colon, or example:
Host pn1
Zone zC
RG5
Zone zB
RG4
Zone zA
RG3 RG2
Voting Node
Sun Cluster
Foundation
RG1
Host pn2
Zone zC
RG5
Zone zB
RG4
Zone zA
RG3 RG2
Voting Node
Sun Cluster
Foundation
RG1
FIGURE 3–8 Failover o Resource Groups Between Nodes
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A78
phys-schost-1:zoneA
Criteria or Using Support or Solaris Zones Directly Through the RGM
Use support or Solaris zones directly through the RGM i any o ollowing criteria is met:
■ Your application cannot tolerate the additional ailover time that is required to boot a zone.
Support or Solaris Zones
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 79/116
Your application cannot tolerate the additional ailover time that is required to boot a zone.
■ You require minimum downtime during maintenance.
■ You require dual-partition sotware upgrade.
■ You are conguring a data service that uses a shared address resource or network loadbalancing.
Requirements or Using Support or Solaris Zones Directly Through theRGM
I you plan to use support or Solaris zones directly through the RGM or an application, ensurethat the ollowing requirements are met:
■ The application is supported to run in non-global zones.
■ The data service or the application is supported to run on a global-cluster non-voting node.
I you use support or Solaris zones directly through the RGM, ensure that resource groups thatare related by an anity are congured to run on the same Solaris host.
Additional Inormation About Support or Solaris Zones DirectlyThrough the RGM
For inormation about how to congure support or Solaris zones directly through the RGM,see the ollowing documentation:
■ “Guidelines or Non-Global Zones in a Global Cluster” in SunCluster Sotware InstallationGuide or Solaris OS
■ “Zone Names” in SunCluster Sotware Installation Guide or Solaris OS
■ “Conguring a Non-Global Zone on a Global-Cluster Node” in SunCluster SotwareInstallation Guide or Solaris OS
■ SunCluster Data Services Planning andAdministrationGuide or Solaris OS
■ Individual data service guides
Support or Solaris Zones on Sun Cluster NodesThrough Sun Cluster HA or Solaris Containers
The Sun Cluster HA or Solaris Containers data service manages each zone as a resource that iscontrolled by the RGM.
Chapter 3 • Key Concepts or System Administrators and Application Developers 79
Criteria or Using Sun Cluster HA or Solaris Containers
Use the Sun Cluster HA or Solaris Containers data service i any o ollowing criteria is met:
■ You require delegated root access.
■ The application is not supported in a cluster.
b h d f h
Service ManagementFacility
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 80/116
■ You require anities between resource groups that are to run in diferent zones on the samenode.
Requirements or Using Sun Cluster HA or Solaris Containers
I you plan to use the Sun Cluster HA or Solaris Containers data service or an application,
ensure that the ollowing requirements are met:■ The application is supported to run on global-cluster non-voting nodes.
■ The application is integrated with the Solaris OS through a script, a run-level script, or aSolaris Service Management Facility (SMF) maniest.
■ The additional ailover time that is required to boot a zone is acceptable.
■ Some downtime during maintenance is acceptable.
Additional Inormation About Sun Cluster HA or Solaris Containers
For inormation about how to use the Sun Cluster HA or Solaris Containers data service, seeSunCluster Data Service orSolarisContainers Guide orSolaris OS.
Service Management FacilityThe Solaris Service Management Facility (SMF) enables you to run and administer applicationsas highly available and scalable resources. Like the Resource Group Manager (RGM), the SMFprovides high availability and scalability, but or the Solaris Operating System.
Sun Cluster provides three proxy resource types that you can use to enable SMF services in acluster. These resource types, SUNW.Proxy_SMF_failover, SUNW.Proxy_SMF_loadbalanced ,and SUNW.Proxy_SMF_multimaster , enable you to run SMF services in a ailover, scalable, andmulti-master conguration, respectively. The SMF manages the availability o SMF services ona single Solaris host. The SMF uses the callback method execution model to run services.
The SMF also provides a set o administrative interaces or monitoring and controllingservices. These interaces enable you to integrate your own SMF-controlled services into SunCluster. This capability eliminates the need to create new callback methods, rewrite existingcallback methods, or update the SMF service maniest. You can include multiple SMF resourcesin a resource group and you can congure dependencies and anities between them.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A80
The SMF is responsible or starting, stopping, and restarting these services and managing theirdependencies. Sun Cluster is responsible or managing the service in the cluster and ordetermining the hosts on which these services are to be started.
The SMF runs as a daemon,svc.startd
, on each cluster host. The SMF daemon automatically starts and stops resources on selected hosts according to pre-congured policies.
System ResourceUsage
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 81/116
The services that are specied or an SMF proxy resource can be located on global cluster votingnode or global cluster non-voting node. However, all the services that are specied or the sameSMF proxy resource must be located on the same node. SMF proxy resources work on any node.
System Resource UsageSystem resources include aspects o CPU usage, memory usage, swap usage, and disk andnetwork throughput. Sun Cluster enables you to monitor how much o a specic systemresource is being used by an object type. An object type includes a host, node, zone, disk,network interace, or resource group. Sun Cluster also enables you to control the CPU that isavailable to a resource group.
Monitoring and controlling system resource usage can be part o your resource managementpolicy. The cost and complexity o managing numerous machines encourages the consolidationo several applications on larger hosts. Instead o running each workload on separate systems,with ull access to each system's resources, you use resource management to segregateworkloads within the system. Resource management enables you to lower overall total cost o ownership by running and controlling several applications on a single Solaris system.
Resource management ensures that your applications have the required response times.Resource management can also increase resource use. By categorizing and prioritizing usage,you can efectively use reserve capacity during of-peak periods, oten eliminating the need oradditional processing power. You can also ensure that resources are not wasted because o load variability.
To use the data that Sun Cluster collects about system resource usage, you must do theollowing:
■
Analyze the data to determine what it means or your system.■ Make a decision about the action that is required to optimize your usage o hardware and
sotware resources.
■ Take action to implement your decision.
By deault, system resource monitoring and control are not congured when you install SunCluster. For inormation about conguring these services, see Chapter 9, “Conguring Controlo CPU Usage,” in SunCluster SystemAdministrationGuide or Solaris OS.
Chapter 3 • Key Concepts or System Administrators and Application Developers 81
System Resource Monitoring
By monitoring system resource usage, you can do the ollowing:
■
Collect data that reects how a service that is using specic system resources is perorming.■ Discover resource bottlenecks or overload and so preempt problems.
System ResourceUsage
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 82/116
■ More eciently manage workloads.
Data about system resource usage can help you determine the hardware resources that are
underused and the applications that use many resources. Based on this data, you can assign
applications to nodes that have the necessary resources and choose the node to which to
ailover. This consolidation can help you optimize the way that you use your hardware and
sotware resources.
Monitoring all system resources at the same time might be costly in terms o CPU. Choose the
system resources that you want to monitor by prioritizing the resources that are most critical
or your system.
When you enable monitoring, you choose the telemetryattribute that you want to monitor. A
telemetry attribute is an aspect o system resources. Examples o telemetry attributes include the
amount o ree CPU or the percentage o blocks that are used on a device. I you monitor a
telemetry attribute on an object type, Sun Cluster monitors this telemetry attribute on all
objects o that type in the cluster. Sun Cluster stores a history o the system resource data that is
collected or seven days.
I you consider a particular data value to be critical or a system resource, you can set a threshold
or this value. When setting a threshold, you also choose how critical this threshold is by
assigning it a severity level. I the threshold is crossed, Sun Cluster changes the severity level o the threshold to the severity level that you choose.
Control o CPU
Each application and service that is running on a cluster has specic CPU needs. Table 3–4 lists
the CPU control activities that are available on diferent versions o the Solaris OS.
TABLE 3–4 CPUControl
Solaris Version Zone Control
Solaris 9 OS Not available Assign CPU shares
Solaris 10 OS Global-cluster voting node Assign CPU shares
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A82
TABLE 3–4 CPUControl (Continued)
Solaris Version Zone Control
Solaris 1 0 OS Global-cluster n on-voting n ode Assign C PU s hares
Assign number o CPU
Create dedicated processor sets
DataService Project Conguration
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 83/116
Note – I you want to apply CPU shares, you must speciy the Fair Share Scheduler (FFS) as thedeault scheduler in the cluster.
Controlling the CPU that is assigned to a resource group in a dedicated processor set in aglobal-cluster non-voting node ofers the strictest level o control. I you reserve CPU or aresource group, this CPU is not available to other resource groups.
Viewing System Resource Usage
You can view system resource data and CPU assignments by using the command line orthrough Sun Cluster Manager. The system resources that you choose to monitor determine thetables and graphs that you can view.
By viewing the output o system resource usage and CPU control, you can do the ollowing:
■ Anticipate ailures due to the exhaustion o system resources.■ Detect unbalanced usage o system resources.■ Validate server consolidation.■ Obtain inormation that enables you to improve the perormance o applications.
Sun Cluster does not provide advice about the actions to take, nor does it take action or youbased on the data that it collects. You must determine whether the data that you view meetsyour expectations or a service. You must then take action to remedy any observedperormance.
Data Service Project CongurationData services can be congured to launch under a Solaris project name when brought online by using the RGM. The conguration associates a resource or resource group managed by theRGM with a Solaris project ID. The mapping rom your resource or resource group to a projectID gives you the ability to use sophisticated controls that are available in the Solaris OS tomanage workloads and consumption within your cluster.
Chapter 3 • Key Concepts or System Administrators and Application Developers 83
Note – You can perorm this conguration i you are using Sun Cluster on the Solaris 9 OS or onthe Solaris 10 OS.
Using the Solaris management unctionality in a Sun Cluster environment enables you toensure that your most important applications are given priority when sharing a node with other
DataService Project Conguration
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 84/116
ensure that your most important applications are given priority when sharing a node with otherapplications. Applications might share a node i you have consolidated services or becauseapplications have ailed over. Use o the management unctionality described herein mightimprove availability o a critical application by preventing lower-priority applications romoverconsuming system supplies such as CPU time.
Note – The Solaris documentation or this eature describes CPU time, processes, tasks andsimilar components as “resources”. Meanwhile, Sun Cluster documentation uses the term“resources” to describe entities that are under the control o the RGM. The ollowing sectionuses the term “resource” to reer to Sun Cluster entities that are under the control o the RGM.The section uses the term “supplies” to reer to CPU time, processes, and tasks.
This section provides a conceptual description o conguring data services to launch processes
on a specied Solaris OS project(4). This section also describes several ailover scenarios andsuggestions or planning to use the management unctionality provided by the SolarisOperating System.
For detailed conceptual and procedural documentation about the management eature, reer toChapter 1, “Network Service (Overview),” in SystemAdministrationGuide: NetworkServices.
When conguring resources and resource groups to use Solaris management unctionality in a
cluster, use the ollowing high-level process:1. Conguring applications as part o the resource.
2. Conguring resources as part o a resource group.
3. Enabling resources in the resource group.
4. Making the resource group managed.
5. Creating a Solaris project or your resource group.
6. Conguring standard properties to associate the resource group name with the project youcreated in step 5.
7. Bringing the resource group online.
To congure the standard Resource_project_name or RG_project_name properties toassociate the Solaris project ID with the resource or resource group, use the -p option with theclresource set and the clresourcegroup set command. Set the property values to theresource or to the resource group. See Appendix B, “Standard Properties,” in SunCluster Data
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A84
Services Planning andAdministrationGuide orSolarisOS or property denitions. See ther_properties(5) and rg_properties(5) man pages or descriptions o properties.
The specied project name must exist in the projects database (/etc/project) and the rootuser must be congured as a member o the named project. Reer to Chapter 2, “Projects and
Tasks (Overview),” in SystemAdministration Guide: Solaris Containers-ResourceManagement andSolaris Zones or conceptual inormation about the project name database. Reer to
DataService Project Conguration
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 85/116
project(4) or a description o project le syntax.
When the RGM brings resources or resource groups online, it launches the related processesunder the project name.
Note – Users can associate the resource or resource group with a project at any time. However,the new project name is not efective until the resource or resource group is taken oine andbrought back online by using the RGM.
Launching resources and resource groups under the project name enables you to congure theollowing eatures to manage system supplies across your cluster.
■ Extended Accounting – Provides a exible way to record consumption on a task or processbasis. Extended accounting enables you to examine historical usage and make assessments
o capacity requirements or uture workloads.
■ Controls – Provide a mechanism or constraint on system supplies. Processes, tasks, andprojects can be prevented rom consuming large amounts o specied system supplies.
■ Fair Share Scheduling (FSS) – Provides the ability to control the allocation o available CPUtime among workloads, based on their importance. Workload importance is expressed by the number o shares o CPU time that you assign to each workload. Reer to the ollowingman pages or more inormation.
■ dispadmin(1M)■ priocntl(1)■ ps(1)■ FSS(7)
■ Pools – Provide the ability to use partitions or interactive applications according to theapplication's requirements. Pools can be used to partition a host that supports a number o diferent sotware applications. The use o pools results in a more predictable response or
each application.
Determining Requirements or Project CongurationBeore you congure data services to use the controls provided by Solaris in a Sun Clusterenvironment, you must decide how to control and track resources across switchovers orailovers. Identiy dependencies within your cluster beore conguring a new project. Forexample, resources and resource groups depend on device groups.
Chapter 3 • Key Concepts or System Administrators and Application Developers 85
Use the nodelist, failback, maximum_primaries and desired_primaries resource groupproperties that you congure with the clresourcegroup set command to identiy node listpriorities or your resource group.
■ For a brie discussion o the node list dependencies between resource groups and device
groups, reer to “Relationship Between Resource Groups and Device Groups” in SunCluster Data Services Planning andAdministrationGuide or Solaris OS.
DataService Project Conguration
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 86/116
■ For detailed property descriptions, reer to rg_properties(5).
Use the preferenced property and failback property that you congure with thecldevicegroup and clsetup commands to determine device group node list priorities. See theclresourcegroup(1CL), cldevicegroup(1CL), and clsetup(1CL) man pages.
■ For conceptual inormation about the preferenced property, see “Multiported DeviceGroups” on page 48.
■ For procedural inormation, see “How To Change Disk Device Properties” in“Administering Device Groups” in SunCluster SystemAdministrationGuide or Solaris OS.
■ For conceptual inormation about node conguration and the behavior o ailover andscalable data services, see “Sun Cluster System Hardware and Sotware Components” onpage 21.
I you congure all cluster nodes identically, usage limits are enorced identically on primary and secondary nodes. The conguration parameters o projects do not need to be identical orall applications in the conguration les on all nodes. All projects that are associated with theapplication must at least be accessible by the project database on all potential masters o thatapplication. Suppose that Application 1 is mastered by phys-schost-1 but could potentially beswitched over or ailed over to phys-schost-2 or phys-schost-3. The project that is associated withApplication 1 must be accessible on all three nodes ( phys-schost-1, phys-schost-2,and
phys-schost-3).
Note – Project database inormation can be a local /etc/project database le or can be storedin the NIS map or the LDAP directory service.
The Solaris Operating System enables or exible conguration o usage parameters, and ewrestrictions are imposed by Sun Cluster. Conguration choices depend on the needs o the site.
Consider the general guidelines in the ollowing sections beore conguring your systems.
Setting Per-ProcessVirtual Memory Limits
Set the process.max-address-space control to limit virtual memory on a per-process basis.See the rctladm(1M) man page or inormation about setting theprocess.max-address-space value.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A86
When you use management controls with Sun Cluster sotware, congure memory limits
appropriately to prevent unnecessary ailover o applications and a “ping-pong” efect o
applications. In general, observe the ollowing guidelines.
■ Do not set memory limits too low.
When an application reaches its memory limit, it might ail over. This guideline is especially
important or database applications, when reaching a virtual memory limit can have
DataService Project Conguration
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 87/116
p pp , g y
unexpected consequences.
■ Do not set memory limits identically on primary and secondary nodes.
Identical limits can cause a ping-pong efect when an application reaches its memory limit
and ails over to a secondary node with an identical memory limit. Set the memory limit
slightly higher on the secondary node. The diference in memory limits helps prevent theping-pong scenario and gives the system administrator a period o time in which to adjust
the parameters as necessary.
■ Do use the resource management memory limits or load balancing.
For example, you can use memory limits to prevent an errant application rom consuming
excessive swap space.
Failover Scenarios
You can congure management parameters so that the allocation in the project conguration
(/etc/project) works in normal cluster operation and in switchover or ailover situations.
The ollowing sections are example scenarios.
■ The rst two sections, “Two-Host Cluster With Two Applications” on page 88 and“Two-Host Cluster With Three Applications” on page 89, show ailover scenarios or entire
hosts.■ The section “Failover o Resource Group Only” on page 91 illustrates ailover operation or
an application only.
In a Sun Cluster environment, you congure an application as part o a resource. You then
congure a resource as part o a resource group (RG). When a ailure occurs, the resource
group, along with its associated applications, ails over to another node. In the ollowingexamples the resources are not shown explicitly. Assume that each resource has only one
application.
Note – Failover occurs in the order in which nodes are specied in the node list and set in the
RGM.
Chapter 3 • Key Concepts or System Administrators and Application Developers 87
The ollowing examples have these constraints:
■ Application 1 (App-1) is congured in resource group RG-1.■ Application 2 (App-2) is congured in resource group RG-2.■ Application 3 (App-3) is congured in resource group RG-3.
Although the numbers o assigned shares remain the same, the percentage o CPU time that is
allocated to each application changes ater ailover This percentage depends on the number o
DataService Project Conguration
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 88/116
allocated to each application changes ater ailover. This percentage depends on the number o
applications that are running on the node and the number o shares that are assigned to each
active application.
In these scenarios, assume the ollowing congurations.
■
All applications are congured under a common project.■ Each resource has only one application.■ The applications are the only active processes on the nodes.■ The projects databases are congured the same on each node o the cluster.
Two-Host Cluster With Two Applications
You can congure two applications on a two-host cluster to ensure that each physical host
( phys-schost-1, phys-schost-2) acts as the deault master or one application. Each physical host
acts as the secondary node or the other physical host. All projects that are associated with
Application 1 and Application 2 must be represented in the projects database les on both
nodes. When the cluster is running normally, each application is running on its deault master,
where it is allocated all CPU time by the management acility.
Ater a ailover or switchover occurs, both applications run on a single node where they are
allocated shares as specied in the conguration le. For example, this entry inthe/etc/project le species that Application 1 is allocated 4 shares and Application 2 is
allocated 1 share.
Prj_1:100:project for App-1:root::project.cpu-shares=(privileged,4,none)
Prj_2:101:project for App-2:root::project.cpu-shares=(privileged,1,none)
The ollowing diagram illustrates the normal and ailover operations o this conguration. The
number o shares that are assigned does not change. However, the percentage o CPU timeavailable to each application can change. The percentage depends on the number o shares that
are assigned to each process that demands CPU time.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A88
App-2(part of RG 2)
Normal Operation
App-1(part of RG 1)
1 share(100% of CPU
4 shares(100% of CPU
DataService Project Conguration
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 89/116
Two-Host Cluster With Three Applications
On a two-host cluster with three applications, you can congure one host ( phys-schost-1)asthedeault master o one application. You can congure the second physical host ( phys-schost-2) asthe deault master or the remaining two applications. Assume the ollowing example projectsdatabase le is located on every host. The projects database le does not change when a ailoveror switchover occurs.
Prj_1:103:project for App-1:root::project.cpu-shares=(privileged,5,none)
Prj_2:104:project for App_2:root::project.cpu-shares=(privileged,3,none)
Prj_3:105:project for App_3:root::project.cpu-shares=(privileged,2,none)
When the cluster is running normally, Application 1 is allocated 5 shares on its deault master, phys-schost-1. This number is equivalent to 100 percent o CPU time because it is the only application that demands CPU time on that host. Applications 2 and 3 are allocated 3 and 2shares, respectively, on their deault master, phys-schost-2. Application 2 would receive 60percent o CPU time and Application 3 would receive 40 percent o CPU time during normaloperation.
App-1(part of RG-1)
4 shares(80% of CPUresources)
phys-schost-2
Failover Operation: Failure of Node phys-schost-1
phys-schost-1
(part of RG-2)
phys-schost-2
(part of RG-1)
phys-schost-1
resources)
App-2(part of RG-2)
1 share(20% of CPUresources)
resources)
Chapter 3 • Key Concepts or System Administrators and Application Developers 89
I a ailover or switchover occurs and Application 1 is switched over to phys-schost-2, the shares
or all three applications remain the same. However, the percentages o CPU resources are
reallocated according to the projects database le.
■ Application 1, with 5 shares, receives 50 percent o CPU.■ Application 2, with 3 shares, receives 30 percent o CPU.■ Application 3, with 2 shares, receives 20 percent o CPU.
DataService Project Conguration
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 90/116
The ollowing diagram illustrates the normal operations and ailover operations o this
conguration.
App-2(part of RG-2)
App-1(part of RG-1)
App-3(part of RG-3)
2 shares(20% of CPUresources)
phys-schost-2
Failover Operation: Failure of Node phys-schost-1
phys-schost-1
App-2(part of RG-2)
phys-schost-2
Normal Operation
App-1(part of RG-1)
5 shares(100% of CPU
resources)
phys-schost-1
3 shares(30% of CPUresources)
5 shares(50% of CPUresources)
2 shares(40% of CPUresources)
3 shares(60% of CPUresources)
App-3(part of RG-3)
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A90
Failover o Resource Group Only
In a conguration in which multiple resource groups have the same deault master, a resource
group (and its associated applications) can ail over or be switched over to a secondary node.
Meanwhile, the deault master is running in the cluster.
Note – During ailover, the application that ails over is allocated resources as specied in the
DataService Project Conguration
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 91/116
g pp p
conguration le on the secondary host. In this example, the project database les on the
primary and secondary hosts have the same congurations.
For example, this sample conguration le species that Application 1 is allocated 1 share,
Application 2 is allocated 2 shares, and Application 3 is allocated 2 shares.
Prj_1:106:project for App_1:root::project.cpu-shares=(privileged,1,none)
Prj_2:107:project for App_2:root::project.cpu-shares=(privileged,2,none)
Prj_3:108:project for App_3:root::project.cpu-shares=(privileged,2,none)
The ollowing diagram illustrates the normal and ailover operations o this conguration,
where RG-2, containing Application 2, ails over to phys-schost-2. Note that the number o
shares assigned does not change. However, the percentage o CPU time available to eachapplication can change, depending on the number o shares that are assigned to each
application that demands CPU time.
Chapter 3 • Key Concepts or System Administrators and Application Developers 91
App-3(part of RG-3)
Normal Operation
App-1(part of RG-1)
1 share(33.3% of CPU
resources)
2 shares(100% of CPUresources)
Public Network Adaptersand IP Network Multipathing
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 92/116
Public Network Adapters and IP Network MultipathingClients make data requests to the cluster through the public network. Each cluster Solaris host isconnected to at least one public network through a pair o public network adapters.
Solaris Internet Protocol (IP) Network Multipathing sotware on Sun Cluster provides the basicmechanism or monitoring public network adapters and ailing over IP addresses rom oneadapter to another when a ault is detected. Each host has its own IP network multipathingconguration, which can be diferent rom the conguration on other hosts.
Public network adapters are organized into IPmultipathing groups (multipathing groups). Eachmultipathing group has one or more public network adapters. Each adapter in a multipathinggroup can be active. Alternatively, you can congure standby interaces that are inactive unlessa ailover occurs.
The in.mpathd multipathing daemon uses a test IP address to detect ailures and repairs. I aault is detected on one o the adapters by the multipathing daemon, a ailover occurs. Allnetwork access ails over rom the aulted adapter to another unctional adapter in the
App-2(part of RG-2)
2 shares(50% of CPUresources)
1 share
(100% of CPUresources)
phys-schost-2
Failover Operation: Failure of Node phys-schost-2
phys-schost-1
(p )
phys-schost-2
2 shares(66.6% of CPU
resources)
phys-schost-1
resources)
App-3(part of RG-3)
2 shares(50% of CPUresources)
App-2(part of RG-2)
App-1
(part of RG-1)
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A92
multipathing group. Thereore, the daemon maintains public network connectivity or the host.I you congured a standby interace, the daemon chooses the standby interace. Otherwise, thedaemon chooses the interace with the least number o IP addresses. Because the ailover occursat the adapter interace level, higher-level connections such as TCP are not afected, except or a
brie transient delay during the ailover. When the ailover o IP addresses completessuccessully, ARP broadcasts are sent. Thereore, the daemon maintains connectivity to remoteclients.
Public Network Adaptersand IP Network Multipathing
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 93/116
Note – Because o the congestion recovery characteristics o TCP, TCP endpoints can experienceurther delay ater a successul ailover. Some segments might have been lost during the ailover,activating the congestion control mechanism in TCP.
Multipathing groups provide the building blocks or logical host name and shared addressresources. You can also create multipathing groups independently o logical host name andshared address resources to monitor public network connectivity o cluster hosts. The samemultipathing group on a host can host any number o logical host name or shared addressresources. For more inormation about logical host name and shared address resources, see theSunCluster Data Services PlanningandAdministrationGuide or Solaris OS.
Note – The design o the IP network multipathing mechanism is meant to detect and mask adapter ailures. The design is not intended to recover rom an administrator's use o ifconfig
to remove one o the logical (or shared) IP addresses. The Sun Cluster sotware views the logicaland shared IP addresses as resources that are managed by the RGM. The correct way or anadministrator to add or to remove an IP address is to use clresource and clresourcegroup tomodiy the resource group that contains the resource.
For more inormation about the Solaris implementation o IP Network Multipathing, see theappropriate documentation or the Solaris Operating System that is installed on your cluster.
Operating System Instructions
Solaris 9 Operating System Chapter 1, “IP Network Multipathing (Overview),” in IP
NetworkMultipathing AdministrationGuide
Solaris 10 Operating System Part VI, “IPMP,” in SystemAdministrationGuide: IP Services
Chapter 3 • Key Concepts or System Administrators and Application Developers 93
SPARC: Dynamic Reconguration SupportSun Cluster 3.2 1/09 support or the dynamic reconguration (DR) sotware eature is beingdeveloped in incremental phases. This section describes concepts and considerations or Sun
Cluster 3.2 1/09 support o the DR eature.
All the requirements, procedures, and restrictions that are documented or the Solaris DReature also apply to Sun Cluster DR support (except or the operating environment quiescence
SPARC:Dynamic RecongurationSupport
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 94/116
pp y pp ( p p g qoperation). Thereore, review the documentation or the Solaris DR eature beore by using theDR eature with Sun Cluster sotware. You should review in particular the issues that afectnonnetwork IO devices during a DR detach operation.
The SunEnterprise 10000Dynamic ReconfgurationUser Guideand the SunEnterprise 10000Dynamic ReconfgurationReerenceManual (rom theSolaris 10 on SunHardware collection)are both available or download rom http://docs.sun.com.
SPARC: Dynamic Reconguration General Description
The DR eature enables operations, such as the removal o system hardware, in running
systems. The DR processes are designed to ensure continuous system operation with no need tohalt the system or interrupt cluster availability.
DR operates at the board level. Thereore, a DR operation afects all the components on a board.Each board can contain multiple components, including CPUs, memory, and peripheralinteraces or disk drives, tape drives, and network connections.
Removing a board that contains active components would result in system errors. Beore
removing a board, the DR subsystem queries other subsystems, such as Sun Cluster, todetermine whether the components on the board are being used. I the DR subsystem nds thata board is in use, the DR remove-board operation is not done. Thereore, it is always sae toissue a DR remove-board operation because the DR subsystem rejects operations on boards thatcontain active components.
The DR add-board operation is also always sae. CPUs and memory on a newly added board areautomatically brought into service by the system. However, the system administrator must
manually congure the cluster to actively use components that are on the newly added board.
Note – The DR subsystem has several levels. I a lower level reports an error, the upper level alsoreports an error. However, when the lower level reports the specic error, the upper levelreports Unknown error. You can saely ignore this error.
The ollowing sections describe DR considerations or the diferent device types.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A94
SPARC: DR Clustering Considerations or CPU Devices
Sun Cluster sotware does not reject a DR remove-board operation because o the presence o
CPU devices.
When a DR add-board operation succeeds, CPU devices on the added board are automatically
incorporated in system operation.
SPARC:Dynamic RecongurationSupport
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 95/116
SPARC: DR Clustering Considerations or Memory
For the purposes o DR, consider two types o memory:■ Kernel memory cage■ Non-kernel memory cage
These two types difer only in usage. The actual hardware is the same or both types. Kernel
memory cage is the memory that is used by the Solaris Operating System. Sun Cluster sotware
does not support remove-board operations on a board that contains the kernel memory cage
and rejects any such operation. When a DR remove-board operation pertains to memory other
than the kernel memory cage, Sun Cluster sotware does not reject the operation. When a DRadd-board operation that pertains to memory succeeds, memory on the added board is
automatically incorporated in system operation.
SPARC: DR Clustering Considerations or Disk and
Tape DrivesSun Cluster rejects dynamic reconguration (DR) remove-board operations on active drives on
the primary host. You can perorm DR remove-board operations on inactive drives on the
primary host and on any drives in the secondary host. Ater the DR operation, cluster data
access continues as beore.
Note – Sun Cluster rejects DR operations that impact the availability o quorum devices. Forconsiderations about quorum devices and the procedure or perorming DR operations on
them, see “SPARC: DR Clustering Considerations or Quorum Devices” on page 96.
See “Dynamic Reconguration With Quorum Devices” in SunCluster SystemAdministration
Guide or Solaris OS or detailed instructions about how to perorm these actions.
Chapter 3 • Key Concepts or System Administrators and Application Developers 95
SPARC: DR Clustering Considerations or QuorumDevices
I the DR remove-board operation pertains to a board that contains an interace to a device
congured or quorum, Sun Cluster sotware rejects the operation. Sun Cluster sotware also
identies the quorum device that would be afected by the operation. You must disable the
device as a quorum device beore you can perorm a DR remove-board operation.
SPARC:Dynamic RecongurationSupport
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 96/116
See Chapter 6, “Administering Quorum,” in SunCluster SystemAdministrationGuide or
SolarisOS or detailed instructions about how administer quorum.
SPARC: DR Clustering Considerations or ClusterInterconnect Interaces
I the DR remove-board operation pertains to a board containing an active cluster interconnect
interace, Sun Cluster sotware rejects the operation. Sun Cluster sotware also identies the
interace that would be afected by the operation. You must use a Sun Cluster administrative
tool to disable the active interace beore the DR operation can succeed.
Caution – Sun Cluster sotware requires each cluster node to have at least one unctioning path to
every other cluster node. Do not disable a private interconnect interace that supports the last
path to any Solaris host in the cluster.
See “Administering the Cluster Interconnects” in SunCluster SystemAdministrationGuide or SolarisOS or detailed instructions about how to perorm these actions.
SPARC: DR Clustering Considerations or PublicNetwork Interaces
I the DR remove-board operation pertains to a board that contains an active public network
interace, Sun Cluster sotware rejects the operation. Sun Cluster sotware also identies the
interace that would be afected by the operation. Beore you remove a board with an active
network interace present, switch over all trac on that interace to another unctional interace
in the multipathing group by using the if_mpadm command.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A96
Caution – I the remaining network adapter ails while you are perorming the DR removeoperation on the disabled network adapter, availability is impacted. The remaining adapter hasno place to ail over or the duration o the DR operation.
See “Administering the Public Network” in SunCluster SystemAdministrationGuide or SolarisOS or detailed instructions about how to perorm a DR remove operation on a public network i t
SPARC:Dynamic RecongurationSupport
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 97/116
interace.
Chapter 3 • Key Concepts or System Administrators and Application Developers 97
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 98/116
98
Frequently Asked Questions
4C H A P T E R 4
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 99/116
This chapter includes answers to the most requently asked questions about the Sun Cluster
product.
The questions are organized by topic as ollows:
■ “High Availability FAQs” on page 99■ “File Systems FAQs” on page 100■ “Volume Management FAQs” on page 101■ “Data Services FAQs” on page 101■ “Public Network FAQs” on page 102■ “Cluster Member FAQs” on page 103■ “Cluster Storage FAQs” on page 104■ “Cluster Interconnect FAQs” on page 104■ “Client Systems FAQs” on page 105■
“Administrative Console FAQs” on page 105■ “Terminal Concentrator and System Service Processor FAQs” on page 106
High Availability FAQs
Question: What exactly is a highly available system?
Answer: The Sun Cluster sotware denes high availability (HA) as the ability o a cluster to keepan application running. The application runs even when a ailure occurs that would normally
make a host system unavailable.
Question: What is the process by which the cluster provides high availability?
Answer: Through a process known as ailover, the cluster ramework provides a highly available
environment. Failover is a series o steps that are perormed by the cluster to migrate data
service resources rom a ailing node to another operational node in the cluster.
99
Question: What is the diference between a ailover and scalable data service?
Answer: There are two types o highly available data services:
■ Failover■ Scalable
A ailover data service runs an application on only one primary node in the cluster at a time.Other nodes might run other applications, but each application runs on only a single node. I aprimary node ails, applications that are running on the ailed node ail over to another node.They continue running.
File Systems FAQs
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 100/116
y g
A scalable data service spreads an application across multiple nodes to create a single, logicalservice. Scalable services leverage the number o nodes and processors in the entire cluster onwhich they run.
For each application, one node hosts the physical interace to the cluster. This node is called aGlobal Interace (GIF) node. Multiple GIF nodes can exist in the cluster. Each GIF node hostsone or more logical interaces that can be used by scalable services. These logical interaces arecalled global interaces. One GIF node hosts a global interace or all requests or a particularapplication and dispatches them to multiple nodes on which the application server is running.I the GIF node ails, the global interace ails over to a surviving node.
I any node on which the application is running ails, the application continues to run on othernodes with some perormance degradation. This process continues until the ailed node returns
to the cluster.
File Systems FAQsQuestion: Can I run one or more o the Solaris hosts in the cluster as highly available NFS serverswith other Solaris hosts as clients?
Answer: No, do not do a loopback mount.
Question: Can I use a cluster le system or applications that are not under Resource GroupManager control?
Answer: Yes. However, without RGM control, the applications need to be restarted manually ater the ailure o the node on which they are running.
Question: Must all cluster le systems have a mount point under the /global directory?
Answer: No. However, placing cluster le systems under the same mount point, such as /global,
enables better organization and management o these le systems.
Question: What are the diferences between using the cluster le system and exporting NFS lesystems?
Answer: Several diferences exist:
1. The cluster le system supports global devices. NFS does not support remote access todevices.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A100
2. The cluster le system has a global namespace. Only one mount command is required. WithNFS, you must mount the le system on each host.
3. The cluster le system caches les in more cases than does NFS. For example, the cluster lesystem caches les when a le is being accessed rom multiple nodes or read, write, le
locks, asynchronous I/O.4. The cluster le system is built to exploit uture ast cluster interconnects that provide remote
DMA and zero-copy unctions.
5. I you change the attributes on a le (using chmod, or example) in a cluster le system, the
Data Services FAQs
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 101/116
5. you c a ge t e att butes o a e (us g c od, o e a p e) a c uste e syste , t echange is reected immediately on all nodes. With an exported NFS le system, this changecan take much longer.
Question: The le system /global/.devices/node@nodeID appears on my cluster nodes. Can Iuse this le system to store data that I want to be highly available and global?
Answer: These le systems store the global device namespace. These le systems are not intendedor general use. While they are global, these le systems are never accessed in a global manner.Each node only accesses its own global device namespace. I a node is down, other nodes cannotaccess this namespace or the node that is down. These le systems are not highly available.These le systems should not be used to store data that needs to be globally accessible or highly available.
Volume Management FAQsQuestion: Do I need to mirror all disk devices?
Answer: For a disk device to be considered highly available, it must be mirrored, or use RAID-5hardware. All data services should use either highly available disk devices, or cluster le systemsmounted on highly available disk devices. Such congurations can tolerate single disk ailures.
Question: Can I use one volume manager or the local disks (boot disk) and a diferent volumemanager or the multihost disks?
Answer: This conguration is supported with the Solaris Volume Manager sotware managingthe local disks and Veritas Volume Manager managing the multihost disks. No othercombination is supported.
Data Services FAQsQuestion: Which Sun Cluster data services are available?
Answer: The list o supported data services is included in the Sun Cluster Release Notes.
Question: Which application versions are supported by Sun Cluster data services?
Answer: The list o supported application versions is included in the Sun Cluster Release Notes.
Chapter 4 • Frequently Asked Questions 101
Question: Can I write my own data service?
Answer: Yes. See the Chapter 11, “DSDL API Functions,” in SunClusterDataServicesDeveloper’sGuide or SolarisOS or more inormation.
Question: When creating network resources, should I speciy numeric IP addresses or hostnames?
Answer: The preerred method or speciying network resources is to use the UNIX host namerather than the numeric IP address.
Public Network FAQs
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 102/116
Question: When creating network resources, what is the diference between using a logical hostname (a LogicalHostname resource) or a shared address (a SharedAddress resource)?
Answer: Except in the case o Sun Cluster HA or NFS, wherever the documentation
recommends the use o a LogicalHostname resource in a Failover mode resource group, aSharedAddress resource or LogicalHostname resource can be used interchangeably. The use o a SharedAddress resource incurs some additional overhead because the cluster networkingsotware is congured or a SharedAddress but notor a LogicalHostname.
The advantage to using a SharedAddress resource is demonstrated when you congure bothscalable and ailover data services, and want clients to be able to access both services by usingthe same host name. In this case, the SharedAddress resources along with the ailover
application resource are contained in one resource group. The scalable service resource iscontained in a separate resource group and congured to use the SharedAddress resource.Both the scalable and ailover services can then use the same set o host names and addressesthat are congured in the SharedAddress resource.
Public Network FAQs
Question: Which public network adapters does the Sun Cluster sotware support?Answer: Currently, the Sun Cluster sotware supports Ethernet (10/100BASE-T and1000BASE-SX Gb) public network adapters. Because new interaces might be supported in theuture, check with your Sun sales representative or the most current inormation.
Question: What is the role o the MAC address in ailover?
Answer: When a ailover occurs, new Address Resolution Protocol (ARP) packets are generatedand broadcast to the world. These ARP packets contain the new MAC address (o the newphysical adapter to which the host ailed over) and the old IP address. When another machineon the network receives one o these packets, it ushes the old MAC-IP mapping rom its ARPcache and uses the new one.
Question: Does the Sun Cluster sotware support setting local-mac-address?=true?
Answer: Yes. In act, IP Network Multipathing requires that local-mac-address? must be settotrue.
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A102
You can set local-mac-address with the eeprom command, at the OpenBoot PROM ok
prompt in a SPARC based cluster. See the eeprom(1M) man page. You can also set the MACaddress with the SCSI utility that you optionally run ater the BIOS boots in an x86 basedcluster.
Question: How much delay can I expect when IP network multipathing perorms a switchoverbetween adapters?
Answer: The delay could be several minutes. The reason is because when an IP network multipathing switchover is perormed, the operation sends a gratuitous ARP broadcast.
Cluster Member FAQs
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 103/116
multipathing switchover is perormed, the operation sends a gratuitous ARP broadcast.However, you cannot be sure that the router between the client and the cluster uses thegratuitous ARP. So, until the ARP cache entry or this IP address on the router times out, theentry can use the stale MAC address.
Question: How ast are ailures o a network adapter detected?
Answer: The deault ailure detection time is 10 seconds. The algorithm tries to meet the ailuredetection time, but the actual time depends on the network load.
Cluster Member FAQsQuestion: Do all cluster members need to have the same root password?
Answer: You are not required to have the same root password on each cluster member. However,you can simpliy administration o the cluster by using the same root password on all nodes.
Question: Is the order in which nodes are booted signicant?
Answer: In most cases, no. However, the boot order is important to prevent amnesia. Forexample, i node two was the owner o the quorum device and node one is down, and then you
bring node two down, you must bring up node two beore bringing back node one. This orderprevents you rom accidentally bringing up a node with outdated cluster congurationinormation.
Question: Do I need to mirror local disks in a cluster node?
Answer: Yes. Though this mirroring is not a requirement, mirroring the cluster node's disksprevents a nonmirrored disk ailure rom taking down the node. The downside to mirroring acluster node's local disks is more system administration overhead.
Question: What are the cluster member backup issues?
Answer: You can use several backup methods or a cluster. One method is to have a host as theback up node with a tape drive or library attached. Then use the cluster le system to back upthe data. Do not connect this host to the shared disks.
See Chapter 11, “Backing Up and Restoring a Cluster,” in SunCluster SystemAdministrationGuide or Solaris OS or additional inormation about how to backup and restore data.
Chapter 4 • Frequently Asked Questions 103
Question: When is a node healthy enough to be used as a secondary node?
Answer: Solaris 9 OS:
Ater a reboot, a node is healthy enough to be a secondary node when the node displays thelogin prompt.
Solaris 10 OS:A nodeis healthy enough to be a secondary node i the multi-user-server milestone isrunning.
| l i d f l
Cluster Storage FAQs
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 104/116
# svcs -a | grep multi-user-server:default
Cluster Storage FAQsQuestion: What makes multihost storage highly available?
Answer: Multihost storage is highly available because it can survive the loss o a single disk,because o mirroring (or because o hardware-based RAID-5 controllers). Because a multihoststorage device has more than one host connection, it can also withstand the loss o a singleSolaris host to which it is connected. In addition, redundant paths rom each host to theattached storage provide tolerance or the ailure o a host bus adapter, cable, or disk controller.
Cluster Interconnect FAQsQuestion: Which cluster interconnects does the Sun Cluster sotware support?
Answer: Currently, the Sun Cluster sotware supports the ollowing cluster interconnects:
■ Ethernet (100BASE-T Fast Ethernet and 1000BASE-SX Gb) in both SPARC based and x86based clusters
■
Inniband in both SPARC based and x86 based clusters■ SCI in SPARC based clusters only
Question: What is the diference between a “cable” and a transport “path”?
Answer: Cluster transport cables are congured by using transport adapters and switches. Cables join adapters and switches on a component-to-component basis. The cluster topology manageruses available cables to build end-to-end transport paths between hosts. A cable does not mapdirectly to a transport path.
Cables are statically “enabled” and “disabled” by an administrator. Cables have a “state”(enabled or disabled), but not a “status.” I a cable is disabled, it is as i it were uncongured.Cables that are disabled cannot be used as transport paths. These cables are not probed andthereore their state is unknown. You can obtain the state o a cable by using the cluster
status command.
Transport paths are dynamically established by the cluster topology manager. The “status” o atransport path is determined by the topology manager. A path can have a status o “online” or
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A104
“oine.” You can obtain the status o a transport path by using the clinterconnect status
command. See the clinterconnect(1CL) man page.
Consider the ollowing example o a two-host cluster with our cables.
node1:adapter0 to switch1, port0
node1:adapter1 to switch2, port0
node2:adapter0 to switch1, port1
node2:adapter1 to switch2, port1
Two possible transport paths can be ormed rom these our cables.
Administrative ConsoleFAQs
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 105/116
node1:adapter0 to node2:adapter0
node2:adapter1 to node2:adapter1
Client Systems FAQsQuestion: Do I need to consider anyspecial clientneedsor restrictions foruse with a cluster?
Answer: Client systems connect to the cluster as they would to any other server. In someinstances, depending on the data service application, you might need to install client-sidesotware or perorm other conguration changes so that the client can connect to the dataservice application. See Chapter 1, “Planning or Sun Cluster Data Services,” in SunCluster DataServices Planning andAdministrationGuide or Solaris OS or more inormation aboutclient-side conguration requirements.
Administrative Console FAQsQuestion: Does the Sun Cluster sotware require an administrative console?
Answer: Yes.
Question: Does the administrative console have to be dedicated to the cluster, or can it be usedor other tasks?
Answer: The Sun Cluster sotware does not require a dedicated administrative console, but usingone provides these benets:
■ Enables centralized cluster management by grouping console and management tools on thesame machine
■ Provides potentially quicker problem resolution by your hardware service provider
Question: Does the administrative console need to be located “close” to the cluster, or example,in the same room?
Answer: Check with your hardware service provider. The provider might require that theconsole be located in close proximity to the cluster. No technical reason exists or the console tobe located in the same room.
Chapter 4 • Frequently Asked Questions 105
Question: Can an administrative console serve more than one cluster, i any distancerequirements are also rst met?
Answer: Yes. You can control multiple clusters rom a single administrative console. You canalso share a single terminal concentrator between clusters.
Terminal Concentrator and System Service Processor FAQsQuestion: Does the Sun Cluster sotware require a terminal concentrator?
Terminal Concentrator and System Service Processor FAQs
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 106/116
Answer: Starting with Sun Cluster 3.0, Sun Cluster sotware does not require a terminalconcentrator. Unlike Sun Cluster 2.2, Sun Cluster 3.0, Sun Cluster 3.1, and Sun Cluster 3.2 donot require a terminal concentrator. Sun Cluster 2.2 required a terminal concentrator or
encing.
Question: I see that most Sun Cluster servers use a terminal concentrator, but the Sun EnterpriseE1000 server does not. Why not?
Answer: The terminal concentrator is efectively a serial-to-Ethernet converter or most servers.The terminal concentrator's console port is a serial port. The Sun Enterprise E1000 serverdoesn't have a serial console. The System Service Processor (SSP) is the console, either throughan Ethernet or jtag port. For the Sun Enterprise E1000 server, you always use the SSP or
consoles.
Question: What are the benets o using a terminal concentrator?
Answer: Using a terminal concentrator provides console-level access to each Solaris host rom aremote machine anywhere on the network. This access is provided even when the host is at theOpenBoot PROM (OBP) on a SPARC based host or a boot subsystem on an x86 based host.
Question: I I use a terminal concentrator that Sun does not support, what do I need to know toqualiy the one that I want to use?
Answer: The main diference between the terminal concentrator that Sun supports and otherconsole devices is that the Sun terminal concentrator has special rmware. This rmwareprevents the terminal concentrator rom sending a break to the console when it boots. I youhave a console device that can send a break, or a signal that might be interpreted as a break tothe console, the break shuts down the host.
Question: Can I ree a locked port on the terminal concentrator that Sun supports withoutrebooting it?
Answer: Yes. Note the port number that needs to be reset and type the ollowing commands:
telnet tcEnter Annex port name or number: cli
annex: su -
annex# admin
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A106
admin : reset port-number
admin : quit
annex# hangup
#
Reer to the ollowing manuals or more inormation about how to congure and administer
the terminal concentrator that Sun supports.
■ “Overview o Administering Sun Cluster” in SunCluster SystemAdministrationGuide or Solaris OS
■ Chapter 2, “Installing and Conguring the Terminal Concentrator,” in SunCluster 3.1 - 3.2
Terminal Concentrator and System Service Processor FAQs
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 107/116
p , g g g ,Hardware AdministrationManual or Solaris OS
Question: What i the terminal concentrator itsel ails? Must I have another one standing by?
Answer: No. You do not lose any cluster availability i the terminal concentrator ails. You dolose the ability to connect to the host consoles until the concentrator is back in service.
Question: I I do use a terminal concentrator, what about security?
Answer: Generally, the terminal concentrator is attached to a small network that systemadministrators use, not a network that is used or other client access. You can control security by limiting access to that particular network.
Question: SPARC: How do I use dynamic reconguration with a tape or disk drive?
Answer: Perorm the ollowing steps:
■ Determine whether the disk or tape drive is part o an active device group. I the drive is notpart o an active device group, you can perorm the DR remove operation on it.
■ I the DR remove-board operation would afect an active disk or tape drive, the systemrejects the operation and identies the drives that would be afected by the operation. I the
drive is part o an active device group, go to “SPARC: DR Clustering Considerations or Disk and Tape Drives” on page 95.
■ Determine whether the drive is a component o the primary node or the secondary node. I the drive is a component o the secondary node, you can perorm the DR remove operationon it.
■ I the drive is a component o the primary node, you must switch the primary and secondary nodes beore perorming the DR remove operation on the device.
Caution – I the current primary node ails while you are perorming the DR operation on asecondary node, cluster availability is impacted. The primary node has no place to ail over untila new secondary node is provided.
Chapter 4 • Frequently Asked Questions 107
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 108/116
108
Index
A li t t (C ti d)
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 109/116
Aadapters, See network, adaptersadministration, cluster, 41-97
administrative console, 28-29FAQs, 105-106
administrative interaces, 42agents, See data servicesamnesia, 57APIs, 71-73, 75application, See data servicesapplication communication, 73-74
application development, 41-97application distribution, 60attributes,See properties
Bbackup node, 103-104
board removal, dynamic reconguration, 95boot disk, See disks, localboot order, 103-104
Ccable, transport, 104-105
CCP, 28CCR, 45-46CD-ROM drive, 26client-server conguration, 65client systems, 27
client systems (Continued)FAQs, 105restrictions, 105
clprivnet driver, 74cluster
administration, 41-97advantages, 14-15application developer view, 18application development, 41-97backup, 103-104board removal, 95
boot order, 103-104conguration, 45-46, 83-91data services, 64-71description, 14-15le system, 51-53, 100-101
FAQsSeealso le system
HAStoragePlus resource type, 52-53
using, 51-52goals, 14-15hardware, 15-16, 21-29interconnect, 23, 26-27
adapters, 26cables, 27data services, 73-74dynamic reconguration, 96
FAQs, 104-105interaces, 26 junctions, 27supported, 104-105
media, 26
109
cluster (Continued)
members, 22, 44
FAQs, 103-104
reconguration, 44
nodes, 22-23
password, 103-104
public network, 27
public network interace, 65
service, 15-16
sotware components 23-24
Ddaemons, svc.startd, 81data, storing, 100-101data services, 64-71
APIs, 71-73cluster interconnect, 73-74conguration, 83-91developing, 71-73ailover, 67-68FAQs, 101-102
Index
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 110/116
sotware components, 23 24
storage FAQs, 104
system administrator view, 16-17
task list, 18-19time, 42
topologies, 29-37, 38-40
Cluster Conguration Repository, 45-46
Cluster Control Panel, 28
cluster in a box topology, 33
Cluster Membership Monitor, 44
clustered pair topology, 30, 39
clustered-server model, 65clusters span two hosts topology, 35-36
CMM, 44
ailast mechanism, 44
See also ailast
concurrent access, 22
conguration
client-server, 65data services, 83-91
parallel database, 22
repository, 45-46
virtual memory limits, 86-87
congurations, quorum, 59-60
console
access, 28
administrative, 28FAQs, 105-106
System Service Processor, 28
Controlling CPU, 82
CPU, control, 82
CPU time, 83-91
ault monitor, 71highly available, 43library API, 73methods, 67resource groups, 74-77resource types, 74-77resources, 74-77scalable, 68-69supported, 101-102
/dev/global/ namespace, 49-50developer, cluster applications, 18device
global, 46-47ID, 46-47
device group, 47-49changing properties, 48-49
device groupsailover, 47multiported, 48-49primary ownership, 48-49
devicesmultihost, 24quorum, 56-63
DID, 46-47disk path monitoring, 53-56disks
dynamic reconguration, 95global devices, 46-47, 49-50local, 25-26, 46-47, 49-50
mirroring, 103-104 volume management, 101
multihost, 46-47, 47-49, 49-50SCSI devices, 25
DR, See dynamic reconguration
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A110
driver, device ID, 46-47DSDL API, 75dynamic reconguration, 94-97
cluster interconnect, 96CPU devices, 95
description, 94disks, 95memory, 95public network, 96-97quorum devices, 96t d i 95
le locking, 51le system
cluster, 51-53, 100-101data storage, 100-101FAQs, 100-101
globalSee le system, cluster
high availability, 100-101local, 52-53mounting, 51-53, 100-101
Index
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 111/116
tape drives, 95
EE10000, See Sun Enterprise E10000
Failback, 71
ailast, 44-45ailover
data services, 67-68device groups, 47scenarios, Solaris Resource Manager, 87-91
ailuredetection, 43ailback, 71
recovery, 43FAQs, 99-107administrative console, 105-106client systems, 105cluster interconnect, 104-105cluster members, 103-104cluster storage, 104data services, 101-102
le systems, 100-101high availability, 99-100public network, 102-103System Service Processor, 106-107terminal concentrator, 106-107 volume management, 101
ault monitor, 71encing, 44
NFS, 53, 100-101syncdir mount option, 53UFS, 53VxFS, 53
le systems, using, 51-52ramework, high availability, 43-46Frequently Asked Questions, See FAQs
Gglobal
device, 46-47, 47-49local disks, 25mounting, 51-53
interace, 66scalable services, 68
namespace, 46, 49-50
local disks, 25global le system, See cluster, le systemglobal interace node, 66/global mount point, 51-53, 100-101groups, device, 47-49
HHA,See high availability hardware, 15-16, 21-29, 94-97Seealso disksSeealso storagecluster interconnect components, 26dynamic reconguration, 94-97
HAStoragePlus resource type, 52-53, 74-77
111
high availability FAQs, 99-100ramework, 43-46
highly available, data services, 43host name, 65
IID
device 46 47
membership,See cluster, membersmemory, 95mission-critical applications, 62monitoring
disk path, 53-56
object type, 81system resources, 82telemetry attributes, 82
mountingle systems, 51-53/global 100 101
Index
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 112/116
device, 46-47node, 50
in.mpathd daemon, 92
interacesSee network, interacesadministrative, 42
IP address, 101-102IP Network Multipathing, 92-93
ailover time, 102-103IPMP, See IP Network Multipathing
Kkernel, memory, 95
L
load balancing, 69-71local disks, 25-26local le system, 52-53local_mac_address, 102-103local namespace, 50logical host name, 65
compared to shared address, 101-102ailover data services, 67-68
LogicalHostname resource type,See logical host name
MMAC address, 102-103mapping, namespaces, 50media, removable, 26
/global, 100-101global devices, 51-53with syncdir, 53
multi-initiator SCSI, 25multihost device, 24multipathing, 92-93multiported device groups, 48-49
N
N+1 (star) topology, 31-32, 39-40N*N (scalable) topology, 32namespaces, 49-50, 50network
adapters, 27, 92-93interaces, 27, 92-93load balancing, 69-71logical host name, 65
private, 23public, 27
dynamic reconguration, 96-97FAQs, 102-103interaces, 102-103IP Network Multipathing, 92-93
resources, 65, 74-77shared address, 65
Network Time Protocol, 42NFS, 53nodes, 22-23
backup, 103-104boot order, 103-104global interace, 66nodeID, 50primary, 48-49, 65
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A112
nodes (Continued)secondary, 48-49, 65
NTP, 42numsecondaries property, 48
Oobject type, system resource, 81Oracle Parallel Server, SeeOracle Real Application
Clusters
quorum (Continued)device, dynamic reconguration, 96devices, 56-63recommended congurations, 61requirements, 59-60
vote counts, 58
Rrecovery
Index
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 113/116
ClustersOracle Real Application Clusters, 72
Ppair+N topology, 31panic, 44-45, 45parallel database congurations, 22password, root, 103-104path, transport, 104-105
per-host address, 73-74preferenced property, 48primary node, 65primary ownership, device groups, 48-49private network, 23projects, 83-91properties
changing, 48-49
resource groups, 77Resource_project_name, 85-86resources, 77RG_project_name, 85-86
proxy resource types, 80public network,See network, publicpure service, 69
Qquorum, 56-63
atypical congurations, 62bad congurations, 63best practices, 60congurations, 58-59, 59-60
recovery ailback settings, 71ailure detection, 43
redundant I/O domains topology, 37removable media, 26Resource Group Manager,SeeRGMresource groups, 74-77
ailover, 67-68properties, 77scalable, 68-69settings, 75-77
states, 75-77resource management, 83-91Resource_project_name property, 85-86resource types, 52-53, 74-77
proxy, 80SUNW.Proxy_SMF_failover, 80SUNW.Proxy_SMF_loadbalanced , 80SUNW.Proxy_SMF_multimaster , 80
resources, 74-77properties, 77settings, 75-77states, 75-77
RG_project_name property, 85-86RGM, 67, 74-77, 83-91RMAPI, 75root password, 103-104
Sscalable data services, 68-69scha_cluster_get command, 74scha_privatelink_hostname_node argument, 74SCSI, multi-initiator, 25
113
scsi-initiator-id property, 25secondary node, 65server models, 65Service Management Facility (SMF), 80-81shared address, 65
compared to logical host name, 101-102global interace node, 66scalable data services, 68-69
SharedAddress resource type, See shared addressshutdown, 44-45single cluster spans two hosts topology 34 35
system resources (Continued)object type, 81usage, 81
System Service Processor, 28FAQs, 106-107
Ttape drive, 26telemetry attribute, system resources, 82
Index
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 114/116
single cluster spans two hosts topology, 34-35single-server model, 65SMF, See Service Management Facility (SMF)
SMF daemon svc.startd, 81sotware components, 23-24Solaris projects, 83-91Solaris Resource Manager, 83-91
conguration requirements, 85-86conguring virtual memory limits, 86-87ailover scenarios, 87-91
Solaris Volume Manager, multihost devices, 24
split brain, 57SSP, See System Service Processorsticky service, 69storage, 24
dynamic reconguration, 95FAQs, 104SCSI, 25
Sun Cluster, See clusterSun Cluster Manager, 42
system resource usage, 83Sun Enterprise E10000, 106-107
administrative console, 28Sun Management Center (SunMC), 42SUNW.Proxy_SMF_failover, resource types, 80SUNW.Proxy_SMF_loadbalanced , resource types, 80SUNW.Proxy_SMF_multimaster , resource types, 80svc.startd, daemons, 81syncdir mount option, 53system resource, threshold, 82system resource monitoring, 82system resource usage, 81system resources
monitoring, 82
y , y ,terminal concentrator, FAQs, 106-107threshold
system resource, 82telemetry attribute, 82
time, between hosts, 42topologies, 29-37, 38-40
clustered pair, 30, 39logical domains: cluster in a box, 33logical domains: clusters span two hosts, 35-36logical domains: redundant I/O domains, 37
logical domains: single cluster spans twohosts, 34-35
N+1 (star), 31-32, 39-40N*N (scalable), 32pair+N, 31
UUFS, 53
VVeritas Volume Manager, multihost devices, 24 volume management
FAQs, 101local disks, 101multihost devices, 24multihost disks, 101namespace, 49RAID-5, 101Solaris Volume Manager, 101Veritas Volume Manager, 101
SunCluster ConceptsGuide or Solaris OS • January2009,Revision A114
vote counts, quorum, 58VxFS, 53
Zzones, 77
Index
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 115/116
115
8/2/2019 Cluster Sun Conceptos
http://slidepdf.com/reader/full/cluster-sun-conceptos 116/116
116