Cluster Sun Conceptos

8/2/2019 Cluster Sun Conceptos

http://slidepdf.com/reader/full/cluster-sun-conceptos 1/116

Sun Cluster Concepts Guide orSolaris OS

Sun Microsystems, Inc.4150 Network CircleSanta Clara, CA 95054U.S.A.

PartNo: 820–4676–10January 2009, Revision A



Copyright2009 SunMicrosystems, Inc. 4150 Network Circle, Santa Clara,CA 95054 U.S.A. Allrights reserved.

SunMicrosystems, Inc. hasintellectual property rightsrelatingto technology embodied in theproduct that is describedin this document.In particular, andwithoutlimitation, these intellectualpropertyrights mayinclude oneor more U.S. patents or pending patentapplications in theU.S. andin other countries.

U.S. Government Rights– Commercial sotware. Government users are subject to the Sun Microsystems, Inc. standard license agreement and applicableprovisionso theFARand itssupplements.

This distribution may include materials developed by thirdparties.Partso theproduct maybe derived rom Berkeley BSDsystems, licensed rom theUniversity o Caliornia. UNIX is a registered trademarkin theU.S. andothercountries, exclusivelylicensed through X/OpenCompany, Ltd.

Sun, SunMicrosystems, theSun logo, theSolaris logo, theJavaCofee Cuplogo,docs.sun.com, OpenBoot, Solaris VolumeManager,StorEdge, SunFire,Java, andSolaris aretrademarks or registered trademarks o SunMicrosystems, Inc. or itssubsidiaries in theU.S. andothercountries. AllSPARCtrademarks areused underlicenseand aretrademarks or registered trademarks o SPARCInternational,Inc. in theU.S. andothercountries. Products bearing SPARCtrademarks arebasedupon an architecture developed by Sun Microsystems, Inc.

The OPENLOOK and SunTM GraphicalUser Interacewas developedby SunMicrosystems, Inc. orits users andlicensees. Sunacknowledges thepioneering efortso Xerox in researching anddeveloping theconcept o visualor graphicaluser interaces orthe computer industry.Sun holds a non-exclusive licenseromXeroxtotheXeroxGraphical UserInterace,whichlicense also coversSun'slicenseeswho implementOPENLOOK GUIs andotherwise complywith Sun's written licenseagreements.

Products covered by andinormationcontained in this publication arecontrolled by U.S. ExportControl laws andmay be subjectto theexport or importlaws inother countries. Nuclear,missile,chemicalor biological weapons or nuclear maritime enduses or endusers,whether director indirect,are strictly prohibited. Exportor reexport to countriessubject to U.S. embargo or to entities identiedon U.S. exportexclusion lists,including, butnot limited to,the deniedpersons andspecially designated nationals lists is strictly prohibited.

DOCUMENTATIONIS PROVIDED “AS IS” AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANYIMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPTTOTHEEXTENTTHAT SUCH DISCLAIMERS AREHELD TO BE LEGALLY INVALID.

Copyright2009 SunMicrosystems, Inc. 4150 Network Circle, Santa Clara,CA 95054 U.S.A. Tous droitsréservés.

SunMicrosystems, Inc. détient lesdroits de propriétéintellectuellerelatisà la technologie incorporée dans le produit quiest décritdans ce document.En particulier,et ce sans limitation, cesdroits de propriétéintellectuellepeuvent inclure un ou plusieursbrevets américains ou desapplications de breveten attente auxEtats-Uniset dans d'autres pays.

Cette distribution peut comprendredes composants développéspar des tierces personnes.

Certainescomposants de ce produit peuvent être dérivées du logiciel Berkeley BSD, licenciéspar l'Universitéde Caliornie. UNIX estune marquedéposée auxEtats-Uniset dans d'autres pays; elle estlicenciée exclusivementpar X/OpenCompany,Ltd.

Sun, SunMicrosystems, le logo Sun, le logo Solaris, le logo Java Cofee Cup, docs.sun.com,OpenBoot,Solaris VolumeManager, StorEdge, SunFire, Java et Solarissont desmarques de abrique ou desmarques déposées de SunMicrosystems, Inc., ou ses liales, auxEtats-Unis et dans d'autres pays. Toutesles marques SPARCsont utiliséessous licence et sont desmarques de abrique ou desmarques déposées de SPARCInternational,Inc. auxEtats-Unis et dans d'autres pays. Lesproduitsportant lesmarques SPARCsont basés surune architecturedéveloppée parSun Microsystems, Inc.

L'interace d'utilisation graphiqueOPENLOOK et Suna étédéveloppée parSun Microsystems, Inc. pour ses utilisateurset licenciés. Sunreconnaît leseforts depionniersde Xerox pour la rechercheet le développement du concept desinteraces d'utilisation visuelle ou graphiquepour l'industrie de l'inormatique.Sun détientunelicence nonexclusive de Xerox surl'interaced'utilisation graphiqueXerox, cette licence couvrant égalementles licenciésde Sunqui mettent en place l'interaced'utilisation graphiqueOPENLOOK et qui, en outre,se conorment auxlicencesécrites de Sun.

Lesproduitsqui ont l'objet de cette publication et lesinormations qu'il contient sontrégispar la legislation américaine en matière de contrôle desexportations etpeuvent être soumisau droit d'autres pays dans le domaine desexportations et importations. Lesutilisations nales, ou utilisateursnaux, pour desarmesnucléaires,des missiles, des armeschimiques ou biologiquesou pour le nucléaire maritime, directementou indirectement, sont strictementinterdites. Les exportations ouréexportations vers despays sous embargo desEtats-Unis,ou vers desentités gurantsur leslistes d'exclusion d'exportation américaines, y compris, mais de manièrenonexclusive, la liste de personnesqui ontobjet d'un ordre de ne pasparticiper,d'uneaçondirecte ou indirecte, auxexportations desproduitsou desservicesquisont régispar la legislationaméricaine en matière de contrôle des exportations et la listede ressortissants spéciquement designés, sont rigoureusement interdites.

LA DOCUMENTATIONEST FOURNIE "EN L'ETAT" ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES EXPRESSES OU TACITESSONTFORMELLEMENT EXCLUES, DANS LA MESUREAUTORISEE PAR LA LOI APPLICABLE, Y COMPRIS NOTAMMENTTOUTE GARANTIEIMPLICITE RELATIVE A LA QUALITE MARCHANDE,A L'APTITUDEA UNE UTILISATIONPARTICULIERE OU A L'ABSENCE DE CONTREFACON.

081112@21288



Contents

Preace .....................................................................................................................................................7

1 Introduction andOverview ...............................................................................................................13

Introduction to the Sun Cluster Environment ................................................................................ 14

Three Views o the Sun Cluster Sotware ......................................................................................... 15

Hardware Installation and Service View ................................................................................... 15

System Administrator View ....................................................................................................... 16

Application Developer View ...................................................................................................... 18

Sun Cluster Sotware Tasks ................................................................................................................ 18

2 Key Concepts orHardware Service Providers ............................................................................... 21

Sun Cluster System Hardware and Sotware Components ............................................................ 21

Cluster Nodes ................................................................................................................... ............ 22

Sotware Components or Cluster Hardware Members .......................................................... 23

Multihost Devices ........................................................................................................................ 24

Multi-Initiator SCSI ..................................................................................................................... 25

Local Disks ....................................................................................................................... ............. 25

Removable Media ......................................................................................................................... 26

Cluster Interconnect .................................................................................................................... 26

Public Network Interaces ........................................................................................................... 27

Client Systems .............................................................................................................................. 27

Console Access Devices ............................................................................................................... 28

Administrative Console ....................................................................................................... ....... 28

SPARC: Sun Cluster Topologies ........................................................................................................ 29

SPARC: Clustered Pair Topology ............................................................................................... 30

SPARC: Pair+N Topology ........................................................................................................... 31SPARC: N+1 (Star) Topology ..................................................................................................... 31

3



SPARC: N*N (Scalable) Topology ............................................................................................. 32

SPARC: LDoms Guest Domains: Cluster in a Box Topology ................................................. 33

SPARC: LDoms Guest Domains: Single Cluster Spans Two Diferent Hosts Topology ..... 34SPARC: LDoms Guest Domains: Clusters Span Two Diferent Hosts Topology ................. 35

SPARC: LDoms Guest Domains: Redundant I/O Domains ................................................... 37

x86: Sun Cluster Topologies ............................................................................................................... 38

x86: Clustered Pair Topology ..................................................................................................... 39

x86: N+1 (Star) Topology ............................................................................................................ 39

3 Key Concepts orSystem Administrators andApplication Developers ..................................... 41

Administrative Interaces ................................................................................................................... 42

Cluster Time .................................................................................................................... ..................... 42

High-Availability Framework ............................................................................................................ 43

Zone Membership ........................................................................................................................ 44

Cluster Membership Monitor .................................................................................................... 44

Failast Mechanism ...................................................................................................................... 44

Cluster Conguration Repository (CCR) ................................................................................. 45

Global Devices ..................................................................................................................................... 46

Device IDs and DID Pseudo Driver ........................................................................................... 46

Device Groups ..................................................................................................................................... 47

Device Group Failover ................................................................................................................. 47

Multiported Device Groups ........................................................................................................ 48

Global Namespace ............................................................................................................................... 49

Local and Global Namespaces Example .................................................................................... 50

Cluster File Systems ........................................................................................................................... .. 51

Using Cluster File Systems .......................................................................................................... 51

HAStoragePlus Resource Type .................................................................................................. 52

syncdir Mount Option ............................................................................................................... 53Disk Path Monitoring ........................................................................................................................ . 53

DPM Overview ............................................................................................................................ . 54

Monitoring Disk Paths ................................................................................................................ 55

Quorum and Quorum Devices .......................................................................................................... 56

About Quorum Vote Counts ...................................................................................................... 58

About Quorum Congurations ................................................................................................. 58Adhering to Quorum Device Requirements ............................................................................ 59

Contents

SunCluster ConceptsGuide or Solaris OS • January2009,Revision A4



Adhering to Quorum Device Best Practices ............................................................................. 60

Recommended Quorum Congurations .................................................................................. 61

Atypical Quorum Congurations .............................................................................................. 62Bad Quorum Congurations ..................................................................................................... 63

Data Services .......................................................................................................................... .............. 64

Data Service Methods .................................................................................................................. 67

Failover Data Services .................................................................................................................. 67

Scalable Data Services .................................................................................................................. 68

Load-Balancing Policies .............................................................................................................. 69

Failback Settings ........................................................................................................................... 71

Data Services Fault Monitors ...................................................................................................... 71

Developing New Data Services .......................................................................................................... 71

Characteristics o Scalable Services ............................................................................................ 72

Data Service API and Data Service Development Library API .............................................. 73

Using the Cluster Interconnect or Data Service Trac ................................................................. 73

Resources, Resource Groups, and Resource Types ......................................................................... 74

Resource Group Manager (RGM) ............................................................................................. 75

Resource and Resource Group States and Settings .................................................................. 75

Resource and Resource Group Properties ................................................................................ 77

Support or Solaris Zones ................................................................................................................... 77

Support or Global-Cluster Non-Voting Nodes (Solaris Zones) Directly Through theRGM .............................................................................................................................................. 78

Support or Solaris Zones on Sun Cluster Nodes Through Sun Cluster HA or SolarisContainers .................................................................................................................................... 79

Service Management Facility ............................................................................................................. 80

System Resource Usage ....................................................................................................................... 81

System Resource Monitoring ..................................................................................................... 82

Control o CPU ............................................................................................................................ 82

Viewing System Resource Usage ................................................................................................ 83Data Service Project Conguration ................................................................................................... 83

Determining Requirements or Project Conguration ........................................................... 85

Setting Per-Process Virtual Memory Limits ............................................................................. 86

Failover Scenarios ........................................................................................................................ 87

Public Network Adapters and IP Network Multipathing ............................................................... 92

SPARC: Dynamic Reconguration Support .................................................................................... 94SPARC: Dynamic Reconguration General Description ....................................................... 94

Contents

5



SPARC: DR Clustering Considerations or CPU Devices ....................................................... 95

SPARC: DR Clustering Considerations or Memory ............................................................... 95

SPARC: DR Clustering Considerations or Disk and Tape Drives ........................................ 95SPARC: DR Clustering Considerations or Quorum Devices ................................................ 96

SPARC: DR Clustering Considerations or Cluster Interconnect Interaces ....................... 96

SPARC: DR Clustering Considerations or Public Network Interaces ................................ 96

4 Frequently Asked Questions .............................................................................................................99

High Availability FAQs ....................................................................................................................... 99File Systems FAQs ............................................................................................................................. . 100

Volume Management FAQs ............................................................................................................. 101

Data Services FAQs ............................................................................................................................ 101

Public Network FAQs ........................................................................................................................ 102

Cluster Member FAQs ...................................................................................................................... 103

Cluster Storage FAQs ........................................................................................................................ 104Cluster Interconnect FAQs ............................................................................................................... 104

Client Systems FAQs ......................................................................................................................... 105

Administrative Console FAQs ......................................................................................................... 105

Terminal Concentrator and System Service Processor FAQs ...................................................... 106

Index ................................................................................................................................................... 109

Contents




Preace

The SunCluster ConceptsGuide or Solaris OS contains conceptual and reerence inormation

about the SunTM

Cluster product on both SPARC® and x86 based systems.

Note – This Sun Cluster release supports systems that use the SPARC and x86 amilies o processor architectures: UltraSPARC, SPARC64, AMD64, and Intel 64. In this document, x86reers to the larger amily o 64-bit x86 compatible products. Inormation in this documentpertains to all platorms unless otherwise specied.

Who Should Use This Book This document is intended or the ollowing audiences:

■ Service providers who install and service cluster hardware

■ System administrators who install, congure, and administer Sun Cluster sotware

■ Application developers who develop ailover and scalable services or applications that arenot currently included with the Sun Cluster product

To understand the concepts that are described in this book, you need to be amiliar with theSolaris Operating System and also have expertise with the volume manager sotware that youcan use with the Sun Cluster product.

Beore reading this document, you need to have already determined your system requirementsand purchased the equipment and sotware that you need. TheSunCluster Data Services

PlanningandAdministrationGuide or Solaris OS contains inormation about how to plan,install, set up, and use the Sun Cluster sotware.

7

http://docs.sun.com/doc/820-4682






How This Book Is OrganizedThe SunCluster ConceptsGuide orSolarisOS contains the ollowing chapters:

Chapter 1, “Introduction and Overview,” provides an overview o the overall concepts that youneed to know about Sun Cluster.

Chapter 2, “Key Concepts or Hardware Service Providers,” describes the concepts with whichhardware service providers need to be amiliar. These concepts can help service providersunderstand the relationships between hardware components. These concepts can also helpservice providers and cluster administrators better understand how to install, congure, and

administer cluster sotware and hardware.

Chapter 3, “Key Concepts or System Administrators and Application Developers,” describesthe concepts with which system administrators and developers who intend to use the SunCluster application programming interace (API) need to know. Developers can use this API toturn a standard user application, such as a web browser or database into a highly available dataservice that can run in the Sun Cluster environment.

Chapter 4, “Frequently Asked Questions,” provides answers to requently asked questions aboutthe Sun Cluster product.

Related DocumentationInormation about related Sun Cluster topics is available in the documentation that is listed in

the ollowing table. All Sun Cluster documentation is available athttp://docs.sun.com

.

Topic Documentation

Overview SunCluster Overview orSolarisOS

SunCluster 3.21/09 Documentation Center

Concepts SunCluster ConceptsGuide orSolarisOS

Hardware installation andadministration

SunCluster 3.1- 3.2HardwareAdministrationManual orSolaris OS

Individual hardware administration guides

Sotware installation SunCluster Sotware Installation Guide orSolarisOS

SunCluster Quick StartGuide orSolarisOS

Data service installation and

administration

SunCluster Data ServicesPlanningandAdministration Guide orSolarisOS

Individual data service guides

Preace


http://docs.sun.com/





















Topic Documentation

Data service development SunCluster Data ServicesDeveloper’s Guide orSolaris OS

System administration SunCluster SystemAdministrationGuide orSolaris OS

SunClusterQuickReerence

Sotware upgrade SunCluster Upgrade Guide orSolaris OS

Error messages SunCluster ErrorMessagesGuide orSolaris OS

Command and unction reerences SunClusterReerenceManual orSolaris OS

SunClusterData Services ReerenceManual orSolaris OSSunCluster QuorumServer ReerenceManual orSolaris OS

For a complete list o Sun Cluster documentation, see the release notes or your release o SunCluster sotware at http://wikis.sun.com/display/SunCluster/Home/.

Getting HelpI you have problems installing or using the Sun Cluster sotware, contact your service providerand provide the ollowing inormation:

■ Your name and email address (i available)■ Your company name, address, and phone number■ The model and serial numbers o your systems■ The release number o the operating system (or example, the Solaris 10 OS)■

The release number o Sun Cluster sotware (or example, 3.2 1/09)

Use the ollowing commands to gather inormation about your systems or your serviceprovider.

Command Function

prtconf -v Displays the size o the system memory and reports

inormation about peripheral devices

psrinfo -v Displays inormation about processors

showrev -p Reports which patchesare installed

SPARC: prtdiag -v Displays system diagnostic inormation

/usr/cluster/bin/clnode show-rev Displays Sun Cluster release and package version

inormation

Also have available the contents o the /var/adm/messages le.

Preace

9









http://wikis.sun.com/display/SunCluster/Home/












Documentation, Support, andTraining

The Sun web site provides inormation about the ollowing additional resources:■ Documentation (http://www.sun.com/documentation/)■ Support (http://www.sun.com/support/)■ Training (http://www.sun.com/training/)

Typographic Conventions

The ollowing table describes the typographic conventions that are used in this book.

TABLE P–1 TypographicConventions

Typeface Meaning Example

AaBbCc123 The names o commands, les, and directories,

and onscreen computer output

Edityour .login le.

Use ls -a to list all les.

machine_name% you have mail.

AaBbCc123 What you type, contrasted with onscreen

computer output

machine_name% su

Password:

aabbcc123 Placeholder:replacewith a realname orvalue The command toremove a le is rm

flename.

AaBbCc123 Book titles, new terms, and termsto beemphasized

Read Chapter 6 in theUser's Guide.

A cache isa copythat isstored

locally.

Do not save the le.

Note: Some emphasized items

appear bold online.

Shell Prompts in Command Examples

The ollowing table shows the deault UNIX® system prompt and superuser prompt or the C

shell, Bourne shell, and Korn shell.

Preace


http://www.sun.com/documentation/



http://www.sun.com/support/



http://www.sun.com/training/








TABLE P–2 ShellPrompts

Shell Prompt

C shell machine_name%

C shell or superuser machine_name#

Bourne shell and Korn shell $

Bourne shell and Korn shell or superuser #

Preace

11



12



Introduction and Overview

The Sun Cluster product is an integrated hardware and sotware solution that you use to createhighly available and scalable services. Sun Cluster Concepts Guide or Solaris OS provides theconceptual inormation that you need to gain a more complete picture o the Sun Clusterproduct. Use this book with the entire Sun Cluster documentation set to provide a complete view o the Sun Cluster sotware.

This chapter provides an overview o the general concepts that underlie the Sun Clusterproduct.

This chapter does the ollowing:

■ Provides an introduction and high-level overview o the Sun Cluster sotware

■ Describes the several views o the Sun Cluster audience

■ Identies key concepts that you need to understand beore you use the Sun Cluster sotware■ Maps key concepts to the Sun Cluster documentation that includes procedures and related

inormation

■ Maps cluster-related tasks to the documentation that contains procedures that you use tocomplete those tasks

This chapter contains the ollowing sections:

■ “Introduction to the Sun Cluster Environment” on page 14■ “Three Views o the Sun Cluster Sotware” on page 15■ “Sun Cluster Sotware Tasks” on page 18

1C H A P T E R 1

13



Introduction to the Sun Cluster Environment

The Sun Cluster environment extends the Solaris Operating System into a cluster operating

system. A cluster is a collection o one or more nodes that belong exclusively to that collection.

Ina cluster that runs on the Solaris 10 OS, a global cluster anda zone cluster are types o clusters.

In a cluster that runs on any version o the Solaris OS that was released beore the Solaris 10 OS,

anodeisa physicalmachine that contributes to cluster membership and is not a quorum device.

In a cluster that runs on the Solaris 10 OS, the concept o a node changes. In this environment, a

node is a Solaris zone that is associated with a cluster. In this environment, a Solaris host , or

simply host , is one o the ollowing hardware or sotware congurations that runs the Solaris OSand its own processes:

■ A “bare metal” physical machine that is not congured with a virtual machine or as a

hardware domain

■ A Sun Logical Domains (LDoms) guest domain

■ A Sun Logical Domains (LDoms) I/O domain

■ A hardware domain

These processes communicate with one another to orm what looks like (to a network client) a

single system that cooperatively provides applications, system resources, and data to users.

In a Solaris 10 environment, a global cluster is a type o cluster that is composed only o one or

more global-cluster voting nodes and optionally, zero or more global-cluster non-voting nodes.

Note – A global cluster can optionally also include solaris8, solaris9, lx (linux), or native

brand, non-global zones that are not nodes, but high availability containers (as resources).

A global-cluster voting node is a native brand, global zone in a global cluster that contributes

votes to the total number o quorum votes, that is, membership votes in the cluster. This total

determines whether the cluster has sucient votes to continue operating. A global-cluster

non-voting node is a native brand, non-global zone in a global cluster that does not contribute

votes to the total number o quorum votes, that is, membership votes in the cluster.

In a Solaris 10 environment, a zone cluster is a type o cluster that is composed only o one or

more cluster brand, voting nodes. A zone cluster depends on, and thereore requires, a global

cluster. A global cluster does not contain a zone cluster. You cannot congure a zone cluster

without a global cluster. A zone cluster has, at most, one zone cluster node on a machine.

Introduction to the Sun Cluster Environment




Note – A zone-cluster node continues to operate only as long as the global-cluster voting nodeon the same machine continues to operate. I a global-cluster voting node on a machine ails, allzone-cluster nodes on that machine ail as well.

A cluster ofers several advantages over traditional single-server systems. These advantagesinclude support or ailover and scalable services, capacity or modular growth, and low entry price compared to traditional hardware ault-tolerant systems.

The goals o the Sun Cluster sotware are:

■ Reduce or eliminate system downtime because o sotware or hardware ailure

■ Ensure availability o data and applications to end users, regardless o the kind o ailure thatwould normally take down a single-server system

■ Increase application throughput by enabling services to scale to additional processors by adding nodes to the cluster

■ Provide enhanced availability o the system by enabling you to perorm maintenance

without shutting down the entire cluster

For more inormation about ault tolerance and high availability, see “Making ApplicationsHighly Available With Sun Cluster” in SunCluster Overview orSolaris OS.

Reer to “High Availability FAQs” on page 99 or questions and answers on high availability.

Three Views o the Sun Cluster SotwareThis section describes three diferent views o the Sun Cluster sotware and the key conceptsand documentation relevant to each view.

These views are typical or the ollowing proessionals:

■ Hardware installation and service personnel■

System administrators■ Application developers

Hardware Installation and Service View

To hardware service proessionals, the Sun Cluster sotware looks like a collection o

of-the-shel hardware that includes servers, networks, and storage. These components are allcabled together so that every component has a backup and no single point o ailure exists.

ThreeViews o theSun Cluster Sotware

Chapter 1 • Introduction and Overview 15

http://docs.sun.com/doc/820-4675/intro-2?a=view








Key Concepts – Hardware

Hardware service proessionals need to understand the ollowing cluster concepts.

■ Cluster hardware congurations and cabling■ Installing and servicing (adding, removing, replacing):

■ Network interace components (adapters, junctions, cables)■ Disk interace cards■ Disk arrays■ Disk drives■ The administrative console and the console access device

■ Setting up the administrative console and console access device

More Hardware Conceptual Inormation

The ollowing sections contain material relevant to the preceding key concepts:

■ “Cluster Nodes” on page 22■ “Multihost Devices” on page 24■ “Local Disks” on page 25■ “Cluster Interconnect” on page 26■ “Public Network Interaces” on page 27■ “Client Systems” on page 27■ “Administrative Console” on page 28■ “Console Access Devices” on page 28■ “SPARC: Clustered Pair Topology” on page 30■ “SPARC: N+1 (Star) Topology” on page 31

Sun Cluster Documentation or Hardware Proessionals

The SunCluster 3.1 - 3.2HardwareAdministrationManual or Solaris OS includes proceduresand inormation that are associated with hardware service concepts.

System Administrator View

To the system administrator, the Sun Cluster product is a set o Solaris hosts that share storagedevices.

The system administrator sees sotware that perorms specic tasks:

■ Specialized cluster sotware that is integrated with Solaris sotware to monitor theconnectivity between Solaris hosts in the cluster

■ Specialized sotware that monitors the health o user application programs that are runningon the cluster nodes

■ Volume management sotware that sets up and administers disks







■ Specialized cluster sotware that enables all Solaris hosts to access all storage devices, eventhose Solaris hosts that are not directly connected to disks

■ Specialized cluster sotware that enables les to appear on every Solaris host as though they were locally attached to that Solaris host

Key Concepts – System Administration

System administrators need to understand the ollowing concepts and processes:

■ The interaction between the hardware and sotware components

■

The general ow o how to install and congure the cluster including:■ Installing the Solaris Operating System■ Installing and conguring Sun Cluster sotware■ Installing and conguring a volume manager■ Installing and conguring application sotware to be cluster ready ■ Installing and conguring Sun Cluster data service sotware

■ Cluster administrative procedures or adding, removing, replacing, and servicing clusterhardware and sotware components

■ Conguration modications to improve perormance

More System Administrator Conceptual Inormation


■ “Administrative Interaces” on page 42■

“Cluster Time” on page 42■ “High-Availability Framework” on page 43■ “Global Devices” on page 46■ “Device Groups” on page 47■ “Global Namespace” on page 49■ “Cluster File Systems” on page 51■ “Disk Path Monitoring” on page 53■ “Data Services” on page 64

Sun Cluster Documentation or System Administrators

The ollowing Sun Cluster documents include procedures and inormation associated with thesystem administration concepts:

■ SunCluster Sotware Installation Guide or Solaris OS■ SunCluster SystemAdministrationGuide or Solaris OS■

SunCluster ErrorMessagesGuide orSolaris OS■ Sun Cluster Release Notes or Solaris OS






http://wikis.sun.com/display/SunCluster/Home

http://wikis.sun.com/display/SunCluster/Home






Application DeveloperView

The Sun Cluster sotware providesdata services or such applications as Oracle, NFS, DNS, Sun

Java System Web Server, Apache Web Server (on SPARC based systems), and Sun Java SystemDirectory Server. Data services are created by conguring of-the-shel applications to rununder control o the Sun Cluster sotware. The Sun Cluster sotware provides conguration lesand management methods that start, stop, and monitor the applications. I you need to create anew ailover or scalable service, you can use the Sun Cluster Application ProgrammingInterace (API) and the Data Service Enabling Technologies API (DSET API) to develop thenecessary conguration les and management methods that enable its application to run as adata service on the cluster.

Key Concepts – Application Development

Application developers need to understand the ollowing:

■ The characteristics o their application to determine whether it can be made to run as aailover or scalable data service.

■ The Sun Cluster API, DSET API, and the “generic” data service. Developers need to

determine which tool is most suitable or them to use to write programs or scripts tocongure their application or the cluster environment.

More Application Developer Conceptual Inormation


■ “Data Services” on page 64

■ “Resources, Resource Groups, and Resource Types” on page 74■ Chapter 4, “Frequently Asked Questions”

Sun Cluster Documentation or Application Developers

The ollowing Sun Cluster documents include procedures and inormation associated with theapplication developer concepts:

■

SunCluster Data ServicesDeveloper’s Guide or Solaris OS■ SunCluster Data Services Planning andAdministrationGuide or Solaris OS

Sun Cluster SotwareTasksAll Sun Cluster sotware tasks require some conceptual background. The ollowing tableprovides a high-level view o the tasks and the documentation that describes task steps. Theconcepts sections in this book describe how the concepts map to these tasks.

Sun Cluster SotwareTasks








TABLE 1–1 Task Map: Mapping User Tasks to Documentation

Task Instructions

Install cluster hardware SunCluster 3.1 - 3.2HardwareAdministrationManual orSolarisOS

Install Solaris sotware on the cluster SunCluster Sotware InstallationGuide orSolaris OS

SPARC: Install SunTM Management Center

sotware

SunCluster Sotware InstallationGuide orSolaris OS

Install and congure Sun Cluster sotware SunCluster Sotware InstallationGuide orSolaris OS

Install and congure volume managementsotware

SunCluster Sotware InstallationGuide orSolaris OS

Your volume management documentation

Install and congure Sun Cluster data

services

SunClusterData ServicesPlanning andAdministrationGuide or

SolarisOS

Service cluster hardware SunCluster 3.1 - 3.2HardwareAdministrationManual orSolaris

OS

Administer Sun Cluster sotware SunCluster SystemAdministration Guide orSolarisOSAdminister volume management sotware SunCluster SystemAdministration Guide orSolarisOS and your

volume management documentation

Administer application sotware Yourapplicationdocumentation

Problem identication and suggested user

actions

SunClusterErrorMessagesGuide orSolaris OS

Create a new data service SunClusterData ServicesDeveloper’s Guide orSolaris OS

Sun Cluster SotwareTasks
































20



Key Concepts or Hardware Service Providers

This chapter describes the key concepts that are related to the hardware components o a SunCluster conguration.

This chapter covers the ollowing topics:

■ “Sun Cluster System Hardware and Sotware Components” on page 21■ “SPARC: Sun Cluster Topologies” on page 29■ “x86: Sun Cluster Topologies” on page 38

Sun Cluster System Hardware and Sotware ComponentsThis inormation is directed primarily to hardware service providers. These concepts can helpservice providers understand the relationships between the hardware components beore they install, congure, or service cluster hardware. Cluster system administrators might also nd thisinormation useul as background to installing, conguring, and administering clustersotware.

A cluster is composed o several hardware components, including the ollowing:

■ Solaris hosts with local disks (unshared)■ Multihost storage (disks are shared between Solaris hosts)■

Removable media (tapes and CD-ROMs)■ Cluster interconnect■ Public network interaces■ Client systems■ Administrative console■ Console access devices

The Sun Cluster sotware enables you to combine these components into a variety o congurations. The ollowing sections describe these congurations.

■ “SPARC: Sun Cluster Topologies” on page 29

2C H A P T E R 2

21

Sun Cluster System Hardware and Sotware Components



■ “x86: Sun Cluster Topologies” on page 38

For an illustration o a sample two-host cluster conguration, see “Sun Cluster Hardware

Environment” in SunCluster Overview orSolarisOS.

Cluster Nodes

In a cluster that runs on any version o the Solaris OS that was released beore the Solaris 10 OS,anodeisa physicalmachine that contributes to cluster membership and is not a quorum device.In a cluster that runs on the Solaris 10 OS, the concept o a node changes. In this environment, a

node is a Solaris zone that is associated with a cluster. In this environment, a Solaris host , orsimply host , is one o the ollowing hardware or sotware congurations that runs the Solaris OSand its own processes:

■ A “bare metal” physical machine that is not congured with a virtual machine or as ahardware domain

■ A Sun Logical Domains (LDoms) guest domain

■ A Sun Logical Domains (LDoms) I/O domain

■ A hardware domain

Depending on your platorm, Sun Cluster sotware supports the ollowing congurations:

■ SPARC: Sun Cluster sotware supports rom one to sixteen Solaris hosts in a cluster.Diferent hardware congurations impose additional limits on the maximum number o hosts that you can congure in a cluster composed o SPARC based systems. See “SPARC:Sun Cluster Topologies” on page 29 or the supported congurations.

■ x86: Sun Cluster sotware supports rom one to eight Solaris hosts in a cluster. Diferenthardware congurations impose additional limits on the maximum number o hosts thatyou can congure in a cluster composed o x86 based systems. See “x86: Sun ClusterTopologies” on page 38 or the supported congurations.

Solaris hosts are generally attached to one or more multihost devices. Hosts that are notattached to multihost devices use the cluster le system to access the multihost devices. Forexample, one scalable services conguration enables hosts to service requests without beingdirectly attached to multihost devices.

In addition, hosts in parallel database congurations share concurrent access to all the disks.

■ See “Multihost Devices” on page 24 or inormation about concurrent access to disks.■ See “SPARC: Clustered Pair Topology” on page 30 and “x86: Clustered Pair Topology” on

page 39 or more inormation about parallel database congurations.

All nodes in the cluster are grouped under a common name (the cluster name), which is usedor accessing and managing the cluster.




http://docs.sun.com/doc/820-4675/architecture-4?a=view








Public network adapters attach hosts to the public networks, providing client access to the

cluster.

Cluster members communicate with the other hosts in the cluster through one or morephysically independent networks. This set o physically independent networks is reerred to as

the cluster interconnect .

Every node in the cluster is aware when another node joins or leaves the cluster. Additionally,

every node in the cluster is aware o the resources that are running locally as well as the

resources that are running on the other cluster nodes.

Hosts in the same cluster should have similar processing, memory, and I/O capability to enableailover to occur without signicant degradation in perormance. Because o the possibility o

ailover, every host must have enough excess capacity to support the workload o all hosts or

which they are a backup or secondary.

Each host boots its own individual root (/) le system.

Sotware Components or Cluster Hardware Members

To unction as a cluster member, a Solaris host must have the ollowing sotware installed:

■ Solaris Operating System

■ Sun Cluster sotware

■ Data service application

■ Volume management (Solaris Volume ManagerTM or Veritas Volume Manager)

An exception is a conguration that uses hardware redundant array o independent disks

(RAID). This conguration might not require a sotware volume manager such as Solaris

Volume Manager or Veritas Volume Manager.

■ See the SunCluster Sotware InstallationGuide orSolaris OS or inormation about how to

install the Solaris Operating System, Sun Cluster, and volume management sotware.

■ See the SunCluster Data Services Planning andAdministrationGuide or Solaris OS orinormation about how to install and congure data services.

■ See Chapter 3, “Key Concepts or System Administrators and Application Developers,” or

conceptual inormation about the preceding sotware components.

The ollowing gure provides a high-level view o the sotware components that work together

to create the Sun Cluster environment.


Chapter 2 • Key Concepts or Hardware Service Providers 23








See Chapter 4, “Frequently Asked Questions,” or questions and answers about clustermembers.

Multihost Devices

Disks that can be connected to more than one Solaris host at a time are multihost devices. In theSun Cluster environment, multihost storage makes disks highly available. Sun Cluster sotwarerequires multihost storage or two-host clusters to establish quorum. Greater than two-hostclusters do not require quorum devices. For more inormation about quorum, see “Quorumand Quorum Devices” on page 56.

Multihost devices have the ollowing characteristics.

■ Tolerance o single-host ailures.

■ Ability to store application data, application binaries, and conguration les.

■ Protection against host ailures. I clients request the data through one host and the hostails, the requests are switched over to use another host with a direct connection to the samedisks.

■ Global access through a primary host that “masters” the disks, or direct concurrent accessthrough local paths. The only application that uses direct concurrent access currently isOracle Real Application Clusters Guard.

A volume manager provides or mirrored or RAID-5 congurations or data redundancy o themultihost devices. Currently, Sun Cluster supports Solaris Volume Manager and VeritasVolume Manager as volume managers, and the RDAC RAID-5 hardware controller on severalhardware RAID platorms.

Combining multihost devices with disk mirroring and disk striping protects against both hostailure and individual disk ailure.

See Chapter 4, “Frequently Asked Questions,” or questions and answers about multihoststorage.

Solaris operating environment

KernelUser

Data service software

Volume management software

Sun Cluster software

FIGURE 2–1 High-LevelRelationship o Sun Cluster Sotware Components

y p





Multi-Initiator SCSI

This section applies only to SCSI storage devices and not to Fibre Channel storage that is used

or the multihost devices.

In a standalone (that is, non-clustered) host, the host controls the SCSI bus activities by way o the SCSI host adapter circuit that connects this host to a particular SCSI bus. This SCSI hostadapter circuit is reerred to as the SCSI initiator . This circuit initiates all bus activities or thisSCSI bus. The deault SCSI address o SCSI host adapters in Sun systems is 7.

Cluster congurations share storage between multiple hosts, using multihost devices. When the

cluster storage consists o single-ended or diferential SCSI devices, the conguration is reerredto as multi-initiator SCSI. As this terminology implies, more than one SCSI initiator exists onthe SCSI bus.

The SCSI specication requires each device on a SCSI bus to have a unique SCSI address. (Thehost adapter is also a device on the SCSI bus.) The deault hardware conguration in amulti-initiator environment results in a conict because all SCSI host adapters deault to 7.

To resolve this conict, on each SCSI bus, leave one o the SCSI host adapters with the SCSI

address o 7, and set the other host adapters to unused SCSI addresses. Proper planning dictatesthat these “unused” SCSI addresses include both currently and eventually unused addresses. Anexample o addresses unused in the uture is the addition o storage by installing new drives intoempty drive slots.

In most congurations, the available SCSI address or a second host adapter is 6.

You can change the selected SCSI addresses or these host adapters by using one o the

ollowing tools to set the scsi-initiator-id property:■ eeprom(1M)

■ The OpenBootTM PROM on a SPARC based system

■ The SCSI utility that you optionally run ater the BIOS boots on an x86 based system

You can set this property globally or a host or on a per-host-adapter basis. Instructions orsetting a unique scsi-initiator-id or each SCSI host adapter are included inSunCluster 3.1

- 3.2WithSCSI JBODStorage DeviceManual or Solaris OS.

Local Disks

Local disks are the disks that are only connected to a single Solaris host. Local disks are,thereore, not protected against host ailure (they are not highly available). However, all disks,including local disks, are included in the global namespace and are congured as global devices.Thereore, the disks themselves are visible rom all cluster hosts.



http://docs.sun.com/doc/816-5166/eeprom-1m?a=view









You can make the le systems on local disks available to other hosts by placing them under aglobal mount point. I the host that currently has one o these global le systems mounted ails,all hosts lose access to that le system. Using a volume manager lets you mirror these disks so

that a ailure cannot cause these le systems to become inaccessible, but volume managers donot protect against host ailure.

See the section “Global Devices” on page 46 or more inormation about global devices.

Removable Media

Removable media such as tape drives and CD-ROM drives are supported in a cluster. Ingeneral, you install, congure, and service these devices in the same way as in a nonclusteredenvironment. These devices are congured as global devices in Sun Cluster, so each device canbe accessed rom any node in the cluster. Reer toSunCluster 3.1 - 3.2HardwareAdministration Manual or Solaris OS or inormation about installing and conguring removable media.

See the section “Global Devices” on page 46 or more inormation about global devices.

Cluster Interconnect

The cluster interconnect is the physical conguration o devices that is used to transercluster-private communications and data service communications between Solaris hosts in thecluster. Because the interconnect is used extensively or cluster-private communications, it canlimit perormance.

Only hosts in the cluster can be connected to the cluster interconnect. The Sun Cluster security model assumes that only cluster hosts have physical access to the cluster interconnect.

You can set up rom one to six cluster interconnects in a cluster. While a single clusterinterconnect reduces the number o adapter ports that are used or the private interconnect, itprovides no redundancy and less availability. I a single interconnect ails, moreover, the clusteris at a higher risk o having to perorm automatic recovery. Whenever possible, install two ormore cluster interconnects to provide redundancy and scalability, and thereore higher

availability, by avoiding a single point o ailure.

The cluster interconnect consists o three hardware components: adapters, junctions, andcables. The ollowing list describes each o these hardware components.

■ Adapters – The network interace cards that are located in each cluster host. Their namesare constructed rom a device name immediately ollowed by a physical-unit number, orexample, qe2. Some adapters have only one physical network connection, but others, likethe qe card, have multiple physical connections. Some adapters also contain both network interaces and storage interaces.









A network adapter with multiple interaces could become a single point o ailure i theentire adapter ails. For maximum availability, plan your cluster so that the only pathbetween two hosts does not depend on a single network adapter.

■ Junctions – The switches that are located outside o the cluster hosts. Junctions perormpass-through and switching unctions to enable you to connect more than two hosts. In atwo-host cluster, you do not need junctions because the hosts can be directly connected toeach other through redundant physical cables connected to redundant adapters on eachhost. Greater than two-host congurations generally require junctions.

■ Cables – The physical connections that you install either between two network adapters orbetween an adapter and a junction.

See Chapter 4, “Frequently Asked Questions,” or questions and answers about the clusterinterconnect.

Public Network Interaces

Clients connect to the cluster through the public network interaces. Each network adapter card

can connect to one or more public networks, depending on whether the card has multiplehardware interaces.

You can set up Solaris hosts in the cluster to include multiple public network interace cardsthat perorm the ollowing unctions:

■ Are congured so that multiple cards are active.■ Serve as ailover backups or one another.

I one o the adapters ails, IP network multipathing sotware is called to ail over the deectiveinterace to another adapter in the group.

No special hardware considerations relate to clustering or the public network interaces.

See Chapter 4, “Frequently Asked Questions,” or questions and answers about public networks.

Client Systems

Client systems include machines or other hosts that access the cluster over the public network.Client-side programs use data or other services that are provided by server-side applicationsrunning on the cluster.

Client systems are not highly available. Data and applications on the cluster are highly available.

See Chapter 4, “Frequently Asked Questions,” or questions and answers about client systems.





Console Access Devices

You must have console access to all Solaris hosts in the cluster.

To gain console access, use one o the ollowing devices:

■ The terminal concentrator that you purchased with your cluster hardware

■ The System Service Processor (SSP) on Sun Enterprise E10000 servers (or SPARC basedclusters)

■ The system controller on Sun FireTM servers (also or SPARC based clusters)

■

Another device that can accessttya

on each host

Only one supported terminal concentrator is available rom Sun and use o the supported Sunterminal concentrator is optional. The terminal concentrator enables access to /dev/console

on each host by using a TCP/IP network. The result is console-level access or each host rom aremote machine anywhere on the network.

The System Service Processor (SSP) provides console access or Sun Enterprise E1000 servers.The SSP is a processor card in a machine on an Ethernet network that is congured to supportthe Sun Enterprise E1000 server. The SSP is the administrative console or the Sun EnterpriseE1000 server. Using the Sun Enterprise E10000 Network Console eature, any machine in thenetwork can open a host console session.

Other console access methods include other terminal concentrators, tip serial port access romanother host, and dumb terminals.

Caution – You can attach a keyboard or monitor to a cluster host provided that that keyboard ormonitor is supported by the base server platorm. However, you cannot use that keyboard ormonitor as a console device. You must redirect the console to a serial port, or depending onyour machine, to the System Service Processor (SSP) and Remote System Control (RSC) by setting the appropriate OpenBoot PROM parameter.

Administrative ConsoleYou can use a dedicated machine, known as the administrative console, to administer the activecluster. Usually, you install and run administrative tool sotware, such as the Cluster ControlPanel (CCP) and the Sun Cluster module or the Sun Management Center product (or use withSPARC based clusters only), on the administrative console. Using cconsole under the CCPenables you to connect to more than one host console at a time. For more inormation about touse the CCP, see the Chapter 1, “Introduction to Administering Sun Cluster,” in SunCluster

SystemAdministrationGuide or Solaris OS.


SPARC: Sun ClusterTopologies

http://docs.sun.com/doc/820-4679/z4000070997776?a=view








The administrative console is not a cluster host. You use the administrative console or remoteaccess to the cluster hosts, either over the public network, or optionally through anetwork-based terminal concentrator.

I your cluster consists o the Sun Enterprise E10000 platorm, you must do the ollowing:

■ Log in rom the administrative console to the SSP.■ Connect by using the netcon command.

Typically, you congure hosts without monitors. Then, you access the host's console through atelnet session rom the administrative console. The administration console is connected to aterminal concentrator, and rom the terminal concentrator to the host's serial port. In the case

o a Sun Enterprise E1000 server, you connect rom the System Service Processor. See “ConsoleAccess Devices” on page 28 or more inormation.

Sun Cluster does not require a dedicated administrative console, but using one provides thesebenets:

■ Enables centralized cluster management by grouping console and management tools on thesame machine

■

Provides potentially quicker problem resolution by your hardware service provider

See Chapter 4, “Frequently Asked Questions,” or questions and answers about theadministrative console.

SPARC: Sun Cluster Topologies

A topology is the connection scheme that connects the Solaris hosts in the cluster to the storageplatorms that are used in a Sun Cluster environment. Sun Cluster sotware supports any topology that adheres to the ollowing guidelines.

■ A Sun Cluster environment that is composed o SPARC based systems supports rom one tosixteen Solaris hosts in a cluster. Diferent hardware congurations impose additional limitson the maximum number o hosts that you can congure in a cluster composed o SPARCbased systems.

■ A shared storage device can connect to as many hosts as the storage device supports.■ Shared storage devices do not need to connect to all hosts o the cluster. However, these

storage devices must connect to at least two hosts.

You can congure logical domains (LDoms) guest domains and LDoms I/O domains as virtualSolaris hosts. In other words, you can create a clustered pair, pair+N, N+1, and N*N cluster thatconsists o any combination o physical machines, LDoms I/O domains, and LDoms guestdomains. You can also create clusters that consist o only LDoms guest domains, LDoms I/O

domains, or any combination o the two.





Sun Cluster sotware does not require you to congure a cluster by using specic topologies.

The ollowing topologies are described to provide the vocabulary to discuss a cluster's

connection scheme. These topologies are typical connection schemes.

■ Clustered pair■ Pair+N■ N+1 (star)■ N*N (scalable)■ LDoms Guest Domains: Cluster in a Box■ LDoms Guest Domains: Single Cluster Spans Two Diferent Hosts■ LDoms Guest Domains: Clusters Span Two Diferent Hosts■

LDoms Guest Domains: Redundant I/O Domains

The ollowing sections include sample diagrams o each topology.

SPARC: Clustered Pair Topology

A clustered pair topology is two or more pairs o Solaris hosts that operate under a single clusteradministrative ramework. In this conguration, ailover occurs only between a pair. However,

all hosts are connected by the cluster interconnect and operate under Sun Cluster sotware

control. You might use this topology to run a parallel database application on one pair and a

ailover or scalable application on another pair.

Using the cluster le system, you could also have a two-pair conguration. More than two hosts

can run a scalable service or parallel database, even though all the hosts are not directly connected to the disks that store the application data.

The ollowing gure illustrates a clustered pair conguration.

Storage Storage Storage Storage

Host 2Host 1 Host 3

Junction

Junction

Host 4

FIGURE 2–2 SPARC:Clustered PairTopology





SPARC: Pair+N Topology

The pair+N topology includes a pair o Solaris hosts that are directly connected to the ollowing:

■ Shared storage.

■ An additional set o hosts that use the cluster interconnect to access shared storage (they have no direct connection themselves).

The ollowing gure illustrates a pair+N topology where two o the our hosts (Host 3 and Host4) use the cluster interconnect to access the storage. This conguration can be expanded toinclude additional hosts that do not have direct access to the shared storage.

SPARC: N+1 (Star)TopologyAn N+1 topology includes some number o primary Solaris hosts and one secondary host. Youdo not have to congure the primary hosts and secondary host identically. The primary hostsactively provide application services. The secondary host need not be idle while waiting or aprimary host to ail.

The secondary host is the only host in the conguration that is physically connected to all the

multihost storage.

I a ailure occurs on a primary host, Sun Cluster ails over the resources to the secondary host.The secondary host is where the resources unction until they are switched back (eitherautomatically or manually) to the primary host.

The secondary host must always have enough excess CPU capacity to handle the load i one o the primary hosts ails.

The ollowing gure illustrates an N+1 conguration.

Host 2Host 1 Host 3

Junction

Storage Storage

Junction

Host 4

FIGURE 2–3 Pair+NTopology





SPARC: N*N (Scalable) Topology

An N*N topology enables every shared storage device in the cluster to connect to every Solaris

host in the cluster. This topology enables highly available applications to ail over rom one host

to another without service degradation. When ailover occurs, the new host can access the

storage device by using a local path instead o the private interconnect.

The ollowing gure illustrates an N*N conguration.

Host 2Primary

Host 1Primary

Host 3Primary

Junction

Storage Storage Storage

Junction

Host 4Secondary

FIGURE 2–4 SPARC: N+1 Topology

Storage Storage

Host 2Host 1 Host 3

Junction

Junction

Host 4

FIGURE 2–5 SPARC: N*N Topology





SPARC: LDoms Guest Domains: Cluster in a BoxTopology

In this logical domains (LDoms) guest domain topology, a cluster and every node within that

cluster are located on the same Solaris host. Each LDoms guest domain node acts the same as a

Solaris host in a cluster. To preclude your having to include a quorum device, this conguration

includes three nodes rather than only two.

In this topology, you do not need to connect each virtual switch (vsw) or the private network to

a physical network because they need only communicate with each other. In this topology,

cluster nodes can also share the same storage device, as all cluster nodes are located on the samehost. To learn more about guidelines or using and installing LDoms guest domains or LDoms

I/O domains in a cluster, see “How to Install Sun Logical Domains Sotware and Create

Domains” in SunCluster Sotware InstallationGuide or Solaris OS.

This topology does not provide high availability, as all nodes in the cluster are located on the

same host. However, developers and administrators might nd this topology useul or testing

and other non-production tasks. This topology is also called a “cluster in a box”.

The ollowing gure illustrates a cluster in a box conguration.



http://docs.sun.com/doc/820-4677/ggoak?a=view








SPARC: LDoms Guest Domains: Single Cluster SpansTwo Diferent Hosts Topology

In this logical domains (LDoms) guest domain topology, a single cluster spans two diferent

Solaris hosts and each cluster comprises one node on each host. Each LDoms guest domain

node acts the same as a Solaris host in a cluster. To learn more about guidelines or using and

installing LDoms guest domains or LDoms I/O domains in a cluster, see “How to Install Sun

Logical Domains Sotware and Create Domains” in SunCluster Sotware Installation Guide or

SolarisOS.

Node 1 Node 2

VSW 0Private

VSW 1Private

VSW 2Public

VSW = Virtual Switch

Node 3

GuestDomain 1

GuestDomain 2

Cluster

Host

I/O Domain

Public Network

GuestDomain 3

Physical Adapter

Storage

FIGURE 2–6 SPARC: Cluster in a Box Topology













The ollowing gure illustrates a conguration in which a single cluster spans two diferenthosts.

SPARC: LDoms Guest Domains: Clusters SpanTwoDiferent Hosts Topology

In this logical domains (LDoms) guest domain topology, each cluster spans two diferent Solarishosts and each cluster comprises one node on each host. Each LDoms guest domain node actsthe same as a Solaris host in a cluster. In this conguration, because both clusters share the sameinterconnect switch, you must speciy a diferent private network address on each cluster.Otherwise, i you speciy the same private network address on clusters that share an

interconnect switch, the conguration ails.

I/O Domain

ClusterInterconnect

Host 1 Host 2

Guest Domain 1 Guest Domain 2

VSW 1

VSWPrivate

VSWPrivate

VSWPrivate

VSWPrivate

VSW 2

PhysicalAdapter

PhysicalAdapter


Public Network

Node 1 Node 2Cluster 1

Storage

I/O Domain

FIGURE 2–7 SPARC:Single Cluster Spans Two Diferent Hosts





To learn more about guidelines or using and installing LDoms guest domains or LDoms I/O

domains in a cluster, see “How to Install Sun Logical Domains Sotware and Create Domains”

in SunCluster Sotware Installation Guide or Solaris OS.

The ollowing gure illustrates a conguration in which more than a single cluster spans two

diferent hosts.

I/O Domain

ClusterInterconnect

Multipleclusters onthe same

interconnectswitch

Host 1 Host 2



VSW 1

VSWPrivate

VSWPrivate

VSWPrivate

VSWPrivate

VSW 2

PhysicalAdapter

PhysicalAdapter


Public Network

Node 1 Node 2

Node 1 Node 2

Cluster 1

Cluster 2

Storage

I/O Domain

FIGURE 2–8 SPARC:Clusters Span Two Diferent Hosts











SPARC: LDoms Guest Domains: Redundant I/ODomains

In this logical domains (LDoms) guest domain topology, multiple I/O domains ensure that

guest domains, or nodes within the cluster, continue to operate i an I/O domain ails. Each

LDoms guest domain node acts the same as a Solaris host in a cluster.

In this topology, the guest domain runs IP network multipathing (IPMP) across two public

networks, one through each I/O domain. Guest domains also mirror storage devices across

diferent I/O domains. To learn more about guidelines or using and installing LDoms guest

domains or LDoms I/O domains in a cluster, see “How to Install Sun Logical Domains Sotwareand Create Domains” in SunCluster Sotware Installation Guide or Solaris OS.

The ollowing gure illustrates a conguration in which redundant I/O domains ensure that

nodes within the cluster continue to operate i an I/O domain ails.


x86:Sun Cluster Topologies









x86: Sun Cluster Topologies

A topology is the connection scheme that connects the cluster nodes to the storage platorms

that are used in the cluster. Sun Cluster supports any topology that adheres to the ollowing

guidelines.

■ Sun Cluster sotware supports rom one to eight Solaris hosts in a cluster. Diferenthardware congurations impose additional limits on the maximum number o hosts that

you can congure in a cluster composed o x86 based systems. See “x86: Sun Cluster

Topologies” on page 38 or the supported host congurations.

■ Shared storage devices must connect to hosts.

Node 1

Guest Domain 1

Host 1 Host 2

ClusterlusterCluster

VSW 1Public

PhysicalAdapter

PhysicalAdapter

PhysicalAdapter

PhysicalAdapter

VSWPrivate


IPMP Mirror

I/O DomainPrimary

VSW 2Public

VSWPrivate

I/O DomainAlternate

Public Network

Node 2

VSW 3Public

VSWPrivate

IPMP Mirror

I/O DomainPrimary

VSW 4Public

VSWPrivate

I/O DomainAlternate

Guest Domain 2

Storage Storage

Mirror

FIGURE 2–9 SPARC:RedundantI/O Domains


S Cl d i l b i i l i Th

x86:Sun ClusterTopologies



Sun Cluster does not require you to congure a cluster by using specic topologies. Theollowing clustered pair topology, which is a topology or clusters that are composed o x86based hosts, is described to provide the vocabulary to discuss a cluster's connection scheme.

This topology is a typical connection scheme.The ollowing section includes a sample diagram o the topology.

x86: Clustered Pair TopologyA clustered pair topology is two Solaris hosts that operate under a single cluster administrativeramework. In this conguration, ailover occurs only between a pair. However, all hosts are

connected by the cluster interconnect and operate under Sun Cluster sotware control. Youmight use this topology to run a parallel database or a ailover or scalable application on thepair.

The ollowing gure illustrates a clustered pair conguration.

x86: N+1 (Star)TopologyAn N+1 topology includes some number o primary Solaris hosts and one secondary host. Youdo not have to congure the primary hosts and secondary host identically. The primary hostsactively provide application services. The secondary host need not be idle while waiting or aprimary host to ail.

The secondary host is the only host in the conguration that is physically connected to all themultihost storage.

I a ailure occurs on a primary host, Sun Cluster ails over the resources to the secondary host.The secondary host is where the resources unction until they are switched back (eitherautomatically or manually) to the primary host.

The secondary host must always have enough excess CPU capacity to handle the load i one o

the primary hosts ails.

Private

Storage Storage

Host 1 Host 2

Public

FIGURE 2–10 x86:Clustered PairTopology


Th ll i ill t t N+1 ti

x86:Sun Cluster Topologies



The ollowing gure illustrates an N+1 conguration.

Host 2Primary

Host 1Primary

Host 3Primary

Junction

Storage Storage Storage

Junction

Host 4Secondary

FIGURE 2–11 x86:N+1 Topology




Key Concepts or System Administrators andApplication Developers

This chapter describes the key concepts that are related to the sotware components o the Sun

Cluster environment. The inormation in this chapter is directed primarily to system

administrators and application developers who use the Sun Cluster API and SDK. Cluster

administrators can use this inormation in preparation or installing, conguring, and

administering cluster sotware. Application developers can use the inormation to understand

the cluster environment in which they work.

This chapter covers the ollowing topics:

■ “Administrative Interaces” on page 42■ “Cluster Time” on page 42■ “High-Availability Framework” on page 43■ “Global Devices” on page 46■

“Device Groups” on page 47■ “Global Namespace” on page 49■ “Cluster File Systems” on page 51■ “Disk Path Monitoring” on page 53■ “Quorum and Quorum Devices” on page 56■ “Data Services” on page 64■ “Developing New Data Services” on page 71■ “Using the Cluster Interconnect or Data Service Trac” on page 73■ “Resources, Resource Groups, and Resource Types” on page 74■ “Support or Solaris Zones” on page 77■ “Service Management Facility” on page 80■ “System Resource Usage” on page 81■ “Data Service Project Conguration” on page 83■ “Public Network Adapters and IP Network Multipathing” on page 92■ “SPARC: Dynamic Reconguration Support” on page 94

3C H A P T E R 3

41

Administrative Interaces

Administrative Interaces



Administrative InteracesYou can choose how you install, congure, and administer the Sun Cluster sotware rom

several user interaces. You can accomplish system administration tasks either through the SunCluster Manager graphical user interace (GUI) or through the command-line interace. On topo the command-line interace are some utilities, such as scinstall and clsetup, to simpliy selected installation and conguration tasks. The Sun Cluster sotware also has a module thatruns as part o Sun Management Center that provides a GUI to particular cluster tasks. Thismodule is available or use in only SPARC based clusters. Reer to “Administration Tools” inSunCluster SystemAdministrationGuide or Solaris OS or complete descriptions o theadministrative interaces.

ClusterTimeTime between all Solaris hosts in a cluster must be synchronized. Whether you synchronize thecluster hosts with any outside time source is not important to cluster operation. The SunCluster sotware employs the Network Time Protocol (NTP) to synchronize the clocks betweenhosts.

In general, a change in the system clock o a raction o a second causes no problems. However,i you run date, rdate, or xntpdate (interactively, or within cron scripts) on an active cluster,you can orce a time change much larger than a raction o a second to synchronize the systemclock to the time source. This orced change might cause problems with le modicationtimestamps or conuse the NTP service.

When you install the Solaris Operating System on each cluster host, you have an opportunity tochange the deault time and date setting or the host. In general, you can accept the actory

deault.

When you install Sun Cluster sotware by using the scinstall command, one step in theprocess is to congure NTP or the cluster. Sun Cluster sotware supplies a template le,ntp.cluster (see /etc/inet/ntp.cluster on an installed cluster host), that establishes a peerrelationship between all cluster hosts. One host is designated the “preerred” host. Hosts areidentied by their private host names and time synchronization occurs across the clusterinterconnect. For instructions about how to congure the cluster or NTP, see Chapter 2,

“Installing Sotware on Global-Cluster Nodes,” in SunCluster Sotware Installation Guide or SolarisOS.

Alternately, you can set up one or more NTP servers outside the cluster and change thentp.conf le to reect that conguration.

In normal operation, you should never need to adjust the time on the cluster. However, i thetime was set incorrectly when you installed the Solaris Operating System and you want tochange it, the procedure or doing so is included in Chapter 8, “Administering the Cluster,” in

SunCluster SystemAdministrationGuide or Solaris OS.


High-Availability Framework

High-AvailabilityFramework

http://docs.sun.com/doc/820-4679/x-4n683?a=view


http://docs.sun.com/doc/820-4677/z40001fb1003552?a=view

















High-Availability Framework

The Sun Cluster sotware makes all components on the “path” between users and data highly

available, including network interaces, the applications themselves, the le system, and themultihost devices. In general, a cluster component is highly available i it survives any single(sotware or hardware) ailure in the system.

The ollowing table shows the kinds o Sun Cluster component ailures (both hardware andsotware) and the kinds o recovery that are built into the high-availability ramework.

TABLE 3–1 Levels o Sun Cluster Failure Detectionand Recovery

Failed Cluster Component Software Recovery Hardware Recovery

Data service HA API, HA ramework Not applicable

Public network

adapter

IP n etwork multipathing Multiple public n etwork adapter cards

Clusterle system Primaryand secondary replicas Multihost devices

Mirrored multihost

device

Volume management (Solaris Volume

Manager and Veritas Volume Manager)

Hardware RAID-5 (or example, Sun

StorEdgeTM A3x00)

Global device Primary and secondary replicas Multiple paths to the device, cluster

transport junctions

Private network HA transport s otware Multiple private hardware-independent

networks

Host CMM, ailast driver Multiple hosts

Zone HA API, HA ramework Not applicable

Sun Cluster sotware's high-availability ramework detects a node ailure quickly and creates a

new equivalent server or the ramework resources on a remaining node in the cluster. At notime are all ramework resources unavailable. Framework resources that are unafected by aailed node are ully available during recovery. Furthermore, ramework resources o the ailednode become available as soon as they are recovered. A recovered ramework resource does not

have to wait or all other ramework resources to complete their recovery.

Most highly available ramework resources are recovered transparently to the applications (dataservices) that are using the resource. The semantics o ramework resource access are ully preserved across node ailure. The applications cannot detect that the ramework resourceserver has been moved to another node. Failure o a single node is completely transparent toprograms on remaining nodes by using the les, devices, and disk volumes that are available tothis node. This transparency exists i an alternative hardware path exists to the disks rom

another host. An example is the use o multihost devices that have ports to multiple hosts.

Chapter 3 • Key Concepts or System Administrators and Application Developers 43

Zone Membership




Zone Membership

Sun Cluster sotware also tracks zone membership by detecting when a zone boots up or halts.

These changes also trigger a reconguration. A reconguration can redistribute clusterresources among the nodes in the cluster.

Cluster Membership Monitor

To ensure that data is kept sae rom corruption, all nodes must reach a consistent agreement on

the cluster membership. When necessary, the CMM coordinates a cluster reconguration o cluster services (applications) in response to a ailure.

The CMM receives inormation about connectivity to other nodes rom the cluster transport

layer. The CMM uses the cluster interconnect to exchange state inormation during a

reconguration.

Ater detecting a change in cluster membership, the CMM perorms a synchronized

conguration o the cluster. In a synchronized conguration, cluster resources might beredistributed, based on the new membership o the cluster.

Failast Mechanism

The ailast mechanism detects a critical problem on either a global-cluster voting node or

global-cluster non-voting node. The action that Sun Cluster takes when ailast detects aproblem depends on whether the problem occurs in a voting node or a non-voting node.

I the critical problem is located in a voting node, Sun Cluster orcibly shuts down the node. Sun

Cluster then removes the node rom cluster membership.

I the critical problem is located in a non-voting node, Sun Cluster reboots that non-voting

node.

I a node loses connectivity with other nodes, the node attempts to orm a cluster with the nodes

with which communication is possible. I that set o nodes does not orm a quorum, Sun Cluster

sotware halts the node and “ences” the node rom the shared disks, that is, prevents the node

rom accessing the shared disks.

You can turn of encing or selected disks or or all disks.





Caution – I you turn of encing under the wrong circumstances, your data can be vulnerable tocorruption during application ailover. Examine this data corruption possibility careully when

you are considering turning of encing. I your shared storage device does not support the SCSIprotocol, such as a Serial Advanced Technology Attachment (SATA) disk, or i you want toallow access to the cluster's storage rom hosts outside the cluster, turn of encing.

I one or more cluster-specic daemons die, Sun Cluster sotware declares that a criticalproblem has occurred. Sun Cluster sotware runs cluster-specic daemons on both votingnodes and non-voting nodes. I a critical problem occurs, Sun Cluster either shuts down and

removes the node or reboots the non-voting node where the problem occurred.

When a cluster-specic daemon that runs on a non-voting node ails, a message similar to theollowing is displayed on the console.

cl_runtime: NOTICE: Failfast: Aborting because "pmfd" died in zone "zone4" (zone id 3)

35 seconds ago.

When a cluster-specic daemon that runs on a voting node ails and the node panics, a message

similar to the ollowing is displayed on the console.

panic[cpu1]/thread=2a10007fcc0: Failfast: Aborting because "pmfd" died in zone "global" (zone id 0)

35 seconds ago.

409b8 cl_runtime:__0FZsc_syslog_msg_log_no_argsPviTCPCcTB+48 (70f900, 30, 70df54, 407acc, 0)

%l0-7: 1006c80 000000a 000000a 10093bc 406d3c80 7110340 0000000 4001 fbf0

Ater the panic, the Solaris host might reboot and the node might attempt to rejoin the cluster.

Alternatively, i the cluster is composed o SPARC based systems, the host might remain at theOpenBoot PROM (OBP) prompt. The next action o the host is determined by the setting o theauto-boot? parameter. You can set auto-boot? with the eeprom command, at the OpenBootPROM ok prompt. See the eeprom(1M) man page.

Cluster Conguration Repository (CCR)

The CCR uses a two-phase commit algorithm or updates: An update must be successully completed on all cluster members or the update is rolled back. The CCR uses the clusterinterconnect to apply the distributed updates.

Caution – Although the CCR consists o text les, never edit the CCR les yoursel. Each lecontains a checksum record to ensure consistency between nodes. Updating CCR les yoursel can cause a node or the entire cluster to stop working.


The CCR relies on the CMM to guarantee that a cluster is running only when quorum isbli h d Th CCR i ibl i i d i h l

Global Devices






established. The CCR is responsible or veriying data consistency across the cluster,perorming recovery as necessary, and acilitating updates to the data.

Global DevicesThe Sun Cluster sotware uses global devices to provide cluster-wide, highly available access toany device in a cluster, rom any node, without regard to where the device is physically attached.In general, i a node ails while providing access to a global device, the Sun Cluster sotwareautomatically discovers another path to the device. The Sun Cluster sotware then redirects theaccess to that path. Sun Cluster global devices include disks, CD-ROMs, and tapes. However,the only multiported global devices that Sun Cluster sotware supports are disks. Consequently,CD-ROM and tape devices are not currently highly available devices. The local disks on eachserver are also not multiported, and thus are not highly available devices.

The cluster automatically assigns unique IDs to each disk, CD-ROM, and tape device in thecluster. This assignment enables consistent access to each device rom any node in the cluster.The global device namespace is held in the /dev/global directory. See “Global Namespace” onpage 49 or more inormation.

Multiported global devices provide more than one path to a device. Because multihost disks arepart o a device group that is hosted by more than one Solaris host, the multihost disks are madehighly available.

Device IDs and DID Pseudo Driver

The Sun Cluster sotware manages global devices through a construct known as the DIDpseudo driver. This driver is used to automatically assign unique IDs to every device in thecluster, including multihost disks, tape drives, and CD-ROMs.

The DID pseudo driver is an integral part o the global device access eature o the cluster. TheDID driver probes all nodes o the cluster and builds a list o unique devices, assigns each devicea unique major and a minor number that are consistent on all nodes o the cluster. Access to theglobal devices is perormed by using the unique device ID instead o the traditional Solarisdevice IDs, such as c0t0d0 or a disk.

This approach ensures that any application that accesses disks (such as a volume manager orapplications that use raw devices) uses a consistent path across the cluster. This consistency isespecially important or multihost disks, because the local major and minor numbers or eachdevice can vary rom Solaris host to Solaris host, thus changing the Solaris device namingconventions as well. For example, Host1 might identiy a multihost disk as c1t2d0,and Host2

might identiy the same disk completely diferently, as c3t2d0. The DID driver assigns a globalname, such as d10, that the hosts use instead, giving each host a consistent mapping to the

multihost disk.


You update and administer device IDs with the cldevice command. See the cldevice(1CL)

Device Groups

http://docs.sun.com/doc/820-4685/cldevice-1cl?a=view





man page.

Device GroupsIn the Sun Cluster sotware, all multihost devices must be under control o the Sun Clustersotware. You rst create volume manager disk groups, either Solaris Volume Manager disk setsor Veritas Volume Manager disk groups, on the multihost disks. Then, you register the volumemanager disk groups as device groups. A device group is a type o global device. In addition, theSun Cluster sotware automatically creates a raw device group or each disk and tape device inthe cluster. However, these cluster device groups remain in an oine state until you access themas global devices.

Registration provides the Sun Cluster sotware inormation about which Solaris hosts have apath to specic volume manager disk groups. At this point, the volume manager disk groupsbecome globally accessible within the cluster. I more than one host can write to (master) adevice group, the data stored in that device group becomes highly available. The highly availabledevice group can be used to contain cluster le systems.

Note – Device groups are independent o resource groups. One node can master a resourcegroup (representing a group o data service processes). Another node can master the disk groups that are being accessed by the data services. However, the best practice is to keep on thesame node the device group that stores a particular application's data and the resource groupthat contains the application's resources (the application daemon). Reer to “RelationshipBetween Resource Groups and Device Groups” in SunCluster Data Services Planning and Administration Guide or Solaris OS or more inormation about the association between device

groups and resource groups.

When a node uses a device group, the volume manager disk group becomes “global” because itprovides multipath support to the underlying disks. Each cluster host that is physically attachedto the multihost disks provides a path to the device group.

Device Group Failover

Because a disk enclosure is connected to more than one Solaris host, all device groups in thatenclosure are accessible through an alternate path i the host currently mastering the devicegroup ails. The ailure o the host that is mastering the device group does not afect access to thedevice group except or the time it takes to perorm the recovery and consistency checks.During this time, all requests are blocked (transparently to the application) until the systemmakes the device group available.


Before Disk Device Group Failover

Device Groups

http://docs.sun.com/doc/820-4682/bcggabja?a=view









Multiported Device Groups

This section describes device group properties that enable you to balance perormance andavailability in a multiported disk conguration. Sun Cluster sotware provides two propertiesthat congure a multiported disk conguration: preferenced and numsecondaries.Youcancontrol the order in which nodes attempt to assume control i a ailover occurs by using thepreferenced property. Use the numsecondaries property to set the number o secondary nodes or a device group that you want.

A highly available service is considered down when the primary node ails and when no eligiblesecondary nodes can be promoted to primary nodes. I service ailover occurs and thepreferenced property is true, then the nodes ollow the order in the node list to select asecondary node. The node list denes the order in which nodes attempt to assume primary control or transition rom spare to secondary. You can dynamically change the preerence o adevice service by using the clsetup command. The preerence that is associated withdependent service providers, or example a global le system, is identical to the preerence o the device service.

Secondary nodes are check-pointed by the primary node during normal operation. In a

multiported disk conguration, checkpointing each secondary node causes cluster

MultihostDisks

DiskDeviceGroups

Client

Access

DataAccess

Primary Secondary

Host1 Host 2

MultihostDisks

After Disk Device Group Failover

DiskDeviceGroups

ClientAccess

DataAccess

Secondary

Host 2Host 1

Primary

FIGURE 3–1 Device Group Beore andAter Failover


perormance degradation and memory overhead. Spare node support was implemented tominimize the perormance degradation and memory overhead that checkpointing caused By

Global Namespace



minimize the perormance degradation and memory overhead that checkpointing caused. By deault, your device group has one primary and one secondary. The remaining availableprovider nodes become spares. I ailover occurs, the secondary becomes primary and the nodeor highest in priority on the node list becomes secondary.

You can set the number o secondary nodes that you want to any integer between one and thenumber o operational nonprimary provider nodes in the device group.

Note – I you are using Solaris Volume Manager, you must create the device group beore youcan set the numsecondaries property to a number other than the deault.

The deault number o secondaries or device services is 1. The actual number o secondary providers that is maintained by the replica ramework is the number that you want, unless thenumber o operational nonprimary providers is less than the number that you want. You mustalter the numsecondaries property and double-check the node list i you are adding orremoving nodes rom your conguration. Maintaining the node list and number o secondariesprevents conict between the congured number o secondaries and the actual number that is

allowed by the ramework.■ (Solaris Volume Manager) Use the metaset command or Solaris Volume Manager device

groups, in conjunction with the preferenced and numsecondaries property settings, tomanage the addition o nodes to and the removal o nodes rom your conguration.

■ (Veritas Volume Manager) Use the cldevicegroup command or VxVM device groups, inconjunction with the preferenced and numsecondaries property settings, to manage theaddition o nodes to and the removal o nodes rom your conguration.

■ Reer to “Overview o Administering Cluster File Systems” in SunCluster System Administration Guide or Solaris OS or procedural inormation about changing devicegroup properties.

Global NamespaceThe Sun Cluster sotware mechanism that enables global devices is the global namespace.The

global namespace includes the /dev/global/ hierarchy as well as the volume managernamespaces. The global namespace reects both multihost disks and local disks (and any othercluster device, such as CD-ROMs and tapes), and provides multiple ailover paths to themultihost disks. Each Solaris host that is physically connected to multihost disks provides apath to the storage or any node in the cluster.

Normally, or Solaris Volume Manager, the volume manager namespaces are located in the/dev/md/diskset /dsk (and rdsk) directories. For Veritas VxVM, the volume manager

namespaces are located in the /dev/vx/dsk/disk-group and /dev/vx/rdsk/disk-group


directories. These namespaces consist o directories or each Solaris Volume Manager disk setand each VxVM disk group imported throughout the cluster, respectively. Each o these

Global Namespace

http://docs.sun.com/doc/820-4679/x-4n6a5?a=view







and each VxVM disk group imported throughout the cluster, respectively. Each o thesedirectories contains a device host or each metadevice or volume in that disk set or disk group.

In the Sun Cluster sotware, each device host in the local volume manager namespace isreplaced by a symbolic link to a device host in the /global/.devices/node@nodeIDle system.nodeID is an integer that represents the nodes in the cluster. Sun Cluster sotware continues topresent the volume manager devices, as symbolic links, in their standard locations as well. Boththe global namespace and standard volume manager namespace are available rom any clusternode.

The advantages o the global namespace include the ollowing:

■ Each host remains airly independent, with little change in the device administration model.

■ Devices can be selectively made global.

■ Third-party link generators continue to work.

■ Given a local device name, an easy mapping is provided to obtain its global name.

Local and Global Namespaces ExampleThe ollowing table shows the mappings between the local and global namespaces or amultihost disk, c0t0d0s0.

TABLE 3–2 Localand Global Namespace Mappings

Component or Path Local Host Namespace Global Namespace

Solaris logical name /dev/dsk/c0t0d0s0 /global/.devices/node@nodeID/dev/dsk/c0t0d0s0

DID name /dev/did/dsk/d0s0 /global/.devices/node@nodeID/dev/did/dsk/d0s0

Solaris Volume Manager /dev/md/diskset /dsk/d0 /global/.devices/node@nodeID/dev/md/diskset /dsk/d0

Veritas Volume Manager /dev/vx/dsk/disk-group/v0 /global/.devices/node@nodeID/dev/vx/dsk/disk-group/v0

The global namespace is automatically generated on installation and updated with every

reconguration reboot. You can also generate the global namespace by using the cldevicecommand. See the cldevice(1CL) man page.


Cluster File Systems

Cluster File Systems






The cluster le system has the ollowing eatures:

■ File access locations are transparent. A process can open a le that is located anywhere in thesystem. Processes on all Solaris hosts can use the same path name to locate a le.

Note – When the cluster le system reads les, it does not update the access time on thoseles.

■

Coherency protocols are used to preserve the UNIX le access semantics even i the le isaccessed concurrently rom multiple nodes.

■ Extensive caching is used along with zero-copy bulk I/O movement to move le dataeciently.

■ The cluster le system provides highly available, advisory le-locking unctionality by usingthe fcntl command interaces. Applications that run on multiple cluster nodes cansynchronize access to data by using advisory le locking on a cluster le system. File locksare recovered immediately rom nodes that leave the cluster, and rom applications that ail

while holding locks.

■ Continuous access to data is ensured, even when ailures occur. Applications are notafected by ailures i a path to disks is still operational. This guarantee is maintained or rawdisk access and all le system operations.

■ Cluster le systems are independent rom the underlying le system and volumemanagement sotware. Cluster le systems make any supported on-disk le system global.

You can mount a le system on a global device globally with mount -g or locally with mount.

Programs can access a le in a cluster le system rom any node in the cluster through the samele name (or example, /global/foo).

A cluster le system is mounted on all cluster members. You cannot mount a cluster le systemon a subset o cluster members.

A cluster le system is not a distinct le system type. Clients veriy the underlying le system

(or example, UFS).

Using Cluster File Systems

In the Sun Cluster sotware, all multihost disks are placed into device groups, which can beSolaris Volume Manager disk sets, VxVM disk groups, or individual disks that are not under

control o a sotware-based volume manager.


For a cluster le system to be highly available, the underlying disk storage must be connected tomore than one Solaris host. Thereore, a local le system (a le system that is stored on a host's

Cluster FileSystems



local disk) that is made into a cluster le system is not highly available.

You can mount cluster le systems as you would mount le systems:

■ Manually.Use the mount command and the -g or -o global mount options to mount thecluster le system rom the command line, or example:

SPARC: # mount -g /dev/global/dsk/d0s0 /global/oracle/data

■ Automatically.Create an entry in the /etc/vfstab le with a global mount option tomount the cluster le system at boot. You then create a mount point under the /global

directory on all hosts. The directory /global is a recommended location, not a requirement.Here's a sample line or a cluster le system rom an /etc/vfstab le:

SPARC: /dev/md/oracle/dsk/d1 /dev/md/oracle/rdsk/d1 /global/oracle/data ufs 2 yes global,logging

Note – While Sun Cluster sotware does not impose a naming policy or cluster le systems, youcan ease administration by creating a mount point or all cluster le systems under the same

directory, such as /global/disk-group. SeeSunCluster 3.19/04 SotwareCollection or SolarisOS (SPARCPlatormEdition) and SunCluster SystemAdministrationGuide or Solaris OS ormore inormation.

HAStoragePlus Resource Type

The HAStoragePlus resource type is designed to make local and global le system

congurations highly available. You can use the HAStoragePlus resource type to integrate yourlocal or global le system into the Sun Cluster environment and make the le system highly available.

You can use the HAStoragePlus resource type to make a le system available to a global-clusternon-voting node. To enable the HAStoragePlus resource type to do this, you must create amount point on the global-cluster voting node and in the global-cluster non-voting node. TheHAStoragePlus resource type makes the le system available to the global-cluster non-voting

node by mounting the le system in the global-cluster voting node. The resource type thenperorms a loopback mount in the global-cluster non-voting node.

Note–

Sun Cluster systems support the ollowing cluster le systems:

■ Solaris ZFSTM

■

UNIX le system (UFS)


■ Sun StorEdge QFS le system and Sun QFS Shared le system■ Sun Cluster Proxy le system (PxFS)

l

Disk Path Monitoring





■ Veritas le system (VxFS)

The HAStoragePlus resource type provides additional le system capabilities such as checks,mounts, and orced unmounts. These capabilities enable Sun Cluster to ail over local lesystems. In order to ail over, the local le system must reside on global disk groups with anity switchovers enabled.

See “Enabling Highly Available Local File Systems” in SunCluster Data Services Planningand Administration Guide or Solaris OS or inormation about how to use the HAStoragePlus

resource type.

You can also use the HAStoragePlus resource type to synchronize the startup o resources anddevice groups on which the resources depend. For more inormation, see “Resources, ResourceGroups, and Resource Types” on page 74.

syncdir Mount OptionYou can use the syncdir mount option or cluster le systems that use UFS as the underlyingle system. However, perormance signicantly improves i you do not speciy syncdir.Iyouspeciy syncdir, the writes are guaranteed to be POSIX compliant. I you do not speciy syncdir, you experience the same behavior as in NFS le systems. For example, withoutsyncdir, you might not discover an out o space condition until you close a le. With syncdir

(and POSIX behavior), the out-o-space condition would have been discovered during the writeoperation. The cases in which you might have problems i you do not speciy syncdir are rare.

I you are using a SPARC based cluster, VxFS does not have a mount option that is equivalent tothe syncdir mount option or UFS. VxFS behavior is the same as or UFS when the syncdir

mount option is not specied.

See “File Systems FAQs” on page 100 or requently asked questions about global devices andcluster le systems.

Disk Path MonitoringThe current release o Sun Cluster sotware supports disk path monitoring (DPM). This sectionprovides conceptual inormation about DPM, the DPM daemon, and administration tools thatyou use to monitor disk paths. Reer toSunCluster SystemAdministrationGuide orSolaris OSor procedural inormation about how to monitor, unmonitor, and check the status o disk

paths.


DPM Overview

DPM h ll l b l l d h b d d k


http://docs.sun.com/doc/820-4682/cdcegbeg?a=view









DPM improves the overall reliability o ailover and switchover by monitoring secondary disk path availability. Use the cldevice command to veriy the availability o the disk path that isused by a resource beore the resource is switched. Options that are provided with the cldevice

command enable you to monitor disk paths to a single Solaris host or to all Solaris hosts in thecluster. See the cldevice(1CL) man page or more inormation about command-line options.

The ollowing table describes the deault location or installation o DPM components.

Location Component

Daemon /usr/cluster/lib/sc/scdpmd

Command-line interace /usr/cluster/bin/cldevice

Daemon status le (created at runtime) /var/run/cluster/scdpm.status

A multithreaded DPM daemon runs on each host. The DPM daemon (scdpmd) is started by anrc.d script when a host boots. I a problem occurs, the daemon is managed by pmfd and restarts

automatically. The ollowing list describes how the scdpmd works on initial startup.

Note – At startup, the status or each disk path is initialized to UNKNOWN.

1. The DPM daemon gathers disk path and node name inormation rom the previous statusle or rom the CCR database. See “Cluster Conguration Repository (CCR)” on page 45 ormore inormation about the CCR. Ater a DPM daemon is started, you can orce thedaemon to read the list o monitored disks rom a specied le name.

2. The DPM daemon initializes the communication interace to respond to requests romcomponents that are external to the daemon, such as the command-line interace.

3. The DPM daemon pings each disk path in the monitored list every 10 minutes by usingscsi_inquiry commands. Each entry is locked to prevent the communication interaceaccess to the content o an entry that is being modied.

4. The DPM daemon noties the Sun Cluster Event Framework and logs the new status o thepath through the UNIX syslogd command. See the syslogd(1M) man page.

Note – All errors that are related to the daemon are reported by pmfd. All the unctions rom theAPI return 0 on success and -1 or any ailure.

The DPM daemon monitors the availability o the logical path that is visible through multipathdrivers such as Solaris I/O multipathing (MPxIO), which was ormerly named Sun StorEdge

Trac Manager, Sun StorEdge 9900 Dynamic Link Manager, and EMC PowerPath. The


individual physical paths that are managed by these drivers are not monitored because the

multipath driver masks individual ailures rom the DPM daemon.




http://docs.sun.com/doc/816-5166/syslogd-1m?a=view






Monitoring Disk Paths

This section describes two methods or monitoring disk paths in your cluster. The rst method

is provided by the cldevice command. Use this command to monitor, unmonitor, or display

the status o disk paths in your cluster. You can also use this command to print a list o aulted

disks and to monitor disk paths rom a le. See the cldevice(1CL) man page.

The second method or monitoring disk paths in your cluster is provided by the Sun ClusterManager graphical user interace (GUI). Sun Cluster Manager provides a topological view o

the monitored disk paths in your cluster. The view is updated every 10 minutes to provide

inormation about the number o ailed pings. Use the inormation that is provided by the Sun

Cluster Manager GUI in conjunction with the cldevice command to administer disk paths.

See Chapter 12, “Administering Sun Cluster With the Graphical User Interaces,” in SunCluster

SystemAdministrationGuide or Solaris OS or inormation about Sun Cluster Manager.

Using the cldevice Command to Monitor and Administer Disk Paths

The cldevice command enables you to perorm the ollowing tasks:

■ Monitor a new disk path■ Unmonitor a disk path■ Reread the conguration data rom the CCR database■

Read the disks to monitor or unmonitor rom a specied le■ Report the status o a disk path or all disk paths in the cluster■ Print all the disk paths that are accessible rom a node

Issue the cldevice command with the disk path argument rom any active node to perorm

DPM administration tasks on the cluster. The disk path argument consists o a node name and a

disk name. The node name is not required. I you do not speciy a node name, all nodes are

afected by deault. The ollowing table describes naming conventions or the disk path.

Note – Always speciy a global disk path name rather than a UNIX disk path name because a

global disk path name is consistent throughout a cluster. A UNIX disk path name is not. For

example, the disk path name can be c1t0d0 on one node and c2t0d0 on another node. To

determine a global disk path name or a device that is connected to a node, use the cldevice

list command beore issuing DPM commands. See the cldevice(1CL) man page.


TABLE 3–3 Sample DiskPathNames

Name Type Sample Disk Path Name Description

Quorum and Quorum Devices



http://docs.sun.com/doc/820-4679/x-4n6hc?a=view











Global disk path schost-1:/dev/did/dsk/d1 Disk path d1 on the

schost-1 node

all:d1 Disk path d1 on all nodes

in the cluster

UNIX disk path schost-1:/dev/rdsk/c0t0d0s0 Disk path c0t0d0s0 on the

schost-1 node

schost-1:all All disk paths on the

schost-1 node

All disk paths all:all All disk paths on all nodes

o the cluster

Using Sun Cluster Manager to Monitor Disk Paths

Sun Cluster Manager enables you to perorm the ollowing basic DPM administration tasks:

■ Monitor a disk path

■ Unmonitor a disk path

■ View the status o all monitored disk paths in the cluster

■ Enable or disable the automatic rebooting o a Solaris host when all monitored disk pathsail

The Sun Cluster Manager online help provides procedural inormation about how toadminister disk paths.

Using the clnode set Command to Manage Disk Path Failure

You use the clnode set command to enable and disable the automatic rebooting o a nodewhen all monitored disk paths ail. You can also use Sun Cluster Manager to perorm thesetasks.

Quorum and Quorum DevicesThis section contains the ollowing topics:

■ “About Quorum Vote Counts” on page 58■ “About Quorum Congurations” on page 58■ “Adhering to Quorum Device Requirements” on page 59■ “Adhering to Quorum Device Best Practices” on page 60■

“Recommended Quorum Congurations” on page 61


■ “Atypical Quorum Congurations” on page 62■ “Bad Quorum Congurations” on page 63




Note – For a list o the specic devices that Sun Cluster sotware supports as quorum devices,contact your Sun service provider.

Because cluster nodes share data and resources, a cluster must never split into separate

partitions that are active at the same time because multiple active partitions might cause data

corruption. The Cluster Membership Monitor (CMM) and quorum algorithm guarantee that,

at most, one instance o the same cluster is operational at any time, even i the cluster

interconnect is partitioned.

For an introduction to quorum and CMM, see “Cluster Membership” in SunClusterOverview

or Solaris OS.

Two types o problems arise rom cluster partitions:

■ Split brain■

Amnesia

Split brainoccurs when the cluster interconnect between nodes is lost and the cluster becomes

partitioned into subclusters. Each partition “believes” that it is the only partition because the

nodes in one partition cannot communicate with the node or nodes in the other partition.

Amnesiaoccurs when the cluster restarts ater a shutdown with cluster conguration data that

is older than the data was at the time o the shutdown. This problem can occur when you start

the cluster on a node that was not in the last unctioning cluster partition.

Sun Cluster sotware avoids split brain and amnesia by:

■ Assigning each node one vote■ Mandating a majority o votes or an operational cluster

A partition with the majority o votes gains quorum and is allowed to operate. This majority

vote mechanism prevents split brain and amnesia when more than two nodes are congured in

a cluster. However, counting node votes alone is not sucient when more than two nodes arecongured in a cluster. In a two-host cluster, a majority is two. I such a two-host cluster

becomes partitioned, an external vote is needed or either partition to gain quorum. This

external vote is provided by a quorum device.


About Quorum Vote Counts

Use the clquorum show command to determine the ollowing inormation:


http://docs.sun.com/doc/820-4675/concepts-64?a=view








Use t e c quo u s o co a d to dete e t e o ow g o at o :

■

Total congured votes■ Current present votes■ Votes required or quorum

See the cluster(1CL) man page.

Both nodes and quorum devices contribute votes to the cluster to orm quorum.

A node contributes votes depending on the node's state:

■ A nodehas a vote count o onewhen it boots and becomes a cluster member.

■ A nodehas a vote count o zerowhen the node is being installed.

■ A nodehas a vote count o zerowhen a system administrator places the node intomaintenance state.

Quorum devices contribute votes that are based on the number o votes that are connected tothe device. When you congure a quorum device, Sun Cluster sotware assigns the quorum

device a vote count o N -1 whereN is the number o connected votes to the quorum device. Forexample, a quorum device that is connected to two nodes with nonzero vote counts has aquorum count o one (two minus one).

A quorum device contributes votes i one o the ollowing two conditions are true:

■ At least one o the nodes to which the quorum device is currently attached is a clustermember.

■

At least one o the hosts to which the quorum device is currently attached is booting, andthat host was a member o the last cluster partition to own the quorum device.

You congure quorum devices during the cluster installation, or aterwards, by using theprocedures that are described in Chapter 6, “Administering Quorum,” in SunCluster System Administration Guide or Solaris OS.

About Quorum CongurationsThe ollowing list contains acts about quorum congurations:

■ Quorum devices can contain user data.

■ In an N+1 conguration whereN quorum devices are each connected to one o the 1

throughN Solaris hosts and the N+1 Solaris host, the cluster survives the death o either all 1

throughN Solaris hosts or any o theN /2 Solaris hosts. This availability assumes that the

quorum device is unctioning correctly.


■ In anN -host conguration where a single quorum device connects to all hosts, the clustercan survive the death o any o theN -1 hosts. This availability assumes that the quorumdevice is unctioning correctly.


http://docs.sun.com/doc/820-4685/cluster-1cl?a=view











g y

■

In anN -host conguration where a single quorum device connects to all hosts, the clustercan survive the ailure o the quorum device i all cluster hosts are available.

For examples o quorum congurations to avoid, see “Bad Quorum Congurations” onpage 63. For examples o recommended quorum congurations, see “Recommended QuorumCongurations” on page 61.

Adhering to Quorum Device RequirementsEnsure that Sun Cluster sotware supports your specic device as a quorum device. I youignore this requirement, you might compromise your cluster's availability.

Note – For a list o the specic devices that Sun Cluster sotware supports as quorum devices,contact your Sun service provider.

Sun Cluster sotware supports the ollowing types o quorum devices:

■ Multihosted shared disks that support SCSI-3 PGR reservations.

■ Dual-hosted shared disks that support SCSI-2 reservations.

■ A Network-Attached Storage (NAS) device rom Sun Microsystems, Incorporated or romNetwork Appliance, Incorporated.

■ A quorum server process that runs on the quorum server machine.

■ Any shared disk, provided that you have turned of encing or this disk, and are thereoreusing sotware quorum. Sotware quorum is a protocol developed by Sun Microsystems thatemulates a orm o SCSI Persistent Group Reservations (PGR).

Caution – I you are using disks that do not support SCSI, such as Serial Advanced

Technology Attachment (SATA) disks, turn of encing.

Note – You cannot use a replicated device as a quorum device.

In a two–host conguration, you must congure at least one quorum device to ensure that a

single host can continue i the other host ails. See Figure 3–2.


For examples o quorum congurations to avoid, see “Bad Quorum Congurations” onpage 63. For examples o recommended quorum congurations, see “Recommended QuorumCongurations” on page 61.




Adhering to Quorum Device Best Practices

Use the ollowing inormation to evaluate the best quorum conguration or your topology:

■ Do you have a device that is capable o being connected to all Solaris hosts o the cluster?

■ I yes, congure that device as your one quorum device. You donot need to congure

another quorum device because your conguration is the most optimal conguration.

Caution – I you ignore this requirement and add another quorum device, the additionalquorum device reduces your cluster's availability.

■ I no, congure your dual-ported device or devices.

■ Ensure that the total number o votes contributed by quorum devices is strictly less than the

total number o votes contributed by nodes. Otherwise, your nodes cannot orm a cluster i all disks are unavailable, even i all nodes are unctioning.

Note – In particular environments, you might want to reduce overall cluster availability tomeet your needs. In these situations, you can ignore this best practice. However, notadhering to this best practice decreases overall availability. For example, in theconguration that is outlined in “Atypical Quorum Congurations” on page 62 the cluster

is less available: the quorum votes exceed the node votes. In a cluster, i access to the sharedstorage between Host A and Host B is lost, the entire cluster ails.

See “Atypical Quorum Congurations” on page 62 or the exception to this best practice.

■ Speciy a quorum device between every pair o hosts that shares access to a storage device.This quorum conguration speeds the encing process. See “Quorum in Greater ThanTwo–Host Congurations” on page 61.

■ In general, i the addition o a quorum device makes the total cluster vote even, the totalcluster availability decreases.

■ Quorum devices slightly slow recongurations ater a node joins or a node dies. Thereore,do not add more quorum devices than are necessary.

For examples o quorum congurations to avoid, see “Bad Quorum Congurations” onpage 63. For examples o recommended quorum congurations, see “Recommended Quorum

Congurations” on page 61.


Recommended Quorum Congurations

This section shows examples o quorum congurations that are recommended. For examples o




quorum congurations you should avoid, see “Bad Quorum Congurations” on page 63.

Quorum in Two–Host Congurations

Two quorum votes are required or a two-host cluster to orm. These two votes can derive rom

the two cluster hosts, or rom just one host and a quorum device.

Quorum in GreaterThanTwo–Host Congurations

Quorum devices are not required when a cluster includes more than two hosts, as the cluster

survives ailures o a single host without a quorum device. However, under these conditions,

you cannot start the cluster without a majority o hosts in the cluster.

You can add a quorum device to a cluster that includes more than two hosts. A partition can

survive as a cluster when that partition has a majority o quorum votes, including the votes o

the hosts and the quorum devices. Consequently, when adding a quorum device, consider the

possible host and quorum device ailures when choosing whether and where to congure

quorum devices.

Host A Host B

11

1

Quorum

Device

Total configured votes: 3Votes required for quorum: 2

FIGURE 3–2 Two–Host Conguration



Host A Host DHost B Host C

1 1 1 1




Atypical Quorum Congurations

Figure 3–3 assumes you are running mission-critical applications (Oracle database, or

example) on Host A and Host B. I Host A and Host B are unavailable and cannot access shared

data, you might want the entire cluster to be down. Otherwise, this conguration is suboptimal

because it does not provide high availability.

For inormation about the best practice to which this exception relates, see “Adhering to

Quorum Device Best Practices” on page 60.



QuorumDevice

QuorumDevice

QuorumDevice

QuorumDevice

In this configuration, usually applications

are configured to run on Host A and HostB and use Host C as a hot spare.

In this configuration, the combinationof any one or more hosts and thequorum device can be from a cluster.

Host A Host B Host C

QuorumDevice

Host A Host B Host C

1 1

1 1 1

2

1 1 1

11

In this configuration, each pair mustbe available for either pair to survive.


Host C Host DHost A

1

Host A

1 1 1Total configured votes: 10Votes required for quorum 6




Bad Quorum Congurations

This section shows examples o quorum congurations you should avoid. For examples o

recommended quorum congurations, see “Recommended Quorum Congurations” on

page 61.

3111

1 1 1 1

QuorumDevice

QuorumDevice

QuorumDevice

QuorumDevice

Votes required for quorum: 6

FIGURE 3–3 Atypical Conguration



Host A

1 1

Host B

DataServices



Data ServicesThe term data service describes an application, such as Sun Java System Web Server or Oracle,that has been congured to run on a cluster rather than on a single server. A data serviceconsists o an application, specialized Sun Cluster conguration les, and Sun Clustermanagement methods that control the ollowing actions o the application.

■ Start■ Stop■

Monitor and take corrective measures



This configuration violates the best practice that youshould not add quorum devices to make total voteseven. This configuration does not add availability.

This configuration violates the best practice that quorumdevice votes should be strictly less than votes of hosts.

QuorumDevice

This configuration violates the best practice thatquorum device votes should be strictly less thanvotes of hosts.

QuorumDevice

QuorumDevice

QuorumDevice

QuorumDevice

11

QuorumDevice

Host A

1 1 1

Host B Host C

1

Host D

11

21

1 1

Host B Host CHost A

1


For inormation about data service types, see “Data Services” in SunCluster Overview or SolarisOS.

Figure 3–4 compares an application that runs on a single application server (the single-server

DataServices









model) to the same application running on a cluster (the clustered-server model). The only diference between the two congurations is that the clustered application might run aster andis more highly available.

In the single-server model, you congure the application to access the server through aparticular public network interace (a host name). The host name is associated with thatphysical server.

In the clustered-server model, the public network interace is a logical host name or a shared address. The termnetwork resources is used to reer to both logical host names and sharedaddresses.

Some data services require you to speciy either logical host names or shared addresses as thenetwork interaces. Logical host names and shared addresses are not always interchangeable.Other data services allow you to speciy either logical host names or shared addresses. Reer tothe installation and conguration or each data service or details about the type o interace youmust speciy.

A network resource is not associated with a specic physical server. A network resource canmigrate between physical servers.

A network resource is initially associated with one node, the primary. I the primary ails, thenetwork resource and the application resource ail over to a diferent cluster node (asecondary). When the network resource ails over, ater a short delay, the application resource

continues to run on the secondary.

Standard Client-Server Application

Clustered Client-Server Application

ClientServer

Clustered Application Servers

ClientServer

ApplicationServer

FIGURE 3–4 Standard Compared to Clustered Client-Server Conguration


Figure 3–5 compares the single-server model with the clustered-server model. Note that in theclustered-server model, a network resource (logical host name, in this example) can movebetween two or more o the cluster nodes. The application is congured to use this logical hostname in place o a host name that is associated with a particular server.

DataServices



p p

A shared address is also initially associated with one node. This node is called the globalinterace node. A shared address (known as the global interace) is used as the single network interace to the cluster.

The diference between the logical host name model and the scalable service model is that in thelatter, each node also has the shared address actively congured on its loopback interace. Thisconguration enables multiple instances o a data service to be active on several nodessimultaneously. The term “scalable service” means that you can add more CPU power to theapplication by adding additional cluster nodes and the perormance scales.

I the global interace node ails, the shared address can be started on another node that is alsorunning an instance o the application (thereby making this other node the new global interacenode). Or, the shared address can ail over to another cluster node that was not previously running the application.

Figure 3–6 compares the single-server conguration with the clustered scalable serviceconguration. Note that in the scalable service conguration, the shared address is present onall nodes. The application is congured to use this shared address in place o a host name that isassociated with a particular server. This scheme is similar to how a logical host name is used or

a ailover data service.


Failover Clustered Client-Server Application

ClientServer


hostname=iws-1

Logical hostname=iws-1

ClientServer

ApplicationServer

FIGURE 3–5 Fixed Host Name Compared to Logical Host Name



Client Application

DataServices



Data Service Methods

The Sun Cluster sotware supplies a set o service management methods. These methods run

under the control o the Resource Group Manager (RGM), which uses them to start, stop, and

monitor the application on the cluster nodes. These methods, along with the cluster ramework

sotware and multihost devices, enable applications to become ailover or scalable data services.

The RGM also manages resources in the cluster, including instances o an application and

network resources (logical host names and shared addresses).

In addition to Sun Cluster sotware-supplied methods, the Sun Cluster sotware also supplies an

API and several data service development tools. These tools enable application developers to

develop the data service methods that are required to make other applications run as highly

available data services with the Sun Cluster sotware.

Failover Data ServicesI the node on which the data service is running (the primary node) ails, the service is migrated

to another working node without user intervention. Failover services use a ailover resource

group, which is a container or application instance resources and network resources (logical

host names). Logical host names are IP addresses that can be congured on one node, and at a

later time, automatically congured down on the original node and congured on another

node.

Scalable Clusterered Client-Server Application

Server


hostname=iws-1

Shared address=iws-1

GIF iws-1 iws-1

iws-1

Server

Client

Server

FIGURE 3–6 Fixed Host Name Compared to Shared Address


For ailover data services, application instances run only on a single node. I the ault monitordetects an error, it either attempts to restart the instance on the same node, or to start theinstance on another node (ailover). The outcome depends on how you have congured thedata service.

DataServices



Scalable Data Services

The scalable data service has the potential or active instances on multiple nodes.

Scalable services use the ollowing two resource groups:

■ A scalable resource groupcontains the application resources.

■ A ailover resource group, which contains the network resources (shared addresses) onwhich the scalable service depends. A shared address is a network address. This network address can be bound by all scalable services that are running on nodes within the cluster.This shared address enables these scalable services to scale on those nodes. A cluster canhave multiple shared addresses, and a service can be bound to multiple shared addresses.

A scalable resource group can be online on multiple nodes simultaneously. As a result, multipleinstances o the service can be running at once. All scalable resource groups use load balancing.

All nodes that host a scalable service use the same shared address to host the service. Theailover resource group that hosts the shared address is online on only one node at a time.

Service requests enter the cluster through a single network interace (the global interace).These requests are distributed to the nodes, based on one o several predened algorithms thatare set by the load-balancing policy. The cluster can use the load-balancing policy to balance theservice load between several nodes. Multiple global interaces can exist on diferent nodes thathost other shared addresses.

For scalable services, application instances run on several nodes simultaneously. I the node thathosts the global interace ails, the global interace ails over to another node. I an applicationinstance that is running ails, the instance attempts to restart on the same node.

I an application instance cannot be restarted on the same node, and another unused node iscongured to run the service, the service ails over to the unused node. Otherwise, the servicecontinues to run on the remaining nodes, possibly causing a degradation o service throughput.

Note – TCP state or each application instance is kept on the node with the instance, not on theglobal interace node. Thereore, ailure o the global interace node does not afect theconnection.

Figure 3–7 shows an example o ailover and a scalable resource group and the dependenciesthat exist between them or scalable services. This example shows three resource groups. Theailover resource group contains application resources or highly available DNS, and network


resources used by both highly available DNS and highly available Apache Web Server (used in

SPARC-based clusters only). The scalable resource groups contain only application instances o

the Apache Web Server. Note that resource group dependencies exist between the scalable and

ailover resource groups (solid lines). Additionally, all the Apache application resources depend

DataServices



on the network resource schost-2, which is a shared address (dashed lines).

Load-Balancing Policies

Load balancing improves perormance o the scalable service, both in response time and in

throughput. There are two classes o scalable data services.

■ Pure■ Sticky

A pure service is capable o having any o its instances respond to client requests. A sticky

service is capable o having a client send requests to the same instance. Those requests are not

redirected to other instances.

A pure service uses a weighted load-balancing policy. Under this load-balancing policy, client

requests are by deault uniormly distributed over the server instances in the cluster. The load is

distributed among various nodes according to specied weight values. For example, in a

three-node cluster, suppose that each node has the weight o 1. Each node services one third o

the requests rom any client on behal o that service. The cluster administrator can change

weights at any time with an administrative command or with Sun Cluster Manager.

Failover application resourcein.named (DNS resource)

Network resourcesschost-1 (logical hostname)

schost-2 (shared IP address)

Scalable Resource Group

Scalable Resource Group

r e

s o ur c e gr o u p d e p en d en

c y

( R

G _ d e p en d en c i e s pr o p e

r t y )

Failover Resource Groupr e s o ur c e d e p en d en c i e s

( N e t w

or k _r e s o ur c e s _ u s e d pr o p er t y )

Scalable application resourceapache (Apache server resource)

Scalable application resourceapache (Apache server resource)

FIGURE 3–7 SPARC:Failover and Scalable Resource GroupExample


The weighted load-balancing policy is set by using the LB_WEIGHTED value or theLoad_balancing_weights property. I a weight or a node is not explicitly set, the weight orthat node is set to 1 by deault.

Th h d l d h l l

DataServices



The weighted policy redirects a certain percentage o the trac rom clients to a particularnode. Given X=weight and A=the total weights o all active nodes, an active node can expectapproximately X/A o the total new connections to be directed to the active node. However, thetotal number o connections must be large enough. This policy does not address individualrequests.

Note that the weighted policy is not round robin. A round-robin policy would always causeeach request rom a client to go to a diferent node. For example, the rst request would go tonode 1, the second request would go to node 2, and so on.

A sticky service has two avors, ordinary stickyandwildcard sticky.

Sticky services enable concurrent application-level sessions over multiple TCP connections toshare in-state memory (application session state).

Ordinary sticky services enable a client to share the state between multiple concurrent TCPconnections. The client is said to be “sticky” toward that server instance that is listening on a

single port.The client is guaranteed that all requests go to the same server instance, provided that theollowing conditions are met:

■ The instance remains up and accessible.■ The load-balancing policy is not changed while the service is online.

For example, a web browser on the client connects to a shared IP address on port 80 using three

diferent TCP connections. However, the connections exchange cached session inormationbetween them at the service.

A generalization o a sticky policy extends to multiple scalable services that exchange sessioninormation in the background and at the same instance. When these services exchange sessioninormation in the background and at the same instance, the client is said to be “sticky” towardmultiple server instances on the same node that is listening on diferent ports.

For example, a customer on an e-commerce web site lls a shopping cart with items by using

HTTP on port 80. The customer then switches to SSL on port 443 to send secure data to pay by credit card or the items in the cart.

In the ordinary sticky policy, the set o ports is known at the time the application resources arecongured. This policy is set by using the LB_STICKY value or the Load_balancing_policy

resource property.

Wildcard sticky services use dynamically assigned port numbers, but still expect client requeststo go to the same node. The client is “sticky wildcard” over pots that have the same IP address.


A good example o this policy is passive mode FTP. For example, a client connects to an FTPserver on port 21. The server then instructs the client to connect back to a listener port server inthe dynamic port range. All requests or this IP address are orwarded to the same node that theserver inormed the client through the control inormation.

Developing New DataServices



The sticky-wildcard policy is a superset o the ordinary sticky policy. For a scalable service thatis identied by the IP address, ports are assigned by the server (and are not known in advance).The ports might change. This policy is set by using the LB_STICKY_WILD value or theLoad_balancing_policy resource property.

For each one o these sticky policies, the weighted load-balancing policy is in efect by deault.Thereore, a client's initial request is directed to the instance that the load balancer dictates.

Ater the client establishes an anity or the node where the instance is running, uture requestsare conditionally directed to that instance. The node must be accessible and the load-balancingpolicy must not have changed.

Failback Settings

Resource groups ail over rom one node to another. When this ailover occurs, the original

secondary becomes the new primary. The ailback settings speciy the actions that occur whenthe original primary comes back online. The options are to have the original primary becomethe primary again (ailback) or to allow the current primary to remain. You speciy the optionyou want by using the Failback resource group property setting.

I the original node that hosts the resource group ails and reboots repeatedly, setting ailback might result in reduced availability or the resource group.

Data Services Fault Monitors

Each Sun Cluster data service supplies a ault monitor that periodically probes the data serviceto determine its health. A ault monitor veries that the application daemon or daemons arerunning and that clients are being served. Based on the inormation that probes return,predened actions such as restarting daemons or causing a ailover can be initiated.

Developing New Data ServicesSun supplies conguration les and management methods templates that enable you to make various applications operate as ailover or scalable services within a cluster. I Sun does not oferthe application that you want to run as a ailover or scalable service, you have an alternative. Usea Sun Cluster API or the DSET API to congure the application to run as a ailover or scalableservice. However, not all applications can become a scalable service.


Characteristics o Scalable Services

A set o criteria determines whether an application can become a scalable service. To determine

i your application can become a scalable service, see “Analyzing the Application or Suitability”

in SunCluster Data ServicesDeveloper’s Guide or Solaris OS

Developing New DataServices

http://docs.sun.com/doc/820-4680/using-2?a=view







in SunCluster Data ServicesDevelopers Guide or Solaris OS.

This set o criteria is summarized as ollows:

■ First, such a service is composed o one or more server instances. Each instance runs on a

diferent node. Two or more instances o the same service cannot run on the same node.

■ Second, i the service provides an external logical data store, you must exercise caution.

Concurrent access to this store rom multiple server instances must be synchronized to

avoid losing updates or reading data as it's being changed. Note the use o “external” todistinguish the store rom in-memory state. The term “logical” indicates that the store

appears as a single entity, although it might itsel be replicated. Furthermore, in this data

store, when any server instance updates the data store, this update is immediately “seen” by

other instances.

The Sun Cluster sotware provides such an external storage through its cluster le system

and its global raw partitions. As an example, suppose a service writes new data to an external

log le or modies existing data in place. When multiple instances o this service run, eachinstance has access to this external log, and each might simultaneously access this log. Each

instance must synchronize its access to this log, or else the instances interere with each

other. The service could use ordinary Solaris le locking through fcntl and lockf to

achieve the synchronization that you want.

Another example o this type o store is a back-end database, such as highly available Oracle

Real Application Clusters Guard or SPARC based clusters or Oracle. This type o back-end

database server provides built-in synchronization by using database query or updatetransactions. Thereore, multiple server instances do not need to implement their own

synchronization.

The Sun IMAP server is an example o a service that is not a scalable service. The service

updates a store, but that store is private and when multiple IMAP instances write to this

store, they overwrite each other because the updates are not synchronized. The IMAP server

must be rewritten to synchronize concurrent access.

■

Finally, note that instances can have private data that is disjoint rom the data o otherinstances. In such a case, the service does not need synchronized concurrent access because

the data is private, and only that instance can manipulate it. In this case, you must be careul

not to store this private data under the cluster le system because this data can become

globally accessible.


Data Service API and Data Service DevelopmentLibrary API

The Sun Cluster sotware provides the ollowing to make applications highly available:

Using the Cluster Interconnect or Data Service Trafc






The Sun Cluster sotware provides the ollowing to make applications highly available:■ Data services that are supplied as part o the Sun Cluster sotware■ A data service API■ A development library API or data services■ A “generic” data service

The SunCluster Data Services PlanningandAdministrationGuide orSolarisOSdescribes howto install and congure the data services that are supplied with the Sun Cluster sotware. The

SunCluster 3.19/04 Sotware Collection or Solaris OS (SPARCPlatormEdition)describes howto instrument other applications to be highly available under the Sun Cluster ramework.

The Sun Cluster APIs enable application developers to develop ault monitors and scripts thatstart and stop data service instances. With these tools, an application can be implemented as aailover or a scalable data service. The Sun Cluster sotware provides a “generic” data service.Use this generic data service to quickly generate an application's required start and stopmethods and to implement the data service as a ailover or scalable service.

Using the Cluster Interconnect or Data Service TracA cluster must usually have multiple network connections between Solaris hosts, orming thecluster interconnect.

Sun Cluster sotware uses multiple interconnects to achieve the ollowing goals:

■ Ensure high availability ■ Improve perormance

For both internal and external trac such as le system data or scalable services data, messagesare striped across all available interconnects. The cluster interconnect is also available toapplications, or highly available communication between hosts. For example, a distributedapplication might have components that are running on diferent hosts that need to

communicate. By using the cluster interconnect rather than the public transport, theseconnections can withstand the ailure o an individual link.

To use the cluster interconnect or communication between hosts, an application must use theprivate host names that you congured during the Sun Cluster installation. For example, i theprivate host name or host1 is clusternode1-priv, use this name to communicate with host1

over the cluster interconnect. TCP sockets that are opened by using this name are routed overthe cluster interconnect and can be transparently rerouted i a private network adapter ails.Application communication between any two hosts is striped over all interconnects. The trac


or a given TCP connection ows on one interconnect at any point. Diferent TCP connectionsare striped across all interconnects. Additionally, UDP trac is always striped across allinterconnects.

An application can optionally use a zone's private host name to communicate over the clusterinterconnect between zones However you must rst set each zone's private host name beore

Resources, ResourceGroups,and ResourceTypes





interconnect between zones. However, you must rst set each zone s private host name beorethe application can begin communicating. Each zone must have its own private host name tocommunicate. An application that is running in one zone must use the private host name in thesame zone to communicate with private host names in other zones. An application in one zonecannot communicate through the private host name in another zone.

Because you can congure the private host names during your Sun Cluster installation, thecluster interconnect uses any name that you choose at that time. To determine the actual name,

use the scha_cluster_get command with the scha_privatelink_hostname_node argument.See the scha_cluster_get(1HA) man page.

Each host is also assigned a xed per-host address. This per-host address is plumbed on theclprivnet driver. The IP address maps to the private host name or the host:clusternode1-priv. See the clprivnet(7) man page.

I your application requires consistent IP addresses at all points, congure the application tobind to the per-host address on both the client and the server. All connections appear then to

originate rom and return to the per-host address.

Resources, Resource Groups, and Resource TypesData services use several types o resources: applications such as Sun Java System Web Server orApache Web Server use network addresses (logical host names and shared addresses) on whichthe applications depend. Application and network resources orm a basic unit that is managed

by the RGM.

Data services are resource types. For example, Sun Cluster HA or Oracle is the resource typeSUNW.oracle-server and Sun Cluster HA or Apache is the resource type SUNW.apache.

A resource is an instantiation o a resource type that is dened cluster wide. Several resourcetypes are dened.

Network resources are either SUNW.LogicalHostname or SUNW.SharedAddress resource types.

These two resource types are preregistered by the Sun Cluster sotware.

The HAStoragePlus resource type is used to synchronize the startup o resources and devicegroups on which the resources depend. This resource type ensures that beore a data servicestarts, the paths to a cluster le system's mount points, global devices, and device group namesare available. For more inormation, see “Synchronizing the Startups Between Resource Groupsand Device Groups” in SunCluster Data Services Planning andAdministrationGuide or SolarisOS. The HAStoragePlus resource type also enables local le systems to be highly available. Formore inormation about this eature, see “HAStoragePlus Resource Type” on page 52.


RGM-managed resources are placed into groups, called resource groups, sothatthey canbe

managed as a unit. A resource group is migrated as a unit i a ailover or switchover is initiated

on the resource group.

N t Wh b i th t t i li ti li th


http://docs.sun.com/doc/820-4685/scha-cluster-get-1ha?a=view


http://docs.sun.com/doc/820-4685/clprivnet-7?a=view


http://docs.sun.com/doc/820-4682/z400043a1071445?a=view












Note – When you bring a resource group that contains application resources online, the

application is started. The data service start method waits until the application is running beore

exiting successully. The determination o when the application is up and running is

accomplished the same way the data service ault monitor determines that a data service is

serving clients. Reer to theSunCluster Data Services Planning andAdministrationGuide or

Solaris OS or more inormation about this process.

Resource Group Manager (RGM)

The RGM controls data services (applications) as resources, which are managed by resource

type implementations. These implementations are either supplied by Sun or created by a

developer with a generic data service template, the Data Service Development Library API

(DSDL API), or the Resource Management API (RMAPI). The cluster administrator creates

and manages resources in containers called resource groups. The RGM stops and starts resourcegroups on selected nodes in response to cluster membership changes.

The RGM acts on resources and resource groups.RGM actions cause resources and resource

groups to move between online and oine states. A complete description o the states and

settings that can be applied to resources and resource groups is located in “Resource and

Resource Group States and Settings” on page 75.

Reer to “Data Service Project Conguration” on page 83 or inormation about how to launchSolaris projects under RGM control.

Resource and Resource Group States and Settings

A system administrator applies static settings to resources and resource groups. You can change

these settings only by administrative action. The RGM moves resource groups betweendynamic “states.”


These settings and states are as ollows:

■ Managedor unmanagedsettings.These cluster-wide settings apply only to resourcegroups. The RGM manages resource groups. You can use the clresourcegroup commandto request that the RGM manage or unmanage a resource group. These resource groupsettings do not change when you recongure a cluster.








g g y g

When a resource group is rst created, it is unmanaged. A resource group must be managedbeore any resources placed in the group can become active.

In some data services, or example, a scalable web server, work must be done prior tostarting network resources and ater they are stopped. This work is done by initialization(INIT) and nish (FINI) data service methods. The INIT methods only run i the resourcegroup in which the resources are located is in the managed state.

When a resource group is moved rom unmanaged to managed, any registered INIT

methods or the group are run on the resources in the group.

When a resource group is moved rom managed to unmanaged, any registered FINI

methods are called to perorm cleanup.

The most common use o the INIT and FINI methods are or network resources or scalableservices. However, a data service developer can use these methods or any initialization orcleanup work that is not perormed by the application.

■ Enabled or disabled settings.These settings apply to resources on one or more nodes. Asystem administrator can use the clresource command to enable or disable a resource onone or more nodes. These settings do not change when the cluster administratorrecongures a cluster.

The normal setting or a resource is that it is enabled and actively running in the system.

I you want to make the resource unavailable on all cluster nodes, disable the resource on allcluster nodes. A disabled resource is not available or general use on the cluster nodes that

you speciy.■ Onlineor ofine states.These dynamic states apply to both resources and resource groups.

Online and oine states change as the cluster transitions through cluster recongurationsteps during switchover or ailover. You can also change the online or oine state o aresource or a resource group by using the clresource and clresourcegroup commands.

A ailover resource or resource group can only be online on one node at any time. A scalableresource or resource group can be online on some nodes and oine on others. During a

switchover or ailover, resource groups and the resources within them are taken oine onone node and then brought online on another node.

I a resource group is oine, all o its resources are oine. I a resource group is online, all o its enabled resources are online.

You can temporarily suspend the automatic recovery actions o a resource group. Youmight need to suspend the automatic recovery o a resource group to investigate and x aproblem in the cluster. Or, you might need to perorm maintenance on resource groupservices.


A suspended resource group isnot automatically restarted or ailed over until you explicitly issue the command that resumes automatic recovery. Whether online or oine, suspendeddata services remain in their current state. You can still manually switch the resource groupto a diferent state on specied nodes. You can also still enable or disable individual

resources in the resource group.

Support or Solaris Zones



Resource groups can contain several resources, with dependencies between resources. Thesedependencies require that the resources be brought online and oine in a particular order.The methods that are used to bring resources online and oine might take diferentamounts o time or each resource. Because o resource dependencies and start and stoptime diferences, resources within a single resource group can have diferent online andoine states during a cluster reconguration.

Resource and Resource Group Properties

You can congure property values or resources and resource groups or your Sun Cluster dataservices. Standard properties are common to all data services. Extension properties are specicto each data service. Some standard and extension properties are congured with deaultsettings so that you do not have to modiy them. Others need to be set as part o the process o

creating and conguring resources. The documentation or each data service species whichresource properties can be set and how to set them.

The standard properties are used to congure resource and resource group properties that areusually independent o any particular data service. For the set o standard properties, seeAppendix B, “Standard Properties,” in SunCluster Data Services PlanningandAdministrationGuide or Solaris OS.

The RGM extension properties provide inormation such as the location o application binaries

and conguration les. You modiy extension properties as you congure your data services.The set o extension properties is described in the individual guide or the data service.

Support or Solaris ZonesSolaris zones provide a means o creating virtualized operating system environments within an

instance o the Solaris 10 OS. Solaris zones enable one or more applications to run in isolationrom other activity on your system. The Solaris zones acility is described in Part II, “Zones,” inSystemAdministration Guide: Solaris Containers-ResourceManagement and Solaris Zones.

When you run Sun Cluster sotware on the Solaris 10 OS, you can create any number o global-cluster non-voting nodes.

You can use Sun Cluster sotware to manage the availability and scalability o applications thatare running on global-cluster non-voting nodes.


Support or Global-Cluster Non-Voting Nodes (SolarisZones) Directly Through the RGM

On a cluster where the Solaris 10OS is running, you can congure a resource group to run on aglobal-cluster voting node or a global-cluster non-voting node. The RGM manages each


http://docs.sun.com/doc/820-4682/babjbjhe?a=view




http://docs.sun.com/doc/817-1592/zone?a=view








g g g g gglobal-cluster non-voting node as a switchover target. I a global-cluster non-voting node isspecied in the node list o a resource group, the RGM brings the resource group online in thespecied node.

Figure 3–8 illustrates the ailover o resource groups between nodes in a two-host cluster. In thisexample, identical nodes are congured to simpliy the administration o the cluster.

You can congure a scalable resource group (which uses network load balancing) to run in acluster non-voting node as well.

In Sun Cluster commands, you speciy a zone by appending the name o the zone to the name o the host, and separating them with a colon, or example:

Host pn1

Zone zC

RG5

Zone zB

RG4

Zone zA

RG3 RG2

Voting Node

Sun Cluster

Foundation

RG1

Host pn2

Zone zC

RG5

Zone zB

RG4

Zone zA

RG3 RG2

Voting Node

Sun Cluster

Foundation

RG1

FIGURE 3–8 Failover o Resource Groups Between Nodes


phys-schost-1:zoneA

Criteria or Using Support or Solaris Zones Directly Through the RGM

Use support or Solaris zones directly through the RGM i any o ollowing criteria is met:

■ Your application cannot tolerate the additional ailover time that is required to boot a zone.




Your application cannot tolerate the additional ailover time that is required to boot a zone.

■ You require minimum downtime during maintenance.

■ You require dual-partition sotware upgrade.

■ You are conguring a data service that uses a shared address resource or network loadbalancing.

Requirements or Using Support or Solaris Zones Directly Through theRGM

I you plan to use support or Solaris zones directly through the RGM or an application, ensurethat the ollowing requirements are met:

■ The application is supported to run in non-global zones.

■ The data service or the application is supported to run on a global-cluster non-voting node.

I you use support or Solaris zones directly through the RGM, ensure that resource groups thatare related by an anity are congured to run on the same Solaris host.

Additional Inormation About Support or Solaris Zones DirectlyThrough the RGM

For inormation about how to congure support or Solaris zones directly through the RGM,see the ollowing documentation:

■ “Guidelines or Non-Global Zones in a Global Cluster” in SunCluster Sotware InstallationGuide or Solaris OS

■ “Zone Names” in SunCluster Sotware Installation Guide or Solaris OS

■ “Conguring a Non-Global Zone on a Global-Cluster Node” in SunCluster SotwareInstallation Guide or Solaris OS

■ SunCluster Data Services Planning andAdministrationGuide or Solaris OS

■ Individual data service guides

Support or Solaris Zones on Sun Cluster NodesThrough Sun Cluster HA or Solaris Containers

The Sun Cluster HA or Solaris Containers data service manages each zone as a resource that iscontrolled by the RGM.


Criteria or Using Sun Cluster HA or Solaris Containers

Use the Sun Cluster HA or Solaris Containers data service i any o ollowing criteria is met:

■ You require delegated root access.

■ The application is not supported in a cluster.

b h d f h

Service ManagementFacility

http://docs.sun.com/doc/820-4677/gcank?a=view



http://docs.sun.com/doc/820-4677/gcvfc?a=view


http://docs.sun.com/doc/820-4677/gbvoc?a=view












■ You require anities between resource groups that are to run in diferent zones on the samenode.

Requirements or Using Sun Cluster HA or Solaris Containers

I you plan to use the Sun Cluster HA or Solaris Containers data service or an application,

ensure that the ollowing requirements are met:■ The application is supported to run on global-cluster non-voting nodes.

■ The application is integrated with the Solaris OS through a script, a run-level script, or aSolaris Service Management Facility (SMF) maniest.

■ The additional ailover time that is required to boot a zone is acceptable.

■ Some downtime during maintenance is acceptable.

Additional Inormation About Sun Cluster HA or Solaris Containers

For inormation about how to use the Sun Cluster HA or Solaris Containers data service, seeSunCluster Data Service orSolarisContainers Guide orSolaris OS.

Service Management FacilityThe Solaris Service Management Facility (SMF) enables you to run and administer applicationsas highly available and scalable resources. Like the Resource Group Manager (RGM), the SMFprovides high availability and scalability, but or the Solaris Operating System.

Sun Cluster provides three proxy resource types that you can use to enable SMF services in acluster. These resource types, SUNW.Proxy_SMF_failover, SUNW.Proxy_SMF_loadbalanced ,and SUNW.Proxy_SMF_multimaster , enable you to run SMF services in a ailover, scalable, andmulti-master conguration, respectively. The SMF manages the availability o SMF services ona single Solaris host. The SMF uses the callback method execution model to run services.

The SMF also provides a set o administrative interaces or monitoring and controllingservices. These interaces enable you to integrate your own SMF-controlled services into SunCluster. This capability eliminates the need to create new callback methods, rewrite existingcallback methods, or update the SMF service maniest. You can include multiple SMF resourcesin a resource group and you can congure dependencies and anities between them.


The SMF is responsible or starting, stopping, and restarting these services and managing theirdependencies. Sun Cluster is responsible or managing the service in the cluster and ordetermining the hosts on which these services are to be started.

The SMF runs as a daemon,svc.startd

, on each cluster host. The SMF daemon automatically starts and stops resources on selected hosts according to pre-congured policies.

System ResourceUsage






The services that are specied or an SMF proxy resource can be located on global cluster votingnode or global cluster non-voting node. However, all the services that are specied or the sameSMF proxy resource must be located on the same node. SMF proxy resources work on any node.

System Resource UsageSystem resources include aspects o CPU usage, memory usage, swap usage, and disk andnetwork throughput. Sun Cluster enables you to monitor how much o a specic systemresource is being used by an object type. An object type includes a host, node, zone, disk,network interace, or resource group. Sun Cluster also enables you to control the CPU that isavailable to a resource group.

Monitoring and controlling system resource usage can be part o your resource managementpolicy. The cost and complexity o managing numerous machines encourages the consolidationo several applications on larger hosts. Instead o running each workload on separate systems,with ull access to each system's resources, you use resource management to segregateworkloads within the system. Resource management enables you to lower overall total cost o ownership by running and controlling several applications on a single Solaris system.

Resource management ensures that your applications have the required response times.Resource management can also increase resource use. By categorizing and prioritizing usage,you can efectively use reserve capacity during of-peak periods, oten eliminating the need oradditional processing power. You can also ensure that resources are not wasted because o load variability.

To use the data that Sun Cluster collects about system resource usage, you must do theollowing:

■

Analyze the data to determine what it means or your system.■ Make a decision about the action that is required to optimize your usage o hardware and

sotware resources.

■ Take action to implement your decision.

By deault, system resource monitoring and control are not congured when you install SunCluster. For inormation about conguring these services, see Chapter 9, “Conguring Controlo CPU Usage,” in SunCluster SystemAdministrationGuide or Solaris OS.


System Resource Monitoring

By monitoring system resource usage, you can do the ollowing:

■

Collect data that reects how a service that is using specic system resources is perorming.■ Discover resource bottlenecks or overload and so preempt problems.

System ResourceUsage

http://docs.sun.com/doc/820-4679/gbylg?a=view







■ More eciently manage workloads.

Data about system resource usage can help you determine the hardware resources that are

underused and the applications that use many resources. Based on this data, you can assign

applications to nodes that have the necessary resources and choose the node to which to

ailover. This consolidation can help you optimize the way that you use your hardware and

sotware resources.

Monitoring all system resources at the same time might be costly in terms o CPU. Choose the

system resources that you want to monitor by prioritizing the resources that are most critical

or your system.

When you enable monitoring, you choose the telemetryattribute that you want to monitor. A

telemetry attribute is an aspect o system resources. Examples o telemetry attributes include the

amount o ree CPU or the percentage o blocks that are used on a device. I you monitor a

telemetry attribute on an object type, Sun Cluster monitors this telemetry attribute on all

objects o that type in the cluster. Sun Cluster stores a history o the system resource data that is

collected or seven days.

I you consider a particular data value to be critical or a system resource, you can set a threshold

or this value. When setting a threshold, you also choose how critical this threshold is by

assigning it a severity level. I the threshold is crossed, Sun Cluster changes the severity level o the threshold to the severity level that you choose.

Control o CPU

Each application and service that is running on a cluster has specic CPU needs. Table 3–4 lists

the CPU control activities that are available on diferent versions o the Solaris OS.

TABLE 3–4 CPUControl

Solaris Version Zone Control

Solaris 9 OS Not available Assign CPU shares

Solaris 10 OS Global-cluster voting node Assign CPU shares


TABLE 3–4 CPUControl (Continued)

Solaris Version Zone Control

Solaris 1 0 OS Global-cluster n on-voting n ode Assign C PU s hares

Assign number o CPU

Create dedicated processor sets

DataService Project Conguration



Note – I you want to apply CPU shares, you must speciy the Fair Share Scheduler (FFS) as thedeault scheduler in the cluster.

Controlling the CPU that is assigned to a resource group in a dedicated processor set in aglobal-cluster non-voting node ofers the strictest level o control. I you reserve CPU or aresource group, this CPU is not available to other resource groups.

Viewing System Resource Usage

You can view system resource data and CPU assignments by using the command line orthrough Sun Cluster Manager. The system resources that you choose to monitor determine thetables and graphs that you can view.

By viewing the output o system resource usage and CPU control, you can do the ollowing:

■ Anticipate ailures due to the exhaustion o system resources.■ Detect unbalanced usage o system resources.■ Validate server consolidation.■ Obtain inormation that enables you to improve the perormance o applications.

Sun Cluster does not provide advice about the actions to take, nor does it take action or youbased on the data that it collects. You must determine whether the data that you view meetsyour expectations or a service. You must then take action to remedy any observedperormance.

Data Service Project CongurationData services can be congured to launch under a Solaris project name when brought online by using the RGM. The conguration associates a resource or resource group managed by theRGM with a Solaris project ID. The mapping rom your resource or resource group to a projectID gives you the ability to use sophisticated controls that are available in the Solaris OS tomanage workloads and consumption within your cluster.


Note – You can perorm this conguration i you are using Sun Cluster on the Solaris 9 OS or onthe Solaris 10 OS.

Using the Solaris management unctionality in a Sun Cluster environment enables you toensure that your most important applications are given priority when sharing a node with other




ensure that your most important applications are given priority when sharing a node with otherapplications. Applications might share a node i you have consolidated services or becauseapplications have ailed over. Use o the management unctionality described herein mightimprove availability o a critical application by preventing lower-priority applications romoverconsuming system supplies such as CPU time.

Note – The Solaris documentation or this eature describes CPU time, processes, tasks andsimilar components as “resources”. Meanwhile, Sun Cluster documentation uses the term“resources” to describe entities that are under the control o the RGM. The ollowing sectionuses the term “resource” to reer to Sun Cluster entities that are under the control o the RGM.The section uses the term “supplies” to reer to CPU time, processes, and tasks.

This section provides a conceptual description o conguring data services to launch processes

on a specied Solaris OS project(4). This section also describes several ailover scenarios andsuggestions or planning to use the management unctionality provided by the SolarisOperating System.

For detailed conceptual and procedural documentation about the management eature, reer toChapter 1, “Network Service (Overview),” in SystemAdministrationGuide: NetworkServices.

When conguring resources and resource groups to use Solaris management unctionality in a

cluster, use the ollowing high-level process:1. Conguring applications as part o the resource.

2. Conguring resources as part o a resource group.

3. Enabling resources in the resource group.

4. Making the resource group managed.

5. Creating a Solaris project or your resource group.

6. Conguring standard properties to associate the resource group name with the project youcreated in step 5.

7. Bringing the resource group online.

To congure the standard Resource_project_name or RG_project_name properties toassociate the Solaris project ID with the resource or resource group, use the -p option with theclresource set and the clresourcegroup set command. Set the property values to theresource or to the resource group. See Appendix B, “Standard Properties,” in SunCluster Data


Services Planning andAdministrationGuide orSolarisOS or property denitions. See ther_properties(5) and rg_properties(5) man pages or descriptions o properties.

The specied project name must exist in the projects database (/etc/project) and the rootuser must be congured as a member o the named project. Reer to Chapter 2, “Projects and

Tasks (Overview),” in SystemAdministration Guide: Solaris Containers-ResourceManagement andSolaris Zones or conceptual inormation about the project name database. Reer to


http://docs.sun.com/doc/816-5174/project-4?a=view



http://docs.sun.com/doc/816-4555/nsov-1?a=view









http://docs.sun.com/doc/820-4685/r-properties-5?a=view


http://docs.sun.com/doc/820-4685/rg-properties-5?a=view


http://docs.sun.com/doc/817-1592/rmtaskproj-1?a=view













project(4) or a description o project le syntax.

When the RGM brings resources or resource groups online, it launches the related processesunder the project name.

Note – Users can associate the resource or resource group with a project at any time. However,the new project name is not efective until the resource or resource group is taken oine andbrought back online by using the RGM.

Launching resources and resource groups under the project name enables you to congure theollowing eatures to manage system supplies across your cluster.

■ Extended Accounting – Provides a exible way to record consumption on a task or processbasis. Extended accounting enables you to examine historical usage and make assessments

o capacity requirements or uture workloads.

■ Controls – Provide a mechanism or constraint on system supplies. Processes, tasks, andprojects can be prevented rom consuming large amounts o specied system supplies.

■ Fair Share Scheduling (FSS) – Provides the ability to control the allocation o available CPUtime among workloads, based on their importance. Workload importance is expressed by the number o shares o CPU time that you assign to each workload. Reer to the ollowingman pages or more inormation.

■ dispadmin(1M)■ priocntl(1)■ ps(1)■ FSS(7)

■ Pools – Provide the ability to use partitions or interactive applications according to theapplication's requirements. Pools can be used to partition a host that supports a number o diferent sotware applications. The use o pools results in a more predictable response or

each application.

Determining Requirements or Project CongurationBeore you congure data services to use the controls provided by Solaris in a Sun Clusterenvironment, you must decide how to control and track resources across switchovers orailovers. Identiy dependencies within your cluster beore conguring a new project. Forexample, resources and resource groups depend on device groups.


Use the nodelist, failback, maximum_primaries and desired_primaries resource groupproperties that you congure with the clresourcegroup set command to identiy node listpriorities or your resource group.

■ For a brie discussion o the node list dependencies between resource groups and device

groups, reer to “Relationship Between Resource Groups and Device Groups” in SunCluster Data Services Planning andAdministrationGuide or Solaris OS.




http://docs.sun.com/doc/816-5166/dispadmin-1m?a=view


http://docs.sun.com/doc/816-5165/priocntl-1?a=view


http://docs.sun.com/doc/816-5165/ps-1?a=view


http://docs.sun.com/doc/816-5177/fss-7?a=view















■ For detailed property descriptions, reer to rg_properties(5).

Use the preferenced property and failback property that you congure with thecldevicegroup and clsetup commands to determine device group node list priorities. See theclresourcegroup(1CL), cldevicegroup(1CL), and clsetup(1CL) man pages.

■ For conceptual inormation about the preferenced property, see “Multiported DeviceGroups” on page 48.

■ For procedural inormation, see “How To Change Disk Device Properties” in“Administering Device Groups” in SunCluster SystemAdministrationGuide or Solaris OS.

■ For conceptual inormation about node conguration and the behavior o ailover andscalable data services, see “Sun Cluster System Hardware and Sotware Components” onpage 21.

I you congure all cluster nodes identically, usage limits are enorced identically on primary and secondary nodes. The conguration parameters o projects do not need to be identical orall applications in the conguration les on all nodes. All projects that are associated with theapplication must at least be accessible by the project database on all potential masters o thatapplication. Suppose that Application 1 is mastered by phys-schost-1 but could potentially beswitched over or ailed over to phys-schost-2 or phys-schost-3. The project that is associated withApplication 1 must be accessible on all three nodes ( phys-schost-1, phys-schost-2,and

phys-schost-3).

Note – Project database inormation can be a local /etc/project database le or can be storedin the NIS map or the LDAP directory service.

The Solaris Operating System enables or exible conguration o usage parameters, and ewrestrictions are imposed by Sun Cluster. Conguration choices depend on the needs o the site.

Consider the general guidelines in the ollowing sections beore conguring your systems.

Setting Per-ProcessVirtual Memory Limits

Set the process.max-address-space control to limit virtual memory on a per-process basis.See the rctladm(1M) man page or inormation about setting theprocess.max-address-space value.


When you use management controls with Sun Cluster sotware, congure memory limits

appropriately to prevent unnecessary ailover o applications and a “ping-pong” efect o

applications. In general, observe the ollowing guidelines.

■ Do not set memory limits too low.

When an application reaches its memory limit, it might ail over. This guideline is especially

important or database applications, when reaching a virtual memory limit can have





http://docs.sun.com/doc/820-4685/clresourcegroup-1cl?a=view


http://docs.sun.com/doc/820-4685/cldevicegroup-1cl?a=view



http://docs.sun.com/doc/820-4685/clsetup-1cl?a=view





http://docs.sun.com/doc/816-5166/rctladm-1m?a=view










p pp , g y

unexpected consequences.

■ Do not set memory limits identically on primary and secondary nodes.

Identical limits can cause a ping-pong efect when an application reaches its memory limit

and ails over to a secondary node with an identical memory limit. Set the memory limit

slightly higher on the secondary node. The diference in memory limits helps prevent theping-pong scenario and gives the system administrator a period o time in which to adjust

the parameters as necessary.

■ Do use the resource management memory limits or load balancing.

For example, you can use memory limits to prevent an errant application rom consuming

excessive swap space.

Failover Scenarios

You can congure management parameters so that the allocation in the project conguration

(/etc/project) works in normal cluster operation and in switchover or ailover situations.

The ollowing sections are example scenarios.

■ The rst two sections, “Two-Host Cluster With Two Applications” on page 88 and“Two-Host Cluster With Three Applications” on page 89, show ailover scenarios or entire

hosts.■ The section “Failover o Resource Group Only” on page 91 illustrates ailover operation or

an application only.

In a Sun Cluster environment, you congure an application as part o a resource. You then

congure a resource as part o a resource group (RG). When a ailure occurs, the resource

group, along with its associated applications, ails over to another node. In the ollowingexamples the resources are not shown explicitly. Assume that each resource has only one

application.

Note – Failover occurs in the order in which nodes are specied in the node list and set in the

RGM.


The ollowing examples have these constraints:

■ Application 1 (App-1) is congured in resource group RG-1.■ Application 2 (App-2) is congured in resource group RG-2.■ Application 3 (App-3) is congured in resource group RG-3.

Although the numbers o assigned shares remain the same, the percentage o CPU time that is

allocated to each application changes ater ailover This percentage depends on the number o




allocated to each application changes ater ailover. This percentage depends on the number o

applications that are running on the node and the number o shares that are assigned to each

active application.

In these scenarios, assume the ollowing congurations.

■

All applications are congured under a common project.■ Each resource has only one application.■ The applications are the only active processes on the nodes.■ The projects databases are congured the same on each node o the cluster.

Two-Host Cluster With Two Applications

You can congure two applications on a two-host cluster to ensure that each physical host

( phys-schost-1, phys-schost-2) acts as the deault master or one application. Each physical host

acts as the secondary node or the other physical host. All projects that are associated with

Application 1 and Application 2 must be represented in the projects database les on both

nodes. When the cluster is running normally, each application is running on its deault master,

where it is allocated all CPU time by the management acility.

Ater a ailover or switchover occurs, both applications run on a single node where they are

allocated shares as specied in the conguration le. For example, this entry inthe/etc/project le species that Application 1 is allocated 4 shares and Application 2 is

allocated 1 share.

Prj_1:100:project for App-1:root::project.cpu-shares=(privileged,4,none)


The ollowing diagram illustrates the normal and ailover operations o this conguration. The

number o shares that are assigned does not change. However, the percentage o CPU timeavailable to each application can change. The percentage depends on the number o shares that

are assigned to each process that demands CPU time.


App-2(part of RG 2)

Normal Operation

App-1(part of RG 1)

1 share(100% of CPU

4 shares(100% of CPU




Two-Host Cluster With Three Applications

On a two-host cluster with three applications, you can congure one host ( phys-schost-1)asthedeault master o one application. You can congure the second physical host ( phys-schost-2) asthe deault master or the remaining two applications. Assume the ollowing example projectsdatabase le is located on every host. The projects database le does not change when a ailoveror switchover occurs.


Prj_2:104:project for App_2:root::project.cpu-shares=(privileged,3,none)


When the cluster is running normally, Application 1 is allocated 5 shares on its deault master, phys-schost-1. This number is equivalent to 100 percent o CPU time because it is the only application that demands CPU time on that host. Applications 2 and 3 are allocated 3 and 2shares, respectively, on their deault master, phys-schost-2. Application 2 would receive 60percent o CPU time and Application 3 would receive 40 percent o CPU time during normaloperation.

App-1(part of RG-1)

4 shares(80% of CPUresources)

phys-schost-2

Failover Operation: Failure of Node phys-schost-1

phys-schost-1

(part of RG-2)

phys-schost-2

(part of RG-1)

phys-schost-1

resources)

App-2(part of RG-2)

1 share(20% of CPUresources)

resources)


I a ailover or switchover occurs and Application 1 is switched over to phys-schost-2, the shares

or all three applications remain the same. However, the percentages o CPU resources are

reallocated according to the projects database le.

■ Application 1, with 5 shares, receives 50 percent o CPU.■ Application 2, with 3 shares, receives 30 percent o CPU.■ Application 3, with 2 shares, receives 20 percent o CPU.




The ollowing diagram illustrates the normal operations and ailover operations o this

conguration.

App-2(part of RG-2)

App-1(part of RG-1)

App-3(part of RG-3)


phys-schost-2


phys-schost-1

App-2(part of RG-2)

phys-schost-2

Normal Operation

App-1(part of RG-1)

5 shares(100% of CPU

resources)

phys-schost-1





App-3(part of RG-3)


Failover o Resource Group Only

In a conguration in which multiple resource groups have the same deault master, a resource

group (and its associated applications) can ail over or be switched over to a secondary node.

Meanwhile, the deault master is running in the cluster.

Note – During ailover, the application that ails over is allocated resources as specied in the




g pp p

conguration le on the secondary host. In this example, the project database les on the

primary and secondary hosts have the same congurations.

For example, this sample conguration le species that Application 1 is allocated 1 share,

Application 2 is allocated 2 shares, and Application 3 is allocated 2 shares.




The ollowing diagram illustrates the normal and ailover operations o this conguration,

where RG-2, containing Application 2, ails over to phys-schost-2. Note that the number o

shares assigned does not change. However, the percentage o CPU time available to eachapplication can change, depending on the number o shares that are assigned to each

application that demands CPU time.


App-3(part of RG-3)

Normal Operation

App-1(part of RG-1)

1 share(33.3% of CPU

resources)


Public Network Adaptersand IP Network Multipathing



Public Network Adapters and IP Network MultipathingClients make data requests to the cluster through the public network. Each cluster Solaris host isconnected to at least one public network through a pair o public network adapters.

Solaris Internet Protocol (IP) Network Multipathing sotware on Sun Cluster provides the basicmechanism or monitoring public network adapters and ailing over IP addresses rom oneadapter to another when a ault is detected. Each host has its own IP network multipathingconguration, which can be diferent rom the conguration on other hosts.

Public network adapters are organized into IPmultipathing groups (multipathing groups). Eachmultipathing group has one or more public network adapters. Each adapter in a multipathinggroup can be active. Alternatively, you can congure standby interaces that are inactive unlessa ailover occurs.

The in.mpathd multipathing daemon uses a test IP address to detect ailures and repairs. I aault is detected on one o the adapters by the multipathing daemon, a ailover occurs. Allnetwork access ails over rom the aulted adapter to another unctional adapter in the

App-2(part of RG-2)


1 share

(100% of CPUresources)

phys-schost-2


phys-schost-1

(p )

phys-schost-2

2 shares(66.6% of CPU

resources)

phys-schost-1

resources)

App-3(part of RG-3)


App-2(part of RG-2)

App-1

(part of RG-1)


multipathing group. Thereore, the daemon maintains public network connectivity or the host.I you congured a standby interace, the daemon chooses the standby interace. Otherwise, thedaemon chooses the interace with the least number o IP addresses. Because the ailover occursat the adapter interace level, higher-level connections such as TCP are not afected, except or a

brie transient delay during the ailover. When the ailover o IP addresses completessuccessully, ARP broadcasts are sent. Thereore, the daemon maintains connectivity to remoteclients.

Public Network Adaptersand IP Network Multipathing



Note – Because o the congestion recovery characteristics o TCP, TCP endpoints can experienceurther delay ater a successul ailover. Some segments might have been lost during the ailover,activating the congestion control mechanism in TCP.

Multipathing groups provide the building blocks or logical host name and shared addressresources. You can also create multipathing groups independently o logical host name andshared address resources to monitor public network connectivity o cluster hosts. The samemultipathing group on a host can host any number o logical host name or shared addressresources. For more inormation about logical host name and shared address resources, see theSunCluster Data Services PlanningandAdministrationGuide or Solaris OS.

Note – The design o the IP network multipathing mechanism is meant to detect and mask adapter ailures. The design is not intended to recover rom an administrator's use o ifconfig

to remove one o the logical (or shared) IP addresses. The Sun Cluster sotware views the logicaland shared IP addresses as resources that are managed by the RGM. The correct way or anadministrator to add or to remove an IP address is to use clresource and clresourcegroup tomodiy the resource group that contains the resource.

For more inormation about the Solaris implementation o IP Network Multipathing, see theappropriate documentation or the Solaris Operating System that is installed on your cluster.

Operating System Instructions

Solaris 9 Operating System Chapter 1, “IP Network Multipathing (Overview),” in IP

NetworkMultipathing AdministrationGuide

Solaris 10 Operating System Part VI, “IPMP,” in SystemAdministrationGuide: IP Services


SPARC: Dynamic Reconguration SupportSun Cluster 3.2 1/09 support or the dynamic reconguration (DR) sotware eature is beingdeveloped in incremental phases. This section describes concepts and considerations or Sun

Cluster 3.2 1/09 support o the DR eature.

All the requirements, procedures, and restrictions that are documented or the Solaris DReature also apply to Sun Cluster DR support (except or the operating environment quiescence

SPARC:Dynamic RecongurationSupport



http://docs.sun.com/doc/816-5249/mpoverview?a=view



http://docs.sun.com/doc/816-4554/ipmptm-1?a=view










pp y pp ( p p g qoperation). Thereore, review the documentation or the Solaris DR eature beore by using theDR eature with Sun Cluster sotware. You should review in particular the issues that afectnonnetwork IO devices during a DR detach operation.

The SunEnterprise 10000Dynamic ReconfgurationUser Guideand the SunEnterprise 10000Dynamic ReconfgurationReerenceManual (rom theSolaris 10 on SunHardware collection)are both available or download rom http://docs.sun.com.

SPARC: Dynamic Reconguration General Description

The DR eature enables operations, such as the removal o system hardware, in running

systems. The DR processes are designed to ensure continuous system operation with no need tohalt the system or interrupt cluster availability.

DR operates at the board level. Thereore, a DR operation afects all the components on a board.Each board can contain multiple components, including CPUs, memory, and peripheralinteraces or disk drives, tape drives, and network connections.

Removing a board that contains active components would result in system errors. Beore

removing a board, the DR subsystem queries other subsystems, such as Sun Cluster, todetermine whether the components on the board are being used. I the DR subsystem nds thata board is in use, the DR remove-board operation is not done. Thereore, it is always sae toissue a DR remove-board operation because the DR subsystem rejects operations on boards thatcontain active components.

The DR add-board operation is also always sae. CPUs and memory on a newly added board areautomatically brought into service by the system. However, the system administrator must

manually congure the cluster to actively use components that are on the newly added board.

Note – The DR subsystem has several levels. I a lower level reports an error, the upper level alsoreports an error. However, when the lower level reports the specic error, the upper levelreports Unknown error. You can saely ignore this error.

The ollowing sections describe DR considerations or the diferent device types.


SPARC: DR Clustering Considerations or CPU Devices

Sun Cluster sotware does not reject a DR remove-board operation because o the presence o

CPU devices.

When a DR add-board operation succeeds, CPU devices on the added board are automatically

incorporated in system operation.







SPARC: DR Clustering Considerations or Memory

For the purposes o DR, consider two types o memory:■ Kernel memory cage■ Non-kernel memory cage

These two types difer only in usage. The actual hardware is the same or both types. Kernel

memory cage is the memory that is used by the Solaris Operating System. Sun Cluster sotware

does not support remove-board operations on a board that contains the kernel memory cage

and rejects any such operation. When a DR remove-board operation pertains to memory other

than the kernel memory cage, Sun Cluster sotware does not reject the operation. When a DRadd-board operation that pertains to memory succeeds, memory on the added board is

automatically incorporated in system operation.

SPARC: DR Clustering Considerations or Disk and

Tape DrivesSun Cluster rejects dynamic reconguration (DR) remove-board operations on active drives on

the primary host. You can perorm DR remove-board operations on inactive drives on the

primary host and on any drives in the secondary host. Ater the DR operation, cluster data

access continues as beore.

Note – Sun Cluster rejects DR operations that impact the availability o quorum devices. Forconsiderations about quorum devices and the procedure or perorming DR operations on

them, see “SPARC: DR Clustering Considerations or Quorum Devices” on page 96.

See “Dynamic Reconguration With Quorum Devices” in SunCluster SystemAdministration

Guide or Solaris OS or detailed instructions about how to perorm these actions.


SPARC: DR Clustering Considerations or QuorumDevices

I the DR remove-board operation pertains to a board that contains an interace to a device

congured or quorum, Sun Cluster sotware rejects the operation. Sun Cluster sotware also

identies the quorum device that would be afected by the operation. You must disable the

device as a quorum device beore you can perorm a DR remove-board operation.


http://docs.sun.com/doc/820-4679/cbbgbhci?a=view







See Chapter 6, “Administering Quorum,” in SunCluster SystemAdministrationGuide or

SolarisOS or detailed instructions about how administer quorum.

SPARC: DR Clustering Considerations or ClusterInterconnect Interaces

I the DR remove-board operation pertains to a board containing an active cluster interconnect

interace, Sun Cluster sotware rejects the operation. Sun Cluster sotware also identies the

interace that would be afected by the operation. You must use a Sun Cluster administrative

tool to disable the active interace beore the DR operation can succeed.

Caution – Sun Cluster sotware requires each cluster node to have at least one unctioning path to

every other cluster node. Do not disable a private interconnect interace that supports the last

path to any Solaris host in the cluster.

See “Administering the Cluster Interconnects” in SunCluster SystemAdministrationGuide or SolarisOS or detailed instructions about how to perorm these actions.

SPARC: DR Clustering Considerations or PublicNetwork Interaces

I the DR remove-board operation pertains to a board that contains an active public network

interace, Sun Cluster sotware rejects the operation. Sun Cluster sotware also identies the

interace that would be afected by the operation. Beore you remove a board with an active

network interace present, switch over all trac on that interace to another unctional interace

in the multipathing group by using the if_mpadm command.


Caution – I the remaining network adapter ails while you are perorming the DR removeoperation on the disabled network adapter, availability is impacted. The remaining adapter hasno place to ail over or the duration o the DR operation.

See “Administering the Public Network” in SunCluster SystemAdministrationGuide or SolarisOS or detailed instructions about how to perorm a DR remove operation on a public network i t





http://docs.sun.com/doc/820-4679/x-4n6d7?a=view







http://docs.sun.com/doc/820-4679/x-4n6dk?a=view







interace.




98

Frequently Asked Questions

4C H A P T E R 4



This chapter includes answers to the most requently asked questions about the Sun Cluster

product.

The questions are organized by topic as ollows:

■ “High Availability FAQs” on page 99■ “File Systems FAQs” on page 100■ “Volume Management FAQs” on page 101■ “Data Services FAQs” on page 101■ “Public Network FAQs” on page 102■ “Cluster Member FAQs” on page 103■ “Cluster Storage FAQs” on page 104■ “Cluster Interconnect FAQs” on page 104■ “Client Systems FAQs” on page 105■

“Administrative Console FAQs” on page 105■ “Terminal Concentrator and System Service Processor FAQs” on page 106

High Availability FAQs

Question: What exactly is a highly available system?

Answer: The Sun Cluster sotware denes high availability (HA) as the ability o a cluster to keepan application running. The application runs even when a ailure occurs that would normally

make a host system unavailable.

Question: What is the process by which the cluster provides high availability?

Answer: Through a process known as ailover, the cluster ramework provides a highly available

environment. Failover is a series o steps that are perormed by the cluster to migrate data

service resources rom a ailing node to another operational node in the cluster.

99

Question: What is the diference between a ailover and scalable data service?

Answer: There are two types o highly available data services:

■ Failover■ Scalable

A ailover data service runs an application on only one primary node in the cluster at a time.Other nodes might run other applications, but each application runs on only a single node. I aprimary node ails, applications that are running on the ailed node ail over to another node.They continue running.

File Systems FAQs



y g

A scalable data service spreads an application across multiple nodes to create a single, logicalservice. Scalable services leverage the number o nodes and processors in the entire cluster onwhich they run.

For each application, one node hosts the physical interace to the cluster. This node is called aGlobal Interace (GIF) node. Multiple GIF nodes can exist in the cluster. Each GIF node hostsone or more logical interaces that can be used by scalable services. These logical interaces arecalled global interaces. One GIF node hosts a global interace or all requests or a particularapplication and dispatches them to multiple nodes on which the application server is running.I the GIF node ails, the global interace ails over to a surviving node.

I any node on which the application is running ails, the application continues to run on othernodes with some perormance degradation. This process continues until the ailed node returns

to the cluster.

File Systems FAQsQuestion: Can I run one or more o the Solaris hosts in the cluster as highly available NFS serverswith other Solaris hosts as clients?

Answer: No, do not do a loopback mount.

Question: Can I use a cluster le system or applications that are not under Resource GroupManager control?

Answer: Yes. However, without RGM control, the applications need to be restarted manually ater the ailure o the node on which they are running.

Question: Must all cluster le systems have a mount point under the /global directory?

Answer: No. However, placing cluster le systems under the same mount point, such as /global,

enables better organization and management o these le systems.

Question: What are the diferences between using the cluster le system and exporting NFS lesystems?

Answer: Several diferences exist:

1. The cluster le system supports global devices. NFS does not support remote access todevices.


2. The cluster le system has a global namespace. Only one mount command is required. WithNFS, you must mount the le system on each host.

3. The cluster le system caches les in more cases than does NFS. For example, the cluster lesystem caches les when a le is being accessed rom multiple nodes or read, write, le

locks, asynchronous I/O.4. The cluster le system is built to exploit uture ast cluster interconnects that provide remote

DMA and zero-copy unctions.

5. I you change the attributes on a le (using chmod, or example) in a cluster le system, the

Data Services FAQs



5. you c a ge t e att butes o a e (us g c od, o e a p e) a c uste e syste , t echange is reected immediately on all nodes. With an exported NFS le system, this changecan take much longer.

Question: The le system /global/.devices/node@nodeID appears on my cluster nodes. Can Iuse this le system to store data that I want to be highly available and global?

Answer: These le systems store the global device namespace. These le systems are not intendedor general use. While they are global, these le systems are never accessed in a global manner.Each node only accesses its own global device namespace. I a node is down, other nodes cannotaccess this namespace or the node that is down. These le systems are not highly available.These le systems should not be used to store data that needs to be globally accessible or highly available.

Volume Management FAQsQuestion: Do I need to mirror all disk devices?

Answer: For a disk device to be considered highly available, it must be mirrored, or use RAID-5hardware. All data services should use either highly available disk devices, or cluster le systemsmounted on highly available disk devices. Such congurations can tolerate single disk ailures.

Question: Can I use one volume manager or the local disks (boot disk) and a diferent volumemanager or the multihost disks?

Answer: This conguration is supported with the Solaris Volume Manager sotware managingthe local disks and Veritas Volume Manager managing the multihost disks. No othercombination is supported.

Data Services FAQsQuestion: Which Sun Cluster data services are available?

Answer: The list o supported data services is included in the Sun Cluster Release Notes.

Question: Which application versions are supported by Sun Cluster data services?

Answer: The list o supported application versions is included in the Sun Cluster Release Notes.

Chapter 4 • Frequently Asked Questions 101

Question: Can I write my own data service?

Answer: Yes. See the Chapter 11, “DSDL API Functions,” in SunClusterDataServicesDeveloper’sGuide or SolarisOS or more inormation.

Question: When creating network resources, should I speciy numeric IP addresses or hostnames?

Answer: The preerred method or speciying network resources is to use the UNIX host namerather than the numeric IP address.

Public Network FAQs





http://docs.sun.com/doc/820-4680/dsdl_api-1?a=view







Question: When creating network resources, what is the diference between using a logical hostname (a LogicalHostname resource) or a shared address (a SharedAddress resource)?

Answer: Except in the case o Sun Cluster HA or NFS, wherever the documentation

recommends the use o a LogicalHostname resource in a Failover mode resource group, aSharedAddress resource or LogicalHostname resource can be used interchangeably. The use o a SharedAddress resource incurs some additional overhead because the cluster networkingsotware is congured or a SharedAddress but notor a LogicalHostname.

The advantage to using a SharedAddress resource is demonstrated when you congure bothscalable and ailover data services, and want clients to be able to access both services by usingthe same host name. In this case, the SharedAddress resources along with the ailover

application resource are contained in one resource group. The scalable service resource iscontained in a separate resource group and congured to use the SharedAddress resource.Both the scalable and ailover services can then use the same set o host names and addressesthat are congured in the SharedAddress resource.

Public Network FAQs

Question: Which public network adapters does the Sun Cluster sotware support?Answer: Currently, the Sun Cluster sotware supports Ethernet (10/100BASE-T and1000BASE-SX Gb) public network adapters. Because new interaces might be supported in theuture, check with your Sun sales representative or the most current inormation.

Question: What is the role o the MAC address in ailover?

Answer: When a ailover occurs, new Address Resolution Protocol (ARP) packets are generatedand broadcast to the world. These ARP packets contain the new MAC address (o the newphysical adapter to which the host ailed over) and the old IP address. When another machineon the network receives one o these packets, it ushes the old MAC-IP mapping rom its ARPcache and uses the new one.

Question: Does the Sun Cluster sotware support setting local-mac-address?=true?

Answer: Yes. In act, IP Network Multipathing requires that local-mac-address? must be settotrue.


You can set local-mac-address with the eeprom command, at the OpenBoot PROM ok

prompt in a SPARC based cluster. See the eeprom(1M) man page. You can also set the MACaddress with the SCSI utility that you optionally run ater the BIOS boots in an x86 basedcluster.

Question: How much delay can I expect when IP network multipathing perorms a switchoverbetween adapters?

Answer: The delay could be several minutes. The reason is because when an IP network multipathing switchover is perormed, the operation sends a gratuitous ARP broadcast.

Cluster Member FAQs






multipathing switchover is perormed, the operation sends a gratuitous ARP broadcast.However, you cannot be sure that the router between the client and the cluster uses thegratuitous ARP. So, until the ARP cache entry or this IP address on the router times out, theentry can use the stale MAC address.

Question: How ast are ailures o a network adapter detected?

Answer: The deault ailure detection time is 10 seconds. The algorithm tries to meet the ailuredetection time, but the actual time depends on the network load.

Cluster Member FAQsQuestion: Do all cluster members need to have the same root password?

Answer: You are not required to have the same root password on each cluster member. However,you can simpliy administration o the cluster by using the same root password on all nodes.

Question: Is the order in which nodes are booted signicant?

Answer: In most cases, no. However, the boot order is important to prevent amnesia. Forexample, i node two was the owner o the quorum device and node one is down, and then you

bring node two down, you must bring up node two beore bringing back node one. This orderprevents you rom accidentally bringing up a node with outdated cluster congurationinormation.

Question: Do I need to mirror local disks in a cluster node?

Answer: Yes. Though this mirroring is not a requirement, mirroring the cluster node's disksprevents a nonmirrored disk ailure rom taking down the node. The downside to mirroring acluster node's local disks is more system administration overhead.

Question: What are the cluster member backup issues?

Answer: You can use several backup methods or a cluster. One method is to have a host as theback up node with a tape drive or library attached. Then use the cluster le system to back upthe data. Do not connect this host to the shared disks.

See Chapter 11, “Backing Up and Restoring a Cluster,” in SunCluster SystemAdministrationGuide or Solaris OS or additional inormation about how to backup and restore data.


Question: When is a node healthy enough to be used as a secondary node?

Answer: Solaris 9 OS:

Ater a reboot, a node is healthy enough to be a secondary node when the node displays thelogin prompt.

Solaris 10 OS:A nodeis healthy enough to be a secondary node i the multi-user-server milestone isrunning.

| l i d f l

Cluster Storage FAQs

http://docs.sun.com/doc/820-4679/cfhdbgbc?a=view







# svcs -a | grep multi-user-server:default

Cluster Storage FAQsQuestion: What makes multihost storage highly available?

Answer: Multihost storage is highly available because it can survive the loss o a single disk,because o mirroring (or because o hardware-based RAID-5 controllers). Because a multihoststorage device has more than one host connection, it can also withstand the loss o a singleSolaris host to which it is connected. In addition, redundant paths rom each host to theattached storage provide tolerance or the ailure o a host bus adapter, cable, or disk controller.

Cluster Interconnect FAQsQuestion: Which cluster interconnects does the Sun Cluster sotware support?

Answer: Currently, the Sun Cluster sotware supports the ollowing cluster interconnects:

■ Ethernet (100BASE-T Fast Ethernet and 1000BASE-SX Gb) in both SPARC based and x86based clusters

■

Inniband in both SPARC based and x86 based clusters■ SCI in SPARC based clusters only

Question: What is the diference between a “cable” and a transport “path”?

Answer: Cluster transport cables are congured by using transport adapters and switches. Cables join adapters and switches on a component-to-component basis. The cluster topology manageruses available cables to build end-to-end transport paths between hosts. A cable does not mapdirectly to a transport path.

Cables are statically “enabled” and “disabled” by an administrator. Cables have a “state”(enabled or disabled), but not a “status.” I a cable is disabled, it is as i it were uncongured.Cables that are disabled cannot be used as transport paths. These cables are not probed andthereore their state is unknown. You can obtain the state o a cable by using the cluster

status command.

Transport paths are dynamically established by the cluster topology manager. The “status” o atransport path is determined by the topology manager. A path can have a status o “online” or


“oine.” You can obtain the status o a transport path by using the clinterconnect status

command. See the clinterconnect(1CL) man page.

Consider the ollowing example o a two-host cluster with our cables.

node1:adapter0 to switch1, port0




Two possible transport paths can be ormed rom these our cables.

Administrative ConsoleFAQs

http://docs.sun.com/doc/820-4685/clinterconnect-1cl?a=view





node1:adapter0 to node2:adapter0

node2:adapter1 to node2:adapter1

Client Systems FAQsQuestion: Do I need to consider anyspecial clientneedsor restrictions foruse with a cluster?

Answer: Client systems connect to the cluster as they would to any other server. In someinstances, depending on the data service application, you might need to install client-sidesotware or perorm other conguration changes so that the client can connect to the dataservice application. See Chapter 1, “Planning or Sun Cluster Data Services,” in SunCluster DataServices Planning andAdministrationGuide or Solaris OS or more inormation aboutclient-side conguration requirements.

Administrative Console FAQsQuestion: Does the Sun Cluster sotware require an administrative console?

Answer: Yes.

Question: Does the administrative console have to be dedicated to the cluster, or can it be usedor other tasks?

Answer: The Sun Cluster sotware does not require a dedicated administrative console, but usingone provides these benets:

■ Enables centralized cluster management by grouping console and management tools on thesame machine

■ Provides potentially quicker problem resolution by your hardware service provider

Question: Does the administrative console need to be located “close” to the cluster, or example,in the same room?

Answer: Check with your hardware service provider. The provider might require that theconsole be located in close proximity to the cluster. No technical reason exists or the console tobe located in the same room.


Question: Can an administrative console serve more than one cluster, i any distancerequirements are also rst met?

Answer: Yes. You can control multiple clusters rom a single administrative console. You canalso share a single terminal concentrator between clusters.

Terminal Concentrator and System Service Processor FAQsQuestion: Does the Sun Cluster sotware require a terminal concentrator?

Terminal Concentrator and System Service Processor FAQs

http://docs.sun.com/doc/820-4682/babeechd?a=view







Answer: Starting with Sun Cluster 3.0, Sun Cluster sotware does not require a terminalconcentrator. Unlike Sun Cluster 2.2, Sun Cluster 3.0, Sun Cluster 3.1, and Sun Cluster 3.2 donot require a terminal concentrator. Sun Cluster 2.2 required a terminal concentrator or

encing.

Question: I see that most Sun Cluster servers use a terminal concentrator, but the Sun EnterpriseE1000 server does not. Why not?

Answer: The terminal concentrator is efectively a serial-to-Ethernet converter or most servers.The terminal concentrator's console port is a serial port. The Sun Enterprise E1000 serverdoesn't have a serial console. The System Service Processor (SSP) is the console, either throughan Ethernet or jtag port. For the Sun Enterprise E1000 server, you always use the SSP or

consoles.

Question: What are the benets o using a terminal concentrator?

Answer: Using a terminal concentrator provides console-level access to each Solaris host rom aremote machine anywhere on the network. This access is provided even when the host is at theOpenBoot PROM (OBP) on a SPARC based host or a boot subsystem on an x86 based host.

Question: I I use a terminal concentrator that Sun does not support, what do I need to know toqualiy the one that I want to use?

Answer: The main diference between the terminal concentrator that Sun supports and otherconsole devices is that the Sun terminal concentrator has special rmware. This rmwareprevents the terminal concentrator rom sending a break to the console when it boots. I youhave a console device that can send a break, or a signal that might be interpreted as a break tothe console, the break shuts down the host.

Question: Can I ree a locked port on the terminal concentrator that Sun supports withoutrebooting it?

Answer: Yes. Note the port number that needs to be reset and type the ollowing commands:

telnet tcEnter Annex port name or number: cli

annex: su -

annex# admin


admin : reset port-number

admin : quit

annex# hangup

#

Reer to the ollowing manuals or more inormation about how to congure and administer

the terminal concentrator that Sun supports.

■ “Overview o Administering Sun Cluster” in SunCluster SystemAdministrationGuide or Solaris OS

■ Chapter 2, “Installing and Conguring the Terminal Concentrator,” in SunCluster 3.1 - 3.2

Terminal Concentrator and System Service Processor FAQs











p , g g g ,Hardware AdministrationManual or Solaris OS

Question: What i the terminal concentrator itsel ails? Must I have another one standing by?

Answer: No. You do not lose any cluster availability i the terminal concentrator ails. You dolose the ability to connect to the host consoles until the concentrator is back in service.

Question: I I do use a terminal concentrator, what about security?

Answer: Generally, the terminal concentrator is attached to a small network that systemadministrators use, not a network that is used or other client access. You can control security by limiting access to that particular network.

Question: SPARC: How do I use dynamic reconguration with a tape or disk drive?

Answer: Perorm the ollowing steps:

■ Determine whether the disk or tape drive is part o an active device group. I the drive is notpart o an active device group, you can perorm the DR remove operation on it.

■ I the DR remove-board operation would afect an active disk or tape drive, the systemrejects the operation and identies the drives that would be afected by the operation. I the

drive is part o an active device group, go to “SPARC: DR Clustering Considerations or Disk and Tape Drives” on page 95.

■ Determine whether the drive is a component o the primary node or the secondary node. I the drive is a component o the secondary node, you can perorm the DR remove operationon it.

■ I the drive is a component o the primary node, you must switch the primary and secondary nodes beore perorming the DR remove operation on the device.

Caution – I the current primary node ails while you are perorming the DR operation on asecondary node, cluster availability is impacted. The primary node has no place to ail over untila new secondary node is provided.









108

Index

A li t t (C ti d)



Aadapters, See network, adaptersadministration, cluster, 41-97

administrative console, 28-29FAQs, 105-106

administrative interaces, 42agents, See data servicesamnesia, 57APIs, 71-73, 75application, See data servicesapplication communication, 73-74

application development, 41-97application distribution, 60attributes,See properties

Bbackup node, 103-104

board removal, dynamic reconguration, 95boot disk, See disks, localboot order, 103-104

Ccable, transport, 104-105

CCP, 28CCR, 45-46CD-ROM drive, 26client-server conguration, 65client systems, 27

client systems (Continued)FAQs, 105restrictions, 105

clprivnet driver, 74cluster

administration, 41-97advantages, 14-15application developer view, 18application development, 41-97backup, 103-104board removal, 95

boot order, 103-104conguration, 45-46, 83-91data services, 64-71description, 14-15le system, 51-53, 100-101

FAQsSeealso le system

HAStoragePlus resource type, 52-53

using, 51-52goals, 14-15hardware, 15-16, 21-29interconnect, 23, 26-27

adapters, 26cables, 27data services, 73-74dynamic reconguration, 96

FAQs, 104-105interaces, 26 junctions, 27supported, 104-105

media, 26

109

cluster (Continued)

members, 22, 44

FAQs, 103-104

reconguration, 44

nodes, 22-23

password, 103-104

public network, 27

public network interace, 65

service, 15-16

sotware components 23-24

Ddaemons, svc.startd, 81data, storing, 100-101data services, 64-71

APIs, 71-73cluster interconnect, 73-74conguration, 83-91developing, 71-73ailover, 67-68FAQs, 101-102

Index



sotware components, 23 24

storage FAQs, 104

system administrator view, 16-17

task list, 18-19time, 42

topologies, 29-37, 38-40

Cluster Conguration Repository, 45-46

Cluster Control Panel, 28

cluster in a box topology, 33

Cluster Membership Monitor, 44

clustered pair topology, 30, 39

clustered-server model, 65clusters span two hosts topology, 35-36

CMM, 44

ailast mechanism, 44

See also ailast

concurrent access, 22

conguration

client-server, 65data services, 83-91

parallel database, 22

repository, 45-46

virtual memory limits, 86-87

congurations, quorum, 59-60

console

access, 28

administrative, 28FAQs, 105-106

System Service Processor, 28

Controlling CPU, 82

CPU, control, 82

CPU time, 83-91

ault monitor, 71highly available, 43library API, 73methods, 67resource groups, 74-77resource types, 74-77resources, 74-77scalable, 68-69supported, 101-102

/dev/global/ namespace, 49-50developer, cluster applications, 18device

global, 46-47ID, 46-47

device group, 47-49changing properties, 48-49

device groupsailover, 47multiported, 48-49primary ownership, 48-49

devicesmultihost, 24quorum, 56-63

DID, 46-47disk path monitoring, 53-56disks

dynamic reconguration, 95global devices, 46-47, 49-50local, 25-26, 46-47, 49-50

mirroring, 103-104 volume management, 101

multihost, 46-47, 47-49, 49-50SCSI devices, 25

DR, See dynamic reconguration


driver, device ID, 46-47DSDL API, 75dynamic reconguration, 94-97

cluster interconnect, 96CPU devices, 95

description, 94disks, 95memory, 95public network, 96-97quorum devices, 96t d i 95

le locking, 51le system

cluster, 51-53, 100-101data storage, 100-101FAQs, 100-101

globalSee le system, cluster

high availability, 100-101local, 52-53mounting, 51-53, 100-101

Index



tape drives, 95

EE10000, See Sun Enterprise E10000

Failback, 71

ailast, 44-45ailover

data services, 67-68device groups, 47scenarios, Solaris Resource Manager, 87-91

ailuredetection, 43ailback, 71

recovery, 43FAQs, 99-107administrative console, 105-106client systems, 105cluster interconnect, 104-105cluster members, 103-104cluster storage, 104data services, 101-102

le systems, 100-101high availability, 99-100public network, 102-103System Service Processor, 106-107terminal concentrator, 106-107 volume management, 101

ault monitor, 71encing, 44

NFS, 53, 100-101syncdir mount option, 53UFS, 53VxFS, 53

le systems, using, 51-52ramework, high availability, 43-46Frequently Asked Questions, See FAQs

Gglobal

device, 46-47, 47-49local disks, 25mounting, 51-53

interace, 66scalable services, 68

namespace, 46, 49-50

local disks, 25global le system, See cluster, le systemglobal interace node, 66/global mount point, 51-53, 100-101groups, device, 47-49

HHA,See high availability hardware, 15-16, 21-29, 94-97Seealso disksSeealso storagecluster interconnect components, 26dynamic reconguration, 94-97

HAStoragePlus resource type, 52-53, 74-77

111

high availability FAQs, 99-100ramework, 43-46

highly available, data services, 43host name, 65

IID

device 46 47

membership,See cluster, membersmemory, 95mission-critical applications, 62monitoring

disk path, 53-56

object type, 81system resources, 82telemetry attributes, 82

mountingle systems, 51-53/global 100 101

Index



device, 46-47node, 50

in.mpathd daemon, 92

interacesSee network, interacesadministrative, 42

IP address, 101-102IP Network Multipathing, 92-93

ailover time, 102-103IPMP, See IP Network Multipathing

Kkernel, memory, 95

L

load balancing, 69-71local disks, 25-26local le system, 52-53local_mac_address, 102-103local namespace, 50logical host name, 65

compared to shared address, 101-102ailover data services, 67-68

LogicalHostname resource type,See logical host name

MMAC address, 102-103mapping, namespaces, 50media, removable, 26

/global, 100-101global devices, 51-53with syncdir, 53

multi-initiator SCSI, 25multihost device, 24multipathing, 92-93multiported device groups, 48-49

N

N+1 (star) topology, 31-32, 39-40N*N (scalable) topology, 32namespaces, 49-50, 50network

adapters, 27, 92-93interaces, 27, 92-93load balancing, 69-71logical host name, 65

private, 23public, 27

dynamic reconguration, 96-97FAQs, 102-103interaces, 102-103IP Network Multipathing, 92-93

resources, 65, 74-77shared address, 65

Network Time Protocol, 42NFS, 53nodes, 22-23

backup, 103-104boot order, 103-104global interace, 66nodeID, 50primary, 48-49, 65


nodes (Continued)secondary, 48-49, 65

NTP, 42numsecondaries property, 48

Oobject type, system resource, 81Oracle Parallel Server, SeeOracle Real Application

Clusters

quorum (Continued)device, dynamic reconguration, 96devices, 56-63recommended congurations, 61requirements, 59-60

vote counts, 58

Rrecovery

Index



ClustersOracle Real Application Clusters, 72

Ppair+N topology, 31panic, 44-45, 45parallel database congurations, 22password, root, 103-104path, transport, 104-105

per-host address, 73-74preferenced property, 48primary node, 65primary ownership, device groups, 48-49private network, 23projects, 83-91properties

changing, 48-49

resource groups, 77Resource_project_name, 85-86resources, 77RG_project_name, 85-86

proxy resource types, 80public network,See network, publicpure service, 69

Qquorum, 56-63

atypical congurations, 62bad congurations, 63best practices, 60congurations, 58-59, 59-60

recovery ailback settings, 71ailure detection, 43

redundant I/O domains topology, 37removable media, 26Resource Group Manager,SeeRGMresource groups, 74-77

ailover, 67-68properties, 77scalable, 68-69settings, 75-77

states, 75-77resource management, 83-91Resource_project_name property, 85-86resource types, 52-53, 74-77

proxy, 80SUNW.Proxy_SMF_failover, 80SUNW.Proxy_SMF_loadbalanced , 80SUNW.Proxy_SMF_multimaster , 80

resources, 74-77properties, 77settings, 75-77states, 75-77

RG_project_name property, 85-86RGM, 67, 74-77, 83-91RMAPI, 75root password, 103-104

Sscalable data services, 68-69scha_cluster_get command, 74scha_privatelink_hostname_node argument, 74SCSI, multi-initiator, 25

113

scsi-initiator-id property, 25secondary node, 65server models, 65Service Management Facility (SMF), 80-81shared address, 65

compared to logical host name, 101-102global interace node, 66scalable data services, 68-69

SharedAddress resource type, See shared addressshutdown, 44-45single cluster spans two hosts topology 34 35

system resources (Continued)object type, 81usage, 81

System Service Processor, 28FAQs, 106-107

Ttape drive, 26telemetry attribute, system resources, 82

Index



single cluster spans two hosts topology, 34-35single-server model, 65SMF, See Service Management Facility (SMF)

SMF daemon svc.startd, 81sotware components, 23-24Solaris projects, 83-91Solaris Resource Manager, 83-91

conguration requirements, 85-86conguring virtual memory limits, 86-87ailover scenarios, 87-91

Solaris Volume Manager, multihost devices, 24

split brain, 57SSP, See System Service Processorsticky service, 69storage, 24

dynamic reconguration, 95FAQs, 104SCSI, 25

Sun Cluster, See clusterSun Cluster Manager, 42

system resource usage, 83Sun Enterprise E10000, 106-107

administrative console, 28Sun Management Center (SunMC), 42SUNW.Proxy_SMF_failover, resource types, 80SUNW.Proxy_SMF_loadbalanced , resource types, 80SUNW.Proxy_SMF_multimaster , resource types, 80svc.startd, daemons, 81syncdir mount option, 53system resource, threshold, 82system resource monitoring, 82system resource usage, 81system resources

monitoring, 82

y , y ,terminal concentrator, FAQs, 106-107threshold

system resource, 82telemetry attribute, 82

time, between hosts, 42topologies, 29-37, 38-40

clustered pair, 30, 39logical domains: cluster in a box, 33logical domains: clusters span two hosts, 35-36logical domains: redundant I/O domains, 37

logical domains: single cluster spans twohosts, 34-35

N+1 (star), 31-32, 39-40N*N (scalable), 32pair+N, 31

UUFS, 53

VVeritas Volume Manager, multihost devices, 24 volume management

FAQs, 101local disks, 101multihost devices, 24multihost disks, 101namespace, 49RAID-5, 101Solaris Volume Manager, 101Veritas Volume Manager, 101


vote counts, quorum, 58VxFS, 53

Zzones, 77

Index



115



116

Documents

Cluster Sun Conceptos