Upload
iolani
View
160
Download
0
Embed Size (px)
DESCRIPTION
IT Service Management 2011 年度教育部 -IBM 精品课程. 同济大学软件学院 严海洲 [email protected]. Chapter 5 Service Operation. Tivoli Software 服务运营 • 服务运营指导如何达到服务交付和服务支持的效果和效率,从而确保客户 和服务 提供者的价值得以实现。 • 《 服务运营 》 介绍了如下的主题和流程: • Service Operation Principles. • Service Operation Processes - PowerPoint PPT Presentation
Citation preview
IT Service Management2011 年度教育部 -IBM 精品课程
同济大学软件学院 严海洲[email protected]
Chapter 5 Service Operation
Tivoli Software
服务运营• 服务运营指导如何达到服务交付和服务支持的效果和效率,从而确保客户和服务提供者的价值得以实现。• 《服务运营》介绍了如下的主题和流程:
• Service Operation Principles• Service Operation Processes
Event ManagementIncident ManagementRequest FulfillmentProblem ManagementAccess Management
• Common Service Operation Activities
IT Operations ( Console, Job Scheduling etc.)Mainframe SupportServer Mgmt and SupportDesktop Support, Middleware Mgmt, Internet/Web MgmtApplication Mgmt Activities
• Organization Service OperationService DeskTechnical ManagementIT Operations ManagementApplication Management
ServiceDesign
ServiceStrategy
ServiceTransition
ServiceOperation
ITIL
5.1 Introduction
Service Operation (SO)
• Coordinate and carry-out day-to-day activities andprocesses to deliver and manage services at agreedlevels
• Ongoing management of the technology that is usedto deliver and support services
• Where the plans, designs and optimizations areexecuted and measured
5
.
Service Operation Goals
• Coordinate and Execute: all ongoing activities requiredto deliver and support services
--Execute the Services
--Coordinate Service Management processes
--Management of the technology infrastructure usedto deliver services
--Coordinate the people who manage thetechnology, processes, and services
6
.
Scope of SO
• Ongoing management of:
– The services themselves
– The Service Management processes
– Technology
– People
7
Value to business of SO
• Where actual value of strategy, design and transitionare realized by the customers and users
Though
• Where business dependency usually commences
8
ACHIEVING BALANCE IN SERVICE OPERATION
• Service Operation: More than repetitive execution
--Services delivered in a changing environment
--Conflict between status quo and adaptation
--Balance between conflicting sets of priorities
• Balance Areas of Conflict:
--Internal IT View vs. External Business View
--Stability vs. Responsiveness
--Quality of Service vs. Cost of Service
--Reactive vs. Proactive
9
ACHIEVING BALANCE IN SERVICE OPERATION
• Internal IT View vs. External Business View
--Internal: IT components and systems
--External: Users and customer experiences
• Stability vs. Responsiveness
--Stability: Stable platform and consistent
--Responsiveness: Quick response and flexible
• Quality of Service vs. Cost of Service
--Quality: Consistent delivery of service
--Cost: Costs and resource utilization optimal
• Reactive vs. Proactive
--Reactive: Does not act until prompted
--Proactive: Always looking to improve10
Internal IT View vs. External Business View
11
Internal IT View vs. External Business View 3-1
12
Internal IT View vs. External Business View 3-2
13
Internal IT View vs. External Business View 3-3
14
Stability vs. Responsiveness
15
Stability vs. Responsiveness
16
Stability vs. Responsiveness
17
Quality of Service vs. Cost of Service
18
Quality of Service vs. Cost of Service
19
Quality of Service vs. Cost of Service
20
Reactive vs. Proactive
21
Reactive vs. Proactive
22
23
Reactive vs. Proactive
Operational Health
• What is operation health?
• Who should pay attention to operation healthy?
• Think your Health.......
Heart?
Brain?
or others?
24
Communication
• Good communication is needed between all ITSMpersonnel and with users/customers/partners
• Issues can often be mitigated or avoided through goodcommunication
• All communication should have:
– Intended purpose and/or resultant action
– Clear audience, who should be involved in deciding theneed/format
25
5.2 Service Operation Processes
Event Management
• Objectives
• Basic concepts
• Roles
27
Event Management — Objectives
• Detect, make sense of them, and determine theappropriate control action
• Event Management is the basis for OperationalMonitoring and Control
28
Event Management — Basic concepts
• Event
An alert or notification created by any IT Service,Configuration Item or monitoring tool. For example abatch job has completed. Events typically require ITOperations personnel to take actions, and often lead toIncidents being logged.
• Event Management
The Process responsible for managing Eventsthroughout their Lifecycle.
• Alert
29
Event Management — Logging andFiltering
Exception
WarningFilter
30
Information
Incident/Proble
m/Change
Event Management — Managing Exceptions
Exception
IncidentManagement
Incident
ProblemManagement
ChangeManagement
Problem
RFC
31
Event Management —Information and
Warnings Incident
Information Log
Problem
RFC
HumanIntervention
Incident/Proble
m/Change
Alert
Do any one orcombinationof …
Warning
Auto Response
32
Event Management — Roles
• Event management rolesare filled by people in thefollowing functions
– Service Desk
– Technical Management
– Application Management
– IT Operations Management
33
Metrics of Event Management
Designing for event management1.Instrumentation
2.Error Messaging
35
Designing for event management3.Event Detection and Alert Mechanisms
4.Identification of thresholds
36
Incident Management
• Objectives
• Scope
• Business value
• Basic concepts
• Activities
• Interfaces
• Key metrics
• Roles
• Challenges
37
Incident Management — Objective
• To restore normal service operation as quickly aspossible and minimize adverse impact on the business
38
Incident Management — Scope
• Managing any disruption or potential disruption to liveIT services
• Incidents are identified
– Directly by users through the Service Desk
– Through an interface from Event Management to IncidentManagement tools
• Reported and/or logged by technical staff
39
Incident Management — Business value
• Quicker incident resolution
• Improved quality
• Reduced support costs
40
Why Incident Management
Ensure the best use of resource to support the business
Develop and maintain meaningful records relating to incidents
Devise and apply a consistent approach to all incidents reported
Incident DefinitionAn incident is an event which is not part of thestandard operation of a service and which causes,or may cause an interruption to, or a reduction inthe quality of that service
41
Incident Management — Basic concepts
• An Incident
– An unplanned interruption or reduction in the quality ofan IT Service
– Any event which could affect an IT Service in the future isalso an Incident
• Timescales
• Incident Models
• Major Incidents
42
Incident Management — Activities
43
Impact, Urgency & Priority
IMPACT
- The likely effect the incident will have on thebusiness (e.g. numbers affected, magnitude)
URGENCY
- Assessment of the speed with which an incidentor problem requires resolution (i.e. how muchdelay will the resolution bear)
PRIORITY
- the relative sequence in which an incident orproblem needs to be resolved, based on impactand urgency
44
Example
45
Incident Management — Interfaces
• Problem Management
• Service Asset and Configuration Management (SACM)
• Change Management
• Capacity Management
• Availability Management
• Service Level Management
46
Incident Management — Key metrics
• Total number of incidents (as a control measure)
• Breakdown of incidents at each stage (for example,
logged, WIP, closed, etc.)• Size of incident backlog
• Mean elapsed time to resolution
• % resolved by the Service Desk (first-line fix)
• % handled within agreed response time
• % resolved within agreed Service Level Agreement target
• No. and % of Major Incidents
• No. and % of incident correctly assigned
• Average cost of incident handling
47
Incident Management — Roles
• Incident Manager
– May be performed by Service Desk Supervisor
• Super Users
• First-Line Support
– Usually Service Desk Analysts
• Second-Line Support
• Third-Line Support (Technical Management, ITOperations, Applications Management, Third-partysuppliers)
48
• Reduced business impact of Incidents by timely
resolution
• Improved monitoring of performance against targets
• Elimination of lost Incidents and Service Requests
• More accurate CMDB information
• Improved User satisfaction
• Less disruption to both IT support staff and Users
Benefits
49
Possible Problems
50
• Lack of Management commitment• Lack of agreed Customer service levels• Lack of knowledge or resources for resolving incidents• Poorly integrated processes• Unsuitable software tools• Users and IT staff bypassing the process
Incident Management — Challenges
51
•Ability to detect incidents as quickly as possible (dependency on Event Management)•Ensuring all incidents are logged•Ensuring previous history is available (Incidents, Problems, Known Errors, Changes)•Integration with Configuration Management System, Service Level Management, and Known Error Database (CMS, SLM, KEDB)
Request Fulfillment
• Objectives
• Basic concepts
• Roles
52
Request Fulfillment — Objectives
• To provide a channel for users to request and receivestandard services for which a pre defined approval andqualification process exists
• To provide information to users and customers aboutthe availability of services and the procedure forobtaining them
• To source and deliver the components of requestedstandard services (for example licenses and softwaremedia)
• To assist with general information, complaints orcomments
53
Request Fulfillment — Basic concepts
• Service Request
– A request from a User for information or advice , or for aStandard Change. For example
• To reset a password, or to provide standard IT Servicesfor a new User
• Request Model
54
Request Fulfillment — Roles
• Not usually dedicated staff
• Service Desk staff
• Incident Management staff
• Service Operations teams
55
Problem Management
• Objectives
• Basic concepts
• Roles
56
Problem Management — Objectives
• To prevent problems and resulting Incidents fromhappening
• To eliminate recurring incidents
• To minimize the impact of incidents that cannot beprevented
57
Problem Management—Basic concepts(1 of 2)
• Problem– The unknown cause of one or more incidents
• Problem Models
• Workaround
• Known Error
• Known Error Database
58
• Reactive Problem Management– Resolution of underlying cause(s)– Covered in Service Operation
• Pro-active Problem Management– Prevention of future problems– Generally undertaken as part of CSI
59
Problem Management—Basic concepts(2 of 2)
Proactive Activities
Trend Analysis
- Post-Change occurrence of particular Problems
- Recurring Problems per type or per component
- Training, documentation issues
Preventative Action
- Raising RFC to prevent occurrence/recurrence
- Initiate education and training
- Ensure adherence to procedures
- Initiate process improvement
- Provide feedback to testing, training and documentation
60
Problem Management — Roles
• Problem Manager
• Supported by technical groups– Technical Management– IT Operations– Applications Management– Third-party suppliers
61
Problem Management — Problem Investigation and Diagnosis
• Objectives
• Basic concepts
• Roles
62
Chronological Analysis
63
Pain Value Analysis
64
Kepner and Tregoe
65
Brainstorming
66
It can often be valuable to gather together the relevant people, either physically or by electronic means, and to ‘brainstorm’ the problem – with people throwing in ideas on what the potential cause may be and potential actions to resolve the problem. Brainstorming sessions can be very constructive and innovative but it is equally important that someone, perhaps the Problem Manager, documents the outcome and any agreed actions and keeps a degree of control in the session(s).
Ishikawa Diagram
67
Preto Analysis
68
Preto Analysis-Example
69
Preto Analysis-Example
70
Network Failures
Access Management
• Objectives
• Basic concepts
• Roles
71
Access Management — Objectives
• Granting authorized users the right to use a service
• Preventing access by non-authorized users
72
Access Management — Basic concepts
• Access
• Identity
• Rights
• Service or Service Groups
• Directory Services
73
Access Management — Roles
• Not usually dedicated staff
• Access management is an execution of AvailabilityManagement and Information Security Management
• •
•
•
Service Desk staff
Technical Management staff
Application Management staff
IT Operations staff
74
5.3 Common Service Operation activities
MONITORING AND CONTROL
•Monitoring refers to the activity of observing a situation to detect changes that happen over time. •Control refers to the process of managing the utilization or behaviour of a device, system or service. It is important to note, though, that simply manipulating a device is not the same as controlling it. Control requires three conditions:1. The action must ensure that behaviour conforms to a defined standard or norm2. The conditions prompting the action must be defined, understood and confirmed3. The action must be defined, approved and appropriate for these conditions.
76
The Monitor Control Loop
77
MAINFRAME MANAGEMENT
Activities are likely to be undertaken:• Mainframe operating system maintenance and support• Third-level support for any mainframe-related incidents/problems• Writing job scripts• System programming• Interfacing to hardware (H/W) support; arranging maintenance, agreeing slots, identifying H/W failure, liaison with H/W engineering.• Provision of information and assistance to Capacity Management to help achieve optimum throughput, utilization and performance from the mainframe.
78
SERVER MANAGEMENT AND SUPPORT
• Operating system support • Licence management • Third-level support • Procurement advice • System security • Definition and management of virtual servers • Capacity and Performance • Ongoing maintenance • Decommissioning and disposal of old server equipment
79
FACILITIES AND DATA CENTRE MANAGEMENT
•Building Management• Equipment Hosting• Power Management• Environmental Conditioning and Alert Systems• Safety• Physical Access Control• Shipping and Receiving• Involvement in Contract Management• Maintenance
80
5.3 Organizing for Service Operation
Service Operation functions
Service Desk
IT OperationsManagement
Operations Control
Facilities Management
TechnicalManagement
ApplicationManagement
82
Service Desk
• Primary point of contact
• Deals with all user issues (incidents,
requests,
standard changes)
• Coordinates actions across the IT
organization to
meet user requirements
• Different options (Local, Centralized,
Virtual, Follow-the-Sun, specialized groups)
83
Local Service Desk
84
Centralized Service Desk
85
Virtual Service Desk
86
Service Desk objectives
• Logging and categorizing Incidents, Service Requestsand some categories of change
•
•
•
•
•
•
First line investigation and diagnosis
Escalation
Communication with Users and IT Staff
Closing calls
Customer satisfaction
Update the CMS if so agreed
87
Service Desk staffing• Correct number and qualifications at any given time,
considering:– Customer expectations and business requirements
– Number of users to support, their language and skills
– Coverage period, out-of-hours, time zones/locations,travel time
– Processes and procedures in place
• Minimum qualifications– Interpersonal skills
– Business understanding
– IT understanding
– Skill sets• Customer and Technical emphasis, Expert
88
Service Desk metrics
• Periodic evaluations of health, maturity, efficiency,effectiveness and any opportunity to improve
• Realistic and carefully chosen – total number of call isnot itself good or bad
• Some examples:– First-line resolution rate
– Average time to resolve and/or escalate an incident
– Total costs for the period divided by total call durationminutes
– The number of calls broken down by time of day and dayof week, combined with the average call-time
89
Technical Management
• The groups, departments or teams that providetechnical expertise and overall management of the ITInfrastructure
– Custodians of technical knowledge and expertise relatedto managing the IT Infrastructure
– Provide the actual resources to support the IT ServiceManagement Lifecycle
– Perform many of the common activities already outlined
– Execute most ITSM processes
90
Technical Management organization
• Technical teams are usually aligned to the technologythey manage
• Can include operational activities
• Examples– Mainframe management
– Server Management
– Internet / Web Management
– Network Management
– Database Administration
91
Technical Management — Objectives
• Design of resilient, cost-effective infrastructure configuration
• Maintenance of the infrastructure
• Support during technical failures
92
Technical Management — Roles
• Technical Managers
• Team Leaders
• Technical Analysts / Architects
• Technical Operator
93
IT Operations Management
• The department, group or team of people responsiblefor performing the organization’s day-to-dayoperational activities, such as:– Console Management
– Job Scheduling
– Backup and Restore
– Print and Output management
– Performance of maintenance activities
– Facilities Management
– Operations Bridge
– Network Operations Center
– Monitoring the infrastructure, applications and services94
IT Operations Management — Objectives
• Maintaining the “status quo” to achieveinfrastructure stability
• Identify opportunities to improve operationalperformance and save costs
• Initial diagnosis and resolution of operational Incidents
95
IT Operations Management — Roles
• IT Operations Manager
• Shift Leaders
• IT Operations Analysts
• IT Operators
96
Applications Management
• Manages Applications throughout their Lifecycle
• Performed by any department, group or teammanaging and supporting operational Applications
•
•
•
•
•
Role in the design, testing and improvement ofApplications that form part of IT Services
Involved in development projects, but not usually thesame as the Application Development teams
Custodian of expertise for Applications
Provides resources throughout the lifecycle
Guidance to IT Operations Management
97
Applications Management — Objectives
• Well designed, resilient, cost effective applications
• Ensuring availability of functionality
• Maintain operational applications
• Support during application failures
98
Applications Management — Roles
• Application Manager / Team leaders
• Applications Analyst / Architect
Note: Application Management teams are usuallyaligned to the applications they manage
99
SERVICE OPERATION ROLES
• Service Desk roles
• Technical Management roles• IT Operations Management roles• Application Management roles
• Event Management roles • Incident Management roles • Request Fulfilment roles • Problem Management roles • Access Management roles
100
SERVICE OPERATION ORGANIZATION STRUCTURES
Organization by technical specialization Organization by activity Organizing to manage processes Organizing IT Operations by Geography
Hybrid organization structures
101
Organization by technical specialization
102
Organization by activity
103
Organizing to manage processes
104
It is not a good idea to structure the whole organization according to processes. Processes are used to overcomethe ‘silo effect’ of departments, not to create silos.However, there are a number of processes that will need adedicated organization structure to support and manageit. For example, it will be very difficult for FinancialManagement to be successful without a dedicated Financedepartment – even if that department consists of a smallnumber of staff.In process-based organizations people are organized intogroups or departments that perform or manage a specificprocess. This is similar to the activity-based structure,except that its departments focus on end-to-end sets ofactivities rather than on one individual type of activity.
Organizing IT Operations by Geography
105
Hybrid organization structures
106
It is unlikely that IT Operations Management will be structured using only one type of organization structure. Most organizations use a technical specialization, with some additional activity- or process-based departments.
The type of structure used and the exact combination of technical specialization, activity-based and process-based departments will depend on a number of organizational variables.
Centralized IT Operations, Technical and Application
Management structure
107