Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
二〇一八年十ㄧ月
IT Operation Analytics IT Operation Analytics Tools, Multi-cloud Ready Environment
Nov-28-2018 第 31 屆 TWNIC OPM 暨第二屆 TWNOG 會議 :
錢小山
首席技術顧問
思科大中華區數據中心架構事業部
IT Operation Analytics Tools
Nov. 2018
© 2018 Cisco and/or its affiliates. All rights reserved.
Challenges for IT
New apps
Average enterprise has at least 13 cloud-native business
apps
Complexity
New users
20M developers today growing to 25M by 2020
Compliance
New attack surfaces
6 months to detect breach3
Compromise
© 2018 Cisco and/or its affiliates. All rights reserved.
Business-Centric Solution Stack
IT Services Consumption
Automation UCS Director ACI
UCS Nexus HyperFlex
CloudCenter
Model Benchmark Deploy Manage
Application Services
Automation Cloud Services
Data Center Private Cloud Public Cloud
Development (DevOps)
Business (ITSM)
AppDynamics
Tetration
CWOM
SDWAN
© 2018 Cisco and/or its affiliates. All rights reserved.
Your apps (and brand) are judged on 4 key areas Is your app compliant?
Is your app secure?
Is your app as fast as Google or FB? Is your app useful?
PROTECTION
POLICY
OUTCOMES PERFORMANCE
Workload Management & Assurance
5 維運大數據分析 - Your apps are judged on 5 key areas
© 2018 Cisco and/or its affiliates. All rights reserved.
維運大數據分析 – 解決的問題
Lost Revenue and Brand Damage Due To Service Outages and Slowdowns
Visibility of Revenue and Impact per Feature, Release or Idea
Applications Don’t Scale to Peak Business or Campaign Requirements BUSINESS
Production Slowdowns and Outages
High Mean-Time-To-Repair (MTTR)
High Number of False-Positive Alerts and Alert Storms OPERATIONS
Excessive Developer Time Spent on Production Problems
Low quality and Robustness of Code
Maintenance of Architecture and Maintenance Diagrams DEVELOPMENT
© 2018 Cisco and/or its affiliates. All rights reserved.
維運大數據分析 – 三大工具
Acquire and retain happy users Drive business outcomes
Business / App Owner
App Dev/Ops
Machine Operations
Infrastructure Operations
Security Operations
Iterate and release faster with confidence Focus on building + rapid time 2 resolution
Run virtualized app with high avail/perf Ensure virtualized apps have resources
Understand dependencies & migrate across infrastructure Troubleshoot data center, latency issues
Secure apps across data center and network Regulate & enforce security/whitelist policies
CWOM
© 2018 Cisco and/or its affiliates. All rights reserved.
維運大數據分析之一 : AppDynamics
Application Intelligence Platform
See Act Know
© 2018 Cisco and/or its affiliates. All rights reserved.
See: End-to-end visibility of business transactions Tag Learn
Instrument every user transaction Collect application and business data Baseline behavior and performance
Trace
NoSQL
Java Heap Usage: 76% /<SearchFlight>/: 32ms From: LON To: LAS Out: Thursday 10th
Network Errors: 1.3% </GetCustLevel/>: 12ms Platinum Customer Lives: CA, USA Using: Chrome
CPU Usage: 36% </GetPrice/>: 56ms Class: Business Price: $3,269 Special Meals: No
Database Time: 156ms </WPProcess/>: 340ms Payment: Mastercard Merchant: WorldPay Confirmed: True
Business Transaction: Book A Flight Response Time: 2.1s
Follow Follow through complex systems
© 2018 Cisco and/or its affiliates. All rights reserved.
Act: Be proactive Unified Troubleshooting
Ops
Dev
Biz
Monitor
Troubleshoot
Resolve
End user slow transaction
96% of unhappy
customers don’t complain1
3
2
1. 1st Financial Training Services, Trainer’s Tool Kit, Are You Undervaluing Customer Service?, 2009 (http://www.1stfinancialtraining.com/Newsletters/trainerstoolkit1Q2009.pdf)
1
© 2018 Cisco and/or its affiliates. All rights reserved.
Act: Act fast by enabling collaboration Unified Troubleshooting
Without AppDynamics
DAYS WEEKS or MONTHS Customer Complains Log Ticket Identify Isolate Repair
Enabling Collaboration With AppDynamics
PROACTIVE ALERT
COLLABORATIVE IDENTIFY AND
ISOLATE
AUTOMATED REPAIR MINUTES
“Virtual War Room supports our multi-vendor structure, enabling efficient, real-time collaboration”
© 2018 Cisco and/or its affiliates. All rights reserved.
Know: What if you knew in real-time? Unified Analytics
$377
,997
$111
,802
$42,
591
$0$50,000
$100,000$150,000$200,000$250,000$300,000$350,000$400,000
Normal Slow Very Slow
Diamond 11%
Platinum 38%
Gold 25%
Silver 16%
Bronze 10%
Top Product Categories
Customers by Tier
Average Response Time
Total Revenues: $532,390
Track customer drop-off
Top Cities
Performance trending towards problem
PERSONAL CARDS
SMALL BUSINESS
CO
RPO
RAT
E C
ARD
S
PREPAID CARDS
TRAVEL
REWARDS
MERCHANTS
Top product categories generating highest revenue
Most of customers experiencing issues are Platinum
$0
$20,000
$40,000
$60,000
$80,000
$100,000
New York San Francisco Honolulu Bangalore London Paris
Revenue by cities
Where are you losing customers?
Intelligent logs
Combine intelligent log information §
§
© 2018 Cisco and/or its affiliates. All rights reserved.
Know: Real-time analytics Identify failed individual transactions and respond within minutes
List impacted Platinum customers
Marketing runs win-back campaigns
10% off
Capture all slow and failed transactions + revenue impact
Performance Business User
Automatically correlate data via the business transaction
+ +
© 2018 Cisco and/or its affiliates. All rights reserved.
AppDynamics production architecture
User Interface & Reporting
Correlated transaction view No code changes required
<Low overhead in production
SaaS/On-Prem Controller
Application Intelligence
Platform
Browser / Mobile / IoT
Application agent
Java - .NET - PHP Node.js - C++
One-Way HTTP/S One-Way HTTP/S One-Way HTTP/S Remote JDBC
Machine agent
OS
Database
SQL
End user agent
© 2018 Cisco and/or its affiliates. All rights reserved.
101010 100
101010 10
101 1
01010 10
101010 100
101010 100
010 100
0 100
0 100
© 2018 Cisco and/or its affiliates. All rights reserved.
Garbage pickup
© 2018 Cisco and/or its affiliates. All rights reserved.
Area Garbage service complete
Street list check
Next service scheduled
Approved
© 2018 Cisco and/or its affiliates. All rights reserved.
Move Fast, Follow Everything & Focus on What Matters Most
Live Customer Journeys for Every Business Transaction
Network
Private Cloud
Public Cloud
App iQ
Business iQ
Enterprise
Baseline
Diagnose
App iQ
Map
— Lightweight Agents
— Deployed Throughout Application Environment
— Flexible instrumentation / Easy configuration
© 2018 Cisco and/or its affiliates. All rights reserved.
Network
Public Cloud
Private Cloud
Move Fast, Follow Everything & Focus on What Matters Most
Live Customer Journeys for Every Business Transaction
— Every Device, Every Transaction
— Discovery and Correlation
— Dynamically Updated
App iQ
Business iQ
Enterprise
Baseline
Diagnose
App iQ
Map
© 2018 Cisco and/or its affiliates. All rights reserved.
Network
Public Cloud
Private Cloud
Move Fast, Follow Everything & Focus on What Matters Most
Detection of Issues Before Customers Notice
— Automatic Baselines for Every Metric
— Machine Learning Anomaly Detection
— Prevent IT Alert Storms
App iQ
Business iQ
Enterprise
Diagnose
App iQ
Map
Baseline
© 2018 Cisco and/or its affiliates. All rights reserved.
Network
Private Cloud
Public Cloud
Down to the line-of-code 3-CLICKS to ROOT-CAUSE
Move Fast, Follow Everything & Focus on What Matters Most
Immediate and Automated Code Level Diagnostics
App iQ
Business iQ
Enterprise
Baseline
App iQ
Map
— No Sifting Through Log Files
— Low Overhead in Production
— Fix Problems in Live and Pre-Production
Diagnose
© 2018 Cisco and/or its affiliates. All rights reserved.
Business iQ
Baseline
Diagnose
Map
Application Components
10s
100s
1,000s
10,000s
Customer Experiences Managed at Scale
— Deploy at Scale
— Visualize at Scale
— Manage at Scale
Move Fast, Follow Everything & Focus on What Matters Most
App iQ App iQ
Enterprise
BRKIOT-1100 22
© 2018 Cisco and/or its affiliates. All rights reserved.
Mike Smith Platinum Customer
London
Travel Airways API
Total Cost $1,800
Move Fast, Follow Everything & Focus on What Matters Most
Real Time Performance Intelligence
App iQ
Enterprise
Baseline
Diagnose
App iQ
Map
— Automatically Collected
— Fully Correlated
— Ready for Real-Time Insights
Business iQ
© 2018 Cisco and/or its affiliates. All rights reserved.
Move Fast
Auto-discover and Map
No Manual Configuration
Baseline Every Metric
Follow Everything
Production Monitoring
Low Overhead
All User Transactions
Focus on What Matters Most
Unified Platform
One Consistent UI
Real-time Context
App iQ Business iQ
© 2018 Cisco and/or its affiliates. All rights reserved.
Without AppDynamics… it would be like driving a car at
100 miles per hour with your eyes closed. You can’t
afford not having the visibility that AppDynamics
provides…
© 2018 Cisco and/or its affiliates. All rights reserved.
維運大數據分析之二 : Cisco Tetration Use cases
Ope
ration
s
Sec
urity
Cisco Tetration™
Visibility and forensics
Application insight
Policy
Neighborhood graphs
Application segmentation
Compliance
Policy simulation
Process inventory
© 2018 Cisco and/or its affiliates. All rights reserved.
Software sensor and enforcement
Embedded network sensors (telemetry only)
ERSPAN sensors (telemetry only)
Analytics engine
Web GUI REST API Event notification Cisco Tetration apps
Third-party sources (configuration data)
Data collection layer
Access mechanism
Bring your own data (streaming telemetry)
Cisco Tetration Architecture overview
© 2018 Cisco and/or its affiliates. All rights reserved.
Main features
Low CPU overhead (SLA enforced)
Low network overhead
New: Enforcement point (software agents)
Highly secure (code signed and authenticated)
Every flow (no sampling) and no payload
*Note: No per-packet telemetry; not an enforcement point
Software sensors
Universal* (basic sensor for other OS)
Linux servers (virtual machine and bare metal)
Windows servers (virtual machines and bare metal)
Windows Desktop VM (virtual desktop infrastructure only)
Cisco Nexus 9300 EX
Cisco Nexus 9300 FX
Network sensors
Next-generation Cisco Nexus® Series Switches
Third-party sources
Asset tagging
Load balancers
IP address management
CMDB
…
Third-party data sources Available today
Cisco Tetration Data Sources
© 2018 Cisco and/or its affiliates. All rights reserved.
Application Insight
Policy Simulation and Impact Assessment
Automated Whitelist
Policy Generation
Forensics: Every Packet, Every Flow,
Every Speed
Policy Compliance
and Auditability
Cisco Tetration Data Analytics
© 2018 Cisco and/or its affiliates. All rights reserved.
Data Replay & Forensics Replay flow details like a DVR
Information mapped across 25 different dimensions
• Thick lines indicate common flows • Faint lines indicate uncommon flows
© 2018 Cisco and/or its affiliates. All rights reserved.
Cisco Tetration Analytics™ Platform
BM VM VM BM
BM VM VM BM
Brownfield
BM VM VM VM BM
Cisco Nexus® 9000 Series
Bare-metal, VM, & switch telemetry
VM telemetry (AMI …)
Bare-metal & VM telemetry
BM VM
BM VM
VM BM
VM M
VM BM
BM VM
BM
Network-only sensors, host-only sensors, or both (preferred)
Bare metal and VM
On-premises and cloud workloads (AWS)
Unsupervised machine learning
Behavior analysis
EPG
Application Group Discovery (ADM)
© 2018 Cisco and/or its affiliates. All rights reserved.
Application Discovery
AppTier
DB Tier
Storage
WebTier
Storage
Policy Enforcement (Future Roadmap)
Whitelist Policy Recommendation (Available in JSON, XML, and YAML)
White List Recommendation
© 2018 Cisco and/or its affiliates. All rights reserved.
• Validating policy impact assessment in real time • Simulating policy changes over historic traffic
• View traffic “outliers” for quick intelligence • Audit becomes a function of continuous machine learning
Cisco Tetration Analytics™ Platform
VM BM
VM VM
BM VM
VM VM
VM BM
VM VM
VM
Real-Time and Historical Data Policy Simulating
© 2018 Cisco and/or its affiliates. All rights reserved.
• Identify policy deviations in real-time
• Review and update whitelist policy with one click
• Policy lifecycle management
VM BM
VM VM
BM VM
VM VM
VM BM
VM VM
VM
Cisco Tetration Analytics™ Platform
VM
BM
VM
Compliance Testing
© 2018 Cisco and/or its affiliates. All rights reserved.
Cisco Tetration Telemetry: ERSPAN Option
• Dedicated virtual machines on each host with 4 software sensors in each virtual machine
• Each sensor binds to a separate vNIC
• ERSPAN terminates on the virtual machine vNIC
• Each sensor terminates one ERSPAN session
• Sensor generates telemetry based on the data-plane traffic
• Horizontally scalable
Layer 3 connection
ERSPAN
Layer 3 switch
Expanded telemetry collection option
• Augment telemetry from other parts of the network
• Useful when software sensor or hardware sensor is not feasible
Cisco Tetration telemetry
Cisco Tetration™ platform
Production network
Production network
© 2018 Cisco and/or its affiliates. All rights reserved.
維運大數據分析之三 : CWOM
• Workload Optimization is a Complex Problem • Solving requires advanced operational analytics
and automation
© 2018 Cisco and/or its affiliates. All rights reserved.
CWOM Workload Optimization
Cisco
Capacity Applications
Compute
Storage
Public Cloud
Databases
© 2018 Cisco and/or its affiliates. All rights reserved.
CWOM Workload Optimization
1 Abstraction: Supply Chain
2 Analytics: Supply, Demand, Price
3 Automation: Real-time action • Placement • Scaling • Capacity
Idea Solution
© 2018 Cisco and/or its affiliates. All rights reserved.
How Does it work?
• Installs within 20 minutes from an .ova file
• Connect to browser choice (Firefox, Chrome)
• Add license key and select targets
• Add IP addresses, user names, and password credentials
• Connects to virtualization, compute, storage
• Detects all service entities
• Overlays constraints
• Result: virtual market of buyers and sellers
• Topological map = relationships Arrow show buying/selling association
• Recommended actions designed to optimize virtual market
• Three types of actions
• Placement • Scaling • Provisioning
• Actions are global
Reduce Vmem capacity for Virtual Machine test-AppSvr1 from 2097152.0 to 1048576.0 to address underutilization
Actions
© 2018 Cisco and/or its affiliates. All rights reserved.
Support for Popular Enterprise Apps
Simply add application server, database, and hypervisor targets in the CWOM settings
Multi-cloud Ready Environment
Nov. 2018
© 2018 Cisco and/or its affiliates. All rights reserved.
Business gets Choice. But IT gets Management Complexity.
Data Center
Public Cloud
Private Cloud
Growth in Applications and Infrastructure Choices
© 2018 Cisco and/or its affiliates. All rights reserved.
Business gets Choice. But IT gets Management Complexity.
Higher Costs
Increased Complexity
Data Center
Public Cloud
Private Cloud
Growth in Applications and Infrastructure Choices
© 2018 Cisco and/or its affiliates. All rights reserved.
Internet of Things
Data centers
HQ Branch Hosting / Colocation
Devices Private cloud Public Clouds
Reimagine your hybrid IT world
Secure Connectivity
Visibility and Insights
Hybrid IT Operations
Risk Management
AGILITY
COST
PERFORMANCE
RISK
New Requirements
© 2018 Cisco and/or its affiliates. All rights reserved.
Cisco CloudCenter Portal On-premise Any Public Cloud
Private Cloud
Same Applications Blueprint
Life Cycle Management
Benchmark Cost/Performance
APIs Translation
Other Cloud Services
Example : Cloud Broker
© 2018 Cisco and/or its affiliates. All rights reserved.
CloudCenter Uncomplicates the Cloud. Model Once. Deploy and Manage Anywhere.
Data Center
Public Cloud
Private Cloud
One Integrated Platform
“Day 2” Actions
New and Existing Applications
Deploy
Manage
Model Model
© 2018 Cisco and/or its affiliates. All rights reserved.
Multi-tenancy Enterprise-Class Governance and Security
Users Clouds Applications
Deploy
Manage
Model
One Tenancy (Federation)
Single user – may need access to multiple cloud accounts Single cloud account – may be used by various users/teams/tenants Roles and permissions may vary based on environment – e.g. HIPPA data OK in datacenter but not cloud Roles and permissions may vary based on project stage (dev, test, prod)
© 2018 Cisco and/or its affiliates. All rights reserved.
Application Profile Based Services
Cloud Agnostic
Cloud API-Specific
Orchestrator
Extendable
Multi-tenant
Secure
Scalable
Orchestrator
Orchestrator
Manager Application Profile
© 2018 Cisco and/or its affiliates. All rights reserved.
Benchmarking: Workload Placement Matters
PetClinic Three tier web application
Blender 5 task rendering job
© 2018 Cisco and/or its affiliates. All rights reserved.
How it works ?
Cloud specific Multi-tenant Dedicated or Shared
Launches VMs and mounts storage to each
Installs agent in each VM
ARTIFACT REPOSITORY
Applies Security policies to configure port settings and firewall rules
Links to artifact repository
AGENT
AGENT
AGENT
Manager sends Profile to Orchestrator
Monitor and trigger run-time policies
Provisions infrastructure and services- network, storage, compute
Deploy and orchestrate components and services
© 2018 Cisco and/or its affiliates. All rights reserved. © 2017 Cisco and/or its affiliates. All rights reserved.
CloudCenter with AppDynamics bundle
Intelligent Application Orchestration
Deploy - AppDynamics Agent seamlessly as part of CloudCenter Application Profile
Monitor – Application ecosystem and identify emerging issues
Optimize – Automate scale out to preserve performance and minimize cost
User
CloudCenter Manager
CloudCenter Orchestrator
AppDynamics Agents
AppDynamics Controller
Monitor
Deploy Optimize
© 2018 Cisco and/or its affiliates. All rights reserved.
CloudCenter and AppDynamics
• Deploy AppDynamics Agent as part of Application Profile deployment • Monitor the application using the AppD Controller • Scale up or Scale Down the Application based on policies configured in AppD
Controller
App Owner
Cloud-A
Cloud-B
Scale the App
CloudCenter Application Profile
Controller
• Self Operating Applications
© 2018 Cisco and/or its affiliates. All rights reserved. © 2017 Cisco and/or its affiliates. All rights reserved.
CloudCenter with Tetration
Greenfield – Application Profile
Brownfield - Import VM
Use Action Library to deploy Tetration sensor
En Masse
Selectively
App Owner
Tetration
VM
VM
VM
VM
VM
VM
VM VM VM
VM VM VM
© 2018 Cisco and/or its affiliates. All rights reserved.
Deploy
Manage
Model
Hooks Scripts Events
Security SSO HSM
Infrastructure IPAM DNS
Docker Puppet, Chef Components
User Content Vendor Content
Content Integration
Tool Integration
Extendable
Cloud APIs Datacenter Private and Public
Cloud
Platform Integration ITSM | Build Automation (Jenkins)
Secure
Multi-tenant
Extendable
Scalable
Summary
Nov. 2018
© 2018 Cisco and/or its affiliates. All rights reserved.
Find the Balance
© 2018 Cisco and/or its affiliates. All rights reserved.
The Cloud Consumption Continuum
Cloud Only
Cloud Never
Cloud First
Cloud Ready
Cloud DR
© 2018 Cisco and/or its affiliates. All rights reserved.
Understand The Metrics That Matter To You
Cost Security DR Connectivity Timeliness
Scalability Performance Repatriation Compliance Support
© 2018 Cisco and/or its affiliates. All rights reserved.
Get The Right Tools For The Right Job
© 2018 Cisco and/or its affiliates. All rights reserved.
You Don’t Have To Go It Alone
© 2018 Cisco and/or its affiliates. All rights reserved.
Cisco Is There To Help
Multicloud Portfolio
Cloud Connect
Cloud Protect
Cloud Advisory
Cloud Consume
© 2018 Cisco and/or its affiliates. All rights reserved.
Thank You