100
William Yeh""茆ᐽ߽" DevOps Summit 2016 (2016-07-06) Monitoring 硬蝨懯向物 窕纷 薪讨 Monitoring: a Process Perspective Oqpkvqtkpi

Monitoring 改造計畫:流程觀點

Embed Size (px)

Citation preview

Page 1: Monitoring 改造計畫:流程觀點

William YehDevOps Summit 2016 (2016-07-06)

Monitoring

Monitoring: a Process Perspective

Page 2: Monitoring 改造計畫:流程觀點
Page 3: Monitoring 改造計畫:流程觀點

CRT (Current Reality Tree)

#6

#10

( )

best practice

#3 #5

( ) ( )( )

AND

AND

AND

AND

#1#2

#4

#7

#8

#9

#11

Murphy exists

AND

AND

×

AND

DevOps

AND

AND

AND

AND

AND

AND

AND

AND

Op

AND

http://www.slideshare.net/williamyeh/devops-63711710

Page 4: Monitoring 改造計畫:流程觀點

#6

#10

( )

best practice

#3 #5

( ) ( )( )

AND

AND

AND

AND

#1#2

#4

#7

#8

#9

#11

Murphy exists

AND

AND

×

AND

DevOps

AND

AND

AND

AND

AND

AND

AND

AND

Op

AND

Page 5: Monitoring 改造計畫:流程觀點

#6

#10

( )

best practice

#3 #5

( ) ( )( )

AND

AND

AND

AND

#1#2

#4

#7

#8

#9

#11

Murphy exists

AND

AND

×

AND

DevOps

AND

AND

AND

AND

AND

AND

AND

AND

Op

AND

Page 6: Monitoring 改造計畫:流程觀點

#7

#9 #1

#2

#11

#8 #4

Page 7: Monitoring 改造計畫:流程觀點

Risk management

• Threats• avoid• transfer• mitigate

7

• Opportunities• exploit• enhance• share

👍👎

http://www.slideshare.net/williamyeh/whoscall-realtime-monitoring

Page 8: Monitoring 改造計畫:流程觀點

William YehDevOps Summit 2016 (2016-07-06)

Monitoring

Monitoring: a Process Perspective

Page 9: Monitoring 改造計畫:流程觀點

Process Monitoring

Monitoring

Page 10: Monitoring 改造計畫:流程觀點

Process

?

?

?

?

Monitoring

Monitoring

Page 11: Monitoring 改造計畫:流程觀點

?

?

Process

Monitoring

Page 12: Monitoring 改造計畫:流程觀點

#5

Part 2

Page 13: Monitoring 改造計畫:流程觀點

Efrat Goldratt-Ashlag

Page 14: Monitoring 改造計畫:流程觀點
Page 15: Monitoring 改造計畫:流程觀點
Page 16: Monitoring 改造計畫:流程觀點
Page 17: Monitoring 改造計畫:流程觀點

Efrat Goldratt-Ashlag

Page 18: Monitoring 改造計畫:流程觀點

What to changeTo What to changeHow to cause the change

Page 19: Monitoring 改造計畫:流程觀點

CRT (Current Reality Tree)

Page 20: Monitoring 改造計畫:流程觀點
Page 21: Monitoring 改造計畫:流程觀點

DevOps

Page 22: Monitoring 改造計畫:流程觀點

DevOps

leverage

TOC

CCPM

FRT (Future Reality Tree)

Page 23: Monitoring 改造計畫:流程觀點

DevOps

leverage

TOC

CCPM

FRT (Future Reality Tree)

Page 24: Monitoring 改造計畫:流程觀點

TOC

CCPM

Page 25: Monitoring 改造計畫:流程觀點

Stephen R. Covey

Page 26: Monitoring 改造計畫:流程觀點
Page 27: Monitoring 改造計畫:流程觀點
Page 28: Monitoring 改造計畫:流程觀點

What get measured, get done.

Peter Drucker

Page 29: Monitoring 改造計畫:流程觀點

Policy

What get measured, get done.

Page 30: Monitoring 改造計畫:流程觀點

Policy

Page 31: Monitoring 改造計畫:流程觀點

Policy

Page 32: Monitoring 改造計畫:流程觀點

PolicyBuy-in

Policy

Page 33: Monitoring 改造計畫:流程觀點
Page 34: Monitoring 改造計畫:流程觀點
Page 35: Monitoring 改造計畫:流程觀點

What to changeTo What to changeHow to cause the change

Page 36: Monitoring 改造計畫:流程觀點

Adrian Cockcroft

Page 37: Monitoring 改造計畫:流程觀點
Page 38: Monitoring 改造計畫:流程觀點
Page 39: Monitoring 改造計畫:流程觀點

CloudFront ELB API servers MongoDB

Cloud Manager

CloudWatch

log in S3

StatsD

BigQuery

Page 40: Monitoring 改造計畫:流程觀點

CloudFront ELB API servers MongoDB

Cloud Manager

CloudWatch

log in S3

StatsD

BigQuery

Page 41: Monitoring 改造計畫:流程觀點

CloudFront ELB API servers MongoDB

Cloud Manager

CloudWatch

log in S3

BigQuery

Page 42: Monitoring 改造計畫:流程觀點

CloudFront ELB API servers MongoDB

Cloud Manager

CloudWatch

log in S3

BigQuery

Page 43: Monitoring 改造計畫:流程觀點
Page 44: Monitoring 改造計畫:流程觀點

http://school.soft-arch.net/blog/125009/change-viewpoint-on-lord-of-rings

Lean Change Canvas

Page 45: Monitoring 改造計畫:流程觀點

Lean Change Canvas

Commitment Wins/Benefits

Urgency

Target State

Success Criteria

Vision

Communication

Action

Change Recipients

FYI: http://kojenchieh.pixnet.net/blog/post/442550432-firstthing_of_agile_promotionFYI: http://leankit.com/blog/2015/02/lean-change-method/

Monitoring Q1 (brainstorming) 2016-Jan-06Iteration #1

TO DO LIST details

Augmented

Page 46: Monitoring 改造計畫:流程觀點

Lean Change Canvas

Urgency

Target State

Success Criteria

Vision

Communication

Action

Monitoring Q1 (brainstorming) 2016-Jan-06Iteration #1

Page 47: Monitoring 改造計畫:流程觀點

What to changeTo What to changeHow to cause the change

Page 48: Monitoring 改造計畫:流程觀點

Lean Change Canvas

Urgency

Target State

Success Criteria

Vision

Communication

Action

Monitoring Q1 (brainstorming) 2016-Jan-06Iteration #1

Flow

Tech

Monitoring

Page 49: Monitoring 改造計畫:流程觀點

Buy-inFlow

Buy-inPolicy

Page 50: Monitoring 改造計畫:流程觀點

Flow

TOC Lean Thinking

CCPM

Page 51: Monitoring 改造計畫:流程觀點

TOC

Page 52: Monitoring 改造計畫:流程觀點

Lean Thinking

Value Value stream FlowPull Perfection

http://school.soft-arch.net/blog/115652/devops-a-lean-perspective

Page 53: Monitoring 改造計畫:流程觀點

“The Three Ways”

Create fast flow of work from Dev into IT Ops. Shorten and amplify feedback loops. Create a culture that simultaneously fosters 2 things: 1. continual experimentation, learning from

failure. 2. repetition and practice is the prerequisite

to mastery.

Create fast flow of work from Dev into IT Ops.

Shorten and amplify feedback loops.

Page 54: Monitoring 改造計畫:流程觀點

CCPM

Critical ChainProject Management

Page 55: Monitoring 改造計畫:流程觀點

Flow

TOC Lean Thinking

CCPM

Page 56: Monitoring 改造計畫:流程觀點

VPC

CloudFront ELB API servers DB

Simplified version

Page 57: Monitoring 改造計畫:流程觀點

CloudFront ELB API servers DB

ELB API servers DB

Microservices

Simplified version

Page 58: Monitoring 改造計畫:流程觀點

Flow

Page 59: Monitoring 改造計畫:流程觀點
Page 60: Monitoring 改造計畫:流程觀點
Page 61: Monitoring 改造計畫:流程觀點
Page 62: Monitoring 改造計畫:流程觀點

Flow

Page 63: Monitoring 改造計畫:流程觀點

Flow

Page 64: Monitoring 改造計畫:流程觀點
Page 65: Monitoring 改造計畫:流程觀點

Flow

Page 66: Monitoring 改造計畫:流程觀點
Page 67: Monitoring 改造計畫:流程觀點

Overview

Page 68: Monitoring 改造計畫:流程觀點

Incomingrequests

Page 69: Monitoring 改造計畫:流程觀點

APIservers

Page 70: Monitoring 改造計畫:流程觀點

DB servers

Page 71: Monitoring 改造計畫:流程觀點

DB serversAPI

servers

Incomingrequests

Overview

Flow

Page 72: Monitoring 改造計畫:流程觀點

Lean Change Canvas

Urgency

Target State

Success Criteria

Vision

Communication

Action

Monitoring Q1 (brainstorming) 2016-Jan-06Iteration #1

Flow

Page 73: Monitoring 改造計畫:流程觀點

TOC

Flow TOC

Page 74: Monitoring 改造計畫:流程觀點

FlowBuy-in

Policy

TechFlow

Page 75: Monitoring 改造計畫:流程觀點

Lean Change Canvas

Urgency

Target State

Success Criteria

Vision

Communication

Action

Monitoring Q1 (brainstorming) 2016-Jan-06Iteration #1

Tech

Page 76: Monitoring 改造計畫:流程觀點

Personal Preferences

• Golang

• Microservices

• Composability

• OSS ecosystem

of server technologies

Page 77: Monitoring 改造計畫:流程觀點

Personal Preferences

• Golang

• Microservices

• Composability

• OSS ecosystem

Runtime dependency

william Ansible

Page 78: Monitoring 改造計畫:流程觀點

Personal Preferences

• Golang

• Microservices

• Composability

• OSS ecosystem

Scalability

Overhead

Page 79: Monitoring 改造計畫:流程觀點
Page 80: Monitoring 改造計畫:流程觀點
Page 81: Monitoring 改造計畫:流程觀點
Page 82: Monitoring 改造計畫:流程觀點
Page 83: Monitoring 改造計畫:流程觀點
Page 84: Monitoring 改造計畫:流程觀點
Page 85: Monitoring 改造計畫:流程觀點

Personal Preferences

• Golang

• Microservices

• Composability

• OSS ecosystem

Node/system metrics exporterAWS CloudWatch exporterBlackbox exporterCollectd exporterConsul exporterGraphite exporterHAProxy exporterInfluxDB exporterJMX exporterMemcached exporterMesos task exporterMySQL server exporterSNMP exporterStatsD exporter

cAdvisorDoormanEtcdKubernetes-MesosKubernetesRobustIRCSkyDNSWeave Flux

Aerospike exporterApache exporterBIG-IP exporterBIND exporterCeph exporterCouchDB exporterDjango exporterGoogle's mtail log data extractorHeka dashboard exporterHeka exporterIoT Edison exporterJenkins exporterknxd exporterMeteor JS web framework exporterMinecraft exporter moduleMirth Connect exporterMongoDB exporterMunin exporterNew Relic exporterNginx metric libraryNSQ exporterOpenWeatherMap exporterPassenger exporterPgBouncer exporterPostgreSQL exporterPowerDNS exporterRabbitMQ exporterRabbitMQ Management Plugin exporterRancher exporterRedis exporterRethinkDB exporterrTorrent exporterscollector exporterSMTP/Maildir MDA blackbox proberSpeedtest.net exporterSQL query result set metrics exporterUbiquiti UniFi exporterVarnish exporterZookeeper exporter

Page 86: Monitoring 改造計畫:流程觀點

CloudFront ELB API servers MongoDB

Cloud Manager

CloudWatch

log in S3

StatsD

BigQuery

Page 87: Monitoring 改造計畫:流程觀點

ELB API servers MongoDB

Cloud Manager

CloudWatch

Page 88: Monitoring 改造計畫:流程觀點

Prometheus vs Graphite/StatsD

Page 89: Monitoring 改造計畫:流程觀點

abs()absent()bottomk()ceil()changes()clamp_max()clamp_min()count_scalar()delta()deriv()drop_common_labels()exp()floor()histogram_quantile()holt_winters()increase()

irate()label_replace()ln()log2()log10()predict_linear()rate()resets()round()scalar()sort()sort_desc()sqrt()time()topk()vector()<aggregation>_over_time()

Page 90: Monitoring 改造計畫:流程觀點

node_cpu

time

number

Page 91: Monitoring 改造計畫:流程觀點

node_cpu

time

number

{mode="idle"}

mode

node_cpu {mode="irq"}

node_cpu {instance="10.0.37.12"}{service="web"}{zone="ap-northest-1a"}

Page 92: Monitoring 改造計畫:流程觀點

sum( irate(

node_netstat_TcpExt_TCPTimeWaitOverflow[1m] )

) by (ec2tag_Service)

countergauge

aggregate

TCP Timeout

node_netstat_TcpExt_TCPTimeWaitOverflow[1m]irate(

node_netstat_TcpExt_TCPTimeWaitOverflow[1m] )

grouping

Page 93: Monitoring 改造計畫:流程觀點

gaugeaggregate

Memory Used

1 - node_memory_MemFree/node_memory_MemTotalgrouping

avg( 1 - node_memory_MemFree/node_memory_MemTotal

) by (ec2tag_Service)

Page 94: Monitoring 改造計畫:流程觀點

avg by (ec2tag_Service) ( irate(

node_cpu{job="node", mode="idle"}[1m] )

)

countergauge

aggregate

CPU Utilization

100 - (

* 100)

Page 95: Monitoring 改造計畫:流程觀點

avg( request_time_summary

) by (ec2tag_Service, quantile)summary

aggregate

Latency

grouping

Customized metricswith Fluentd plugin for Prometheus

Page 96: Monitoring 改造計畫:流程觀點

Conclusion

Page 97: Monitoring 改造計畫:流程觀點

#7

#9 #1

#2

#11

#8 #4

Page 98: Monitoring 改造計畫:流程觀點

PolicyBuy-in

FlowTech

Page 99: Monitoring 改造計畫:流程觀點

Policy

Buy-in

Flow

Tech

???

Issue tracking

Page 100: Monitoring 改造計畫:流程觀點

William YehDevOps Summit 2016 (2016-07-06)

Monitoring

Monitoring: a Process Perspective