36
Anti-fragility, Microservices & DevOps - A Study By William

Antifragile, Microservices and DevOps - A Study

Embed Size (px)

Citation preview

Page 1: Antifragile, Microservices and DevOps - A Study

Anti-fragility, Microservices & DevOps- A Study

By William

Page 2: Antifragile, Microservices and DevOps - A Study

Agenda

• The Principle of Anti-fragility• Microservices Architecture• The Principle of DevOps

Page 3: Antifragile, Microservices and DevOps - A Study

Topic:What’s the Antonym of Fragile?

• Robust?• Anti-fragile

Page 4: Antifragile, Microservices and DevOps - A Study

Fragile

Shatters when exposed to even a small stressor.

Page 5: Antifragile, Microservices and DevOps - A Study

Robust

Page 6: Antifragile, Microservices and DevOps - A Study

The Problem of Robust

• Robust is just Fragile with a thicker skin…• Encourages a defensive, static mindset• Resistant to change?• Vulnerable to “Black Swan” events…– Something we haven’t anticipated– A failure mode we can’t have foreseen– A cascade of errors that we did not plan for

Page 7: Antifragile, Microservices and DevOps - A Study

Black Swans

Page 8: Antifragile, Microservices and DevOps - A Study

Anti-fragile

When exposed to stress it gets stronger

Page 9: Antifragile, Microservices and DevOps - A Study

Anti-fragile

Some things benefit from shocks…volatility, randomness, disorder, and stressors and love adventure, risk, and uncertainty… there is no word for the exact opposite of fragile. Let’s call it antifragile.

Nassim N. Taleb, “Antifragile. Things that gain from disorder”

Page 10: Antifragile, Microservices and DevOps - A Study

Triple Prism of Fragile, Robust & Anti-fragile

Page 11: Antifragile, Microservices and DevOps - A Study

Fragile Robust Anti-Fragile

Icon Glass Medieval Castle

DNA/Muscle

Methodology “Spaghetti” ITIL DevOps

Attitude to change

Fear Change Resist Change Embrace Change

Response to Change

Break Repel Adapt

Rate of Change

Ideally never! Slow Rapid

Change initiated by

Needs CEO approval

Change Management Board

User-initiated(via automation)

Focuses on Survival Process Business Value

http://blog.devopsguys.com/2013/07/17/devops-antifragility-and-the-borg-collective/

Page 12: Antifragile, Microservices and DevOps - A Study

Is the System in Your Company

• Fragile?• Robust??• Anti-fragile???

Page 13: Antifragile, Microservices and DevOps - A Study

Anti-fragile Microservices Architecture

Page 14: Antifragile, Microservices and DevOps - A Study

Microservices Architecture – A Case in Practice

Page 15: Antifragile, Microservices and DevOps - A Study

Service Dependency

Page 16: Antifragile, Microservices and DevOps - A Study

Single Dependency Delay Causing Blocking of User Request

Page 17: Antifragile, Microservices and DevOps - A Study

All User Requests will be Blocked at Peak Hour(Cascading Failure)

Page 18: Antifragile, Microservices and DevOps - A Study

Circuit Breaker & Bulkhead Isolation Pattern

https://github.com/Netflix/Hystrix

Page 19: Antifragile, Microservices and DevOps - A Study

Cross IDC Active - Active

GLSB

DC Aware Gateway

SOA Edge ServiceServiceRegistry

Peer Sync

Invoke

Invoke Invoke

Invoke

DC 1 DC 2

SOA Middle Tier Service

DC Aware Gateway

SOA Edge Service

SOA Middle Tier Service

ServiceRegistryDC Aware

ClientDC Aware

Client

Invoke

Invoke Invoke

Lookup Lookup

Register Register

Lookup Lookup

RegisterRegister

Fallback Invocation

Fallback Invocation

Page 21: Antifragile, Microservices and DevOps - A Study

Building Distributed System is Extremely Hard

• Even Harder to Test Sufficiently– Massive data sets and changing shape– Internet-scale traffic– Complex interaction and information flow– Asynchronous nature– 3rd party services– All while innovating and building features

Prohibitively expensive, if not impossible, for most large-scale systems.

Page 22: Antifragile, Microservices and DevOps - A Study

There is another Way

• Assume everything will fail• Cause failure to validate resiliency• Test design assumption by stressing them• Don’t wait for random failure. Remove its

uncertainty by forcing it periodically.

Page 23: Antifragile, Microservices and DevOps - A Study

What Netflix has Done – Embrace Chaos!

“One of the first systems our engineers built in AWS is called the Chaos Monkey. The Chaos Monkey’s job is to randomly kill instances and services within our architecture. If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most – in the event of an unexpected outage.”

http://luckyrobot.com/netflix-chaos-monkey-keeps-movies-streaming/

http://www.codinghorror.com/blog/2011/04/working-with-the-chaos-monkey.html

Page 24: Antifragile, Microservices and DevOps - A Study

Netflix Simian Army

Page 25: Antifragile, Microservices and DevOps - A Study

Representative Anti-fragile Organization

The Netflix cloud architecture is anti-fragile… The Netflix culture is anti-fragile… Getting stronger through failure is the basis of anti-fragility. Avoiding failure at all costs… makes you brittle and vulnerable…

Adrian Cockroft, “Looking back at 2012 with pointers to 2013”http://perfcap.blogspot.com/2013/12/looking-back-at-2013-with-pointers-to.html

Page 26: Antifragile, Microservices and DevOps - A Study

Architecture for ImperfectionA highly agile and highly available service constructed from ephemeral and often broken components. It is a service-oriented architecture built on micro-services, none of which are essential to the operation of the whole.The software is written to run across three Amazon datacenters, and will tolerate the loss of any one. We can lose a third of our infrastructure without our customers noticing and calling customer services, it’s no idle claim, Netflix even tests this aspect of its infrastructure. A few weeks ago the team deliberately killed one of the three zones, knocking out 3000 servers in one fell swoop, just to prove that we could do it.By Adrian Cockcroft, from “Netflix, HANA and the meaning of cloud”http://diginomica.com/2013/05/13/netflix-hana-and-the-meaning-of-cloud/

Page 27: Antifragile, Microservices and DevOps - A Study

Netflix Global Active – Active Cloud Architecture

http://awsmedia.s3.amazonaws.com/ARC305.pdf

Page 28: Antifragile, Microservices and DevOps - A Study

What on Earth is DevOps

Devops means giving a sh*t about your job enough to not pass the buck.Devops means giving a sh*t about your job enough to want to learn all the parts and not just your little world.Developers need to understand infrastructure.Operations people need to understand code.- John E. Vincent(@Lusis)

http://blog.lusis.org/blog/2013/06/04/devops-the-title-match/

Page 30: Antifragile, Microservices and DevOps - A Study

The First Way

Silo vs. System Thinking, focus on the end to end value flow.

Page 31: Antifragile, Microservices and DevOps - A Study

The Second Way

System improvement via visibility, feedback and data driven decisions

Page 32: Antifragile, Microservices and DevOps - A Study

The Third Way

Embrace ChangeBe willing to ExperimentLearn from your mistakes

Page 33: Antifragile, Microservices and DevOps - A Study
Page 34: Antifragile, Microservices and DevOps - A Study

Microservices Organizational Structure

Page 35: Antifragile, Microservices and DevOps - A Study

Take Away

1. Obsessive protection of system against extremely rare events makes it more fragile.

2. Monoculture is fragile, diversity is anti-fragile.

3. If it hurts, do it more often, and bring the pain forward.

4. To create anti-fragile system, stress to them continuously so we are forced to simplify and automate.

Page 36: Antifragile, Microservices and DevOps - A Study

Reading for System and Architectural Thinking – recommended by Adrian Cockroft