Measuring web performance. Velocity EU 2011

Steve ThairSeriti Consulting

@TheOpsMgr

MEASURING WEB PERFORMANCE

Every measurement of web performance you will ever make will be

wrong

MY DEFINITION OF WEB PERFORMANCE

“The human perception of duration is both subjective and variable”

http://en.wikipedia.org/wiki/Time_perception

“The human perception of duration is both subjective and variable”



“PERCEPTION IS VARIABLE…”

Go read Stoyan’s talk! http://velocityconf.com/velocity2010/public/schedule/detail/13019

Web Performance

Subjective Objective

Subj

ectiv

e“Qualitative techniques”

Case Studies

Focus Groups

Interviews

Video Analysis

Surveys

Obj

ectiv

e“Quantitative techniques”

Javascript

Navigation timing

Browser Extensions

Custom Browsers

Proxy timings

Web Server mods

Network sniffing

“I keep six honest serving-men

(They taught me all I knew);

Their names are What and Why and When

And How and Where and Who.”Rudyard Kipling, The Elephant’s Tale

http://www.boop.org/jan/justso/elephant.htm

Journey

Page

Object

WHAT LEVEL DO YOU MEASURE?

CHOOSE YOUR METRIC!

https://dvcs.w3.org/hg/webperf/raw-file/tip/specs/NavigationTiming/Overview.html


4 Key “Raw” Metrics• Time to First Byte (TTFB)• Render Start Time• DOMContentLoaded• Page (onLoad) Load Time (PLT)

What about “Above the Fold” time?• How long to “render of the static stuff in

the viewable area of the page”?Limitations of AFT– Only applicable to lab setting– Does not reflect user perceived latency based on functionality

http://assets.en.oreilly.com/1/event/62/Above%20the%20Fold%20Time_%20Measuring%20Web%20Page%20Performance%20Visually%20Presentation.pdf



ApdexStatisticalMetrics

Counts/Histograms

Raw Metrics

WHAT OTHER METRICS?

Apdex (t) = (Satisfied Count + Tolerated Count / 2)

/ Total Samples• A number between 0 and 1 that represents “user satisfaction”

• For technical reasons the “Tolerated” threshold is set to four times the “Satisfied” Threshold so if your “Satisfied” threshold (t) was 4 seconds then:

• 0 to 4 seconds = Satisfied 4 to 16 seconds = Tolerated over 16 seconds = Frustrated.

http://apdex.org/

PERFORMANCE IS MULTI-DIMENSIONAL

Multiple MetricsFor Multiple URLSFrom Different LocationsUsing Different ToolsAcross the LifecycleOver Time

The importance of CONTEXT

ContextTime of Day

Browser

Addons & Extensions

Cached objects

Wired, WiFi, 3G

Location Bandwidth

Latency

Operating System

Antivirus

Device

Resolution

Who? User Experience

DevelopersTesters

WebOps“The Boss”

When?

SDLC

Design (UX)

Develop

Build (CI)QA

Prod Ops

WHERE – DEPENDS ON THE HOW & WHY…Sy

nthe

tic v

ersu

s Rea

l-U

ser

“Real User”

Firewall / Load-Balancer

Web Browser

WiFi or 3G(Reverse) Proxy Server

Network “Sniffer”

Smartphone

InternetProxy Server

SPAN

port

or

Netw

ork t

ap

Web ServerSynthetic Agent

User/Browser metrics Server-based metrics

Signal/Noise Ratio increases….

The Synthetic Versus

Real-User Debate

“…it's a question of when, not if active monitoring of websites for availability and performance will

be obsolete.”- Pat Meenan

“Because you’re skipping the “last mile” between the server and the user’s

browser, you’re not seeing how your site actually performs in the real world”

- Josh Bixby

“You can have my active monitoring when you pry it

from my cold, dead hands…”

- Steve Thair

http://blog.patrickmeenan.com/2011/05/demise-of-active-website-monitoring.htmlhttp://www.webperformancetoday.com/2011/07/05/web-performance-measurement-island-is-sinking/

http://www.seriticonsulting.com/blog/2011/5/21/you-can-have-my-active-monitoring-when-you-pry-it-from-my-co.html

http://blog.patrickmeenan.com/2011/05/demise-of-active-website-monitoring.html

http://blog.patrickmeenan.com/2011/05/demise-of-active-website-monitoring.html

http://www.webperformancetoday.com/2011/07/05/web-performance-measurement-island-is-sinking/

http://www.webperformancetoday.com/2011/07/05/web-performance-measurement-island-is-sinking/

http://www.seriticonsulting.com/blog/2011/5/21/you-can-have-my-active-monitoring-when-you-pry-it-from-my-co.html

Observational StudyVersus

Experiment

Experiment versus Observational Study• Both typically have the goal of detecting a relationship between the

explanatory and response variables.

Experiment• create differences in the explanatory variable and examine any resulting

changes in the response variable (cause-and-effect conclusion)

Observational Study• observe differences in the explanatory variable and notice any related

differences in the response variable (association between variables)

http://www.math.utah.edu/~joseph/Chapter_09.pdf

• “Watching” what happens in a given population sample

• We can only observe… and try to infer what is actually happening

• Many “confounding variables”

• High signal to noise

• Correlation

Observational Study = Real-User

ContextTime of Day

Browser

Addons & Extensions

Cached objects

Wired, WiFi, 3G

Location Bandwidth

Latency

Operating System

Antivirus

Device

Resolution

• We “design” our experiment

• We chose when, where, what, how etc

• We control the variables (as much as possible)

• Lower signal to noise

• Causation*

• “Watching” what happens in a given population sample

• We can only observe… and try to infer what is actually happening

• Many “confounding variables”

• High signal to noise

• Correlation

Observational Study = Real-User Experiment = Synthetic

* OK, real “root cause” analysis will probably take a lot more investigation,

I admit… but you get closer!

So which one is better? Neither.

Complementary not Competing“…Ultimately I'd love to see a hybrid model where

synthetic tests are triggered based on something detected in the data (slowdown, drop in volume, etc) to

validate the issue or collect more data.- Pat Meenan

Real-User Monitoring detect a change in a page’s performance

API Call to Synthetic

Controlled Test and compare to baseline.

From Observation… To Experiment…By controlling the variables

Use RUM as “Reality Check”

Obj

ectiv

e“Quantitative techniques”

Javascript

Navigation timing

Browser Extensions

Custom Browsers

Proxy timings

Web Server mods

Network sniffing

Back to the “How”…

7 WAYS OF MEASURING WEBPERF

1. JavaScript timing e.g. Souder’s Episodes or Yahoo! Boomerang*

2. Navigation-Timing e.g GA SiteSpeed

3. Browser Extension e.g. HTTPwatch

4. Custom browser e.g. 3pmobile.com or (headless) PhantomJS.org

5. Proxy timing e.g. Charles proxy

6. Web Server Mod e.g. APM solutions

7. Network sniffing e.g. Atomic Labs Pion

COMPARING METHODS…

Metric JavaScriptNavigation-Timing API

Browser Extension

Custom Browser

Proxy Debugger

Web Server Mod

Network sniffing

Example Product WebTuna SiteSpeed HTTPWatch 3PMobileCharles Proxy

APM Modules

Pion

"Blocked/Wait" No No Yes Yes Yes No NoDNS No Yes Yes Yes Yes No No

Connect No Yes Yes Yes Yes No YesTime to First Byte Partially Yes Yes Yes Yes Yes Yes

"Render Start" No No Yes Yes No No NoDOMReady Partially Yes Yes Yes No No No"Page/HTTP Complete"

Partially Yes Yes Yes Yes No Partially

OnLoad Event Yes Yes Yes Yes No No NoJS Execution Time Partially No Yes Yes No No No

Page-Level Yes Yes Yes Yes Partially Partially PartiallyObject Level No No Yes Yes Yes Yes Yes

Good for RUM? Yes Yes Partially No No Partially YesGood for Mobile? Partially Partially Partially Partially Partially Partially Partially

Affects Measurement Yes No Yes Yes Yes Yes No

Measurement Method

JAVASCRIPT TIMING – HOW IT WORKS

unLoad Event var start = new

Date().getTime()Stick it in a Cookie Load the next page

onLoad Eventvar end = new

Date().getTime()

PLT = end - start

Send a beaconbeacon.gif?time=plt


PROS & CONS OF JAVASCRIPT TIMING

• Pro’s

• Simple

• Episodes/Boomerang provide custom timing for developer instrumentation

• Cons

• Relies on Javascript and Cookies

• Only accurate for 2nd page in journey

• Can only really get a “page load metric” and a partial TTFB metric

• “Observer effect” (and Javascript can break!)

Metric JavaScript

Example Product WebTuna

"Blocked/Wait" NoDNS No

Connect NoTime to First Byte Partially

"Render Start" NoDOMReady Partially"Page/HTTP Complete"

Partially

OnLoad Event YesJS Execution Time Partially

Page-Level YesObject Level No

Good for RUM? YesGood for Mobile? Partially

Affects Measurement Yes

NAVIGATION-TIMING – HOW IT WORKS

onLoad Eventvar end = new

Date().getTime()

var plt = now - performance.timing.

navigationStart;

Send a beaconbeacon.gif?time=plt

NAVIGATION TIMING METRICS



PROS & CONS OF NAVIGATION-TIMING• Pro’s

• Even simpler!

• Lots more metrics

• More accurate

• Cons

• Need browser support for API

• IE9+ / Chrome 6+ / Firefox 7+

• Relies on Javascript (for querying API & beacon)

• “Observer effect”

• Page-level only

MetricNavigation-Timing API

Example Product SiteSpeed

"Blocked/Wait" NoDNS Yes

Connect YesTime to First Byte Yes

"Render Start" NoDOMReady Yes"Page/HTTP Complete"

Yes

OnLoad Event YesJS Execution Time No

Page-Level YesObject Level No


Affects Measurement No

A BIT MORE ABOUT GA SITESPEED… • Just add one line for basic, free, real-user monitoring!

_gaq.push(['_setAccount', 'UA-12345-1']);_gaq.push(['_trackPageview']);_gaq.push(['_trackPageLoadTime']);

• Sampling appears to vary (a lot!)

• 10% of page visits by design but reported 2% to 100%

• Falls back to Google Toolbar if available (but NOT javascript timing)

• Will probably make you think perf is better than it really is…

10/04/2023(C) SERITI CONSULTING, 2011 41



BROWSER EXTENSION – HOW IT WORKS

Write a browser extension…

That subscribes to a whole lot of API event listeners…

https://developer.mozilla.org/en/XPCOM_Interface_Reference

PROS & CONS OF BROWSER EXTENSIONS

• Pros

• Very complete metrics

• Object and Page level

• No javascript (in the page at least)!!!

• Great for continuous integration perf testing

• Cons

• Getting users to install it…

• Not natively cross-browser

• Some browsers don’t support extensions

• Especially mobile browsers!

• “Observer effect”

MetricBrowser

Extension

Example Product HTTPWatch

"Blocked/Wait" YesDNS Yes


"Render Start" YesDOMReady Yes"Page/HTTP Complete"

Yes

OnLoad Event YesJS Execution Time Yes

Page-Level YesObject Level Yes

Good for RUM? PartiallyGood for Mobile? Partially


CUSTOM BROWSER – HOW IT WORKS

Take some open source browser

code

Like WebKit or the Android Browser

PROS & CONS OF CUSTOM BROWSER

• Pros

• Great when you can’t use extensions / javascript / cookies ie. For mobile performance e.g. 3Pmobile.com

• Great for automation e.g. http://www.PhantomJS.org/

• Good metrics (depending on OS API availability)

• Cons

• Requires installation

• Maintaining fidelity to “real browser” measurements

• “Observer Effect” (due to instrumentation code)

MetricCustom Browser

Example Product 3PMobile



"Render Start" YesDOMReady Yes"Page/HTTP Complete"

Yes

OnLoad Event YesJS Execution Time Yes

Page-Level YesObject Level Yes

Good for RUM? NoGood for Mobile? Partially


http://www.phantomjs.org/

http://www.phantomjs.org/

PROXY DEBUGGER – HOW IT WORKSChange browser to use debugging Proxy e.g.

Charles or Fiddler

Debugging proxy records each

request

Export data to log

PROS & CONS OF PROXY DEBUGGER

• Pros

• One simple change to browser config

• No Javascript / Cookies

• Can offer bandwidth throttling

• Cons

• Proxies significantly impact HTTP traffic

• http://insidehttp.blogspot.com/2005/06/using-fiddler-for-performance.html

• No access to browser events

• Concept of a “page” be problematic…

MetricProxy

Debugger

Example ProductFiddler Proxy



"Render Start" NoDOMReady No"Page/HTTP Complete"

Yes

OnLoad Event NoJS Execution Time No

Page-Level PartiallyObject Level Yes

Good for RUM? NoGood for Mobile? Partially


http://insidehttp.blogspot.com/2005/06/using-fiddler-for-performance.html

http://insidehttp.blogspot.com/2005/06/using-fiddler-for-performance.html

6 Keep-Alive connections per SERVERVersus

8 Keep-Alive connections TOTAL per PROXY(Firefox 7.0.1)

WEB SERVER MOD – HOW IT WORKS

Write a webserver Mod or ISAPI filter

Start a timer on Request

http://www.apachetutor.org/dev/request

http://www.apachetutor.org/dev/request

PROS & CONS OF WEB SERVER MOD

• Pros

• Great for Application Performance Management (APM)

• Can be used in a “hybrid mode” with Javascript timing

• Measuring your “back-end” performance

• Can be easy to deploy*

• Cons

• Limited metrics, ignores network RTT and only sees origin requests

• “Observer Effect” (~5% server perf hit with APM?)


• Can be a pain to deploy*

MetricWeb Server

Mod

Example ProductAPM

Modules"Blocked/Wait" No

DNS NoConnect No

Time to First Byte Yes"Render Start" No

DOMReady No"Page/HTTP Complete"

No



Good for RUM? PartiallyGood for Mobile? Partially


NETWORK SNIFFING – HOW IT WORKS

Create a SPAN port or network

tap

Promiscuous mode packet

sniffing

PROS & CONS OF NETWORK SNIFFING• Pros

• No “observer effect” (totally “passive”)

• Very common “appliance-based” RUM solution

• Can be used in a “hybrid mode” with Javascript timing

• Can be easy to deploy*

• Cons

• Limited metrics and only sees origin requests

• Not “cloud friendly” at present


• Can be a pain to deploy*

MetricNetwork sniffing

Example Product Pion

"Blocked/Wait" NoDNS No


"Render Start" NoDOMReady No"Page/HTTP Complete"

Partially




Affects Measurement No

SUMMARY

• Performance is subjective (but we try to make it objective)

• Performance is Multi-dimensional

• Context is critical

• “Observational Studies AND Experiments”

• Real User Monitoring AND Synthetic Monitoring

• 7 different measurement techniques each with Pros & Cons

@LDNWEBPERF USER GROUP!• Join our London Web Performance Meetup

• http://www.meetup.com/London-Web-Performance-Group/

• Next Wednesday 16th Nov - 7pm – London (Bank)

• WPO case study from www.thetimes.co.uk!

• Follow us on Twitter @LDNWebPerf

• #LDNWebPerf & #WebPerf

http://www.meetup.com/London-Web-Performance-Group/

http://www.meetup.com/London-Web-Performance-Group/

http://www.thetimes.co.uk/

QUESTIONS?

http://mobro.co/TheOpsMgr

Technology

Measuring web performance. Velocity EU 2011