56
Systems Programming in Python - Week 6 Asynchronous Network I/O Brian Dorsey [email protected] .

Systems Programming in Python - Week 6 Asynchronous Network I/O

Embed Size (px)

Citation preview

Page 1: Systems Programming in Python - Week 6 Asynchronous Network I/O

Systems Programmingin Python - Week 6

Asynchronous Network I/O

Brian [email protected]

.

Page 2: Systems Programming in Python - Week 6 Asynchronous Network I/O

why  do  we  care?

Page 3: Systems Programming in Python - Week 6 Asynchronous Network I/O

if  you  need  lots  of  simultaneous  connections

chatservice  proxiesslow  clients

anything  that  takes  a  while

Page 4: Systems Programming in Python - Week 6 Asynchronous Network I/O

you’ve  been  warned

big  topic

&  

I’m  a  newbie

Page 5: Systems Programming in Python - Week 6 Asynchronous Network I/O

CPU  bound  problems  will  need  different  solutions,  processes,  multiple  computers,  etc...

of  course,  once  you  try  to  get  them  all  talking  to  each  other...  you  have  network  I/O  bound  problems  again.  :)

the  solutions  in  this  talk  are  only  effective  if  your  

applicaion  is  I/O  bound

Page 6: Systems Programming in Python - Week 6 Asynchronous Network I/O

telephone  analogy

• answer• ask someone to get get info (they walk

across the building and back)• relay answer• hang up

Page 7: Systems Programming in Python - Week 6 Asynchronous Network I/O

.

here we go

Page 8: Systems Programming in Python - Week 6 Asynchronous Network I/O

we’ve  got  a  fantastic  phone  system,  it  can  handle  thousands  of  incoming  calls,  easily  we  have  1  person  answering  the  phones

blocking  I/O

Page 9: Systems Programming in Python - Week 6 Asynchronous Network I/O

what  if  you  need  to  handle  more  than  one  client?

Page 10: Systems Programming in Python - Week 6 Asynchronous Network I/O

1  thread  =  1  person  answering  phones,  10  people  in  our  phone  pool

threads(&  thread  pools)

Django,  CherryPy,  etc(mod_wsgi  uses  processes  

and  threads)  

Page 11: Systems Programming in Python - Week 6 Asynchronous Network I/O

We’re  I/O  bound  today,  so  no  worries.

GIL?

(or...  the  internet  says  Python  can’t  do  real  

threading)

Page 12: Systems Programming in Python - Week 6 Asynchronous Network I/O

remember  my  apache  bench  stats  last  quarter?

this  scales  up  preKy  darn  well

Page 13: Systems Programming in Python - Week 6 Asynchronous Network I/O

except  when  it  doesn’t

Page 14: Systems Programming in Python - Week 6 Asynchronous Network I/O

just  keep  adding  people  to  answer  phones

what  if  we  get  more  traffic  than  we  can  handle?

Page 15: Systems Programming in Python - Week 6 Asynchronous Network I/O

as  long  as  each  conversation  is  short,  we’re  OK

Page 16: Systems Programming in Python - Week 6 Asynchronous Network I/O

what  if  we  need  to  ask  a  question  which  takes  a  

while  to  answer?

Page 17: Systems Programming in Python - Week 6 Asynchronous Network I/O

and  eventually,  we  run  out  of  space  for  more  people  to  answer  the  phones

as  the  conversations  get  longer,  we’re  tying  up  

resources  even  when  we’re  not  actively  working

Page 18: Systems Programming in Python - Week 6 Asynchronous Network I/O

how  many  threads  can  I  make?  

demo

Page 19: Systems Programming in Python - Week 6 Asynchronous Network I/O

10,000  simultaneous  connections

c10k

Page 20: Systems Programming in Python - Week 6 Asynchronous Network I/O

Note  the  dated  specs...  this  was  said  a  *while*  ago!    http://www.kegel.com/c10k.html

“It'ʹs  time  for  web  servers  to  handle  ten  thousand  clients  simultaneously,  don'ʹt  you  think?  After  all,  

the  web  is  a  big  place  now.

And  computers  are  big,  too.  You  can  buy  a  1000MHz  machine  with  2  gigabytes  of  RAM  and  

an  1000Mbit/sec  Ethernet  card  for  $1200  or  so.  Let'ʹs  see  -­‐‑  at  20000  clients,  that'ʹs  50KHz,  100Kbytes,  and  50Kbits/sec  per  client.  It  shouldn'ʹt  take  any  more  horsepower  than  that  to  take  four  kilobytes  from  the  disk  and  send  them  to  the  network  once  a  second  for  each  of  twenty  thousand  clients.  <...>  

So  hardware  is  no  longer  the  boQleneck.”

Page 21: Systems Programming in Python - Week 6 Asynchronous Network I/O

why  does  this  happen?

Page 22: Systems Programming in Python - Week 6 Asynchronous Network I/O

maybe  even  maliciously  slow?

slow  clients

Page 23: Systems Programming in Python - Week 6 Asynchronous Network I/O

nature  of  the  problem

(think  chat  applications,  or    messaging  servers)

Page 24: Systems Programming in Python - Week 6 Asynchronous Network I/O

networks  are  always  ‘slow’,  so  DB  and  other  service  requests  add  up

something  else  is  slow,  and  you  need  to  wait  on  it

Page 25: Systems Programming in Python - Week 6 Asynchronous Network I/O

until  recently,  most  of  us  didn’t  need  to  worry  about  

this  stuff

Page 26: Systems Programming in Python - Week 6 Asynchronous Network I/O

have  you  used  chat  in  gmail?

real  time  web,  chat,  events,  live  updates  are  geKing  more  and  more  common

soon  they’ll  be  expected  everywhere

Page 27: Systems Programming in Python - Week 6 Asynchronous Network I/O

.

Asynchronous I/O

Page 28: Systems Programming in Python - Week 6 Asynchronous Network I/O

       http://en.wikipedia.org/wiki/Asynchronous_I/O

ignoring:

hardware  interruptsI/O  callback  functionsI/O  completion  ports

Page 29: Systems Programming in Python - Week 6 Asynchronous Network I/O

processes

Page 30: Systems Programming in Python - Week 6 Asynchronous Network I/O

threads

(OS  threads)

Page 31: Systems Programming in Python - Week 6 Asynchronous Network I/O

user  threads

(lightweight  threads)

Page 32: Systems Programming in Python - Week 6 Asynchronous Network I/O

select  loop

• give list of file descriptors, block until one is ready

• slows down if hundreds or thousands of file descriptors

Page 33: Systems Programming in Python - Week 6 Asynchronous Network I/O

epoll,  kqueue

• same as select loop, but only returns the file descriptors which are ready

• fast even with many thousands of (idle)file descriptors

Page 34: Systems Programming in Python - Week 6 Asynchronous Network I/O

analogy  -­‐  if  the  operator  needs  to  do  work  themselves,  all  calls  are  slowed  down

these  work  only  as  long  as  we'ʹre  blocking  on  I/O

Page 35: Systems Programming in Python - Week 6 Asynchronous Network I/O

for  both  select  and  epoll,  we  have  a  main  event  loop

Page 36: Systems Programming in Python - Week 6 Asynchronous Network I/O

being  programmers,  the  next  step  is  to  add  abstraction  layers

Page 37: Systems Programming in Python - Week 6 Asynchronous Network I/O

loop  with  inline  code?

Page 38: Systems Programming in Python - Week 6 Asynchronous Network I/O

a  reactor  which  owns  the  loop  and  calls  methods  on  

our  objects?

Page 39: Systems Programming in Python - Week 6 Asynchronous Network I/O

register  callbacks  with  the  loop  to  handle  each  

connection?

Page 40: Systems Programming in Python - Week 6 Asynchronous Network I/O

use  coroutine  magic  to  write  code  which  looks  like  it  blocks,  but  really  hands  execution  off  to  the  loop  with  a  way  to  jump  back  

when  ready?

Page 41: Systems Programming in Python - Week 6 Asynchronous Network I/O

Yes.

Page 42: Systems Programming in Python - Week 6 Asynchronous Network I/O

all  of  those  and  more

Page 43: Systems Programming in Python - Week 6 Asynchronous Network I/O

Python  generators  are  limited  coroutines

wait,  what’s  a  coroutine?

subroutines  are  called,  then  exit

coroutines  call  each  other

Page 44: Systems Programming in Python - Week 6 Asynchronous Network I/O

.

Python options

Page 45: Systems Programming in Python - Week 6 Asynchronous Network I/O

there  are  far,  far  too  many  to  cover

Page 46: Systems Programming in Python - Week 6 Asynchronous Network I/O

1  operator  answering  the  phones,  getting  answers,  etc

remember  the  echo  server?

this  is  blocking  I/O

Page 47: Systems Programming in Python - Week 6 Asynchronous Network I/O

http://docs.python.org/library/asyncore.html

aysyncore

• stdlib• handles the select loop for you• you subclass an object, create handers

to do the work

Page 48: Systems Programming in Python - Week 6 Asynchronous Network I/O

       http://twistedmatrix.com/trac/

twisted

• the original async framework in Python• large, efficient, steep learning curve• twisted• callback style programming, with

deferreds to keep things clean• if you already understand JavaScript

callbacks and jQuery deferreds, you won’t be confused

Page 49: Systems Programming in Python - Week 6 Asynchronous Network I/O

http://www.gevent.org/

gevent

• best of both worlds: "What you get is all the performance and scalability of an event system with the elegance and straightforward model of blocking IO programing."

• uses greenlets to cooperatively swap state and switch between functions

Page 50: Systems Programming in Python - Week 6 Asynchronous Network I/O

.

Wrap up

Page 51: Systems Programming in Python - Week 6 Asynchronous Network I/O

if  you  have  a  c10k  problem,  you  probably  need  

async  I/O

Page 52: Systems Programming in Python - Week 6 Asynchronous Network I/O

async  I/O  is  really  just:

do  one  thing  at  a  time,  very  quickly

Page 53: Systems Programming in Python - Week 6 Asynchronous Network I/O

there  are  a  bunch  of  ways  to  implement  the  async  part,  with  different  trade  offs

Page 54: Systems Programming in Python - Week 6 Asynchronous Network I/O

there  are  a  bunch  of  abstractions  to  make  it  easier  to  understand  

async  code

Page 55: Systems Programming in Python - Week 6 Asynchronous Network I/O

this  whole  talk  is  only  relevant  when  you’re  

I/O  bound

Page 56: Systems Programming in Python - Week 6 Asynchronous Network I/O

.

Questions?