49
Асинхронный RPC с помощью Gevent и RabbitMQ Александр Мокров

Gevent rabbit rpc

Embed Size (px)

Citation preview

Page 1: Gevent rabbit rpc

Асинхронный RPC с помощью Gevent и RabbitMQАлександр Мокров

Page 2: Gevent rabbit rpc

О чем доклад

Некоторые ограничения Celery

Как их обойти

Gevent

RabbitMQ (некоторые особенности)

Будет предложена модель асинхронного RPC

Page 3: Gevent rabbit rpc

Пример построения приложения на Celery workflow

get flour

bake pie

get meat

seal pie

create dough

order pie

get milk

get aggs

Page 4: Gevent rabbit rpc

...

entry point 2

entry point 1

entry point 3 service n

app

service 2

service 1

Page 5: Gevent rabbit rpc

sommelier

winery

сhateauService of degustation 3

appService of degustation 2

Service of degustation 1

Page 6: Gevent rabbit rpc

Дегустация вин с gevent и RabbitMQ

Page 7: Gevent rabbit rpc

entry task

service task 1

DBcallback task 1

service task n

callback task n

Page 8: Gevent rabbit rpc

Что хотелось бы получить

entry task

service task 1

service task 2

service task n

Page 9: Gevent rabbit rpc

long task services

time

Page 10: Gevent rabbit rpc

Celery AsyncResult

async_result = task.apply_async()

print(async_result.status)

False

result = async_result.wait()

Page 11: Gevent rabbit rpc

long task service task

persistant queue

exclusive queue

reply_to=amq.gen-E6.correlation_id

Request

correlation_id

Response reply_to=amq.gen-E6...

RabbitMQ RPC

Page 12: Gevent rabbit rpc

greenletgreenlet

long taskservice response listener

service service service

reply_to exclusive queue

services queues

Page 13: Gevent rabbit rpc

Workers

● solo

● prefork

● eventlet

● gevent

Page 14: Gevent rabbit rpc

Gevent

gevent is a concurrency library based around libev. It provides a clean API for a variety of concurrency and network related tasks.

Page 15: Gevent rabbit rpc

Greenlet

The primary pattern used in gevent is the Greenlet, a lightweight coroutine provided to Python as a C extension module. Greenlets all run inside of the OS process for the main program but are scheduled cooperatively.

Only one greenlet is ever running at any given time.

Spin-off of Stackless, a version of CPython that supports micro-threads called “tasklets”. Tasklets run pseudo-concurrently (typically in a single or a few OS-level threads) and are synchronized with data exchanges on “channels”.

Its coroutine

Page 16: Gevent rabbit rpc

Event loop

Page 17: Gevent rabbit rpc

def foo(): print('Running in foo') gevent.sleep(0) print('Explicit context switch to foo')

def bar(): print('Explicit context to bar') gevent.sleep() print('Implicit context switch to bar')

gevent.joinall([ gevent.spawn(foo), gevent.spawn(bar),])

Running in fooExplicit context to barExplicit context switch to fooImplicit context switch to bar

Page 18: Gevent rabbit rpc

def task(pid):

gevent.sleep(random.randint(0,2)*0.001) print('Task %s done' % pid)

def synchronous(): for i in range(1, 8): task(i)

def asynchronous(): threads = [gevent.spawn(task, i) for i in range(10)] gevent.joinall(threads)

Synchronous:Task 1 doneTask 2 doneTask 3 doneTask 4 doneTask 5 doneTask 6 doneTask 7 doneAsynchronous:Task 1 doneTask 5 doneTask 6 doneTask 2 doneTask 4 doneTask 7 doneTask 0 doneTask 3 done

Page 19: Gevent rabbit rpc

def echo(i): time.sleep(0.001) return i

# Non Deterministic Process Pool

from multiprocessing.pool import Pool

p = Pool(10)run1 = [a for a in p.imap_unordered(echo, xrange(10))]run2 = [a for a in p.imap_unordered(echo, xrange(10))]run3 = [a for a in p.imap_unordered(echo, xrange(10))]run4 = [a for a in p.imap_unordered(echo, xrange(10))]

print(run1 == run2 == run3 == run4)

False

Page 20: Gevent rabbit rpc

# Deterministic Gevent Pool

from gevent.pool import Pool

p = Pool(10)run1 = [a for a in p.imap_unordered(echo, xrange(10))]run2 = [a for a in p.imap_unordered(echo, xrange(10))]run3 = [a for a in p.imap_unordered(echo, xrange(10))]run4 = [a for a in p.imap_unordered(echo, xrange(10))]

print(run1 == run2 == run3 == run4)

True

Page 21: Gevent rabbit rpc

Spawning Greenletsfrom gevent import Greenlet

thread1 = Greenlet.spawn(foo, "message", 1)

thread2 = gevent.spawn(foo, "message", 2)

thread3 = gevent.spawn(lambda x: (x+1), 2)

threads = [thread1, thread2, thread3]

# Block until all threads complete.gevent.joinall(threads)

Page 22: Gevent rabbit rpc

class MyGreenlet(Greenlet):

def __init__(self, message, n): Greenlet.__init__(self) self.message = message self.n = n

def _run(self): print(self.message) gevent.sleep(self.n)

g = MyGreenlet("Hi there!", 3)g.start()g.join()

Page 23: Gevent rabbit rpc

Greenlet State

started -- Boolean, indicates whether the Greenlet has been started

ready() -- Boolean, indicates whether the Greenlet has halted

successful() -- Boolean, indicates whether the Greenlet has halted and not thrown an exception

value -- arbitrary, the value returned by the Greenlet

exception -- exception, uncaught exception instance thrown inside the greenlet

Page 24: Gevent rabbit rpc

greenletgreenlet

long taskservice

response listener

subscribe(task_id)

Page 25: Gevent rabbit rpc

Timeouts

Page 26: Gevent rabbit rpc

from gevent import Timeout

seconds = 10

timeout = Timeout(seconds)timeout.start()

def wait(): gevent.sleep(10)

try: gevent.spawn(wait).join()except Timeout: print('Could not complete')

Page 27: Gevent rabbit rpc

time_to_wait = 5 # seconds

class TooLong(Exception): pass

with Timeout(time_to_wait, TooLong): gevent.sleep(10)

Page 28: Gevent rabbit rpc

class Queue(maxsize=None, items=None)

empty()

full()

get(block=True, timeout=None)

get_nowait()

next()

peek(block=True, timeout=None)

peek_nowait()

put(item, block=True, timeout=None)

put_nowait(item)

qsize()

Page 29: Gevent rabbit rpc

greenletgreenlet

greenletservice results

dispatcher

gevent.queues

task_id

reply_to, results_queue

Page 30: Gevent rabbit rpc

EventsGroups and

Pools

Locks and Semaphores

Subprocess

Thread Locals

Actors

Page 31: Gevent rabbit rpc
Page 32: Gevent rabbit rpc

Monkey patching

guerrilla patch

gorilla patch

monkey patch

Page 33: Gevent rabbit rpc

import socket

print(socket.socket)

from gevent import monkey

monkey.patch_socket()

print("After monkey patch")

print(socket.socket)

import select

print(select.select)

monkey.patch_select()

print("After monkey patch")

print(select.select)

<class 'socket.socket'>

After monkey patch

<class 'gevent._socket3.socket'>

<built-in function select>

After monkey patch

<function select at 0x7ff7e111c378>

Page 34: Gevent rabbit rpc

Stack layout for a greenlet | ^^^ | | older data | | | stack_stop . |_______________| . | | . | greenlet data | . | in stack | . * |_______________| . . _____________ stack_copy + stack_saved . | | | | . | data | |greenlet data| . | unrelated | | saved | . | to | | in heap |stack_start . | this | . . |_____________| stack_copy | greenlet | | | | newer data | | vvv |

Page 35: Gevent rabbit rpc

greenletgreenlet

greenletservice results dispatcher

service service service

reply_to exclusive queue

services queues

subscribe

gevent.queues

Page 36: Gevent rabbit rpc

Service Result Dispatcher

Page 37: Gevent rabbit rpc

greenletgreenlet

greenletservice results dispatcher

reply_to exclusive queue

reply_to, results_queue

gevent.queues

task_id

task_id, reply_to

Page 38: Gevent rabbit rpc

class ServiceResultsDispatcher(Greenlet):

def __init__(self): … self.reply_to = None Greenlet.__init__(self)

def create_connection(self): ... result = self.channel.queue_declare(exclusive=True) self.reply_to = result.method.queue

Page 39: Gevent rabbit rpc

def subscribe(self, task_id):

service_results_queue = gevent.queue.Queue()

self.service_results[task_id] = service_results_queue

return service_results, self.reply_to

def unsubscribe(self, task_id):

self.service_results.pop(task_id, None)

Page 40: Gevent rabbit rpc

def _run(self):

while True:

try:

for method_frame, properties, body in self.channel.consume(self.reply_to, no_ack=True):

if properties.correlation_id in self.tasks:

self.tasks[properties.correlation_id].put_nowait((method_frame, properties, body))

except ...

Page 41: Gevent rabbit rpc

Greenlet task

Page 42: Gevent rabbit rpc

greenletgreenlet

greenletservice results dispatcher

services queues

subscribe

gevent.queues

Page 43: Gevent rabbit rpc

self.results_queue, self.reply_to = self.service_publisher.subscribe(self.task_id)

self.channel.basic_publish(exchange='',

routing_key=service_queue,

properties=BasicProperties(

reply_to=self.reply_to,

correlation_id=self.task_id

),

body=request)

Page 44: Gevent rabbit rpc

try:

method_frame, properties, body =

self.results_queue.get(block=True, timeout=self.timeout)

except Empty:

logger.info('timeout')

break

else:

logger.info('body = {}'.format(body))

Page 45: Gevent rabbit rpc

Services

response = channel.basic_publish(

exchange='',

routing_key=props.reply_to,

properties=BasicProperties(correlation_id=request.task_id),

body=response

)

Page 46: Gevent rabbit rpc

Альтернативы?

Почему gevent?

1. Встроенная поддержка в Celery (малыми силами)

2. Хотелось рассмотреть в докладе именно gevent. Ничто не мешает переделать, к примеру, на asyncio.

Page 47: Gevent rabbit rpc

Вывод

Page 48: Gevent rabbit rpc

Ссылки

http://www.gevent.org

http://sdiehl.github.io/gevent-tutorial/

https://github.com/python-greenlet/greenlet

https://www.rabbitmq.com/

http://www.celeryproject.org/

Page 49: Gevent rabbit rpc

Спасибо за внимание!